From f464fe464a41b2449900232b15c1a5b257dfead2 Mon Sep 17 00:00:00 2001 From: Francois NOUAILLE Date: Thu, 4 Jun 2020 13:01:49 +0000 Subject: [PATCH 1/5] Change doxygen doc folder and add a new dev focused doc. --- .gitignore | 10 + Makefile | 8 +- doc/BRICK_CONCEPT.md | 95 +++++++++ doc/PG_GENERAL_CONCEPT.md | 30 +++ doc/README.md | 15 ++ doc/RXTX.md | 21 ++ doc/VHOST.md | 38 ++++ doc/VTEP.md | 42 ++++ doxygen_build/README2.md | 184 ++++++++++++++++++ {doc => doxygen_build}/check_error.sh | 0 {doc => doxygen_build}/contrib.md | 0 .../deploy_documentation.sh | 0 {doc => doxygen_build}/doxygen.conf.template | 2 +- {doc => doxygen_build}/packetgraph_doc.png | Bin {doc => doxygen_build}/sed_readme.sh | 0 15 files changed, 440 insertions(+), 5 deletions(-) create mode 100644 doc/BRICK_CONCEPT.md create mode 100644 doc/PG_GENERAL_CONCEPT.md create mode 100644 doc/README.md create mode 100644 doc/RXTX.md create mode 100644 doc/VHOST.md create mode 100644 doc/VTEP.md create mode 100644 doxygen_build/README2.md rename {doc => doxygen_build}/check_error.sh (100%) rename {doc => doxygen_build}/contrib.md (100%) rename {doc => doxygen_build}/deploy_documentation.sh (100%) rename {doc => doxygen_build}/doxygen.conf.template (99%) rename {doc => doxygen_build}/packetgraph_doc.png (100%) rename {doc => doxygen_build}/sed_readme.sh (100%) diff --git a/.gitignore b/.gitignore index 808c2a81b..cc852edeb 100644 --- a/.gitignore +++ b/.gitignore @@ -38,3 +38,13 @@ html !*.h example-* dpdk_symbols_autogen.h +tests/**/*.gcda +tests/**/*.gcno +src/**/*.gcda +src/**/*.gcno +src/npf +libpacketgraph-dev.so.17.5.0 +libpacketgraph.so.17.5.0 +packetgraph_coverage/ +coverage.xml + diff --git a/Makefile b/Makefile index 84e1b1822..5f7749729 100644 --- a/Makefile +++ b/Makefile @@ -92,14 +92,14 @@ $(PG_OBJECTS) : src/%.o : src/%.c $(PG_dev_OBJECTS): src/%-dev.o : src/%.c $(CC) -c $(PG_dev_CFLAGS) $(PG_HEADERS) $< -o $@ -doxygen.conf: $(srcdir)/doc/doxygen.conf.template +doxygen.conf: $(srcdir)/doxygen_build/doxygen.conf.template $(shell sed "s|PG_SRC_PATH|$(srcdir)|g" $< > $@) doc: doxygen.conf - $(srcdir)/doc/sed_readme.sh + $(srcdir)/doxygen_build/sed_readme.sh doxygen $^ - $(srcdir)/doc/check_error.sh - $(srcdir)/doc/deploy_documentation.sh + $(srcdir)/doxygen_build/check_error.sh + $(srcdir)/doxygen_build/deploy_documentation.sh style: $(srcdir)/tests/style/test.sh $(srcdir) diff --git a/doc/BRICK_CONCEPT.md b/doc/BRICK_CONCEPT.md new file mode 100644 index 000000000..c5a160818 --- /dev/null +++ b/doc/BRICK_CONCEPT.md @@ -0,0 +1,95 @@ +# Packetgraph's brick concept. + +## Overview. + +Each brick have 2 sides: East and West (except for monopole brick that have only one).
+Each side can have 0 or more edges (except for dipole brick that have one edge per side).
+Edges are stored in brick's sides and pointing to another brick.
+So to create a link we need 2 edges... One from the first one pointing to the second one and vice versa.
+Note: the side notion is for the packet's source, because it goes directly to a brick.
+
+A basic dipole brick shema:
+``` + | | + edge 0 <---| +---------+ |--->edge 0 + edge ... <---|-| BRICK |-|--->edge ... + edge n <---| +---------+ |--->edge n + | | + | | + | | + | | + WEST SIDE EAST SIDE +``` +And now 2 basic bricks linked together:
+``` + +-----B-West to A----+ +---A-East to B-------+ + | | | | + | V | | | | V | + edge 0 <---| +---------+ |--->edge 0-------+ edge 0 <---| +---------+ |--->edge 0 + edge ... <---|-| BRICK A |-|--->edge ... | edge ...<---|-| BRICK B |-|--->edge ... + edge n <---| +---------+ |--->edge n +------edge n <---| +---------+ |--->edge n + | | | | + | | | | + | | | | + | | | | + WEST SIDE EAST SIDE WEST SIDE EAST SIDE +``` +
+Why having sides?
+Because it makes it easier to perform operations between two sides such as acting as a diode, filter...
+ +## How monopole/single edge brick works: +### Single edge: +As the following content shows it, `edge` and `edges` are in an `union` so basically one side can have `edge` OR `edges`. +``` +struct pg_brick_side { + [...] + /* Edges is use by multipoles bricks, + * and edge by dipole and monopole bricks + */ + union { + struct pg_brick_edge *edges; /* edges */ + struct pg_brick_edge edge; + }; +}; +``` +### Single side: +As the following content shows it, `side` and `sides` are in an `union` so basically one side can have `side` OR `sides`. +``` +struct pg_brick { + [...] + union { + struct pg_brick_side sides[PG_MAX_SIDE]; + struct pg_brick_side side; + }; +}; +``` + +## Brick's common packet operations. + +Packets are going through bricks via bursts. Bursts are started only from Inputs/Outputs of the graph (IO bricks: VHOST, NIC, RXTX, VTEP) during a graph poll. so polling bricks that are not IO bricks is a nonsense.
+ +``` +struct pg_brick { + [...] + /* Accept a packet burst */ + int (*burst)(struct pg_brick *brick, enum pg_side from, + uint16_t edge_index, struct rte_mbuf **pkts, + uint64_t pkts_mask, struct pg_error **errp); + /* polling */ + int (*poll)(struct pg_brick *brick, + uint16_t *count, struct pg_error **errp); + [...] +}; +``` + +According to the `pg_brick` strcture, we have two methods dealing with packets: + +* `int burst([...]);`
+ Called to burst packets through an edge through a side to another brick though her other side.
+ Example: If I burst from brick A through East side and edge 1 (pointing to brick B),
brick B will receive it though West side.
+ Bursts are not moving packets! Bursts are passing their adress in the hugepage to another brick's burst (Each burst call the next's brick burst until going outside the graph via an IO brick). +* `int poll([...]);`
+ Called during a graph poll for an input brick. Most of the time it will burst packets received/created through the graph. + + diff --git a/doc/PG_GENERAL_CONCEPT.md b/doc/PG_GENERAL_CONCEPT.md new file mode 100644 index 000000000..b34fb7b1f --- /dev/null +++ b/doc/PG_GENERAL_CONCEPT.md @@ -0,0 +1,30 @@ +# Packetgraph's general concept +## Introduction +Outscale's packetgraph is a solution to link (In a network) virtual machines with some others and/or the real world.
+It aims at doing it fast, and for this purpose we will use [DPDK](https://www.dpdk.org/) aka Data Plane Development Kit.
+The core idea is to do not "move" packets which cost a lot of memmory and time so we alocate them one time for all.
+.
+ + +``` + + The outer +---Host Machine---------------------------------------------------------------------+ + World | | + | +--The GRAPH--------------------------------------+ | + | | | | + | | +-------+ +---------+ | + | | +---------------------------->| VHOST |<------------>| VM | | + | | | +-------+ +---------+ | + | | v | | + +---------+ +---------+ +-------+ +---------+ | + <-------->| NIC |<------>|Switch |<---------------------->| VHOST |<------------>| VM | | + +---------+ +---------+ +-------+ +---------+ | + | | ^ | | + | | | +--------------+ +-------+ +---------+ | + | | +----->| Firewall |<----->| VHOST |<------------>| VM | | + | | +--------------+ +-------+ +---------+ | + | | | | + | +-------------------------------------------------- | + | | + +------------------------------------------------------------------------------------+ +``` diff --git a/doc/README.md b/doc/README.md new file mode 100644 index 000000000..512eacc2d --- /dev/null +++ b/doc/README.md @@ -0,0 +1,15 @@ +# DOCUMENTATION + +Here is a documentation aiming at providing detailed information about Packetgraph's brick concept, about implemented technologies/features (with standards descriptions) and about each brick. The idea is to explain what's the purpose of each component, further optimizations and choices made.
+
+An overview of the general concept of packetgraph: +* [General concept.](PG_GENERAL_CONCEPT.md) + +Detailed brick linking information and shema are availables here: +* [Packetgraph's brick concept.](BRICK_CONCEPT.md) + +For specific brick'informations and shemas: +* [VHOST brick.](VHOST.md) +* [RXTX brick.](RXTX.md) +* [VTEP brick.](VTEP.md) + diff --git a/doc/RXTX.md b/doc/RXTX.md new file mode 100644 index 000000000..f913dd0f1 --- /dev/null +++ b/doc/RXTX.md @@ -0,0 +1,21 @@ +# RXTX Brick + +## Introduction + +he RXTX brick is a monopole single-edge brick.
+It is intended mainly for testing and benchmarking purpose.
+We use it to create packet and send them through a graph and/or receive them.
+ +## Usage + +Here is the constructor: +``` +struct pg_brick *pg_rxtx_new(const char *name, + pg_rxtx_rx_callback_t rx, + pg_rxtx_tx_callback_t tx, + void *private_data) +``` +As we can see, we give it two callbacks as parameters: + +* `pg_rxtx_rx_callback_t rx`: The method that will be used to send packets. +* `pg_rxtx_rx_callback_t tx`: The method that will be called whene the brick receive packets. diff --git a/doc/VHOST.md b/doc/VHOST.md new file mode 100644 index 000000000..0606f6605 --- /dev/null +++ b/doc/VHOST.md @@ -0,0 +1,38 @@ +# VHOST Brick + +## Introduction + +The VHOST brick is the brick used to make the graph communicate with VMs.
+The problem while communicating with VMs via "standard way" is that it's really slow.
+So here we use the virtio protocol implemented as vhost in DPDK.
+ + +## VHOST overview + ``` ++---------------------------+ +| | +| +-------------+ | +-------------+ +| | Graph's side| | | Host's side | +| +-------------+ | +-------------+ +| | +| | | +| | +--------+ +-------------+ +---------------+ +| edge <---|-| VHOST |<------------>| UNIX SOCKET |<----------->|Virtual-Machine| +| | +--------+ +-------------+ +---------------+ +| | | ^ ^ +| Side | | | +| ^ ^ ^ ^ ^ ^ ^ ^ | | | ++-|-|-|-|-|-|-|-|-----------+ | | ++-|-|-|-|-|-|-|-|--------------|-------------------------------------------------------|------+ +| v v v v v v v v v v | +| Host's shared memmory, aka hugepage, containing packets. | +| | ++---------------------------------------------------------------------------------------------+ +``` +As previously described, VHOST use an unix socket and a hugepage to communicate via ip.
+It manages a queue and reduce memmory write/free operatons.
+It's based on a cient(s)/server model, meaning that one server can handle multiple connections through the socket.
Only packet address in the hugepage are flowing through the socket.
+ +## Current VHOST brick's status + +Currently the VHOST brick only works in SERVER mode... Which means that if packetgraph crash, we will need to reboot VMs...
Not a good thing! diff --git a/doc/VTEP.md b/doc/VTEP.md new file mode 100644 index 000000000..766ad951d --- /dev/null +++ b/doc/VTEP.md @@ -0,0 +1,42 @@ + # VTEP Brick + + ## Introduction + + The VTEP brick is intended to allow us to use the VXLAN protocol.
+ The VXLAN protocol is allowing us to create tunnels between multpiple LAN through another network.
+ + ## VXLAN protocol + + THe VXLAN (Virtual Extensible LAN) protocol work by encapsulating packets before sending them through another network.
+ It's all in the 2 OSI layer. + + ## Usage + + Our typical use case is making tunnels between virtual networks (or packetgraph's graphs) through the host's network via a NIC brick.
+ When creating the brick, we tell her which side is the output (the tunnel's network), either East or West.
+ This brick has an empty poll function because the only way to make packets going through ot is by bursting from an input/output of the graph.
+ + Example use case: making a tunnel between virtual networks 1 and 2 through the host's network.
+ (NIC brick are described in the [NIC section](NIC.md))
+ For a better link description between bricks, see [packetgraph's brick concept.](BRICK_CONCEPT.md)
+ ``` + Virtual Network 1 | Host's network | Virtual Network 2 + | | + | | + | | | | | | + Virtual Network 1 <---| +--------+ | +---------+ +---------+ | +--------+ |---> Virtual Network 1 + Virtual Network ... <---|-| VTEP |-|------>| NIC |----------| NIC |<------|-| VTEP |-|---> Virtual Network ... + Virtual Network n <---| +--------+ | +---------+ +---------+ | +--------+ |---> Virtual Network n + | | | | | | + West Side East Side | | West Side East Side + | | + | | + + + ``` + Using the previous shema, it will be like if VN1 and VN2 were the same network.
+ + # Output redirection + The VTEP brick features her own switch. It is based on the VNI of each packet (Virtual Network Identifier).
+ One vtep can lead to multiple virtual networks, each one identified by its own VNI.
+ On a virtual network, there could be any kind of packetgraph brick, and why not another VTEP? If we want to encapsulate one more time the encapsulated network...
diff --git a/doxygen_build/README2.md b/doxygen_build/README2.md new file mode 100644 index 000000000..2e1863858 --- /dev/null +++ b/doxygen_build/README2.md @@ -0,0 +1,184 @@ +# Packetgraph ? + +Packetgraph is a library aiming to give the user a tool to build networks graph easily, It's built upon the fast [DPDK library](http://dpdk.org/). + +\htmlonly \endhtmlonly + +The goal of this library is to provide a really EASY interface to +build you own DPDK based application using [Network Function Virtualization](https://en.wikipedia.org/wiki/Network_function_virtualization) +Everyone is free to use this library to build up their own network application. + +Once you have created and connected all bricks in you network graph, +some bricks will be able to poll a burst of packets (max 64 packets) +and let the burst propagate in you graph. + +Connections between bricks don't store any packets and each burst will +propagate in the graph without any copy. + +Each graph run on one core but you can connect different graph using +Queue bricks (which are thread safe). For example, a graph can be +split on demand to be run on different core or even merged. + +If you want a graphical representation of a graph, you can generate a [dot](https://en.wikipedia.org/wiki/DOT_%28graph_description_language%29) output. + +![Packetgraph features](https://osu.eu-west-2.outscale.com/jerome.jutteau/16d1bc0517de5c95aa076a0584b43af6/packetgraph_features.png "packetgraph features") + +# Available bricks (ipv4/ipv6): + +- switch: a layer 2 switch +- rxtx: setup your own callbacks to get and sent packets +- tap: classic kernel virtual interface +- vhost: allow to connect a vhost NIC to a virtual machine (virtio based) +- firewall: allow traffic filtering passing through it (based on [NPF](https://github.com/rmind/npf)) +- diode: only let packets pass in one direction +- hub: act as a hub device, passing packets to all connected bricks +- nic: allow passing packets to a NIC of the system (accelerated by DPDK) +- print: a basic print brick to show packets flowing through it +- antispoof: a basic mac checking, arp anti-spoofing and ipv6 neighbor discovery anti-spoofing +- vtep: VXLAN Virtual Terminal End Point switching packets on virtual LANs, can encapsulate packets over ipv4 or ipv6 +- queue: temporally store packets between graph +- pmtud(ipv4 only): Path MTU Discovery is an implementation of [RFC 1191](https://tools.ietf.org/html/rfc1191) +- user-dipole: setup your own callback in a dipole brick, to filter or implement your own protocol + +A lot of other bricks can be created, check our [wall](https://github.com/outscale/packetgraph/issues?q=is%3Aopen+is%3Aissue+label%3Awall) ;) + +# How should I use Packetgraph ? + +Code Documentation: [doxygen link](https://outscale.github.io/packetgraph/doc/master) + +![Packetgraph usage flow](https://osu.eu-west-2.outscale.com/jerome.jutteau/16d1bc0517de5c95aa076a0584b43af6/packetgraph_flow.png "packetgraph usage flow") + +# Examples + +To build and run examples, you may first check how to build Packetgraph below and adjust your configure command before make: +``` +$ ./configure --with-examples +$ make +``` + +To run a specific example, check run scripts in tests directories: +``` +$ ./examples/switch/run_vhost.sh +$ ./examples/switch/run.sh +$ ./examples/firewall/run.sh +$ ./examples/rxtx/run.sh +$ ./examples/dperf/run.sh +... +``` + +# Building + +You will need to build DPDK before building Packetgraph. + +## Install needed tools + +You may adapt this depending on your Linux distribution: + +Ubuntu +``` +$ sudo apt-get install libpcap-dev libglib2.0-dev libjemalloc-dev libnuma-dev openssl +``` +CentOS +``` +$ sudo yum install -y glibc-devel glib2-devel libpcap-devel git wget numactl numactl-devel openssl-devel clang +$ wget http://cbs.centos.org/kojifiles/packages/jemalloc/3.6.0/8.el7.centos/x86_64/jemalloc-devel-3.6.0-8.el7.centos.x86_64.rpm +$ wget http://cbs.centos.org/kojifiles/packages/jemalloc/3.6.0/8.el7.centos/x86_64/jemalloc-3.6.0-8.el7.centos.x86_64.rpm +$ sudo rpm -i jemalloc-devel-3.6.0-8.el7.centos.x86_64.rpm jemalloc-3.6.0-8.el7.centos.x86_64.rpm +``` + +## Build DPDK + +``` +$ git clone http://dpdk.org/git/dpdk +$ cd dpdk +$ git checkout -b v19.02 v19.02 +$ make config T=x86_64-native-linuxapp-gcc +``` + +Note: use `T=x86_64-native-linuxapp-clang` to build with clang + +Edit build/.config and be sure to set the following parameters to 'y': +- CONFIG_RTE_LIBRTE_PMD_PCAP + +If you don't want to use some special PMD in DPDK requiring kernel headers, +you will have to set the following parameters to 'n': +- CONFIG_RTE_EAL_IGB_UIO +- CONFIG_RTE_KNI_KMOD + +Once your .config file is ready, you can now build dpdk as follows: + +``` +$ make EXTRA_CFLAGS='-fPIC' +``` + +Finally, set `RTE_SDK` environment variable: +``` +$ export RTE_SDK=$(pwd) +``` + +## Build packetgraph +``` +$ git clone https://github.com/outscale/packetgraph.git +$ cd packetgraph +$ git submodule update --init +$ ./configure +$ make +$ make install +``` + +Note: to build with clang, you can use `./configure CC=clang` + +Note 2: You need a compiler that support C11 (gcc 4.9 or superior, or clang 3.4 or superior). + +## Configure Huge Pages + +Packetgraph uses some [huge pages](https://en.wikipedia.org/wiki/Page_%28computer_memory%29#Huge_pages) +(adjust to your needs): + +- Edit your `/etc/sysctl.conf` and add some huge pages: +``` +vm.nr_hugepages=2000 +``` +- Reload your sysctl configuration: +``` +$ sudo sysctl -p /etc/sysctl.conf +``` +- Check that your huge pages are available: +``` +$ cat /proc/meminfo | grep Huge +``` +- Mount your huge pages: +``` +$ sudo mkdir -p /mnt/huge +$ sudo mount -t hugetlbfs nodev /mnt/huge +``` +- (optional) Add this mount in your `/etc/fstab`: +``` +hugetlbfs /mnt/huge hugetlbfs rw,mode=0777 0 0 +``` +# Compille Time Optimisation: + +-DPG_BRICK_NO_ATOMIC_COUNT: do not use atomic variable to count packets, if you do so, you must call `pg_brick_pkts_count_get` in the same thread you use to poll packets +-DPG_VHOST_FASTER_YET_BROKEN_POLL: change the way vhost lock the queue so it spend less time locking/unlocking the queue, but can easily deadlock if badly use. + +# Compille Time Option +-DTAP_IGNORE_ERROR: when packetgrapg can't burst or poll a tap, it return 0 instead of returning an error + +# Licence + +Packetgraph project is published under [GNU GPLv3](http://www.gnu.org/licenses/quick-guide-gplv3.en.html). +For more information, check LICENSE file. + +# Contribute + +New to packetgraph ? Want to contribute and/or create a new brick ? Some +[developer guidelines](https://github.com/outscale/packetgraph/blob/master/doc/contrib.md/) are available. + +# Question ? Contact us ! + +Packetgraph is an open-source project, feel free to [chat with us on IRC](https://webchat.freenode.net/?channels=betterfly&nick=packetgraph_user) + +> server: irc.freenode.org + +> chan: \#betterfly + diff --git a/doc/check_error.sh b/doxygen_build/check_error.sh similarity index 100% rename from doc/check_error.sh rename to doxygen_build/check_error.sh diff --git a/doc/contrib.md b/doxygen_build/contrib.md similarity index 100% rename from doc/contrib.md rename to doxygen_build/contrib.md diff --git a/doc/deploy_documentation.sh b/doxygen_build/deploy_documentation.sh similarity index 100% rename from doc/deploy_documentation.sh rename to doxygen_build/deploy_documentation.sh diff --git a/doc/doxygen.conf.template b/doxygen_build/doxygen.conf.template similarity index 99% rename from doc/doxygen.conf.template rename to doxygen_build/doxygen.conf.template index f31ed03b4..2a650baec 100644 --- a/doc/doxygen.conf.template +++ b/doxygen_build/doxygen.conf.template @@ -58,7 +58,7 @@ PROJECT_LOGO = # entered, it will be relative to the location where doxygen was started. If # left blank the current directory will be used. -OUTPUT_DIRECTORY = PG_SRC_PATH/doc +OUTPUT_DIRECTORY = PG_SRC_PATH/doxygen_build # If the CREATE_SUBDIRS tag is set to YES then doxygen will create 4096 sub- # directories (in 2 levels) under the output directory of each output format and diff --git a/doc/packetgraph_doc.png b/doxygen_build/packetgraph_doc.png similarity index 100% rename from doc/packetgraph_doc.png rename to doxygen_build/packetgraph_doc.png diff --git a/doc/sed_readme.sh b/doxygen_build/sed_readme.sh similarity index 100% rename from doc/sed_readme.sh rename to doxygen_build/sed_readme.sh From b2a71cc039da43f2e17d9c1e0f364677b1a9d214 Mon Sep 17 00:00:00 2001 From: Francois NOUAILLE Date: Thu, 4 Jun 2020 13:41:20 +0000 Subject: [PATCH 2/5] doc: add a SWITCH.md documentation. --- doc/README.md | 1 + doc/SWITCH.md | 3 +++ 2 files changed, 4 insertions(+) create mode 100644 doc/SWITCH.md diff --git a/doc/README.md b/doc/README.md index 512eacc2d..8ff5cf86d 100644 --- a/doc/README.md +++ b/doc/README.md @@ -12,4 +12,5 @@ For specific brick'informations and shemas: * [VHOST brick.](VHOST.md) * [RXTX brick.](RXTX.md) * [VTEP brick.](VTEP.md) +* [SWITCH brick.](SWITCH.md) diff --git a/doc/SWITCH.md b/doc/SWITCH.md new file mode 100644 index 000000000..725952c88 --- /dev/null +++ b/doc/SWITCH.md @@ -0,0 +1,3 @@ +# SWITCH Brick + +## Introduction From 9fe9bdff40c5ea7f2c4f85a6cf0cb51d11bc9120 Mon Sep 17 00:00:00 2001 From: Francois NOUAILLE Date: Thu, 4 Jun 2020 13:51:56 +0000 Subject: [PATCH 3/5] wip --- doc/BRICK_CONCEPT.md | 20 ++++++++++++++++++++ doc/README.md | 4 ++++ doc/SWITCH.md | 44 ++++++++++++++++++++++++++++++++++++++++++++ doc/VHOST.md | 12 +++++++++++- 4 files changed, 79 insertions(+), 1 deletion(-) diff --git a/doc/BRICK_CONCEPT.md b/doc/BRICK_CONCEPT.md index c5a160818..0e06cd263 100644 --- a/doc/BRICK_CONCEPT.md +++ b/doc/BRICK_CONCEPT.md @@ -38,6 +38,26 @@ And now 2 basic bricks linked together:
Why having sides?
Because it makes it easier to perform operations between two sides such as acting as a diode, filter...
+### Warning! + +While creating links, make sure that there is not 2 bricks modifying packets (VXLAN, VTEP) on the same side!
+Here is why:
+To improve our perfs, we do not copy packets so if a brick modify them, they will be modified for all other bricks on this side.
+
+Example:
+We want to link some VMs to a VTEP. So we need VHOST bricks for each VMs and a switch.
+The VTEP must NOT be on the same side than VHOST bricks.
+Sides are decided by the order of the arguments of the method `pg_brick_link(BRICK_A, BRICK_B)`.
+So basically we will do so: + +* `pg_brick_link(SWITCH, VTEP);` +* `pg_brick_link(VHOST_0, SWITCH);` +* `pg_brick_link(VHOST_1, SWITCH);` +* `pg_brick_link(VHOST_2, SWITCH);` + +So we are sure that VTEP and VHOST_n are not on the same side.
+If we cannot isolate as we want the VTEP, a NOT recommended way would be to disable the `NOCOPY` flag. + ## How monopole/single edge brick works: ### Single edge: As the following content shows it, `edge` and `edges` are in an `union` so basically one side can have `edge` OR `edges`. diff --git a/doc/README.md b/doc/README.md index 8ff5cf86d..26a03c56a 100644 --- a/doc/README.md +++ b/doc/README.md @@ -1,6 +1,7 @@ # DOCUMENTATION Here is a documentation aiming at providing detailed information about Packetgraph's brick concept, about implemented technologies/features (with standards descriptions) and about each brick. The idea is to explain what's the purpose of each component, further optimizations and choices made.
+All this documentation must be written in ASCII so we can access it through a terminal.

An overview of the general concept of packetgraph: * [General concept.](PG_GENERAL_CONCEPT.md) @@ -14,3 +15,6 @@ For specific brick'informations and shemas: * [VTEP brick.](VTEP.md) * [SWITCH brick.](SWITCH.md) +About out testing architecture: +* `wip` + diff --git a/doc/SWITCH.md b/doc/SWITCH.md index 725952c88..9135d6edc 100644 --- a/doc/SWITCH.md +++ b/doc/SWITCH.md @@ -1,3 +1,47 @@ # SWITCH Brick ## Introduction + +The switch brick is a brick doing what does a switch do in "real life" as described [here](https://en.wikipedia.org/wiki/Network_switch).
+The core feature being that when we receive an incoming packet, trying to reach a mac address, there's two cases: + +* We know on which interface we can find it and so we forward it directly through the right interface.. + +* We don't know where is the packet's destination so we broadcast it on all interfaces. Then, once we get the answer, we update the MAC TABLE linking MAC address with INTERFACES. So the next time we will know where to forward the packet. + +Another thing is the forget feature: if a line in the mac table hasn't been used since a long time, we forget it!
+However it is not working yet! + +## Basic example: connect 3 VMs to a NIC. + +``` + + The outer +---Host Machine---------------------------------------------------------------------+ + World | | + | +--The GRAPH--------------------------------------+ | + | | | | + | | +-------+ +---------+ | + | | | |<------------------>| VHOST |<------------>| VM | | + | | | | +-------+ +---------+ | + | | | | | | + +---------+ | +---------+ | +-------+ +---------+ | + <-------->| NIC |<-->|---|Switch |---|<------------------>| VHOST |<------------>| VM | | + +---------+ | +---------+ | +-------+ +---------+ | + | | | | | | + | | | | +-------+ +---------+ | + | | | |<------------------>| VHOST |<------------>| VM | | + | | WEST SIDE EAST SIDE +-------+ +---------+ | + | | | | + | +-------------------------------------------------- | + | | + +------------------------------------------------------------------------------------+ +``` + +Note: it's always a good practice to link to one side all "subnet" devices and to another the "upper" device (No matter if it's EAST or WEST!).
+Here is the reason:
+Basically, if we heve a brick modifying packets such as VTEP or VXLAN, we should isolate it on a side. In fact, to manage packets faster, we do not copy them so be careful to do not modify them! They would be modified for all bricks on this side.
+Please refer to the [warning section of the brick concept's overview](BRICK_CONCEPT.md) for more informations. + +## Let's go deeper into the MAC TABLE. + +wip diff --git a/doc/VHOST.md b/doc/VHOST.md index 0606f6605..b72797043 100644 --- a/doc/VHOST.md +++ b/doc/VHOST.md @@ -5,6 +5,7 @@ The VHOST brick is the brick used to make the graph communicate with VMs.
The problem while communicating with VMs via "standard way" is that it's really slow.
So here we use the virtio protocol implemented as vhost in DPDK.
+You can find a more detailed description here: https://www.redhat.com/en/blog/hands-vhost-user-warm-welcome-dpdk.
## VHOST overview @@ -33,6 +34,15 @@ As previously described, VHOST use an unix socket and a hugepage to communicate It manages a queue and reduce memmory write/free operatons.
It's based on a cient(s)/server model, meaning that one server can handle multiple connections through the socket.
Only packet address in the hugepage are flowing through the socket.
+## How to use it + +* `pg_vhost_start("/tmp", &error)`: start the vhost driver and setup the socket's folder. +* `pg_vhost_new("vhost-0", flags, &error);`: create the brick.
The socket will be named `qemu-vhost-0`.
Here are some flags availables: + * `PG_VHOST_USER_CLIENT`: means that the brick will be the client and the qemu the server. + * `PG_VHOST_USER_DEQUEUE_ZERO_COPY`: means that we will use zero copy. #FIXME: explain more. + * `PG_VHOST_USER_NO_RECONNECT`: disable reconnection after disconnection. + ## Current VHOST brick's status -Currently the VHOST brick only works in SERVER mode... Which means that if packetgraph crash, we will need to reboot VMs...
Not a good thing! +Currently the VHOST brick only works in SERVER mode... Which means that if packetgraph crash, we will need to reboot VMs...
Not a good thing!
+However, a PR in in progress to adress this issue.
From b493743975e552750063ae449545da35f77124e3 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Fran=C3=A7ois=20NOUAILLE=20DEGORCE?= Date: Tue, 8 Sep 2020 06:29:31 +0000 Subject: [PATCH 4/5] wip --- doc/SWITCH.md | 15 ++++++++++++++- 1 file changed, 14 insertions(+), 1 deletion(-) diff --git a/doc/SWITCH.md b/doc/SWITCH.md index 9135d6edc..2859b12a7 100644 --- a/doc/SWITCH.md +++ b/doc/SWITCH.md @@ -44,4 +44,17 @@ Please refer to the [warning section of the brick concept's overview](BRICK_CONC ## Let's go deeper into the MAC TABLE. -wip +from `src/utils/mac-table.h`: +``` +A mac array containing pointers or elements +the idea of this mac table, is that a mac is an unique identifier, +as sure, doesn't need hashing we could just +allocate an array for each possible mac +Problem is that doing so require ~280 TByte +So I've cut the mac in 2 part, example 01.02.03.04.05.06 +will now have "01.02.03" that will serve as index of the mac table +and "04.05.06" will serve as the index of the sub mac table +if order to take advantage of Virtual Memory, we use bitmask, so we +don't have to allocate 512 MB of physical ram for each unlucky mac. + +``` From 404a9fa8e836a8ee51a449d4f66942c02c9999f3 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Fran=C3=A7ois=20NOUAILLE=20DEGORCE?= <54851383+outscale-fne@users.noreply.github.com> Date: Thu, 24 Sep 2020 14:48:08 +0200 Subject: [PATCH 5/5] Update VHOST.md --- doc/VHOST.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/doc/VHOST.md b/doc/VHOST.md index b72797043..525449e18 100644 --- a/doc/VHOST.md +++ b/doc/VHOST.md @@ -26,7 +26,7 @@ You can find a more detailed description here: https://www.redhat.com/en/blog/ha +-|-|-|-|-|-|-|-|-----------+ | | +-|-|-|-|-|-|-|-|--------------|-------------------------------------------------------|------+ | v v v v v v v v v v | -| Host's shared memmory, aka hugepage, containing packets. | +| Host's hugepage which is a shared memmory, containing packets. | | | +---------------------------------------------------------------------------------------------+ ```