OLSR Network Spider

The OLSR network spider is used to obtain configuration information from Nodes of an OLSR network. The information is obtained by querying the web-interface of each OLSR device.

To get started the spider needs the OLSR topology information of the network to spider. This can either be obtained from a Dump in text format or via the network from a given URL.

The text file should be in the format:

Table: Topology
Dest. IP        Last hop IP     LQ      NLQ     Cost
<IP-Address>    <IP-Address>    1.000   1.000   1.000

which is generated by the OLSR textinfo plugin. In the data above the <IP-Address> are valid IPv4 addresses, the link quality information are floating point numbers or the work INFINITE.

In addition the text format may contain other olsr tables Links, Neighbors, HNA, MID, and Routes. The spider currently uses only the Topology table.

When obtaining the topology information from an OLSR device, the device has to run a version of Backfire. Only the top-level URL is given, the spider knows how to find OLSR topology information.

Spidering

Warning: Spidering a network can consume huge amounts of bandwidth and other system resources. Only spider a network where you have obtained permission to do so. You should limit the number of requests and the number of parallel connections.

For trying out the spidering with a limited number of devices (not spidering the whole network) you can limit the number of devices to be spidered using the -n or --limit-nodes option.

The number of parallel spider processes is a critical parameter that influences the load put on the spidered network, it is specified with the -p or --processes option. Do not set this higher than the default of 20 unless you obtained permission from the network administrators and you know what you're doing.

Not all nodes in a spidered network are up and reachable all the time. For this reason we need a timeout that specifies the maximum time to spend on a single IP address. The timeout is specified with the -t or --timeout option.

By default the spider tries to obtain the configuration information via the default http` port 80. If you know certain nodes in the network that run their web-interface on a non-standard port, the ``-i or --ip-port option will allow you to specify exceptions for those nodes. The parameter to this option takes the form IP-Address:port where IP-Address is the numeric IP address of the node and port is the non-standard port in use.

The result of the spider run is saved in a python-specific serialisation format called pickle. The -d or --dump file specifies the filename of the pickle dump.

In addition to the options that influence the spider run, you can request verbose information using the -v or --verbose option (more -v options increase verbosity) and turn on debug output with the -D or --debug option.

The data spidered

The spider currently obtains the following information from a node in the network:

The type and version of software running on the node, if possible the version of the web-interface software is also obtained, e.g.:
```
Freifunk Version: v1.6.37
```
or:
```
Backfire Version: 0xFF-Backfire Vienna 4.1 (r1201) / Luci trunk+svn9042
```
Interfaces configured on that node, if an interface is a wireless interface additional parameters describing the wireless interface are also obtained if possible. In addition the IP Address(es) for these interfaces are obtained. Note that currently information about a network interface depends on the software on the node and is not normalized. So it can happen that for one interface the netmask is specified in dotted notation while for another it is specified as the mask-length after a /, e.g.:
```
Interface (tap0, 2, is_wlan=False)
    Inet4 (YYY.YYY.YYY.YYY/255.255.255.255, 255.255.255.255, None)

Interface (eth1, 4, is_wlan=True)
    Net_Link (ether, XX:XX:XX:XX:XX:XX, ff:ff:ff:ff:ff:ff)
    Inet4 (YYY.YYY.YYY.YYY/22, YYY.YYY.YYY.255, global)
    WLAN_Config_Freifunk
        ( ssid=freiesnetz.www.funkfeuer.at
        , mode=Ad_Hoc
        , channel=4
        , bssid=XX:XX:XX:XX:XX:XX
        )
```
A list of all configured IP addresses for this node, these are the IP addresses that were already listed for the interfaces.
If a node is not reachable or errors occur during parsing, the spider entry for that node contains the python exception that was raised. There are different errors, like ValueError for unknown information in the web interface or Timeout_Error for communication timeouts.

Making use of the data

The pickle dump produced by the spider contains a dictionary indexed by IP address of spidered nodes. The content of the dictionary for each IP address is either an exception (because the spider failed to obtain the information) or an instance of a Guess class that in turn contains a dictionary of interfaces indexed by name and a dictionary of IPs (the IP is the key of the dictionary).

The pickle dump can be read using the parser.py script of the spider. Reading an existing dump is performed with the -r or --read-pickle option. If several input files are read, these are merged together. The resulting merged file can be written using the -o or --output-pickle option.

Merging of pickle dumps uses the latest valid information. If in an earlier dump a node was available (and information was obtained) and the same node is later unavailable, the information is not overwritten by the later dump. On the other hand if a node was not available earlier or the earlier information was different and the new information is also valid, the later information will overwrite the earlier one.

Exceptions also have a hierarchy that decides what information is overwritten. A ValueError (indicating unparseable information on the web interface) overwrites a Timeout_Error that indicates an unsuccessful connection.

What is considered the earlier and later version depends on the order of -r or -read-pickle options given.

The output read via -r can be printed using the -v or --verbose option. More -v options mean more verbose output.

In addition to merging and printing the parser.py script can also be used for obtaining spider information for a list of explicit IP addresses. In that case these addresses are specified as parameters on the command line. With the -p or --port option a non-standard port can be specified. This port is applied for all explicit IP addresses.

When merging IP addresses, explicitly spidered addresses given as parameters are merged last and override (if successful) earlier results read in via -r or --read-pickle options.

Name		Name	Last commit message	Last commit date
Latest commit History 91 Commits
bin		bin
ff_spider		ff_spider
.gitignore		.gitignore
LICENSE		LICENSE
MANIFEST.in		MANIFEST.in
Makefile		Makefile
README.rst		README.rst
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

OLSR Network Spider

Spidering

The data spidered

Making use of the data

About

Releases

Packages

Contributors 2

Languages

License

FunkFeuer/ff-spider

Folders and files

Latest commit

History

Repository files navigation

OLSR Network Spider

Spidering

The data spidered

Making use of the data

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages