Skip to content

Quickstart: Hello World

ruferp edited this page Oct 11, 2017 · 23 revisions

EOS SDK Quick Start: a Hello World Agent

Introduction

This tutorial will walk you through building, installing, and running your first EOS SDK agent. By the end of this document, you will have created an agent that will say 'Hello' to a name configured via the CLI. This program, although simple, demonstrates the lifecycle of an agent along with various components of the SDK. We'll first describe how to create the agent executable, then explain how to run your agent, and finally walk you through the meat of the EOS SDK code. If you'd like to explore the code without running the agent on a switch or vEOS instance, you can skip to the usage or code explanation sections.

The complete code for the agent can be found in the examples directory, at HelloWorld.cpp for the C++ version of the agent, and HelloWorld.py for the Python implementation. Note that you can easily access the raw versions of the example files (for easy wget or scp access) by clicking the 'Raw' link in the upper right of each file's GitHub page.

Getting Started

There are two steps to running your agent on an Arista device. The first is installing the EOS SDK extension on EOS. This extension exists per-release-per-EOS-version, and contains the underlying libeos.so library which translates the SDK APIs to underlying EOS state updates. The second step is to build your agent and transfer it to a switch.

Installing EOS SDK on your switch

First, we'll need an EOS instance where we can run our new agent. That means we'll need either a vEOS virtual machine or a physical switch. We then need to install the EOS SDK RPM, which contains the binary implementation of the SDK. Follow the [download and installation instructions](Downloading and Installing the SDK) for information on how to complete both of these steps.

Building your agent

If you want to use the Python version of the agent, there is no need to build anything. All you need to do is copy the script to the switch. This means you can scp the file to the switch:/mnt/flash directly, or you can use the copy CLI that EOS provides:

switch# copy <URI-of-HelloWorld.py> flash:

If your switch has internet access, <URI-of-HelloWorld.py> can use the direct link served by GitHub: https://raw.githubusercontent.com/aristanetworks/EosSdk/master/examples/HelloWorld.py.

For the C++ version of the agent, you'll need a 32-bit Linux environment with the stubs tarball downloaded, unpacked, and built. This "stubs tarball" includes the contents of this GitHub repository, and contains the same headers that that the EOS SDK extension on the switch exposes. This allows you to build your code independently of any Arista specific environment, and instead lets you focus on actual agent development. To create this environment, download and unpack the tarball from the releases page and run ./build.sh. Then, copy the HelloWorld.cpp file to your build directory and run:

bash# g++ -std=gnu++0x -o HelloWorldBinary HelloWorld.cpp -leos

This will create an executable named HelloWorldBinary in your current directory. Copy that file to your switch's /mnt/flash directory.

See the instructions on building your agent for more information on setting up your build environment.

Running your Agent

Now that we have a switch with the SDK installed along with an agent executable, let's run see what this agent does! Then, in the following section, we'll actually dive into how it is done.

bash# ssh admin@myAristaSwitch
switch> enable
switch# configure
switch(config)# daemon HelloWorldAgent
switch(config-daemon-HelloWorldAgent)# exec /mnt/flash/HelloWorld.py
switch(config-daemon-HelloWorldAgent)# no shutdown

The daemon CLI allows you to run your agent in the context of EOS's process manager. This means that if the process ever dies, hangs, or otherwise runs into issues, the process manager will restart your agent to get you back into a working state. After entering the daemon configuration mode, we specify the path to the executable. In the above snippet, we assume you've copied HelloWorld.py to /mnt/flash. Any other path to an executable will work as well. Finally, we issue a no shutdown to start the agent.

When an agent starts up, it must synchronize with Sysdb to receive all state the agent cares about. Since release 4.16.x a "mount profile" needs to be installed into /usr/lib/SysdbMountProfiles for that purpose. Without such profile you would see this in the agent's logfile:

[admin@myEosSwitch ~]$ cat /var/log/agents/HelloWorld-*
waiting for connection to Sysdb ..........................................

Intermezzo: Creating a Mount Profile

The first line of the profile identifies the agent the profile is for; the filename itself is irrelevant (just needs to be unique, best to use the name of the agent though). After that first line there is a line for each "manager" that your agent uses. What a "manager" is will become evident once you see the code...

There is a brute-force/template profile (mounts everything) which can be used as a guide at /usr/lib/SysdbMountProfiles/EosSdkAll. Its first line needs to be adapted according to your agent's binary name: replace the "EosSdk" part of the first line with the basename of your binary. Then optionally remove the lines that are not needed by your agent (for performance reasons).

Here is a copy/paste way to produce your test profile from the bash prompt of the EOS switch, but first change the first line to match your agent's executable (see "HelloWorld" below):

bin=/mnt/flash/HelloWorld # <<<=== adapt this line to actual binary
[ ${bin%.*} == $bin ] || echo "Error: remove dots from binary name"
name=$(basename $bin)
dest=/usr/lib/SysdbMountProfiles/$name
source="/usr/lib/SysdbMountProfiles/EosSdkAll"
cat $source | sed "1s/agentName[ ]*:.*/agentName:${name}-%sliceId/" > /tmp/tmp_$name
delta=$(cmp /tmp/tmp_$name $source)
if [ "$?" = "0" ]; then
  echo "Error: something is wrong"
else
  sudo mv /tmp/tmp_$name $dest
fi

End of intermezzo (demo continues)

You can confirm that the program is running via the show daemon command:

switch(config-daemon-HelloWorldAgent)# show daemon
Agent: HelloWorldAgent (running)
No configuration options stored.

Status:
Data           Value
-------------- ---------------------------
greeting       Welcome! What is your name?

Looks like everything is up and running! Feel free to use Linux's excellent process introspection utilities to confirm the agent is running as well, for example by dropping to bash and issuing ps -ef | grep HelloWorld.

Let's now tell our friendly agent our name:

switch(config-daemon-HelloWorldAgent)# option name value Robert Metcalfe

With this command, we changed some state in EOS. This state is propagated to our agent, which received an event notification. Hopefully our agent responded with a salutation; to check, run

switch(config-daemon-HelloWorldAgent)# show daemon
Agent: HelloWorldAgent (running)
Configuration:
Option       Value
------------ ---------------
name         Robert Metcalfe

Status:
Data           Value
-------------- ----------------------
greeting       Hello Robert Metcalfe!

And, ta-da, our agent reacted to the name option and said "hi". And with that, you've just redefined "social networking."

Feel free to change your name via the option name value <new-name> command and remove your name via no option name, and observe how your newly created social network responds. When we're finished, we can stop our agent using the shutdown command:

switch(config-daemon-HelloWorldAgent)# shutdown
switch(config-daemon-HelloWorldAgent)# show daemon
Agent: HelloWorldAgent (shutdown)
Configuration:
Option       Value
------------ ---------------
name         Robert Metcalfe

Status:
Data           Value
-------------- ------
greeting       Adios!

Persist your Agent

After a reboot you will need to re-install the mount profile to /usr/lib/SysdbMountProfiles. This can be done via an "on-boot" script, or by packaging that profile into an RPM and configure EOS to boot with it (that's the best route if your application is already installed via an RPM: just add the profile to it).

The on-boot script route looks like this:

switch(config)# event-handler HelloWorld-install
switch(config-handler)# trigger on-boot
switch(config-handler)# action bash /mnt/flash/HelloWorld-install.sh
switch(config-handler)# bash cat /mnt/flash/HelloWorld-install.sh

The RPM route looks like this (after scp-ing the RPM to /tmp on the switch):

switch(config)# copy file:/tmp/HelloWorld.i686.rpm extension:
switch(config)# extension HelloWorld.i686.rpm
switch(config)# copy installed-extensions boot-extensions

The first line puts the extension (RPM) where the second command expects it. The second line installs the RPM. The third one makes sure it will get re-installed after a reboot.

Anatomy of the HelloWorld agent

In this section, we will explore the code behind the HelloWorld C++ agent. The same explanations hold for the Python variant.

In the beginning...

The agent executable first runs when you enter no shutdown via the CLI. At this point, ProcMgr starts up an instance of your agent, using the command stored in the exec CLI. As we set exec to the path of our executable, ProcMgr will run this file, which, like all C++ programs, begins execution at the main() function:

int main(int argc, char ** argv) {
   eos::sdk sdk;
   hello_world_agent agent(sdk);
   sdk.main_loop(argc, argv);
}

The set-up for this agent is simple. We first create an instance of the SDK, using eos::sdk's default constructor. We then construct the hello_world_agent class, which contains the meat of the program's logic. Let's see what happens there:

class hello_world_agent : public eos::agent_handler {
 public:
   eos::tracer t;

   explicit hello_world_agent(eos::sdk & sdk)
         : eos::agent_handler(sdk.get_agent_mgr()),
           t("HelloWorldCppAgent") {
      t.trace0("Agent constructed");
   }
   // ...
};

The first thing to notice is that the hello_world_agent subclasses eos::agent_handler. A handler class is an EOS SDK construct which lets your agent react to state changes via overridable functions (the on_xxx() methods). There are many different types of handlers, and agents should subclass from each handler that has events they are interested in. One handler, for example, might let you know when interfaces become operational, while another will fire when an access control list (ACL) has been programmed to hardware. All agents, however, should have at least one class that inherits from the agent_handler, as this handler provides agent-specific callbacks alerting you of startup, shutdown and configuration events. As our HelloWorld agent only cares about agent options, we only need to subclass from the agent_handler.

In the hello_world_agent constructor, we go ahead and initialize various elements:

  • our superclass, which takes an eos::agent_mgr
  • and an eos::tracer object, which lets our agent output debug trace statements to its log file when tracing is enabled.

At this point, all of the relevant classes, data structures, and logic are created, so, back in our main function, we start the main_loop. This function never returns and instead creates the continuously running event loop, managed by the SDK.

Before this point, our program has simply been a C++ class that has no connection to Sysdb. This means that none of the on_xxx() callbacks will fire, and the agent_mgr will not be able to set or read any state. Our call to start the main_loop changes this: the SDK connects to Sysdb, synchronizes any state needed by the agent_mgr, and registers itself for any relevant notifications. When this dance is complete, the agent_handler's on_initialized method is called, a method that our hello_world_agent overrides:

   void on_initialized() {
      t.trace0("Initialized");
      std::string name = get_agent_mgr()->agent_option("name");
      if(name.empty()) {
         // No name initially set.
         get_agent_mgr()->status_set("greeting", "Welcome! What is your name?");
      } else {
         // Handle initial state.
         on_agent_option("name", name);
      }
   }

The first thing we do upon initialization is check if Sysdb already has any relevant state set. For our agent, we just care if somebody has told us their name, so we go and ask the agent_mgr for the value corresponding to the agent_option called "name". If the option is not set, the call to agent_option will result in an empty string (as documented in agent.h), so we'll just set a welcome message by calling status_set (also documented in agent.h). Otherwise, there was already a name set, so we jump to the on_agent_option logic, which handles handles the name.

You may wonder when any initial state could have been created. There are many ways that this could have happened:

  • The user may have set the name option before running no shutdown from the CLI, thus seeding Sysdb with a name already.
  • The switch may have been rebooted, and the start-up configuration contained the command option name value <name> for this daemon, meaning the configuration was set before your agent was ever enabled.
  • A stateful switchover event happened, meaning that control transferred from one supervisor to another on a modular system. To your agent, this just looks like an agent restart where state is already set.
  • Your agent has a bug and crashed, causing ProcMgr to restart it.
  • Someone, or some other program, set that state in Sysdb.
  • One of many other causes.

In any case, your agent was started, and needs to make sure it can sanely handle what is already in Sysdb. At this point, there is nothing more to do so we return control to the event loop and wait for another event to occur.

Let's assume the co-inventor of Ethernet now enters his name at the EOS CLI: option name value Robert Metcalfe. Under the hood, this command sets a field in the CLI's state hierarchy. This state then propagates to Sysdb. Sysdb knows that our agent is synchronizing agent related state (because we grabbed a reference to the agent_mgr), and in turn forwards this new state to us. When our agent decodes the state update, it notices that is registered for notifications on this state, transforms this update into a std::string key/value pair and calls on_agent_option with the new value. It may sound crazy that there are three processes to handle a single state update, but lets the system be very resilient to failure when something goes wrong. The vast majority of the time, though, all of this data transfer happens extraordinarily quickly.

In any case, since we overrode on_agent_option, our callback runs:

   void on_agent_option(std::string const & option_name,
                        std::string const & value) {
      if(option_name == "name") {
         if(value.empty()) {
            // User deleted the 'name' option
            t.trace3("Name deleted");
            get_agent_mgr()->status_set("greeting", "Goodbye!");
         } else {
            // Now *this* is what social networking is all
            // about. Somebody set, or changed, the name option. Let's
            // do some salutations!
            t.trace3("Saying hi to %s", value.c_str());
            get_agent_mgr()->status_set("greeting", "Hello " + value + "!");
         }
      }
   }

This method is straightforward: if the 'name' option changed, we update the greeting appropriately. We then return to the event loop to await further notifications. Once the agent is in the event loop, our agent transforms the status update to an underlying data type, sends the state to Sysdb, which then forwards the new greeting to the CLI. When the user runs show daemon, they'll see the agent's cordial welcome message.

The last piece of functionality is the cleanup mechanism. If this agent is ever shutdown from the CLI, we are given the chance to cleanup any status. Note that this cleanup is not guaranteed to run: the underlying agent process could be abruptly terminated by the system for any number of reasons. However, if we are cleanly disabled, the agent_handler's on_enabled callback will fire:

   void on_agent_enabled(bool enabled) {
      if (!enabled) {
         t.trace0("Shutting down");
         get_agent_mgr()->status_set("greeting", "Adios!");
         get_agent_mgr()->agent_shutdown_complete_is(true);
      }
   }

In this case we don't have to do much: we just publish a goodbye message and alert the agent_mgr that we have completed cleanup. Under the hood, the SDK then exits the event loop and the process exits.

Congratulations on finishing your first walkthrough of an EOS SDK agent!

Next Steps

At this point you may want to