wiki:DelegationModule
Last modified 8 years ago Last modified on 07/25/08 19:40:02

Delegation!

A couple of scenarios can come up for Func that make it interesting to have a tiered hierachy of Func minions as opposed to just one central overlord with a lot of direct reports.

One example is when network connectivity from the central overlord to all subservient minions is not possible, due to a firewall or other strange network topology-related stuff.

Another scenario (which is optimized at the moment) is if your Func network is VERY VERY large and you want to minimize the number of calls each machine has to make over the network.

Both of these scenarios can come up in a large organization, such as if you had Func installed on all desktops/servers in a corporation with all the labs, datacenters, and so forth.

The Func delegation feature allows you to accomplish both of these ends and more with a minimum of hassle and setup.

Getting Started

NOTE: This feature will be included in Func 0.22, until then, pull down the latest Git repo and build Func yourself to try this out.

The voodoo of func-build-map

func-build-map [--append] [--only-alive] [--verbose]

Func delegation works by using a mapfile generated by func-build-map. func-build-map probes through your network topology and discovers every minion and overlord present that you can connect to. The delegation feature then uses this map data to find it's way through your network.

How does it work? The script builds a tree and saves it in /var/lib/func/map.

Run without arguments, func-build-map will rewrite any mapfile currently sitting in /var/lib/func. If you want to play around with it a bit, there are several other options which will change the script's behavior:

  • -a, --append: probes through Func network and appends new data to existing mapfile instead of rewriting it
  • -o. --onlyalive: builds map using test.ping() method, returning a map that contains only boxes that return a ping
  • -v, --verbose: displays some neat verbose output so you can see what you're doing

To delegate commands, this mapfile must be created and kept updated. Thus, I recommend running it as a daily cron job on your highest overlord.

Note that minions not yet in the map file will not be reached by delegation calls.

The awesometivity of a delegating Overlord client object

To get started with delegation via the Python API, try the following code:

import func.overlord.client as fc
my_overlord = fc.Overlord("<your glob>", delegate=True)

If you want to use an alternative delegation map file, you can add the argument mapfile=<your mapfile location> to the Overlord constructor to tell it to pull the mapping data out of it instead.

From this point, you can treat your delegating overlord object in the same manner as a non-delegating overlord. Minions that exist under multiple layers of overlords will appear as if they existed directly beneath your master overlord, so make some function calls and play around with it!

If you are using the command line, you can skip all of this and just run the command as follows, assuming that you have already generated the mapfile:

func "*" call --delegate command run "/bin/foo"

How It Works: Anatomy of a Delegated Call

To explain how delegation works, we'll use an example of a delegated function call from a central overlord to a fictional minion known as "Minion 6", which exists under 2 sub-overlords beneath our central overlord: Bitmap Diagram of a Delegated Function Call

Here's the code we would've used to do this:

import func.overlord.client as fc
minion6 = fc.Overlord("Minion 6", delegate=True)
minion6.test.ping()

and the results:

{'Minion 6': 1}

When one instantiates a Func overlord Client object (contained within the client.py code) in the manner described above, the mapfile (stored in yaml format) is read and converted into a Python dictionary object. When a function call is made against this Client object, the glob provided is matched against the elements in the tree and the shortest 'call paths' through the tree to each match are found. In this case, the shortest call path to Minion 6 would look something like this:

['Minion 2', 'Minion 3', 'Minion 6']

The Client code then checks the length of each call path list. If a call path contains only one element, we know that it exists directly under the current overlord and we can make the function call as we would without delegation. Otherwise, we call the delegation module and call its run() method on Minion 2.

We pass the module, method, and arguments specified by the user to the run() method, along with a new call path with 'Minion 2' stripped off. The run() method, seeing that the call path it was passed contains multiple elements, calls the delegation.run() method on Minion 3, passing along the module, method, and arguments, and a new call path with 'Minion 3' stripped off.

Minion 3's run() method, seeing that the call path contains a single element, calls the module and method on Minion 6 and passes it the user-supplied arguments. The return data from Minion 6 is passed back to Minion 3, which pulls the Minion 6 data out of the results hash and then passes back to Minion 2, which does the same, and finally passed to the central overlord, which inserts the results data into the master results hash.

This process repeats for each call path, which is a bit inefficient for large-scale Func networks. One of the areas for optimization that we'll want to target involves sending multiple call paths that share a common sub-overlord to that sub-overlord to reduce the amount of calls to it.

Notes

Asynchronous operation

Async/nforks functionality is implemented as of July 24th, 2008 in the Func git repository. This functionality was achieved by introducing polling code into the minion-side delegation.py module and the overlord-side client.py module. When one makes a call asynchronously with delegation turned on, each sub-overlord in the delegation chain polls the asynchronous function calls they make, and when all calls have returned, all of the resulting hash data is massaged, merged together, and returned to the user in the same manner as a call without asynchronous mode enabled.

Optimizations

Along with the asynchronous delegation call capabilities, I've added in a few optimizations to reduce the number of XMLRPC calls made during delegation. To explain, let's say that you have a group of delegation paths that look like this:

[
['Minion 1', 'Minion 5', 'Minion 4', 'Minion 8', 'Minion 9'],
['Minion 1', 'Minion 3', 'Minion 2'],
['Minion 6', 'Minion 11'],
['Minion 10', 'Minion 7'],
['Minion 12']
]

In the unoptimized state, we would have to call Minion 1 twice to send it both of its delegation paths, which doesn't look very good from a performance standpoint. Hence, I've added in some code that filters out single-element lists (since these are direct calls) and groups the rest together by their next sequential minion. The code for this function is located in the func/overlord/delegation_tools.py script and is contained within the group_paths() method. So, instead of that gnarly list of lists, the method returns a list and a dictionary that looks like this:

(
['Minion 12'],
{
 'Minion 1': [['Minion 5', 'Minion 4', 'Minion 8', 'Minion 9'], ['Minion 3', 'Minion 2']],
 'Minion 6': [['Minion 11']]
 'Minion 10': [['Minion 7']]
}
)

This path grouping occurs at each delegation point, so the amount of XMLRPC calls required is minimized as much as possible.

Things To Do

I've done some testing, but I haven't done a lot of testing over vast Func networks. If anyone wants to test this feature and file some bug tickets, I'd be grateful for your help. If you have any comments or questions, post them to the func-list or shoot me an e-mail at ssalevan@…. Thanks!

(Currently the delegation implementation may talk to each mid-level overlord more than once, so it achieves network topology ends but may not achieve great performance ends just yet. This can be optimized out later and is on our radar. -- mpd)

--ssalevan

Attachments