vxargs: running arbitrary command with explicit parallelism,
visualization and redirection
vxargs is inspired by xargs
and pssh. It provides the parallel
versions of any arbitrary command, including ssh, rsync, scp,
wget, curl, and whatever. One reason to use it is to control a large
set of machines in the wide-area network. For example, I use vxargs on
PlanetLab to control hundreds
of machines spreading around the globe when I'm working on DHARMA project.
The main features are:
- parallelism: run many jobs at the same time
- flexibility: arbitrary command with arbitrary options
- visualization: monitor the total/per job progress in a
curses-based UI
- redirection: stdout and stderr of each individual job are redirected to files respectively for further analysis.
- Why not use pssh
?
There are a couple of reasons:
(1) with pssh, you can only run limited
command, e.g. ssh, rsync, and with limited command-line options. It
is not flexible. With vxargs you can run everything in the way you
like it.
(2) vxargs has a curses-based user interface that can dynamically
monitor the execution process.
(3) vxargs is only 1 python script,
which is extremely simple to install.
- Why not use xargs?
xargs could do some of the work. Check out the (rarely used) options of
--max-procs (-P)
and --replace (-i). However, there seems to be no easy way to track of which
individual process is running and output from all processes are mixed
together. xargs also can't specify the maximal life time for each
process to run. vxargs addresses these issues.
ChangeLog
- vxargs 0.3.3, released Jul 27, 2005
- vxargs 0.3.2, released Jun 22, 2005
- vxargs 0.3.1, released March 13, 2005
- vxargs 0.3, released Feb 19, 2005
- vxargs 0.2.1, released Jan 7, 2005
- vxargs 0.2, released Dec 12, 2004
- vxargs 0.1, released Dec 6, 2004
In vxargs 0.4, multiple arguments will be supported. Send email to me to obtain the latest development version.
Thanks to Guohan Lu for the patch.
RPM
package of vxargs maintained by Andras Horvath
vxargs link on freshmeat
To install vxargs, simply download the latest vxargs python script, rename it to
your favorite name (e.g. vxargs), make sure it has the executable
permission (e.g. chmod +x /home/username/bin/vxargs) and
its dir is in your PATH. Of course, make sure you have Python 2.2 or
above installed.
Read the man page or type vxargs
--help for the detailed usage. Here I'll show several examples
to explain how it works. Suppose the iplist.txt file has following
content:
$ cat iplist.txt
216.165.109.79
#planetx.scs.cs.nyu.edu
158.130.6.254
#planetlab1.cis.upenn.edu
158.130.6.253
#planetlab2.cis.upenn.edu
128.232.103.203
#planetlab3.xeno.cl.cam.ac.uk
|
The IP addresses will be used as dynamic arguments for the
following examples. The hostnames preceded by '#' is a comment for
previous IP address, which will be used for eye candy purpose only and
can be omitted.
- check the uptime of every node in iplist.txt using ssh in parallel.
vxargs -a iplist.txt -o /tmp/result ssh {} uptime
|
Note: {} is replaced by the dynamic arguments (IP addresses) respectively for each
job. This is equivallent to run
ssh 216.165.109.79 uptime
ssh 158.130.6.254 uptime
ssh 158.130.6.253 uptime
ssh 128.232.103.203 uptime
but in parallel. If you are not sure, checkout the -n
option to make sure you get what you wanted. Furthermore, the output from stdout and stderr of each
individual process is redirected to the corresponding files in
/tmp/result/ directory.
In addition, use
cat /tmp/result/abnormal_list
|
to examine the hosts that failed ssh command.
Synchronize the local directory mirror with all
current PlanetLab production nodes
curl --silent https://www.planet-lab.org/db/nodes/production_hosts.php | vxargs -P 2 -y rsync -az -e ssh --delete mirror $SLICE@{}:
|
Note: No "-a" argument file is specified, so vxargs reads it from
stdin. because output dir is also not specified, output is redirected to
/dev/null. -P 2 means run up to 2 processes at a time.
Run startjob on every
cluster node named from cluster001 to cluster128 (New!)
pattern cluster[001-128] | vxargs -P 2
-o cluster/result/`safepath` ssh {} startjob
|
Note: The pattern program was found from
Python
cookbook.
safepath is a tiny program that generates maildir-style folder names. In
this case you don't have to worry about overwriting previous results.
Download cotop information from every node in the list
vxargs.py -a iplist.txt --timeout=20 curl http://{}:3120/cotop?sort=9
|
Note: --timeout=20 will enforce every process to be terminated in 20 seconds.
run a CoDNS query on every node against
"www.google.com"
vxargs.py -a iplist.txt -o /tmp/codns/ -y -t 20 bash -c 'echo www.google.com| nc {} 4119'
|
Note: use cat /tmp/codns/*.out to examine the results.
Note: the following bug is fixed in 0.3
-
When a command spawns multiple processes, e.g.
bash -c 'echo
www.google.com| nc {} 4119', after timeout, only the main
process will be terminated or killed (in the example, bash will be
killed but nc may still be alive).
Send me email if you encounter problems, find bugs, or have any
random comments: maoy AT cis.upenn.edu
Last Modified: $Id: index.html,v 1.27 2005/07/27 21:04:02 maoy Exp $