CSCS Home Page UM Home Page



Back to CSCSSoftware Home
Back to GridSweeper Home

GridSweeper User's Manual

Ed Baskerville
Version 1.0.2, last modified March 05, 2011, at 08:43 PM

Table of Contents

What is GridSweeper?

GridSweeper is a tool for performing many runs of a computer model, across many different parameter values, in parallel on a grid computing system. Most of the time, scientists wanting to perform many runs of a model will construct an ad-hoc script to loop through parameter combinations, or build this capability right into their model program, thus limiting themselves to a single processor on a single computer. GridSweeper handles "parameter sweeping" for you, letting you define ranges of parameters to sweep over in a flexible XML format, then submitting all the requested parameter combinations to a grid system, which automatically ships runs to processors on the grid as they become available.

GridSweeper is not a tool for building models, or a framework for doing data-parallel computing: it simply makes it easier to perform independent runs of your model on many processors at once, saving you from writing custom parameter sweeping code and from being limited to a single machine.

GridSweeper is released under the GNU Affero General Public License, Version 3.

Getting Started with GridSweeper

To start, I'll assume that you are a user on a system that already has GridSweeper properly installed and configured. If you are a system administrator and want to install and configure GridSweeper for your system, you'll want to read the section on installation first. I will also assume that you're comfortable with the basics of the Unix command line.

Before starting to set up your experiment, you'll want to set up a directory to hold your output files. If your program outputs large amounts of data, check with your system administrator that the location you've chosen is appropriate. For example, on some systems, the sysadmins have "scratch" space specifically set up for large quantities of model output. The last thing you want to do is accidentally fill up all the shared disk space in your lab and ruin your simulations and everyone else's too.

When the grid executes your model, it will do so in a working directory set up by GridSweeper inside your output directory that indicates the name of your experiment, the date and time of the experiment run, and the parameter values for your particular run, for example,

/path/to/gs-results/huge-experiment/2009-07-29/02-41-43/g=0.1-n=0.05

Most simulation programs make use of random number generation, so you will probably want to perform multiple runs with different seed values for the random number generator. GridSweeper will automatically pipe standard output and error to numbered files for each run. Additionally, as discussed later, GridSweeper will provide your simulation with the run number so your code can label output files appropriately. GridSweeper will also create an XML file called case.xml containing parameter values and random seeds for all the runs in the directory. In all, then, an output directory will look something like this:

case.xml
data.01.txt
data.02.txt
...
stdout.01
stdout.02
...
stderr.01
stderr.02

Experiments are controlled by an XML file, written by you, describing the parameter sweeps you want to perform for your model. It's easiest to start with a full example:

<?xml version="1.0" encoding="UTF-8"?>
<experiment name="huge-experiment" numRuns="16">
  <setting key="ResultsDirectory" value="/path/to/scratch/gs-results"/>
  <setting key="EmailAddress" value="ed@example.com"/>
  <setting key="Model" value="/path/to/mymodel/mymodelprogram"/>

  <abbrev param="alpha" abbrev="a"/>
  <abbrev param="beta" abbrev="b"/>
  <abbrev param="gamma" abbrev="g"/>
  <abbrev param="nu" abbrev="n"/>

  <value param="alpha" value="0.5"/>
  <value param="beta" value="0.3"/>
  <value param="gamma" value="0.1"/>
  <value param="nu" value="0.04"/>

  <list param="nu">
    <item value="0.05"/>
    <item value="0.06"/>
    <item value="0.09"/>
  </list>
  <range param="gamma" start="0.1" end="0.5" increment="0.1" />
</experiment>

The first line is the XML declaration, which should be at the start of any XML file—you can simply copy and paste that line to your experiment file. The rest of the file, bounded by <experiment...> and </experiment>, constitutes the description of your experiment.

Inside the <experiment...> tag itself, you specify a name for your experiment (name="huge-experiment"), which is used to name the subdirectory where output for this experiment is stored, and how many times the simulation program will be run with different random seeds (numRuns="16"). The rest of the file typically is divided into four sections: settings, abbreviations, fixed parameter values, and parameter sweeps.

We'll look at each of these in turn. But first, if you haven't used XML before, a quick introduction.

XML Primer

XML, which almost stands for Extensible Markup Language, is a flexible format that nearly everyone turns to when they want to save all the time required to make their own format. In essence, XML consists of elements, which are delimited by closing and ending tags:

...
<element>
...
</element>

which can in turn contain other elements:

...
<element>
  <child>
  </child>
</element>

All elements can also contain "content", which is essentially free-form text (with some minor restrictions that shouldn't cause you trouble when using GridSweeper):

...
  <child>
    This is some content.
  </child>
...

Content-less elements also come in a special, highly useful, form, which collapses the start and end tags into one:

...
  <childwithoutcontent/>
...

Finally, XML elements can have attributes, which map an attribute name to some value (always just a string of characters):

...
  <childwithoutcontent attribute="value"/>
...

And that is most of what you need to know about XML to use GridSweeper. A few more important notes:

  • XML files contain just one outermost element, known as the "root element." Family-tree nomenclature is used for the rest of the nested elements, so elements can have children and grandchildren, and children have parents, and children of the same parent are siblings.
  • XML is case-sensitive, which means that <experiment numRuns="16"> is not the same thing as <experiment numruns="16">
  • XML attribute values are delimited with quotation marks, so <experiment numRuns=16> is not OK, but <experiment numRuns="16"> is.
  • XML files should have a declaration, which declares the XML version (always 1.0) and the text encoding. I recommend UTF-8, because it is a superset of age-old ASCII—most text editors should have a place to make sure you're using it.

This, then, is an XML file that contains everything we've talked about:

<?xml version="1.0" encoding="UTF-8"?>
<root>
  <childwithoutcontent attribute="value"/>
  <child>
    This is some content.
    <grandchild age="2"/>
  </child>
</root>

Now, let's look in detail at GridSweeper experiment XML files.

Settings

GridSweeper uses settings from experiment XML files to control different aspects of its operation. Two settings should be specified in every experiment file, specifying the results directory models are executed—the one you should have created earlier—and what email address is used notified when all simulations are complete:

  <setting key="ResultsDirectory" value="/users/username/scratch/gs-results"/>
  <setting key="EmailAddress" value="ed@example.com"/>

For custom command-line program programs and NetLogo models, the Model setting specifies the location of the model executable/file:

  <setting key="Model" value="/users/username/scratch/mymodel/mymodelprogram"/>

Other settings vary from model type to model type, and are covered in the sections below on command-line programs, R scrips, Matlab scripts, and NetLogo models.

Abbreviations

Output directories are named using parameter values, so in order to prevent the directory names from being too long, you can specify abbreviations to use when creating directories. Abbreviations are set up like this:

<abbrev param="alpha" abbrev="a"/>

That way, instead of naming a directory with ...-alpha=0.5-..., GridSweeper will simply use ...-a=0.5-....

Fixed Parameter Values

In a typical experiment, some parameters have the same value across all runs. To set a single value for a parameter, add a line like this to the experiment file:

<value param="alpha" value="0.5"/>

Since they do not serve to distinguish between different model runs, fixed parameter values are not used when naming output directories. Additionally, such values will be overridden by the presence of the parameter in any parameter sweeps.

Parameter Sweeps

A typical experiment will additionally include a parameter sweep, where different parameters take on multiple values and all combinations of parameters are tried. Parameters can be assigned multiple values in several ways, the most basic of which are a "list sweep"—a simple enumerated list of values—and a "range sweep"—where the starting and ending point and step size are specified.

List sweeps are specified as a series of <item> elements embedded in a <list> element. For example, to step the parameter nu through the values 0.06, 0.12, and 0.24, you would write:

<list param="nu">
  <item value="0.05"/>
  <item value="0.06"/>
  <item value="0.09"/>
</list>

Range sweeps consist of a single element. For example, to step the parameter gamma from 0.1 to 0.5, inclusive, in increments of 0.1, you would write:

<range param="gamma" start="0.1" end="0.5" increment="0.1" />

Together, these two sweeps would produce 15 different combinations of parameters. Assuming abbreviations n and g for the two parameters and a time of 12:07:34 PM on August 8, 2009, the resulting set of output directories would look like this:

/path/to/gs-results/
  2009-08-01/
    12-07-34/
      n=0.05-g=0.1/
      n=0.05-g=0.2/
      n=0.05-g=0.3/
      n=0.05-g=0.4/
      n=0.05-g=0.5/
      n=0.06-g=0.1/
      n=0.06-g=0.2/
      n=0.06-g=0.3/
      n=0.06-g=0.4/
      n=0.06-g=0.5/
      n=0.09-g=0.1/
      n=0.09-g=0.2/
      n=0.09-g=0.3/
      n=0.09-g=0.4/
      n=0.09-g=0.5/

If instead of wanting all combinations of parameter values, you want to run through a specific pairs (or triplets) of values for different parameters—e.g., alpha = 0.5 paired with beta = 0.1 and alpha = 1.0 paired with beta = 0.2, but not alpha = 0.5 paired with beta = 0.2, you can define a "parallel sweep" like so:

<parallel>
  <list param="alpha">
    <item value="0.5"/>
    <item value="1.0"/>
  </list>
  <list param="beta">
    <item value="0.1"/>
    <item value="0.2"/>
  </list>
</parallel>

Finally, it is possible to combine discrete sets of parameter values with parameter ranges by embedding range sweeps inside lists. For example, this will produce a list of values nu = 0.06, 0.12, 0.24, 0.5, 0.6, 0.7, 0.8, 0.9, 1.0:

<list param="nu">
  <item value="0.05"/>
  <item value="0.06"/>
  <item value="0.09"/>
  <range start="0.5" end="1.0" increment="0.1"/>
</list>

Running an Experiment

GridSweeper experiments are run using a different command depending on the model type:

Once the experiment file is ready, a wise first step is not to run the experiment, but to perform a "dry run," where output directories are set up but no simulations are actually performed. This enables you to make sure that the parameter sweep is doing what you want it to do. Dry runs are activated with the -d switch, so for a gsdrone experiment, type

gsdrone -d /path/to/experiment.xml

at the command line (substituting gsmatlab, etc. as appropriate).

You will see a long quantity of output, like so:

Performing dry run for experiment "huge-experiment"...
Created experiment directory "/path/to/gs-results/huge-experiment/2009-08-08/16-35-56".
Submitting cases:
n=0.05-g=0.1
  Not submitting run 1 (dry run)
  Not submitting run 2 (dry run)
...
n=0.05-g=0.2
...
n=0.09-g=0.5
...
  Not submitting run 15 (dry run)
  Not submitting run 16 (dry run)
All cases submitted.
Dry run complete.
Sent notification email to ed@example.com.

The result will be an output directory that looks something like this:

/path/to/gs-results/huge-experiment/
  experiment.xml
  2009-08-08/
    16-35-56/
      n=0.05-g=0.1/
      n=0.05-g=0.2/
      ...
      n=0.09-g=0.4/
      n=0.09-g=0.5/

Each individual output directory will be empty except for a case.xml file, which contains the full parameter value specifications as well as a list of seeds for the random number generator, e.g.,

<?xml version="1.0" encoding="UTF-8"?>
<case name="huge-experiment - n=0.05-g=0.1 (2009-08-08, 16-35-56)">
  <value param="nu" value="0.05"/>
  <value param="gamma" value="0.1"/>
  <value param="alpha" value="0.5"/>
  <value param="beta" value="0.3"/>
  <run number="1" rngSeed="851871007"/>
  <run number="2" rngSeed="1309022849"/>
  ...
  <run number="15" rngSeed="459410224"/>
  <run number="16" rngSeed="1881943137"/>
</case>

Once you have verified that everything looks right with a dry run, you can run the experiment for real like so:

gsdrone /path/to/experiment.xml

This time, the output will be slightly different:

Running experiment "huge-experiment"...
Created experiment directory "/path/to/gs-results/huge-experiment/2009-08-08/16-40-20".
Establishing grid session
Submitting cases:
n=0.05-g=0.1
  Submitted run 1 (DRMAA job ID 1224435.1)
  Submitted run 2 (DRMAA job ID 1224435.2)
...
n=0.05-g=0.2
...
n=0.09-g=0.5
...
  Submitted run 15 (DRMAA job ID 1224449.15)
  Submitted run 16 (DRMAA job ID 1224449.16)
All cases submitted.
Experiment submitted.
Detaching from console (monitoring process id: 27512)...
Status output will be written to:
  /path/to/gs-results/huge-experiment/2009-08-08/16-40-20/status.log
and an email will be sent to ed@example.com upon experiment completion.
You may now close this console or log out without disturbing the experiment.

If all goes well, GridSweeper will ship all your experiment runs to the grid, and in due time you will receive an email declaring that they are all complete. If you want to monitor progress as it goes, you can watch the status.log file, for example with

tail -f /path/to/gs-results/huge-experiment/2009-08-08/16-40-20/status.log

Using GridSweeper with Command-Line Programs (gsdrone)

The GridSweeper interface for command-line programs, gsdrone, is based on Ted Belding's drone program, which was used to perform parameter sweeps on a single machine. In short, the gsdrone tool takes your experiment XML file and passes all the resulting parameter combinations to your command-line model program on the grid.

Command-Line Program Requirements

The first step to using gsdrone is building a model program so that gsdrone can talk to it via command-line options. If you are working with existing model code, you can modify your executable to support gsdrone directly, or build an adapter script to pass command-line options to your model's native format. (If you have a whole class of models with a similar way of receiving parameter settings, it may make sense for you to build an adapter plugin so that GridSweeper can talk to them directly. See the section on custom adapters for more information.) If you are building a model from scratch, it probably will be easiest to build in gsdrone support from the start.

At the most basic, a gsdrone model must be able to accept three things at the command line:

  1. A seed for the random number generator, in the form [random-seed-option][random-seed], e.g. -S18482305.
  2. A run number, used as an output file suffix to distinguish different runs with the same parameter settings, in the form [run-number-option][run-number], e.g. -N17.
  3. Parameter values, in the form [parameter-value-option][parameter-name]=[parameter-value], e.g. -Pbeta=0.5. The [parameter-value-option] can be blank, in which case parameters will be passed simply as, e.g., beta=0.5.

Additionally, gsdrone can also be configured to pass the location of an input file, for example to contain a list of default parameter values, via [input-file-option][input-file-path], and an arbitrary list of additional command-line options.

Setting Up an Experiment File

Once your program is set up to accept input from gsdrone, you can write a control file for your experiment, as described in detail in Getting Started with GridSweeper. An experiment file that uses all the gsdrone options will look something like this:

<?xml version="1.0" encoding="UTF-8"?>
<experiment name="huge-experiment" numRuns="16">
  <setting key="ResultsDirectory" value="/path/to/gs-results"/>
  <setting key="EmailAddress" value="ed@example.com"/>

  <!-- gsdrone settings-->
  <setting key="Model" value="/path/to/mymodel/mymodelprogram"/>
  <setting key="SetParamOption" value=""/>
  <setting key="RunNumOption" value="-N"/>
  <setting key="RNGSeedOption" value="-S"/>
  <setting key="UseInputFile" value="true"/>
  <setting key="InputFileOption" value="-I"/>
  <setting key="InputFilePath" value="/path/to/inputfile.txt"/>
  <setting key="MiscOptions">
    --verbose
    --debug
  </setting>

  <abbrev param="alpha" abbrev="a"/>
  <abbrev param="beta" abbrev="b"/>
  <abbrev param="gamma" abbrev="g"/>
  <abbrev param="nu" abbrev="n"/>

  <value param="alpha" value="0.5"/>
  <value param="beta" value="0.3"/>
  <value param="gamma" value="0.1"/>
  <value param="nu" value="0.04"/>

  <list param="nu">
    <item value="0.05"/>
    <item value="0.06"/>
    <item value="0.09"/>
  </list>
  <range param="gamma" start="0.1" end="0.5" increment="0.1" />
</experiment>

Settings specific to gsdrone experiment files are as follows:

SettingDefault ValueDescription
Model(none)Specifies the path to the model program executable. Should be provided as a fully qualified path. (~ may not successfully map to the user's home directory.)
SetParamOption""Specifies the command-line option used to prefix parameter values.
RunNumOption"-N"The command-line option used to prefix the run number.
RNGSeedOption"-S"The command-line option used to prefix the random number generator seed.
UseInputFile"true"Whether an input file will be used: true if "true"; false for any other value. if no actual input file is provided, this option is ignored.
InputFileOption"-I"The command-line option used to prefix the input file path.
InputFilePath(none)The path to the input file.
MiscOptions(none)Additional options to pass to the executable. If written as <setting name="MiscOptions" value="[option]"/>, the value will be passed to the executable as a single argument, whitespace included. If written as <setting name="MiscOptions">[options]</setting>, each line of the content will be passed along as a separate argument.

Running an Experiment with gsdrone

To run your experiment, use the gsdrone command:

gsdrone -d /path/to/experiment.xml

will perform a dry run, without actually executing your simulation program so you can verify that output directories match your expectations, and

gsdrone /path/to/experiment.xml

will perform an actual run.

Full details of running an experiment are described in Running An Experiment in the previous section.

Using GridSweeper with Matlab Scripts (gsmatlab)

GridSweeper supports running experiments based on scripts written in Matlab or Octave via the gsmatlab tool. Most Matlab users tend to perform parameter sweeps by writing nested loops to sweep over different parameter values. To produce a script that works with gsmatlab, you typically just need to remove your outer loops and wrap your script in a function following the right format.

Matlab Script Requirements

To be used with gsmatlab, a Matlab or Octave script needs to be written as a function file with a function that takes three parameters, in this order:

  1. A Matlab/Octave data structure (struct) that maps parameter names to parameter values.
  2. The run number, as an integer literal.
  3. The seed for the random number generator, also as an integer literal. Note that using gsmatlab will not automatically seed the Matlab/Octave random number generator for you: because there are many ways of generating random numbers, your script needs to do this itself using the provided seed.

The process is best illustrated with a simple example. The following is a gsmatlab-compliant script that uses parameters to generate a series of random numbers that are written into a file appropriately tagged with the run number:

function run_model(params, run_number, random_seed)

% Initialize standard random number generator
% (Matlab 7.7 or later syntax)
s = RandStream.create('mt19937ar','seed',random_seed);
RandStream.setDefaultStream(s);

% Create an output file identified by the run number
fid = fopen(sprintf('output.%d.txt', run_number), 'w');

% Print out parameter values
fprintf(fid, 'alpha=%f\n', params.alpha);
fprintf(fid, 'beta=%f\n', params.beta);
fprintf(fid, 'gamma=%f\n', params.gamma);
fprintf(fid, 'nu=%f\n', params.nu);

% Print 100 U(0,1) random numbers multiplied by the parameter values
for i=1:100
    fprintf(fid, '%e\n', rand * params.alpha ...
        * params.beta * params.gamma * params.nu);
end

% Be a good citizen and close the file
fclose(fid);

Setting Up an Experiment File

Once you have a script properly formatted to accept input from gsmatlab, you can write a control file for your experiment, as described in detail in Getting Started with GridSweeper. An experiment file that uses all the gsmatlab options will look something like this:

<?xml version="1.0" encoding="UTF-8"?>
<experiment name="huge-experiment" numRuns="16">
  <setting key="ResultsDirectory" value="/path/to/gs-results"/>
  <setting key="EmailAddress" value="ed@example.com"/>

  <!-- gsmatlab settings-->
  <setting key="FunctionName" value="run_model"/>
  <setting key="MatlabExecutablePath" value="/usr/local/bin/matlab"/>
  <setting key="MatlabSearchPath" value="/path/to/modeldir"/>
  <setting key="MatlabOptions" value="-nodisplay"/>

  <abbrev param="alpha" abbrev="a"/>
  <abbrev param="beta" abbrev="b"/>
  <abbrev param="gamma" abbrev="g"/>
  <abbrev param="nu" abbrev="n"/>

  <value param="alpha" value="0.5"/>
  <value param="beta" value="0.3"/>
  <value param="gamma" value="0.1"/>
  <value param="nu" value="0.04"/>

  <list param="nu">
    <item value="0.05"/>
    <item value="0.06"/>
    <item value="0.09"/>
  </list>
  <range param="gamma" start="0.1" end="0.5" increment="0.1" />
</experiment>

Settings specific to gsmatlab experiment files are as follows:

SettingDefault ValueDescription
FunctionName(none)The name of the gsmatlab-compliant function that performs a simulation run. Must take three parameters, in this order: (1) a data structure mapping parameter names to parameter values; (2) the run number, passed as an integer literal; and (3) the seed for the random number generator, passed as an integer literal. A return value, if present, will be ignored.
MatlabSearchPath(none)Search path that enables Matlab or Octave to call your function, i.e., the directory where your run_model.m file resides. Search paths are added using the built-in Matlab/Octave function addpath. You can provide multiple directories using the format <setting name="MatlabSearchPath">[search paths]</setting>, one search path per line.
MatlabExecutablePath"/usr/local/bin/matlab"The full path to the Matlab or Octave executable in your environment.
MatlabOptions"-nodisplay"Options to pass to the Matlab/Octave executable. Note that Octave does not understand "-nodisplay", so you will need to override it with "" (or something else Octave understands). You can provide multiple options to be loaded using the format <setting name="MatlabOptionsOptions">[options]</setting>, one option per line.

Running an Experiment with gsmatlab

To run your experiment, follow the instructions described in Running An Experiment above, but be sure to use the gsmatlab command:

gsmatlab -d /path/to/experiment.xml

will perform a dry run, without actually executing your script so you can verify that output directories match your expectations, and

gsmatlab /path/to/experiment.xml

will perform an actual run. What gsmatlab will actually do is start running an instance of Matlab or Octave, e.g., by running

/usr/local/bin/matlab -nodisplay

and then sending it a series of commands that add the search path, construct a parameters data structure, and finally run your model, in that order. For our example, one of the runs might send these commands to Matlab:

addpath('/path/to/modeldir');
params.alpha = 0.5;
params.beta = 0.3;
params.gamma = 0.4;
params.nu = 0.09;
run_model(params, 15, 459410224);

Again, remember that because there are many possible ways to generate random numbers, gsmatlab will not initialize the random number generator for you: you must do this yourself in your script.

Using GridSweeper with R Scripts (gsr)

GridSweeper supports running experiments based on scripts written in R via the gsr tool. Most R users tend to perform parameter sweeps by writing nested loops to sweep over different parameter values. To produce a script that works with gsr, you typically just need to remove your outer loops and wrap your script in a function following the right format.

R Script Requirements

To be used with gsr, an R script needs to contain a function that takes three parameters, in this order:

  1. An R (named) list that maps parameter names to parameter values.
  2. The run number, as an integer literal.
  3. The seed for the random number generator, also as an integer literal. Note that using gsr will not automatically seed the R random number generator for you: because you may want to use a different generator than the default, you need to call set.seed yourself.

The process is best illustrated with a simple example. The following is a gsr-compliant script that uses parameters to generate a series of random numbers that are written into a file appropriately tagged with the run number:

run_model = function(params, run_number, random_seed)
{
  # Initialize standard random number generator
  set.seed(random_seed);

  # Redirect output to an output file
  sink(sprintf("output.%d.txt", run_number))

  # Print out parameter values
  cat(sprintf("alpha=%f\n", params$alpha))
  cat(sprintf("beta=%f\n", params$beta))
  cat(sprintf("gamma=%f\n", params$gamma))
  cat(sprintf("nu=%f\n", params$nu))

  # Print 100 U(0,1) random numbers multiplied by the parameter values
  for(i in 1:100)
  {
    cat(sprintf("%e\n", runif(1) * params$alpha
      * params$beta * params$gamma * params$nu));
  }

  # Restore output to console
  sink();
}

Setting Up an Experiment File

Once you have a script properly formatted to accept input from gsr, you can write a control file for your experiment, as described in detail in Getting Started with GridSweeper. An experiment file that uses all the gsr options will look something like this:

<?xml version="1.0" encoding="UTF-8"?>
<experiment name="huge-experiment" numRuns="16">
  <setting key="ResultsDirectory" value="/path/to/gs-results"/>
  <setting key="EmailAddress" value="ed@example.com"/>

  <!-- gsr settings-->
  <setting key="FunctionName" value="run_model"/>
  <setting key="RExecutablePath" value="/usr/local/bin/R"/>
  <setting key="RInputFile" value="/path/to/inputfile.R"/>
  <setting key="ROptions" value="--vanilla"/>

  <abbrev param="alpha" abbrev="a"/>
  <abbrev param="beta" abbrev="b"/>
  <abbrev param="gamma" abbrev="g"/>
  <abbrev param="nu" abbrev="n"/>

  <value param="alpha" value="0.5"/>
  <value param="beta" value="0.3"/>
  <value param="gamma" value="0.1"/>
  <value param="nu" value="0.04"/>

  <list param="nu">
    <item value="0.05"/>
    <item value="0.06"/>
    <item value="0.09"/>
  </list>
  <range param="gamma" start="0.1" end="0.5" increment="0.1" />
</experiment>

Settings specific to gsr experiment files are as follows:

SettingDefault ValueDescription
FunctionName(none)The name of the gsr-compliant function that performs a simulation run. Must take three parameters, in this order: (1) a named list mapping parameter names to parameter values; (2) the run number, passed as an integer literal; and (3) the seed for the random number generator, passed as an integer literal. A return value, if present, will be ignored.
RInputFile(none)File to load into R containing the definition of your function. Files are loaded using the standard source R command. You can provide multiple files to be loaded using the format <setting name="RInputFile">[input files]</setting>, one file per line.
RExecutablePath"/usr/local/bin/R"The full path to the R executable in your environment.
ROptions"--vanilla"Options to pass to the R executable. You can provide multiple options to be loaded using the format <setting name="ROptions">[options]</setting>, one option per line.

Running an Experiment with gsr

To run your experiment, follow the instructions described in Running An Experiment above, but be sure to use the gsr command:

gsr -d /path/to/experiment.xml

will perform a dry run, without actually executing your script so you can verify that output directories match your expectations, and

gsr /path/to/experiment.xml

will perform an actual run. What gsr will actually do is start running an instance of R, e.g., by running

/usr/local/bin/R --vanilla

and then sending it a series of commands that load the input file, construct a list of parameter values, and finally run your model, in that order. For our example, one of the runs might send these commands to R:

source("/path/to/inputfile.R")
params = list()
params$alpha = 0.5
params$beta = 0.3
params$gamma = 0.4
params$nu = 0.09
run_model(params, 15, 459410224)

Again, remember that gsr will not initialize the random number generator for you: you must do this yourself in your script using, e.g., set.seed.

Using GridSweeper with NetLogo Models (gsnetlogo)

GridSweeper can run distributed batches of NetLogo models, typically with zero or very little modification of the model. Experiment XML files for NetLogo models are run using the gsnetlogo tool, which handles converting parameter settings and random seeds to NetLogo's native format.

NOTE: NetLogo has a built-in mechanism for performing parameter sweeps on a single machine, called BehaviorSpace. In fact, gsnetlogo controls NetLogo by generating BehaviorSpace-compatible XML for each run of the model. However, as of version 1.0, gsnetlogo does not parse user-created BehaviorSpace experiments from the .nlogo model file. This is planned for the future.

NetLogo Model Requirements

As long as your NetLogo model has parameters set up in the normal way, and can be controlled via standard setup and go commands, it should work unmodified with gsnetlogo. For example, here is an example model that comes with NetLogo 4.0.4:

In this model, all the slider parameters—altruistic-probability, selfish-probability, etc.—can be controlled by gsnetlogo automatically; they are modified in a GridSweeper experiment XML file normally. If you want to output data once per tick using the standard NetLogo output format used by BehaviorSpace, you need not make any modifications at all.

If you perform custom output to data files, you will need a way to receive the run number from GridSweeper. This is done by setting up an extra parameter—called whatever you like, say run-number—and then instructing GridSweeper to use that parameter to set the run number in the experiment XML file, as described in the next section.

Setting Up an Experiment File

Experiment control files for NetLogo models are largely as described in detail in Getting Started with GridSweeper, with several additional settings to tell GridSweeper how to talk to NetLogo.

An experiment file that uses all the gsnetlogo options will look something like this:

<?xml version="1.0" encoding="UTF-8"?>
<experiment name="huge-experiment" numRuns="16">
  <setting key="ResultsDirectory" value="/path/to/gs-results"/>
  <setting key="EmailAddress" value="ed@example.com"/>

  <!-- gsnetlogo settings-->
  <setting key="Model" value="/path/to/model.nlogo"/>
  <setting key="NetLogoJarPath" value="/path/to/NetLogo.jar"/>
  <setting key="JavaPath" value="/usr/bin/java"/>
  <setting key="JavaOptions">
    -server
    -Xmx1024M
  </setting>
  <setting key="RunNumberVariable" value="run-number"/>
  <setting key="Metrics"
    %susceptible
    %infected
  </setting>
  <setting key="OutputFormat" value="Table"/>
  <setting key="RunMetricsEveryStep" value="true"/>
  <setting key="Setup" value="setup"/>
  <setting key="Go" value="go"/>
  <setting key="Final" value="final"/>
  <setting key="ExitCondition" value="%infected = 0"/>
  <setting key="TimeLimit" value="1000"/>

  <abbrev param="alpha" abbrev="a"/>
  <abbrev param="beta" abbrev="b"/>
  <abbrev param="gamma" abbrev="g"/>
  <abbrev param="nu" abbrev="n"/>

  <value param="alpha" value="0.5"/>
  <value param="beta" value="0.3"/>
  <value param="gamma" value="0.1"/>
  <value param="nu" value="0.04"/>

  <list param="nu">
    <item value="0.05"/>
    <item value="0.06"/>
    <item value="0.09"/>
  </list>
  <range param="gamma" start="0.1" end="0.5" increment="0.1" />
</experiment>

Settings specific to gsnetlogo experiment files are as follows:

SettingDefault ValueDescription
Model(none)The full path to the .nlogo model file.
NetLogoJarPath(none)The full path to the NetLogo.jar file. If you are unsure where this is, ask your system administrator.
JavaPath"/usr/bin/java"The full path to the Java executable. If you are unsure where this is, ask your system administrator or type which java at the command line to see where the default installation resides on your system.
JavaOptions"-server", "-Xmx1024M"Command-line options to pass to the Java runtime. The default options are the ones recommended by the NetLogo team. To provide multiple options, use the format <setting key="JavaOptions">[options]</setting>, with one option on each line.
RunNumberVariable(none)If provided, a variable used by GridSweeper to set the run number so your model can properly name custom output files.
Metrics(none)A list of metrics—typically variables that your code updates each tick—to be automatically outputted to an output file. To provide multiple metrics, use the format <setting key="Metrics">[metrics]</setting>, with one metric on each line.
OutputFormat"Table"The NetLogo standard output format to write metrics and other run data to, either "None", "Table", "Spreadsheet", or "Both" (case-sensitive). Output data will be written to table.[run-number].csv and/or spreadsheet.[run-number].csv.
RunMetricsEveryStep"true"Whether to output metrics every timestep, or just at the end of the run. Must be "true" or "false".
Setup"setup"NetLogo commands to set up the run. Multiple commands can be provided in the format <setting key="Setup">[commands]</setting>, one command per line.
Go"go"NetLogo commands to cause the model to advance a tick. Multiple commands are allowed, as with Setup.
Final(none)NetLogo commands to perform when the model has finished. Multiple commands are allowed, as with Setup.
ExitCondition(none)A conditional statement that when true will cause the model to finish.
TimeLimit(none)The maximum number of ticks to run the model for.

Running an Experiment with gsnetlogo

To run your experiment, follow the instructions described in Running An Experiment above, but be sure to use the gsnetlogo command:

gsnetlogo -d /path/to/experiment.xml

will perform a dry run, without actually executing your model so you can verify that output directories match your expectations, and

gsnetlogo /path/to/experiment.xml

will perform an actual run.

When a run is executed, what actually happens is that GridSweeper generates a BehaviorSpace XML file, .gsnetlogo.[run-number].xml, in the working directory, and then runs a separate Java process for NetLogo, e.g., via:

/usr/bin/java -server -Xmx1024M -cp /path/to/NetLogo.jar
  org.nlogo.headless.HeadlessWorkspace --model /path/to/model.nlogo
  --setup-file .gsnetlogo.15.xml --table table.15.csv

NetLogo then runs the model in headless mode, generating output files as requested.

Modifying Sweeps at the Command Line

The GridSweeper tools (gsdrone, gsmatlab, etc.) support, in addition to running an experiment from the command line, flexible command-line options to create and modify parameter sweeps. To see full details about how to do this, you can view the manpage online or at the command line with

man gsdrone

Installing GridSweeper

GridSweeper installation is fairly straightforward: if you have the right system, download the package, unzip it in a convenient location, and perform a small bit of configuration.

System Requirements

GridSweeper is known to require the following:

  • Linux/Mac OS X/other Unix with python and mail
  • JDK 5 ("1.5") or later, 32-bit or 64-bit.
  • Grid system with Java DRMAA 1.0 support. Native DRMAA library must match bit mode (32-bit or 64-bit) of JDK.
  • CERN Colt (included in distribution).
  • A shared filesystem, so that the submission host and all execution hosts have write access to the same results directory. Execution hosts must have access to this directory when being run by the grid—if the grid runs jobs as a sandboxed user, GridSweeper will not work.

NOTE: GridSweeper has been tested primarily in a Linux environment using Sun Grid Engine 6.2u5. If you have success or issues on other platforms, please email gridsweeper@umich.edu with comments. Also see the Troubleshooting section in this manual.

Installation Steps

First, download the GridSweeper 1.0 installation from the GridSweeper main page, and unzip it, e.g.,

unzip gridsweeper-1.0

First, you need to make sure GridSweeper can link against the DRMAA library (drmaa.jar). With Sun Grid Engine, drmaa.jar is located in the lib subdirectory of the SGE root.

The easiest way to do this is to add a symbolic link to drmaa.jar in the GridSweeper lib directory:

ln -s /path/to/sge/lib/drmaa.jar /path/to/gridsweeper/lib/drmaa.jar

If the JAVA_HOME environment variable is set, GridSweeper will use that version of Java. Otherwise, it will use the Java installation accessed via the command java at the shell prompt—that is, whatever comes up when you type which java. The same JAVA_HOME will be used on execution hosts. If you want to use a different Java configuration, set the JAVA_HOME environment variable. For more customization, you can modify gridsweeper/scripts/gsweep and gridsweeper/scripts/gsrunner via the JAVA and JAVA_OPTIONS variables at the top of these two files.

Finally, you will need to add the gridsweeper/scripts directory to your system search path, and add gridsweeper/man to your system manpage path, or instruct users to add them individually. E.g., a user might have to modify her .bash_profile like so:

export PATH=$PATH:/path/to/gridsweeper/scripts
export MANPATH=$MANPATH:/path/to/gridsweeper/man

That's it. Now users should be able to use gsdrone, gsmatlab, gsr, and gsnetlogo scripts to submit experiments to the grid.

Implementing a Custom Adapter

In GridSweeper, a software object called an adapter translates the parameter sweeps set up by the user into a format the desired model can understand, and actually runs the model itself. Built-in adapter classes provide support for Drone-style command-line programs, Matlab and R scripts, and NetLogo models.

Additional adapters can be installed in the plugins directory inside the GridSweeper installation directory, either in JAR files or as raw Java classes in a package directory hierarchy rooted in plugins.

If you are familiar with Java programming, it is fairly straightforward to implement a custom adapter. First, configure your programming environment to link against the gridsweeper/lib/gridsweeper.jar, and then create a class that implements the gridsweeper.adapter.Adapter interface.

Custom adapter classes must implement the run method defined in the interface as well as a single constructor, that takes an instance of gridsweeper.util.Settings as a parameter:

import gridsweeper.adapter.*;
import gridsweeper.util.*;
import java.io.*;

public class SampleAdapter implements Adapter
{
  // Instance variables to store settings here

  public SampleAdapter(Settings settings) throws AdapterException
  {
    // Load settings from the Settings object
  }

  public RunResults run(ParameterMap parameterMap, int runNumber,
    int numRuns, int rngSeed, OutputStream stdout, OutputStream stderr)
    throws AdapterException
  {
    // Run the model using the provided parameters, etc., passing
    // output and error to the appropriate output streams if desired
  }
}

The best way to get going is to look at the sample adapter provided in the GridSweeper source code (gridsweeper.example.SampleAdapter) as well as the source code for the built-in adapters in the gridsweeper.adapter package.

Once the adapter is completed, package it into a .jar file and place it in the gridsweeper/plugins folder. Users can now use your adapter by specifying it in experiment XML files:

<setting name="Adapter" value="my.adapter.ClassName"/>

and then running GridSweeper using the built-in gsweep tool:

gsweep /path/to/experiment.xml

To make things a bit easier, you can also write a script that sets the adapter, which is how the gsdrone, gsmatlab, etc. scripts work, so the user can just type

custom-script experiment.xml

without worrying about Java class names in their experiment file.

For example, if you add your script into /path/to/gridsweeper/scripts, it could look like this:

#!/bin/sh

ABSPATH=`python -c "import os; print os.path.abspath(\"$0\")"`
BINPATH=`dirname $ABSPATH`
GRIDSWEEPER_ROOT=`dirname $BINPATH`
ADAPTER="my.adapter.ClassName"
${GRIDSWEEPER_ROOT}/scripts/gsweep -a ${ADAPTER} "$@"

Experiment XML File Format

The experiment file format described here is used for both input and output files. Experiment files are written in XML with a simple set of elements. At the top level is the <experiment> element:

<?xml version="1.0" encoding="UTF-8"?>
<experiment name="My Experiment" numRuns="10" firstSeedRow="0" seedCol="0" gridConfig="-q anotherqueue">
	<!-- ... -->
</experiment>

All attributes of the <experiment> element are optional. The name attribute is used to name experiment directories in the filesystem and in naming strings submitted to the grid. The numRuns attribute specifies how many runs with different random seeds should be completed for each case.

The attributes firstSeedRow and seedCol identify the starting seed location in a virtual table of random seeds that is used to extract the sequence of random seeds that will be assigned to runs. See Random Seed Generation for more information.

The gridConfig attribute passes along a configuration string to the underlying grid system. The format of this string is system-dependent. In Sun Grid Engine, for example, this string can contain the same options that would be passed to qsub, such as to specify a queue. Terminology note: gridConfig corresponds to the "native specification" in the DRMAA standard.

Elements that may appear within <experiment> are described below.

<setting>

The <experiment> element will typically contain one or more <setting> elements, which look like this:

<setting key="key" value="value"/>

Values may also be assigned as multi-line XML content between start and end tags:

<setting key="key>
  value
</setting>

See the earlier sections on using GridSweeper for a description of available settings.

<value>

The <value> element is used to assign single values to parameters. It takes the form

<value param="param" value="value"/>

Parameter values can be any string; special XML entities must be properly escaped.

<list>

The <list> element is used to define a list sweep for a particular parameter. It can contain <item> elements and <range> elements to specify parameter values, as shown here:

<list param="param">
  <item value="0.1"/>
  <item value="0.3"/>
  <range start="0.5" end="1.0" increment="0.1"/>
  <!-- ... -->
</list>

<range>

The <range> element is used to define a range list sweep for a particular parameter. In addition to the parameter name, it supports and requires three attributes, for the start value, end value, and increment:

<range param="param" start="0.0" end="1.0" increment="0.1"/>

If provided as a child of a <list> element, the param attribute is optional, and a conflicting name will result in an error.

<multiplicative>

The <multiplicative> tag is used to define a multiplicative combination sweep. This tag is strictly a container:

<multiplicative>
  <range param="param1"
   start="0.0" end="1.0" increment="0.1"/>
  <range param="param2"
   start="0" end="100" increment="5"/>
</multiplicative>

<parallel>

The <parallel> tag is used to define a parallel combination sweep. This is also just a container, whose children must all generate the exact same number of cases, six each in this example:

<parallel>
  <range param="param1"
   start="0.0" end="1.0" increment="0.2"/>
  <range param="param2"
   start="0" end="100" increment="20"/>
  <list param="param3">
    <item value="25"/>
    <item value="399"/>
    <item value="4096"/>
    <item value="33333"/>
    <item value="1677216"/>
    <item value="10000000"/>
  </list>
</parallel>

Case XML File Format

Case files are also written out as XML. The format is very simple, consisting of a single <case> element that in turn contains a number of <value> and <run> elements.

The <case> element includes a single attribute, name, which is intended for human readability only and is constructed by GridSweeper from the experiment name, the parameter settings, and the date. For example:

<case name="echo - r=0.4-s=0.9 (2007-07-20, 16-29-39)">
	<!-- ... -->
</case>

The <value> elements are the same as in experiment XML files, and are the only type of parameter specification allowed in case XML files. They specify the parameter name with the param attribute, and the value with the value attribute, as in:

<value param="r" value="0.4"/>

Each <run> element includes two attributes, number and rngSeed:

<run number="1" rngSeed="1986201165"/>

Here is a complete example of a case file:

<?xml version="1.0" encoding="UTF-8"?>
<case name="echo - r=0.4-s=0.9 (2007-07-20, 16-29-39)">
  <value param="r" value="0.4"/>
  <value param="s" value="0.9"/>
  <run number="0" rngSeed="526374054"/>
  <run number="1" rngSeed="1986201165"/>
  <run number="2" rngSeed="1585196345"/>
  <run number="3" rngSeed="1619001183"/>
  <run number="4" rngSeed="2137463870"/>
  <run number="5" rngSeed="549727158"/>
  <run number="6" rngSeed="1322681018"/>
  <run number="7" rngSeed="296371489"/>
  <run number="8" rngSeed="1066118686"/>
  <run number="9" rngSeed="1141036221"/>
</case>

Random Seed Generation

Random seeds are generated using the RandomSeedGenerator class from the CERN Colt scientific computing library, whose sole purpose is to decorrelate seeds from any uniform random number generator. Seeds are selected deterministically, in sequence from one of two columns, 0 or 1, in a virtual seed table. The range of rows is 0 to 2^32^ - 1. The firstSeedRow and seedCol attributes in experiment XML can be used to specify a starting point in the table; if are missing, they are chosen at random. Unless you are trying to reproduce a prior experiment, there is no reason to specify these attributes, but they will appear in the experiment file generated in the experiment results directory. You can read more in the RandomSeedGenerator API documentation.

Troubleshooting

GridSweeper has not been tested in many environments, so you may run into quirks specific to your configuration. If you are having trouble getting things set up, or if you had to perform a special workaround for your system, please send comments to gridsweeper@umich.edu.

Jobs aren't running

First, make sure your installation meets the listed system requirements. If you're sure it does, make sure that you can submit single jobs using the native submission system for your grid (e.g., qsub). Check the shared filesystem: is GridSweeper creating output directories before jobs run, and are those output directories visible to all execution agents at job runtime? Check individual output directories, and make sure that files called .gsweep_in.[run-number] are being created.

Jobs are returning errors

Make sure that your jobs are working properly when run manually one at a time. Check your debugging output in stderr.[run-number] and stdout.[run-number]. Also check GridSweeper's debugging output in .gsweep_err.[run-number]. If all jobs are returning the same error, try submitting a single run using your grid's native submission system to see if the fault lies with GridSweeper or the underlying system.

I can't specify a queue (or some other qsub option)

Actually, you can. As of version 1.0.1, you can use the gridConfig experiment attribute to pass along a string to the underlying grid system, like so:

<?xml version="1.0" encoding="UTF-8"?>
<experiment name="My Experiment" numRuns="10" gridConfig="-q anotherqueue">
	<!-- ... -->
</experiment>

In Sun Grid Engine, this string takes the same form as command-line options passed to qsub. Other grid systems are likely to work similarly, but what gridConfig actually does may vary.

Terminology note: in the DRMAA specification, this string is referred to as a "native specification."

Emails aren't being sent

Make sure that it is possible to send emails on your system using the standard Unix mail command (Sendmail). If you have a unique system configuration, it may be possible to modify gridsweeper/scripts/gsmail to work with your mail system.

Other problems

If you are having another issue, or if you have had to deal with one, please send an email to gridsweeper@umich.edu and let us know. We will add additional notes here as they arise.

Version History

Version 1.0.2, 5 March 2011

  • The JAVA_HOME environment variable is used to determine which Java to use. The same copy of Java is always used on submit hosts and execution hosts.
  • Empty stderr and stdout files generated by jobs are now removed if their size is zero.
  • Temporary files .gsweep_in, .gsweep_err, and .gsweep_out are deleted unless errors occur, or the -t or --keep-temp-files command-line option is specified.
  • If available, the hostname is listed in the run log and error messages in notification emails.
  • Notification emails are saved in a user-visible file (notification_email.txt in the output directory), and are thus accessible even email delivery has not been configured properly.
  • GridSweeper has been updated to use Java bindings version 1.0, and thus requires a version of SGE or other system that supports them. (The documentation incorrectly stated that it already used 1.0; in fact, the version supported was 0.95.)

Version 1.0.1, 14 January 2010

Introduced gridConfig attribute to enable grid system-specific options (e.g., qsub command-line options).

Version 1.0, 9 October 2009

First generally available version.