Using Hyperparameter Optimization (HPO)

Steps for importing the crayai module, defining parameters to optimize, creating an evaluator, creating an optimizer, and then optimizing over the hyperparameter.

This procedure requires the following software to be installed on the system:

Python 3.6
numpy

To use the HPO framework, a user must perform the following steps:

Import the required module.
Define parameters to be optimized.
Create an Evaluator.
Create an Optimizer.
Optimize over the parameter.

Descriptions of each of these steps is provided in this procedure.

Log on to a login node.
Load the analytics module.
Load the desired Python 3 environment.
The site default python3 or a local python3 environment can be used as long as numpy is installed.
```
login$ module load cray-python
```
Import the hpo sub module of the crayai module in a Python script.
```
from crayai import hpo
```

Define parameters to be optimized.

These are the parameter to optimize over. They are exposed to the training program through command-line flags. The crayai hpo tool searches within a specified range, starting at a specified default value.

Hyperparameter Definition Format

params = hpo.Params([[command_line_flag_1, default_val_1, (min_val, max_val)],
                     [command_line_flag_2, default_val_2, (min_val, max_val)],
                     ...
                    ])

Hyperparameter Definition Example

params = hpo.Params([["--learningRate", 0.01, (1e-6, 1.0)],
                     ["--neuronsPerLayer", 100, (50, 500)],
                     ["--dropoutRate", 0.5, (0.0, 0.7)]])

Create an Evaluator.

The Evaluator class defines how to evaluate a set of hyperparameters by running the kernel program (model training script) with command-line arguments. This includes distribution of individual evaluations via a workload manager (specified as wlm), the Urika-XCS launcher (specified as urika), or local mode (specified as none).The Evaluator can handle distributed training processes via the nodes_per_eval parameter, and can calculate the number of parallel evaluations that can be executed simultaneously within the given allocation.

Evaluator Definition Format

evaluator = hpo.Evaluator(command,# Command to run to evaluate the hyperparameters
                          run_path,       # Opt: Workspace directory for log files.
                          fom,            # Opt: Unique string identifying where field of
                                          #      merit value will be in evaluation output.
                          checkpoint,     # Opt: Path to checkpoint directory per workspace.
                          alloc_job_ID,   # Opt: Allocation id for existing allocation
                                          #      (wlm launcher only)
                          nodes,          # Opt: Total node count in the allocation
                          nodes_per_eval, # Opt: Nodes needed for each evaluation
                          launcher,       # Opt: How to distribute the evaluation. Choose
                                          #      from "urika", "wlm", or "none"
                          urika_args,     # Opt: Argument to pass on to run_training for
                                          #      the urika launcher
                          verbose)        # Opt: Verbose print message.

Evaluator Example

cmd = "python source/train.py --epochs 5"
evaluator = hpo.Evaluator(cmd,
                          nodes=8,
                          nodes_per_eval=2
                          launcher='urika',
                          urikaArgs="--no-node-list",
                          verbose=False)

In the preceding example:

The training process defined in source/train.py will run with 5 full epochs every time it is executed.
The Evaluator will have access to 8 nodes in an allocation.
Each evaluation will run on 2 nodes, allowing 4 evaluations to occur in parallel.
The urika launcher will be used to run the command with run_training from the Urika-XCS package.
--no-node-list will be passed as an additional argument to run_training for each evaluation.
Verbose logging information will not be printed.

Create an Optimizer.

The Optimizer contains the core algorithms behind HPO, specifically genetic, random and grid searches. The Optimizer works in tandem with the Evaluator by ingesting the results from the Evaluator and returning a new set of hyperparameters to be evaluated.

Optimizer Definition Format

optimizer = hpo.GeneticOptimizer(evaluator, # Evaluator instance
                              generations,        # Opt: Number of generations.
                              num_demes,          # Opt: Number of distinct demes (populations)
                              pop_size,           # Opt: Number of individuals per deme
                              mutation_rate,      # Opt: Probability of mutation per
                                                  #      hyperparameter during creation of next
                                                  #      generation
                              crossover_rate,     # Opt: Probability of crossover per
                                                  #      hyperparameter during creation of next
                                                  #      generation
                              migration_interval, # Opt: Interval of migration between demes
                              log_fn,             # Opt: Filename to record results of optimization
                              verbose)            # Opt: Enable verbose output
optimizer = hpo.RandomOptimizer(evaluator, # Evaluator instance
                              numIters,  # Opt: Number of iterations to run
                              seed,      # Opt: Seed for random number generator. Defaults to 0,
                                         #      i.e. random seed used.
                              verbose)   # Opt: Enable verbose output
optimizer = hpo.GridOptimizer(evaluator,  # Evaluator instance
                              grid_size,  # Opt: Number of grid points to discretize for each
                                          #      hyperparameter
                              chunk_size, # Opt: Number of grid points to evaluate per batch (chunk)
                              verbose)    # Opt: Enable verbose output

Optimizer Example

optimizer = hpo.GeneticOptimizer(evaluator,
                                  pop_size= 4,
                                  num_demes=2,
                                  generations=5,
                                  mutation_rate=0.10,
                                  crossover_rate=0.4,
                                  verbose=True )

optimizer.optimize(params)