Using Hyperparameter Optimization (HPO)
Steps for importing the crayai module, defining parameters to optimize, creating an evaluator, creating an optimizer, and then optimizing over the hyperparameter.
This procedure requires the following software to be installed on the system:
- Python 3.6
- numpy
To use the HPO framework, a user must perform the following steps:
- Import the required module.
- Define parameters to be optimized.
- Create an Evaluator.
- Create an Optimizer.
- Optimize over the parameter.
- Log on to a login node.
- Load the analytics module.
- Load the desired Python 3 environment.The site default
python3or a localpython3environment can be used as long asnumpyis installed.login$ module load cray-python
- Import the
hposub module of thecrayaimodule in a Python script.from crayai import hpo
- Define parameters to be optimized. These are the parameter to optimize over. They are exposed to the training program through command-line flags. The crayai hpo tool searches within a specified range, starting at a specified default value.Hyperparameter Definition Format
params = hpo.Params([[command_line_flag_1, default_val_1, (min_val, max_val)], [command_line_flag_2, default_val_2, (min_val, max_val)], ... ])Hyperparameter Definition Exampleparams = hpo.Params([["--learningRate", 0.01, (1e-6, 1.0)], ["--neuronsPerLayer", 100, (50, 500)], ["--dropoutRate", 0.5, (0.0, 0.7)]]) - Create an Evaluator.The Evaluator class defines how to evaluate a set of hyperparameters by running the kernel program (model training script) with command-line arguments. This includes distribution of individual evaluations via a workload manager (specified as
wlm), the Urika-XCS launcher (specified asurika), or local mode (specified asnone).The Evaluator can handle distributed training processes via thenodes_per_evalparameter, and can calculate the number of parallel evaluations that can be executed simultaneously within the given allocation.Evaluator Definition Formatevaluator = hpo.Evaluator(command,# Command to run to evaluate the hyperparameters run_path, # Opt: Workspace directory for log files. fom, # Opt: Unique string identifying where field of # merit value will be in evaluation output. checkpoint, # Opt: Path to checkpoint directory per workspace. alloc_job_ID, # Opt: Allocation id for existing allocation # (wlm launcher only) nodes, # Opt: Total node count in the allocation nodes_per_eval, # Opt: Nodes needed for each evaluation launcher, # Opt: How to distribute the evaluation. Choose # from "urika", "wlm", or "none" urika_args, # Opt: Argument to pass on to run_training for # the urika launcher verbose) # Opt: Verbose print message.Evaluator Examplecmd = "python source/train.py --epochs 5" evaluator = hpo.Evaluator(cmd, nodes=8, nodes_per_eval=2 launcher='urika', urikaArgs="--no-node-list", verbose=False)In the preceding example:- The training process defined in source/train.py will run with 5 full epochs every time it is executed.
- The Evaluator will have access to 8 nodes in an allocation.
- Each evaluation will run on 2 nodes, allowing 4 evaluations to occur in parallel.
- The urika launcher will be used to run the command with run_training from the Urika-XCS package.
- --no-node-list will be passed as an additional argument to run_training for each evaluation.
- Verbose logging information will not be printed.
- Create an Optimizer.The Optimizer contains the core algorithms behind HPO, specifically genetic, random and grid searches. The Optimizer works in tandem with the Evaluator by ingesting the results from the Evaluator and returning a new set of hyperparameters to be evaluated.Optimizer Definition Format
optimizer = hpo.GeneticOptimizer(evaluator, # Evaluator instance generations, # Opt: Number of generations. num_demes, # Opt: Number of distinct demes (populations) pop_size, # Opt: Number of individuals per deme mutation_rate, # Opt: Probability of mutation per # hyperparameter during creation of next # generation crossover_rate, # Opt: Probability of crossover per # hyperparameter during creation of next # generation migration_interval, # Opt: Interval of migration between demes log_fn, # Opt: Filename to record results of optimization verbose) # Opt: Enable verbose output optimizer = hpo.RandomOptimizer(evaluator, # Evaluator instance numIters, # Opt: Number of iterations to run seed, # Opt: Seed for random number generator. Defaults to 0, # i.e. random seed used. verbose) # Opt: Enable verbose output optimizer = hpo.GridOptimizer(evaluator, # Evaluator instance grid_size, # Opt: Number of grid points to discretize for each # hyperparameter chunk_size, # Opt: Number of grid points to evaluate per batch (chunk) verbose) # Opt: Enable verbose outputOptimizer Exampleoptimizer = hpo.GeneticOptimizer(evaluator, pop_size= 4, num_demes=2, generations=5, mutation_rate=0.10, crossover_rate=0.4, verbose=True ) optimizer.optimize(params)