Main¶

The Main class wraps a RideModule to supply a fully functional command-line interface which includes

Training (”–train”)
Evaluation on validation set (”–validate”)
Evaluation on test set (”–test”)
Logger integration (”–logging_backend”)
Hyperparameter search (”–hparamsearch”)
Hyperparameter file loading (”–from_hparams_file”)
Profiling of model timing, flops, and params (”–profile_model”)
Checkpointing
Checkpoint loading (”–resume_from_checkpoint”)

Example¶

All it takes to get a working CLI is to add the following to the bottom of a file:

# my_ride_module.py

import numpy as np
from ride import RideModule, TopKAccuracyMetric
from .examples.mnist_dataset import MnistDataset

class MyRideModule(RideModule, TopKAccuracyMetric(1,3), MnistDataset):
    def __init__(self, hparams):
        # `self.input_shape` and `self.output_shape` were injected via `MnistDataset`
        self.lin = torch.nn.Linear(np.prod(self.input_shape), self.output_shape)

    def forward(self, x):
        x = x.view(x.size(0), -1)
        x = torch.relu(self.lin(x))
        return x

ride.Main(MyRideModule).argparse()  # <-- Add this

and executing from the command line:

>> python my_ride_module.py --train --test --max_epochs 1 --id my_first_run

lightning: Global seed set to 123
ride: Running on host d40049
ride: ⭐️ View project repository at https://github.com/username/ride/tree/hash
ride: Run data is saved locally at /Users/username/project_folder/logs/run_logs/my_first_run/version_0
ride: Logging using Tensorboard
ride: 🚀 Running training
ride: Checkpointing on val/loss with optimisation direction min
lightning: GPU available: False, used: False
lightning: TPU available: None, using: 0 TPU cores
lightning:
| Name | Type   | Params
--------------------------------
0 | l1   | Linear | 100 K
1 | l2   | Linear | 1.3 K
--------------------------------
101 K     Trainable params
0         Non-trainable params
101 K     Total params
0.407     Total estimated model params size (MB)
Epoch 0: 100%|███████████████| 3751/3751 [00:16<00:00, 225.44it/s, loss=0.762, v_num=0, step_train/loss=0.899]
lightning: Epoch 0, global step 3437: val/loss reached 0.90666 (best 0.90666), saving model to "/Users/username/project_folder/logs/run_logs/my_first_run/version_0/checkpoints/epoch=0-step=3437.ckpt" as top 1
Epoch 1: 100%|███████████████| 3751/3751 [00:17<00:00, 210.52it/s, loss=0.581, v_num=1, step_train/loss=0.0221]
lightning: Epoch 1, global step 3437: val/loss reached 0.61922 (best 0.61922), saving model to "/Users/username/project_folder/logs/run_logs/my_first_run/version_0/checkpoints/epoch=1-step=6875.ckpt" as top 1
lightning: Saving latest checkpoint...
ride: 🚀 Running evaluation on test set
Testing: 100%|███████████████| 625/625 [00:01<00:00, 432.69it/s]
--------------------------------------------------------------------------------
ride: Results:
test/epoch: 0.000000000
test/loss: 0.889312625
test/top1acc: 0.739199996
test/top3acc: 0.883000016

ride: Saving /Users/username/project_folder/ride/logs/my_first_run/version_0/evaluation/test_results.yaml

Help¶

The best way to explore all the options available is to run the “–help”

>> python my_ride_module.py --help

...

Flow:
Commands that control the top-level flow of the programme.

--hparamsearch        Run hyperparameter search. The best hyperparameters
                        will be used for subsequent lifecycle methods
--train               Run model training
--validate            Run model evaluation on validation set
--test                Run model evaluation on test set
--profile_model       Profile the model

General:
Settings that apply to the programme in general.

--id ID               Identifier for the run. If not specified, the current
                        timestamp will be used (Default: 202101011337)
--seed SEED           Global random seed (Default: 123)
--logging_backend {tensorboard,wandb}
                        Type of experiment logger (Default: tensorboard)
...

Pytorch Lightning:
Settings inherited from the pytorch_lightning.Trainer
...
--gpus GPUS           number of gpus to train on (int) or which GPUs to
                        train on (list or str) applied per node
...

Hparamsearch:
Settings associated with hyperparameter optimisation
...

Module:
Settings associated with the Module
--loss {mse_loss,l1_loss,nll_loss,cross_entropy,binary_cross_entropy,...}
                        Loss function used during optimisation.
                        (Default: cross_entropy)
--batch_size BATCH_SIZE
                        Dataloader batch size. (Default: 64)
--num_workers NUM_WORKERS
                        Number of CPU workers to use for dataloading.
                        (Default: 10)
--learning_rate LEARNING_RATE
                        Learning rate. (Default: 0.1)
--weight_decay WEIGHT_DECAY
                        Weight decay. (Default: 1e-05)
--momentum MOMENTUM   Momentum. (Default: 0.9)
...