train
Given a path to an INPUT sequential dataset (.csv, .json, .txt), generates an pre-trained .pt OUTPUT model, trained on that dataset.
Along with all other training parameters, an optional set of data augmentation operations can be provided to be applied in series during training to the input data.
Usage
hxmx train <INPUT> [OUTPUT] [-m <INTEGER>] [-l <INTEGER>] [-hs <INTEGER>] [-c <INTEGER>] [-e <INTEGER>] [-bs <INTEGER>] [--lr <FLOAT>] [-p <INTEGER>] [--dropout <FLOAT>] [--betas <FLOAT RANGE>] [--slope <FLOAT RANGE>] [-n <FLOAT RANGE>] [-s <INTEGER>] [-op <TEXT>] [-d <CHOICE>] [--debug] [--help]
Arguments
| Name | Type | Required | Default |
|---|---|---|---|
INPUT | path — must exist, file | ✓ | — |
OUTPUT | path — file | model.pt |
Options
| Name | Type | Default | Description |
|---|---|---|---|
--mixtures, -m | integer | 10 | Number of Gaussian mixture components. |
--layers, -l | integer | 1 | Number of recurrent layers. |
--hidden-size, -hs | integer | 120 | Number of dimensions to use for hidden representation. |
--context, -c | integer | 200 | Length of sequence segments to use during training. |
--epochs, -e | integer | 1000 | Maximum number of epochs. |
--batch-size, -bs | integer | 32 | Batch size. |
--lr | float | 0.0025 | Learning rate. |
--patience, -p | integer | 15 | Number of iterations the model is allowed to not improve before stopping training. |
--dropout | float | 0.25 | Dropout rate. During training, randomly zero some of the elements of the input data. Useful to prevent over-fitting. |
--betas | float [0.1, 0.995] | [0.9, 0.99] | Coefficients used for computing running averages of gradient and its square, via Adaptive Moment Estimation (Adam) optimizer. |
--slope | float [0, +∞] | 1e-05 | Negative slope for Leaky ReLU activations. |
--noise, -n | float [0, 1] | [0, 0] | Adaptive weight noise parameters, as a pair of standard deviation and decay factor values, respectively. Adds Gaussian noise to the model weights during training, to prevent overfitting. |
--seed, -s | integer | 1 | Random seed. Use 0 for non-deterministic results. |
--operations, -op | string (multiple) | [] | Data augmentation operation(s) to stochastically apply during training. See operations |
--device, -d | choice (auto|cpu| ...) | auto | Computing device. To list all available devices, run hxmx devices. |
--debug | boolean | False | Debug mode. Includes traceback when an error is raised. |
-h, --help | boolean | False | Open documentation in browser. |