A2ML CLI¶
The command line is a convenient way to start an A2ML project even if you plan to use the API.
Creating a New A2ML Project¶
Create a new A2ML project with the new
command by supplying a project name. A2ML will create a directory which has a default set of configuration files that can be more specifically configured.
$ a2ml new test_app
Configuring Your A2ML Project¶
To use the Python API or the command line interface for a specific PREDIT pipeline, configure the project first.
This includes both general options that apply to all vendors and vendor specific options in separate YAML files.
All Vendors
config.yaml
name: the name of the project provider: the AutoML provider. **GC (for Google Cloud)**, **AZ (for Microsoft Azure)**, or **Auger** source: the CSV or Parquet file to train with. Can be a local file path (for Auger or Azure). Can be a hosted file URL. Can be URL for Google Cloud Storage ("gs://...") for Google Cloud AutoML. source_format: csv(default), parquet. Set it if source is url or file has no extension exclude: features from the dataset to exclude from the model target: the feature which is the target model_type: Can be regression or classification budget: the time budget in milliseconds to train
Vendor Specfic
auger.yaml
azure.yaml
google.yaml
region: the region for the AutoML providers compute clusters, each vendor has different names for their regions metric: how to measure the accuracy of the model to perform the search of algorithms, each vendor has different names for their regions
Examples¶
Here is an example with options that apply to all AutoML providers:
config.yaml
source: ./moneyball/train.csv
exclude: Team,League,Year
target: RS
model_type: regression
experiment:
cross_validation_folds: 5
max_total_time: 60
max_eval_time: 5
max_n_trials: 10
use_ensemble: true
azure.yaml
experiment:
metric: r2_score
cluster:
region: eastus2
name: cpucluster
min_nodes: 0
max_nodes: 4
type: STANDARD_D2_V2
file_share:
account_name:
account_key:
dataset: train.csv
google.yaml
region: us-central1
metric: MINIMIZE_MAE
project: automl-test-237311
dataset_id: TBL1889796605356277760
operation_id: TBL2145477039279308800
operation_name: projects/291533092938/locations/us-central1/operations/TBL4473943599746121728
model_name: projects/291533092938/locations/us-central1/models/TBL1517370026795991040
auger.yaml
project: moneyball
dataset: train.csv
experiment:
cross_validation_folds: 5
max_total_time: 60
max_eval_time: 1
max_n_trials: 10
use_ensemble: true
metric: f1_macro
A2ML CLI Commands¶
Below are the full set of commands provided by A2ML. Command line options are provided for each stage in the PREDIT Pipeline.
$ a2ml [OPTIONS] COMMAND [ARGS]...
Commands
new Create new A2ML application.
import Import data for training.
train Train the model.
evaluate Evaluate models after training.
deploy Deploy trained model.
predict Predict with deployed model.
review Review specified model info.
project Project(s) management.
dataset Dataset(s) management.
experiment Experiment(s) management.
model Model(s) management.
To get detailed information on available options for each command, please run:
$ a2ml command --help