A2ML API¶

a2ml.api package¶

a2ml module - A2ML PREDIT API¶

class a2ml.api.a2ml.A2ML(ctx, provider=None)¶

Facade to A2ML providers.

__init__(ctx, provider=None)¶

Initializes new A2ML PREDIT instance.

Parameters

ctx (object) -- An instance of the a2ml Context.
provider (str) -- The automl provider(s) you wish to run. For example 'auger,azure,google'. The default is None - use provider set in config.

Returns

A2ML object

Examples

ctx = Context()
a2ml = A2ML(ctx, 'auger, azure')

import_data(source=None, name=None, description=None)¶

Imports data defined in context. Uploading the same file name will result in versions being appended to the file name.

Note

Your context points to a config file where source is defined.

# Local file name, remote url to the data source file or postgres url
source: './dataset.csv'

# Postgres url parameters: dbname, tablename, offset(OPTIONAL), limit(OPTIONAL)
source: jdbc:postgresql://user:pwd@ec2-54-204-21-226.compute-1.amazonaws.com:5432/dbname?tablename=table1&offset=0&limit=100

Parameters

source (str, optional) -- Local file name, remote url to the data source file, Pandas DataFrame or postgres url
name (str, optional) -- Name of dataset, if none then file name used. If source is DataFrame then name should be specified.
description (str, optionsl) -- Description of dataset

Returns

Results for each provider.

{
    'auger': {'result': True, 'data': {'created': 'dataset.csv'}},

    'azure': {'result': True, 'data': {'created': 'dataset.csv'}}
}

Errors.

{
    'auger': {'result': False, 'data': 'Please specify data source file...'},

    'azure': {'result': False, 'data': 'Please specify data source file...'}
}

Examples

ctx = Context()
a2ml = A2ML(ctx, 'auger, azure')
a2ml.import_data()

preprocess_data(data, preprocessors, locally=False)¶

Preprocess data

Parameters

data (str|pandas.DataFrame) -- Input data for preprocess. Can be path to file(local or s3) or Pandas Dataframe
preprocessors (array of dicts) --
List of preprocessors with parameters
```
[
    {'text': {'text_cols': []}}
]
```

Preprocessors:

text

text_cols(array): List of text columns to process
text_metrics ['mean_length', 'unique_count', 'separation_score'] : Calculate metrics for text fields and after vectorize(separation_score)
tokenize (dict): Default - {'max_text_len': 30000, 'tokenizers': ['sent'], 'remove_chars': '○•'}
vectorize ('en_use_lg'|'hashing'|'en_use_md'|'en_use_cmlm_md'|'en_use_cmlm_lg'): See see https://github.com/MartinoMensio/spacy-universal-sentence-encoder
dim_reduction(dict): Generate features based on vectors. See https://scikit-learn.org/stable/modules/generated/sklearn.decomposition.PCA.html
```
{
    'alg_name': 'PCA'|'t-SNE',
    'args': {'n_components': 2} #Number of components to keep.
}
```
output_prefix (str): Prefix for generated columns. Format name: {prefix}_{colname}_{num}
calc_distance ['none', 'cosine', 'cityblock', 'euclidean', 'haversine', 'l1', 'l2', 'manhattan', 'nan_euclidean'] | 'cosine' : See https://scikit-learn.org/stable/modules/generated/sklearn.metrics.pairwise.distance_metrics.html#sklearn.metrics.pairwise.distance_metrics

compare_pairs (array of dicts): When calc_distance is not none.

[
    {'compare_cols': [{'dataset_idx': 0, 'cols': ['col1']}, {'dataset_idx': 1, 'cols': ['col2']}],
        'output_name':'cosine_col1_col2', 'params': {}
    },
    {'compare_cols': [{'dataset_idx': 0, 'cols': ['col3']}, {'dataset_idx': 1, 'cols': ['col4']}],
        'output_name':'cosine_col3_col4', 'params': {}
    },
]

datasets: List of datasets to process, may be empty, so all fields takes from main dataset

[
    {'path': 'path', 'keys': ['main_key', 'local_key'], 'text_metrics': ['separation_score', 'mean_length', 'unique_count']},
    {'path': 'path1', 'keys': ['main_key1', 'local_key1']}
]

Returns

{: 'result': True, 'data': 'data in input format'

}

train()¶

Starts training session based on context state.

Returns

Results for each provider.

{
    'auger': {
        'result': True,
        'data': {
            'experiment_name': 'dataset.csv-4-experiment',
            'session_id': '9ccfe04eca67757a'
         }
    },
    'azure': {
        'result': True,
        'data': {
            'experiment_name': 'dataset.csv-4-experiment',
            'session_id': '9ccfe04eca67757a'
         }
    }
}

Errors.

{
    'auger': {'result': False, 'data': 'Please set target to build model.'},

    'azure': {'result': False, 'data': 'Please set target to build model.'}
}

Examples

ctx = Context()
a2ml = A2ML(ctx, 'auger, azure')
a2ml.train()

evaluate(run_id=None)¶

Evaluate the results of training.

Parameters

run_id (str, optional) -- The run id for a training session. A unique run id is created for every train. Default is last experiment train.

Returns

Results for each provider.

{
    'auger': {
        'result': True,
        'data': {
            'run_id': '9ccfe04eca67757a',
            'leaderboard': [
                {'model id': 'A017AC8EAD094FD', 'rmse': '0.0000', 'algorithm': 'LGBMRegressor'},
                {'model id': '4602AFCEEEAE413', 'rmse': '0.0000', 'algorithm': 'ExtraTreesRegressor'}
            ],
            'trials_count': 10,
            'status': 'started',
            'provider_status': 'provider specific'
        }
    },
    'azure': {
        'result': True,
        'data': {
            'run_id': '9ccfe04eca67757a',
            'leaderboard': [
                {'model id': 'A017AC8EAD094FD', 'rmse': '0.0000', 'algorithm': 'LGBMRegressor'},
                {'model id': '4602AFCEEEAE413', 'rmse': '0.0000', 'algorithm': 'ExtraTreesRegressor'}
            ],
            'trials_count': 10,
            'status': 'started',
            'provider_status': 'provider specific'
        }
    }
}

Status

preprocess - search is preprocessing data for traing

started - search is in progress

completed - search is completed

interrupted - search was interrupted

error - search was finished with error

Examples

ctx = Context()
a2ml = A2ML(ctx, 'auger, azure')
while True:
    res = a2ml.evaluate()
    if status['auger']['status'] not in ['preprocess','started']:
        break

deploy(model_id, locally=False, review=False, provider=None, name=None, algorithm=None, score=None, data_path=None, metadata=None)¶

Deploy a model locally or to specified provider(s).

Note

See evaluate function to get model_id

This method support only one provider

Parameters

model_id (str) -- The model id from any experiment you will deploy. Ignored for 'external' provider
locally (bool) -- Deploys the model locally if True, on the Provider Cloud if False. The default is False.
review (bool) -- Should model support review based on actual data. The default is True.
provider (str) -- The automl provider you wish to run. For example 'auger'. The default is None - use provider defined by model_id or set in costructor.
name (str) -- Friendly name for the model. Used as name for Review Endpoint
algorithm (str) -- Monitored model(external provider) algorithm name.
score (float) -- Monitored model(external provider) score.
data_path (str) -- Data path to fit model when deploy. Return new deployed model-id
metadata (dict) -- Additional parameter for the model. Used for accurcay report(report parameter)

Returns

{
    'result': True,
    'data': {'model_id': 'A017AC8EAD094FD'}
}

Examples

ctx = Context()
a2ml = A2ML(ctx, 'auger, azure')
a2ml.deploy(model_id='A017AC8EAD094FD', name='FirstExperiment')

ctx = Context()
a2ml = A2ML(ctx, 'external')
result = a2ml.deploy(model_id=None, name="My external model.", algorithm='RandomForest', score=0.75)
model_id = result['data']['model_id']

predict(model_id, filename=None, data=None, columns=None, predicted_at=None, threshold=None, score=False, score_true_data=None, output=None, no_features_in_result=None, locally=False, provider=None, predict_labels=None)¶

Predict results with new data against deployed model. Predictions are stored next to the file with data to be predicted on. The file name will be appended with suffix _predicted.

Note

Use deployed model_id

This method support only one provider

Parameters

model_id (str) -- The deployed model id you want to use.
filename (str) -- The file with data to request predictions for.
data -- array of records [[target, actual]] or Pandas DataFrame (target, actual) or dict created with Pandas DataFrame to_dict('list') method
columns (list) -- list of column names if data is array of records
predicted_at -- Predict data date. Use for review of historical data.
threshold (float) -- For classification models only. This will return class probabilities with response.
score (bool) -- Calculate scores for predicted results.
score_true_data (str, pandas.DataFrame, dict) -- Data with true values to calculate scores. If missed, target from filename used for true values.
output (str) -- Output csv file path.
no_features_in_result (bool) -- Do not return feature columns in prediction result. False by default
locally (bool, str) -- Predicts using a local model with auger.ai.predict if True, on the Provider Cloud if False. If set to "docker", then docker image used to run the model
provider (str) -- The automl provider you wish to run. For example 'auger'. The default is None - use provider set in costructor or config.
predict_labels (dict, bool) -- Run ActiveLearn to select data for labelling

Returns

if filename is not None.

{
    'result': True,
    'data': {'predicted': 'dataset_predicted.csv'}
}

if filename is None and data is not None and columns is None.

{
    'result': True,
    'data': {'predicted': [{col1: value1, col2: value2, target: predicted_value1}, {col1: value3, col2: value4, target: predicted_value2}]}
}

if filename is None and data is not None and columns is not None.

{
    'result': True,
    'data': {'predicted': {'columns': ['col1', 'col2', target], 'data': [['value1', 'value2', 1], ['value3', 'value4', 0]]}}
}

Examples

ctx = Context()
rv = A2ML(ctx).predict(model_id, '../irises.csv')
# if rv[provider].result is True
# predictions are stored in rv[provider]['data']['predicted']

ctx = Context()
data = [{'col1': 'value1', 'col2': 'value2'}, {'col1': 'value3', 'col2': 'value4'}]
rv = A2ML(ctx).predict(model_id, data=data)
# if rv[provider].result is True
# predictions are returned as rv[provider]['data']['predicted']

ctx = Context()
data = [['value1', 'value2'], ['value3', 'value4']]
columns = ['col1', 'col2']
rv = A2ML(ctx).predict(model_id, data=data)
# if rv[provider].result is True
# predictions are returned as rv[provider]['data']['predicted']

# Predict locally without config files. Model will automatically downloaded if not exists.
# To use local predict install a2ml[predict]
ctx = Context()
ctx.config.set('name', 'project name')
ctx.credentials = "Json string from a2ml ui settings"

rv = A2ML(ctx).predict(model_id, '../irises.csv',
    no_features_in_result = True, locally=True)
# if rv[provider].result is True
# predictions are stored in rv[provider]['data']['predicted']

actuals(model_id, filename=None, data=None, columns=None, actuals_at=None, actual_date_column=None, experiment_params=None, locally=False, provider=None)¶

Submits actual results(ground truths) for predictions of a deployed model. This is used to review and monitor active models.

Note

It is assumed you have predictions against this model first.

actuals.csv¶
predicted( or target): predicted value. If missed - predict called automatically	actual	baseline_target: predicted value for baseline model (OPTIONAL)
Iris-setosa	Iris-setosa	Iris-setosa
Iris-virginica	Iris-virginica	Iris-virginica

It may also contain train features to predict(if target missed), retrain model while Review and for distribution chart

This method support only one provider

Parameters

model_id (str) -- The deployed model id you want to use.
filename (str) -- The file with data to request predictions for.
data -- array of records [[target, actual]] or Pandas DataFrame (target, actual) or dict created with Pandas DataFrame to_dict('list') method
columns (list) -- list of column names if data is array of records
actuals_at -- Actuals date. Use for review of historical data.
actual_date_column (str) -- name of column in data which contains actual date

experiment_params (dict) --

parameters to calculate experiment metrics

start_date(date): experiment actuals start date
end_date(date):  experiment actuals end date
date_col(str): column name with date

locally (bool) -- Process actuals locally.
provider (str) -- The automl provider you wish to run. For example 'auger'. The default is None - use provider set in costructor or config.

Returns

{
    'result': True,
    'data': True
}

Errors.

{
    'result': False,
    'data': 'Actual Prediction IDs not found in model predictions.'
}

Examples

ctx = Context()
A2ML(ctx).actuals('D881079E1ED14FB', filename=<path_to_file>/actuals.csv)

ctx = Context()
actual_records = [['predicted_value_1', 'actual_value_1'], ['predicted_value_2', 'actual_value_2']]
columns = [target, 'actual']

A2ML(ctx).actuals('D881079E1ED14FB', data=actual_records,columns=columns)

ctx = Context()
actual_records = [['predicted_value_1', 'actual_value_1'], ['predicted_value_2', 'actual_value_2']]
columns = [target, 'actual']

A2ML(ctx, "external").actuals('external_model_id', data=actual_records,columns=columns)

delete_actuals(model_id, with_predictions=False, begin_date=None, end_date=None, locally=False, provider=None)¶

Delete files with actuals and predcitions locally or from specified provider(s).

Parameters

model_id (str) -- Model ID to delete actuals and predictions.
with_predictions (bool) --
begin_date -- Date to begin delete operations
end_date -- Date to end delete operations
locally (bool) -- Delete files from local model if True, on the Provider Cloud if False. The default is False.
provider (str) -- The automl provider you wish to run. For example 'auger'. The default is None - use provider defined by model_id or set in costructor.

Returns

{
    'result': True,
    'data': None
}

Examples

ctx = Context()
A2MLModel(ctx).delete_actuals(model_id='D881079E1ED14FB')

review(model_id, locally=False, provider=None)¶

Review information about deployed model.

Parameters

model_id (str) -- The deployed model id you want to use.
locally (bool) -- Process review locally.

Returns

May be : started, error, completed, retrain error(str): Description of error if status='error' accuracy(float): Average accuracy of model(based on used metric) for review sensitivity period(see config.yml)

{
    'result': True,
    'data': {'status': 'completed', 'error': '', 'accuracy': 0.76}
}

Return type

status(str)

Examples

ctx = Context()
result = A2ML(ctx).review(model_id='D881079E1ED14FB')

a2ml_dataset module¶

class a2ml.api.a2ml_dataset.A2MLDataset(ctx, provider=None)¶

Contains the dataset CRUD operations that interact with provider.

__init__(ctx, provider=None)¶

Initializes a new a2ml dataset.

Parameters

ctx (object) -- An instance of the a2ml Context.
provider (str) -- The automl provider(s) you wish to run. For example 'auger,azure,google'. The default is None - use provider set in config.

Returns

A2MLDataset object

Examples

ctx = Context()
dataset = A2MLDataset(ctx, 'auger, azure')

list()¶

List all of the DataSets for the Project specified in the .yaml.

Note

You will need to user the iter function to access the dataset elements.

Returns

Results for each provider.

{
    'auger': {
        'result': True,
        'data': {
            'datasets': <object>
        }
    }
}

Examples

ctx = Context()
dataset_list = A2MLDataset(ctx, 'auger, azure').list()
for provider in ['auger', 'azure']
    if dataset_list[provider].result is True:
        for dataset in iter(dataset_list[provider].data.datasets):
            ctx.log(dataset.get('name'))
    else:
        ctx.log('error %s' % dataset_list[provider].data)

create(source=None, name=None, description=None)¶

Create a new DataSet for the Project specified in the .yaml.

Parameters

source (str, optional) -- Local file name, remote url to the data source file, Pandas DataFrame or postgres url
name (str, optional) -- Name of dataset, if none then file name used. If source is DataFrame then name should be specified.

Returns

Results for each provider.

{
    'auger': {
        'result': True,
        'data': {
            'created': 'dataset.csv'
        }
    }
}

Examples

ctx = Context()
dataset = A2MLDataset(ctx, 'auger, azure').create('../dataset.csv')

upload(source, name=None)¶

Upload file to Auger and get Auger url.

Parameters

source (str) -- Local file name, remote url to the data source file, Pandas DataFrame or postgres url
name (str, optional) -- Name of dataset, if none then file name used. If source is DataFrame then name should be specified.

Returns

{: 'result': True, 'data': 'url for the file on Auger Hub'

}

Examples

ctx = Context() url = A2MLDataset(ctx).upload('../dataset.csv')

delete(name=None)¶

Deletes a DataSet for the Project specified in the .yaml.

Parameters

name (str) -- name of dataset.

Returns

Results for each provider.

{
    'auger': {
        'result': True,
        'data': {
            'deleted': 'dataset.csv'
        }
    }
}

Examples

ctx = Context()
A2MLDataset(ctx, 'auger, azure').delete(dataset_name)
ctx.log('Deleted dataset %s' % dataset_name)

select(name=None)¶

Sets a DataSet name in the context.

Parameters

name (str) -- name of dataset.

Returns

Results for each provider.

{
    'auger': {
        'result': True,
        'data': {
            'selected': 'fortunetest'
        }
    }
}

Examples

ctx = Context()
A2MLDataset(ctx, 'auger, azure').select(dataset_name)

download(name=None, path=None)¶

Download DataSet by name to the local file.

Parameters

name (str, optional) -- name of dataset. If skipped dataset from auger.yaml will be used
path (str, optional) -- local dir path to store file. If skipped current folder will be used.

Returns

{: 'result': True, 'data': 'full local path to the file'

}

Examples

ctx = Context()
A2MLDataset(ctx).download(dataset_name)

a2ml_experiment module¶

class a2ml.api.a2ml_experiment.A2MLExperiment(ctx, provider=None)¶

Contains the experiment operations that interact with provider.

__init__(ctx, provider=None)¶

Initializes a new a2ml experiment.

Parameters

ctx (object) -- An instance of the a2ml Context.
provider (str) -- The automl provider(s) you wish to run. For example 'auger,azure,google'. The default is None - use provider set in config.

Returns

A2MLExperiment object

Examples

ctx = Context()
model = A2MLExperiment(ctx, 'auger, azure')

list()¶

List all of the experiments for the Project specified in the .yaml.

Note

You will need to user the iter function to access the dataset elements.

Returns

Results for each provider.

{
    'auger': {
        'result': True,
        'data': {
            'experiments': <object>
        }
    }
}

Examples

ctx = Context()
experiment_list = A2MLExperiment(ctx, 'auger, azure').list()
for provider in ['auger', 'azure']
    if experiment_list[provider].result is True:
        for experiment in iter(experiment_list[provider].data.datasets):
            ctx.log(experiment.get('name'))
    else:
        ctx.log('error %s' % experiment_list[provider].data)

start()¶

Starts experiment/s for selected dataset. If the name of experiment is not set in context config, new experiment will be created, otherwise an existing experiment will be run.

Returns

Results for each provider.

{
    'auger': {
        'result': True,
        'data': {
            'experiment_name': <experiment_name>,
            'session_id': <session_id>
        }
    }
}

Examples

ctx = Context()
experiment = A2MLExperiment(ctx, providers).start()

stop(run_id=None)¶

Stops runninng experiment/s.

Parameters

run_id (str) -- The run id for a training session. A unique run id is created for every train. If set to None default is last experiment train.

Returns

Results for each provider.

{
    'auger': {
        'result': True,
        'data': {
            'stopped': <experiment_name>
        }
    }
}

Examples

ctx = Context()
experiment = A2MLExperiment(ctx, providers).stop()

leaderboard(run_id)¶

The leaderboard of the currently running or previously completed experiment/s.

Parameters

run_id (str) -- The run id for a training session. A unique run id is created for every train. If set to None default is last experiment train.

Returns

Results for each provider.

{
    'auger': {
        'result': True,
        'data': {
            'run_id': '9ccfe04eca67757a',
            'leaderboard': [
                {'model id': 'A017AC8EAD094FD', 'rmse': '0.0000', 'algorithm': 'LGBMRegressor'},
                {'model id': '4602AFCEEEAE413', 'rmse': '0.0000', 'algorithm': 'ExtraTreesRegressor'}
            ],
            'trials_count': 10,
            'status': 'started',
            'provider_status': 'provider specific'
        }
    },
    'azure': {
        'result': True,
        'data': {
            'run_id': '9ccfe04eca67757a',
            'leaderboard': [
                {'model id': 'A017AC8EAD094FD', 'rmse': '0.0000', 'algorithm': 'LGBMRegressor'},
                {'model id': '4602AFCEEEAE413', 'rmse': '0.0000', 'algorithm': 'ExtraTreesRegressor'}
            ],
            'trials_count': 10,
            'status': 'started',
            'provider_status': 'provider specific'
        }
    }
}

Status

preprocess - search is preprocessing data for traing

started - search is in progress

completed - search is completed

interrupted - search was interrupted

error - search was finished with error

Examples

ctx = Context()
leaderboard = A2MLExperiment(ctx, 'auger, azure').leaderboard()
for provider in ['auger', 'azure']
if leaderboard[provider].result is True:
    for entry in iter(leaderboard[provider].data.leaderboard):
        ctx.log(entry['model id'])
        ctx.log('status %s' % leaderboard[provider].data.status)
else:
    ctx.log('error %s' % leaderboard[provider].data)

history()¶

The history of the currently running or previously completed experiment/s.

Note

You will need to user the iter function to access the dataset elements.

Returns

Results for each provider.

 {
    'auger': {
        'result': True,
        'data': {
            'history': <object>
        }
    }
}

Examples

ctx = Context()
history = A2MLExperiment(ctx, 'auger, azure').history()
for provider in ['auger', 'azure']
if history[provider].result is True:
    for run in iter(history[provider].data.history):
    ctx.log("run id: {}, status: {}".format(
        run.get('id'),
        run.get('status')))
else:
    ctx.log('error %s' % history[provider].data)

a2ml_model module¶

class a2ml.api.a2ml_model.A2MLModel(ctx, provider=None)¶

Contains the model operations that interact with provider.

__init__(ctx, provider=None)¶

Initializes a new a2ml model.

Parameters

ctx (object) -- An instance of the a2ml Context.
provider (str) -- The automl provider you wish to run. For example 'auger'. The default is None - use provider from methods.

Returns

A2MLModel object

Examples

ctx = Context()
model = A2MLModel(ctx)

deploy(model_id, locally=False, review=False, provider=None, name=None, algorithm=None, score=None, data_path=None, metadata=None)¶

Deploy a model locally or to specified provider(s).

Parameters

model_id (str) -- Model ID from the any experiment leaderboard.
locally (bool) -- Deploys using a local model if True, on the Provider Cloud if False.
review (bool) -- Should model support review based on actual data. The default is True.
provider (str) -- The automl provider you wish to run. For example 'auger'. The default is None - use provider defined by model_id or set in costructor.
name (str) -- Friendly name for the model. Used as name for Review Endpoint
algorithm (str) -- Monitored model(external provider) algorithm name.
score (float) -- Monitored model(external provider) score.
data_path (str) -- Data path to fit model when deploy. Return new deployed model-id
metadata (dict) -- Additional parameter for the model. Used for accurcay report(report parameter)

Returns

{
    'result': True,
    'data': {'model_id': 'A017AC8EAD094FD'}
}

Examples

ctx = Context()
model = A2MLModel(ctx).deploy(model_id='D881079E1ED14FB', name='FirstExperiment')

ctx = Context()
model = A2MLModel(ctx, 'external')
result = model.deploy(model_id=None, name="My external model.", algorithm='RandomForest', score=0.75)
model_id = result['data']['model_id']

predict(model_id, filename=None, data=None, columns=None, predicted_at=None, threshold=None, score=False, score_true_data=None, output=None, no_features_in_result=None, locally=False, provider=None, predict_labels=None)¶

Predict results with new data against deployed model. Predictions are stored next to the file with data to be predicted on. The file name will be appended with suffix _predicted.

Note

Use deployed model_id

This method support only one provider

Parameters

model_id (str) -- The deployed model id you want to use.
filename (str) -- The file with data to request predictions for.
data -- array of records [[target, actual]] or Pandas DataFrame (target, actual) or dict created with Pandas DataFrame to_dict('list') method
columns (list) -- list of column names if data is array of records
predicted_at -- Predict data date. Use for review of historical data.
threshold (float) -- For classification models only. This will return class probabilities with response.
score (bool) -- Calculate scores for predicted results.
score_true_data (str, pandas.DataFrame, dict) -- Data with true values to calculate scores. If missed, target from filename used for true values.
output (str) -- Output csv file path.
no_features_in_result (bool) -- Do not return feature columns in prediction result. False by default
locally (bool, str) -- Predicts using a local model with auger.ai.predict if True, on the Provider Cloud if False. If set to "docker", then docker image used to run the model
provider (str) -- The automl provider you wish to run. For example 'auger'. The default is None - use provider set in costructor or config.
predict_labels (dict, bool) -- Run ActiveLearn to select data for labelling

Returns

if filename is not None.

{
    'result': True,
    'data': {'predicted': 'dataset_predicted.csv'}
}

if filename is None and data is not None and columns is None.

{
    'result': True,
    'data': {'predicted': [{col1: value1, col2: value2, target: predicted_value1}, {col1: value3, col2: value4, target: predicted_value2}]}
}

if filename is None and data is not None and columns is not None.

{
    'result': True,
    'data': {'predicted': {'columns': ['col1', 'col2', target], 'data': [['value1', 'value2', 1], ['value3', 'value4', 0]]}}
}

Examples

ctx = Context()
rv = A2MLModel(ctx).predict(model_id, '../irises.csv')
# if rv[provider].result is True
# predictions are stored in rv[provider]['data']['predicted']

ctx = Context()
data = [{'col1': 'value1', 'col2': 'value2'}, {'col1': 'value3', 'col2': 'value4'}]
rv = A2MLModel(ctx).predict(model_id, data=data)
# if rv[provider].result is True
# predictions are returned as rv[provider]['data']['predicted']

ctx = Context()
data = [['value1', 'value2'], ['value3', 'value4']]
columns = ['col1', 'col2']
rv = A2MLModel(ctx).predict(model_id, data=data)
# if rv[provider].result is True
# predictions are returned as rv[provider]['data']['predicted']

# Predict locally without config files. Model will automatically downloaded if not exists.
# To use local predict install a2ml[predict]
ctx = Context()
ctx.config.set('name', 'project name')
ctx.credentials = "Json string from a2ml ui settings"

rv = A2MLModel(ctx).predict(model_id, '../irises.csv',
    no_features_in_result = True, locally=True)
# if rv[provider].result is True
# predictions are stored in rv[provider]['data']['predicted']

actuals(model_id, filename=None, data=None, columns=None, actuals_at=None, actual_date_column=None, experiment_params=None, locally=False, provider=None)¶

Submits actual results(ground truths) for predictions of a deployed model. This is used to review and monitor active models.

Note

It is assumed you have predictions against this model first.

actuals.csv¶
predicted ( or target): predicted value. If missed - predict called automatically	actual	baseline_target: predicted value for baseline model (OPTIONAL)
Iris-setosa	Iris-setosa	Iris-setosa
Iris-virginica	Iris-virginica	Iris-virginica

It may also contain train features to predict(if target missed), retrain model while Review and for distribution chart

This method support only one provider

Parameters

model_id (str) -- The deployed model id you want to use.
filename (str) -- The file with data to request predictions for.
data -- array of records [[target, actual]] or Pandas DataFrame (target, actual) or dict created with Pandas DataFrame to_dict('list') method
columns (list) -- list of column names if data is array of records
actuals_at -- Actuals date. Use for review of historical data.
actual_date_column (str) -- name of column in data which contains actual date

experiment_params (dict) --

parameters to calculate experiment metrics

start_date(date): experiment actuals start date
end_date(date):  experiment actuals end date
date_col(str): column name with date

locally (bool) -- Process actuals locally.
provider (str) -- The automl provider you wish to run. For example 'auger'. The default is None - use provider set in costructor or config.

Returns

{
    'result': True,
    'data': True
}

Errors.

{
    'result': False,
    'data': 'Actual Prediction IDs not found in model predictions.'
}

Examples

ctx = Context()
A2MLModel(ctx).actuals('D881079E1ED14FB', filename=<path_to_file>/actuals.csv)

ctx = Context()
actual_records = [['predicted_value_1', 'actual_value_1'], ['predicted_value_2', 'actual_value_2']]
columns = [target, 'actual']

A2MLModel(ctx).actuals('D881079E1ED14FB', data=actual_records,columns=columns)

ctx = Context()
actual_records = [['predicted_value_1', 'actual_value_1'], ['predicted_value_2', 'actual_value_2']]
columns = [target, 'actual']

A2MLModel(ctx, "external").actuals('external_model_id', data=actual_records,columns=columns)

review_alert(model_id, parameters=None, locally=False, provider=None, name=None)¶

Update Review parameters.

Parameters

model_id (str) -- The deployed model id you want to use.
parameters (dict) --
If None, review section from config will be used.
- active (True/False): Activate/Deactivate Review Alert
- type (model_accuracy/feature_average_range/runtime_errors_burst)
  - model_accuracy: Decrease in Model Accuracy: the model accuracy threshold allowed before trigger is initiated. Default threshold: 0.7. Default sensitivity: 72
  - feature_average_range: Feature Average Out-Of-Range: Trigger an alert if average feature value during time period goes beyond the standard deviation range calculated during training period by the specified number of times or more. Default threshold: 1. Default sensitivity: 168
  - runtime_errors_burst: Burst Of Runtime Errors: Trigger an alert if runtime error count exceeds threshold. Default threshold: 5. Default sensitivity: 1
- threshold (float)
- sensitivity (int): The amount of time(in hours) this metric must be at or below the threshold to trigger the alert.
- threshold_policy (all_values/average_value/any_value)
  - all_values: Default value. Trigger an alert when all values in sensitivity below threshold
  - average_value: Trigger an alert when average of values in sensitivity below threshold
  - any_value: Trigger an alert when any value in sensitivity below threshold
- action (no/retrain/retrain_deploy)
  - no: no action should be executed
  - retrain: Use new predictions and actuals as test set to retrain the model.
  - retrain_deploy: Deploy retrained model and make it active model of this endpoint.
- notification (no/user/organization): Send message via selected notification channel.
locally (bool) -- Process review locally.
provider (str) -- The automl provider you wish to run. For example 'auger'. The default is None - use provider defined by model_id or set in costructor.
name (str) -- Friendly name for the model. Used as name for Review Endpoint

Returns

{
    'result': True,
}

Examples

ctx = Context()
model = A2MLModel(ctx).review_alert(model_id='D881079E1ED14FB')

review(model_id, locally=False, provider=None)¶

Review information about deployed model.

Parameters

model_id (str) -- The deployed model id you want to use.
locally (bool) -- Process review locally.
provider (str) -- The automl provider you wish to run. For example 'auger'. The default is None - use provider defined by model_id or set in costructor.

Returns

May be : started, error, completed, retrain error(str): Description of error if status='error' accuracy(float): Average accuracy of model(based on used metric) for review sensitivity period(see config.yml)

{
    'result': True,
    'data': {'status': 'completed', 'error': '', 'accuracy': 0.76}
}

Return type

status(str)

Examples

ctx = Context()
result = A2MLModel(ctx).review(model_id='D881079E1ED14FB')

undeploy(model_id, locally=False, provider=None)¶

Undeploy a model locally or from specified provider(s).

Parameters

model_id (str) -- Model ID from any experiment leaderboard.
locally (bool) -- Deploys using a local model if True, on the Provider Cloud if False. The default is False.
provider (str) -- The automl provider you wish to run. For example 'auger'. The default is None - use provider defined by model_id or set in costructor.

Returns

{
    'result': True,
    'data': {'model_id': 'A017AC8EAD094FD'}
}

Examples

ctx = Context()
model = A2MLModel(ctx).undeploy(model_id='D881079E1ED14FB', locally=True)

delete_actuals(model_id, with_predictions=False, begin_date=None, end_date=None, locally=False, provider=None)¶

Delete files with actuals and predcitions locally or from specified provider(s).

Parameters

model_id (str) -- Model ID to delete actuals and predictions.
with_predictions (bool) --
begin_date -- Date to begin delete operations
end_date -- Date to end delete operations
locally (bool) -- Delete files from local model if True, on the Provider Cloud if False. The default is False.
provider (str) -- The automl provider you wish to run. For example 'auger'. The default is None - use provider defined by model_id or set in costructor.

Returns

{
    'result': True,
    'data': None
}

Examples

ctx = Context()
A2MLModel(ctx).delete_actuals(model_id='D881079E1ED14FB')

get_info(model_id, locally=False, provider=None)¶

Get information about model

Parameters

model_id (str) -- Model ID to delete actuals and predictions.
locally (bool) -- Get information from local model if True, on the Provider Cloud if False. The default is False.
provider (str) -- The automl provider you wish to run. For example 'auger'. The default is None - use provider defined by model_id or set in costructor.

Returns

{
    'result': True,
    'data': {} #detailed model information
}

Examples

ctx = Context()
res = A2MLModel(ctx).get_info('D881079E1ED14FB')

update(model_id, metadata, locally=False, provider=None)¶

Update model metadata

Parameters

model_id (str) -- Model ID to delete actuals and predictions.
metadata (dict) -- Model metadata to update
locally (bool) -- Get information from local model if True, on the Provider Cloud if False. The default is False.
provider (str) -- The automl provider you wish to run. For example 'auger'. The default is None - use provider defined by model_id or set in costructor.

Returns

{
    'result': True,
    'data': {}
}

Examples

ctx = Context()
res = A2MLModel(ctx).update('D881079E1ED14FB', {'report': {}})

list(endpoints=False)¶

List all of the models/endpoints for the specified providers.

Returns

Results for each provider.

    'result': True,
    'data': {
        'projects': <object>
    }
}

Examples

ctx = Context()
models_list = A2MLModel(ctx).list()

a2ml_project module¶

class a2ml.api.a2ml_project.A2MLProject(ctx, provider=None)¶

Contains the project CRUD operations that interact with provider.

__init__(ctx, provider=None)¶

Initializes a new a2ml project.

Parameters

ctx (object) -- An instance of the a2ml Context.
provider (str) -- The automl provider(s) you wish to run. For example 'auger,azure,google'. The default is None - use provider set in config.

Returns

A2MLProject object

Examples

ctx = Context()
project = A2MLDataset(ctx, 'auger, azure')

list()¶

List all of the projects for the specified providers.

Note

You will need to user the iter function to access the dataset elements.

Returns

Results for each provider.

{
    'auger': {
        'result': True,
        'data': {
            'projects': <object>
        }
    }
}

Examples

ctx = Context()
project_list = A2MLProject(ctx, 'auger, azure').list()
for provider in ['auger', 'azure']
    if project_list[provider].result is True:
        for project in iter(project_list[provider].data.projects):
            ctx.log(project.get('name'))
    else:
        ctx.log('error %s' % project_list[provider].data)

create(name)¶

Creates a project for the specified providers.

Parameters

name (str) -- name of project. If None - use project name from config.

Returns

Results for each provider.

{
    'auger': {
        'result': True,
        'created': 'project_name'
    }
}

Examples

ctx = Context()
project_list = A2MLProject(ctx, 'auger, azure').create('new_project_name')

delete(name)¶

Deletes a project for the specified providers.

Parameters

name (str) -- name of project. If None - use project name from config.

Returns

Results for each provider.

{
    'auger': {
        'result': True,
        'deleted': 'existing_project_name'
    }
}

Examples

ctx = Context()
project_list = A2MLProject(ctx, 'auger, azure').delete('existinng_project_name')

select(name)¶

Sets a Project name in the context.

Parameters

name (str) -- name of project. name(str): name of project. If None - use project name from config.

Returns

Results for each provider.

{
    'auger': {
        'result': True,
        'data': {
            'selected': 'fortunetest'
        }
    }
}

Examples

ctx = Context()
DataSet(ctx, 'auger, azure').select(dataset_name)

get_cluster_config(name, local_config=True)¶

Get project cluster configuration for the specified providers.

Parameters

name (str) -- name of project. If None - use project name from config.
local_config (bool) -- If True, return cluster parameters from local config, otherwise from remote cluster

Returns

Results for each provider.

{
    'auger': {
        'result': True,
        'data': {
            'type': 'standard',
            'min_nodes': 2,
            'max_nodes': 2,
            'stack_version': 'stable'
        }
    },
    'azure': {
        'result': True,
        'data': {
            'region': 'eastus2',
            'min_nodes': 0,
            'max_nodes': 2,
            'type': 'STANDARD_D2_V2',
            'name': 'a2ml-azure',
            'idle_seconds_before_scaledown': 120
        }
    }
}

Examples

ctx = Context()
cluster_config = A2MLProject(ctx, 'auger, azure').get_cluster_config()

update_cluster_config(name, params)¶

Update project cluster configuration for the specified providers.

Parameters

name (str) -- name of project. If None - use project name from config.
params (dict) -- cluster parameters to update.

Returns

Results for each provider.

{
    'auger': {
        'result': True,
        'data': None
    },
    'azure': {
        'result': True,
        'data': None
    }
}

Examples

ctx = Context()
params = {'max_nodes': 4}
cluster_config = A2MLProject(ctx, 'auger, azure').update_cluster_config(params)

Submodules¶

context module¶

class a2ml.api.utils.context.Context(name='auger', path=None, debug=False)¶

The Context class provides an environment to run A2ML

__init__(name='auger', path=None, debug=False)¶

Initializes the Context instance

Parameters

name (str) -- The name of the config file. Default is 'config'
path (str) -- The path to your config file. If the config file is in the root directory leave as None.
debug (bool) -- True | False. Default is False.

Returns

Context object

Return type

object

Example

ctx = Context()

get_providers(provider=None)¶

constructs Context instance

Parameters

name (str) -- The name of the config file. Default is 'config'
path (str) -- The path to your config file. If the config file is in the root directory leave as None.
debug (bool) -- True | False. Default is False.

Returns

['azure', 'auger']

Return type

list[str]

Examples

ctx = Context()
ctx.get_providers()