A2ML API

a2ml.api package

a2ml module - A2ML PREDIT API

class a2ml.api.a2ml.A2ML(ctx, provider=None)

Facade to A2ML providers.

__init__(ctx, provider=None)

Initializes new A2ML PREDIT instance.

Parameters
  • ctx (object) -- An instance of the a2ml Context.

  • provider (str) -- The automl provider(s) you wish to run. For example 'auger,azure,google'. The default is None - use provider set in config.

Returns

A2ML object

Examples

ctx = Context()
a2ml = A2ML(ctx, 'auger, azure')
import_data(source=None, name=None, description=None)

Imports data defined in context. Uploading the same file name will result in versions being appended to the file name.

Note

Your context points to a config file where source is defined.

# Local file name, remote url to the data source file or postgres url
source: './dataset.csv'
# Postgres url parameters: dbname, tablename, offset(OPTIONAL), limit(OPTIONAL)
source: jdbc:postgresql://user:pwd@ec2-54-204-21-226.compute-1.amazonaws.com:5432/dbname?tablename=table1&offset=0&limit=100
Parameters
  • source (str, optional) -- Local file name, remote url to the data source file, Pandas DataFrame or postgres url

  • name (str, optional) -- Name of dataset, if none then file name used. If source is DataFrame then name should be specified.

  • description (str, optionsl) -- Description of dataset

Returns

Results for each provider.

{
    'auger': {'result': True, 'data': {'created': 'dataset.csv'}},

    'azure': {'result': True, 'data': {'created': 'dataset.csv'}}
}

Errors.

{
    'auger': {'result': False, 'data': 'Please specify data source file...'},

    'azure': {'result': False, 'data': 'Please specify data source file...'}
}

Examples

ctx = Context()
a2ml = A2ML(ctx, 'auger, azure')
a2ml.import_data()
preprocess_data(data, preprocessors, locally=False)

Preprocess data

Parameters
  • data (str|pandas.DataFrame) -- Input data for preprocess. Can be path to file(local or s3) or Pandas Dataframe

  • preprocessors (array of dicts) --

    List of preprocessors with parameters

    [
        {'text': {'text_cols': []}}
    ]
    

Preprocessors:
text
  • text_cols(array): List of text columns to process

  • text_metrics ['mean_length', 'unique_count', 'separation_score'] : Calculate metrics for text fields and after vectorize(separation_score)

  • tokenize (dict): Default - {'max_text_len': 30000, 'tokenizers': ['sent'], 'remove_chars': '○•'}

  • vectorize ('en_use_lg'|'hashing'|'en_use_md'|'en_use_cmlm_md'|'en_use_cmlm_lg'): See see https://github.com/MartinoMensio/spacy-universal-sentence-encoder

  • dim_reduction(dict): Generate features based on vectors. See https://scikit-learn.org/stable/modules/generated/sklearn.decomposition.PCA.html

    {
        'alg_name': 'PCA'|'t-SNE',
        'args': {'n_components': 2} #Number of components to keep.
    }
    
  • output_prefix (str): Prefix for generated columns. Format name: {prefix}_{colname}_{num}

  • calc_distance ['none', 'cosine', 'cityblock', 'euclidean', 'haversine', 'l1', 'l2', 'manhattan', 'nan_euclidean'] | 'cosine' : See https://scikit-learn.org/stable/modules/generated/sklearn.metrics.pairwise.distance_metrics.html#sklearn.metrics.pairwise.distance_metrics

  • compare_pairs (array of dicts): When calc_distance is not none.

    [
        {'compare_cols': [{'dataset_idx': 0, 'cols': ['col1']}, {'dataset_idx': 1, 'cols': ['col2']}],
            'output_name':'cosine_col1_col2', 'params': {}
        },
        {'compare_cols': [{'dataset_idx': 0, 'cols': ['col3']}, {'dataset_idx': 1, 'cols': ['col4']}],
            'output_name':'cosine_col3_col4', 'params': {}
        },
    ]
    
  • datasets: List of datasets to process, may be empty, so all fields takes from main dataset

    [
        {'path': 'path', 'keys': ['main_key', 'local_key'], 'text_metrics': ['separation_score', 'mean_length', 'unique_count']},
        {'path': 'path1', 'keys': ['main_key1', 'local_key1']}
    ]
    
Returns

{

'result': True, 'data': 'data in input format'

}

train()

Starts training session based on context state.

Returns

Results for each provider.

{
    'auger': {
        'result': True,
        'data': {
            'experiment_name': 'dataset.csv-4-experiment',
            'session_id': '9ccfe04eca67757a'
         }
    },
    'azure': {
        'result': True,
        'data': {
            'experiment_name': 'dataset.csv-4-experiment',
            'session_id': '9ccfe04eca67757a'
         }
    }
}

Errors.

{
    'auger': {'result': False, 'data': 'Please set target to build model.'},

    'azure': {'result': False, 'data': 'Please set target to build model.'}
}

Examples

ctx = Context()
a2ml = A2ML(ctx, 'auger, azure')
a2ml.train()
evaluate(run_id=None)

Evaluate the results of training.

Parameters

run_id (str, optional) -- The run id for a training session. A unique run id is created for every train. Default is last experiment train.

Returns

Results for each provider.

{
    'auger': {
        'result': True,
        'data': {
            'run_id': '9ccfe04eca67757a',
            'leaderboard': [
                {'model id': 'A017AC8EAD094FD', 'rmse': '0.0000', 'algorithm': 'LGBMRegressor'},
                {'model id': '4602AFCEEEAE413', 'rmse': '0.0000', 'algorithm': 'ExtraTreesRegressor'}
            ],
            'trials_count': 10,
            'status': 'started',
            'provider_status': 'provider specific'
        }
    },
    'azure': {
        'result': True,
        'data': {
            'run_id': '9ccfe04eca67757a',
            'leaderboard': [
                {'model id': 'A017AC8EAD094FD', 'rmse': '0.0000', 'algorithm': 'LGBMRegressor'},
                {'model id': '4602AFCEEEAE413', 'rmse': '0.0000', 'algorithm': 'ExtraTreesRegressor'}
            ],
            'trials_count': 10,
            'status': 'started',
            'provider_status': 'provider specific'
        }
    }
}

Status

  • preprocess - search is preprocessing data for traing

  • started - search is in progress

  • completed - search is completed

  • interrupted - search was interrupted

  • error - search was finished with error

Examples

ctx = Context()
a2ml = A2ML(ctx, 'auger, azure')
while True:
    res = a2ml.evaluate()
    if status['auger']['status'] not in ['preprocess','started']:
        break
deploy(model_id, locally=False, review=False, provider=None, name=None, algorithm=None, score=None, data_path=None, metadata=None)

Deploy a model locally or to specified provider(s).

Note

See evaluate function to get model_id

This method support only one provider

Parameters
  • model_id (str) -- The model id from any experiment you will deploy. Ignored for 'external' provider

  • locally (bool) -- Deploys the model locally if True, on the Provider Cloud if False. The default is False.

  • review (bool) -- Should model support review based on actual data. The default is True.

  • provider (str) -- The automl provider you wish to run. For example 'auger'. The default is None - use provider defined by model_id or set in costructor.

  • name (str) -- Friendly name for the model. Used as name for Review Endpoint

  • algorithm (str) -- Monitored model(external provider) algorithm name.

  • score (float) -- Monitored model(external provider) score.

  • data_path (str) -- Data path to fit model when deploy. Return new deployed model-id

  • metadata (dict) -- Additional parameter for the model. Used for accurcay report(report parameter)

Returns

{
    'result': True,
    'data': {'model_id': 'A017AC8EAD094FD'}
}

Examples

ctx = Context()
a2ml = A2ML(ctx, 'auger, azure')
a2ml.deploy(model_id='A017AC8EAD094FD', name='FirstExperiment')
ctx = Context()
a2ml = A2ML(ctx, 'external')
result = a2ml.deploy(model_id=None, name="My external model.", algorithm='RandomForest', score=0.75)
model_id = result['data']['model_id']
predict(model_id, filename=None, data=None, columns=None, predicted_at=None, threshold=None, score=False, score_true_data=None, output=None, no_features_in_result=None, locally=False, provider=None, predict_labels=None)

Predict results with new data against deployed model. Predictions are stored next to the file with data to be predicted on. The file name will be appended with suffix _predicted.

Note

Use deployed model_id

This method support only one provider

Parameters
  • model_id (str) -- The deployed model id you want to use.

  • filename (str) -- The file with data to request predictions for.

  • data -- array of records [[target, actual]] or Pandas DataFrame (target, actual) or dict created with Pandas DataFrame to_dict('list') method

  • columns (list) -- list of column names if data is array of records

  • predicted_at -- Predict data date. Use for review of historical data.

  • threshold (float) -- For classification models only. This will return class probabilities with response.

  • score (bool) -- Calculate scores for predicted results.

  • score_true_data (str, pandas.DataFrame, dict) -- Data with true values to calculate scores. If missed, target from filename used for true values.

  • output (str) -- Output csv file path.

  • no_features_in_result (bool) -- Do not return feature columns in prediction result. False by default

  • locally (bool, str) -- Predicts using a local model with auger.ai.predict if True, on the Provider Cloud if False. If set to "docker", then docker image used to run the model

  • provider (str) -- The automl provider you wish to run. For example 'auger'. The default is None - use provider set in costructor or config.

  • predict_labels (dict, bool) -- Run ActiveLearn to select data for labelling

Returns

if filename is not None.

{
    'result': True,
    'data': {'predicted': 'dataset_predicted.csv'}
}

if filename is None and data is not None and columns is None.

{
    'result': True,
    'data': {'predicted': [{col1: value1, col2: value2, target: predicted_value1}, {col1: value3, col2: value4, target: predicted_value2}]}
}

if filename is None and data is not None and columns is not None.

{
    'result': True,
    'data': {'predicted': {'columns': ['col1', 'col2', target], 'data': [['value1', 'value2', 1], ['value3', 'value4', 0]]}}
}

Examples

ctx = Context()
rv = A2ML(ctx).predict(model_id, '../irises.csv')
# if rv[provider].result is True
# predictions are stored in rv[provider]['data']['predicted']
ctx = Context()
data = [{'col1': 'value1', 'col2': 'value2'}, {'col1': 'value3', 'col2': 'value4'}]
rv = A2ML(ctx).predict(model_id, data=data)
# if rv[provider].result is True
# predictions are returned as rv[provider]['data']['predicted']
ctx = Context()
data = [['value1', 'value2'], ['value3', 'value4']]
columns = ['col1', 'col2']
rv = A2ML(ctx).predict(model_id, data=data)
# if rv[provider].result is True
# predictions are returned as rv[provider]['data']['predicted']
# Predict locally without config files. Model will automatically downloaded if not exists.
# To use local predict install a2ml[predict]
ctx = Context()
ctx.config.set('name', 'project name')
ctx.credentials = "Json string from a2ml ui settings"

rv = A2ML(ctx).predict(model_id, '../irises.csv',
    no_features_in_result = True, locally=True)
# if rv[provider].result is True
# predictions are stored in rv[provider]['data']['predicted']
actuals(model_id, filename=None, data=None, columns=None, actuals_at=None, actual_date_column=None, experiment_params=None, locally=False, provider=None)

Submits actual results(ground truths) for predictions of a deployed model. This is used to review and monitor active models.

Note

It is assumed you have predictions against this model first.

actuals.csv

predicted( or target): predicted value. If missed - predict called automatically

actual

baseline_target: predicted value for baseline model (OPTIONAL)

Iris-setosa

Iris-setosa

Iris-setosa

Iris-virginica

Iris-virginica

Iris-virginica

It may also contain train features to predict(if target missed), retrain model while Review and for distribution chart

This method support only one provider

Parameters
  • model_id (str) -- The deployed model id you want to use.

  • filename (str) -- The file with data to request predictions for.

  • data -- array of records [[target, actual]] or Pandas DataFrame (target, actual) or dict created with Pandas DataFrame to_dict('list') method

  • columns (list) -- list of column names if data is array of records

  • actuals_at -- Actuals date. Use for review of historical data.

  • actual_date_column (str) -- name of column in data which contains actual date

  • experiment_params (dict) --

    parameters to calculate experiment metrics

    start_date(date): experiment actuals start date
    end_date(date):  experiment actuals end date
    date_col(str): column name with date
    

  • locally (bool) -- Process actuals locally.

  • provider (str) -- The automl provider you wish to run. For example 'auger'. The default is None - use provider set in costructor or config.

Returns

{
    'result': True,
    'data': True
}

Errors.

{
    'result': False,
    'data': 'Actual Prediction IDs not found in model predictions.'
}

Examples

ctx = Context()
A2ML(ctx).actuals('D881079E1ED14FB', filename=<path_to_file>/actuals.csv)
ctx = Context()
actual_records = [['predicted_value_1', 'actual_value_1'], ['predicted_value_2', 'actual_value_2']]
columns = [target, 'actual']

A2ML(ctx).actuals('D881079E1ED14FB', data=actual_records,columns=columns)
ctx = Context()
actual_records = [['predicted_value_1', 'actual_value_1'], ['predicted_value_2', 'actual_value_2']]
columns = [target, 'actual']

A2ML(ctx, "external").actuals('external_model_id', data=actual_records,columns=columns)
delete_actuals(model_id, with_predictions=False, begin_date=None, end_date=None, locally=False, provider=None)

Delete files with actuals and predcitions locally or from specified provider(s).

Parameters
  • model_id (str) -- Model ID to delete actuals and predictions.

  • with_predictions (bool) --

  • begin_date -- Date to begin delete operations

  • end_date -- Date to end delete operations

  • locally (bool) -- Delete files from local model if True, on the Provider Cloud if False. The default is False.

  • provider (str) -- The automl provider you wish to run. For example 'auger'. The default is None - use provider defined by model_id or set in costructor.

Returns

{
    'result': True,
    'data': None
}

Examples

ctx = Context()
A2MLModel(ctx).delete_actuals(model_id='D881079E1ED14FB')
review(model_id, locally=False, provider=None)

Review information about deployed model.

Parameters
  • model_id (str) -- The deployed model id you want to use.

  • locally (bool) -- Process review locally.

Returns

May be : started, error, completed, retrain error(str): Description of error if status='error' accuracy(float): Average accuracy of model(based on used metric) for review sensitivity period(see config.yml)

{
    'result': True,
    'data': {'status': 'completed', 'error': '', 'accuracy': 0.76}
}

Return type

status(str)

Examples

ctx = Context()
result = A2ML(ctx).review(model_id='D881079E1ED14FB')

a2ml_dataset module

class a2ml.api.a2ml_dataset.A2MLDataset(ctx, provider=None)

Contains the dataset CRUD operations that interact with provider.

__init__(ctx, provider=None)

Initializes a new a2ml dataset.

Parameters
  • ctx (object) -- An instance of the a2ml Context.

  • provider (str) -- The automl provider(s) you wish to run. For example 'auger,azure,google'. The default is None - use provider set in config.

Returns

A2MLDataset object

Examples

ctx = Context()
dataset = A2MLDataset(ctx, 'auger, azure')
list()

List all of the DataSets for the Project specified in the .yaml.

Note

You will need to user the iter function to access the dataset elements.

Returns

Results for each provider.

{
    'auger': {
        'result': True,
        'data': {
            'datasets': <object>
        }
    }
}

Examples

ctx = Context()
dataset_list = A2MLDataset(ctx, 'auger, azure').list()
for provider in ['auger', 'azure']
    if dataset_list[provider].result is True:
        for dataset in iter(dataset_list[provider].data.datasets):
            ctx.log(dataset.get('name'))
    else:
        ctx.log('error %s' % dataset_list[provider].data)
create(source=None, name=None, description=None)

Create a new DataSet for the Project specified in the .yaml.

Parameters
  • source (str, optional) -- Local file name, remote url to the data source file, Pandas DataFrame or postgres url

  • name (str, optional) -- Name of dataset, if none then file name used. If source is DataFrame then name should be specified.

Returns

Results for each provider.

{
    'auger': {
        'result': True,
        'data': {
            'created': 'dataset.csv'
        }
    }
}

Examples

ctx = Context()
dataset = A2MLDataset(ctx, 'auger, azure').create('../dataset.csv')
upload(source, name=None)

Upload file to Auger and get Auger url.

Parameters
  • source (str) -- Local file name, remote url to the data source file, Pandas DataFrame or postgres url

  • name (str, optional) -- Name of dataset, if none then file name used. If source is DataFrame then name should be specified.

Returns

{

'result': True, 'data': 'url for the file on Auger Hub'

}

Examples


ctx = Context() url = A2MLDataset(ctx).upload('../dataset.csv')

delete(name=None)

Deletes a DataSet for the Project specified in the .yaml.

Parameters

name (str) -- name of dataset.

Returns

Results for each provider.

{
    'auger': {
        'result': True,
        'data': {
            'deleted': 'dataset.csv'
        }
    }
}

Examples

ctx = Context()
A2MLDataset(ctx, 'auger, azure').delete(dataset_name)
ctx.log('Deleted dataset %s' % dataset_name)
select(name=None)

Sets a DataSet name in the context.

Parameters

name (str) -- name of dataset.

Returns

Results for each provider.

{
    'auger': {
        'result': True,
        'data': {
            'selected': 'fortunetest'
        }
    }
}

Examples

ctx = Context()
A2MLDataset(ctx, 'auger, azure').select(dataset_name)
download(name=None, path=None)

Download DataSet by name to the local file.

Parameters
  • name (str, optional) -- name of dataset. If skipped dataset from auger.yaml will be used

  • path (str, optional) -- local dir path to store file. If skipped current folder will be used.

Returns

{

'result': True, 'data': 'full local path to the file'

}

Examples

ctx = Context()
A2MLDataset(ctx).download(dataset_name)

a2ml_experiment module

class a2ml.api.a2ml_experiment.A2MLExperiment(ctx, provider=None)

Contains the experiment operations that interact with provider.

__init__(ctx, provider=None)

Initializes a new a2ml experiment.

Parameters
  • ctx (object) -- An instance of the a2ml Context.

  • provider (str) -- The automl provider(s) you wish to run. For example 'auger,azure,google'. The default is None - use provider set in config.

Returns

A2MLExperiment object

Examples

ctx = Context()
model = A2MLExperiment(ctx, 'auger, azure')
list()

List all of the experiments for the Project specified in the .yaml.

Note

You will need to user the iter function to access the dataset elements.

Returns

Results for each provider.

{
    'auger': {
        'result': True,
        'data': {
            'experiments': <object>
        }
    }
}

Examples

ctx = Context()
experiment_list = A2MLExperiment(ctx, 'auger, azure').list()
for provider in ['auger', 'azure']
    if experiment_list[provider].result is True:
        for experiment in iter(experiment_list[provider].data.datasets):
            ctx.log(experiment.get('name'))
    else:
        ctx.log('error %s' % experiment_list[provider].data)
start()

Starts experiment/s for selected dataset. If the name of experiment is not set in context config, new experiment will be created, otherwise an existing experiment will be run.

Returns

Results for each provider.

{
    'auger': {
        'result': True,
        'data': {
            'experiment_name': <experiment_name>,
            'session_id': <session_id>
        }
    }
}

Examples

ctx = Context()
experiment = A2MLExperiment(ctx, providers).start()
stop(run_id=None)

Stops runninng experiment/s.

Parameters

run_id (str) -- The run id for a training session. A unique run id is created for every train. If set to None default is last experiment train.

Returns

Results for each provider.

{
    'auger': {
        'result': True,
        'data': {
            'stopped': <experiment_name>
        }
    }
}

Examples

ctx = Context()
experiment = A2MLExperiment(ctx, providers).stop()
leaderboard(run_id)

The leaderboard of the currently running or previously completed experiment/s.

Parameters

run_id (str) -- The run id for a training session. A unique run id is created for every train. If set to None default is last experiment train.

Returns

Results for each provider.

{
    'auger': {
        'result': True,
        'data': {
            'run_id': '9ccfe04eca67757a',
            'leaderboard': [
                {'model id': 'A017AC8EAD094FD', 'rmse': '0.0000', 'algorithm': 'LGBMRegressor'},
                {'model id': '4602AFCEEEAE413', 'rmse': '0.0000', 'algorithm': 'ExtraTreesRegressor'}
            ],
            'trials_count': 10,
            'status': 'started',
            'provider_status': 'provider specific'
        }
    },
    'azure': {
        'result': True,
        'data': {
            'run_id': '9ccfe04eca67757a',
            'leaderboard': [
                {'model id': 'A017AC8EAD094FD', 'rmse': '0.0000', 'algorithm': 'LGBMRegressor'},
                {'model id': '4602AFCEEEAE413', 'rmse': '0.0000', 'algorithm': 'ExtraTreesRegressor'}
            ],
            'trials_count': 10,
            'status': 'started',
            'provider_status': 'provider specific'
        }
    }
}

Status

  • preprocess - search is preprocessing data for traing

  • started - search is in progress

  • completed - search is completed

  • interrupted - search was interrupted

  • error - search was finished with error

Examples

ctx = Context()
leaderboard = A2MLExperiment(ctx, 'auger, azure').leaderboard()
for provider in ['auger', 'azure']
if leaderboard[provider].result is True:
    for entry in iter(leaderboard[provider].data.leaderboard):
        ctx.log(entry['model id'])
        ctx.log('status %s' % leaderboard[provider].data.status)
else:
    ctx.log('error %s' % leaderboard[provider].data)
history()

The history of the currently running or previously completed experiment/s.

Note

You will need to user the iter function to access the dataset elements.

Returns

Results for each provider.

 {
    'auger': {
        'result': True,
        'data': {
            'history': <object>
        }
    }
}

Examples

ctx = Context()
history = A2MLExperiment(ctx, 'auger, azure').history()
for provider in ['auger', 'azure']
if history[provider].result is True:
    for run in iter(history[provider].data.history):
    ctx.log("run id: {}, status: {}".format(
        run.get('id'),
        run.get('status')))
else:
    ctx.log('error %s' % history[provider].data)

a2ml_model module

class a2ml.api.a2ml_model.A2MLModel(ctx, provider=None)

Contains the model operations that interact with provider.

__init__(ctx, provider=None)

Initializes a new a2ml model.

Parameters
  • ctx (object) -- An instance of the a2ml Context.

  • provider (str) -- The automl provider you wish to run. For example 'auger'. The default is None - use provider from methods.

Returns

A2MLModel object

Examples

ctx = Context()
model = A2MLModel(ctx)
deploy(model_id, locally=False, review=False, provider=None, name=None, algorithm=None, score=None, data_path=None, metadata=None)

Deploy a model locally or to specified provider(s).

Parameters
  • model_id (str) -- Model ID from the any experiment leaderboard.

  • locally (bool) -- Deploys using a local model if True, on the Provider Cloud if False.

  • review (bool) -- Should model support review based on actual data. The default is True.

  • provider (str) -- The automl provider you wish to run. For example 'auger'. The default is None - use provider defined by model_id or set in costructor.

  • name (str) -- Friendly name for the model. Used as name for Review Endpoint

  • algorithm (str) -- Monitored model(external provider) algorithm name.

  • score (float) -- Monitored model(external provider) score.

  • data_path (str) -- Data path to fit model when deploy. Return new deployed model-id

  • metadata (dict) -- Additional parameter for the model. Used for accurcay report(report parameter)

Returns

{
    'result': True,
    'data': {'model_id': 'A017AC8EAD094FD'}
}

Examples

ctx = Context()
model = A2MLModel(ctx).deploy(model_id='D881079E1ED14FB', name='FirstExperiment')
ctx = Context()
model = A2MLModel(ctx, 'external')
result = model.deploy(model_id=None, name="My external model.", algorithm='RandomForest', score=0.75)
model_id = result['data']['model_id']
predict(model_id, filename=None, data=None, columns=None, predicted_at=None, threshold=None, score=False, score_true_data=None, output=None, no_features_in_result=None, locally=False, provider=None, predict_labels=None)

Predict results with new data against deployed model. Predictions are stored next to the file with data to be predicted on. The file name will be appended with suffix _predicted.

Note

Use deployed model_id

This method support only one provider

Parameters
  • model_id (str) -- The deployed model id you want to use.

  • filename (str) -- The file with data to request predictions for.

  • data -- array of records [[target, actual]] or Pandas DataFrame (target, actual) or dict created with Pandas DataFrame to_dict('list') method

  • columns (list) -- list of column names if data is array of records

  • predicted_at -- Predict data date. Use for review of historical data.

  • threshold (float) -- For classification models only. This will return class probabilities with response.

  • score (bool) -- Calculate scores for predicted results.

  • score_true_data (str, pandas.DataFrame, dict) -- Data with true values to calculate scores. If missed, target from filename used for true values.

  • output (str) -- Output csv file path.

  • no_features_in_result (bool) -- Do not return feature columns in prediction result. False by default

  • locally (bool, str) -- Predicts using a local model with auger.ai.predict if True, on the Provider Cloud if False. If set to "docker", then docker image used to run the model

  • provider (str) -- The automl provider you wish to run. For example 'auger'. The default is None - use provider set in costructor or config.

  • predict_labels (dict, bool) -- Run ActiveLearn to select data for labelling

Returns

if filename is not None.

{
    'result': True,
    'data': {'predicted': 'dataset_predicted.csv'}
}

if filename is None and data is not None and columns is None.

{
    'result': True,
    'data': {'predicted': [{col1: value1, col2: value2, target: predicted_value1}, {col1: value3, col2: value4, target: predicted_value2}]}
}

if filename is None and data is not None and columns is not None.

{
    'result': True,
    'data': {'predicted': {'columns': ['col1', 'col2', target], 'data': [['value1', 'value2', 1], ['value3', 'value4', 0]]}}
}

Examples

ctx = Context()
rv = A2MLModel(ctx).predict(model_id, '../irises.csv')
# if rv[provider].result is True
# predictions are stored in rv[provider]['data']['predicted']
ctx = Context()
data = [{'col1': 'value1', 'col2': 'value2'}, {'col1': 'value3', 'col2': 'value4'}]
rv = A2MLModel(ctx).predict(model_id, data=data)
# if rv[provider].result is True
# predictions are returned as rv[provider]['data']['predicted']
ctx = Context()
data = [['value1', 'value2'], ['value3', 'value4']]
columns = ['col1', 'col2']
rv = A2MLModel(ctx).predict(model_id, data=data)
# if rv[provider].result is True
# predictions are returned as rv[provider]['data']['predicted']
# Predict locally without config files. Model will automatically downloaded if not exists.
# To use local predict install a2ml[predict]
ctx = Context()
ctx.config.set('name', 'project name')
ctx.credentials = "Json string from a2ml ui settings"

rv = A2MLModel(ctx).predict(model_id, '../irises.csv',
    no_features_in_result = True, locally=True)
# if rv[provider].result is True
# predictions are stored in rv[provider]['data']['predicted']
actuals(model_id, filename=None, data=None, columns=None, actuals_at=None, actual_date_column=None, experiment_params=None, locally=False, provider=None)

Submits actual results(ground truths) for predictions of a deployed model. This is used to review and monitor active models.

Note

It is assumed you have predictions against this model first.

actuals.csv

predicted ( or target): predicted value. If missed - predict called automatically

actual

baseline_target: predicted value for baseline model (OPTIONAL)

Iris-setosa

Iris-setosa

Iris-setosa

Iris-virginica

Iris-virginica

Iris-virginica

It may also contain train features to predict(if target missed), retrain model while Review and for distribution chart

This method support only one provider

Parameters
  • model_id (str) -- The deployed model id you want to use.

  • filename (str) -- The file with data to request predictions for.

  • data -- array of records [[target, actual]] or Pandas DataFrame (target, actual) or dict created with Pandas DataFrame to_dict('list') method

  • columns (list) -- list of column names if data is array of records

  • actuals_at -- Actuals date. Use for review of historical data.

  • actual_date_column (str) -- name of column in data which contains actual date

  • experiment_params (dict) --

    parameters to calculate experiment metrics

    start_date(date): experiment actuals start date
    end_date(date):  experiment actuals end date
    date_col(str): column name with date
    

  • locally (bool) -- Process actuals locally.

  • provider (str) -- The automl provider you wish to run. For example 'auger'. The default is None - use provider set in costructor or config.

Returns

{
    'result': True,
    'data': True
}

Errors.

{
    'result': False,
    'data': 'Actual Prediction IDs not found in model predictions.'
}

Examples

ctx = Context()
A2MLModel(ctx).actuals('D881079E1ED14FB', filename=<path_to_file>/actuals.csv)
ctx = Context()
actual_records = [['predicted_value_1', 'actual_value_1'], ['predicted_value_2', 'actual_value_2']]
columns = [target, 'actual']

A2MLModel(ctx).actuals('D881079E1ED14FB', data=actual_records,columns=columns)
ctx = Context()
actual_records = [['predicted_value_1', 'actual_value_1'], ['predicted_value_2', 'actual_value_2']]
columns = [target, 'actual']

A2MLModel(ctx, "external").actuals('external_model_id', data=actual_records,columns=columns)
review_alert(model_id, parameters=None, locally=False, provider=None, name=None)

Update Review parameters.

Parameters
  • model_id (str) -- The deployed model id you want to use.

  • parameters (dict) --

    If None, review section from config will be used.

    • active (True/False): Activate/Deactivate Review Alert

    • type (model_accuracy/feature_average_range/runtime_errors_burst)

      • model_accuracy: Decrease in Model Accuracy: the model accuracy threshold allowed before trigger is initiated. Default threshold: 0.7. Default sensitivity: 72

      • feature_average_range: Feature Average Out-Of-Range: Trigger an alert if average feature value during time period goes beyond the standard deviation range calculated during training period by the specified number of times or more. Default threshold: 1. Default sensitivity: 168

      • runtime_errors_burst: Burst Of Runtime Errors: Trigger an alert if runtime error count exceeds threshold. Default threshold: 5. Default sensitivity: 1

    • threshold (float)

    • sensitivity (int): The amount of time(in hours) this metric must be at or below the threshold to trigger the alert.

    • threshold_policy (all_values/average_value/any_value)

      • all_values: Default value. Trigger an alert when all values in sensitivity below threshold

      • average_value: Trigger an alert when average of values in sensitivity below threshold

      • any_value: Trigger an alert when any value in sensitivity below threshold

    • action (no/retrain/retrain_deploy)

      • no: no action should be executed

      • retrain: Use new predictions and actuals as test set to retrain the model.

      • retrain_deploy: Deploy retrained model and make it active model of this endpoint.

    • notification (no/user/organization): Send message via selected notification channel.

  • locally (bool) -- Process review locally.

  • provider (str) -- The automl provider you wish to run. For example 'auger'. The default is None - use provider defined by model_id or set in costructor.

  • name (str) -- Friendly name for the model. Used as name for Review Endpoint

Returns

{
    'result': True,
}

Examples

ctx = Context()
model = A2MLModel(ctx).review_alert(model_id='D881079E1ED14FB')
review(model_id, locally=False, provider=None)

Review information about deployed model.

Parameters
  • model_id (str) -- The deployed model id you want to use.

  • locally (bool) -- Process review locally.

  • provider (str) -- The automl provider you wish to run. For example 'auger'. The default is None - use provider defined by model_id or set in costructor.

Returns

May be : started, error, completed, retrain error(str): Description of error if status='error' accuracy(float): Average accuracy of model(based on used metric) for review sensitivity period(see config.yml)

{
    'result': True,
    'data': {'status': 'completed', 'error': '', 'accuracy': 0.76}
}

Return type

status(str)

Examples

ctx = Context()
result = A2MLModel(ctx).review(model_id='D881079E1ED14FB')
undeploy(model_id, locally=False, provider=None)

Undeploy a model locally or from specified provider(s).

Parameters
  • model_id (str) -- Model ID from any experiment leaderboard.

  • locally (bool) -- Deploys using a local model if True, on the Provider Cloud if False. The default is False.

  • provider (str) -- The automl provider you wish to run. For example 'auger'. The default is None - use provider defined by model_id or set in costructor.

Returns

{
    'result': True,
    'data': {'model_id': 'A017AC8EAD094FD'}
}

Examples

ctx = Context()
model = A2MLModel(ctx).undeploy(model_id='D881079E1ED14FB', locally=True)
delete_actuals(model_id, with_predictions=False, begin_date=None, end_date=None, locally=False, provider=None)

Delete files with actuals and predcitions locally or from specified provider(s).

Parameters
  • model_id (str) -- Model ID to delete actuals and predictions.

  • with_predictions (bool) --

  • begin_date -- Date to begin delete operations

  • end_date -- Date to end delete operations

  • locally (bool) -- Delete files from local model if True, on the Provider Cloud if False. The default is False.

  • provider (str) -- The automl provider you wish to run. For example 'auger'. The default is None - use provider defined by model_id or set in costructor.

Returns

{
    'result': True,
    'data': None
}

Examples

ctx = Context()
A2MLModel(ctx).delete_actuals(model_id='D881079E1ED14FB')
get_info(model_id, locally=False, provider=None)

Get information about model

Parameters
  • model_id (str) -- Model ID to delete actuals and predictions.

  • locally (bool) -- Get information from local model if True, on the Provider Cloud if False. The default is False.

  • provider (str) -- The automl provider you wish to run. For example 'auger'. The default is None - use provider defined by model_id or set in costructor.

Returns

{
    'result': True,
    'data': {} #detailed model information
}

Examples

ctx = Context()
res = A2MLModel(ctx).get_info('D881079E1ED14FB')
update(model_id, metadata, locally=False, provider=None)

Update model metadata

Parameters
  • model_id (str) -- Model ID to delete actuals and predictions.

  • metadata (dict) -- Model metadata to update

  • locally (bool) -- Get information from local model if True, on the Provider Cloud if False. The default is False.

  • provider (str) -- The automl provider you wish to run. For example 'auger'. The default is None - use provider defined by model_id or set in costructor.

Returns

{
    'result': True,
    'data': {}
}

Examples

ctx = Context()
res = A2MLModel(ctx).update('D881079E1ED14FB', {'report': {}})
list(endpoints=False)

List all of the models/endpoints for the specified providers.

Returns

Results for each provider.

    'result': True,
    'data': {
        'projects': <object>
    }
}

Examples

ctx = Context()
models_list = A2MLModel(ctx).list()

a2ml_project module

class a2ml.api.a2ml_project.A2MLProject(ctx, provider=None)

Contains the project CRUD operations that interact with provider.

__init__(ctx, provider=None)

Initializes a new a2ml project.

Parameters
  • ctx (object) -- An instance of the a2ml Context.

  • provider (str) -- The automl provider(s) you wish to run. For example 'auger,azure,google'. The default is None - use provider set in config.

Returns

A2MLProject object

Examples

ctx = Context()
project = A2MLDataset(ctx, 'auger, azure')
list()

List all of the projects for the specified providers.

Note

You will need to user the iter function to access the dataset elements.

Returns

Results for each provider.

{
    'auger': {
        'result': True,
        'data': {
            'projects': <object>
        }
    }
}

Examples

ctx = Context()
project_list = A2MLProject(ctx, 'auger, azure').list()
for provider in ['auger', 'azure']
    if project_list[provider].result is True:
        for project in iter(project_list[provider].data.projects):
            ctx.log(project.get('name'))
    else:
        ctx.log('error %s' % project_list[provider].data)
create(name)

Creates a project for the specified providers.

Parameters

name (str) -- name of project. If None - use project name from config.

Returns

Results for each provider.

{
    'auger': {
        'result': True,
        'created': 'project_name'
    }
}

Examples

ctx = Context()
project_list = A2MLProject(ctx, 'auger, azure').create('new_project_name')
delete(name)

Deletes a project for the specified providers.

Parameters

name (str) -- name of project. If None - use project name from config.

Returns

Results for each provider.

{
    'auger': {
        'result': True,
        'deleted': 'existing_project_name'
    }
}

Examples

ctx = Context()
project_list = A2MLProject(ctx, 'auger, azure').delete('existinng_project_name')
select(name)

Sets a Project name in the context.

Parameters

name (str) -- name of project. name(str): name of project. If None - use project name from config.

Returns

Results for each provider.

{
    'auger': {
        'result': True,
        'data': {
            'selected': 'fortunetest'
        }
    }
}

Examples

ctx = Context()
DataSet(ctx, 'auger, azure').select(dataset_name)
get_cluster_config(name, local_config=True)

Get project cluster configuration for the specified providers.

Parameters
  • name (str) -- name of project. If None - use project name from config.

  • local_config (bool) -- If True, return cluster parameters from local config, otherwise from remote cluster

Returns

Results for each provider.

{
    'auger': {
        'result': True,
        'data': {
            'type': 'standard',
            'min_nodes': 2,
            'max_nodes': 2,
            'stack_version': 'stable'
        }
    },
    'azure': {
        'result': True,
        'data': {
            'region': 'eastus2',
            'min_nodes': 0,
            'max_nodes': 2,
            'type': 'STANDARD_D2_V2',
            'name': 'a2ml-azure',
            'idle_seconds_before_scaledown': 120
        }
    }
}

Examples

ctx = Context()
cluster_config = A2MLProject(ctx, 'auger, azure').get_cluster_config()
update_cluster_config(name, params)

Update project cluster configuration for the specified providers.

Parameters
  • name (str) -- name of project. If None - use project name from config.

  • params (dict) -- cluster parameters to update.

Returns

Results for each provider.

{
    'auger': {
        'result': True,
        'data': None
    },
    'azure': {
        'result': True,
        'data': None
    }
}

Examples

ctx = Context()
params = {'max_nodes': 4}
cluster_config = A2MLProject(ctx, 'auger, azure').update_cluster_config(params)

Submodules

context module

class a2ml.api.utils.context.Context(name='auger', path=None, debug=False)

The Context class provides an environment to run A2ML

__init__(name='auger', path=None, debug=False)

Initializes the Context instance

Parameters
  • name (str) -- The name of the config file. Default is 'config'

  • path (str) -- The path to your config file. If the config file is in the root directory leave as None.

  • debug (bool) -- True | False. Default is False.

Returns

Context object

Return type

object

Example

ctx = Context()
get_providers(provider=None)

constructs Context instance

Parameters
  • name (str) -- The name of the config file. Default is 'config'

  • path (str) -- The path to your config file. If the config file is in the root directory leave as None.

  • debug (bool) -- True | False. Default is False.

Returns

['azure', 'auger']

Return type

list[str]

Examples

ctx = Context()
ctx.get_providers()