A2ML API¶
a2ml.api package¶
a2ml module - A2ML PREDIT API¶
- class a2ml.api.a2ml.A2ML(ctx, provider=None)¶
Facade to A2ML providers.
- __init__(ctx, provider=None)¶
Initializes new A2ML PREDIT instance.
- Parameters
ctx (object) -- An instance of the a2ml Context.
provider (str) -- The automl provider(s) you wish to run. For example 'auger,azure,google'. The default is None - use provider set in config.
- Returns
A2ML object
Examples
ctx = Context() a2ml = A2ML(ctx, 'auger, azure')
- import_data(source=None, name=None, description=None)¶
Imports data defined in context. Uploading the same file name will result in versions being appended to the file name.
Note
Your context points to a config file where
source
is defined.# Local file name, remote url to the data source file or postgres url source: './dataset.csv'
# Postgres url parameters: dbname, tablename, offset(OPTIONAL), limit(OPTIONAL) source: jdbc:postgresql://user:pwd@ec2-54-204-21-226.compute-1.amazonaws.com:5432/dbname?tablename=table1&offset=0&limit=100
- Parameters
source (str, optional) -- Local file name, remote url to the data source file, Pandas DataFrame or postgres url
name (str, optional) -- Name of dataset, if none then file name used. If source is DataFrame then name should be specified.
description (str, optionsl) -- Description of dataset
- Returns
Results for each provider.
{ 'auger': {'result': True, 'data': {'created': 'dataset.csv'}}, 'azure': {'result': True, 'data': {'created': 'dataset.csv'}} }
Errors.
{ 'auger': {'result': False, 'data': 'Please specify data source file...'}, 'azure': {'result': False, 'data': 'Please specify data source file...'} }
Examples
ctx = Context() a2ml = A2ML(ctx, 'auger, azure') a2ml.import_data()
- preprocess_data(data, preprocessors, locally=False)¶
Preprocess data
- Parameters
data (str|pandas.DataFrame) -- Input data for preprocess. Can be path to file(local or s3) or Pandas Dataframe
preprocessors (array of dicts) --
List of preprocessors with parameters
[ {'text': {'text_cols': []}} ]
- Preprocessors:
- text
text_cols(array): List of text columns to process
text_metrics ['mean_length', 'unique_count', 'separation_score'] : Calculate metrics for text fields and after vectorize(separation_score)
tokenize (dict): Default - {'max_text_len': 30000, 'tokenizers': ['sent'], 'remove_chars': '○•'}
vectorize ('en_use_lg'|'hashing'|'en_use_md'|'en_use_cmlm_md'|'en_use_cmlm_lg'): See see https://github.com/MartinoMensio/spacy-universal-sentence-encoder
dim_reduction(dict): Generate features based on vectors. See https://scikit-learn.org/stable/modules/generated/sklearn.decomposition.PCA.html
{ 'alg_name': 'PCA'|'t-SNE', 'args': {'n_components': 2} #Number of components to keep. }
output_prefix (str): Prefix for generated columns. Format name: {prefix}_{colname}_{num}
calc_distance ['none', 'cosine', 'cityblock', 'euclidean', 'haversine', 'l1', 'l2', 'manhattan', 'nan_euclidean'] | 'cosine' : See https://scikit-learn.org/stable/modules/generated/sklearn.metrics.pairwise.distance_metrics.html#sklearn.metrics.pairwise.distance_metrics
compare_pairs (array of dicts): When calc_distance is not none.
[ {'compare_cols': [{'dataset_idx': 0, 'cols': ['col1']}, {'dataset_idx': 1, 'cols': ['col2']}], 'output_name':'cosine_col1_col2', 'params': {} }, {'compare_cols': [{'dataset_idx': 0, 'cols': ['col3']}, {'dataset_idx': 1, 'cols': ['col4']}], 'output_name':'cosine_col3_col4', 'params': {} }, ]
datasets: List of datasets to process, may be empty, so all fields takes from main dataset
[ {'path': 'path', 'keys': ['main_key', 'local_key'], 'text_metrics': ['separation_score', 'mean_length', 'unique_count']}, {'path': 'path1', 'keys': ['main_key1', 'local_key1']} ]
- Returns
- {
'result': True, 'data': 'data in input format'
}
- train()¶
Starts training session based on context state.
- Returns
Results for each provider.
{ 'auger': { 'result': True, 'data': { 'experiment_name': 'dataset.csv-4-experiment', 'session_id': '9ccfe04eca67757a' } }, 'azure': { 'result': True, 'data': { 'experiment_name': 'dataset.csv-4-experiment', 'session_id': '9ccfe04eca67757a' } } }
Errors.
{ 'auger': {'result': False, 'data': 'Please set target to build model.'}, 'azure': {'result': False, 'data': 'Please set target to build model.'} }
Examples
ctx = Context() a2ml = A2ML(ctx, 'auger, azure') a2ml.train()
- evaluate(run_id=None)¶
Evaluate the results of training.
- Parameters
run_id (str, optional) -- The run id for a training session. A unique run id is created for every train. Default is last experiment train.
- Returns
Results for each provider.
{ 'auger': { 'result': True, 'data': { 'run_id': '9ccfe04eca67757a', 'leaderboard': [ {'model id': 'A017AC8EAD094FD', 'rmse': '0.0000', 'algorithm': 'LGBMRegressor'}, {'model id': '4602AFCEEEAE413', 'rmse': '0.0000', 'algorithm': 'ExtraTreesRegressor'} ], 'trials_count': 10, 'status': 'started', 'provider_status': 'provider specific' } }, 'azure': { 'result': True, 'data': { 'run_id': '9ccfe04eca67757a', 'leaderboard': [ {'model id': 'A017AC8EAD094FD', 'rmse': '0.0000', 'algorithm': 'LGBMRegressor'}, {'model id': '4602AFCEEEAE413', 'rmse': '0.0000', 'algorithm': 'ExtraTreesRegressor'} ], 'trials_count': 10, 'status': 'started', 'provider_status': 'provider specific' } } }
Status
preprocess - search is preprocessing data for traing
started - search is in progress
completed - search is completed
interrupted - search was interrupted
error - search was finished with error
Examples
ctx = Context() a2ml = A2ML(ctx, 'auger, azure') while True: res = a2ml.evaluate() if status['auger']['status'] not in ['preprocess','started']: break
- deploy(model_id, locally=False, review=False, provider=None, name=None, algorithm=None, score=None, data_path=None, metadata=None)¶
Deploy a model locally or to specified provider(s).
Note
See evaluate function to get model_id
This method support only one provider
- Parameters
model_id (str) -- The model id from any experiment you will deploy. Ignored for 'external' provider
locally (bool) -- Deploys the model locally if True, on the Provider Cloud if False. The default is False.
review (bool) -- Should model support review based on actual data. The default is True.
provider (str) -- The automl provider you wish to run. For example 'auger'. The default is None - use provider defined by model_id or set in costructor.
name (str) -- Friendly name for the model. Used as name for Review Endpoint
algorithm (str) -- Monitored model(external provider) algorithm name.
score (float) -- Monitored model(external provider) score.
data_path (str) -- Data path to fit model when deploy. Return new deployed model-id
metadata (dict) -- Additional parameter for the model. Used for accurcay report(report parameter)
- Returns
{ 'result': True, 'data': {'model_id': 'A017AC8EAD094FD'} }
Examples
ctx = Context() a2ml = A2ML(ctx, 'auger, azure') a2ml.deploy(model_id='A017AC8EAD094FD', name='FirstExperiment')
ctx = Context() a2ml = A2ML(ctx, 'external') result = a2ml.deploy(model_id=None, name="My external model.", algorithm='RandomForest', score=0.75) model_id = result['data']['model_id']
- predict(model_id, filename=None, data=None, columns=None, predicted_at=None, threshold=None, score=False, score_true_data=None, output=None, no_features_in_result=None, locally=False, provider=None, predict_labels=None)¶
Predict results with new data against deployed model. Predictions are stored next to the file with data to be predicted on. The file name will be appended with suffix _predicted.
Note
Use deployed model_id
This method support only one provider
- Parameters
model_id (str) -- The deployed model id you want to use.
filename (str) -- The file with data to request predictions for.
data -- array of records [[target, actual]] or Pandas DataFrame (target, actual) or dict created with Pandas DataFrame to_dict('list') method
columns (list) -- list of column names if data is array of records
predicted_at -- Predict data date. Use for review of historical data.
threshold (float) -- For classification models only. This will return class probabilities with response.
score (bool) -- Calculate scores for predicted results.
score_true_data (str, pandas.DataFrame, dict) -- Data with true values to calculate scores. If missed, target from filename used for true values.
output (str) -- Output csv file path.
no_features_in_result (bool) -- Do not return feature columns in prediction result. False by default
locally (bool, str) -- Predicts using a local model with auger.ai.predict if True, on the Provider Cloud if False. If set to "docker", then docker image used to run the model
provider (str) -- The automl provider you wish to run. For example 'auger'. The default is None - use provider set in costructor or config.
predict_labels (dict, bool) -- Run ActiveLearn to select data for labelling
- Returns
if filename is not None.
{ 'result': True, 'data': {'predicted': 'dataset_predicted.csv'} }
if filename is None and data is not None and columns is None.
{ 'result': True, 'data': {'predicted': [{col1: value1, col2: value2, target: predicted_value1}, {col1: value3, col2: value4, target: predicted_value2}]} }
if filename is None and data is not None and columns is not None.
{ 'result': True, 'data': {'predicted': {'columns': ['col1', 'col2', target], 'data': [['value1', 'value2', 1], ['value3', 'value4', 0]]}} }
Examples
ctx = Context() rv = A2ML(ctx).predict(model_id, '../irises.csv') # if rv[provider].result is True # predictions are stored in rv[provider]['data']['predicted']
ctx = Context() data = [{'col1': 'value1', 'col2': 'value2'}, {'col1': 'value3', 'col2': 'value4'}] rv = A2ML(ctx).predict(model_id, data=data) # if rv[provider].result is True # predictions are returned as rv[provider]['data']['predicted']
ctx = Context() data = [['value1', 'value2'], ['value3', 'value4']] columns = ['col1', 'col2'] rv = A2ML(ctx).predict(model_id, data=data) # if rv[provider].result is True # predictions are returned as rv[provider]['data']['predicted']
# Predict locally without config files. Model will automatically downloaded if not exists. # To use local predict install a2ml[predict] ctx = Context() ctx.config.set('name', 'project name') ctx.credentials = "Json string from a2ml ui settings" rv = A2ML(ctx).predict(model_id, '../irises.csv', no_features_in_result = True, locally=True) # if rv[provider].result is True # predictions are stored in rv[provider]['data']['predicted']
- actuals(model_id, filename=None, data=None, columns=None, actuals_at=None, actual_date_column=None, experiment_params=None, locally=False, provider=None)¶
Submits actual results(ground truths) for predictions of a deployed model. This is used to review and monitor active models.
Note
It is assumed you have predictions against this model first.
¶ predicted( or target): predicted value. If missed - predict called automatically
actual
baseline_target: predicted value for baseline model (OPTIONAL)
Iris-setosa
Iris-setosa
Iris-setosa
Iris-virginica
Iris-virginica
Iris-virginica
It may also contain train features to predict(if target missed), retrain model while Review and for distribution chart
This method support only one provider
- Parameters
model_id (str) -- The deployed model id you want to use.
filename (str) -- The file with data to request predictions for.
data -- array of records [[target, actual]] or Pandas DataFrame (target, actual) or dict created with Pandas DataFrame to_dict('list') method
columns (list) -- list of column names if data is array of records
actuals_at -- Actuals date. Use for review of historical data.
actual_date_column (str) -- name of column in data which contains actual date
experiment_params (dict) --
parameters to calculate experiment metrics
start_date(date): experiment actuals start date end_date(date): experiment actuals end date date_col(str): column name with date
locally (bool) -- Process actuals locally.
provider (str) -- The automl provider you wish to run. For example 'auger'. The default is None - use provider set in costructor or config.
- Returns
{ 'result': True, 'data': True }
Errors.
{ 'result': False, 'data': 'Actual Prediction IDs not found in model predictions.' }
Examples
ctx = Context() A2ML(ctx).actuals('D881079E1ED14FB', filename=<path_to_file>/actuals.csv)
ctx = Context() actual_records = [['predicted_value_1', 'actual_value_1'], ['predicted_value_2', 'actual_value_2']] columns = [target, 'actual'] A2ML(ctx).actuals('D881079E1ED14FB', data=actual_records,columns=columns)
ctx = Context() actual_records = [['predicted_value_1', 'actual_value_1'], ['predicted_value_2', 'actual_value_2']] columns = [target, 'actual'] A2ML(ctx, "external").actuals('external_model_id', data=actual_records,columns=columns)
- delete_actuals(model_id, with_predictions=False, begin_date=None, end_date=None, locally=False, provider=None)¶
Delete files with actuals and predcitions locally or from specified provider(s).
- Parameters
model_id (str) -- Model ID to delete actuals and predictions.
with_predictions (bool) --
begin_date -- Date to begin delete operations
end_date -- Date to end delete operations
locally (bool) -- Delete files from local model if True, on the Provider Cloud if False. The default is False.
provider (str) -- The automl provider you wish to run. For example 'auger'. The default is None - use provider defined by model_id or set in costructor.
- Returns
{ 'result': True, 'data': None }
Examples
ctx = Context() A2MLModel(ctx).delete_actuals(model_id='D881079E1ED14FB')
- review(model_id, locally=False, provider=None)¶
Review information about deployed model.
- Parameters
model_id (str) -- The deployed model id you want to use.
locally (bool) -- Process review locally.
- Returns
May be : started, error, completed, retrain error(str): Description of error if status='error' accuracy(float): Average accuracy of model(based on used metric) for review sensitivity period(see config.yml)
{ 'result': True, 'data': {'status': 'completed', 'error': '', 'accuracy': 0.76} }
- Return type
status(str)
Examples
ctx = Context() result = A2ML(ctx).review(model_id='D881079E1ED14FB')
a2ml_dataset module¶
- class a2ml.api.a2ml_dataset.A2MLDataset(ctx, provider=None)¶
Contains the dataset CRUD operations that interact with provider.
- __init__(ctx, provider=None)¶
Initializes a new a2ml dataset.
- Parameters
ctx (object) -- An instance of the a2ml Context.
provider (str) -- The automl provider(s) you wish to run. For example 'auger,azure,google'. The default is None - use provider set in config.
- Returns
A2MLDataset object
Examples
ctx = Context() dataset = A2MLDataset(ctx, 'auger, azure')
- list()¶
List all of the DataSets for the Project specified in the .yaml.
Note
You will need to user the iter function to access the dataset elements.
- Returns
Results for each provider.
{ 'auger': { 'result': True, 'data': { 'datasets': <object> } } }
Examples
ctx = Context() dataset_list = A2MLDataset(ctx, 'auger, azure').list() for provider in ['auger', 'azure'] if dataset_list[provider].result is True: for dataset in iter(dataset_list[provider].data.datasets): ctx.log(dataset.get('name')) else: ctx.log('error %s' % dataset_list[provider].data)
- create(source=None, name=None, description=None)¶
Create a new DataSet for the Project specified in the .yaml.
- Parameters
source (str, optional) -- Local file name, remote url to the data source file, Pandas DataFrame or postgres url
name (str, optional) -- Name of dataset, if none then file name used. If source is DataFrame then name should be specified.
- Returns
Results for each provider.
{ 'auger': { 'result': True, 'data': { 'created': 'dataset.csv' } } }
Examples
ctx = Context() dataset = A2MLDataset(ctx, 'auger, azure').create('../dataset.csv')
- upload(source, name=None)¶
Upload file to Auger and get Auger url.
- Parameters
source (str) -- Local file name, remote url to the data source file, Pandas DataFrame or postgres url
name (str, optional) -- Name of dataset, if none then file name used. If source is DataFrame then name should be specified.
- Returns
- {
'result': True, 'data': 'url for the file on Auger Hub'
}
Examples
ctx = Context() url = A2MLDataset(ctx).upload('../dataset.csv')
- delete(name=None)¶
Deletes a DataSet for the Project specified in the .yaml.
- Parameters
name (str) -- name of dataset.
- Returns
Results for each provider.
{ 'auger': { 'result': True, 'data': { 'deleted': 'dataset.csv' } } }
Examples
ctx = Context() A2MLDataset(ctx, 'auger, azure').delete(dataset_name) ctx.log('Deleted dataset %s' % dataset_name)
- select(name=None)¶
Sets a DataSet name in the context.
- Parameters
name (str) -- name of dataset.
- Returns
Results for each provider.
{ 'auger': { 'result': True, 'data': { 'selected': 'fortunetest' } } }
Examples
ctx = Context() A2MLDataset(ctx, 'auger, azure').select(dataset_name)
- download(name=None, path=None)¶
Download DataSet by name to the local file.
- Parameters
name (str, optional) -- name of dataset. If skipped dataset from auger.yaml will be used
path (str, optional) -- local dir path to store file. If skipped current folder will be used.
- Returns
- {
'result': True, 'data': 'full local path to the file'
}
Examples
ctx = Context() A2MLDataset(ctx).download(dataset_name)
a2ml_experiment module¶
- class a2ml.api.a2ml_experiment.A2MLExperiment(ctx, provider=None)¶
Contains the experiment operations that interact with provider.
- __init__(ctx, provider=None)¶
Initializes a new a2ml experiment.
- Parameters
ctx (object) -- An instance of the a2ml Context.
provider (str) -- The automl provider(s) you wish to run. For example 'auger,azure,google'. The default is None - use provider set in config.
- Returns
A2MLExperiment object
Examples
ctx = Context() model = A2MLExperiment(ctx, 'auger, azure')
- list()¶
List all of the experiments for the Project specified in the .yaml.
Note
You will need to user the iter function to access the dataset elements.
- Returns
Results for each provider.
{ 'auger': { 'result': True, 'data': { 'experiments': <object> } } }
Examples
ctx = Context() experiment_list = A2MLExperiment(ctx, 'auger, azure').list() for provider in ['auger', 'azure'] if experiment_list[provider].result is True: for experiment in iter(experiment_list[provider].data.datasets): ctx.log(experiment.get('name')) else: ctx.log('error %s' % experiment_list[provider].data)
- start()¶
Starts experiment/s for selected dataset. If the name of experiment is not set in context config, new experiment will be created, otherwise an existing experiment will be run.
- Returns
Results for each provider.
{ 'auger': { 'result': True, 'data': { 'experiment_name': <experiment_name>, 'session_id': <session_id> } } }
Examples
ctx = Context() experiment = A2MLExperiment(ctx, providers).start()
- stop(run_id=None)¶
Stops runninng experiment/s.
- Parameters
run_id (str) -- The run id for a training session. A unique run id is created for every train. If set to None default is last experiment train.
- Returns
Results for each provider.
{ 'auger': { 'result': True, 'data': { 'stopped': <experiment_name> } } }
Examples
ctx = Context() experiment = A2MLExperiment(ctx, providers).stop()
- leaderboard(run_id)¶
The leaderboard of the currently running or previously completed experiment/s.
- Parameters
run_id (str) -- The run id for a training session. A unique run id is created for every train. If set to None default is last experiment train.
- Returns
Results for each provider.
{ 'auger': { 'result': True, 'data': { 'run_id': '9ccfe04eca67757a', 'leaderboard': [ {'model id': 'A017AC8EAD094FD', 'rmse': '0.0000', 'algorithm': 'LGBMRegressor'}, {'model id': '4602AFCEEEAE413', 'rmse': '0.0000', 'algorithm': 'ExtraTreesRegressor'} ], 'trials_count': 10, 'status': 'started', 'provider_status': 'provider specific' } }, 'azure': { 'result': True, 'data': { 'run_id': '9ccfe04eca67757a', 'leaderboard': [ {'model id': 'A017AC8EAD094FD', 'rmse': '0.0000', 'algorithm': 'LGBMRegressor'}, {'model id': '4602AFCEEEAE413', 'rmse': '0.0000', 'algorithm': 'ExtraTreesRegressor'} ], 'trials_count': 10, 'status': 'started', 'provider_status': 'provider specific' } } }
Status
preprocess - search is preprocessing data for traing
started - search is in progress
completed - search is completed
interrupted - search was interrupted
error - search was finished with error
Examples
ctx = Context() leaderboard = A2MLExperiment(ctx, 'auger, azure').leaderboard() for provider in ['auger', 'azure'] if leaderboard[provider].result is True: for entry in iter(leaderboard[provider].data.leaderboard): ctx.log(entry['model id']) ctx.log('status %s' % leaderboard[provider].data.status) else: ctx.log('error %s' % leaderboard[provider].data)
- history()¶
The history of the currently running or previously completed experiment/s.
Note
You will need to user the iter function to access the dataset elements.
- Returns
Results for each provider.
{ 'auger': { 'result': True, 'data': { 'history': <object> } } }
Examples
ctx = Context() history = A2MLExperiment(ctx, 'auger, azure').history() for provider in ['auger', 'azure'] if history[provider].result is True: for run in iter(history[provider].data.history): ctx.log("run id: {}, status: {}".format( run.get('id'), run.get('status'))) else: ctx.log('error %s' % history[provider].data)
a2ml_model module¶
- class a2ml.api.a2ml_model.A2MLModel(ctx, provider=None)¶
Contains the model operations that interact with provider.
- __init__(ctx, provider=None)¶
Initializes a new a2ml model.
- Parameters
ctx (object) -- An instance of the a2ml Context.
provider (str) -- The automl provider you wish to run. For example 'auger'. The default is None - use provider from methods.
- Returns
A2MLModel object
Examples
ctx = Context() model = A2MLModel(ctx)
- deploy(model_id, locally=False, review=False, provider=None, name=None, algorithm=None, score=None, data_path=None, metadata=None)¶
Deploy a model locally or to specified provider(s).
- Parameters
model_id (str) -- Model ID from the any experiment leaderboard.
locally (bool) -- Deploys using a local model if True, on the Provider Cloud if False.
review (bool) -- Should model support review based on actual data. The default is True.
provider (str) -- The automl provider you wish to run. For example 'auger'. The default is None - use provider defined by model_id or set in costructor.
name (str) -- Friendly name for the model. Used as name for Review Endpoint
algorithm (str) -- Monitored model(external provider) algorithm name.
score (float) -- Monitored model(external provider) score.
data_path (str) -- Data path to fit model when deploy. Return new deployed model-id
metadata (dict) -- Additional parameter for the model. Used for accurcay report(report parameter)
- Returns
{ 'result': True, 'data': {'model_id': 'A017AC8EAD094FD'} }
Examples
ctx = Context() model = A2MLModel(ctx).deploy(model_id='D881079E1ED14FB', name='FirstExperiment')
ctx = Context() model = A2MLModel(ctx, 'external') result = model.deploy(model_id=None, name="My external model.", algorithm='RandomForest', score=0.75) model_id = result['data']['model_id']
- predict(model_id, filename=None, data=None, columns=None, predicted_at=None, threshold=None, score=False, score_true_data=None, output=None, no_features_in_result=None, locally=False, provider=None, predict_labels=None)¶
Predict results with new data against deployed model. Predictions are stored next to the file with data to be predicted on. The file name will be appended with suffix _predicted.
Note
Use deployed model_id
This method support only one provider
- Parameters
model_id (str) -- The deployed model id you want to use.
filename (str) -- The file with data to request predictions for.
data -- array of records [[target, actual]] or Pandas DataFrame (target, actual) or dict created with Pandas DataFrame to_dict('list') method
columns (list) -- list of column names if data is array of records
predicted_at -- Predict data date. Use for review of historical data.
threshold (float) -- For classification models only. This will return class probabilities with response.
score (bool) -- Calculate scores for predicted results.
score_true_data (str, pandas.DataFrame, dict) -- Data with true values to calculate scores. If missed, target from filename used for true values.
output (str) -- Output csv file path.
no_features_in_result (bool) -- Do not return feature columns in prediction result. False by default
locally (bool, str) -- Predicts using a local model with auger.ai.predict if True, on the Provider Cloud if False. If set to "docker", then docker image used to run the model
provider (str) -- The automl provider you wish to run. For example 'auger'. The default is None - use provider set in costructor or config.
predict_labels (dict, bool) -- Run ActiveLearn to select data for labelling
- Returns
if filename is not None.
{ 'result': True, 'data': {'predicted': 'dataset_predicted.csv'} }
if filename is None and data is not None and columns is None.
{ 'result': True, 'data': {'predicted': [{col1: value1, col2: value2, target: predicted_value1}, {col1: value3, col2: value4, target: predicted_value2}]} }
if filename is None and data is not None and columns is not None.
{ 'result': True, 'data': {'predicted': {'columns': ['col1', 'col2', target], 'data': [['value1', 'value2', 1], ['value3', 'value4', 0]]}} }
Examples
ctx = Context() rv = A2MLModel(ctx).predict(model_id, '../irises.csv') # if rv[provider].result is True # predictions are stored in rv[provider]['data']['predicted']
ctx = Context() data = [{'col1': 'value1', 'col2': 'value2'}, {'col1': 'value3', 'col2': 'value4'}] rv = A2MLModel(ctx).predict(model_id, data=data) # if rv[provider].result is True # predictions are returned as rv[provider]['data']['predicted']
ctx = Context() data = [['value1', 'value2'], ['value3', 'value4']] columns = ['col1', 'col2'] rv = A2MLModel(ctx).predict(model_id, data=data) # if rv[provider].result is True # predictions are returned as rv[provider]['data']['predicted']
# Predict locally without config files. Model will automatically downloaded if not exists. # To use local predict install a2ml[predict] ctx = Context() ctx.config.set('name', 'project name') ctx.credentials = "Json string from a2ml ui settings" rv = A2MLModel(ctx).predict(model_id, '../irises.csv', no_features_in_result = True, locally=True) # if rv[provider].result is True # predictions are stored in rv[provider]['data']['predicted']
- actuals(model_id, filename=None, data=None, columns=None, actuals_at=None, actual_date_column=None, experiment_params=None, locally=False, provider=None)¶
Submits actual results(ground truths) for predictions of a deployed model. This is used to review and monitor active models.
Note
It is assumed you have predictions against this model first.
¶ predicted ( or target): predicted value. If missed - predict called automatically
actual
baseline_target: predicted value for baseline model (OPTIONAL)
Iris-setosa
Iris-setosa
Iris-setosa
Iris-virginica
Iris-virginica
Iris-virginica
It may also contain train features to predict(if target missed), retrain model while Review and for distribution chart
This method support only one provider
- Parameters
model_id (str) -- The deployed model id you want to use.
filename (str) -- The file with data to request predictions for.
data -- array of records [[target, actual]] or Pandas DataFrame (target, actual) or dict created with Pandas DataFrame to_dict('list') method
columns (list) -- list of column names if data is array of records
actuals_at -- Actuals date. Use for review of historical data.
actual_date_column (str) -- name of column in data which contains actual date
experiment_params (dict) --
parameters to calculate experiment metrics
start_date(date): experiment actuals start date end_date(date): experiment actuals end date date_col(str): column name with date
locally (bool) -- Process actuals locally.
provider (str) -- The automl provider you wish to run. For example 'auger'. The default is None - use provider set in costructor or config.
- Returns
{ 'result': True, 'data': True }
Errors.
{ 'result': False, 'data': 'Actual Prediction IDs not found in model predictions.' }
Examples
ctx = Context() A2MLModel(ctx).actuals('D881079E1ED14FB', filename=<path_to_file>/actuals.csv)
ctx = Context() actual_records = [['predicted_value_1', 'actual_value_1'], ['predicted_value_2', 'actual_value_2']] columns = [target, 'actual'] A2MLModel(ctx).actuals('D881079E1ED14FB', data=actual_records,columns=columns)
ctx = Context() actual_records = [['predicted_value_1', 'actual_value_1'], ['predicted_value_2', 'actual_value_2']] columns = [target, 'actual'] A2MLModel(ctx, "external").actuals('external_model_id', data=actual_records,columns=columns)
- review_alert(model_id, parameters=None, locally=False, provider=None, name=None)¶
Update Review parameters.
- Parameters
model_id (str) -- The deployed model id you want to use.
parameters (dict) --
If None, review section from config will be used.
active (True/False): Activate/Deactivate Review Alert
type (model_accuracy/feature_average_range/runtime_errors_burst)
model_accuracy: Decrease in Model Accuracy: the model accuracy threshold allowed before trigger is initiated. Default threshold: 0.7. Default sensitivity: 72
feature_average_range: Feature Average Out-Of-Range: Trigger an alert if average feature value during time period goes beyond the standard deviation range calculated during training period by the specified number of times or more. Default threshold: 1. Default sensitivity: 168
runtime_errors_burst: Burst Of Runtime Errors: Trigger an alert if runtime error count exceeds threshold. Default threshold: 5. Default sensitivity: 1
threshold (float)
sensitivity (int): The amount of time(in hours) this metric must be at or below the threshold to trigger the alert.
threshold_policy (all_values/average_value/any_value)
all_values: Default value. Trigger an alert when all values in sensitivity below threshold
average_value: Trigger an alert when average of values in sensitivity below threshold
any_value: Trigger an alert when any value in sensitivity below threshold
action (no/retrain/retrain_deploy)
no: no action should be executed
retrain: Use new predictions and actuals as test set to retrain the model.
retrain_deploy: Deploy retrained model and make it active model of this endpoint.
notification (no/user/organization): Send message via selected notification channel.
locally (bool) -- Process review locally.
provider (str) -- The automl provider you wish to run. For example 'auger'. The default is None - use provider defined by model_id or set in costructor.
name (str) -- Friendly name for the model. Used as name for Review Endpoint
- Returns
{ 'result': True, }
Examples
ctx = Context() model = A2MLModel(ctx).review_alert(model_id='D881079E1ED14FB')
- review(model_id, locally=False, provider=None)¶
Review information about deployed model.
- Parameters
model_id (str) -- The deployed model id you want to use.
locally (bool) -- Process review locally.
provider (str) -- The automl provider you wish to run. For example 'auger'. The default is None - use provider defined by model_id or set in costructor.
- Returns
May be : started, error, completed, retrain error(str): Description of error if status='error' accuracy(float): Average accuracy of model(based on used metric) for review sensitivity period(see config.yml)
{ 'result': True, 'data': {'status': 'completed', 'error': '', 'accuracy': 0.76} }
- Return type
status(str)
Examples
ctx = Context() result = A2MLModel(ctx).review(model_id='D881079E1ED14FB')
- undeploy(model_id, locally=False, provider=None)¶
Undeploy a model locally or from specified provider(s).
- Parameters
model_id (str) -- Model ID from any experiment leaderboard.
locally (bool) -- Deploys using a local model if True, on the Provider Cloud if False. The default is False.
provider (str) -- The automl provider you wish to run. For example 'auger'. The default is None - use provider defined by model_id or set in costructor.
- Returns
{ 'result': True, 'data': {'model_id': 'A017AC8EAD094FD'} }
Examples
ctx = Context() model = A2MLModel(ctx).undeploy(model_id='D881079E1ED14FB', locally=True)
- delete_actuals(model_id, with_predictions=False, begin_date=None, end_date=None, locally=False, provider=None)¶
Delete files with actuals and predcitions locally or from specified provider(s).
- Parameters
model_id (str) -- Model ID to delete actuals and predictions.
with_predictions (bool) --
begin_date -- Date to begin delete operations
end_date -- Date to end delete operations
locally (bool) -- Delete files from local model if True, on the Provider Cloud if False. The default is False.
provider (str) -- The automl provider you wish to run. For example 'auger'. The default is None - use provider defined by model_id or set in costructor.
- Returns
{ 'result': True, 'data': None }
Examples
ctx = Context() A2MLModel(ctx).delete_actuals(model_id='D881079E1ED14FB')
- get_info(model_id, locally=False, provider=None)¶
Get information about model
- Parameters
model_id (str) -- Model ID to delete actuals and predictions.
locally (bool) -- Get information from local model if True, on the Provider Cloud if False. The default is False.
provider (str) -- The automl provider you wish to run. For example 'auger'. The default is None - use provider defined by model_id or set in costructor.
- Returns
{ 'result': True, 'data': {} #detailed model information }
Examples
ctx = Context() res = A2MLModel(ctx).get_info('D881079E1ED14FB')
- update(model_id, metadata, locally=False, provider=None)¶
Update model metadata
- Parameters
model_id (str) -- Model ID to delete actuals and predictions.
metadata (dict) -- Model metadata to update
locally (bool) -- Get information from local model if True, on the Provider Cloud if False. The default is False.
provider (str) -- The automl provider you wish to run. For example 'auger'. The default is None - use provider defined by model_id or set in costructor.
- Returns
{ 'result': True, 'data': {} }
Examples
ctx = Context() res = A2MLModel(ctx).update('D881079E1ED14FB', {'report': {}})
- list(endpoints=False)¶
List all of the models/endpoints for the specified providers.
- Returns
Results for each provider.
'result': True, 'data': { 'projects': <object> } }
Examples
ctx = Context() models_list = A2MLModel(ctx).list()
a2ml_project module¶
- class a2ml.api.a2ml_project.A2MLProject(ctx, provider=None)¶
Contains the project CRUD operations that interact with provider.
- __init__(ctx, provider=None)¶
Initializes a new a2ml project.
- Parameters
ctx (object) -- An instance of the a2ml Context.
provider (str) -- The automl provider(s) you wish to run. For example 'auger,azure,google'. The default is None - use provider set in config.
- Returns
A2MLProject object
Examples
ctx = Context() project = A2MLDataset(ctx, 'auger, azure')
- list()¶
List all of the projects for the specified providers.
Note
You will need to user the iter function to access the dataset elements.
- Returns
Results for each provider.
{ 'auger': { 'result': True, 'data': { 'projects': <object> } } }
Examples
ctx = Context() project_list = A2MLProject(ctx, 'auger, azure').list() for provider in ['auger', 'azure'] if project_list[provider].result is True: for project in iter(project_list[provider].data.projects): ctx.log(project.get('name')) else: ctx.log('error %s' % project_list[provider].data)
- create(name)¶
Creates a project for the specified providers.
- Parameters
name (str) -- name of project. If None - use project name from config.
- Returns
Results for each provider.
{ 'auger': { 'result': True, 'created': 'project_name' } }
Examples
ctx = Context() project_list = A2MLProject(ctx, 'auger, azure').create('new_project_name')
- delete(name)¶
Deletes a project for the specified providers.
- Parameters
name (str) -- name of project. If None - use project name from config.
- Returns
Results for each provider.
{ 'auger': { 'result': True, 'deleted': 'existing_project_name' } }
Examples
ctx = Context() project_list = A2MLProject(ctx, 'auger, azure').delete('existinng_project_name')
- select(name)¶
Sets a Project name in the context.
- Parameters
name (str) -- name of project. name(str): name of project. If None - use project name from config.
- Returns
Results for each provider.
{ 'auger': { 'result': True, 'data': { 'selected': 'fortunetest' } } }
Examples
ctx = Context() DataSet(ctx, 'auger, azure').select(dataset_name)
- get_cluster_config(name, local_config=True)¶
Get project cluster configuration for the specified providers.
- Parameters
name (str) -- name of project. If None - use project name from config.
local_config (bool) -- If True, return cluster parameters from local config, otherwise from remote cluster
- Returns
Results for each provider.
{ 'auger': { 'result': True, 'data': { 'type': 'standard', 'min_nodes': 2, 'max_nodes': 2, 'stack_version': 'stable' } }, 'azure': { 'result': True, 'data': { 'region': 'eastus2', 'min_nodes': 0, 'max_nodes': 2, 'type': 'STANDARD_D2_V2', 'name': 'a2ml-azure', 'idle_seconds_before_scaledown': 120 } } }
Examples
ctx = Context() cluster_config = A2MLProject(ctx, 'auger, azure').get_cluster_config()
- update_cluster_config(name, params)¶
Update project cluster configuration for the specified providers.
- Parameters
name (str) -- name of project. If None - use project name from config.
params (dict) -- cluster parameters to update.
- Returns
Results for each provider.
{ 'auger': { 'result': True, 'data': None }, 'azure': { 'result': True, 'data': None } }
Examples
ctx = Context() params = {'max_nodes': 4} cluster_config = A2MLProject(ctx, 'auger, azure').update_cluster_config(params)
Submodules¶
context module¶
- class a2ml.api.utils.context.Context(name='auger', path=None, debug=False)¶
The Context class provides an environment to run A2ML
- __init__(name='auger', path=None, debug=False)¶
Initializes the Context instance
- Parameters
name (str) -- The name of the config file. Default is 'config'
path (str) -- The path to your config file. If the config file is in the root directory leave as None.
debug (bool) -- True | False. Default is False.
- Returns
Context object
- Return type
object
Example
ctx = Context()
- get_providers(provider=None)¶
constructs Context instance
- Parameters
name (str) -- The name of the config file. Default is 'config'
path (str) -- The path to your config file. If the config file is in the root directory leave as None.
debug (bool) -- True | False. Default is False.
- Returns
['azure', 'auger']
- Return type
list[str]
Examples
ctx = Context() ctx.get_providers()