Skip to content

Instantly share code, notes, and snippets.

@cristofima
Last active June 20, 2023 18:10
Show Gist options
  • Save cristofima/b4deb0c8435d919d769f0a9a57f740a0 to your computer and use it in GitHub Desktop.
Save cristofima/b4deb0c8435d919d769f0a9a57f740a0 to your computer and use it in GitHub Desktop.
Display the source blob
Display the rendered blob
Raw
{
"cells": [
{
"attachments": {},
"cell_type": "markdown",
"metadata": {
"nteract": {
"transient": {
"deleting": false
}
}
},
"source": [
"# Building Azure ML Pipelines using the Azure Machine Learning SDK"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {
"nteract": {
"transient": {
"deleting": false
}
}
},
"source": [
"The Azure Machine Learning SDK allows data scientists and AI developers to interact with the Azure Machine Learning services within any Python environment. This provides many benefits such as managing datasets, training models using cloud resources, and deploying trained models as web services.\n",
"\n",
"\n",
"In this notebook, you will follow the process of using the Azure ML SDK to build a pipeline for training and modeling.\n",
"\n",
"Note: To execute the code in each cell, click on the cell and press SHIFT + ENTER.\n",
"\n"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {
"nteract": {
"transient": {
"deleting": false
}
}
},
"source": [
"## Log in to Workspace"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {
"nteract": {
"transient": {
"deleting": false
}
}
},
"source": [
"To login to the workspace with the Azure ML Python SDK, you will need to authenticate again with Azure. When you run this cell for the first time, you are prompted to authenticate with Azure by clicking on a link and inputting a security code into a web page.\n",
"\n",
"This block of code imports the azureml.core package which is used for interacting with Azure Machine Learning:"
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {
"gather": {
"logged": 1598448890874
},
"jupyter": {
"outputs_hidden": false,
"source_hidden": false
},
"nteract": {
"transient": {
"deleting": false
}
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Ready to use Azure ML 1.48.0 to work with ml-lab-ebsdek7c3q3ja\n"
]
}
],
"source": [
"import azureml.core\n",
"from azureml.core import Workspace\n",
"\n",
"# Load the workspace from the saved config file\n",
"ws = Workspace.from_config()\n",
"print('Ready to use Azure ML {} to work with {}'.format(azureml.core.VERSION, ws.name))\n"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {
"nteract": {
"transient": {
"deleting": false
}
}
},
"source": [
"## Default Datastore\n",
"Datastores enable Azure ML users to connect data in almost any Azure Storage service to their Azure ML Workspace. The datastore becomes an abstraction layer for connecting to the various types of Azure storage.\n",
"This lab uses the default datastore attached to a storage account created by default when provisioning the Azure ML Workspace:\n"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {
"gather": {
"logged": 1598448893726
},
"jupyter": {
"outputs_hidden": false,
"source_hidden": false
},
"nteract": {
"transient": {
"deleting": false
}
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"The default datastore has been saved to a variable.\n"
]
}
],
"source": [
"# Set Default Datastore\n",
"datastore = ws.get_default_datastore()\n",
"\n",
"print('The default datastore has been saved to a variable.')"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {
"nteract": {
"transient": {
"deleting": false
}
}
},
"source": [
"## Select Compute\n",
"A compute cluster has already been created at the beginning of this lab. This cluster will be used for processing the tasks during each pipeline step:"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {
"gather": {
"logged": 1598448896596
},
"jupyter": {
"outputs_hidden": false,
"source_hidden": false
},
"nteract": {
"transient": {
"deleting": false
}
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Found compute cluster!\n"
]
}
],
"source": [
"# Select the compute cluster target\n",
"from azureml.core.compute import ComputeTarget\n",
"\n",
"cpu_cluster = ComputeTarget(workspace=ws, name='automl-compute')\n",
"\n",
"print(\"Found compute cluster!\")\n"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {
"nteract": {
"transient": {
"deleting": false
}
}
},
"source": [
"## Building an Azure ML Pipeline\n",
"\n",
"Azure ML Pipelines split up machine learning workflows into different steps. This workflow allows multiple users to collaborate on a single machine learning workflow by making changes to just one step. It can also save costs by using cheaper computing resources for different steps.\n",
"\n",
"In this example, you will break up the traditional workflow for training a model and build a pipeline with the following steps for training an Iris classification model:\n",
"\n",
"- Ingest Iris data from a URL\n",
"- Preprocess Iris data and split into test and training samples\n",
"- Train the model using the preprocessed data\n",
"- Evaluate the model and determine the accuracy\n",
"- Deploy the model as a web service\n",
"\n",
"Splitting up the machine learning workflow into different pipeline steps allows the workflow to scale with more massive datasets during the model's lifecycle. It also allows for multiple members of a team to manage separate parts of the workflow.\n",
"\n",
"\n"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"### Creating the Source Directories\n",
"Each ML pipeline step will have it's own Python script to execute to perform the desired actions. The location for each script file and any files it depends on is called a source directory. It's best practice to use separate folders for each source directory because a snapshot is taken of the source directory for each step. Using a different source directory for each pipeline step reduces the size of each snapshot. Any changes made to the files in each step's source directory can trigger a re-upload of the snapshot, causing that step to be rerun. \n",
"\n",
"The source directory folder structure will look like this:\n",
"```\n",
"data_dependency_run_ingest\n",
" └ ingest.py\n",
"data_dependency_run_preprocess\n",
" └ preprocess.py\n",
"data_dependency_run_train\n",
" └ train.py\n",
"data_dependency_run_evaluate\n",
" └ evaluate.py\n",
"data_dependency_run_deploy\n",
" └ score.py\n",
" └ deploy.py\n",
"```\n",
"\n",
"Run the cell block below to create the directories:\n"
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {
"gather": {
"logged": 1598448902096
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"The source directories have been created.\n"
]
}
],
"source": [
"import os\n",
"\n",
"# Create the source directory for each pipeline step\n",
"source_directory_ingest = 'data_dependency_run_ingest'\n",
"source_directory_preprocess = 'data_dependency_run_preprocess'\n",
"source_directory_train = 'data_dependency_run_train'\n",
"source_directory_evaluate = 'data_dependency_run_evaluate'\n",
"source_directory_deploy = 'data_dependency_run_deploy'\n",
"\n",
"\n",
"\n",
"if not os.path.exists(source_directory_ingest):\n",
" os.makedirs(source_directory_ingest)\n",
"if not os.path.exists(source_directory_preprocess):\n",
" os.makedirs(source_directory_preprocess)\n",
"if not os.path.exists(source_directory_train ):\n",
" os.makedirs(source_directory_train)\n",
"if not os.path.exists(source_directory_evaluate):\n",
" os.makedirs(source_directory_evaluate)\n",
"if not os.path.exists(source_directory_deploy):\n",
" os.makedirs(source_directory_deploy)\n",
" \n",
"print('The source directories have been created.')"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"### Creating Scripts in the Source Directories\n",
"\n",
"Each pipeline step's scripts will need to be created and placed in their respective source directory folder. Read the summary of each script and run the cell block to make each script.\n",
"\n",
"Each script contains arguments that are used to pass in the directory information between each step. "
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {
"nteract": {
"transient": {
"deleting": false
}
}
},
"source": [
"#### Ingest Script\n",
"\n",
"The ingestion step takes input for a URL and a directory to store the data. It downloads the data from the URL and saves it to a folder on the datastore:"
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Writing data_dependency_run_ingest/ingest.py\n"
]
}
],
"source": [
"%%writefile $source_directory_ingest/ingest.py\n",
"\n",
"import os\n",
"import urllib.request\n",
"import argparse\n",
"\n",
"# Define arguments\n",
"parser = argparse.ArgumentParser(description='Iris Data Ingestion')\n",
"parser.add_argument('--iris_data_dir', type=str, help='Directory to store Iris Data')\n",
"parser.add_argument('--urls', type=str, help='Data URL to ingest')\n",
"args = parser.parse_args()\n",
"\n",
"\n",
"\n",
"# Get arguments from parser\n",
"iris_data_dir = args.iris_data_dir\n",
"urls = args.urls\n",
"\n",
"\n",
"if not os.path.exists(iris_data_dir):\n",
" os.makedirs(iris_data_dir)\n",
"\n",
"\n",
"# Download data from URL\n",
"print(\"Downloading data from URL Arguments\")\n",
"urllib.request.urlretrieve(urls, \"{}/iris.csv\".format(iris_data_dir))\n",
"\n",
"\n",
"\n"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {
"nteract": {
"transient": {
"deleting": false
}
}
},
"source": [
"#### Preprocess Script\n",
"\n",
"After the data is ingested and stored on a directory on the datastore, the preprocess step takes the iris data from the previous step and splits the data into separate train and test sets. This is the typical pattern for training a machine learning model. The train and test sets are then stored in separate folders on the datastore:"
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Writing data_dependency_run_preprocess/preprocess.py\n"
]
}
],
"source": [
"%%writefile $source_directory_preprocess/preprocess.py\n",
"\n",
"import pandas as pd\n",
"from sklearn.model_selection import train_test_split\n",
"import glob\n",
"import os\n",
"import argparse\n",
"import pickle\n",
"\n",
"# Define arguments\n",
"parser = argparse.ArgumentParser(description='Preprocessing')\n",
"parser.add_argument('--train_dir', type=str, help='Directory to output the processed training data')\n",
"parser.add_argument('--iris_data_dir', type=str, help='Directory to store iris data')\n",
"parser.add_argument('--test_dir', type=str, help='Directory to output the processed test data')\n",
"\n",
"\n",
"args = parser.parse_args()\n",
"\n",
"# Get arguments from parser\n",
"iris_data_dir = args.iris_data_dir\n",
"train_dir = args.train_dir\n",
"test_dir = args.test_dir\n",
"\n",
"\n",
"\n",
"# Process data and split into train and test models\n",
"path = iris_data_dir\n",
"all_files = glob.glob(os.path.join(path, \"*.csv\"))\n",
"\n",
"names = ['sepal-length', 'sepal-width', 'petal-length', 'petal-width', 'class']\n",
"dataset = pd.concat((pd.read_csv(f, names=names) for f in all_files))\n",
"\n",
"array = dataset.values\n",
"X = array[:,0:4]\n",
"y = array[:,4]\n",
"X_train, X_test, Y_train, Y_test = train_test_split(X, y, test_size=0.20, random_state=1)\n",
"\n",
"# Make train and test directories if they don't exist\n",
"if not os.path.exists(train_dir):\n",
" os.makedirs(train_dir)\n",
"\n",
"if not os.path.exists(test_dir):\n",
" os.makedirs(test_dir)\n",
"\n",
"# Output processed data to their respective folders\n",
"with open(test_dir + '/X_test.sav', 'wb') as f:\n",
" pickle.dump(X_test, f)\n",
"with open(test_dir + '/Y_test.sav', 'wb') as f:\n",
" pickle.dump(Y_test, f)\n",
"with open(train_dir + '/X_train.sav', 'wb') as f:\n",
" pickle.dump(X_train, f)\n",
"with open(train_dir + '/Y_train.sav', 'wb') as f:\n",
" pickle.dump(Y_train, f)\n",
" \n",
" "
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {
"nteract": {
"transient": {
"deleting": false
}
}
},
"source": [
"#### Train Script\n",
"\n",
"The training step takes the preprocessed data and trains the model to fit the dataset. The training script saves the model to a directory:"
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Writing data_dependency_run_train/train.py\n"
]
}
],
"source": [
"%%writefile $source_directory_train/train.py\n",
"\n",
"import os\n",
"from sklearn.svm import SVC\n",
"import pickle\n",
"import argparse\n",
"\n",
"\n",
"# Define arguments\n",
"parser = argparse.ArgumentParser(description='Train')\n",
"parser.add_argument('--train_dir', type=str, help='Directory to output the processed training data')\n",
"parser.add_argument('--output_dir', type=str, help='Directory to store output raw data')\n",
"\n",
"args = parser.parse_args()\n",
"\n",
"# Get arguments from parser\n",
"output_dir = args.output_dir\n",
"train_dir = args.train_dir\n",
"\n",
"if not os.path.exists(output_dir):\n",
" os.makedirs(output_dir)\n",
"\n",
"\n",
"# load the model from the training directory\n",
"loaded_X_train = pickle.load(open(train_dir + '/X_train.sav', 'rb'))\n",
"loaded_Y_train = pickle.load(open(train_dir + '/Y_train.sav', 'rb'))\n",
"\n",
"# Fit the model with training dataset\n",
"model = SVC(gamma='auto')\n",
"model.fit(loaded_X_train, loaded_Y_train)\n",
"\n",
"# Output model to directory\n",
"with open(output_dir + '/model.pt', 'wb') as f:\n",
" pickle.dump(model, f)\n"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {
"nteract": {
"transient": {
"deleting": false
}
}
},
"source": [
"#### Evaluation Script\n",
"\n",
"The next step is to evaluate the model and determine accuracy. The evaluation step tests the model against the test data, and an accuracy score is determined. The script then outputs the accuracy to a file:"
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Writing data_dependency_run_evaluate/evaluate.py\n"
]
}
],
"source": [
"%%writefile $source_directory_evaluate/evaluate.py\n",
"\n",
"from sklearn.metrics import classification_report\n",
"from sklearn.metrics import confusion_matrix\n",
"from sklearn.metrics import accuracy_score\n",
"from sklearn.svm import SVC\n",
"import pickle\n",
"import argparse\n",
"\n",
"# Define arguments\n",
"parser = argparse.ArgumentParser(description='Evaluate')\n",
"parser.add_argument('--model_dir', type=str, help='Directory of the model')\n",
"parser.add_argument('--test_dir', type=str, help='Directory to output the processed test data')\n",
"parser.add_argument('--accuracy_dir', type=str, help='Directory to store output raw data')\n",
"\n",
"args = parser.parse_args()\n",
"\n",
"# Get arguments from parser\n",
"model_dir = args.model_dir\n",
"test_dir = args.test_dir\n",
"accuracy_dir = args.accuracy_dir\n",
"\n",
"# load the model and test datasets from their directories\n",
"loaded_model = pickle.load(open(model_dir + '/model.pt', 'rb'))\n",
"loaded_validx = pickle.load(open(test_dir + '/X_test.sav', 'rb'))\n",
"loaded_validy = pickle.load(open(test_dir + '/Y_test.sav', 'rb'))\n",
"\n",
"\n",
"# Evaluate predictions and output to file\n",
"predictions = loaded_model.predict(loaded_validx)\n",
"print(accuracy_score(loaded_validy, predictions))\n",
"accuracy = accuracy_score(loaded_validy, predictions)\n",
"\n",
"if not os.path.exists(accuracy_dir):\n",
" os.makedirs(accuracy_dir)\n",
"\n",
"with open(accuracy_dir + '/accuracy_file', 'wb') as f:\n",
" pickle.dump(accuracy, f)\n"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {
"nteract": {
"transient": {
"deleting": false
}
}
},
"source": [
"#### Deploy Script\n",
"\n",
"The deploy step takes the newly trained model and deploys it as a web service endpoint:"
]
},
{
"cell_type": "code",
"execution_count": 9,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Writing data_dependency_run_deploy/deploy.py\n"
]
}
],
"source": [
"%%writefile $source_directory_deploy/deploy.py\n",
"\n",
"import os\n",
"from sklearn.metrics import accuracy_score\n",
"\n",
"import pickle\n",
"from azureml.core.webservice import Webservice\n",
"from azureml.core.model import InferenceConfig\n",
"from azureml.core.environment import Environment\n",
"from azureml.core import Workspace\n",
"from azureml.core.model import Model\n",
"from azureml.core.run import Run\n",
"from azureml.core.conda_dependencies import CondaDependencies\n",
"from azureml.core.webservice import AciWebservice\n",
"from azureml.exceptions import WebserviceException\n",
"import argparse\n",
"\n",
"# Create function for registering the model\n",
"def register_model(output_dir, model_name, accuracy, test_dir, workspace):\n",
" '''\n",
" Registers a new model\n",
" '''\n",
" model = Model.register(\n",
" model_path = model_dir + '/model.pt',\n",
" model_name = 'iris-classification-pipeline',\n",
" tags = {\n",
" 'accuracy': accuracy, \n",
" 'test_data': test_dir\n",
" },\n",
" description='Object recognition classifier',\n",
" workspace=workspace)\n",
" return model\n",
"\n",
"# Define arguments\n",
"parser = argparse.ArgumentParser(description='Deploy arg parser')\n",
"parser.add_argument('--test_dir', type=str, help='Directory where testing data is stored')\n",
"parser.add_argument('--model_dir', type=str, help='File storing the evaluation accuracy')\n",
"parser.add_argument('--accuracy_dir', type=str, help='File storing the evaluation accuracy')\n",
"\n",
"args = parser.parse_args()\n",
"\n",
"# Get run context\n",
"run = Run.get_context()\n",
"workspace = run.experiment.workspace\n",
"\n",
"\n",
"# Get arguments from parser\n",
"test_dir = args.test_dir\n",
"accuracy_dir = args.accuracy_dir\n",
"model_dir = args.model_dir\n",
"\n",
"if not os.path.exists(model_dir):\n",
" os.makedirs(model_dir)\n",
"\n",
"\n",
"# Get environment install required packages\n",
"env = Environment('iris-env')\n",
"\n",
"# Register environment to re-use later\n",
"env.register(workspace = workspace)\n",
"\n",
"# Define model and service names\n",
"service_name = 'iris-classification-service'\n",
"model_name = 'iris-classification-pipeline'\n",
"\n",
"\n",
"\n",
"# Read Accuracy\n",
"accuracy = pickle.load(open(accuracy_dir + '/accuracy_file', 'rb'))\n",
"\n",
"# Set up Environment\n",
"myenv = Environment.get(workspace=workspace, name=\"iris-env\", version=\"1\")\n",
"cd = CondaDependencies.create(pip_packages=['azureml-dataprep[pandas,fuse]>=1.1.14', 'azureml-defaults==1.38.0', 'Jinja2<3.1'], conda_packages = ['scikit-learn==0.24.2'])\n",
"myenv.python.conda_dependencies = cd\n",
"\n",
"# Register model if accuracy is higher or if test dataset has changed\n",
"new_model = False\n",
"try:\n",
" model = Model(workspace, model_name)\n",
" prev_accuracy = model.tags['accuracy']\n",
" prev_test_dir = model.tags['test_data']\n",
" if prev_test_dir != test_dir or prev_accuracy >= accuracy:\n",
" model = register_model(model_dir, model_name, accuracy, test_dir, workspace)\n",
" new_model = True\n",
"except WebserviceException:\n",
" print('Model does not exist yet')\n",
" model = register_model(model_dir, model_name, accuracy, test_dir, workspace)\n",
" new_model = True\n",
"\n",
"# Deploy new webservice if new model was registered\n",
"if new_model:\n",
" # Create inference config\n",
" inference_config = InferenceConfig(entry_script=\"score.py\", environment=myenv)\n",
"\n",
" # Deploy model\n",
" aci_config = AciWebservice.deploy_configuration(\n",
" cpu_cores = 2, \n",
" memory_gb = 4, \n",
" tags = {'model': 'iris', 'method': 'sklearn'}, \n",
" description='Iris classifier')\n",
"\n",
" try:\n",
" service = Webservice(workspace, name=service_name)\n",
" if service:\n",
" service.delete()\n",
" except WebserviceException as e:\n",
" print()\n",
"\n",
" service = Model.deploy(workspace, service_name, [model], inference_config, aci_config)\n",
" service.wait_for_deployment(True)\n",
"else:\n",
" service = Webservice(workspace, name=service_name)\n",
"\n",
"# Output scoring url to file\n",
"print(service.scoring_uri)\n",
"with open(model_dir + '/scoring_uri.txt', 'w+') as f:\n",
" f.write(service.scoring_uri)\n"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {
"nteract": {
"transient": {
"deleting": false
}
}
},
"source": [
"#### Score Script\n",
"\n",
"The scoring script runs when deploying the web service. The `init` method contains the logic for retrieving the registered model. The `run` method contains logic which gets invoked when calling the web service. The example takes the model and performs a prediction against the data that gets sent to the web service endpoint:"
]
},
{
"cell_type": "code",
"execution_count": 10,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Writing data_dependency_run_deploy/score.py\n"
]
}
],
"source": [
"%%writefile $source_directory_deploy/score.py\n",
"import json\n",
"import numpy as np\n",
"import os\n",
"import pickle\n",
"from sklearn.svm import SVC\n",
"\n",
"\n",
"def init():\n",
" global model\n",
" # Get registered model\n",
" model_path = os.path.join(os.getenv('AZUREML_MODEL_DIR'), 'model.pt')\n",
" model = pickle.load(open(model_path, 'rb'))\n",
"\n",
"def run(raw_data):\n",
" data = np.array(json.loads(raw_data)['data'])\n",
" # make prediction\n",
" prediction = model.predict([data])\n",
" # Output prediction\n",
" return prediction.tolist()\n"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {
"nteract": {
"transient": {
"deleting": false
}
}
},
"source": [
"### Passing Data Between Pipeline Steps\n",
"\n",
"A pipeline can take input and output data. This data can already exist from a dataset or be output data from a previous pipeline step called a PipelineData object.\n",
"\n",
"The first step in the pipeline will be the ingestion step, which downloads the Iris CSV dataset and stores it in a directory on the default datastore. The Iris CSV directory location needs to be passed on to the preprocessing step so it can perform its tasks. \n",
"\n",
"The default datastore also needs to be referenced in the PipelineData object using a data reference. This reference is a pointer to the datastore path and is used during a run.\n",
"\n",
"Create a PipelineData object for the Iris data directory:\n",
"\n"
]
},
{
"cell_type": "code",
"execution_count": 11,
"metadata": {
"gather": {
"logged": 1598448929360
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"The iris data PipelineObject has been created.\n"
]
}
],
"source": [
"\n",
"from azureml.pipeline.core import PipelineData\n",
"from azureml.data.data_reference import DataReference\n",
"\n",
"# Get datastore reference\n",
"datastore_reference = DataReference(datastore, mode='mount')\n",
"\n",
"# Create Pipeline Data\n",
"iris_data_dir = PipelineData(\n",
" name='iris_data_dir', \n",
" pipeline_output_name='iris_data_dir',\n",
" datastore=datastore_reference.datastore,\n",
" output_mode='mount',\n",
" is_directory=True)\n",
"\n",
"print('The iris data PipelineObject has been created.')"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {
"nteract": {
"transient": {
"deleting": false
}
}
},
"source": [
"Each pipeline step will need to pass data between them. Define the additional PipelineData objects for the rest of the pipeline workflow:"
]
},
{
"cell_type": "code",
"execution_count": 12,
"metadata": {
"gather": {
"logged": 1598448931456
},
"jupyter": {
"outputs_hidden": false,
"source_hidden": false
},
"nteract": {
"transient": {
"deleting": false
}
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"The remaining PipelineObjects have been created.\n"
]
}
],
"source": [
"# Create Pipeline Data for remaining steps\n",
"train_dir = PipelineData(\n",
" name='train_dir', \n",
" pipeline_output_name='train_dir',\n",
" datastore=datastore_reference.datastore,\n",
" output_mode='mount',\n",
" is_directory=True)\n",
"\n",
"output_dir = PipelineData(\n",
" name='output_dir', \n",
" pipeline_output_name='outputdir',\n",
" datastore=datastore_reference.datastore,\n",
" output_mode='mount',\n",
" is_directory=True)\n",
"\n",
"accuracy_dir = PipelineData(\n",
" name='accuracy_dir', \n",
" pipeline_output_name='accuracydir',\n",
" datastore=datastore_reference.datastore,\n",
" output_mode='mount',\n",
" is_directory=True)\n",
"\n",
"model_dir = PipelineData(\n",
" name='model_dir', \n",
" pipeline_output_name='modeldir',\n",
" datastore=datastore_reference.datastore,\n",
" output_mode='mount',\n",
" is_directory=True)\n",
"\n",
"test_dir = PipelineData(\n",
" name='test_dir', \n",
" pipeline_output_name='test_dir',\n",
" datastore=datastore_reference.datastore,\n",
" output_mode='mount',\n",
" is_directory=True)\n",
"\n",
"print('The remaining PipelineObjects have been created.')"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {
"nteract": {
"transient": {
"deleting": false
}
}
},
"source": [
"### Set up RunConfiguration\n",
"\n",
"The RunConfiguration object contains the information for submitting a training run in the experiment. For this run, the Conda dependencies require the Scikit-Learn package. This ML package will then be accessible during the experiment run:"
]
},
{
"cell_type": "code",
"execution_count": 13,
"metadata": {
"gather": {
"logged": 1598448934441
},
"jupyter": {
"outputs_hidden": false,
"source_hidden": false
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Run configuration has been created.\n"
]
}
],
"source": [
"\n",
"from azureml.core.runconfig import RunConfiguration, DockerConfiguration\n",
"from azureml.core.environment import CondaDependencies\n",
"\n",
"\n",
"# Configure the conda dependancies for the Run\n",
"conda_dep = CondaDependencies()\n",
"conda_dep.add_conda_package(\"scikit-learn==0.24.2\")\n",
"conda_dep.add_conda_package(\"pandas==0.25.3\")\n",
"docker_configuration = DockerConfiguration(use_docker=False)\n",
"run_config = RunConfiguration(conda_dependencies=conda_dep)\n",
"run_config.docker = docker_configuration\n",
"\n",
"print('Run configuration has been created.')"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"### Defining the Pipeline Steps\n",
"\n",
"After the PipelineDataObjects have been defined, the pipeline steps can be created. There are many built-in pipeline steps available in the Azure ML SDK. For a list of more steps, check out the [pipeline step documentation](https://docs.microsoft.com/en-us/python/api/azureml-pipeline-steps/azureml.pipeline.steps?view=azure-ml-py). For now, the PythonScriptStep is used to execute our python scripts. \n",
"\n",
"The PythonScriptStep consists of the name of the script to run and any arguments to pass through. The source directory is also defined in the PythonScriptStep. This source directory is the local directory created earlier in the lab and is where the `ingest.py` file is located.\n",
"\n",
"A step in the pipeline can take input data and create output data. In this case, the ingestion step is taking input from the default datastore and creating output for the Iris data directory. Then the iris data directory is passed into the preprocessing step as an input. This linking of inputs and outputs creates an implicit dependency and automatically tells Azure ML which order to run the steps. You could use the [run_after](https://docs.microsoft.com/en-us/python/api/azureml-pipeline-core/azureml.pipeline.core.builder.pipelinestep?view=azure-ml-py#run-after-step-) construct to declare the order of the steps, but since there is already a data dependency between the steps, this is not necessary:\n"
]
},
{
"cell_type": "code",
"execution_count": 14,
"metadata": {
"gather": {
"logged": 1598448938671
},
"jupyter": {
"outputs_hidden": false,
"source_hidden": false
},
"nteract": {
"transient": {
"deleting": false
}
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"The ingestion and preprocess pipelines have been created.\n"
]
}
],
"source": [
"import os\n",
"from azureml.pipeline.steps import PythonScriptStep\n",
"\n",
"# The URL for the Iris data that will be ingested in the first step of the pipeline\n",
"url = \"https://gist.githubusercontent.com/cristofima/b4deb0c8435d919d769f0a9a57f740a0/raw/37b1fa4c5ffb2dd4b23d9f958f35de96fc57e71c/iris.csv\"\n",
" \n",
"\n",
"# Pipeline Steps\n",
"ingestion_step = PythonScriptStep(\n",
" script_name='ingest.py',\n",
" arguments=['--iris_data_dir', iris_data_dir, '--urls', url],\n",
" inputs=[datastore_reference],\n",
" outputs=[iris_data_dir],\n",
" compute_target=cpu_cluster,\n",
" source_directory=source_directory_ingest,\n",
" runconfig=run_config,\n",
" allow_reuse=True\n",
")\n",
"\n",
"\n",
"preprocess_step = PythonScriptStep(\n",
" script_name='preprocess.py',\n",
" arguments=['--iris_data_dir', iris_data_dir, '--train_dir', train_dir,'--test_dir', test_dir],\n",
" inputs=[iris_data_dir],\n",
" outputs=[train_dir, test_dir],\n",
" compute_target=cpu_cluster,\n",
" source_directory=source_directory_preprocess,\n",
" runconfig=run_config,\n",
" allow_reuse=True\n",
")\n",
"\n",
"print('The ingestion and preprocess pipelines have been created.')"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {
"nteract": {
"transient": {
"deleting": false
}
}
},
"source": [
"Add another step for the training step of the pipeline:"
]
},
{
"cell_type": "code",
"execution_count": 15,
"metadata": {
"gather": {
"logged": 1598448947802
},
"jupyter": {
"outputs_hidden": false,
"source_hidden": false
},
"nteract": {
"transient": {
"deleting": false
}
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"The training pipeline has been created.\n"
]
}
],
"source": [
"\n",
"# Create training pipeline step\n",
"\n",
"train_step = PythonScriptStep(\n",
" script_name='train.py',\n",
" arguments=['--train_dir', train_dir, '--output_dir', model_dir],\n",
" inputs=[train_dir],\n",
" outputs=[model_dir],\n",
" compute_target=cpu_cluster,\n",
" source_directory=source_directory_train,\n",
" runconfig=run_config,\n",
" allow_reuse=False\n",
")\n",
"print('The training pipeline has been created.')"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {
"nteract": {
"transient": {
"deleting": false
}
}
},
"source": [
"Create the remaining pipeline steps for evaluating and deploying the model:"
]
},
{
"cell_type": "code",
"execution_count": 16,
"metadata": {
"gather": {
"logged": 1598448950375
},
"jupyter": {
"outputs_hidden": false,
"source_hidden": false
},
"nteract": {
"transient": {
"deleting": false
}
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"The evaluate and deploy pipelines have been created.\n"
]
}
],
"source": [
"# Create the evaluate and deploy pipeline steps\n",
"\n",
"evaluate_step = PythonScriptStep(\n",
" script_name='evaluate.py',\n",
" arguments=['--model_dir', model_dir,'--test_dir', test_dir, '--accuracy_dir', accuracy_dir],\n",
" inputs=[test_dir,model_dir],\n",
" outputs=[accuracy_dir],\n",
" compute_target=cpu_cluster,\n",
" source_directory=source_directory_evaluate,\n",
" runconfig=run_config,\n",
" allow_reuse=True\n",
")\n",
"\n",
"deploy_step = PythonScriptStep(\n",
" script_name='deploy.py',\n",
" arguments=['--model_dir', model_dir, '--accuracy_dir', accuracy_dir,'--test_dir', test_dir],\n",
" inputs=[test_dir,accuracy_dir,model_dir],\n",
" outputs=[output_dir],\n",
" compute_target=cpu_cluster,\n",
" source_directory=source_directory_deploy,\n",
" runconfig=run_config,\n",
" allow_reuse=True\n",
")\n",
"\n",
"print('The evaluate and deploy pipelines have been created.')"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"### Run Pipeline\n",
"\n",
"Submit the pipeline to initiate the run; this may take up to 30 minutes for the pipeline to complete:"
]
},
{
"cell_type": "code",
"execution_count": 17,
"metadata": {
"gather": {
"logged": 1598448960909
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Submitting pipeline ...\n",
"Created step ingest.py [d7f43506][3e0d8b8a-22cd-4dfc-971f-b6fe3fd331f8], (This step will run and generate new outputs)\n",
"Created step preprocess.py [386388a2][1b26cd03-7093-4ff8-838b-d26e4695e69e], (This step will run and generate new outputs)\n",
"Created step train.py [56b8a801][50e42c88-cc93-47c1-844b-f23c927c7bd9], (This step will run and generate new outputs)\n",
"Created step evaluate.py [13df7a86][549ea38e-2eac-4e35-b563-81ce5f85bae4], (This step will run and generate new outputs)Created step deploy.py [dab736a3][50420c95-2e2d-47b4-a997-2c3648e08a69], (This step will run and generate new outputs)\n",
"\n",
"Created data reference workspaceblobstore for StepId [f2f3d2ec][a0118a6d-6410-4934-b13b-ef9955c482b3], (Consumers of this data will generate new runs.)\n",
"Submitted PipelineRun 8ea7880b-6a28-48da-a924-1161bc3c2d85\n",
"Link to Azure Machine Learning Portal: https://ml.azure.com/runs/8ea7880b-6a28-48da-a924-1161bc3c2d85?wsid=/subscriptions/c460cd3f-7c2a-48cc-9f5d-b62d6083ec23/resourcegroups/cal-3326-b84/workspaces/ml-lab-ebsdek7c3q3ja&tid=fd1fbf9f-991a-40b4-ae26-61dfc34421ef\n"
]
}
],
"source": [
"from azureml.pipeline.core import Pipeline\n",
"from azureml.core import Experiment\n",
"\n",
"\n",
"# Submit the pipeline\n",
"print('Submitting pipeline ...')\n",
"pipeline = Pipeline(workspace=ws, steps=[ingestion_step, preprocess_step, train_step, evaluate_step, deploy_step])\n",
"pipeline_run = Experiment(ws, 'iris_pipeline').submit(pipeline)"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {
"nteract": {
"transient": {
"deleting": false
}
}
},
"source": [
"To view the progress of the pipeline, either click the link to the Azure Machine Learning Portal generated by the code cell above or run the azureml widget to view the pipeline's status run through the Jupyter Notebook:"
]
},
{
"cell_type": "code",
"execution_count": 19,
"metadata": {
"gather": {
"logged": 1598448969342
}
},
"outputs": [
{
"data": {
"application/vnd.jupyter.widget-view+json": {
"model_id": "94830899eeb34b9b888a6224a311e1c7",
"version_major": 2,
"version_minor": 0
},
"text/plain": [
"_PipelineWidget(child_runs=[{'run_id': 'faab47cd-1474-4198-85eb-c3134c1670cf', 'name': 'ingest.py', 'status': …"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"application/aml.mini.widget.v1": "{\"status\": \"Completed\", \"workbench_run_details_uri\": \"https://ml.azure.com/runs/8ea7880b-6a28-48da-a924-1161bc3c2d85?wsid=/subscriptions/c460cd3f-7c2a-48cc-9f5d-b62d6083ec23/resourcegroups/cal-3326-b84/workspaces/ml-lab-ebsdek7c3q3ja&tid=fd1fbf9f-991a-40b4-ae26-61dfc34421ef\", \"run_id\": \"8ea7880b-6a28-48da-a924-1161bc3c2d85\", \"run_properties\": {\"run_id\": \"8ea7880b-6a28-48da-a924-1161bc3c2d85\", \"created_utc\": \"2023-02-12T02:15:46.240659Z\", \"properties\": {\"azureml.runsource\": \"azureml.PipelineRun\", \"runSource\": \"SDK\", \"runType\": \"SDK\", \"azureml.parameters\": \"{}\", \"azureml.continue_on_step_failure\": \"False\", \"azureml.continue_on_failed_optional_input\": \"True\", \"azureml.pipelineComponent\": \"pipelinerun\", \"azureml.pipelines.stages\": \"{\\\"Initialization\\\":null,\\\"Execution\\\":{\\\"StartTime\\\":\\\"2023-02-12T02:15:47.5900367+00:00\\\",\\\"EndTime\\\":\\\"2023-02-12T02:42:56.5013822+00:00\\\",\\\"Status\\\":\\\"Finished\\\"}}\"}, \"tags\": {}, \"end_time_utc\": \"2023-02-12T02:42:56.595556Z\", \"status\": \"Completed\", \"log_files\": {\"logs/azureml/executionlogs.txt\": \"https://mllabebsdek7c3q3ja.blob.core.windows.net/azureml/ExperimentRun/dcid.8ea7880b-6a28-48da-a924-1161bc3c2d85/logs/azureml/executionlogs.txt?sv=2019-07-07&sr=b&sig=%2FthQn2eNzG55weLOhX8jw%2BEAXcAKa%2Fvt1xr%2FHJnNTI0%3D&skoid=7cba5741-f756-4347-a2b4-25cc14700d45&sktid=fd1fbf9f-991a-40b4-ae26-61dfc34421ef&skt=2023-02-12T02%3A05%3A47Z&ske=2023-02-13T10%3A15%3A47Z&sks=b&skv=2019-07-07&st=2023-02-12T02%3A32%3A07Z&se=2023-02-12T10%3A42%3A07Z&sp=r\", \"logs/azureml/stderrlogs.txt\": \"https://mllabebsdek7c3q3ja.blob.core.windows.net/azureml/ExperimentRun/dcid.8ea7880b-6a28-48da-a924-1161bc3c2d85/logs/azureml/stderrlogs.txt?sv=2019-07-07&sr=b&sig=HQhLppZgiwu%2FPZ0fFpP%2BUuMRV8GFpt%2Flb3L0oYbZV4w%3D&skoid=7cba5741-f756-4347-a2b4-25cc14700d45&sktid=fd1fbf9f-991a-40b4-ae26-61dfc34421ef&skt=2023-02-12T02%3A05%3A47Z&ske=2023-02-13T10%3A15%3A47Z&sks=b&skv=2019-07-07&st=2023-02-12T02%3A32%3A07Z&se=2023-02-12T10%3A42%3A07Z&sp=r\", \"logs/azureml/stdoutlogs.txt\": \"https://mllabebsdek7c3q3ja.blob.core.windows.net/azureml/ExperimentRun/dcid.8ea7880b-6a28-48da-a924-1161bc3c2d85/logs/azureml/stdoutlogs.txt?sv=2019-07-07&sr=b&sig=3oOnrwkgIB1suiOXZiXOfHta%2FnyBgJG3%2Fk5Y9HdBMFM%3D&skoid=7cba5741-f756-4347-a2b4-25cc14700d45&sktid=fd1fbf9f-991a-40b4-ae26-61dfc34421ef&skt=2023-02-12T02%3A05%3A47Z&ske=2023-02-13T10%3A15%3A47Z&sks=b&skv=2019-07-07&st=2023-02-12T02%3A32%3A07Z&se=2023-02-12T10%3A42%3A07Z&sp=r\"}, \"log_groups\": [[\"logs/azureml/executionlogs.txt\", \"logs/azureml/stderrlogs.txt\", \"logs/azureml/stdoutlogs.txt\"]], \"run_duration\": \"0:27:10\", \"run_number\": \"1676168146\", \"run_queued_details\": {\"status\": \"Finished\", \"details\": null}}, \"child_runs\": [{\"run_id\": \"faab47cd-1474-4198-85eb-c3134c1670cf\", \"name\": \"ingest.py\", \"status\": \"Finished\", \"start_time\": \"2023-02-12T02:28:49.869814Z\", \"created_time\": \"2023-02-12T02:15:49.035147Z\", \"end_time\": \"2023-02-12T02:29:38.870936Z\", \"duration\": \"0:13:49\", \"run_number\": 1676168149, \"metric\": null, \"run_type\": \"azureml.StepRun\", \"training_percent\": null, \"created_time_dt\": \"2023-02-12T02:15:49.035147Z\", \"is_reused\": \"\"}, {\"run_id\": \"240a13fe-9999-4422-9b99-fe64ee0a44f8\", \"name\": \"preprocess.py\", \"status\": \"Finished\", \"start_time\": \"2023-02-12T02:29:47.411241Z\", \"created_time\": \"2023-02-12T02:29:41.159257Z\", \"end_time\": \"2023-02-12T02:29:59.301424Z\", \"duration\": \"0:00:18\", \"run_number\": 1676168981, \"metric\": null, \"run_type\": \"azureml.StepRun\", \"training_percent\": null, \"created_time_dt\": \"2023-02-12T02:29:41.159257Z\", \"is_reused\": \"\"}, {\"run_id\": \"40187763-0a2f-4c56-aa36-a4ca8d2017fa\", \"name\": \"train.py\", \"status\": \"Finished\", \"start_time\": \"2023-02-12T02:30:06.557178Z\", \"created_time\": \"2023-02-12T02:30:01.118239Z\", \"end_time\": \"2023-02-12T02:30:17.734579Z\", \"duration\": \"0:00:16\", \"run_number\": 1676169001, \"metric\": null, \"run_type\": \"azureml.StepRun\", \"training_percent\": null, \"created_time_dt\": \"2023-02-12T02:30:01.118239Z\", \"is_reused\": \"\"}, {\"run_id\": \"f0442241-7a3d-4044-a3a8-b83761f9fd56\", \"name\": \"evaluate.py\", \"status\": \"Finished\", \"start_time\": \"2023-02-12T02:30:22.71625Z\", \"created_time\": \"2023-02-12T02:30:19.086704Z\", \"end_time\": \"2023-02-12T02:30:34.491374Z\", \"duration\": \"0:00:15\", \"run_number\": 1676169019, \"metric\": null, \"run_type\": \"azureml.StepRun\", \"training_percent\": null, \"created_time_dt\": \"2023-02-12T02:30:19.086704Z\", \"is_reused\": \"\"}, {\"run_id\": \"7201c4c7-dd8a-4110-863a-0629ae921692\", \"name\": \"deploy.py\", \"status\": \"Finished\", \"start_time\": \"2023-02-12T02:30:42.895747Z\", \"created_time\": \"2023-02-12T02:30:36.446227Z\", \"end_time\": \"2023-02-12T02:42:55.458042Z\", \"duration\": \"0:12:19\", \"run_number\": 1676169036, \"metric\": null, \"run_type\": \"azureml.StepRun\", \"training_percent\": null, \"created_time_dt\": \"2023-02-12T02:30:36.446227Z\", \"is_reused\": \"\"}], \"children_metrics\": {\"categories\": null, \"series\": null, \"metricName\": null}, \"run_metrics\": [], \"run_logs\": \"[2023-02-12 02:15:47Z] Submitting 1 runs, first five are: d7f43506:faab47cd-1474-4198-85eb-c3134c1670cf\\n[2023-02-12 02:29:39Z] Completing processing run id faab47cd-1474-4198-85eb-c3134c1670cf.\\n[2023-02-12 02:29:40Z] Submitting 1 runs, first five are: 386388a2:240a13fe-9999-4422-9b99-fe64ee0a44f8\\n[2023-02-12 02:29:59Z] Completing processing run id 240a13fe-9999-4422-9b99-fe64ee0a44f8.\\n[2023-02-12 02:30:00Z] Submitting 1 runs, first five are: 56b8a801:40187763-0a2f-4c56-aa36-a4ca8d2017fa\\n[2023-02-12 02:30:18Z] Completing processing run id 40187763-0a2f-4c56-aa36-a4ca8d2017fa.\\n[2023-02-12 02:30:18Z] Submitting 1 runs, first five are: 13df7a86:f0442241-7a3d-4044-a3a8-b83761f9fd56\\n[2023-02-12 02:30:35Z] Completing processing run id f0442241-7a3d-4044-a3a8-b83761f9fd56.\\n[2023-02-12 02:30:35Z] Submitting 1 runs, first five are: dab736a3:7201c4c7-dd8a-4110-863a-0629ae921692\\n[2023-02-12 02:42:56Z] Completing processing run id 7201c4c7-dd8a-4110-863a-0629ae921692.\\n\\nRun is completed.\", \"graph\": {\"datasource_nodes\": {\"f2f3d2ec\": {\"node_id\": \"f2f3d2ec\", \"name\": \"workspaceblobstore\"}}, \"module_nodes\": {\"d7f43506\": {\"node_id\": \"d7f43506\", \"name\": \"ingest.py\", \"status\": \"Finished\", \"_is_reused\": false, \"run_id\": \"faab47cd-1474-4198-85eb-c3134c1670cf\"}, \"386388a2\": {\"node_id\": \"386388a2\", \"name\": \"preprocess.py\", \"status\": \"Finished\", \"_is_reused\": false, \"run_id\": \"240a13fe-9999-4422-9b99-fe64ee0a44f8\"}, \"56b8a801\": {\"node_id\": \"56b8a801\", \"name\": \"train.py\", \"status\": \"Finished\", \"_is_reused\": false, \"run_id\": \"40187763-0a2f-4c56-aa36-a4ca8d2017fa\"}, \"13df7a86\": {\"node_id\": \"13df7a86\", \"name\": \"evaluate.py\", \"status\": \"Finished\", \"_is_reused\": false, \"run_id\": \"f0442241-7a3d-4044-a3a8-b83761f9fd56\"}, \"dab736a3\": {\"node_id\": \"dab736a3\", \"name\": \"deploy.py\", \"status\": \"Finished\", \"_is_reused\": false, \"run_id\": \"7201c4c7-dd8a-4110-863a-0629ae921692\"}}, \"edges\": [{\"source_node_id\": \"f2f3d2ec\", \"source_node_name\": \"workspaceblobstore\", \"source_name\": \"data\", \"target_name\": \"workspaceblobstore\", \"dst_node_id\": \"d7f43506\", \"dst_node_name\": \"ingest.py\"}, {\"source_node_id\": \"d7f43506\", \"source_node_name\": \"ingest.py\", \"source_name\": \"iris_data_dir\", \"target_name\": \"iris_data_dir\", \"dst_node_id\": \"386388a2\", \"dst_node_name\": \"preprocess.py\"}, {\"source_node_id\": \"386388a2\", \"source_node_name\": \"preprocess.py\", \"source_name\": \"train_dir\", \"target_name\": \"train_dir\", \"dst_node_id\": \"56b8a801\", \"dst_node_name\": \"train.py\"}, {\"source_node_id\": \"386388a2\", \"source_node_name\": \"preprocess.py\", \"source_name\": \"train_dir\", \"target_name\": \"test_dir\", \"dst_node_id\": \"13df7a86\", \"dst_node_name\": \"evaluate.py\"}, {\"source_node_id\": \"56b8a801\", \"source_node_name\": \"train.py\", \"source_name\": \"model_dir\", \"target_name\": \"test_dir\", \"dst_node_id\": \"13df7a86\", \"dst_node_name\": \"evaluate.py\"}, {\"source_node_id\": \"386388a2\", \"source_node_name\": \"preprocess.py\", \"source_name\": \"train_dir\", \"target_name\": \"test_dir\", \"dst_node_id\": \"dab736a3\", \"dst_node_name\": \"deploy.py\"}, {\"source_node_id\": \"13df7a86\", \"source_node_name\": \"evaluate.py\", \"source_name\": \"accuracy_dir\", \"target_name\": \"test_dir\", \"dst_node_id\": \"dab736a3\", \"dst_node_name\": \"deploy.py\"}, {\"source_node_id\": \"56b8a801\", \"source_node_name\": \"train.py\", \"source_name\": \"model_dir\", \"target_name\": \"test_dir\", \"dst_node_id\": \"dab736a3\", \"dst_node_name\": \"deploy.py\"}], \"child_runs\": [{\"run_id\": \"faab47cd-1474-4198-85eb-c3134c1670cf\", \"name\": \"ingest.py\", \"status\": \"Finished\", \"start_time\": \"2023-02-12T02:28:49.869814Z\", \"created_time\": \"2023-02-12T02:15:49.035147Z\", \"end_time\": \"2023-02-12T02:29:38.870936Z\", \"duration\": \"0:13:49\", \"run_number\": 1676168149, \"metric\": null, \"run_type\": \"azureml.StepRun\", \"training_percent\": null, \"created_time_dt\": \"2023-02-12T02:15:49.035147Z\", \"is_reused\": \"\"}, {\"run_id\": \"240a13fe-9999-4422-9b99-fe64ee0a44f8\", \"name\": \"preprocess.py\", \"status\": \"Finished\", \"start_time\": \"2023-02-12T02:29:47.411241Z\", \"created_time\": \"2023-02-12T02:29:41.159257Z\", \"end_time\": \"2023-02-12T02:29:59.301424Z\", \"duration\": \"0:00:18\", \"run_number\": 1676168981, \"metric\": null, \"run_type\": \"azureml.StepRun\", \"training_percent\": null, \"created_time_dt\": \"2023-02-12T02:29:41.159257Z\", \"is_reused\": \"\"}, {\"run_id\": \"40187763-0a2f-4c56-aa36-a4ca8d2017fa\", \"name\": \"train.py\", \"status\": \"Finished\", \"start_time\": \"2023-02-12T02:30:06.557178Z\", \"created_time\": \"2023-02-12T02:30:01.118239Z\", \"end_time\": \"2023-02-12T02:30:17.734579Z\", \"duration\": \"0:00:16\", \"run_number\": 1676169001, \"metric\": null, \"run_type\": \"azureml.StepRun\", \"training_percent\": null, \"created_time_dt\": \"2023-02-12T02:30:01.118239Z\", \"is_reused\": \"\"}, {\"run_id\": \"f0442241-7a3d-4044-a3a8-b83761f9fd56\", \"name\": \"evaluate.py\", \"status\": \"Finished\", \"start_time\": \"2023-02-12T02:30:22.71625Z\", \"created_time\": \"2023-02-12T02:30:19.086704Z\", \"end_time\": \"2023-02-12T02:30:34.491374Z\", \"duration\": \"0:00:15\", \"run_number\": 1676169019, \"metric\": null, \"run_type\": \"azureml.StepRun\", \"training_percent\": null, \"created_time_dt\": \"2023-02-12T02:30:19.086704Z\", \"is_reused\": \"\"}, {\"run_id\": \"7201c4c7-dd8a-4110-863a-0629ae921692\", \"name\": \"deploy.py\", \"status\": \"Finished\", \"start_time\": \"2023-02-12T02:30:42.895747Z\", \"created_time\": \"2023-02-12T02:30:36.446227Z\", \"end_time\": \"2023-02-12T02:42:55.458042Z\", \"duration\": \"0:12:19\", \"run_number\": 1676169036, \"metric\": null, \"run_type\": \"azureml.StepRun\", \"training_percent\": null, \"created_time_dt\": \"2023-02-12T02:30:36.446227Z\", \"is_reused\": \"\"}]}, \"widget_settings\": {\"childWidgetDisplay\": \"popup\", \"send_telemetry\": false, \"log_level\": \"INFO\", \"sdk_version\": \"1.48.0\"}, \"loading\": false}"
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"# Show run details\n",
"from azureml.widgets import RunDetails\n",
"r = RunDetails(pipeline_run)\n",
"r.get_widget_data()\n",
"r.show()"
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {
"nteract": {
"transient": {
"deleting": false
}
}
},
"source": [
"## Test the Endpoint\n",
"\n",
"Once the web service is deployed, it can be tested by sending a series of sepal and petal measurements to the URI. The web service will take the data, run a model prediction against it, and return the classification prediction:"
]
},
{
"cell_type": "code",
"execution_count": 21,
"metadata": {
"gather": {
"logged": 1598387052885
},
"jupyter": {
"outputs_hidden": false,
"source_hidden": false
},
"nteract": {
"transient": {
"deleting": false
}
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"b'[\"Iris-virginica\"]'\n"
]
}
],
"source": [
"import urllib.request, urllib.error # urllib.request and urllib.error for Python 3.X\n",
"import json\n",
"from azureml.core.webservice import Webservice\n",
"\n",
"\n",
"# Iris petal and sepal measurements\n",
"rawdata = {\"data\": [\n",
" 6.7, \n",
" 3.0, \n",
" 5.2, \n",
" 2.3\n",
" ]\n",
"}\n",
"\n",
"# Get the URL of the web service\n",
"service = Webservice(workspace=ws, name='iris-classification-service')\n",
"url = service.scoring_uri\n",
"\n",
"# Send data to web service\n",
"body = str.encode(json.dumps(rawdata))\n",
"\n",
"headers = {'Content-Type':'application/json', 'Authorization':('Bearer ')}\n",
"req = urllib.request.Request(url, body, headers)\n",
"\n",
"try:\n",
" response = urllib.request.urlopen(req)\n",
" result = response.read()\n",
" print(result)\n",
"\n",
"except urllib.error.HTTPError as error: \n",
" print(\"The request failed with status code: \" + str(error.code))\n",
" print(error.info())"
]
}
],
"metadata": {
"kernel_info": {
"name": "python38-azureml"
},
"kernelspec": {
"display_name": "Python 3.8 - AzureML",
"language": "python",
"name": "python38-azureml"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.8.10"
},
"nteract": {
"version": "nteract-front-end@1.0.0"
}
},
"nbformat": 4,
"nbformat_minor": 2
}
5.1 3.5 1.4 0.2 Iris-setosa
4.9 3.0 1.4 0.2 Iris-setosa
4.7 3.2 1.3 0.2 Iris-setosa
4.6 3.1 1.5 0.2 Iris-setosa
5.0 3.6 1.4 0.2 Iris-setosa
5.4 3.9 1.7 0.4 Iris-setosa
4.6 3.4 1.4 0.3 Iris-setosa
5.0 3.4 1.5 0.2 Iris-setosa
4.4 2.9 1.4 0.2 Iris-setosa
4.9 3.1 1.5 0.1 Iris-setosa
5.4 3.7 1.5 0.2 Iris-setosa
4.8 3.4 1.6 0.2 Iris-setosa
4.8 3.0 1.4 0.1 Iris-setosa
4.3 3.0 1.1 0.1 Iris-setosa
5.8 4.0 1.2 0.2 Iris-setosa
5.7 4.4 1.5 0.4 Iris-setosa
5.4 3.9 1.3 0.4 Iris-setosa
5.1 3.5 1.4 0.3 Iris-setosa
5.7 3.8 1.7 0.3 Iris-setosa
5.1 3.8 1.5 0.3 Iris-setosa
5.4 3.4 1.7 0.2 Iris-setosa
5.1 3.7 1.5 0.4 Iris-setosa
4.6 3.6 1.0 0.2 Iris-setosa
5.1 3.3 1.7 0.5 Iris-setosa
4.8 3.4 1.9 0.2 Iris-setosa
5.0 3.0 1.6 0.2 Iris-setosa
5.0 3.4 1.6 0.4 Iris-setosa
5.2 3.5 1.5 0.2 Iris-setosa
5.2 3.4 1.4 0.2 Iris-setosa
4.7 3.2 1.6 0.2 Iris-setosa
4.8 3.1 1.6 0.2 Iris-setosa
5.4 3.4 1.5 0.4 Iris-setosa
5.2 4.1 1.5 0.1 Iris-setosa
5.5 4.2 1.4 0.2 Iris-setosa
4.9 3.1 1.5 0.1 Iris-setosa
5.0 3.2 1.2 0.2 Iris-setosa
5.5 3.5 1.3 0.2 Iris-setosa
4.9 3.1 1.5 0.1 Iris-setosa
4.4 3.0 1.3 0.2 Iris-setosa
5.1 3.4 1.5 0.2 Iris-setosa
5.0 3.5 1.3 0.3 Iris-setosa
4.5 2.3 1.3 0.3 Iris-setosa
4.4 3.2 1.3 0.2 Iris-setosa
5.0 3.5 1.6 0.6 Iris-setosa
5.1 3.8 1.9 0.4 Iris-setosa
4.8 3.0 1.4 0.3 Iris-setosa
5.1 3.8 1.6 0.2 Iris-setosa
4.6 3.2 1.4 0.2 Iris-setosa
5.3 3.7 1.5 0.2 Iris-setosa
5.0 3.3 1.4 0.2 Iris-setosa
7.0 3.2 4.7 1.4 Iris-versicolor
6.4 3.2 4.5 1.5 Iris-versicolor
6.9 3.1 4.9 1.5 Iris-versicolor
5.5 2.3 4.0 1.3 Iris-versicolor
6.5 2.8 4.6 1.5 Iris-versicolor
5.7 2.8 4.5 1.3 Iris-versicolor
6.3 3.3 4.7 1.6 Iris-versicolor
4.9 2.4 3.3 1.0 Iris-versicolor
6.6 2.9 4.6 1.3 Iris-versicolor
5.2 2.7 3.9 1.4 Iris-versicolor
5.0 2.0 3.5 1.0 Iris-versicolor
5.9 3.0 4.2 1.5 Iris-versicolor
6.0 2.2 4.0 1.0 Iris-versicolor
6.1 2.9 4.7 1.4 Iris-versicolor
5.6 2.9 3.6 1.3 Iris-versicolor
6.7 3.1 4.4 1.4 Iris-versicolor
5.6 3.0 4.5 1.5 Iris-versicolor
5.8 2.7 4.1 1.0 Iris-versicolor
6.2 2.2 4.5 1.5 Iris-versicolor
5.6 2.5 3.9 1.1 Iris-versicolor
5.9 3.2 4.8 1.8 Iris-versicolor
6.1 2.8 4.0 1.3 Iris-versicolor
6.3 2.5 4.9 1.5 Iris-versicolor
6.1 2.8 4.7 1.2 Iris-versicolor
6.4 2.9 4.3 1.3 Iris-versicolor
6.6 3.0 4.4 1.4 Iris-versicolor
6.8 2.8 4.8 1.4 Iris-versicolor
6.7 3.0 5.0 1.7 Iris-versicolor
6.0 2.9 4.5 1.5 Iris-versicolor
5.7 2.6 3.5 1.0 Iris-versicolor
5.5 2.4 3.8 1.1 Iris-versicolor
5.5 2.4 3.7 1.0 Iris-versicolor
5.8 2.7 3.9 1.2 Iris-versicolor
6.0 2.7 5.1 1.6 Iris-versicolor
5.4 3.0 4.5 1.5 Iris-versicolor
6.0 3.4 4.5 1.6 Iris-versicolor
6.7 3.1 4.7 1.5 Iris-versicolor
6.3 2.3 4.4 1.3 Iris-versicolor
5.6 3.0 4.1 1.3 Iris-versicolor
5.5 2.5 4.0 1.3 Iris-versicolor
5.5 2.6 4.4 1.2 Iris-versicolor
6.1 3.0 4.6 1.4 Iris-versicolor
5.8 2.6 4.0 1.2 Iris-versicolor
5.0 2.3 3.3 1.0 Iris-versicolor
5.6 2.7 4.2 1.3 Iris-versicolor
5.7 3.0 4.2 1.2 Iris-versicolor
5.7 2.9 4.2 1.3 Iris-versicolor
6.2 2.9 4.3 1.3 Iris-versicolor
5.1 2.5 3.0 1.1 Iris-versicolor
5.7 2.8 4.1 1.3 Iris-versicolor
6.3 3.3 6.0 2.5 Iris-virginica
5.8 2.7 5.1 1.9 Iris-virginica
7.1 3.0 5.9 2.1 Iris-virginica
6.3 2.9 5.6 1.8 Iris-virginica
6.5 3.0 5.8 2.2 Iris-virginica
7.6 3.0 6.6 2.1 Iris-virginica
4.9 2.5 4.5 1.7 Iris-virginica
7.3 2.9 6.3 1.8 Iris-virginica
6.7 2.5 5.8 1.8 Iris-virginica
7.2 3.6 6.1 2.5 Iris-virginica
6.5 3.2 5.1 2.0 Iris-virginica
6.4 2.7 5.3 1.9 Iris-virginica
6.8 3.0 5.5 2.1 Iris-virginica
5.7 2.5 5.0 2.0 Iris-virginica
5.8 2.8 5.1 2.4 Iris-virginica
6.4 3.2 5.3 2.3 Iris-virginica
6.5 3.0 5.5 1.8 Iris-virginica
7.7 3.8 6.7 2.2 Iris-virginica
7.7 2.6 6.9 2.3 Iris-virginica
6.0 2.2 5.0 1.5 Iris-virginica
6.9 3.2 5.7 2.3 Iris-virginica
5.6 2.8 4.9 2.0 Iris-virginica
7.7 2.8 6.7 2.0 Iris-virginica
6.3 2.7 4.9 1.8 Iris-virginica
6.7 3.3 5.7 2.1 Iris-virginica
7.2 3.2 6.0 1.8 Iris-virginica
6.2 2.8 4.8 1.8 Iris-virginica
6.1 3.0 4.9 1.8 Iris-virginica
6.4 2.8 5.6 2.1 Iris-virginica
7.2 3.0 5.8 1.6 Iris-virginica
7.4 2.8 6.1 1.9 Iris-virginica
7.9 3.8 6.4 2.0 Iris-virginica
6.4 2.8 5.6 2.2 Iris-virginica
6.3 2.8 5.1 1.5 Iris-virginica
6.1 2.6 5.6 1.4 Iris-virginica
7.7 3.0 6.1 2.3 Iris-virginica
6.3 3.4 5.6 2.4 Iris-virginica
6.4 3.1 5.5 1.8 Iris-virginica
6.0 3.0 4.8 1.8 Iris-virginica
6.9 3.1 5.4 2.1 Iris-virginica
6.7 3.1 5.6 2.4 Iris-virginica
6.9 3.1 5.1 2.3 Iris-virginica
5.8 2.7 5.1 1.9 Iris-virginica
6.8 3.2 5.9 2.3 Iris-virginica
6.7 3.3 5.7 2.5 Iris-virginica
6.7 3.0 5.2 2.3 Iris-virginica
6.3 2.5 5.0 1.9 Iris-virginica
6.5 3.0 5.2 2.0 Iris-virginica
6.2 3.4 5.4 2.3 Iris-virginica
5.9 3.0 5.1 1.8 Iris-virginica
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment