Skip to content

Instantly share code, notes, and snippets.

Created April 2, 2018 18:06
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save donwany/37b837c5ae7ea0c9a1d70d164d04dbfb to your computer and use it in GitHub Desktop.
Save donwany/37b837c5ae7ea0c9a1d70d164d04dbfb to your computer and use it in GitHub Desktop.
XGBoost Algorithm with AWS SageMaker
Display the source blob
Display the rendered blob
"cells": [
"cell_type": "markdown",
"metadata": {
"collapsed": true
"source": [
"## Author : Theophilus Siameh\n",
" Email :\n",
" \n",
"## Algorithm : Regression Model using XGBoost Algorithm with AWS SageMaker"
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": true
"outputs": [],
"source": [
"import numpy as np\n",
"import pandas as pd\n",
"# Define IAM role\n",
"import boto3\n",
"import re\n",
"import sagemaker\n",
"from sagemaker import get_execution_role"
"cell_type": "markdown",
"metadata": {},
"source": [
"## Download files from my github \n",
" \n",
" \n",
"cell_type": "markdown",
"metadata": {},
"source": [
"## Initialization"
"cell_type": "code",
"execution_count": 5,
"metadata": {
"collapsed": true
"outputs": [],
"source": [
"# Region or your own availability zone\n",
"region = \"us-east-2\"\n",
"# bucket name -> create your own unique bucket name\n",
"bucket_name = \"bike-sagemaker\"\n",
"# bucket path -> \n",
"bucket_path = \"https://s3.{0}{1}\".format(region,bucket_name)\n",
"# training set -> make sure this file is pointing to us-east-2 in the s3 bucket or your own availability zone\n",
"training_file_key = 'bike_train.csv'\n",
"# validation set -> make sure this file is pointing to us-east-2 in the s3 bucket or your own availability zone\n",
"validation_file_key = 'bike_validation.csv'\n",
"# testing set -> make sure this file is pointing to us-east-2 in the s3 bucket or your own availability zone\n",
"test_file_key = 'bike_test.csv'\n"
"cell_type": "markdown",
"metadata": {},
"source": [
"## Upload Data to S3"
"cell_type": "code",
"execution_count": 7,
"metadata": {
"collapsed": true
"outputs": [],
"source": [
"s3_model_output_location = bucket_path +'/' + 'bike/model'\n",
"s3_training_file_location = bucket_path + '/' + training_file_key\n",
"s3_validation_file_location = bucket_path +'/' + validation_file_key\n",
"s3_test_file_location = bucket_path + '/' + test_file_key"
"cell_type": "code",
"execution_count": 8,
"metadata": {},
"outputs": [
"name": "stdout",
"output_type": "stream",
"text": [
"source": [
"cell_type": "code",
"execution_count": 10,
"metadata": {
"collapsed": true
"outputs": [],
"source": [
"# Upload files to s3 bucket with boto3 api or you upload manually to s3 bucket via the console.\n",
"def write_to_s3(filename, bucket, key):\n",
" with open(filename,'rb') as f: # Read in binary mode\n",
" print(\"Uploading files to s3 bucket ...\")\n",
" return boto3.Session().resource('s3').Bucket(bucket).Object(key).upload_fileobj(f)"
"cell_type": "code",
"execution_count": 11,
"metadata": {},
"outputs": [
"name": "stdout",
"output_type": "stream",
"text": [
"Uploading files to s3 bucket ...\n",
"Uploading files to s3 bucket ...\n",
"Uploading files to s3 bucket ...\n"
"source": [
"cell_type": "markdown",
"metadata": {},
"source": [
"## Training Algorithm Docker Image\n",
"### AWS Maintains a separate image for every region and algorithm"
"cell_type": "code",
"execution_count": 12,
"metadata": {
"collapsed": true
"outputs": [],
"source": [
"# Registry Path for algorithms provided by SageMaker\n",
"containers = {\n",
" 'us-west-2': '',\n",
" 'us-east-1': '',\n",
" 'us-east-2': '',\n",
" 'eu-west-1': ''\n",
" }"
"cell_type": "code",
"execution_count": 13,
"metadata": {
"collapsed": true
"outputs": [],
"source": [
"role = get_execution_role()"
"cell_type": "code",
"execution_count": 14,
"metadata": {},
"outputs": [
"name": "stdout",
"output_type": "stream",
"text": [
"source": [
"# This role contains the permissions needed to train, deploy models\n",
"# SageMaker Service is trusted to assume this role\n",
"cell_type": "markdown",
"metadata": {},
"source": [
"## Build Model"
"cell_type": "code",
"execution_count": 27,
"metadata": {},
"outputs": [
"name": "stdout",
"output_type": "stream",
"text": [
"Region name : us-east-2\n",
"Docker Container Region :\n"
"source": [
"sess = sagemaker.Session()\n",
"print(\"Region name : {}\".format(sess.boto_region_name))\n",
"print(\"Docker Container Region : {}\".format(containers[boto3.Session().region_name]))"
"cell_type": "code",
"execution_count": 16,
"metadata": {
"collapsed": true
"outputs": [],
"source": [
"# Access appropriate algorithm container image\n",
"# Specify how many instances to use for distributed training and what type of machine to use\n",
"# Finally, specify where the trained model artifacts needs to be stored\n",
"# Reference:\n",
"# Optionally, give a name to the training job using base_job_name\n",
"estimator = sagemaker.estimator.Estimator(containers[boto3.Session().region_name],\n",
" role, \n",
" train_instance_count = 1, \n",
" train_instance_type = 'ml.m4.xlarge',\n",
" output_path = s3_model_output_location,\n",
" sagemaker_session = sess,\n",
" base_job_name = 'xgboost-biketrain-v1')"
"cell_type": "code",
"execution_count": 17,
"metadata": {
"collapsed": true
"outputs": [],
"source": [
"# Specify hyper parameters that appropriate for the training algorithm\n",
"# XGBoost Training Parameter Reference: \n",
"cell_type": "code",
"execution_count": 42,
"metadata": {},
"outputs": [
"data": {
"text/plain": [
"{'eta': 0.1,\n",
" 'max_depth': 6,\n",
" 'num_round': 200,\n",
" 'objective': 'reg:linear',\n",
" 'subsample': 0.7}"
"execution_count": 42,
"metadata": {},
"output_type": "execute_result"
"source": [
"cell_type": "markdown",
"metadata": {
"collapsed": true
"source": [
"### Specify Training Data Location and Optionally, Validation Data Location"
"cell_type": "code",
"execution_count": 21,
"metadata": {
"collapsed": true
"outputs": [],
"source": [
"# content type can be libsvm or csv for XGBoost\n",
"training_input_config = sagemaker.session.\\\n",
" s3_input(s3_data=s3_training_file_location,content_type=\"csv\",distribution='FullyReplicated')\n",
"validation_input_config = sagemaker.session.\\\n",
" s3_input(s3_data=s3_validation_file_location,content_type=\"csv\",distribution='FullyReplicated')"
"cell_type": "code",
"execution_count": 23,
"metadata": {},
"outputs": [
"name": "stdout",
"output_type": "stream",
"text": [
"{'DataSource': {'S3DataSource': {'S3DataDistributionType': 'FullyReplicated', 'S3DataType': 'S3Prefix', 'S3Uri': ''}}, 'ContentType': 'csv'}\n"
"source": [
"cell_type": "code",
"execution_count": 24,
"metadata": {},
"outputs": [
"name": "stdout",
"output_type": "stream",
"text": [
"{'DataSource': {'S3DataSource': {'S3DataDistributionType': 'FullyReplicated', 'S3DataType': 'S3Prefix', 'S3Uri': ''}}, 'ContentType': 'csv'}\n"
"source": [
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": true
"outputs": [],
"source": []
"cell_type": "markdown",
"metadata": {},
"source": [
"### Train the model - takes 5-6mins"
"cell_type": "code",
"execution_count": 25,
"metadata": {},
"outputs": [
"name": "stderr",
"output_type": "stream",
"text": [
"INFO:sagemaker:Creating training-job with name: xgboost-biketrain-v1-2018-04-02-17-48-12-139\n"
"name": "stdout",
"output_type": "stream",
"text": [
"\u001b[31mArguments: train\u001b[0m\n",
"\u001b[31m[2018-04-02:17:53:56:INFO] Running standalone xgboost training.\u001b[0m\n",
"\u001b[31m[2018-04-02:17:53:56:INFO] File size need to be processed in the node: 0.65mb. Available memory size in the node: 8673.54mb\u001b[0m\n",
"\u001b[31m/opt/amazon/lib/python2.7/site-packages/sage_xgboost/ ParserWarning: Falling back to the 'python' engine because the 'c' engine does not support sep=None with delim_whitespace=False; you can avoid this warning by specifying engine='python'.\n",
" df = pd.read_csv(os.path.join(files_path, csv_file), sep=None, header=None)\u001b[0m\n",
"\u001b[31m[17:53:56] src/tree/ tree pruning end, 1 roots, 68 extra nodes, 0 pruned nodes, max_depth=6\u001b[0m\n",
"\u001b[31m[17:53:56] src/tree/ tree pruning end, 1 roots, 84 extra nodes, 0 pruned nodes, max_depth=6\u001b[0m\n",
"\u001b[31m[17:53:56] src/tree/ tree pruning end, 1 roots, 84 extra nodes, 0 pruned nodes, max_depth=6\u001b[0m\n",
"\u001b[31m[17:53:56] src/tree/ tree pruning end, 1 roots, 86 extra nodes, 0 pruned nodes, max_depth=6\u001b[0m\n",
"\u001b[31m[17:53:56] src/tree/ tree pruning end, 1 roots, 102 extra nodes, 0 pruned nodes, max_depth=6\u001b[0m\n",
"\u001b[31m[17:53:56] src/tree/ tree pruning end, 1 roots, 86 extra nodes, 0 pruned nodes, max_depth=6\u001b[0m\n",
"\u001b[31m[17:53:56] src/tree/ tree pruning end, 1 roots, 92 extra nodes, 0 pruned nodes, max_depth=6\u001b[0m\n",
"\u001b[31m[17:53:56] src/tree/ tree pruning end, 1 roots, 106 extra nodes, 0 pruned nodes, max_depth=6\u001b[0m\n",
"\u001b[31m[17:53:56] src/tree/ tree pruning end, 1 roots, 112 extra nodes, 0 pruned nodes, max_depth=6\u001b[0m\n",
"\u001b[31m[17:53:56] src/tree/ tree pruning end, 1 roots, 114 extra nodes, 0 pruned nodes, max_depth=6\u001b[0m\n",
"\u001b[31m[17:53:56] src/tree/ tree pruning end, 1 roots, 116 extra nodes, 0 pruned nodes, max_depth=6\u001b[0m\n",
"\u001b[31m[17:53:56] src/tree/ tree pruning end, 1 roots, 114 extra nodes, 0 pruned nodes, max_depth=6\u001b[0m\n",
"\u001b[31m[17:53:56] src/tree/ tree pruning end, 1 roots, 118 extra nodes, 0 pruned nodes, max_depth=6\u001b[0m\n",
"\u001b[31m[17:53:56] src/tree/ tree pruning end, 1 roots, 126 extra nodes, 0 pruned nodes, max_depth=6\u001b[0m\n",
"\u001b[31m[17:53:56] src/tree/ tree pruning end, 1 roots, 118 extra nodes, 0 pruned nodes, max_depth=6\u001b[0m\n",
"\u001b[31m[17:53:56] src/tree/ tree pruning end, 1 roots, 120 extra nodes, 0 pruned nodes, max_depth=6\u001b[0m\n",
"\u001b[31m[17:53:56] src/tree/ tree pruning end, 1 roots, 126 extra nodes, 0 pruned nodes, max_depth=6\u001b[0m\n",
"\u001b[31m[17:53:56] src/tree/ tree pruning end, 1 roots, 124 extra nodes, 0 pruned nodes, max_depth=6\u001b[0m\n",
"\u001b[31m[17:53:56] src/tree/ tree pruning end, 1 roots, 118 extra nodes, 0 pruned nodes, max_depth=6\u001b[0m\n",
"\u001b[31m[17:53:57] src/tree/ tree pruning end, 1 roots, 126 extra nodes, 0 pruned nodes, max_depth=6\u001b[0m\n",
"\u001b[31m[17:53:57] src/tree/ tree pruning end, 1 roots, 126 extra nodes, 0 pruned nodes, max_depth=6\u001b[0m\n",
"\u001b[31m[17:53:57] src/tree/ tree pruning end, 1 roots, 118 extra nodes, 0 pruned nodes, max_depth=6\u001b[0m\n",
"\u001b[31m[17:53:57] src/tree/ tree pruning end, 1 roots, 120 extra nodes, 0 pruned nodes, max_depth=6\u001b[0m\n",
"\u001b[31m[17:53:57] src/tree/ tree pruning end, 1 roots, 124 extra nodes, 0 pruned nodes, max_depth=6\u001b[0m\n",
"\u001b[31m[17:53:57] src/tree/ tree pruning end, 1 roots, 116 extra nodes, 0 pruned nodes, max_depth=6\u001b[0m\n",
"\u001b[31m[17:53:57] src/tree/ tree pruning end, 1 roots, 124 extra nodes, 0 pruned nodes, max_depth=6\u001b[0m\n",
"\u001b[31m[17:53:57] src/tree/ tree pruning end, 1 roots, 124 extra nodes, 0 pruned nodes, max_depth=6\u001b[0m\n",
"\u001b[31m[17:53:57] src/tree/ tree pruning end, 1 roots, 126 extra nodes, 0 pruned nodes, max_depth=6\u001b[0m\n",
"\u001b[31m[17:53:57] src/tree/ tree pruning end, 1 roots, 122 extra nodes, 0 pruned nodes, max_depth=6\u001b[0m\n",
"\u001b[31m[17:53:57] src/tree/ tree pruning end, 1 roots, 106 extra nodes, 0 pruned nodes, max_depth=6\u001b[0m\n",
"\u001b[31m[17:53:57] src/tree/ tree pruning end, 1 roots, 110 extra nodes, 0 pruned nodes, max_depth=6\u001b[0m\n",
"\u001b[31m[17:53:57] src/tree/ tree pruning end, 1 roots, 126 extra nodes, 0 pruned nodes, max_depth=6\u001b[0m\n",
"\u001b[31m[17:53:57] src/tree/ tree pruning end, 1 roots, 118 extra nodes, 0 pruned nodes, max_depth=6\u001b[0m\n",
"\u001b[31m[17:53:57] src/tree/ tree pruning end, 1 roots, 120 extra nodes, 0 pruned nodes, max_depth=6\u001b[0m\n",
"\u001b[31m[17:53:57] src/tree/ tree pruning end, 1 roots, 120 extra nodes, 0 pruned nodes, max_depth=6\u001b[0m\n",
"\u001b[31m[17:53:57] src/tree/ tree pruning end, 1 roots, 120 extra nodes, 0 pruned nodes, max_depth=6\u001b[0m\n",
"\u001b[31m[17:53:57] src/tree/ tree pruning end, 1 roots, 126 extra nodes, 0 pruned nodes, max_depth=6\u001b[0m\n",
"\u001b[31m[17:53:57] src/tree/ tree pruning end, 1 roots, 122 extra nodes, 0 pruned nodes, max_depth=6\u001b[0m\n",
"\u001b[31m[17:53:57] src/tree/ tree pruning end, 1 roots, 122 extra nodes, 0 pruned nodes, max_depth=6\u001b[0m\n",
"\u001b[31m[17:53:57] src/tree/ tree pruning end, 1 roots, 124 extra nodes, 0 pruned nodes, max_depth=6\u001b[0m\n",
"\u001b[31m[17:53:57] src/tree/ tree pruning end, 1 roots, 122 extra nodes, 0 pruned nodes, max_depth=6\u001b[0m\n",
"\u001b[31m[17:53:57] src/tree/ tree pruning end, 1 roots, 120 extra nodes, 0 pruned nodes, max_depth=6\u001b[0m\n",
"\u001b[31m[17:53:57] src/tree/ tree pruning end, 1 roots, 118 extra nodes, 0 pruned nodes, max_depth=6\u001b[0m\n",
"\u001b[31m[17:53:57] src/tree/ tree pruning end, 1 roots, 124 extra nodes, 0 pruned nodes, max_depth=6\u001b[0m\n",
"\u001b[31m[17:53:57] src/tree/ tree pruning end, 1 roots, 112 extra nodes, 0 pruned nodes, max_depth=6\u001b[0m\n",
"\u001b[31m[17:53:57] src/tree/ tree pruning end, 1 roots, 126 extra nodes, 0 pruned nodes, max_depth=6\u001b[0m\n",
"\u001b[31m[17:53:57] src/tree/ tree pruning end, 1 roots, 124 extra nodes, 0 pruned nodes, max_depth=6\u001b[0m\n",
"\u001b[31m[17:53:57] src/tree/ tree pruning end, 1 roots, 120 extra nodes, 0 pruned nodes, max_depth=6\u001b[0m\n",
"\u001b[31m[17:53:57] src/tree/ tree pruning end, 1 roots, 122 extra nodes, 0 pruned nodes, max_depth=6\u001b[0m\n",
"\u001b[31m[17:53:57] src/tree/ tree pruning end, 1 roots, 118 extra nodes, 0 pruned nodes, max_depth=6\u001b[0m\n",
"\u001b[31m[17:53:57] src/tree/ tree pruning end, 1 roots, 116 extra nodes, 0 pruned nodes, max_depth=6\u001b[0m\n",
"\u001b[31m[17:53:57] src/tree/ tree pruning end, 1 roots, 120 extra nodes, 0 pruned nodes, max_depth=6\u001b[0m\n",
"\u001b[31m[17:53:57] src/tree/ tree pruning end, 1 roots, 120 extra nodes, 0 pruned nodes, max_depth=6\u001b[0m\n",
"\u001b[31m[17:53:57] src/tree/ tree pruning end, 1 roots, 122 extra nodes, 0 pruned nodes, max_depth=6\u001b[0m\n",
"\u001b[31m[17:53:57] src/tree/ tree pruning end, 1 roots, 108 extra nodes, 0 pruned nodes, max_depth=6\u001b[0m\n",
"\u001b[31m[17:53:57] src/tree/ tree pruning end, 1 roots, 118 extra nodes, 0 pruned nodes, max_depth=6\u001b[0m\n",
"\u001b[31m[17:53:57] src/tree/ tree pruning end, 1 roots, 114 extra nodes, 0 pruned nodes, max_depth=6\u001b[0m\n",
"\u001b[31m[17:53:57] src/tree/ tree pruning end, 1 roots, 120 extra nodes, 0 pruned nodes, max_depth=6\u001b[0m\n",
"\u001b[31m[17:53:57] src/tree/ tree pruning end, 1 roots, 112 extra nodes, 0 pruned nodes, max_depth=6\u001b[0m\n",
"\u001b[31m[17:53:57] src/tree/ tree pruning end, 1 roots, 126 extra nodes, 0 pruned nodes, max_depth=6\u001b[0m\n",
"\u001b[31m[17:53:57] src/tree/ tree pruning end, 1 roots, 120 extra nodes, 0 pruned nodes, max_depth=6\u001b[0m\n",
"\u001b[31m[17:53:57] src/tree/ tree pruning end, 1 roots, 118 extra nodes, 0 pruned nodes, max_depth=6\u001b[0m\n",
"\u001b[31m[17:53:57] src/tree/ tree pruning end, 1 roots, 106 extra nodes, 0 pruned nodes, max_depth=6\u001b[0m\n",
"\u001b[31m[17:53:57] src/tree/ tree pruning end, 1 roots, 98 extra nodes, 0 pruned nodes, max_depth=6\u001b[0m\n",
"\u001b[31m[17:53:57] src/tree/ tree pruning end, 1 roots, 124 extra nodes, 0 pruned nodes, max_depth=6\u001b[0m\n",
"\u001b[31m[17:53:57] src/tree/ tree pruning end, 1 roots, 122 extra nodes, 0 pruned nodes, max_depth=6\u001b[0m\n",
"\u001b[31m[17:53:57] src/tree/ tree pruning end, 1 roots, 94 extra nodes, 0 pruned nodes, max_depth=6\u001b[0m\n",
"\u001b[31m[17:53:57] src/tree/ tree pruning end, 1 roots, 120 extra nodes, 0 pruned nodes, max_depth=6\u001b[0m\n",
"\u001b[31m[17:53:57] src/tree/ tree pruning end, 1 roots, 120 extra nodes, 0 pruned nodes, max_depth=6\u001b[0m\n",
"\u001b[31m[17:53:57] src/tree/ tree pruning end, 1 roots, 102 extra nodes, 0 pruned nodes, max_depth=6\u001b[0m\n",
"\u001b[31m[17:53:57] src/tree/ tree pruning end, 1 roots, 122 extra nodes, 0 pruned nodes, max_depth=6\u001b[0m\n",
"\u001b[31m[17:53:57] src/tree/ tree pruning end, 1 roots, 108 extra nodes, 0 pruned nodes, max_depth=6\u001b[0m\n",
"\u001b[31m[17:53:57] src/tree/ tree pruning end, 1 roots, 120 extra nodes, 0 pruned nodes, max_depth=6\u001b[0m\n",
"\u001b[31m[17:53:57] src/tree/ tree pruning end, 1 roots, 90 extra nodes, 0 pruned nodes, max_depth=6\u001b[0m\n",
"\u001b[31m[17:53:57] src/tree/ tree pruning end, 1 roots, 112 extra nodes, 0 pruned nodes, max_depth=6\u001b[0m\n",
"\u001b[31m[17:53:57] src/tree/ tree pruning end, 1 roots, 100 extra nodes, 0 pruned nodes, max_depth=6\u001b[0m\n",
"\u001b[31m[17:53:57] src/tree/ tree pruning end, 1 roots, 114 extra nodes, 0 pruned nodes, max_depth=6\u001b[0m\n",
"\u001b[31m[17:53:57] src/tree/ tree pruning end, 1 roots, 118 extra nodes, 0 pruned nodes, max_depth=6\u001b[0m\n",
"\u001b[31m[17:53:57] src/tree/ tree pruning end, 1 roots, 110 extra nodes, 0 pruned nodes, max_depth=6\u001b[0m\n",
"\u001b[31m[17:53:57] src/tree/ tree pruning end, 1 roots, 120 extra nodes, 0 pruned nodes, max_depth=6\u001b[0m\n",
"\u001b[31m[17:53:57] src/tree/ tree pruning end, 1 roots, 92 extra nodes, 0 pruned nodes, max_depth=6\u001b[0m\n",
"\u001b[31m[17:53:57] src/tree/ tree pruning end, 1 roots, 122 extra nodes, 0 pruned nodes, max_depth=6\u001b[0m\n",
"\u001b[31m[17:53:57] src/tree/ tree pruning end, 1 roots, 72 extra nodes, 0 pruned nodes, max_depth=6\u001b[0m\n",
"\u001b[31m[17:53:57] src/tree/ tree pruning end, 1 roots, 80 extra nodes, 0 pruned nodes, max_depth=6\u001b[0m\n",
"\u001b[31m[17:53:57] src/tree/ tree pruning end, 1 roots, 116 extra nodes, 0 pruned nodes, max_depth=6\u001b[0m\n",
"\u001b[31m[17:53:57] src/tree/ tree pruning end, 1 roots, 118 extra nodes, 0 pruned nodes, max_depth=6\u001b[0m\n",
"\u001b[31m[17:53:57] src/tree/ tree pruning end, 1 roots, 120 extra nodes, 0 pruned nodes, max_depth=6\u001b[0m\n",
"\u001b[31m[17:53:57] src/tree/ tree pruning end, 1 roots, 90 extra nodes, 0 pruned nodes, max_depth=6\u001b[0m\n",
"\u001b[31m[17:53:57] src/tree/ tree pruning end, 1 roots, 100 extra nodes, 0 pruned nodes, max_depth=6\u001b[0m\n",
"\u001b[31m[17:53:57] src/tree/ tree pruning end, 1 roots, 120 extra nodes, 0 pruned nodes, max_depth=6\u001b[0m\n",
"\u001b[31m[17:53:57] src/tree/ tree pruning end, 1 roots, 124 extra nodes, 0 pruned nodes, max_depth=6\u001b[0m\n",
"\u001b[31m[17:53:57] src/tree/ tree pruning end, 1 roots, 98 extra nodes, 0 pruned nodes, max_depth=6\u001b[0m\n",
"\u001b[31m[17:53:57] src/tree/ tree pruning end, 1 roots, 126 extra nodes, 0 pruned nodes, max_depth=6\u001b[0m\n",
"\u001b[31m[17:53:57] src/tree/ tree pruning end, 1 roots, 106 extra nodes, 0 pruned nodes, max_depth=6\u001b[0m\n",
"\u001b[31m[17:53:57] src/tree/ tree pruning end, 1 roots, 110 extra nodes, 0 pruned nodes, max_depth=6\u001b[0m\n",
"\u001b[31m[17:53:57] src/tree/ tree pruning end, 1 roots, 124 extra nodes, 0 pruned nodes, max_depth=6\u001b[0m\n",
"\u001b[31m[17:53:57] src/tree/ tree pruning end, 1 roots, 100 extra nodes, 0 pruned nodes, max_depth=6\u001b[0m\n",
"\u001b[31m[17:53:57] src/tree/ tree pruning end, 1 roots, 120 extra nodes, 0 pruned nodes, max_depth=6\u001b[0m\n",
"\u001b[31m[17:53:57] src/tree/ tree pruning end, 1 roots, 110 extra nodes, 0 pruned nodes, max_depth=6\u001b[0m\n",
"\u001b[31m[17:53:57] src/tree/ tree pruning end, 1 roots, 102 extra nodes, 0 pruned nodes, max_depth=6\u001b[0m\n",
"\u001b[31m[17:53:57] src/tree/ tree pruning end, 1 roots, 116 extra nodes, 0 pruned nodes, max_depth=6\u001b[0m\n",
"\u001b[31m[17:53:57] src/tree/ tree pruning end, 1 roots, 114 extra nodes, 0 pruned nodes, max_depth=6\u001b[0m\n",
"\u001b[31m[17:53:57] src/tree/ tree pruning end, 1 roots, 112 extra nodes, 0 pruned nodes, max_depth=6\u001b[0m\n",
"\u001b[31m[17:53:57] src/tree/ tree pruning end, 1 roots, 124 extra nodes, 0 pruned nodes, max_depth=6\u001b[0m\n",
"\u001b[31m[17:53:57] src/tree/ tree pruning end, 1 roots, 124 extra nodes, 0 pruned nodes, max_depth=6\u001b[0m\n",
"\u001b[31m[17:53:57] src/tree/ tree pruning end, 1 roots, 122 extra nodes, 0 pruned nodes, max_depth=6\u001b[0m\n",
"\u001b[31m[17:53:57] src/tree/ tree pruning end, 1 roots, 114 extra nodes, 0 pruned nodes, max_depth=6\u001b[0m\n",
"\u001b[31m[17:53:57] src/tree/ tree pruning end, 1 roots, 122 extra nodes, 0 pruned nodes, max_depth=6\u001b[0m\n",
"\u001b[31m[17:53:57] src/tree/ tree pruning end, 1 roots, 116 extra nodes, 0 pruned nodes, max_depth=6\u001b[0m\n",
"\u001b[31m[17:53:57] src/tree/ tree pruning end, 1 roots, 120 extra nodes, 0 pruned nodes, max_depth=6\u001b[0m\n",
"\u001b[31m[17:53:57] src/tree/ tree pruning end, 1 roots, 120 extra nodes, 0 pruned nodes, max_depth=6\u001b[0m\n",
"\u001b[31m[17:53:57] src/tree/ tree pruning end, 1 roots, 120 extra nodes, 0 pruned nodes, max_depth=6\u001b[0m\n",
"\u001b[31m[17:53:57] src/tree/ tree pruning end, 1 roots, 122 extra nodes, 0 pruned nodes, max_depth=6\u001b[0m\n",
"\u001b[31m[17:53:57] src/tree/ tree pruning end, 1 roots, 116 extra nodes, 0 pruned nodes, max_depth=6\u001b[0m\n",
"\u001b[31m[17:53:57] src/tree/ tree pruning end, 1 roots, 120 extra nodes, 0 pruned nodes, max_depth=6\u001b[0m\n",
"\u001b[31m[17:53:57] src/tree/ tree pruning end, 1 roots, 116 extra nodes, 0 pruned nodes, max_depth=6\u001b[0m\n",
"\u001b[31m[17:53:57] src/tree/ tree pruning end, 1 roots, 108 extra nodes, 0 pruned nodes, max_depth=6\u001b[0m\n",
"\u001b[31m[17:53:57] src/tree/ tree pruning end, 1 roots, 116 extra nodes, 0 pruned nodes, max_depth=6\u001b[0m\n",
"\u001b[31m[17:53:57] src/tree/ tree pruning end, 1 roots, 122 extra nodes, 0 pruned nodes, max_depth=6\u001b[0m\n",
"\u001b[31m[17:53:57] src/tree/ tree pruning end, 1 roots, 110 extra nodes, 0 pruned nodes, max_depth=6\u001b[0m\n",
"\u001b[31m[17:53:57] src/tree/ tree pruning end, 1 roots, 118 extra nodes, 0 pruned nodes, max_depth=6\u001b[0m\n",
"\u001b[31m[17:53:57] src/tree/ tree pruning end, 1 roots, 110 extra nodes, 0 pruned nodes, max_depth=6\u001b[0m\n",
"\u001b[31m[17:53:57] src/tree/ tree pruning end, 1 roots, 122 extra nodes, 0 pruned nodes, max_depth=6\u001b[0m\n",
"\u001b[31m[17:53:57] src/tree/ tree pruning end, 1 roots, 100 extra nodes, 0 pruned nodes, max_depth=6\u001b[0m\n",
"\u001b[31m[17:53:57] src/tree/ tree pruning end, 1 roots, 124 extra nodes, 0 pruned nodes, max_depth=6\u001b[0m\n",
"\u001b[31m[17:53:57] src/tree/ tree pruning end, 1 roots, 108 extra nodes, 0 pruned nodes, max_depth=6\u001b[0m\n",
"\u001b[31m[17:53:57] src/tree/ tree pruning end, 1 roots, 110 extra nodes, 0 pruned nodes, max_depth=6\u001b[0m\n",
"\u001b[31m[17:53:57] src/tree/ tree pruning end, 1 roots, 76 extra nodes, 0 pruned nodes, max_depth=6\u001b[0m\n",
"\u001b[31m[17:53:57] src/tree/ tree pruning end, 1 roots, 98 extra nodes, 0 pruned nodes, max_depth=6\u001b[0m\n",
"\u001b[31m[17:53:57] src/tree/ tree pruning end, 1 roots, 108 extra nodes, 0 pruned nodes, max_depth=6\u001b[0m\n",
"\u001b[31m[17:53:57] src/tree/ tree pruning end, 1 roots, 112 extra nodes, 0 pruned nodes, max_depth=6\u001b[0m\n",
"\u001b[31m[17:53:57] src/tree/ tree pruning end, 1 roots, 110 extra nodes, 0 pruned nodes, max_depth=6\u001b[0m\n",
"\u001b[31m[17:53:57] src/tree/ tree pruning end, 1 roots, 122 extra nodes, 0 pruned nodes, max_depth=6\u001b[0m\n",
"\u001b[31m[17:53:57] src/tree/ tree pruning end, 1 roots, 98 extra nodes, 0 pruned nodes, max_depth=6\u001b[0m\n",
"\u001b[31m[17:53:57] src/tree/ tree pruning end, 1 roots, 110 extra nodes, 0 pruned nodes, max_depth=6\u001b[0m\n",
"\u001b[31m[17:53:57] src/tree/ tree pruning end, 1 roots, 104 extra nodes, 0 pruned nodes, max_depth=6\u001b[0m\n",
"\u001b[31m[17:53:57] src/tree/ tree pruning end, 1 roots, 114 extra nodes, 0 pruned nodes, max_depth=6\u001b[0m\n",
"\u001b[31m[17:53:57] src/tree/ tree pruning end, 1 roots, 108 extra nodes, 0 pruned nodes, max_depth=6\u001b[0m\n",
"\u001b[31m[17:53:57] src/tree/ tree pruning end, 1 roots, 90 extra nodes, 0 pruned nodes, max_depth=6\u001b[0m\n",
"\u001b[31m[17:53:57] src/tree/ tree pruning end, 1 roots, 126 extra nodes, 0 pruned nodes, max_depth=6\u001b[0m\n",
"\u001b[31m[17:53:57] src/tree/ tree pruning end, 1 roots, 110 extra nodes, 0 pruned nodes, max_depth=6\u001b[0m\n",
"\u001b[31m[17:53:57] src/tree/ tree pruning end, 1 roots, 112 extra nodes, 0 pruned nodes, max_depth=6\u001b[0m\n",
"\u001b[31m[17:53:57] src/tree/ tree pruning end, 1 roots, 122 extra nodes, 0 pruned nodes, max_depth=6\u001b[0m\n",
"\u001b[31m[17:53:57] src/tree/ tree pruning end, 1 roots, 68 extra nodes, 0 pruned nodes, max_depth=6\u001b[0m\n",
"\u001b[31m[17:53:57] src/tree/ tree pruning end, 1 roots, 110 extra nodes, 0 pruned nodes, max_depth=6\u001b[0m\n",
"\u001b[31m[17:53:57] src/tree/ tree pruning end, 1 roots, 106 extra nodes, 0 pruned nodes, max_depth=6\u001b[0m\n",
"\u001b[31m[17:53:57] src/tree/ tree pruning end, 1 roots, 104 extra nodes, 0 pruned nodes, max_depth=6\u001b[0m\n",
"\u001b[31m[17:53:57] src/tree/ tree pruning end, 1 roots, 110 extra nodes, 0 pruned nodes, max_depth=6\u001b[0m\n",
"\u001b[31m[17:53:57] src/tree/ tree pruning end, 1 roots, 112 extra nodes, 0 pruned nodes, max_depth=6\u001b[0m\n",
"\u001b[31m[17:53:57] src/tree/ tree pruning end, 1 roots, 96 extra nodes, 0 pruned nodes, max_depth=6\u001b[0m\n",
"\u001b[31m[17:53:57] src/tree/ tree pruning end, 1 roots, 104 extra nodes, 0 pruned nodes, max_depth=6\u001b[0m\n",
"\u001b[31m[17:53:57] src/tree/ tree pruning end, 1 roots, 118 extra nodes, 0 pruned nodes, max_depth=6\u001b[0m\n",
"\u001b[31m[17:53:57] src/tree/ tree pruning end, 1 roots, 104 extra nodes, 0 pruned nodes, max_depth=6\u001b[0m\n",
"\u001b[31m[17:53:57] src/tree/ tree pruning end, 1 roots, 120 extra nodes, 0 pruned nodes, max_depth=6\u001b[0m\n",
"\u001b[31m[17:53:57] src/tree/ tree pruning end, 1 roots, 120 extra nodes, 0 pruned nodes, max_depth=6\u001b[0m\n",
"\u001b[31m[17:53:57] src/tree/ tree pruning end, 1 roots, 116 extra nodes, 0 pruned nodes, max_depth=6\u001b[0m\n",
"\u001b[31m[17:53:57] src/tree/ tree pruning end, 1 roots, 98 extra nodes, 0 pruned nodes, max_depth=6\u001b[0m\n",
"\u001b[31m[17:53:57] src/tree/ tree pruning end, 1 roots, 116 extra nodes, 0 pruned nodes, max_depth=6\u001b[0m\n",
"\u001b[31m[17:53:57] src/tree/ tree pruning end, 1 roots, 118 extra nodes, 0 pruned nodes, max_depth=6\u001b[0m\n",
"\u001b[31m[17:53:57] src/tree/ tree pruning end, 1 roots, 118 extra nodes, 0 pruned nodes, max_depth=6\u001b[0m\n",
"\u001b[31m[17:53:57] src/tree/ tree pruning end, 1 roots, 118 extra nodes, 0 pruned nodes, max_depth=6\u001b[0m\n",
"\u001b[31m[17:53:57] src/tree/ tree pruning end, 1 roots, 106 extra nodes, 0 pruned nodes, max_depth=6\u001b[0m\n",
"\u001b[31m[17:53:57] src/tree/ tree pruning end, 1 roots, 108 extra nodes, 0 pruned nodes, max_depth=6\u001b[0m\n",
"\u001b[31m[17:53:57] src/tree/ tree pruning end, 1 roots, 112 extra nodes, 0 pruned nodes, max_depth=6\u001b[0m\n",
"\u001b[31m[17:53:57] src/tree/ tree pruning end, 1 roots, 112 extra nodes, 0 pruned nodes, max_depth=6\u001b[0m\n",
"\u001b[31m[17:53:57] src/tree/ tree pruning end, 1 roots, 106 extra nodes, 0 pruned nodes, max_depth=6\u001b[0m\n",
"\u001b[31m[17:53:57] src/tree/ tree pruning end, 1 roots, 118 extra nodes, 0 pruned nodes, max_depth=6\u001b[0m\n",
"\u001b[31m[17:53:57] src/tree/ tree pruning end, 1 roots, 108 extra nodes, 0 pruned nodes, max_depth=6\u001b[0m\n",
"\u001b[31m[17:53:57] src/tree/ tree pruning end, 1 roots, 114 extra nodes, 0 pruned nodes, max_depth=6\u001b[0m\n",
"\u001b[31m[17:53:57] src/tree/ tree pruning end, 1 roots, 118 extra nodes, 0 pruned nodes, max_depth=6\u001b[0m\n",
"\u001b[31m[17:53:57] src/tree/ tree pruning end, 1 roots, 102 extra nodes, 0 pruned nodes, max_depth=6\u001b[0m\n",
"\u001b[31m[17:53:57] src/tree/ tree pruning end, 1 roots, 92 extra nodes, 0 pruned nodes, max_depth=6\u001b[0m\n",
"\u001b[31m[17:53:57] src/tree/ tree pruning end, 1 roots, 116 extra nodes, 0 pruned nodes, max_depth=6\u001b[0m\n",
"\u001b[31m[17:53:57] src/tree/ tree pruning end, 1 roots, 104 extra nodes, 0 pruned nodes, max_depth=6\u001b[0m\n",
"\u001b[31m[17:53:57] src/tree/ tree pruning end, 1 roots, 104 extra nodes, 0 pruned nodes, max_depth=6\u001b[0m\n",
"\u001b[31m[17:53:57] src/tree/ tree pruning end, 1 roots, 110 extra nodes, 0 pruned nodes, max_depth=6\u001b[0m\n",
"\u001b[31m[17:53:57] src/tree/ tree pruning end, 1 roots, 110 extra nodes, 0 pruned nodes, max_depth=6\u001b[0m\n",
"\u001b[31m[17:53:57] src/tree/ tree pruning end, 1 roots, 118 extra nodes, 0 pruned nodes, max_depth=6\u001b[0m\n",
"\u001b[31m[17:53:57] src/tree/ tree pruning end, 1 roots, 126 extra nodes, 0 pruned nodes, max_depth=6\u001b[0m\n",
"\u001b[31m[17:53:57] src/tree/ tree pruning end, 1 roots, 124 extra nodes, 0 pruned nodes, max_depth=6\u001b[0m\n",
"\u001b[31m[17:53:57] src/tree/ tree pruning end, 1 roots, 120 extra nodes, 0 pruned nodes, max_depth=6\u001b[0m\n",
"\u001b[31m[17:53:57] src/tree/ tree pruning end, 1 roots, 120 extra nodes, 0 pruned nodes, max_depth=6\u001b[0m\n",
"\u001b[31m[17:53:57] src/tree/ tree pruning end, 1 roots, 116 extra nodes, 0 pruned nodes, max_depth=6\u001b[0m\n",
"\u001b[31m[17:53:57] src/tree/ tree pruning end, 1 roots, 114 extra nodes, 0 pruned nodes, max_depth=6\u001b[0m\n",
"\u001b[31m[17:53:57] src/tree/ tree pruning end, 1 roots, 116 extra nodes, 0 pruned nodes, max_depth=6\u001b[0m\n",
"\u001b[31m[17:53:57] src/tree/ tree pruning end, 1 roots, 124 extra nodes, 0 pruned nodes, max_depth=6\u001b[0m\n",
"\u001b[31m[17:53:57] src/tree/ tree pruning end, 1 roots, 124 extra nodes, 0 pruned nodes, max_depth=6\u001b[0m\n",
"\u001b[31m[17:53:57] src/tree/ tree pruning end, 1 roots, 124 extra nodes, 0 pruned nodes, max_depth=6\u001b[0m\n",
"\u001b[31m[17:53:57] src/tree/ tree pruning end, 1 roots, 122 extra nodes, 0 pruned nodes, max_depth=6\u001b[0m\n",
"\u001b[31m[17:53:57] src/tree/ tree pruning end, 1 roots, 114 extra nodes, 0 pruned nodes, max_depth=6\u001b[0m\n",
"\u001b[31m[17:53:57] src/tree/ tree pruning end, 1 roots, 110 extra nodes, 0 pruned nodes, max_depth=6\u001b[0m\n",
"\u001b[31m[17:53:57] src/tree/ tree pruning end, 1 roots, 98 extra nodes, 0 pruned nodes, max_depth=6\u001b[0m\n",
"\u001b[31m[17:53:57] src/tree/ tree pruning end, 1 roots, 124 extra nodes, 0 pruned nodes, max_depth=6\u001b[0m\n",
"\u001b[31m[17:53:57] src/tree/ tree pruning end, 1 roots, 82 extra nodes, 0 pruned nodes, max_depth=6\u001b[0m\n",
"\u001b[31m[17:53:57] src/tree/ tree pruning end, 1 roots, 118 extra nodes, 0 pruned nodes, max_depth=6\u001b[0m\n",
"\u001b[31m[17:53:57] src/tree/ tree pruning end, 1 roots, 124 extra nodes, 0 pruned nodes, max_depth=6\u001b[0m\n",
"\u001b[31m[17:53:57] src/tree/ tree pruning end, 1 roots, 116 extra nodes, 0 pruned nodes, max_depth=6\u001b[0m\n",
"\u001b[31m[17:53:57] src/tree/ tree pruning end, 1 roots, 126 extra nodes, 0 pruned nodes, max_depth=6\u001b[0m\n",
"\u001b[31m[17:53:57] src/tree/ tree pruning end, 1 roots, 110 extra nodes, 0 pruned nodes, max_depth=6\u001b[0m\n",
"\u001b[31m[17:53:57] src/tree/ tree pruning end, 1 roots, 112 extra nodes, 0 pruned nodes, max_depth=6\u001b[0m\n",
"name": "stdout",
"output_type": "stream",
"text": [
"===== Job Complete =====\n"
"source": [
"# XGBoost supports \"train\", \"validation\" channels\n",
"# Reference: Supported channels by algorithm\n",
"{'train':training_input_config, 'validation':validation_input_config})"
"cell_type": "markdown",
"metadata": {},
"source": [
"train-rmse: 0.163079 ,\n",
"validation-rmse: 0.281315"
"cell_type": "markdown",
"metadata": {},
"source": [
"## Deploy Model"
"cell_type": "code",
"execution_count": 28,
"metadata": {},
"outputs": [
"name": "stderr",
"output_type": "stream",
"text": [
"INFO:sagemaker:Creating model with name: xgboost-2018-04-02-17-56-16-881\n",
"INFO:sagemaker:Creating endpoint with name xgboost-biketrain-v1\n"
"name": "stdout",
"output_type": "stream",
"text": [
"source": [
"# Ref:\n",
"predictor = estimator.deploy(initial_instance_count = 1,\n",
" instance_type = 'ml.m4.xlarge',\n",
" endpoint_name = 'xgboost-biketrain-v1')"
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": true
"outputs": [],
"source": []
"cell_type": "markdown",
"metadata": {
"collapsed": true
"source": [
"## Run Predictions"
"cell_type": "code",
"execution_count": 29,
"metadata": {
"collapsed": true
"outputs": [],
"source": [
"from sagemaker.predictor import csv_serializer, json_deserializer\n",
"predictor.content_type = 'text/csv'\n",
"predictor.serializer = csv_serializer\n",
"predictor.deserializer = None"
"cell_type": "code",
"execution_count": 30,
"metadata": {},
"outputs": [
"data": {
"text/plain": [
"execution_count": 30,
"metadata": {},
"output_type": "execute_result"
"source": [
"cell_type": "markdown",
"metadata": {
"collapsed": true
"source": [
"## Summary"
"cell_type": "markdown",
"metadata": {},
"source": [
"1. Ensure Training, Test and Validation data are in S3 Bucket\n",
"2. Select Algorithm Container Registry Path - Path varies by region\n",
"3. Configure Estimator for training - Specify Algorithm container, instance count, instance type, model output location\n",
"4. Specify algorithm specific hyper parameters\n",
"5. Train model\n",
"6. Deploy model - Specify instance count, instance type and endpoint name\n",
"7. Run Predictions"
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": true
"outputs": [],
"source": []
"metadata": {
"kernelspec": {
"display_name": "conda_python3",
"language": "python",
"name": "conda_python3"
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.6.2"
"nbformat": 4,
"nbformat_minor": 2
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment