Skip to content

Instantly share code, notes, and snippets.

@haje01
Last active May 11, 2019 11:47
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save haje01/558560a620a31c4555648f07149597da to your computer and use it in GitHub Desktop.
Save haje01/558560a620a31c4555648f07149597da to your computer and use it in GitHub Desktop.
SageMaker RL에서 Ray로 Roboschool Reacher 풀이
Display the source blob
Display the rendered blob
Raw
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Amazon SageMaker로 다중 노드들간 분산 RL을 이용해 Roboschool 에이전트를 훈련하기"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### 주의) 로컬 노트북 인스턴스에서 실행시 권한\n",
"- SageMaker 노트북 인스턴스는 SageMaker에 맞게 설정된 IAM Role을 이용\n",
"- 로컬 노트북 인스턴스는 로컬에서 설정된 IAM User를 사용한다\n",
"- IAM User로 이 노트북을 실행하기 위해서는 다음과 같은 권한이 필요\n",
" - AmazonSageMakerFullAccess\n",
" - AmazonEC2ContainerRegistryFullAccess\n",
" - IAMFullAccess"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 사용할 Roboschool 과제 선택"
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [],
"source": [
"roboschool_problem = 'reacher' # 쉬움\n",
"#roboschool_problem = 'hopper'\n",
"#roboschool_problem = 'humanoid'"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 먼저 필요한 것들\n",
"\n",
"### 외부 모듈 import"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {
"scrolled": true
},
"outputs": [],
"source": [
"import sagemaker\n",
"import boto3\n",
"import sys\n",
"import os\n",
"import glob\n",
"import re\n",
"import subprocess\n",
"from IPython.display import HTML, Markdown\n",
"import time\n",
"from time import gmtime, strftime\n",
"sys.path.append(\"common\")\n",
"from misc import get_execution_role, wait_for_s3_object\n",
"from docker_utils import build_and_push_docker_image\n",
"from sagemaker.rl import RLEstimator, RLToolkit, RLFramework\n",
"from markdown_helper import generate_help_for_s3_endpoint_permissions, create_s3_endpoint_manually"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### S3 버킷 설정\n",
"\n",
"체크포인트와 메타데이터를 위한 S3 설정"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
]
}
],
"source": [
"sage_session = sagemaker.session.Session()\n",
"s3_bucket = sage_session.default_bucket() \n",
"s3_output_path = 's3://{}/'.format(s3_bucket)\n",
"print(\"S3 bucket path: {}\".format(s3_output_path))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### 변수 정의\n",
"\n",
"훈련 작업을 위한 접두어와 *컨테이너를 위한 이미지 경로(BYOC일 때만 필요)*를 정의"
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"'rl-roboschool-distributed-reacher'"
]
},
"execution_count": 4,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# 설명적인 작업 이름 만들기\n",
"job_name_prefix = 'rl-roboschool-distributed-' + roboschool_problem\n",
"aws_region = boto3.Session().region_name\n",
"job_name_prefix"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### 훈련 장소 설정\n",
"\n",
"SageMaker 노트북 인스턴스나 로컬 노트북 인스턴스에서 훈련 가능. 노트북 인스턴스에서 훈련하는 것을 **로컬 모드**라 함. 로컬 모드에서는 로컬 컨테이너에서 코드 실행을 위해 SageMaker Python SDK를 이용. 로컬 모드는 초기 개발 및 더비깅에 적합. "
]
},
{
"cell_type": "code",
"execution_count": 13,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"3"
]
},
"execution_count": 13,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# 로컬 모드(노트북 인스턴스에서 훈련) 여부\n",
"local_mode = False\n",
"\n",
"if local_mode:\n",
" instance_type = 'local'\n",
"else:\n",
" # SageMaker에서 훈련시, 인트턴스 타입 선택 \n",
" instance_type = \"ml.c5.2xlarge\"\n",
" \n",
"# 훈련용 인스턴스 수 \n",
"train_instance_count = 3"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### IAM 역할 생성\n",
"\n",
"SageMaker 노트북 인스턴스에서 role = sagemaker.get_execution_role() 또는 로컬 노트북 인스턴스에서 유틸리티 함수 role = get_execution_role()를 호출해 실행 역할을 얻음"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"try:\n",
" role = sagemaker.get_execution_role()\n",
"except:\n",
" role = get_execution_role()\n",
"\n",
"print(\"Using IAM role arn: {}\".format(role))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### 로컬 모드를 위한 도커 설치 \n",
"\n",
"로컬 모드를 위해서는 아래와 같은 도커 환경이 설치되어 있어야 함:\n",
"- docker\n",
"- docker-compose\n",
"- nvidia-docker (로컬 GPU 장비를 위해)\n",
"\n",
"SageMaker 노트북 인스턴스에서는 아래의 스크립트 수행하면 설치됨."
]
},
{
"cell_type": "code",
"execution_count": 15,
"metadata": {},
"outputs": [],
"source": [
"# SageMaker 노트북 인스턴스에서만 실행\n",
"if local_mode:\n",
" !/bin/bash ./common/setup.sh"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 도커 컨테이너 빌드\n",
"\n",
"아래 과정으로 Roboschool이 설치된 도커 컨테이너를 빌드해야 한다:\n",
"\n",
"1. 기본 컨테이너 이미지 얻음 \n",
"2. Roboschool과 의존 패키지 설치 \n",
"3. 새 컨테이너 이미지를 ECR에 업로드 \n",
"\n",
"3~4분 정도 걸림"
]
},
{
"cell_type": "code",
"execution_count": 16,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"WARNING! Using --password via the CLI is insecure. Use --password-stdin.\n",
"Login Succeeded\n",
"Logged into ECR\n",
"Building docker image sagemaker-roboschool-ray-cpu from Dockerfile\n",
"$ docker build -t sagemaker-roboschool-ray-cpu -f Dockerfile . --build-arg CPU_OR_GPU=cpu --build-arg AWS_REGION=ap-northeast-2\n",
"Sending build context to Docker daemon 464.4kB\n",
"Step 1/14 : ARG CPU_OR_GPU\n",
"Step 2/14 : ARG AWS_REGION\n",
"Step 3/14 : FROM 520713654638.dkr.ecr.${AWS_REGION}.amazonaws.com/sagemaker-rl-tensorflow:ray0.5.3-${CPU_OR_GPU}-py3\n",
" ---> b522ab9d6e52\n",
"Step 4/14 : WORKDIR /opt/ml\n",
" ---> Using cache\n",
" ---> 888e0f77a0c8\n",
"Step 5/14 : RUN apt-get update && apt-get install -y git cmake ffmpeg pkg-config qtbase5-dev libqt5opengl5-dev libassimp-dev libtinyxml-dev libgl1-mesa-dev && cd /opt && apt-get clean && rm -rf /var/cache/apt/archives/* /var/lib/apt/lists/*\n",
" ---> Using cache\n",
" ---> 6751e6cb26af\n",
"Step 6/14 : RUN apt-get update && apt-get install -y libboost-python-dev\n",
" ---> Using cache\n",
" ---> 7cdde49abdda\n",
"Step 7/14 : RUN apt-get update && apt-get install -y --no-install-recommends python3.6-dev && ln -s -f /usr/bin/python3.6 /usr/bin/python && apt-get clean && rm -rf /var/cache/apt/archives/* /var/lib/apt/lists/*\n",
" ---> Using cache\n",
" ---> 545c2184394b\n",
"Step 8/14 : RUN curl -fSsL -O https://bootstrap.pypa.io/get-pip.py && python get-pip.py && rm get-pip.py\n",
" ---> Using cache\n",
" ---> 72d3593543de\n",
"Step 9/14 : RUN pip install --upgrade pip setuptools\n",
" ---> Using cache\n",
" ---> 47deb5234f9d\n",
"Step 10/14 : RUN pip install sagemaker-containers --upgrade\n",
" ---> Using cache\n",
" ---> 0f48f773151d\n",
"Step 11/14 : RUN pip install roboschool==1.0.46\n",
" ---> Using cache\n",
" ---> b7dd4076a46c\n",
"Step 12/14 : ENV PYTHONUNBUFFERED 1\n",
" ---> Using cache\n",
" ---> 108f959cf33e\n",
"Step 13/14 : RUN python -c \"import gym;import sagemaker_containers.cli.train;import roboschool; import ray; from sagemaker_containers.cli.train import main\"\n",
" ---> Using cache\n",
" ---> 36f65ce4adf3\n",
"Step 14/14 : WORKDIR /opt/ml/code\n",
" ---> Using cache\n",
" ---> c2d2e27d72fc\n",
"Successfully built c2d2e27d72fc\n",
"Successfully tagged sagemaker-roboschool-ray-cpu:latest\n",
"Done building docker image sagemaker-roboschool-ray-cpu\n",
"ECR repository already exists: sagemaker-roboschool-ray-cpu\n",
"WARNING! Using --password via the CLI is insecure. Use --password-stdin.\n",
"Login Succeeded\n",
"Logged into ECR\n",
"$ docker tag sagemaker-roboschool-ray-cpu 415742736303.dkr.ecr.ap-northeast-2.amazonaws.com/sagemaker-roboschool-ray-cpu\n",
"Pushing docker image to ECR repository 415742736303.dkr.ecr.ap-northeast-2.amazonaws.com/sagemaker-roboschool-ray-cpu\n",
"\n",
"$ docker push 415742736303.dkr.ecr.ap-northeast-2.amazonaws.com/sagemaker-roboschool-ray-cpu\n",
"The push refers to repository [415742736303.dkr.ecr.ap-northeast-2.amazonaws.com/sagemaker-roboschool-ray-cpu]\n",
"c4e8cce86b64: Preparing\n",
"b438630f4a71: Preparing\n",
"83256b64f5df: Preparing\n",
"6cbdea9e59f6: Preparing\n",
"725ba5f36903: Preparing\n",
"65fed3034f32: Preparing\n",
"d1d3b9d7fb4e: Preparing\n",
"3ad59e27f6bf: Preparing\n",
"9702a85a1269: Preparing\n",
"79e4b43308b4: Preparing\n",
"60a4c6264060: Preparing\n",
"6ee8fe30b55b: Preparing\n",
"36cb14e2633d: Preparing\n",
"c9bbcdfaaa98: Preparing\n",
"09b011f6bf18: Preparing\n",
"296294040c48: Preparing\n",
"302af95b9e68: Preparing\n",
"0ecacb4d5bb1: Preparing\n",
"d6db7d05e3b2: Preparing\n",
"8d999119430c: Preparing\n",
"fe4ed9a0e78a: Preparing\n",
"3db5746c911a: Preparing\n",
"819a824caf70: Preparing\n",
"647265b9d8bc: Preparing\n",
"41c002c8a6fd: Preparing\n",
"65fed3034f32: Waiting\n",
"d1d3b9d7fb4e: Waiting\n",
"3ad59e27f6bf: Waiting\n",
"9702a85a1269: Waiting\n",
"79e4b43308b4: Waiting\n",
"60a4c6264060: Waiting\n",
"6ee8fe30b55b: Waiting\n",
"36cb14e2633d: Waiting\n",
"c9bbcdfaaa98: Waiting\n",
"fe4ed9a0e78a: Waiting\n",
"3db5746c911a: Waiting\n",
"819a824caf70: Waiting\n",
"09b011f6bf18: Waiting\n",
"296294040c48: Waiting\n",
"302af95b9e68: Waiting\n",
"0ecacb4d5bb1: Waiting\n",
"d6db7d05e3b2: Waiting\n",
"8d999119430c: Waiting\n",
"647265b9d8bc: Waiting\n",
"41c002c8a6fd: Waiting\n",
"b438630f4a71: Layer already exists\n",
"83256b64f5df: Layer already exists\n",
"725ba5f36903: Layer already exists\n",
"6cbdea9e59f6: Layer already exists\n",
"c4e8cce86b64: Layer already exists\n",
"65fed3034f32: Layer already exists\n",
"d1d3b9d7fb4e: Layer already exists\n",
"3ad59e27f6bf: Layer already exists\n",
"79e4b43308b4: Layer already exists\n",
"9702a85a1269: Layer already exists\n",
"60a4c6264060: Layer already exists\n",
"6ee8fe30b55b: Layer already exists\n",
"c9bbcdfaaa98: Layer already exists\n",
"09b011f6bf18: Layer already exists\n",
"36cb14e2633d: Layer already exists\n",
"296294040c48: Layer already exists\n",
"302af95b9e68: Layer already exists\n",
"0ecacb4d5bb1: Layer already exists\n",
"d6db7d05e3b2: Layer already exists\n",
"8d999119430c: Layer already exists\n",
"41c002c8a6fd: Layer already exists\n",
"819a824caf70: Layer already exists\n",
"fe4ed9a0e78a: Layer already exists\n",
"3db5746c911a: Layer already exists\n",
"647265b9d8bc: Layer already exists\n",
"latest: digest: sha256:f98f6ab259615f76be64336bd34b18ed9ec72cce2dae6eaf488780a9c44d9f6f size: 5573\n",
"Done pushing 415742736303.dkr.ecr.ap-northeast-2.amazonaws.com/sagemaker-roboschool-ray-cpu\n",
"Using ECR image 415742736303.dkr.ecr.ap-northeast-2.amazonaws.com/sagemaker-roboschool-ray-cpu\n",
"CPU times: user 147 ms, sys: 52 ms, total: 199 ms\n",
"Wall time: 3 s\n"
]
}
],
"source": [
"%%time\n",
"cpu_or_gpu = 'gpu' if instance_type.startswith('ml.p') else 'cpu'\n",
"repository_short_name = \"sagemaker-roboschool-ray-%s\" % cpu_or_gpu\n",
"docker_build_args = {\n",
" 'CPU_OR_GPU': cpu_or_gpu, \n",
" 'AWS_REGION': boto3.Session().region_name,\n",
"}\n",
"custom_image_name = build_and_push_docker_image(repository_short_name, build_args=docker_build_args)\n",
"print(\"Using ECR image %s\" % custom_image_name)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 학습 코드 작성\n",
"\n",
"훈련 코드는 `src/` 디렉토리의 `train-reacher.py` 파일에 저장되어 있음. "
]
},
{
"cell_type": "code",
"execution_count": 17,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\u001b[34mimport\u001b[39;49;00m \u001b[04m\u001b[36mjson\u001b[39;49;00m\n",
"\u001b[34mimport\u001b[39;49;00m \u001b[04m\u001b[36mos\u001b[39;49;00m\n",
"\n",
"\u001b[34mimport\u001b[39;49;00m \u001b[04m\u001b[36mgym\u001b[39;49;00m\n",
"\u001b[34mimport\u001b[39;49;00m \u001b[04m\u001b[36mray\u001b[39;49;00m\n",
"\u001b[34mfrom\u001b[39;49;00m \u001b[04m\u001b[36mray.tune\u001b[39;49;00m \u001b[34mimport\u001b[39;49;00m run_experiments\n",
"\u001b[34mfrom\u001b[39;49;00m \u001b[04m\u001b[36mray.tune.registry\u001b[39;49;00m \u001b[34mimport\u001b[39;49;00m register_env\n",
"\u001b[34mimport\u001b[39;49;00m \u001b[04m\u001b[36mroboschool\u001b[39;49;00m\n",
"\n",
"\u001b[34mfrom\u001b[39;49;00m \u001b[04m\u001b[36msagemaker_rl.ray_launcher\u001b[39;49;00m \u001b[34mimport\u001b[39;49;00m SageMakerRayLauncher\n",
"\n",
"\n",
"\u001b[34mdef\u001b[39;49;00m \u001b[32mcreate_environment\u001b[39;49;00m(env_config):\n",
" \u001b[37m# 워커 프로세스를 위해 이 import는 메소드 내에서 수행되어야 함.\u001b[39;49;00m\n",
" \u001b[34mimport\u001b[39;49;00m \u001b[04m\u001b[36mroboschool\u001b[39;49;00m\n",
" \u001b[34mreturn\u001b[39;49;00m gym.make(\u001b[33m'\u001b[39;49;00m\u001b[33mRoboschoolReacher-v1\u001b[39;49;00m\u001b[33m'\u001b[39;49;00m)\n",
"\n",
"\n",
"\u001b[37m# SageMakerRayLauncher는 Ray-RLLib을 이용한 SageMaker RL 어플리케이션을 위한 기본 클래스.\u001b[39;49;00m\n",
"\u001b[37m# 사용자는 이것을 상속받아 필요한 메소드를 구현하고, train_main()을 호출하여 훈련 프로세스를 시작.\u001b[39;49;00m\n",
"\u001b[34mclass\u001b[39;49;00m \u001b[04m\u001b[32mMyLauncher\u001b[39;49;00m(SageMakerRayLauncher):\n",
"\n",
" \u001b[34mdef\u001b[39;49;00m \u001b[32mregister_env_creator\u001b[39;49;00m(\u001b[36mself\u001b[39;49;00m):\n",
" register_env(\u001b[33m\"\u001b[39;49;00m\u001b[33mRoboschoolReacher-v1\u001b[39;49;00m\u001b[33m\"\u001b[39;49;00m, create_environment)\n",
"\n",
" \u001b[34mdef\u001b[39;49;00m \u001b[32mget_experiment_config\u001b[39;49;00m(\u001b[36mself\u001b[39;49;00m):\n",
" \u001b[34mreturn\u001b[39;49;00m {\n",
" \u001b[33m\"\u001b[39;49;00m\u001b[33mtraining\u001b[39;49;00m\u001b[33m\"\u001b[39;49;00m: {\n",
" \u001b[33m\"\u001b[39;49;00m\u001b[33menv\u001b[39;49;00m\u001b[33m\"\u001b[39;49;00m: \u001b[33m\"\u001b[39;49;00m\u001b[33mRoboschoolReacher-v1\u001b[39;49;00m\u001b[33m\"\u001b[39;49;00m,\n",
" \u001b[33m\"\u001b[39;49;00m\u001b[33mrun\u001b[39;49;00m\u001b[33m\"\u001b[39;49;00m: \u001b[33m\"\u001b[39;49;00m\u001b[33mPPO\u001b[39;49;00m\u001b[33m\"\u001b[39;49;00m,\n",
" \u001b[33m\"\u001b[39;49;00m\u001b[33mstop\u001b[39;49;00m\u001b[33m\"\u001b[39;49;00m: {\n",
" \u001b[33m\"\u001b[39;49;00m\u001b[33mepisode_reward_mean\u001b[39;49;00m\u001b[33m\"\u001b[39;49;00m: \u001b[34m18\u001b[39;49;00m,\n",
" },\n",
" \u001b[33m\"\u001b[39;49;00m\u001b[33mconfig\u001b[39;49;00m\u001b[33m\"\u001b[39;49;00m: {\n",
" \u001b[33m\"\u001b[39;49;00m\u001b[33mgamma\u001b[39;49;00m\u001b[33m\"\u001b[39;49;00m: \u001b[34m0.995\u001b[39;49;00m,\n",
" \u001b[33m\"\u001b[39;49;00m\u001b[33mkl_coeff\u001b[39;49;00m\u001b[33m\"\u001b[39;49;00m: \u001b[34m1.0\u001b[39;49;00m,\n",
" \u001b[33m\"\u001b[39;49;00m\u001b[33mnum_sgd_iter\u001b[39;49;00m\u001b[33m\"\u001b[39;49;00m: \u001b[34m20\u001b[39;49;00m,\n",
" \u001b[33m\"\u001b[39;49;00m\u001b[33mlr\u001b[39;49;00m\u001b[33m\"\u001b[39;49;00m: \u001b[34m0.0001\u001b[39;49;00m,\n",
" \u001b[33m\"\u001b[39;49;00m\u001b[33msgd_minibatch_size\u001b[39;49;00m\u001b[33m\"\u001b[39;49;00m: \u001b[34m1000\u001b[39;49;00m,\n",
" \u001b[33m\"\u001b[39;49;00m\u001b[33mtrain_batch_size\u001b[39;49;00m\u001b[33m\"\u001b[39;49;00m: \u001b[34m25000\u001b[39;49;00m,\n",
" \u001b[33m\"\u001b[39;49;00m\u001b[33mmonitor\u001b[39;49;00m\u001b[33m\"\u001b[39;49;00m: \u001b[36mTrue\u001b[39;49;00m, \u001b[37m# 비디오 저장\u001b[39;49;00m\n",
" \u001b[33m\"\u001b[39;49;00m\u001b[33mmodel\u001b[39;49;00m\u001b[33m\"\u001b[39;49;00m: {\n",
" \u001b[33m\"\u001b[39;49;00m\u001b[33mfree_log_std\u001b[39;49;00m\u001b[33m\"\u001b[39;49;00m: \u001b[36mTrue\u001b[39;49;00m\n",
" },\n",
" \u001b[33m\"\u001b[39;49;00m\u001b[33mnum_workers\u001b[39;49;00m\u001b[33m\"\u001b[39;49;00m: (\u001b[36mself\u001b[39;49;00m.num_cpus-\u001b[34m1\u001b[39;49;00m),\n",
" \u001b[33m\"\u001b[39;49;00m\u001b[33mnum_gpus\u001b[39;49;00m\u001b[33m\"\u001b[39;49;00m: \u001b[36mself\u001b[39;49;00m.num_gpus,\n",
" \u001b[33m\"\u001b[39;49;00m\u001b[33mbatch_mode\u001b[39;49;00m\u001b[33m\"\u001b[39;49;00m: \u001b[33m\"\u001b[39;49;00m\u001b[33mcomplete_episodes\u001b[39;49;00m\u001b[33m\"\u001b[39;49;00m\n",
" }\n",
" }\n",
" }\n",
"\n",
"\u001b[34mif\u001b[39;49;00m \u001b[31m__name__\u001b[39;49;00m == \u001b[33m\"\u001b[39;49;00m\u001b[33m__main__\u001b[39;49;00m\u001b[33m\"\u001b[39;49;00m:\n",
" MyLauncher().train_main()\n"
]
}
],
"source": [
"!pygmentize src/train-{roboschool_problem}.py"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Ray 균질(homogeneous) 스케일링 - train_instance_count을 1 보다 크게 설정"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"균질 스케일링은 같은 타입의 다중 인스턴스를 이용하게 해줌. "
]
},
{
"cell_type": "code",
"execution_count": 18,
"metadata": {
"scrolled": true
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Training job: rl-roboschool-distributed-reacher-2019-05-08-10-20-27-580\n"
]
}
],
"source": [
"# Ray의 훈련 척도 이용 \n",
"metric_definitions = RLEstimator.default_metric_definitions(RLToolkit.RAY)\n",
" \n",
"# 훈련 시작 \n",
"estimator = RLEstimator(entry_point=\"train-%s.py\" % roboschool_problem,\n",
" source_dir='src',\n",
" dependencies=[\"common/sagemaker_rl\"],\n",
" image_name=custom_image_name,\n",
" role=role,\n",
" train_instance_type=instance_type,\n",
" train_instance_count=train_instance_count,\n",
" output_path=s3_output_path,\n",
" base_job_name=job_name_prefix,\n",
" metric_definitions=metric_definitions,\n",
" hyperparameters={\n",
" # 여기에서 Ray 알고리즘의 인자들을 설정할 수 있음.\n",
" \n",
" # 8코어의 m4.2xl 인스턴스 3 대. 1개의 코어는 Ray 스케쥴러를 위한 것.\n",
" \"rl.training.config.num_workers\": (8 * train_instance_count) - 1 \n",
" \n",
" #\"rl.training.config.horizon\": 5000,\n",
" #\"rl.training.config.num_sgd_iter\": 10,\n",
" }\n",
" )\n",
"\n",
"# 훈련 시작(로컬 모드: Sync, 리모트 모드: Async)\n",
"estimator.fit(wait=local_mode)\n",
"job_name = estimator.latest_training_job.job_name\n",
"print(\"Training job: %s\" % job_name)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### 시각화\n",
"\n",
"RL 훈련은 시간이 많이 걸리기에 훈련 과정 추적이 필요. 훈련 중 중간 출력물이 S3에 저장되기에, 그것을 캡쳐."
]
},
{
"cell_type": "code",
"execution_count": 26,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Job name: rl-roboschool-distributed-reacher-2019-05-08-10-20-27-580\n",
"S3 job path: s3://sagemaker-ap-northeast-2-415742736303/rl-roboschool-distributed-reacher-2019-05-08-10-20-27-580\n",
"Output.tar.gz location: s3://sagemaker-ap-northeast-2-415742736303/rl-roboschool-distributed-reacher-2019-05-08-10-20-27-580/output/output.tar.gz\n",
"Intermediate folder path: s3://sagemaker-ap-northeast-2-415742736303/rl-roboschool-distributed-reacher-2019-05-08-10-20-27-580/output/intermediate/\n",
"Create local folder /tmp/rl-roboschool-distributed-reacher-2019-05-08-10-20-27-580\n"
]
}
],
"source": [
"print(\"Job name: {}\".format(job_name))\n",
"\n",
"s3_url = \"s3://{}/{}\".format(s3_bucket,job_name)\n",
"\n",
"if local_mode:\n",
" output_tar_key = \"{}/output.tar.gz\".format(job_name)\n",
"else:\n",
" output_tar_key = \"{}/output/output.tar.gz\".format(job_name)\n",
"\n",
"intermediate_folder_key = \"{}/output/intermediate/\".format(job_name)\n",
"output_url = \"s3://{}/{}\".format(s3_bucket, output_tar_key)\n",
"intermediate_url = \"s3://{}/{}\".format(s3_bucket, intermediate_folder_key)\n",
"\n",
"print(\"S3 job path: {}\".format(s3_url))\n",
"print(\"Output.tar.gz location: {}\".format(output_url))\n",
"print(\"Intermediate folder path: {}\".format(intermediate_url))\n",
" \n",
"tmp_dir = \"/tmp/{}\".format(job_name)\n",
"os.system(\"mkdir {}\".format(tmp_dir))\n",
"print(\"Create local folder {}\".format(tmp_dir))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### 훈련 롤아웃 비디오를 얻음\n",
"\n",
"훈련중 일부 롤아웃 비디오가 S3에 저장됨. 여기에서는 최근 10개 비디오를 가져오고, 그중 마지막을 플레이."
]
},
{
"cell_type": "code",
"execution_count": 30,
"metadata": {
"scrolled": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Waiting for s3://sagemaker-ap-northeast-2-415742736303/rl-roboschool-distributed-reacher-2019-05-08-10-20-27-580/output/intermediate/...\n",
"Only downloading 10 of 143 files\n",
"Downloading rl-roboschool-distributed-reacher-2019-05-08-10-20-27-580/output/intermediate/training/PPO_RoboschoolReacher-v1_0_2019-05-08_10-23-4188iwjt0k/openaigym.video.0.136.video000000.mp4\n",
"Downloading rl-roboschool-distributed-reacher-2019-05-08-10-20-27-580/output/intermediate/training/PPO_RoboschoolReacher-v1_0_2019-05-08_10-23-4188iwjt0k/openaigym.video.0.136.video000001.mp4\n",
"Downloading rl-roboschool-distributed-reacher-2019-05-08-10-20-27-580/output/intermediate/training/PPO_RoboschoolReacher-v1_0_2019-05-08_10-23-4188iwjt0k/openaigym.video.0.136.video000008.mp4\n",
"Downloading rl-roboschool-distributed-reacher-2019-05-08-10-20-27-580/output/intermediate/training/PPO_RoboschoolReacher-v1_0_2019-05-08_10-23-4188iwjt0k/openaigym.video.0.136.video000027.mp4\n",
"Downloading rl-roboschool-distributed-reacher-2019-05-08-10-20-27-580/output/intermediate/training/PPO_RoboschoolReacher-v1_0_2019-05-08_10-23-4188iwjt0k/openaigym.video.0.136.video000064.mp4\n",
"Downloading rl-roboschool-distributed-reacher-2019-05-08-10-20-27-580/output/intermediate/training/PPO_RoboschoolReacher-v1_0_2019-05-08_10-23-4188iwjt0k/openaigym.video.0.136.video000125.mp4\n",
"Downloading rl-roboschool-distributed-reacher-2019-05-08-10-20-27-580/output/intermediate/training/PPO_RoboschoolReacher-v1_0_2019-05-08_10-23-4188iwjt0k/openaigym.video.0.136.video000216.mp4\n",
"Downloading rl-roboschool-distributed-reacher-2019-05-08-10-20-27-580/output/intermediate/training/PPO_RoboschoolReacher-v1_0_2019-05-08_10-23-4188iwjt0k/openaigym.video.0.136.video000343.mp4\n",
"Downloading rl-roboschool-distributed-reacher-2019-05-08-10-20-27-580/output/intermediate/training/PPO_RoboschoolReacher-v1_0_2019-05-08_10-23-4188iwjt0k/openaigym.video.0.136.video000512.mp4\n",
"Downloading rl-roboschool-distributed-reacher-2019-05-08-10-20-27-580/output/intermediate/training/PPO_RoboschoolReacher-v1_0_2019-05-08_10-23-4188iwjt0k/openaigym.video.0.136.video000729.mp4\n"
]
}
],
"source": [
"recent_videos = wait_for_s3_object(s3_bucket, intermediate_folder_key, tmp_dir, \n",
" fetch_only=(lambda obj: obj.key.endswith(\".mp4\") and obj.size>0), limit=10)"
]
},
{
"cell_type": "code",
"execution_count": 31,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<video src=\"./src/tmp_render/last_video.mp4\" controls autoplay></video>"
],
"text/plain": [
"<IPython.core.display.HTML object>"
]
},
"execution_count": 31,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"last_video = sorted(recent_videos)[-1] # Pick which video to watch\n",
"os.system(\"mkdir -p ./src/tmp_render/ && cp {} ./src/tmp_render/last_video.mp4\".format(last_video))\n",
"HTML('<video src=\"./src/tmp_render/last_video.mp4\" controls autoplay></video>')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### 훈련 작업의 성능 척도를 그림\n",
"\n",
"CloudWatch에 저장된 알고리즘 척도를 이용해 실행중인 훈련의 리워드를 볼 수 있음."
]
},
{
"cell_type": "code",
"execution_count": 29,
"metadata": {},
"outputs": [
{
"data": {
"image/png": "\n",
"text/plain": [
"<Figure size 864x360 with 1 Axes>"
]
},
"metadata": {
"needs_background": "light"
},
"output_type": "display_data"
}
],
"source": [
"%matplotlib inline\n",
"from sagemaker.analytics import TrainingJobAnalytics\n",
"\n",
"df = TrainingJobAnalytics(job_name, ['episode_reward_mean']).dataframe()\n",
"num_metrics = len(df)\n",
"if num_metrics == 0:\n",
" print(\"No algorithm metrics found in CloudWatch\")\n",
"else:\n",
" plt = df.plot(x='timestamp', y='value', figsize=(12,5), legend=True, style='b-')\n",
" plt.set_ylabel('Mean reward per episode')\n",
" plt.set_xlabel('Training time (s)')"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"anaconda-cloud": {},
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.2"
},
"notice": "Copyright 2018 Amazon.com, Inc. or its affiliates. All Rights Reserved. Licensed under the Apache License, Version 2.0 (the \"License\"). You may not use this file except in compliance with the License. A copy of the License is located at http://aws.amazon.com/apache2.0/ or in the \"license\" file accompanying this file. This file is distributed on an \"AS IS\" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License."
},
"nbformat": 4,
"nbformat_minor": 2
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment