tomislacker/elasticache_benchmarking.ipynb

## elasticache_benchmarking.ipynb
{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# ElastiCache Bandwith Mapping\n",
    "## About\n",
    "There have been occasions where we've discovered that our\n",
    "[ElastiCache](https://aws.amazon.com/elasticache/) Redis instances\n",
    "become a bottleneck due to their respective instance type's bandwidth\n",
    "capabilities. As a response, we determined that we needed our own\n",
    "benchmarking capabilities that were flexible enough to adapt to analyzing\n",
    "not just bandwidth but also CPU load. With this data, we would then have\n",
    "reasonable thresholds to which alarms could be created for situations where\n",
    "we're at or near the point of resource exhaustion.\n",
    "\n",
    "### Objectives\n",
    "* Construct a framework for benchmarking ElastiCache Redis performance\n",
    "* Tune as needed to construct reasonably consistent results\n",
    "* Output the data in a way that a\n",
    "[CloudFormation Mapping](https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/mappings-section-structure.html)\n",
    "may be constructed & published\n",
    "* Adapt templates that use ElastiCache Redis to import the published mapping,\n",
    "match the intended instance type to a value, and set a reasonable alarm threshold\n",
    "\n",
    "## Results\n",
    "### Benchmarking Framework\n",
    "_See [widdix/ec2-network-benchmark#2](https://github.com/widdix/ec2-network-benchmark/pull/2)_\n",
    "\n",
    "### Data\n",
    "Once we were able to benchmark our instances, we queried\n",
    "[Amazon Athena](https://aws.amazon.com/athena/)\n",
    "with the following query to download a CSV from.\n",
    "\n",
    "```sql\n",
    "SELECT\n",
    "    instancetype,\n",
    "    dataSize,\n",
    "    (avg(networkbytesout.p90)/60/1024/1024*8) AS mbps_p90,\n",
    "    (avg(networkbytesout.p70)/60/1024/1024*8) AS mbps_p70,\n",
    "    count(distinct benchmarkId) as test_passes,\n",
    "    avg(cpuutilization.p90) AS cpuutilization_90,\n",
    "    avg(enginecpuutilization.p90) AS enginecpuutilization_90,\n",
    "    avg(cpuutilization.p50) AS cpuutilization_50,\n",
    "    avg(enginecpuutilization.p50) AS enginecpuutilization_50,\n",
    "    avg(networkbytesout.p90) AS BytePerMinP90\n",
    "FROM cachenetworkbenchmark\n",
    "WHERE d >= from_iso8601_date('2018-05-01')\n",
    "GROUP BY region, instancetype, dataSize\n",
    "ORDER BY mbps_p90 DESC, region, instancetype, dataSize\n",
    "```"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [],
   "source": [
    "import json\n",
    "import pandas as pd\n",
    "\n",
    "CSV_PATH = '64bbbb18-12de-4e93-9c15-e5407c499f74.csv'"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [],
   "source": [
    "results = pd.read_csv(CSV_PATH)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>instancetype</th>\n",
       "      <th>dataSize</th>\n",
       "      <th>mbps_p90</th>\n",
       "      <th>mbps_p70</th>\n",
       "      <th>test_passes</th>\n",
       "      <th>cpuutilization_90</th>\n",
       "      <th>enginecpuutilization_90</th>\n",
       "      <th>cpuutilization_50</th>\n",
       "      <th>enginecpuutilization_50</th>\n",
       "      <th>BytePerMinP90</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>cache.r4.16xlarge</td>\n",
       "      <td>100000</td>\n",
       "      <td>10570.404</td>\n",
       "      <td>10180.308</td>\n",
       "      <td>1</td>\n",
       "      <td>3.293243</td>\n",
       "      <td>62.481094</td>\n",
       "      <td>3.156840</td>\n",
       "      <td>60.163340</td>\n",
       "      <td>8.312904e+10</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>cache.r4.4xlarge</td>\n",
       "      <td>100000</td>\n",
       "      <td>9178.739</td>\n",
       "      <td>9078.103</td>\n",
       "      <td>5</td>\n",
       "      <td>8.583490</td>\n",
       "      <td>48.732200</td>\n",
       "      <td>6.758875</td>\n",
       "      <td>42.735767</td>\n",
       "      <td>7.218454e+10</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>cache.r4.8xlarge</td>\n",
       "      <td>100000</td>\n",
       "      <td>9171.347</td>\n",
       "      <td>9033.280</td>\n",
       "      <td>5</td>\n",
       "      <td>5.165210</td>\n",
       "      <td>49.556260</td>\n",
       "      <td>4.878917</td>\n",
       "      <td>43.237050</td>\n",
       "      <td>7.212641e+10</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>cache.m4.10xlarge</td>\n",
       "      <td>100000</td>\n",
       "      <td>8522.664</td>\n",
       "      <td>8439.150</td>\n",
       "      <td>5</td>\n",
       "      <td>2.197376</td>\n",
       "      <td>35.648620</td>\n",
       "      <td>2.033342</td>\n",
       "      <td>29.952028</td>\n",
       "      <td>6.702496e+10</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "        instancetype  dataSize   mbps_p90   mbps_p70  test_passes  \\\n",
       "0  cache.r4.16xlarge    100000  10570.404  10180.308            1   \n",
       "1   cache.r4.4xlarge    100000   9178.739   9078.103            5   \n",
       "2   cache.r4.8xlarge    100000   9171.347   9033.280            5   \n",
       "3  cache.m4.10xlarge    100000   8522.664   8439.150            5   \n",
       "\n",
       "   cpuutilization_90  enginecpuutilization_90  cpuutilization_50  \\\n",
       "0           3.293243                62.481094           3.156840   \n",
       "1           8.583490                48.732200           6.758875   \n",
       "2           5.165210                49.556260           4.878917   \n",
       "3           2.197376                35.648620           2.033342   \n",
       "\n",
       "   enginecpuutilization_50  BytePerMinP90  \n",
       "0                60.163340   8.312904e+10  \n",
       "1                42.735767   7.218454e+10  \n",
       "2                43.237050   7.212641e+10  \n",
       "3                29.952028   6.702496e+10  "
      ]
     },
     "execution_count": 3,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "results.head(4)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### \"Data size\" Evaluations\n",
    "The `redis-benchmark` application provides an argument (`-d`) to specify\n",
    "the size of the data being used during the execution. Early on, we discovered\n",
    "that tuning this value was the single most influential value for squeezing\n",
    "more bandwidth out of an instance.\n",
    "\n",
    "Not only would that typically result in more bandwidth, it often came with\n",
    "a reduction in CPU load as well -- but to decreasing effect.\n",
    "\n",
    "Below shows one such example of these observations."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>instancetype</th>\n",
       "      <th>dataSize</th>\n",
       "      <th>mbps_p90</th>\n",
       "      <th>mbps_p70</th>\n",
       "      <th>test_passes</th>\n",
       "      <th>cpuutilization_90</th>\n",
       "      <th>enginecpuutilization_90</th>\n",
       "      <th>cpuutilization_50</th>\n",
       "      <th>enginecpuutilization_50</th>\n",
       "      <th>BytePerMinP90</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>30</th>\n",
       "      <td>cache.m4.xlarge</td>\n",
       "      <td>100000</td>\n",
       "      <td>734.86810</td>\n",
       "      <td>713.35380</td>\n",
       "      <td>5</td>\n",
       "      <td>16.646667</td>\n",
       "      <td>28.711346</td>\n",
       "      <td>12.355090</td>\n",
       "      <td>4.539111</td>\n",
       "      <td>5.779238e+09</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>31</th>\n",
       "      <td>cache.m4.xlarge</td>\n",
       "      <td>10000</td>\n",
       "      <td>714.60900</td>\n",
       "      <td>701.69025</td>\n",
       "      <td>1</td>\n",
       "      <td>16.310000</td>\n",
       "      <td>18.576769</td>\n",
       "      <td>16.310000</td>\n",
       "      <td>17.449402</td>\n",
       "      <td>5.619914e+09</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>33</th>\n",
       "      <td>cache.m4.xlarge</td>\n",
       "      <td>1024</td>\n",
       "      <td>626.63776</td>\n",
       "      <td>622.25073</td>\n",
       "      <td>3</td>\n",
       "      <td>25.761267</td>\n",
       "      <td>56.875660</td>\n",
       "      <td>21.817337</td>\n",
       "      <td>46.935776</td>\n",
       "      <td>4.928080e+09</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "       instancetype  dataSize   mbps_p90   mbps_p70  test_passes  \\\n",
       "30  cache.m4.xlarge    100000  734.86810  713.35380            5   \n",
       "31  cache.m4.xlarge     10000  714.60900  701.69025            1   \n",
       "33  cache.m4.xlarge      1024  626.63776  622.25073            3   \n",
       "\n",
       "    cpuutilization_90  enginecpuutilization_90  cpuutilization_50  \\\n",
       "30          16.646667                28.711346          12.355090   \n",
       "31          16.310000                18.576769          16.310000   \n",
       "33          25.761267                56.875660          21.817337   \n",
       "\n",
       "    enginecpuutilization_50  BytePerMinP90  \n",
       "30                 4.539111   5.779238e+09  \n",
       "31                17.449402   5.619914e+09  \n",
       "33                46.935776   4.928080e+09  "
      ]
     },
     "execution_count": 4,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "results[results[\"instancetype\"] == 'cache.m4.xlarge'].groupby('dataSize').head()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>instancetype</th>\n",
       "      <th>dataSize</th>\n",
       "      <th>mbps_p90</th>\n",
       "      <th>cpuutilization_90</th>\n",
       "      <th>enginecpuutilization_90</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>30</th>\n",
       "      <td>cache.m4.xlarge</td>\n",
       "      <td>100000</td>\n",
       "      <td>734.86810</td>\n",
       "      <td>16.646667</td>\n",
       "      <td>28.711346</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>31</th>\n",
       "      <td>cache.m4.xlarge</td>\n",
       "      <td>10000</td>\n",
       "      <td>714.60900</td>\n",
       "      <td>16.310000</td>\n",
       "      <td>18.576769</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>33</th>\n",
       "      <td>cache.m4.xlarge</td>\n",
       "      <td>1024</td>\n",
       "      <td>626.63776</td>\n",
       "      <td>25.761267</td>\n",
       "      <td>56.875660</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "       instancetype  dataSize   mbps_p90  cpuutilization_90  \\\n",
       "30  cache.m4.xlarge    100000  734.86810          16.646667   \n",
       "31  cache.m4.xlarge     10000  714.60900          16.310000   \n",
       "33  cache.m4.xlarge      1024  626.63776          25.761267   \n",
       "\n",
       "    enginecpuutilization_90  \n",
       "30                28.711346  \n",
       "31                18.576769  \n",
       "33                56.875660  "
      ]
     },
     "execution_count": 5,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "results[results[\"instancetype\"] == 'cache.m4.xlarge'] \\\n",
    "[['instancetype','dataSize', 'mbps_p90', 'cpuutilization_90', 'enginecpuutilization_90']]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Creating the Mapping\n",
    "Once we acquired the level of data that was desired, we were ready to\n",
    "construct our mapping transform and two main considerations have been\n",
    "made here below:\n",
    "\n",
    "1. Network traffic is typically measured in _bits_ but ElastiCache\n",
    "reports _bytes_, so we'll need to do some calculations\n",
    "1. The 90th percentile data from CloudWatch is being used rather than\n",
    "the 100th percentile in an attempt to get a more stable, consistent,\n",
    "and conservative parameter\n",
    "\n",
    "We'll treat the 90th percentile as the maximum bandwidth we should\n",
    "ever **expect** an instance type to be able to communicate at. From\n",
    "there, we'll break down percentages of that value for consumption in\n",
    "the mapping. In our case, 100%, 90%, and 80%; again all based off the\n",
    "observed 90th percentile of bandwidth observed during the tests."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [],
   "source": [
    "mapping = {}\n",
    "for instance_type in results.instancetype.unique():\n",
    "    instance_results = results[results[\"instancetype\"] == instance_type]\n",
    "    max_bandwidth = instance_results.mbps_p90.max() / 8 * 1024 * 1024\n",
    "    mapping.update({\n",
    "        instance_type: {\n",
    "            percent: int(max_bandwidth*percent/100)\n",
    "            for percent in range(100, 70, -10)\n",
    "        }\n",
    "    })"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "1454"
      ]
     },
     "execution_count": 7,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "len(json.dumps(mapping, sort_keys=2))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "{\n",
      "  \"cache.m3.2xlarge\": {\n",
      "    \"80\": 49835910,\n",
      "    \"90\": 56065399,\n",
      "    \"100\": 62294888\n",
      "  },\n",
      "  \"cache.m3.large\": {\n",
      "    \"80\": 48378954,\n",
      "    \"90\": 54426323,\n",
      "    \"100\": 60473692\n",
      "  },\n",
      "  \"cache.m3.medium\": {\n",
      "    \"80\": 29315378,\n",
      "    \"90\": 32979801,\n",
      "    \"100\": 36644223\n",
      "  },\n",
      "  \"cache.m3.xlarge\": {\n",
      "    \"80\": 54896319,\n",
      "    \"90\": 61758359,\n",
      "    \"100\": 68620399\n",
      "  },\n",
      "  \"cache.m4.10xlarge\": {\n",
      "    \"80\": 893666092,\n",
      "    \"90\": 1005374354,\n",
      "    \"100\": 1117082615\n",
      "  },\n",
      "  \"cache.m4.2xlarge\": {\n",
      "    \"80\": 102033530,\n",
      "    \"90\": 114787721,\n",
      "    \"100\": 127541912\n",
      "  },\n",
      "  \"cache.m4.4xlarge\": {\n",
      "    \"80\": 204153531,\n",
      "    \"90\": 229672723,\n",
      "    \"100\": 255191914\n",
      "  },\n",
      "  \"cache.m4.large\": {\n",
      "    \"80\": 47538989,\n",
      "    \"90\": 53481362,\n",
      "    \"100\": 59423736\n",
      "  },\n",
      "  \"cache.m4.xlarge\": {\n",
      "    \"80\": 77056505,\n",
      "    \"90\": 86688568,\n",
      "    \"100\": 96320631\n",
      "  },\n",
      "  \"cache.r3.2xlarge\": {\n",
      "    \"80\": 98557728,\n",
      "    \"90\": 110877444,\n",
      "    \"100\": 123197160\n",
      "  },\n",
      "  \"cache.r3.4xlarge\": {\n",
      "    \"80\": 105002848,\n",
      "    \"90\": 118128204,\n",
      "    \"100\": 131253560\n",
      "  },\n",
      "  \"cache.r3.8xlarge\": {\n",
      "    \"80\": 104620621,\n",
      "    \"90\": 117698199,\n",
      "    \"100\": 130775777\n",
      "  },\n",
      "  \"cache.r3.large\": {\n",
      "    \"80\": 49536345,\n",
      "    \"90\": 55728388,\n",
      "    \"100\": 61920431\n",
      "  },\n",
      "  \"cache.r3.xlarge\": {\n",
      "    \"80\": 68982989,\n",
      "    \"90\": 77605862,\n",
      "    \"100\": 86228736\n",
      "  },\n",
      "  \"cache.r4.16xlarge\": {\n",
      "    \"80\": 1108387194,\n",
      "    \"90\": 1246935593,\n",
      "    \"100\": 1385483993\n",
      "  },\n",
      "  \"cache.r4.2xlarge\": {\n",
      "    \"80\": 820773115,\n",
      "    \"90\": 923369755,\n",
      "    \"100\": 1025966394\n",
      "  },\n",
      "  \"cache.r4.4xlarge\": {\n",
      "    \"80\": 962460542,\n",
      "    \"90\": 1082768110,\n",
      "    \"100\": 1203075678\n",
      "  },\n",
      "  \"cache.r4.8xlarge\": {\n",
      "    \"80\": 961685435,\n",
      "    \"90\": 1081896114,\n",
      "    \"100\": 1202106793\n",
      "  },\n",
      "  \"cache.r4.large\": {\n",
      "    \"80\": 537689598,\n",
      "    \"90\": 604900797,\n",
      "    \"100\": 672111997\n",
      "  },\n",
      "  \"cache.r4.xlarge\": {\n",
      "    \"80\": 654315513,\n",
      "    \"90\": 736104952,\n",
      "    \"100\": 817894391\n",
      "  }\n",
      "}\n"
     ]
    }
   ],
   "source": [
    "print(json.dumps(mapping, indent=2, sort_keys=2))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.6.4"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}
	{
	"cells": [
	{
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"# ElastiCache Bandwith Mapping\n",
	"## About\n",
	"There have been occasions where we've discovered that our\n",
	"[ElastiCache](https://aws.amazon.com/elasticache/) Redis instances\n",
	"become a bottleneck due to their respective instance type's bandwidth\n",
	"capabilities. As a response, we determined that we needed our own\n",
	"benchmarking capabilities that were flexible enough to adapt to analyzing\n",
	"not just bandwidth but also CPU load. With this data, we would then have\n",
	"reasonable thresholds to which alarms could be created for situations where\n",
	"we're at or near the point of resource exhaustion.\n",
	"\n",
	"### Objectives\n",
	"* Construct a framework for benchmarking ElastiCache Redis performance\n",
	"* Tune as needed to construct reasonably consistent results\n",
	"* Output the data in a way that a\n",
	"[CloudFormation Mapping](https://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/mappings-section-structure.html)\n",
	"may be constructed & published\n",
	"* Adapt templates that use ElastiCache Redis to import the published mapping,\n",
	"match the intended instance type to a value, and set a reasonable alarm threshold\n",
	"\n",
	"## Results\n",
	"### Benchmarking Framework\n",
	"_See [widdix/ec2-network-benchmark#2](https://github.com/widdix/ec2-network-benchmark/pull/2)_\n",
	"\n",
	"### Data\n",
	"Once we were able to benchmark our instances, we queried\n",
	"[Amazon Athena](https://aws.amazon.com/athena/)\n",
	"with the following query to download a CSV from.\n",
	"\n",
	"```sql\n",
	"SELECT\n",
	" instancetype,\n",
	" dataSize,\n",
	" (avg(networkbytesout.p90)/60/1024/1024*8) AS mbps_p90,\n",
	" (avg(networkbytesout.p70)/60/1024/1024*8) AS mbps_p70,\n",
	" count(distinct benchmarkId) as test_passes,\n",
	" avg(cpuutilization.p90) AS cpuutilization_90,\n",
	" avg(enginecpuutilization.p90) AS enginecpuutilization_90,\n",
	" avg(cpuutilization.p50) AS cpuutilization_50,\n",
	" avg(enginecpuutilization.p50) AS enginecpuutilization_50,\n",
	" avg(networkbytesout.p90) AS BytePerMinP90\n",
	"FROM cachenetworkbenchmark\n",
	"WHERE d >= from_iso8601_date('2018-05-01')\n",
	"GROUP BY region, instancetype, dataSize\n",
	"ORDER BY mbps_p90 DESC, region, instancetype, dataSize\n",
	"```"
	]
	},
	{
	"cell_type": "code",
	"execution_count": 1,
	"metadata": {},
	"outputs": [],
	"source": [
	"import json\n",
	"import pandas as pd\n",
	"\n",
	"CSV_PATH = '64bbbb18-12de-4e93-9c15-e5407c499f74.csv'"
	]
	},
	{
	"cell_type": "code",
	"execution_count": 2,
	"metadata": {},
	"outputs": [],
	"source": [
	"results = pd.read_csv(CSV_PATH)"
	]
	},
	{
	"cell_type": "code",
	"execution_count": 3,
	"metadata": {},
	"outputs": [
	{
	"data": {
	"text/html": [
	"<div>\n",
	"<style scoped>\n",
	" .dataframe tbody tr th:only-of-type {\n",
	" vertical-align: middle;\n",
	" }\n",
	"\n",
	" .dataframe tbody tr th {\n",
	" vertical-align: top;\n",
	" }\n",
	"\n",
	" .dataframe thead th {\n",
	" text-align: right;\n",
	" }\n",
	"</style>\n",
	"<table border=\"1\" class=\"dataframe\">\n",
	" <thead>\n",
	" <tr style=\"text-align: right;\">\n",
	" <th></th>\n",
	" <th>instancetype</th>\n",
	" <th>dataSize</th>\n",
	" <th>mbps_p90</th>\n",
	" <th>mbps_p70</th>\n",
	" <th>test_passes</th>\n",
	" <th>cpuutilization_90</th>\n",
	" <th>enginecpuutilization_90</th>\n",
	" <th>cpuutilization_50</th>\n",
	" <th>enginecpuutilization_50</th>\n",
	" <th>BytePerMinP90</th>\n",
	" </tr>\n",
	" </thead>\n",
	" <tbody>\n",
	" <tr>\n",
	" <th>0</th>\n",
	" <td>cache.r4.16xlarge</td>\n",
	" <td>100000</td>\n",
	" <td>10570.404</td>\n",
	" <td>10180.308</td>\n",
	" <td>1</td>\n",
	" <td>3.293243</td>\n",
	" <td>62.481094</td>\n",
	" <td>3.156840</td>\n",
	" <td>60.163340</td>\n",
	" <td>8.312904e+10</td>\n",
	" </tr>\n",
	" <tr>\n",
	" <th>1</th>\n",
	" <td>cache.r4.4xlarge</td>\n",
	" <td>100000</td>\n",
	" <td>9178.739</td>\n",
	" <td>9078.103</td>\n",
	" <td>5</td>\n",
	" <td>8.583490</td>\n",
	" <td>48.732200</td>\n",
	" <td>6.758875</td>\n",
	" <td>42.735767</td>\n",
	" <td>7.218454e+10</td>\n",
	" </tr>\n",
	" <tr>\n",
	" <th>2</th>\n",
	" <td>cache.r4.8xlarge</td>\n",
	" <td>100000</td>\n",
	" <td>9171.347</td>\n",
	" <td>9033.280</td>\n",
	" <td>5</td>\n",
	" <td>5.165210</td>\n",
	" <td>49.556260</td>\n",
	" <td>4.878917</td>\n",
	" <td>43.237050</td>\n",
	" <td>7.212641e+10</td>\n",
	" </tr>\n",
	" <tr>\n",
	" <th>3</th>\n",
	" <td>cache.m4.10xlarge</td>\n",
	" <td>100000</td>\n",
	" <td>8522.664</td>\n",
	" <td>8439.150</td>\n",
	" <td>5</td>\n",
	" <td>2.197376</td>\n",
	" <td>35.648620</td>\n",
	" <td>2.033342</td>\n",
	" <td>29.952028</td>\n",
	" <td>6.702496e+10</td>\n",
	" </tr>\n",
	" </tbody>\n",
	"</table>\n",
	"</div>"
	],
	"text/plain": [
	" instancetype dataSize mbps_p90 mbps_p70 test_passes \\\n",
	"0 cache.r4.16xlarge 100000 10570.404 10180.308 1 \n",
	"1 cache.r4.4xlarge 100000 9178.739 9078.103 5 \n",
	"2 cache.r4.8xlarge 100000 9171.347 9033.280 5 \n",
	"3 cache.m4.10xlarge 100000 8522.664 8439.150 5 \n",
	"\n",
	" cpuutilization_90 enginecpuutilization_90 cpuutilization_50 \\\n",
	"0 3.293243 62.481094 3.156840 \n",
	"1 8.583490 48.732200 6.758875 \n",
	"2 5.165210 49.556260 4.878917 \n",
	"3 2.197376 35.648620 2.033342 \n",
	"\n",
	" enginecpuutilization_50 BytePerMinP90 \n",
	"0 60.163340 8.312904e+10 \n",
	"1 42.735767 7.218454e+10 \n",
	"2 43.237050 7.212641e+10 \n",
	"3 29.952028 6.702496e+10 "
	]
	},
	"execution_count": 3,
	"metadata": {},
	"output_type": "execute_result"
	}
	],
	"source": [
	"results.head(4)"
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"#### \"Data size\" Evaluations\n",
	"The `redis-benchmark` application provides an argument (`-d`) to specify\n",
	"the size of the data being used during the execution. Early on, we discovered\n",
	"that tuning this value was the single most influential value for squeezing\n",
	"more bandwidth out of an instance.\n",
	"\n",
	"Not only would that typically result in more bandwidth, it often came with\n",
	"a reduction in CPU load as well -- but to decreasing effect.\n",
	"\n",
	"Below shows one such example of these observations."
	]
	},
	{
	"cell_type": "code",
	"execution_count": 4,
	"metadata": {},
	"outputs": [
	{
	"data": {
	"text/html": [
	"<div>\n",
	"<style scoped>\n",
	" .dataframe tbody tr th:only-of-type {\n",
	" vertical-align: middle;\n",
	" }\n",
	"\n",
	" .dataframe tbody tr th {\n",
	" vertical-align: top;\n",
	" }\n",
	"\n",
	" .dataframe thead th {\n",
	" text-align: right;\n",
	" }\n",
	"</style>\n",
	"<table border=\"1\" class=\"dataframe\">\n",
	" <thead>\n",
	" <tr style=\"text-align: right;\">\n",
	" <th></th>\n",
	" <th>instancetype</th>\n",
	" <th>dataSize</th>\n",
	" <th>mbps_p90</th>\n",
	" <th>mbps_p70</th>\n",
	" <th>test_passes</th>\n",
	" <th>cpuutilization_90</th>\n",
	" <th>enginecpuutilization_90</th>\n",
	" <th>cpuutilization_50</th>\n",
	" <th>enginecpuutilization_50</th>\n",
	" <th>BytePerMinP90</th>\n",
	" </tr>\n",
	" </thead>\n",
	" <tbody>\n",
	" <tr>\n",
	" <th>30</th>\n",
	" <td>cache.m4.xlarge</td>\n",
	" <td>100000</td>\n",
	" <td>734.86810</td>\n",
	" <td>713.35380</td>\n",
	" <td>5</td>\n",
	" <td>16.646667</td>\n",
	" <td>28.711346</td>\n",
	" <td>12.355090</td>\n",
	" <td>4.539111</td>\n",
	" <td>5.779238e+09</td>\n",
	" </tr>\n",
	" <tr>\n",
	" <th>31</th>\n",
	" <td>cache.m4.xlarge</td>\n",
	" <td>10000</td>\n",
	" <td>714.60900</td>\n",
	" <td>701.69025</td>\n",
	" <td>1</td>\n",
	" <td>16.310000</td>\n",
	" <td>18.576769</td>\n",
	" <td>16.310000</td>\n",
	" <td>17.449402</td>\n",
	" <td>5.619914e+09</td>\n",
	" </tr>\n",
	" <tr>\n",
	" <th>33</th>\n",
	" <td>cache.m4.xlarge</td>\n",
	" <td>1024</td>\n",
	" <td>626.63776</td>\n",
	" <td>622.25073</td>\n",
	" <td>3</td>\n",
	" <td>25.761267</td>\n",
	" <td>56.875660</td>\n",
	" <td>21.817337</td>\n",
	" <td>46.935776</td>\n",
	" <td>4.928080e+09</td>\n",
	" </tr>\n",
	" </tbody>\n",
	"</table>\n",
	"</div>"
	],
	"text/plain": [
	" instancetype dataSize mbps_p90 mbps_p70 test_passes \\\n",
	"30 cache.m4.xlarge 100000 734.86810 713.35380 5 \n",
	"31 cache.m4.xlarge 10000 714.60900 701.69025 1 \n",
	"33 cache.m4.xlarge 1024 626.63776 622.25073 3 \n",
	"\n",
	" cpuutilization_90 enginecpuutilization_90 cpuutilization_50 \\\n",
	"30 16.646667 28.711346 12.355090 \n",
	"31 16.310000 18.576769 16.310000 \n",
	"33 25.761267 56.875660 21.817337 \n",
	"\n",
	" enginecpuutilization_50 BytePerMinP90 \n",
	"30 4.539111 5.779238e+09 \n",
	"31 17.449402 5.619914e+09 \n",
	"33 46.935776 4.928080e+09 "
	]
	},
	"execution_count": 4,
	"metadata": {},
	"output_type": "execute_result"
	}
	],
	"source": [
	"results[results[\"instancetype\"] == 'cache.m4.xlarge'].groupby('dataSize').head()"
	]
	},
	{
	"cell_type": "code",
	"execution_count": 5,
	"metadata": {},
	"outputs": [
	{
	"data": {
	"text/html": [
	"<div>\n",
	"<style scoped>\n",
	" .dataframe tbody tr th:only-of-type {\n",
	" vertical-align: middle;\n",
	" }\n",
	"\n",
	" .dataframe tbody tr th {\n",
	" vertical-align: top;\n",
	" }\n",
	"\n",
	" .dataframe thead th {\n",
	" text-align: right;\n",
	" }\n",
	"</style>\n",
	"<table border=\"1\" class=\"dataframe\">\n",
	" <thead>\n",
	" <tr style=\"text-align: right;\">\n",
	" <th></th>\n",
	" <th>instancetype</th>\n",
	" <th>dataSize</th>\n",
	" <th>mbps_p90</th>\n",
	" <th>cpuutilization_90</th>\n",
	" <th>enginecpuutilization_90</th>\n",
	" </tr>\n",
	" </thead>\n",
	" <tbody>\n",
	" <tr>\n",
	" <th>30</th>\n",
	" <td>cache.m4.xlarge</td>\n",
	" <td>100000</td>\n",
	" <td>734.86810</td>\n",
	" <td>16.646667</td>\n",
	" <td>28.711346</td>\n",
	" </tr>\n",
	" <tr>\n",
	" <th>31</th>\n",
	" <td>cache.m4.xlarge</td>\n",
	" <td>10000</td>\n",
	" <td>714.60900</td>\n",
	" <td>16.310000</td>\n",
	" <td>18.576769</td>\n",
	" </tr>\n",
	" <tr>\n",
	" <th>33</th>\n",
	" <td>cache.m4.xlarge</td>\n",
	" <td>1024</td>\n",
	" <td>626.63776</td>\n",
	" <td>25.761267</td>\n",
	" <td>56.875660</td>\n",
	" </tr>\n",
	" </tbody>\n",
	"</table>\n",
	"</div>"
	],
	"text/plain": [
	" instancetype dataSize mbps_p90 cpuutilization_90 \\\n",
	"30 cache.m4.xlarge 100000 734.86810 16.646667 \n",
	"31 cache.m4.xlarge 10000 714.60900 16.310000 \n",
	"33 cache.m4.xlarge 1024 626.63776 25.761267 \n",
	"\n",
	" enginecpuutilization_90 \n",
	"30 28.711346 \n",
	"31 18.576769 \n",
	"33 56.875660 "
	]
	},
	"execution_count": 5,
	"metadata": {},
	"output_type": "execute_result"
	}
	],
	"source": [
	"results[results[\"instancetype\"] == 'cache.m4.xlarge'] \\\n",
	"[['instancetype','dataSize', 'mbps_p90', 'cpuutilization_90', 'enginecpuutilization_90']]"
	]
	},
	{
	"cell_type": "markdown",
	"metadata": {},
	"source": [
	"#### Creating the Mapping\n",
	"Once we acquired the level of data that was desired, we were ready to\n",
	"construct our mapping transform and two main considerations have been\n",
	"made here below:\n",
	"\n",
	"1. Network traffic is typically measured in _bits_ but ElastiCache\n",
	"reports _bytes_, so we'll need to do some calculations\n",
	"1. The 90th percentile data from CloudWatch is being used rather than\n",
	"the 100th percentile in an attempt to get a more stable, consistent,\n",
	"and conservative parameter\n",
	"\n",
	"We'll treat the 90th percentile as the maximum bandwidth we should\n",
	"ever expect an instance type to be able to communicate at. From\n",
	"there, we'll break down percentages of that value for consumption in\n",
	"the mapping. In our case, 100%, 90%, and 80%; again all based off the\n",
	"observed 90th percentile of bandwidth observed during the tests."
	]
	},
	{
	"cell_type": "code",
	"execution_count": 6,
	"metadata": {},
	"outputs": [],
	"source": [
	"mapping = {}\n",
	"for instance_type in results.instancetype.unique():\n",
	" instance_results = results[results[\"instancetype\"] == instance_type]\n",
	" max_bandwidth = instance_results.mbps_p90.max() / 8 * 1024 * 1024\n",
	" mapping.update({\n",
	" instance_type: {\n",
	" percent: int(max_bandwidth*percent/100)\n",
	" for percent in range(100, 70, -10)\n",
	" }\n",
	" })"
	]
	},
	{
	"cell_type": "code",
	"execution_count": 7,
	"metadata": {},
	"outputs": [
	{
	"data": {
	"text/plain": [
	"1454"
	]
	},
	"execution_count": 7,
	"metadata": {},
	"output_type": "execute_result"
	}
	],
	"source": [
	"len(json.dumps(mapping, sort_keys=2))"
	]
	},
	{
	"cell_type": "code",
	"execution_count": 8,
	"metadata": {},
	"outputs": [
	{
	"name": "stdout",
	"output_type": "stream",
	"text": [
	"{\n",
	" \"cache.m3.2xlarge\": {\n",
	" \"80\": 49835910,\n",
	" \"90\": 56065399,\n",
	" \"100\": 62294888\n",
	" },\n",
	" \"cache.m3.large\": {\n",
	" \"80\": 48378954,\n",
	" \"90\": 54426323,\n",
	" \"100\": 60473692\n",
	" },\n",
	" \"cache.m3.medium\": {\n",
	" \"80\": 29315378,\n",
	" \"90\": 32979801,\n",
	" \"100\": 36644223\n",
	" },\n",
	" \"cache.m3.xlarge\": {\n",
	" \"80\": 54896319,\n",
	" \"90\": 61758359,\n",
	" \"100\": 68620399\n",
	" },\n",
	" \"cache.m4.10xlarge\": {\n",
	" \"80\": 893666092,\n",
	" \"90\": 1005374354,\n",
	" \"100\": 1117082615\n",
	" },\n",
	" \"cache.m4.2xlarge\": {\n",
	" \"80\": 102033530,\n",
	" \"90\": 114787721,\n",
	" \"100\": 127541912\n",
	" },\n",
	" \"cache.m4.4xlarge\": {\n",
	" \"80\": 204153531,\n",
	" \"90\": 229672723,\n",
	" \"100\": 255191914\n",
	" },\n",
	" \"cache.m4.large\": {\n",
	" \"80\": 47538989,\n",
	" \"90\": 53481362,\n",
	" \"100\": 59423736\n",
	" },\n",
	" \"cache.m4.xlarge\": {\n",
	" \"80\": 77056505,\n",
	" \"90\": 86688568,\n",
	" \"100\": 96320631\n",
	" },\n",
	" \"cache.r3.2xlarge\": {\n",
	" \"80\": 98557728,\n",
	" \"90\": 110877444,\n",
	" \"100\": 123197160\n",
	" },\n",
	" \"cache.r3.4xlarge\": {\n",
	" \"80\": 105002848,\n",
	" \"90\": 118128204,\n",
	" \"100\": 131253560\n",
	" },\n",
	" \"cache.r3.8xlarge\": {\n",
	" \"80\": 104620621,\n",
	" \"90\": 117698199,\n",
	" \"100\": 130775777\n",
	" },\n",
	" \"cache.r3.large\": {\n",
	" \"80\": 49536345,\n",
	" \"90\": 55728388,\n",
	" \"100\": 61920431\n",
	" },\n",
	" \"cache.r3.xlarge\": {\n",
	" \"80\": 68982989,\n",
	" \"90\": 77605862,\n",
	" \"100\": 86228736\n",
	" },\n",
	" \"cache.r4.16xlarge\": {\n",
	" \"80\": 1108387194,\n",
	" \"90\": 1246935593,\n",
	" \"100\": 1385483993\n",
	" },\n",
	" \"cache.r4.2xlarge\": {\n",
	" \"80\": 820773115,\n",
	" \"90\": 923369755,\n",
	" \"100\": 1025966394\n",
	" },\n",
	" \"cache.r4.4xlarge\": {\n",
	" \"80\": 962460542,\n",
	" \"90\": 1082768110,\n",
	" \"100\": 1203075678\n",
	" },\n",
	" \"cache.r4.8xlarge\": {\n",
	" \"80\": 961685435,\n",
	" \"90\": 1081896114,\n",
	" \"100\": 1202106793\n",
	" },\n",
	" \"cache.r4.large\": {\n",
	" \"80\": 537689598,\n",
	" \"90\": 604900797,\n",
	" \"100\": 672111997\n",
	" },\n",
	" \"cache.r4.xlarge\": {\n",
	" \"80\": 654315513,\n",
	" \"90\": 736104952,\n",
	" \"100\": 817894391\n",
	" }\n",
	"}\n"
	]
	}
	],
	"source": [
	"print(json.dumps(mapping, indent=2, sort_keys=2))"
	]
	},
	{
	"cell_type": "code",
	"execution_count": null,
	"metadata": {},
	"outputs": [],
	"source": []
	}
	],
	"metadata": {
	"kernelspec": {
	"display_name": "Python 3",
	"language": "python",
	"name": "python3"
	},
	"language_info": {
	"codemirror_mode": {
	"name": "ipython",
	"version": 3
	},
	"file_extension": ".py",
	"mimetype": "text/x-python",
	"name": "python",
	"nbconvert_exporter": "python",
	"pygments_lexer": "ipython3",
	"version": "3.6.4"
	}
	},
	"nbformat": 4,
	"nbformat_minor": 2
	}