Skip to content

Instantly share code, notes, and snippets.

@adikamath
Created January 19, 2018 14:34
Show Gist options
  • Save adikamath/fed4483af5c97eb7690c44cb520a91f7 to your computer and use it in GitHub Desktop.
Save adikamath/fed4483af5c97eb7690c44cb520a91f7 to your computer and use it in GitHub Desktop.
Display the source blob
Display the rendered blob
Raw
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"<h2>CryptoCompare API: An Introduction</h2>"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"While there are several website's out there that provide public API access to get at cyrptocurrency data, I found CryptoCompare's API the easiest to work with for my needs with this project. You can use this API to download and keep up-to-date with all sorts of interesting data on thousands of cryptocurrencies and most of the popular Exchanges. I've included a link to their API documentation <a href=\"https://www.cryptocompare.com/api/\">here</a> and also at the end of this post. \n",
"\n",
" For this post however I will focus on getting Open-High-Low-Close (OHLC) Price data and then creating simple interactive plots with Plotly. Before we dive into the writing some functions, I first want to introduce you to the API itself and how we can send requests to it to get data. Start by importing the necessary libraries, I will explain their use throughout the post."
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<script>requirejs.config({paths: { 'plotly': ['https://cdn.plot.ly/plotly-latest.min']},});if(!window.Plotly) {{require(['plotly'],function(plotly) {window.Plotly=plotly;});}}</script>"
],
"text/vnd.plotly.v1+html": [
"<script>requirejs.config({paths: { 'plotly': ['https://cdn.plot.ly/plotly-latest.min']},});if(!window.Plotly) {{require(['plotly'],function(plotly) {window.Plotly=plotly;});}}</script>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"'''import libraries to work with the data'''\n",
"import numpy as np\n",
"import pandas as pd\n",
"import os\n",
"import requests\n",
"import pickle\n",
"import datetime as datetime\n",
"\n",
"'''import plotly packages for visualization'''\n",
"from plotly import tools\n",
"import plotly.offline as py\n",
"import plotly.graph_objs as go\n",
"import plotly.figure_factory as ff\n",
"py.init_notebook_mode(connected=True)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The <i>requests</i> library is going to help us send HTTP requests to the API in order to get the data. Check out the exmaple below for a GET request from the API. Of all the different methods provided by the API, we will be using the <i>HistoDay</i> method for historical OHLC data for a given coin pair which in our case here is BTC to USD."
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"<Response [200]>"
]
},
"execution_count": 5,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"myData = requests.get(\"https://min-api.cryptocompare.com/data/histoday?fsym=BTC&tsym=USD&limit=1&aggregate=1&e=CCCAGG\")\n",
"myData"
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"{'Aggregated': False,\n",
" 'ConversionType': {'conversionSymbol': '', 'type': 'direct'},\n",
" 'Data': [{'close': 11175.52,\n",
" 'high': 12018.43,\n",
" 'low': 10642.33,\n",
" 'open': 11162.7,\n",
" 'time': 1516233600,\n",
" 'volumefrom': 204918.02,\n",
" 'volumeto': 2357251805.09},\n",
" {'close': 11793.87,\n",
" 'high': 11941.74,\n",
" 'low': 10867.18,\n",
" 'open': 11175.52,\n",
" 'time': 1516320000,\n",
" 'volumefrom': 68188.62,\n",
" 'volumeto': 778676449.36}],\n",
" 'FirstValueInArray': True,\n",
" 'Response': 'Success',\n",
" 'TimeFrom': 1516233600,\n",
" 'TimeTo': 1516320000,\n",
" 'Type': 100}"
]
},
"execution_count": 6,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"myData.json()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Okay, so let's make sense of what we did here. The url within the get request has some parameters that are passed to the API and it is crucial that these parameters are correctly enterd in. The <i>fysm</i> parameter which means from symbol is BTC (Bitcoin) while <i>tysm</i> meaning to symbol is the USD, <i>aggregate</i> is how many days you want the OHLC price data to be aggregated for which is 1 here and <i>limit</i> is how many data points you need. Here we're getting 1-day's worth of data only. The response from this request - <i>&lt;Response [200]&gt;</i> means that we successfully communicated with the API. But it is important to confirm this by looking at the json data that is returned. If the value of the 'Response' key is 'Success' then all is well and you can see the required data assigned to the 'Data'key."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"<h2>Define Helper Functions</h2>"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The first helper function <i>cryptocompare_data()</i> will download or update data depending on wether a cached version of the data is available or not. The parameters of the <i>HistoDay</i> API method are passed as parameters to this function which are then used to customize URL (and ultimately the data download) based on the User's wishes. Note that this function will only update the data if the there atleast 1 full day has passed since the last cache."
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {},
"outputs": [],
"source": [
"def cryptocompare_data(symbol, convert_symbol = 'USD', limit= 1, aggregate= 1, exchange= 'CCCAGG'):\n",
" \n",
" '''this helper function will download data if not already downloaded and will update it if an earlier outdated \n",
" download is already available'''\n",
" \n",
" cache_path = '{}.pkl'.format(symbol.upper()+'_'+convert_symbol.upper())\n",
" try:\n",
" f = open(cache_path, 'rb')\n",
" '''pickle module is used to serialize the data for caching locally'''\n",
" df = pickle.load(f)\n",
" print('Loaded {} from cache'.format(symbol.upper()+'_'+convert_symbol.upper()))\n",
" \n",
" '''the outer if block will check to see if the last day in the downloaded file is the same as now. If not, \n",
" it assigns the new limit as the number of days since the last update and allData is set to false.'''\n",
" \n",
" if (datetime.datetime.now()-df['timestamp'].iloc[-1]).days > 0:\n",
" limit = (datetime.datetime.now()-df['timestamp'].iloc[-1]).days\n",
" print('{} day(s) since last update'.format(limit))\n",
" print('Updating now...')\n",
" url = \"https://min-api.cryptocompare.com/data/histoday?fsym={}&tsym={}&limit={}&aggregate={}&e={}&allData=false\"\\\n",
" .format(symbol.upper(), convert_symbol.upper(), limit, aggregate, exchange)\n",
" json_dump2 = requests.get(url).json()\n",
" \n",
" '''this inner if block will run only if the update request was successfull in downloading data. Then it \n",
" downloads data only for those days since last update. We also drop the first row of the dataframe due to \n",
" an extra row being downloaded. We also concat the new DataFrame to the one with the last update(remember \n",
" to ignore the index so that a new index is formed). It then caches the data and updates it locally.'''\n",
" \n",
" if json_dump2['Response'] == 'Success':\n",
" df2 = pd.DataFrame(json_dump2['Data'])\n",
" df2.drop(df2.index[0], inplace = True)\n",
" df2['timestamp'] = [datetime.datetime.fromtimestamp(d) for d in df2['time']]\n",
" updated_df = pd.concat([df, df2], ignore_index = True)\n",
" updated_df.to_pickle(cache_path)\n",
" print('{} successfully updated on {}\\n'.format(symbol.upper()+'_'+convert_symbol.upper(),\\\n",
" datetime.datetime.now()))\n",
" \n",
" return updated_df\n",
" \n",
" else:\n",
" print('Error while updating {} data from CryptoCompare.\\n Message: {}'.format(symbol.upper(),\\\n",
" json_dump2['Message'])) \n",
" \n",
" else:\n",
" '''displays the timestamp for the last available OLHC price from the CryptoCompare API'''\n",
" print('Data is up-to-date as of {}\\n'.format(df['timestamp'].iloc[-1]))\n",
" return df\n",
" \n",
" except(OSError, IOError) as e:\n",
" print('Downloading {} data from {} exchange...'.format(symbol.upper(), exchange))\n",
" url = \"https://min-api.cryptocompare.com/data/histoday?fsym={}&tsym={}&limit={}&aggregate={}&e={}&allData=true\"\\\n",
" .format(symbol.upper(), convert_symbol.upper(), limit, aggregate, exchange)\n",
" \n",
" json_dump = requests.get(url).json()\n",
" if json_dump['Response'] == 'Success':\n",
" df = pd.DataFrame(json_dump['Data'])\n",
" df['timestamp'] = [pd.datetime.fromtimestamp(d) for d in df.time]\n",
" df.to_pickle(cache_path)\n",
" print('Cached {} data from {} exchange at {}\\n'\\\n",
" .format(symbol.upper(), exchange, cache_path))\n",
" return df\n",
" \n",
" else:\n",
" print('Error in downloading {} data from CryptoCompare.\\n Message: {}'.format(symbol.upper(),\\\n",
" json_dump['Message']))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The <i>downloader()</i> function uses the <i>cryptocompare_data()</i> function to download/update the data for a given list of coins that the User passes to it. The User can also specify what month and year the OHLC data should begin from. The function returns a dictionary of DataFrames for each coin."
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {
"scrolled": true
},
"outputs": [],
"source": [
"def downloader(symbol_list, month_onward= 'JAN', year_onward= '2015'):\n",
" \n",
" '''this helper function will call cryptocompare_data() to download data for coin symbols that you pass as a list. \n",
" Also note that by default it will return historical price data for the month-year JAN-2015 onward. You can change this by\n",
" passing different month and year string values''' \n",
" \n",
" coin_store = {}\n",
" '''date_start is the date from which you want to filter the data'''\n",
" date_start = datetime.datetime.strptime(\"01{}{}\".format(month_onward.upper(), year_onward), \"%d%b%Y\")\n",
" \n",
" '''to make it easier to visualize close prices downstream in our analysis, we rename the close price \n",
" column in for each DataFrame to the symbol of the coin'''\n",
" for symbol in symbol_list:\n",
" coin_store[symbol] = cryptocompare_data(symbol)[['timestamp', 'open', 'high', 'low', 'close', 'volumefrom']]\\\n",
" .rename(columns = {'close':symbol})\n",
" \n",
" for symbol in symbol_list:\n",
" coin_store[symbol] = coin_store[symbol][coin_store[symbol].timestamp >= date_start]\n",
" \n",
" return coin_store\n",
" "
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The <i>line_plotter()</i> function abstracts away the lenghty code that is required to build an interactive multi-line plot of the OHLC prices. In a nutshell, we create a list of 'traces' for each coin's price data and then pass it to plotly's plotting functions to create the plot. You can read more about the multi-line plot <a href= \"https://plot.ly/python/line-charts/\">here</a> on Plotly's website. To account for the huge difference in the price of different cryptocurrencies, the plot here uses a logarithmic axis to represent the prices."
]
},
{
"cell_type": "code",
"execution_count": 16,
"metadata": {},
"outputs": [],
"source": [
"def line_plotter(coin_store, title= 'Cryptocurrency Historical Daily Prices', x_label= 'Date', y_label= 'Close Price (USD)'):\n",
" \n",
" '''this helper function makes it easier to use the plotly library to plot a line graph to visualize\n",
" the Cryptocurrency data that we have stored'''\n",
" \n",
" data = []\n",
" '''the for loop below is used to create a list of traces for each coins pricing data'''\n",
" for symbol in list(coin_store.keys()):\n",
" trace= go.Scatter(\n",
" x= coin_store[symbol]['timestamp'],\n",
" y= coin_store[symbol][symbol],\n",
" mode= 'lines',\n",
" name= symbol)\n",
" data.append(trace)\n",
" \n",
" layout = dict(width= 900, height= 700, \n",
" title= title,\n",
" xaxis= dict(title= x_label,\n",
" rangeselector= dict(\n",
" buttons= list([\n",
" dict(count= 1,\n",
" label= '1m',\n",
" step= 'month',\n",
" stepmode= 'backward'),\n",
" dict(count= 6,\n",
" label= '6m',\n",
" step= 'month',\n",
" stepmode= 'backward'),\n",
" dict(count= 1,\n",
" label= '1y',\n",
" step= 'year',\n",
" stepmode= 'backward'),\n",
" dict(step= 'all')\n",
" ])\n",
" ),\n",
" rangeslider= dict(),\n",
" type= 'date'\n",
" ),\n",
" yaxis= dict(title= y_label, type= 'log', autorange= True),\n",
" )\n",
" fig = dict(data= data, layout= layout)\n",
" py.plot(fig, filename= 'close_prices.html')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The <i>ohlc_chart()</i> function is also used to abstract away some code that is required to create a OHLC Chart for a given cryptocurrency. Note that I've modified this to include a plot of the daily volume of the currency traded. This helps in giving us a better idea of what was going on in the market."
]
},
{
"cell_type": "code",
"execution_count": 10,
"metadata": {},
"outputs": [],
"source": [
"def ohlc_chart(coin_store, symbol):\n",
" \n",
" '''this helper function is used for creating an OHLC chart for a given coin symbol'''\n",
" \n",
" trace_ohlc = go.Ohlc(x= store[symbol.upper()].timestamp,\n",
" open= store[symbol.upper()].open,\n",
" high= store[symbol.upper()].high,\n",
" low= store[symbol.upper()].low,\n",
" close= store[symbol.upper()][symbol.upper()])\n",
" \n",
" trace_line = go.Scatter( \n",
" x= coin_store[symbol.upper()]['timestamp'],\n",
" y= coin_store[symbol.upper()]['volumefrom'],\n",
" mode= 'line',\n",
" name= 'Volume'\n",
" )\n",
"\n",
" fig = tools.make_subplots(rows= 2, cols= 1, \n",
" specs= [[{}], [{}]],\n",
" shared_xaxes= False, \n",
" shared_yaxes= False,\n",
" subplot_titles= ('{} OHLC Chart'.format(symbol.upper()),\\\n",
" '{} Daily Trade Volume'.format(symbol.upper()))\n",
" )\n",
" \n",
" \n",
" fig.append_trace(trace_ohlc, 1, 1)\n",
" fig.append_trace(trace_line, 2, 1)\n",
"\n",
" fig['layout']['xaxis1'].update(title= 'Date')\n",
" fig['layout']['xaxis2'].update(title= 'Date')\n",
" fig['layout']['yaxis1'].update(title= 'Prices (USD)')\n",
" fig['layout']['yaxis2'].update(title= 'Volume')\n",
" \n",
" fig['layout'].update(width= 900, height= 650, title='')\n",
"\n",
" py.plot(fig, filename='olhc_line_subplot.html')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**----------------------------------------------------------------------------------------------------------------------**"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"<h2>Data Caching And Plotting</h2>"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Now that we've written all our helper functions, let's download some data. Specefically, let's download some of the most popular cryptocurrencies that are being traded right now by passing their symbols in a list. Also note that I have filtered the data to only from June 2017 onward. Depending on wether you had previously cached/updated data, you will see that the <i>downloader()</i> function will display the appropriate messages."
]
},
{
"cell_type": "code",
"execution_count": 11,
"metadata": {
"scrolled": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Loaded BTC_USD from cache\n",
"1 day(s) since last update\n",
"Updating now...\n",
"BTC_USD successfully updated on 2018-01-19 09:21:50.532577\n",
"\n",
"Loaded ETH_USD from cache\n",
"1 day(s) since last update\n",
"Updating now...\n",
"ETH_USD successfully updated on 2018-01-19 09:21:51.299070\n",
"\n",
"Loaded XRP_USD from cache\n",
"1 day(s) since last update\n",
"Updating now...\n",
"XRP_USD successfully updated on 2018-01-19 09:21:52.021118\n",
"\n",
"Loaded BCH_USD from cache\n",
"1 day(s) since last update\n",
"Updating now...\n",
"BCH_USD successfully updated on 2018-01-19 09:21:52.784397\n",
"\n",
"Loaded ADA_USD from cache\n",
"1 day(s) since last update\n",
"Updating now...\n",
"ADA_USD successfully updated on 2018-01-19 09:21:53.576245\n",
"\n",
"Loaded LTC_USD from cache\n",
"1 day(s) since last update\n",
"Updating now...\n",
"LTC_USD successfully updated on 2018-01-19 09:21:54.838456\n",
"\n",
"Loaded NEO_USD from cache\n",
"1 day(s) since last update\n",
"Updating now...\n",
"NEO_USD successfully updated on 2018-01-19 09:21:55.721851\n",
"\n",
"Loaded XEM_USD from cache\n",
"1 day(s) since last update\n",
"Updating now...\n",
"XEM_USD successfully updated on 2018-01-19 09:21:56.445253\n",
"\n",
"Loaded DASH_USD from cache\n",
"1 day(s) since last update\n",
"Updating now...\n",
"DASH_USD successfully updated on 2018-01-19 09:21:57.173742\n",
"\n",
"Loaded XMR_USD from cache\n",
"1 day(s) since last update\n",
"Updating now...\n",
"XMR_USD successfully updated on 2018-01-19 09:21:57.992615\n",
"\n"
]
}
],
"source": [
"'''use the downloader function to download or update data'''\n",
"store = downloader(['BTC','ETH', 'XRP', 'BCH', 'ADA', 'LTC', 'NEO', 'XEM', 'DASH', 'XMR'], 'JUN','2017')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"<h2>Visualizing Close Prices</h2>"
]
},
{
"cell_type": "code",
"execution_count": 17,
"metadata": {},
"outputs": [],
"source": [
"'''plot the close prices'''\n",
"line_plotter(store, 'Cryptocurrency Daily Price Data')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"<h2>Creating An OHLC Chart</h2>"
]
},
{
"cell_type": "code",
"execution_count": 19,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"This is the format of your plot grid:\n",
"[ (1,1) x1,y1 ]\n",
"[ (2,1) x2,y2 ]\n",
"\n"
]
}
],
"source": [
"'''plot an OHLC chart for any coin'''\n",
"ohlc_chart(store, 'BTC')"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.6.1"
}
},
"nbformat": 4,
"nbformat_minor": 2
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment