chicago-joe/itch orderbook output

## itch orderbook output
C:\Users\jloss\venv\ITCH50parser\Scripts\python.exe C:\Users\jloss\AppData\Local\JetBrains\Toolbox\apps\PyCharm-P\ch-0\192.7142.56\helpers\pydev\pydevconsole.py --mode=client --port=10634

import sys; print('Python %s on %s' % (sys.version, sys.platform))
sys.path.extend(['C:\\Users\\jloss\\PyCharmProjects\\NASDAQ-ITCH-5.0-VWAP-PARSER', 'C:\\Users\\jloss\\PyCharmProjects\\NASDAQ-ITCH-5.0-VWAP-PARSER\\src', 'C:\\Users\\jloss\\PyCharmProjects\\NASDAQ-ITCH-5.0-VWAP-PARSER\\data', 'C:/Users/jloss/PyCharmProjects/NASDAQ-ITCH-5.0-VWAP-PARSER'])

PyDev console: starting.

Python 3.7.3 (v3.7.3:ef4ec6ed12, Mar 25 2019, 22:22:05) [MSC v.1916 64 bit (AMD64)] on win32
>>> #!/usr/bin/env python
... # coding: utf-8
...
... # # Working with Order Book Data: NASDAQ ITCH
...
... # The primary source of market data is the order book, which is continuously updated in real-time throughout the day to reflect all trading activity. Exchanges typically offer this data as a real-time service and may provide some historical data for free.
... #
... # The trading activity is reflected in numerous messages about trade orders sent by market participants. These messages typically conform to the electronic Financial Information eXchange (FIX) communications protocol for real-time exchange of securities transactions and market data or a native exchange protocol.
...
... # ## Background
...
... # ### The FIX Protocol
...
... # Just like SWIFT is the message protocol for back-office (example, for trade-settlement) messaging, the [FIX protocol](https://www.fixtrading.org/standards/) is the de facto messaging standard for communication before and during, trade execution between exchanges, banks, brokers, clearing firms, and other market participants. Fidelity Investments and Salomon Brothers introduced FIX in 1992 to facilitate electronic communication between broker-dealers and institutional clients who by then exchanged information over the phone.
... #
... # It became popular in global equity markets before expanding into foreign exchange, fixed income and derivatives markets, and further into post-trade to support straight-through processing. Exchanges provide access to FIX messages as a real-time data feed that is parsed by algorithmic traders to track market activity and, for example, identify the footprint of market participants and anticipate their next move.
...
... # ### Nasdaq TotalView-ITCH Order Book data
...
... # While FIX has a dominant large market share, exchanges also offer native protocols. The Nasdaq offers a [TotalView ITCH direct data-feed protocol](http://www.nasdaqtrader.com/content/technicalsupport/specifications/dataproducts/NQTVITCHspecification.pdf) that allows subscribers to track
... # individual orders for equity instruments from placement to execution or cancellation.
... #
... # As a result, it allows for the reconstruction of the order book that keeps track of the list of active-limit buy and sell orders for a specific security or financial instrument. The order book reveals the market depth throughout the day by listing the number of shares being bid or offered at each price point. It may also identify the market participant responsible for specific buy and sell orders unless it is placed anonymously. Market depth is a key indicator of liquidity and the potential price impact of sizable market orders.
...
... # The ITCH v5.0 specification declares over 20 message types related to system events, stock characteristics, the placement and modification of limit orders, and trade execution. It also contains information about the net order imbalance before the open and closing cross.
...
... # ## Imports
...
... # In[1]:
...
...
... import gzip
... import shutil
... from pathlib import Path
... from urllib.request import urlretrieve
... from urllib.parse import urljoin
... from clint.textui import progress
... from datetime import datetime
... import pandas as pd
... import numpy as np
... import matplotlib.pyplot as plt
... from matplotlib.ticker import FuncFormatter
... from struct import unpack
... from collections import namedtuple, Counter
... from datetime import timedelta
... from time import time
...
...
... # ## Get NASDAQ ITCH Data from FTP Server
...
... # The Nasdaq offers [samples](ftp://emi.nasdaq.com/ITCH/) of daily binary files for several months.
... #
... # We are now going to illustrates how to parse a sample file of ITCH messages and reconstruct both the executed trades and the order book for any given tick.
...
... # The data is fairly large and running the entire example can take a lot of time and require substantial memory (16GB+). Also, the sample file used in this example may no longer be available because NASDAQ occasionaly updates the sample files.
...
... # The following table shows the frequency of the most common message types for the sample file date March 29, 2018:
...
... # | Name                    | Offset  | Length  | Value      | Notes                                                                                |
... # |-------------------------|---------|---------|------------|--------------------------------------------------------------------------------------|
... # | Message Type            | 0       | 1       | S          | System Event Message                                                                 |
... # | Stock Locate            | 1       | 2       | Integer    | Always 0                                                                             |
... # | Tracking Number         | 3       | 2       | Integer    | Nasdaq internal tracking number                                                      |
... # | Timestamp               | 5       | 6       | Integer    | Nanoseconds since midnight                                                           |
... # | Order Reference Number  | 11      | 8       | Integer    | The unique reference number assigned to the new order at the time of receipt.        |
... # | Buy/Sell Indicator      | 19      | 1       | Alpha      | The type of order being added. B = Buy Order. S = Sell Order.                        |
... # | Shares                  | 20      | 4       | Integer    | The total number of shares associated with the order being added to the book.        |
... # | Stock                   | 24      | 8       | Alpha      | Stock symbol, right padded with spaces                                               |
... # | Price                   | 32      | 4       | Price (4)  | The display price of the new order. Refer to Data Types for field processing notes.  |
... # | Attribution             | 36      | 4       | Alpha      | Nasdaq Market participant identifier associated with the entered order               |
...
... # ### Set Data paths
...
... # We will store the download in a `data` subdirectory and convert the result to `hdf` format (discussed in the last section of chapter 2).
...
... # In[80]:
...
...
... data_path = Path('C://Users//jloss//PyCharmProjects//NASDAQ-ITCH-5.0-VWAP-PARSER//data')
... itch_store = str(data_path / 'itch.h5')
... order_book_store = data_path / 'order_book.h5'
...
...
... # The FTP address, filename and corresponding date used in this example:
...
... # This is already updated from the 2018 example used in the book:
...
... # In[22]:
...
...
... FTP_URL = 'ftp://emi.nasdaq.com/ITCH/Nasdaq_ITCH/'
... SOURCE_FILE = '01302019.NASDAQ_ITCH50.gz'
...
...
... # ### Download & unzip
...
... # In[25]:
...
...
... def may_be_download(url):
...     """Download & unzip ITCH data if not yet available"""
...     filename = data_path / url.split('/')[-1]
...     if not data_path.exists():
...         print('Creating directory')
...         data_path.mkdir()
...     if not filename.exists():
...         print('Downloading...', url)
...         urlretrieve(url, filename)
...     unzipped = data_path / (filename.stem + '.bin')
...     if not (data_path / unzipped).exists():
...         print('Unzipping to', unzipped)
...         with gzip.open(str(filename), 'rb') as f_in:
...             with open(unzipped, 'wb') as f_out:
...                 shutil.copyfileobj(f_in, f_out)
...     return unzipped
...
...
... # This will download 5.1GB data that unzips to 12.9GB.
...
... # In[26]:
...
...
... file_name = may_be_download(urljoin(FTP_URL, SOURCE_FILE))
... date = file_name.name.split('.')[0]
...
...
... # ## ITCH Format Settings
...
... # ### The `struct` module for binary data
...
... # The ITCH tick data comes in binary format. Python provides the `struct` module (see [docs])(https://docs.python.org/3/library/struct.html) to parse binary data using format strings that identify the message elements by indicating length and type of the various components of the byte string as laid out in the specification.
...
... # From the docs:
... #
... # > This module performs conversions between Python values and C structs represented as Python bytes objects. This can be used in handling binary data stored in files or from network connections, among other sources. It uses Format Strings as compact descriptions of the layout of the C structs and the intended conversion to/from Python values.
...
... # Let's walk through the critical steps to parse the trading messages and reconstruct the order book:
...
... # ### Defining format strings
...
... # The parser uses format strings according to the following formats dictionaries:
...
... # In[58]:
...
...
... event_codes = {'O': 'Start of Messages',
...                'S': 'Start of System Hours',
...                'Q': 'Start of Market Hours',
...                'M': 'End of Market Hours',
...                'E': 'End of System Hours',
...                'C': 'End of Messages'}
...
...
... # In[59]:
...
...
... encoding = {'primary_market_maker': {'Y': 1, 'N': 0},
...             'printable'           : {'Y': 1, 'N': 0},
...             'buy_sell_indicator'  : {'B': 1, 'S': -1},
...             'cross_type'          : {'O': 0, 'C': 1, 'H': 2},
...             'imbalance_direction' : {'B': 0, 'S': 1, 'N': 0, 'O': -1}}
...
...
... # In[60]:
...
...
... formats = {
...         ('integer', 2): 'H',
...         ('integer', 4): 'I',
...         ('integer', 6): '6s',
...         ('integer', 8): 'Q',
...         ('alpha', 1)  : 's',
...         ('alpha', 2)  : '2s',
...         ('alpha', 4)  : '4s',
...         ('alpha', 8)  : '8s',
...         ('price_4', 4): 'I',
...         ('price_8', 8): 'Q',
... }
...
...
... # ### Create message specs for binary data parser
...
... # The ITCH parser relies on message specifications that we create in the following steps.
...
... # #### Load Message Types
...
... # The file `message_types.xlxs` contains the message type specs as laid out in the [documentation](https://www.nasdaqtrader.com/content/technicalsupport/specifications/dataproducts/NQTVITCHSpecification.pdf)
...
... # In[61]:
...
...
... message_data = (pd.read_excel('C://Users//jloss//PyCharmProjects//NASDAQ-ITCH-5.0-VWAP-PARSER//src//message_types.xlsx',
...                               sheet_name='messages',
...                               encoding='latin1')
...                 .sort_values('id')
...                 .drop('id', axis=1))
...
...
... # #### Basic Cleaning
...
... # The function `clean_message_types()` just runs a few basic string cleaning steps.
...
... # In[62]:
...
...
... def clean_message_types(df):
...     df.columns = [c.lower().strip() for c in df.columns]
...     df.value = df.value.str.strip()
...     df.name = (df.name
...                .str.strip() # remove whitespace
...                .str.lower()
...                .str.replace(' ', '_')
...                .str.replace('-', '_')
...                .str.replace('/', '_'))
...     df.notes = df.notes.str.strip()
...     df['message_type'] = df.loc[df.name == 'message_type', 'value']
...     return df
...
...
... # In[63]:
...
...
... message_types = clean_message_types(message_data)
...
...
... # #### Get Message Labels
...
... # We extract message type codes and names so we can later make the results more readable.
...
... # In[64]:
...
...
... message_labels = (message_types.loc[:, ['message_type', 'notes']]
...                   .dropna()
...                   .rename(columns={'notes': 'name'}))
... message_labels.name = (message_labels.name
...                        .str.lower()
...                        .str.replace('message', '')
...                        .str.replace('.', '')
...                        .str.strip().str.replace(' ', '_'))
... # message_labels.to_csv('message_labels.csv', index=False)
... message_labels.head()
...
...
... # ### Finalize specification details
...
... # Each message consists of several fields that are defined by offset, length and type of value. The `struct` module will use this format information to parse the binary source data.
...
... # In[65]:
...
...
... message_types.message_type = message_types.message_type.ffill()
... message_types = message_types[message_types.name != 'message_type']
... message_types.value = (message_types.value
...                        .str.lower()
...                        .str.replace(' ', '_')
...                        .str.replace('(', '')
...                        .str.replace(')', ''))
... message_types.info()
...
...
... # In[68]:
...
...
... message_types.head()
...
...
... # Optionally, persist/reload from file:
...
... # In[67]:
...
...
... message_types.to_csv('message_types.csv', index=False)
... message_types = pd.read_csv('message_types.csv')
...
...
... # The parser translates the message specs into format strings and namedtuples that capture the message content. First, we create `(type, length)` formatting tuples from ITCH specs:
...
... # In[72]:
...
...
... message_types.loc[:, 'formats'] = (message_types[['value', 'length']]
...                                    .apply(tuple, axis=1).map(formats))
...
...
... # Then, we extract formatting details for alphanumerical fields
...
... # In[73]:
...
...
... alpha_fields = message_types[message_types.value == 'alpha'].set_index('name')
... alpha_msgs = alpha_fields.groupby('message_type')
... alpha_formats = {k: v.to_dict() for k, v in alpha_msgs.formats}
... alpha_length = {k: v.add(5).to_dict() for k, v in alpha_msgs.length}
...
...
... # We generate message classes as named tuples and format strings
...
... # In[74]:
...
...
... message_fields, fstring = {}, {}
... for t, message in message_types.groupby('message_type'):
...     message_fields[t] = namedtuple(typename=t, field_names=message.name.tolist())
...     fstring[t] = '>' + ''.join(message.formats.tolist())
...
...
... # Fields of `alpha` type (alphanumeric) require post-processing as defined in the `format_alpha` function:
...
... # In[75]:
...
...
... def format_alpha(mtype, data):
...     """Process byte strings of type alpha"""
...
...     for col in alpha_formats.get(mtype).keys():
...         if mtype != 'R' and col == 'stock':
...             data = data.drop(col, axis=1)
...             continue
...         data.loc[:, col] = data.loc[:, col].str.decode("utf-8").str.strip()
...         if encoding.get(col):
...             data.loc[:, col] = data.loc[:, col].map(encoding.get(col))
...     return data
...
...
... # ## Process Binary Message Data
...
... # The binary file for a single day contains over 350,000,000 messages worth over 12 GB.
...
... # In[76]:
...
...
... def store_messages(m):
...     """Handle occasional storing of all messages"""
...     with pd.HDFStore(itch_store) as store:
...         for mtype, data in m.items():
...             # convert to DataFrame
...             data = pd.DataFrame(data)
...
...             # parse timestamp info
...             data.timestamp = data.timestamp.apply(int.from_bytes, byteorder='big')
...             data.timestamp = pd.to_timedelta(data.timestamp)
...
...             # apply alpha formatting
...             if mtype in alpha_formats.keys():
...                 data = format_alpha(mtype, data)
...
...             s = alpha_length.get(mtype)
...             if s:
...                 s = {c: s.get(c) for c in data.columns}
...             dc = ['stock_locate']
...             if m == 'R':
...                 dc.append('stock')
...             store.put(mtype,
...                       data,
...                       format='t',
...                       min_itemsize=s,
...                       data_columns=dc)
...
...
... # In[77]:
...
...
... messages = {}
... message_count = 0
... message_type_counter = Counter()
...
...
... # The script appends the parsed result iteratively to a file in the fast HDF5 format using the `store_messages()` function we just defined to avoid memory constraints (see last section in chapter 2 for more on this format).
...
... # The following (simplified) code processes the binary file and produces the parsed orders stored by message type:
...
... # In[78]:
...
...
... start = time()
... with file_name.open('rb') as data:
...     while True:
...
...         # determine message size in bytes
...         message_size = int.from_bytes(data.read(2), byteorder='big', signed=False)
...
...         # get message type by reading first byte
...         message_type = data.read(1).decode('ascii')
...
...         # create data structure to capture result
...         if not messages.get(message_type):
...             messages[message_type] = []
...
...         message_type_counter.update([message_type])
...
...         # read & store message
...         record = data.read(message_size - 1)
...         message = message_fields[message_type]._make(unpack(fstring[message_type], record))
...         messages[message_type].append(message)
...
...         # deal with system events
...         if message_type == 'S':
...             timestamp = int.from_bytes(message.timestamp, byteorder='big')
...             print('\n', event_codes.get(message.event_code.decode('ascii'), 'Error'))
...             print('\t{0}\t{1:,.0f}'.format(timedelta(seconds=timestamp * 1e-9),
...                                            message_count))
...             if message.event_code.decode('ascii') == 'C':
...                 store_messages(messages)
...                 break
...
...         message_count += 1
...         if message_count % 2.5e7 == 0:
...             timestamp = int.from_bytes(message.timestamp, byteorder='big')
...             print('\t{0}\t{1:,.0f}\t{2}'.format(timedelta(seconds=timestamp * 1e-9),
...                                                 message_count,
...                                                 timedelta(seconds=time() - start)))
...             store_messages(messages)
...             messages = {}
...
...
... print(timedelta(seconds=time() - start))
...
...
... # ## Summarize Trading Day
...
... # ### Trading Message Frequency
...
... # In[79]:
...
...
... counter = pd.Series(message_type_counter).to_frame('# Trades')
... counter['Message Type'] = counter.index.map(message_labels.set_index('message_type').name.to_dict())
... counter = counter[['Message Type', '# Trades']].sort_values('# Trades', ascending=False)
... print(counter)
...
...
... # In[81]:
...
...
... with pd.HDFStore(itch_store) as store:
...     store.put('summary', counter)
...
...
... # ### Top Equities by Traded Value
...
C:\Users\jloss\venv\ITCH50parser\lib\site-packages\pandas\core\generic.py:5208: SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  self[name] = value
<class 'pandas.core.frame.DataFrame'>
Int64Index: 152 entries, 1 to 172
Data columns (total 6 columns):
name            152 non-null object
offset          152 non-null int64
length          152 non-null int64
value           152 non-null object
notes           152 non-null object
message_type    152 non-null object
dtypes: int64(2), object(4)
memory usage: 8.3+ KB

 Start of Messages
	3:03:59.687761	0

 Start of System Hours
	4:00:00.000181	219,799

 Start of Market Hours
	9:30:00.000036	10,532,163
	9:39:49.689879	25,000,000	0:01:19.884404
	10:01:44.569840	50,000,000	0:04:43.833597
	10:26:57.655610	75,000,000	0:07:58.918175
	10:55:11.316923	100,000,000	0:11:11.258512
	11:23:10.732310	125,000,000	0:14:24.130732
	11:57:40.768604	150,000,000	0:17:37.293754
	12:36:14.416343	175,000,000	0:20:49.210650
	13:22:12.680450	200,000,000	0:24:02.729754
	14:00:58.959369	225,000,000	0:27:20.667001
	14:20:21.253174	250,000,000	0:30:37.013699
	14:42:13.639272	275,000,000	0:33:49.794699
	15:05:20.251304	300,000,000	0:37:03.289516
	15:30:15.801362	325,000,000	0:40:24.247659
	15:53:07.026351	350,000,000	0:43:42.851925

 End of Market Hours
	16:00:00.000113	365,323,584

 End of System Hours
	20:00:00.000021	368,335,086

 End of Messages
	20:05:00.000034	368,366,633
0:48:10.344354
                                        Message Type   # Trades
A                      add_order_no_mpid_attribution  162970455
D                                       order_delete  158273361
U                                      order_replace   27222746
E                                     order_executed    8096995
X                                       order_cancel    4669874
I                                               noii    3684511
F                         add_order_mpid_attribution    1725898
P                                              trade    1326184
L                        market_participant_position     193769
C                          order_executed_with_price     158886
Q                                        cross_trade      17430
Y  reg_sho_short_sale_price_test_restricted_indic...       8821
H                               stock_trading_action       8805
R                                    stock_directory       8714
B                                       broken_trade        116
J                                luld_auction_collar         62
S                                       system_event          6
V          market_wide_circuit_breaker_decline_level          1
>>> from main import *
...
...
... # build order book flow for the given day
... stock = 'TSLA'
... order_dict = {-1: 'sell', 1: 'buy'}
...
... # get all messages for the chosen stock
... def get_messages(date, stock = stock):
...     with pd.HDFStore(itch_store) as store:
...         stock_locate = store.select('R', where='stock = stock').stock_locate.iloc[0]
...         target = 'stock_locate = stock_locate'
...
...         data = {}
...         trading_msgs = ['A', 'F', 'E', 'C', 'X', 'D', 'U', 'P', 'Q']
...         for msg in trading_msgs:
...             data[msg] = store.select(msg, where = target).drop('stock_locate', axis = 1).assign(type = msg)
...
...         # public key records in each type of message (order_ref_number, stock locate code, etc)
...         order_cols = ['order_reference_number', 'buy_sell_indicator', 'shares', 'price']
...
...         # 'A' and 'F' message types are for Add Orders (with and without unattributed orders/quotes)
...         orders = pd.concat([data['A'], data['F']], sort=False, ignore_index=True).loc[:, order_cols]
...
...         for msg in trading_msgs[2: -3]:
...             data[msg] = data[msg].merge(orders, how = 'left')
...
...         # Msg for whenever an order on the book has been cancel-replaced
...         data['U'] = data['U'].merge(orders, how = 'left',
...                                     right_on = 'order_reference_number',
...                                     left_on = 'original_order_reference_number',
...                                     suffixes = ['', '_replaced'])
...
...         # Cross Trade messages:
...         data['Q'].rename(columns = {'cross_price': 'price'}, inplace = True)
...
...         # Order Cancel Messages:
...         data['X']['shares'] = data['X']['cancelled_shares']
...         data['X'] = data['X'].dropna(subset = ['price'])
...
...         data = pd.concat([data[msg] for msg in trading_msgs],
...                          ignore_index = True,
...                          sort = False)
...
...         data['date'] = pd.to_datetime(date, format = '%m%d%Y')
...         data.timestamp = data['date'].add(data.timestamp)
...         data = data[data.printable != 0]
...
...         drop_cols = ['tracking_number', 'order_reference_number', 'original_order_reference_number',
...                      'cross_type', 'new_order_reference_number', 'attribution', 'match_number',
...                      'printable', 'date', 'cancelled_shares']
...         return data.drop(drop_cols, axis = 1).sort_values('timestamp').reset_index(drop = True)
...
...
... messages = get_messages(date = date)
... messages.info(null_counts = True)
...
... with pd.HDFStore(order_book_store) as store:
...     key = '{}/messages'.format(stock)
...     store.put(key, messages)
...     print(store.info())
...
... # combine trade orders (reconstruct successful trades)
... def get_trades(msg):
...     """Combine C, E, P and Q messages into trading records"""
...     trade_dict = {'executed_shares': 'shares', 'execution_price': 'price'}
...     cols = ['timestamp', 'executed_shares']
...     trades = pd.concat([msg.loc[msg.type == 'E', cols + ['price']].rename(columns = trade_dict),
...                         msg.loc[msg.type == 'C', cols + ['execution_price']].rename(columns = trade_dict),
...                         msg.loc[msg.type == 'P', ['timestamp', 'price', 'shares']],
...                         msg.loc[msg.type == 'Q', ['timestamp', 'price', 'shares']].assign(cross = 1),
...                         ], sort=False).dropna(subset=['price']).fillna(0)
...     return trades.set_index('timestamp').sort_index().astype(int)
...
...
... trades = get_trades(messages)
... print(trades.info())
...
... with pd.HDFStore(order_book_store) as store:
...     store.put('{}/trades'.format(stock), trades)
...
... # create orders - accumulate sell orders in ascending and buy orders in desc. order for given timestamps
... def add_orders(orders, buysell, nlevels):
...     new_order = []
...     items = sorted(orders.copy().items())
...     if buysell == 1:
...         items = reversed(items)
...     for i, (p, s) in enumerate(items, 1):
...         new_order.append((p, s))
...         if i == nlevels:
...             break
...     return orders, new_order
...
... # save orders
... def save_orders(orders, append=False):
...     cols = ['price', 'shares']
...     for buysell, book in orders.items():
...         df = (pd.concat([pd.DataFrame(data = data, columns = cols).assign(timestamp = t)
...                          for t, data in book.items()]))
...         key = '{}/{}'.format(stock, order_dict[buysell])
...         df.loc[:, ['price', 'shares']] = df.loc[:, ['price', 'shares']].astype(int)
...         with pd.HDFStore(order_book_store) as store:
...             if append:
...                 store.append(key, df.set_index('timestamp'), format = 't')
...             else:
...                 store.put(key, df.set_index('timestamp'))
...
... ## iterate over all ITCH msgs to process orders/replacement orders as specified:
... order_book = {-1:{}, 1:{}}
... current_orders = {-1: Counter(), 1: Counter()}
... message_counter = Counter()
... nlevels = 100
...
... start = time()
... for msg in messages.itertuples():
...     i = msg[0]
...     if i % 1e5 == 0 and i > 0:
...         print('{:,.0f}\t\t{}'.format(i, timedelta(seconds=time() - start)))
...         save_orders(order_book, append=True)
...         order_book = {-1: {}, 1: {}}
...         start=time()
...     if np.isnan(msg.buy_sell_indicator):
...         continue
...     message_counter.update(msg.type)
...
...     buysell = msg.buy_sell_indicator
...     price, shares = None, None
...
...     if msg.type in ['A', 'F', 'U']:
...         price = int(msg.price)
...         shares = int(msg.shares)
...         current_orders[buysell].update({price: shares})
...         current_orders[buysell], new_order = add_orders(current_orders[buysell], buysell, nlevels)
...         order_book[buysell][msg.timestamp] = new_order
...
...     if msg.type in ['E', 'C', 'X', 'D', 'U']:
...         if msg.type == 'U':
...             if not np.isnan(msg.shares_replaced):
...                 price = int(msg.price_replaced)
...                 shares = -int(msg.shares_replaced)
...         else:
...             if not np.isnan(msg.price):
...                 price = int(msg.price)
...                 shares = -int(msg.shares)
...         if price is not None:
...             current_orders[buysell].update({price: shares})
...             if current_orders[buysell][price] <= 0:
...                 current_orders[buysell].pop(price)
...             current_orders[buysell], new_order = add_orders(current_orders[buysell], buysell, nlevels)
...             order_book[buysell][msg.timestamp] = new_order
...
...
... message_counter = pd.Series(message_counter)
... print(message_counter)
...
... with pd.HDFStore(order_book_store) as store:
...     print(store.info())
...
...
...
...
...
...
...
...
...
...
...
...
...
...
   message_type                                               name
0             S                                       system_event
5             R                                    stock_directory
23            H                               stock_trading_action
31            Y  reg_sho_short_sale_price_test_restricted_indic...
37            L                        market_participant_position
<class 'pandas.core.frame.DataFrame'>
Int64Index: 152 entries, 1 to 172
Data columns (total 6 columns):
name            152 non-null object
offset          152 non-null int64
length          152 non-null int64
value           152 non-null object
notes           152 non-null object
message_type    152 non-null object
dtypes: int64(2), object(4)
memory usage: 8.3+ KB
('\n', None)
('\n',            name  offset  length    value     notes message_type
1  stock_locate       1       2  integer  Always 0            S)
('\n', 'Start of Messages')
	3:03:59.687761	0
('\n', 'Start of System Hours')
	4:00:00.000181	219,799
('\n', 'Start of Market Hours')
	9:30:00.000036	10,532,163
	9:39:49.689879	25,000,000	0:01:29.215000
	10:01:44.569840	50,000,000	0:03:02.431030
	10:26:57.655610	75,000,000	0:04:36.968581
	10:55:11.316923	100,000,000	0:06:12.014789
	11:23:10.732310	125,000,000	0:07:47.788707
	11:57:40.768604	150,000,000	0:09:26.672726
	12:36:14.416343	175,000,000	0:11:01.649949
	13:22:12.680450	200,000,000	0:12:43.246565
	14:00:58.959369	225,000,000	0:14:22.839424
	14:20:21.253174	250,000,000	0:15:57.240454
	14:42:13.639272	275,000,000	0:17:33.944669
	15:05:20.251304	300,000,000	0:19:13.228699
	15:30:15.801362	325,000,000	0:20:50.367699
	15:53:07.026351	350,000,000	0:22:23.574699
('\n', 'End of Market Hours')
	16:00:00.000113	365,323,584
('\n', 'End of System Hours')
	20:00:00.000021	368,335,086
('\n', 'End of Messages')
	20:05:00.000034	368,366,633
0:23:46.911519
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 88335 entries, 0 to 88334
Data columns (total 9 columns):
timestamp             88335 non-null datetime64[ns]
buy_sell_indicator    68612 non-null float64
shares                70208 non-null float64
price                 70208 non-null float64
type                  88335 non-null object
executed_shares       8420 non-null float64
execution_price       7 non-null float64
shares_replaced       37 non-null float64
price_replaced        37 non-null float64
dtypes: datetime64[ns](1), float64(7), object(1)
memory usage: 6.1+ MB
<class 'pandas.io.pytables.HDFStore'>
File path: C:\Users\jloss\PyCharmProjects\NASDAQ-ITCH-5.0-VWAP-PARSER\data\order_book.h5
/TSLA/messages            frame        (shape->[88335,9])
<class 'pandas.core.frame.DataFrame'>
DatetimeIndex: 13172 entries, 2019-01-30 15:53:07.518469637 to 2019-01-30 19:59:58.535897223
Data columns (total 3 columns):
shares    13172 non-null int32
price     13172 non-null int32
cross     13172 non-null int32
dtypes: int32(3)
memory usage: 257.3 KB
None
A    30103
P     5375
E     7789
D    25142
X      130
F       29
U       37
C        7
dtype: int64
<class 'pandas.io.pytables.HDFStore'>
File path: C:\Users\jloss\PyCharmProjects\NASDAQ-ITCH-5.0-VWAP-PARSER\data\order_book.h5
/TSLA/messages            frame        (shape->[88335,9])
/TSLA/trades              frame        (shape->[13172,3])