Skip to content

Instantly share code, notes, and snippets.

#!/usr/bin/env python
# Building high-frequency trading signals in Python with Databento and sklearn
#
# This is a simple example that demonstrates how to build high-frequency trading signals in Python,
# using order book and market depth data from [Databento](https://databento.com) together with
# machine learning models from [sklearn](https://scikit-learn.org/).
import databento as db
import numpy as np
import databento as db
client = db.Historical()
data = client.timeseries.get_range(
dataset='GLBX.MDP3',
schema='definition',
symbols='ALL_SYMBOLS',
start='2023-12-27',
@databento-bot
databento-bot / get_cme_options_statistics.py
Last active January 24, 2024 16:41
Gets official open interest and volume from CME options data using Databento
#!/usr/bin/env python
# Gets official open interest and volume from CME options data using Databento
#
# CME has different parent symbols like E[1-4][A-D], EW, ES, LO[1-4][A-D], LO
# Databento provides these as E1A.OPT, E2A.OPT, ..., respectively.
#
# However, this can be tedious to fetch one at a time. Another way to fetch
# all options of interest is to do it yourself using `ALL_SYMBOLS` and
# take advantage of the 'group' column in the instrument definitions.
@databento-bot
databento-bot / markouts.py
Last active February 27, 2024 10:43
Demonstrate adverse selection and market impact of aggressive/passive limit orders in US equities
import databento as db
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
DATE = pd.Timestamp(year=2023, month=6, day=22, tz='US/Eastern')
NUM_TIME_SAMPLES = 1000
SYMBOL = 'NVDA'
WINDOW_LIMITS_US = 120 * 1e6
@databento-bot
databento-bot / premarket.py
Created February 21, 2024 11:05
Example of getting largest premarket moves in US stocks with Databento
import databento as db
client = db.Historical()
data = client.timeseries.get_range(
dataset='XNAS.ITCH',
schema='ohlcv-1m',
symbols='ALL_SYMBOLS',
start='20230606T00:00',
end='20230606T14:30',
)
@databento-bot
databento-bot / energies.py
Created February 21, 2024 12:06
Example of fetching CME and ICE energy futures data on Databento
import databento as db
client = db.Historical()
# Heating oil, gasoline and crude futures on CME and ICE
# .FUT parent symbology notation fetches all expirations across
# all outrights and multi-legged spreads etc.
CME_SYMBOLS = ['HO.FUT', 'RB.FUT', 'CL.FUT']
ICE_SYMBOLS = ['UHO.FUT', 'UHU.FUT', 'BRN.FUT']
@databento-bot
databento-bot / extended_hours.py
Created February 28, 2024 09:20
Example of getting largest moves in US stocks during extended hours with Databento
import datetime
import databento as db
import pandas as pd
def get_df_move(date: str = '2023-06-06'):
end_dt = datetime.datetime.strptime(date+'T16:00', '%Y-%m-%dT%H:%M')
# Reddit comment 2024-04-03
> See https://www.reddit.com/r/algotrading/comments/1bu59ql/comment/kxuil9a
There will always be some differences in the vendor’s infrastructure used to process real-time vs. historical.
It takes a bit of effort to make these as identical as possible. Non-exhaustive list:
The most common issue I’ve seen is that the vendor will retroactively clean and patch their historical data ex post in ways
that are not replicable in real-time. (The most obvious tell is if you report a data error and they tell you it’s patched
within the same day.) This is one area where Bloomberg is quite good despite doing it the “wrong” way - they have a
@databento-bot
databento-bot / databento-extract-cme.py
Created April 15, 2024 04:39
Fetch data of options on futures through Databento, using COMEX Gold (GC) and CBOT US 5-Year T-Note (ZF) and including weeklies, as an example
#!/usr/bin/env python
#
# databento-extract-cme.py
#
# Fetch data of options on futures through Databento,
# using COMEX Gold (GC) and CBOT US 5-Year T-Note (ZF) and
# including weeklies, as an example
import databento as db
import itertools
@databento-bot
databento-bot / replay_snapshot.py
Last active April 30, 2024 23:24
Use intraday replay to synthetically generate a snapshot
#!/usr/bin/env python
#
# replay_snapshot.py
#
# As of Databento's Python client v0.33.0, intraday snapshots are not available via
# the historical (HTTP) API, so you can use intraday replay to generate those snapshots
# on client side. This is the recommended method until these two features are released:
#
# If no bar for a symbol is received in the first second, there will be no entry in the
# DataFrame for that symbol.