Skip to content

Instantly share code, notes, and snippets.

@jingningzhang1
Last active June 24, 2022 01:05
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save jingningzhang1/e2f51d0e74fb9114347804e7058eaab4 to your computer and use it in GitHub Desktop.
Save jingningzhang1/e2f51d0e74fb9114347804e7058eaab4 to your computer and use it in GitHub Desktop.
from sqlalchemy.engine import create_engine
## Set up SQL Alchemy engine
engine = create_engine(
f"databricks+connector://token:{token}@{host_name}:443/{database}",
connect_args={
"http_path": http_path,
},
)
## Get a full table with SQL Alchemy
sensor_table = Table("gold_all_sensors", MetaData(bind=engine), autoload=True)
## Get some distinct values on app load for filters
stmt = select([sensor_table.columns.SensorMeasurement]).distinct()
results = engine.execute(stmt).fetchall()
measurement_options = [{'label': re.sub("[(),']", "", str(i)), 'value':re.sub("[(),']", "", str(i))} for i in results]
## Use in a function
## ORM-based SQL Query with dynamic filters in the callback
stmt = select([
sensor_table.columns.MeasurementDateTime,
sensor_table.columns.LongMovingAverage,
sensor_table.columns.SensorLocation
]).where(and_(
sensor_table.columns.SensorMeasurement == input_value
))
## Read data via pandas or just raw Dict/array
## TIPS: Always try to push the filtering/complex logic down to the system where the most data is filtered
## minimize data brought to client
df = pd.read_sql_query(stmt, engine).sort_values(by=['MeasurementDateTime'])
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment