Skip to content

Instantly share code, notes, and snippets.

@slopp
slopp / README.md
Created January 2, 2025 22:37
LangGraph Exploration

A modification of the LangChain SQL Q&A tutorial https://python.langchain.com/docs/tutorials/sql_qa/.

The changes are:

  • uses Pydantic to type the state and inputs/outputs
  • uses duckDB on the palmerpenguin dataset
  • uses a Nvidia NIM for the LLM
  • instead of a sequence write_query -> run_query -> gen_answer, this graph adds a LLM that checks the write_query output for validity and to see if it answers the question, leading to a more dynamic graph that looks like this:

graph

@slopp
slopp / README.md
Last active January 2, 2025 22:32
Simple LLM agent to recommend fake coffee shops via tool calling
@slopp
slopp / README.md
Last active January 2, 2025 22:32
Code for creating a RAG chatbot based on theradavist
@slopp
slopp / README.md
Created December 31, 2024 00:10
AI Coffee Shop Streamlit App

This simple streamlit app uses the Google Maps and Places API, along with a hosted Nvidia NIM wrapper of the Llama model, to help you find coffee shops near an address.

Screenshot 2024-12-30 at 5 03 38 PM

To run

  1. Install dependencies
@slopp
slopp / ReadMe.md
Created December 23, 2024 20:17
Torch Experiment

Torch Experiment

Goal

  • First attempt at using Torch for some type of "deep" learning
  • Take advantage of modal to access serverless Python compute, including GPUs

Approach

@slopp
slopp / README.md
Created February 27, 2024 16:14
SQL Server to GCS to BQ Dagster Pipeline Example

This example shows a skeleton for how to build a Dagster project that extracts tables from SQL Server, stores the extract as a CSV in GCS, and then uploads the GCS extract to BigQuery.

The actual extract and load logic is omitted. But the purpose of this project is to show how such a pipeline can be represented in Dagster assets.

First, a single pipeline for one table is created. This is demonstrated in the file dagster_mock_one_table.py. To run this example:

  1. Create a Python virtual environment and then run:
pip install dagster dagster-webserver
@slopp
slopp / penguins.csv
Created March 31, 2021 20:07
Palmer Penguins Dataset as CSV
rowid species island bill_length_mm bill_depth_mm flipper_length_mm body_mass_g sex year
1 Adelie Torgersen 39.1 18.7 181 3750 male 2007
2 Adelie Torgersen 39.5 17.4 186 3800 female 2007
3 Adelie Torgersen 40.3 18 195 3250 female 2007
4 Adelie Torgersen NA NA NA NA NA 2007
5 Adelie Torgersen 36.7 19.3 193 3450 female 2007
6 Adelie Torgersen 39.3 20.6 190 3650 male 2007
7 Adelie Torgersen 38.9 17.8 181 3625 female 2007
8 Adelie Torgersen 39.2 19.6 195 4675 male 2007
9 Adelie Torgersen 34.1 18.1 193 3475 NA 2007
@slopp
slopp / example.py
Created September 24, 2024 21:42
IO Manager that Depends on Resource
from typing import Any
from dagster import ConfigurableResource, ConfigurableIOManager, InputContext, OutputContext, asset, Definitions, ResourceDependency, EnvVar
from pydantic import Field
# https://docs.dagster.io/concepts/resources#resources-that-depend-on-other-resources
class myResource(ConfigurableResource):
username: str = Field(description="the username")
password: str = Field(description="the password")
@slopp
slopp / job.py
Created September 23, 2024 19:31
Canary Ping Dagster
import os
from dagster import define_asset_job, load_assets_from_package_module, repository, with_resources, op, job, ScheduleDefinition
from my_dagster_project import assets
from datadog_api_client import ApiClient, Configuration
from datadog_api_client.v2.api.metrics_api import MetricsApi
from datadog_api_client.v2.model.metric_intake_type import MetricIntakeType
from datadog_api_client.v2.model.metric_payload import MetricPayload
from datadog_api_client.v2.model.metric_point import MetricPoint
from datetime import datetime
@slopp
slopp / README.md
Last active September 21, 2024 10:25
Dynamic pipeline that invokes k8 ops

Dynamic Pipeline

This example shows the psuedo-code for a Dagster pipeline that:

  1. Accepts the path to a raw dataset as a string
  2. Runs a step to break the raw dataset into partitions
  3. For each partition, the pipeline runs a series of two processing steps. Each processing step is a call out to a Docker container to run supplying the partition key as an input argument. The partitions are run together in parallel before being collected in a final processing step that operates on all the partitions.

To run the pipeline: