Skip to content

Instantly share code, notes, and snippets.

@voberoi
voberoi / README.md
Last active March 19, 2024 20:20
Prompts used for chapter extraction in citymeetings.nyc -- from my talk at NYC School of Data 2024

These are the prompts I use to extract chapters in citymeetings.nyc as of March 23rd, 2024 -- the date of my NYC School of Data talk.

To simplify things I've removed all the code that stitches these prompts together and consolidated all the common items from each step in my chapter extraction pipeline.

See the slides & talk for a description of how these work in concert and how I review and fix issues.

NOTE: these work reasonably well and save tons of time, but I haven't systematically evaluated or improved them yet in the same way I have my speaker identification prompt.

@voberoi
voberoi / README.md
Last active March 19, 2024 20:21
The prompt I use for speaker identification in citymeetings.nyc -- from my talk at NYC School of Data 2024
@voberoi
voberoi / gist:9fb7affa5e3d2ae3aaef7104aad8d37d
Last active August 10, 2022 17:20
Setting bit 128,000,000 in a Redis bitmap allocates ~16MB of RAM
# 1. Check the RSS of the running Redis instance.
# 2. Set the 128 millionth bit in a new bitmap keyed "test0"
# 3. Check the RSS again to see how much it's increased.
#
# Note that Redis will allocate as much memory as it needs to set
# the given bit. Even though we don't set any bit prior to the 128
# millionth, it will allocate ~16MB RAM to set that bit.
$ ps -axm -o rss,comm | grep redis-server
6800 redis-server *:6379
count type
3730 13
1 17
2121 19
18910 20
377 21
3061 22
31762 23
693040 24

Keybase proof

I hereby claim:

  • I am voberoi on github.
  • I am voberoi (https://keybase.io/voberoi) on keybase.
  • I have a public key ASDiZ6Ku8OXPKh4BCEo4ymwGtEZT7qlaMB-GTHAHxgidbQo

To claim this, I am signing this object:

@voberoi
voberoi / foo_dag.py
Created June 18, 2018 20:47
Cloud Composer Dependency Issue (June 2018)
import airflow
from airflow.operators.bash_operator import BashOperator
from airflow.models import DAG
from foo_dep import print_something
args = {"owner": "airflow", "start_date": airflow.utils.dates.days_ago(2)}
dag = DAG(dag_id="foo", default_args=args, schedule_interval="0 0 * * *")
a_bash_cmd = BashOperator(task_id="a_bash_cmd", bash_command="echo 1", dag=dag)
## <project root>/ops/deploy.sh
PROJECT_ROOT=`git rev-parse --show-toplevel`
VERSION_FILE="$PROJECT_ROOT/atari_archive/VERSION"
# <rest of deploy script>
# Create version file based on tag or branch being deployed.
VERSION_STRING="$TAG_OR_BRANCH"
echo ">>>>> Creating VERSION file containing '$VERSION_STRING'..."