Skip to content

Instantly share code, notes, and snippets.

@tkunstek
tkunstek / readme.txt
Created October 21, 2024 10:37
Free AI note taker
You can utilize Whisper.cpp (https://github.com/ggerganov/whisper.cpp?tab=readme-ov-file) to take a generic recording, extract the text, and feed it to an LLM using rag. Completely free and privately.
Start by getting a meeting recording. I use Just Press Record on my Mac to record meetings.
This now needs to be converted to a specific format. I used the command:
ffmpeg -i /Users/tkunstek/Library/Mobile\ Documents/iCloud\~com\~openplanetsoftware\~just-press-record/Documents/2024-05-28/18-12-07.m4a -ar 16000 -ac 1 -c:a pcm_s16le output.wav
Now use whisper.cpp to extract the text:
./main -otxt true -m models/ggml-base.en.bin -f output.wav > transcript.txt
@tkunstek
tkunstek / agents.yaml
Created October 21, 2024 10:31
A crewai project to act as a detective and build a profile on a suspect
detective:
role: >
Cold case homicide detective
goal: >
Review cold case files to find answers to questions about the suspect {suspect} as it relates to the victim {victim}.
Sometimes the information is not in the cold case files, so simply report unknown and move on.
Always cite your notes so the other detectives can see where you found your information.
backstory: >
You're a seasoned detective with a knack for identifying facts in homicide investigations.
You are known for your ability to find the most relevant
@tkunstek
tkunstek / Readme.txt
Created October 21, 2024 10:24
A private team LLM for research
I created the environment in the docker-compose.yaml. The sandbox was only accessable via Wireguard and team members individual keys. Since everything had to exist in the sandbox, you will see that I included a container to run firefox. Using that container I downloaded all of the case files.
The case files included PDF's that were image scans of computer print-outs and handwritten notes. There was no native text in any of the files. I attempeted using OCR software (Tika, Tesseract, etc) with poor results. I settled on using AWS Textract in a private account with a private VPC.
Before sending the data to textract there was some cleanup needed. First, I had to fix the file names, for this I used the detox linux command. Next, each multi-page PDF had to be split into a seperate file. See split.sh for a wrapper script I wrote to automate the job.
The resulting individual pages were than uploaded to s3 using the aws cli into a secure s3 bucket. I configured a retention policy on the bucket to delete all files
@tkunstek
tkunstek / diagnostic.txt
Created November 3, 2019 18:48
Teslausb log
This file has been truncated, but you can view the full file.
reading config from /root/teslausb_setup_variables.conf
====== summary ======
hardware: Raspberry Pi Zero W Rev 1.1
OS: Raspbian GNU/Linux 10 (buster)
headless setup config in /root
archive method: rsync
lun0 connected, from file /backingfiles/music_disk.bin
lun1 connected, from file /backingfiles/cam_disk.bin
1 snapshots mounted
@tkunstek
tkunstek / SwiftCodeCram
Created September 23, 2014 22:53
Swift CodeCram
// Playground - noun: a place where people can play
import UIKit
var str = "Hello, playground"
let five = 5
let six = 6
var eleven = five + six