Skip to content

Instantly share code, notes, and snippets.

View veekaybee's full-sized avatar
💫
in the latent space

Vicki Boykis veekaybee

💫
in the latent space
View GitHub Profile
@veekaybee
veekaybee / searchrecs.md
Last active January 22, 2024 13:53
Understanding search and recommendations

How are search and recommendations the same, and how are they different?

TL;DR:

  • The design of both search and recommendations is to find and filter information
  • Search is a "recommendation with a null query"
  • Search is "I want this", recommendations is "you might like this"
@veekaybee
veekaybee / chatgpt.md
Last active April 12, 2024 20:16
Everything I understand about chatgpt

ChatGPT Resources

Context

ChatGPT appeared like an explosion on all my social media timelines in early December 2022. While I keep up with machine learning as an industry, I wasn't focused so much on this particular corner, and all the screenshots seemed like they came out of nowhere. What was this model? How did the chat prompting work? What was the context of OpenAI doing this work and collecting my prompts for training data?

I decided to do a quick investigation. Here's all the information I've found so far. I'm aggregating and synthesizing it as I go, so it's currently changing pretty frequently.

Model Architecture

@veekaybee
veekaybee / pyscript.html
Created November 26, 2022 16:21
Testing out Pyscript
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="utf-8">
<meta name="viewport" content="width=device-width, initial-scale=1">
<title>Some plotting</title>
<link rel="stylesheet" href="https://pyscript.net/alpha/pyscript.css" />
<script defer src="https://pyscript.net/alpha/pyscript.js"></script>
<py-env>

To run: dot -Tpng trie.dot -o trie.png

import com.twitter.scalding._
class WordCountJob(args: Args) extends Job(args) {
val lines = TypedPipe.from(TextLine("posts.txt"))
lines.flatMap { line => tokenize(line) }
.groupBy { word => word }
.size
.groupAll
"com.lihaoyi" %% "os-lib" % "0.7.8"
// Clone my static site repo, loop through posts and get all files as a single file
val wd = os.pwd / "_posts"
val sd = os.Path("/Users/vicki/IdeaProjects/scalding/scalding-repl")
// Concatentates all the files
os.write.over(
wd / "posts.md",
@veekaybee
veekaybee / distance.md
Last active December 30, 2021 15:41
Different Distance Measures

Jaccard Similarity

import numpy as numpy
import typing
 
a = [1,2,3,4,5,11,12]
b = [2,3,4,5,6,8,9]

cats = ["calico", "tabby", "tom"]

Keybase proof

I hereby claim:

  • I am veekaybee on github.
  • I am veekaybee (https://keybase.io/veekaybee) on keybase.
  • I have a public key ASC1BmRUMCaXHMnJ2DzEnxIyypbZqJmYGJIbCxhhrrSZKgo

To claim this, I am signing this object:

@veekaybee
veekaybee / wholesome-data-science.md
Last active August 16, 2019 06:40
Wholesome data science.

Wholesome Data Science

Data science has a really bad reputation recently. Between Facebook's privacy violations , facial scanning at kiosks in restaurants, and racism in algorithms, there are a lot of cases where surveillance, invasion of privacy, and unethical algorithms are dominating the news.

These cases are really important to make public, study, and prevent. But it's just as important to collect examples of good use cases of data science (that are not hyperbolized or PR fluff) so we can focus on those as an industry, and learn about what makes them work, as well.

Have some? Make some? Feel free to leave a comment or edit.

Examples