Skip to content

Instantly share code, notes, and snippets.

@veekaybee
veekaybee / normcore-llm.md
Last active June 7, 2024 05:10
Normcore LLM Reads

Anti-hype LLM reading list

Goals: Add links that are reasonable and good explanations of how stuff works. No hype and no vendor content if possible. Practical first-hand accounts of models in prod eagerly sought.

Foundational Concepts

Screenshot 2023-12-18 at 10 40 27 PM

Pre-Transformer Models

@erikbern
erikbern / kaplan_meier_for_revenue.py
Last active October 14, 2023 19:04
Kaplan-Meier for multiple revenue events
from matplotlib import pyplot
import random
import time
pyplot.style.use("ggplot")
now = time.time()
def generate_user(censor=now):
# Pick some point in time the user was created
t_created = t = now - random.random() * 1e7
@d0choa
d0choa / 2021_approvals.R
Last active July 12, 2022 01:48
Supporting evidence on 2021 FDA approvals
library("tidyverse")
library("sparklyr")
library("sparklyr.nested")
library("cowplot")
library("ggsci")
#Spark config
config <- spark_config()
# Allowing to GCP datasets access
@hadley
hadley / ds-training.md
Created March 13, 2015 18:49
My advise on what you need to do to become a data scientist...

If you were to give recommendations to your "little brother/sister" on things that they need to do to become a data scientist, what would those things be?

I think the "Data Science Venn Diagram" (http://drewconway.com/zia/2013/3/26/the-data-science-venn-diagram) is a great place to start. You need three things to be a good data scientist:

  • Statistical knowledge
  • Programming/hacking skills
  • Domain expertise

Statistical knowledge

@samuell
samuell / luigi_time_tasks_example.py
Last active June 14, 2022 19:32
How to output the execution time of tasks in the luigi workflow system, as discussed [here](https://groups.google.com/d/msg/luigi-user/uivbf-luX9w/z0GCKKsIefoJ)
import luigi
import time
class TimeTaskMixin(object):
'''
A mixin that when added to a luigi task, will print out
the tasks execution time to standard out, when the task is
finished
'''
@luigi.Task.event_handler(luigi.Event.PROCESSING_TIME)
@admackin
admackin / .bashrc
Last active February 10, 2022 22:06
Sane SSH_AUTH_SOCK handling for Screen and Tmux, so that new SSH agents created by subsequent logons are still usable.
_ssh_auth_save() {
ln -sf "$SSH_AUTH_SOCK" "$HOME/.ssh/ssh-auth-sock.$HOSTNAME"
}
alias screen='_ssh_auth_save ; export HOSTNAME=$(hostname) ; screen'
alias tmux='_ssh_auth_save ; export HOSTNAME=$(hostname) ; tmux'
@UniIsland
UniIsland / README
Created May 7, 2012 04:38
python cross process data sharing with mmap
# Intro
extremely simple and unsophisticated cross process data sharing
supports one read-write master process and an arbitrary number of read-only processes
please consider using pickle/cPickle/ctype to store complex data
# References