Skip to content

Instantly share code, notes, and snippets.

View cedrickchee's full-sized avatar
⚒️
⚡ 🦀 🐿️ 🐘 🐳 ⬡ ⚛️ 🚢 🚀 🦄 🍵

Cedric Chee cedrickchee

⚒️
⚡ 🦀 🐿️ 🐘 🐳 ⬡ ⚛️ 🚢 🚀 🦄 🍵
View GitHub Profile
@cedrickchee
cedrickchee / ai_engineering_handbook.md
Last active May 17, 2024 13:22
AI Engineering Handbook (Draft)

AI Engineering Handbook (Draft)

What is AI Engineering?

AI engineers operate at a higher level of abstraction than Machine Learning (ML) engineers or large language model (LLM) engineers and don't necessarily need to know how to build an LLM or a ML model.

AI engineering builds upon ML systems[^1], but with a focus on large scale, ready made models (aka. base models).

The distinct skills that AI engineers need to know include prompt engineering,

@cedrickchee
cedrickchee / ml_sys_design_book_review.md
Last active May 16, 2024 16:46
Designing Machine Learning Systems Book Review
@cedrickchee
cedrickchee / the-bitter-lesson-richsutton.md
Created May 16, 2024 05:44
The Bitter Lesson by Rich Sutton

The Bitter Lesson by Rich Sutton

From the 2019 essay "The Bitter Lesson" by Rich Sutton:

Summary: AI research shows that leveraging computation through general methods like search and learning is far more effective than incorporating human knowledge. As computational power grows, these methods outperform human-centric approaches in fields like chess, Go, speech recognition, and vision. The key takeaway: focus on scalable computational methods, not mimicking human thought.

I wish every graduate student in AI would read "The Bitter Lesson".[^1]

Arguments

@cedrickchee
cedrickchee / fine-tuning-transformer-tools.md
Created May 15, 2024 09:06
Fine-Tuning Transformer Tools

Fine-Tuning Transformer Tools

I tested Gemini as my Pixel Assistant today. I migrated from Google Assistant to Gemini on Android.

I'm fine-tuning large language models. I need to compare fine-tuning Transformer tools. So I asked Gemini. Here's the link to the entire chat below:

Prompt:
You are an large language models expert. You have trained deep learning Transformer models. I want to fine-tune a base model with my domain-specific dataset.
@cedrickchee
cedrickchee / llm_assembly_lang_period.md
Created May 15, 2024 06:52
The Assembly Language Period of LLMs and Generative AI

The Assembly Language Period of LLMs and Generative AI

You are facing these challenges:

  • LLMs are too expensive and/or slow
  • Overwhelmed by tools or frameworks
  • Feel lost or blind regarding how good or bad your LLMs are
  • Don't know how to improve your LLM-based apps or products
  • Lack of quality learning resources on AI engineering designed for the LLM era
@cedrickchee
cedrickchee / flash_attention_cuda_programming.md
Created May 14, 2024 12:23
GPGPU Programming Flash Attention 2 with CUDA

GPGPU Programming Flash Attention 2 with CUDA

GPUs Go Brrr by Hazy Research, Stanford, May 2024.

we’re going to talk about what we’ve learned about making GPUs go brr -- and release an embedded DSL, ThunderKittens, that we’ve built to help us write some particularly speedy kernels (which we are also releasing).

small library (DSL?) that we called ThunderKittens that we hope lets us write simple-to-understand clean code that indeed makes gpus go brrr.

@cedrickchee
cedrickchee / deconstructing_gpt-4o.md
Last active May 14, 2024 11:04
Deconstructing GPT-4o

Deconstructing GPT-4o

I know your timeline is flooded now with word salads of "insane, HER, 10 features you missed, we're so back". Sit down. Chill. Take a deep breath like Mark does in the demo . Let's think step by step:

Techniques and Data

  • Technique-wise, OpenAI has figured out a way to map audio to audio directly as first-class modality, and stream videos to a transformer [^1] in real-time. These require some new research on tokenization and architecture, but overall it's a data and system optimization problem (as most things are).

High-quality data can come from at least 2 sources:

@cedrickchee
cedrickchee / voice_ai_research.md
Last active May 18, 2024 15:29
Voice AI Research

Voice AI Research

Goal: make computers talk like human.

My notes from audio (voice + speech) AI research started in 2023.

Emerging Research

Audio-to-audio Models

@cedrickchee
cedrickchee / HER_moment.md
Last active May 14, 2024 11:26
GPT-4o: HER Went From Science Fiction to Reality

GPT-4o: HER Went From Science Fiction to Reality

State of GPT presentation by Andrej Karpathy for the Microsoft Build 2023 event

GPT-4o ("o" for "omni") is having the "HER" moment. To put it in another way, Audio[^1] AI models are having their Stable Diffusion moment too.

HER is a 2013 sci-fi movie where Samantha, an AI virtual assistant personified through a female voice. GPT-4o shows us we are ridiculously close to this movie becoming reality.

GPT-4o is OpenAI's new model which can reason across text, audio, and video in real time. It is described as smart, fast, natively multimodal, and a step towards more natural human-computer interaction. It is extremely versatile and fun to play with.