Skip to content

Instantly share code, notes, and snippets.

View cedrickchee's full-sized avatar
⚒️
⚡ 🦀 🐿️ 🐘 🐳 ⬡ ⚛️ 🚢 🚀 🦄 🍵

Cedric Chee cedrickchee

⚒️
⚡ 🦀 🐿️ 🐘 🐳 ⬡ ⚛️ 🚢 🚀 🦄 🍵
View GitHub Profile
@cedrickchee
cedrickchee / tradeoffs-long-context-llm-rag.md
Created May 10, 2024 07:08
Tradeoffs Between Long-Context LLM and RAG

Tradeoffs Between Long-Context LLM and RAG

From Claude 100K to Gemini 10M, we are in the era of long-context large language models (LLMs).

Retrieval-Augmented Generation (RAG) is a technique for enhancing the accuracy and reliability of generative AI models with facts fetched from external sources. [^1] [^2]

The hype surrounding RAG is largely driven by its potential to address some of the limitations of language models by enabling the development of more accurate, contextually grounded, and creative language generation systems. However, it's essential to note that RAG is still a relatively new and evolving field, and there are several challenges and limitations that need to be addressed before it can be widely adopted. For example, RAG requires large amounts of high-quality training data, and it can be challenging to integrate the retrieval and generation components in a way that produces coherent and natural-sounding language.

LLMs such as Claude 100K and Gemini 10M support long contexts spanning tens

@cedrickchee
cedrickchee / eval_llama3_coding.md
Created April 26, 2024 17:23
Evaluating Llama 3 on Code Tasks

Evaluating Llama 3 on Code Tasks

To test Meta Llama 3's performance against existing models, we used the coding benchmarks: HumanEval. HumanEval tests the model's ability to complete code based on docstrings.

The benchmark tests 137 publicly available large language models (LLMs) on code tasks.

Model Accuracy[^1]
@cedrickchee
cedrickchee / rope_embeddings.md
Created April 26, 2024 12:56
The Intuition behind Rotary Positional Embedding (RoPE)

The Intuition behind Rotary Positional Embedding (RoPE)

RoPE is a position embedding method proposed by Jianlin Su et al. in 2021 (paper).

I aim to make the subject matter accessible to a broader audience. You won't find any math-heavy equations or theoretical proofs here.


Let me try to explain RoPE in a way that a non-technical person can understand.

@cedrickchee
cedrickchee / llama3.md
Last active April 26, 2024 05:54
Meta releases Llama 3 Large Language Models (LLMs) 🦙

Meta releases Llama 3 Large Language Models (LLMs) 🦙

Llama 3 8B and 70B pretrained and instruction-tuned models available today. Based on benchmarks, 8B and 70B model is not quite GPT-4 class, but 400B+ (still in development) will reach GPT-4 level soon. Llama 3 sets a new standard for state-of-the art performance and efficiency for openly available LLMs.

Key highlights:

  • 8k context length
  • New capabilities: enhanced reasoning and coding
  • Big change: new tokenizer that expands the vocab size to 128K (from 32K tokens in v2) for better multilingual performance
  • Trained with 7x+ more data on 15 trillion tokens on two clusters with 24K GPUs
@cedrickchee
cedrickchee / rebutting_devin_ai_claim.md
Last active April 13, 2024 13:32
Rebutting Devin: "First AI Software Engineer" Claim is Not True

Rebutting Devin: "First AI Software Engineer" Claim is Not True

A human software engineer, Carl (aka. "InternetOfBugs") looked closer and exposed Cognition Labs's Devin "First AI Software Engineer" Upwork lie. InternetOfBugs is an AI enthusiast and uses coding AI himself. InternetOfBugs is not anti-AI, but anti-hype.

Debunking Devin: "First AI Software Engineer" Upwork lie exposed!

The company lied and said that their video showed Devin completing and getting paid for freelance jobs on Upwork, but it didn't show that at all. On the whole that's not surprising given the current state of Generative AI, and I wouldn't be bothering to debunk it, except:

  1. The company lied about what Devin could do in the video description, and
  2. a lot of people uncritically parroted the lie all over the Internet, and
@cedrickchee
cedrickchee / ai_agents_llm_perf.md
Last active April 26, 2024 02:47
AI Agents and LLM Performance

AI Agents and LLM Performance

In response to Dr. Andrew Ng's letter: https://www.deeplearning.ai/the-batch/how-agents-can-improve-llm-performance/

When I read Andrew's letter, I'm imagining him as Steve Balmer, shouting "Agentic, agentic, agentic workflows!". Haha, we can hear you. No need for that.

AI agent competitions are rising; MetaGPT -> AgentCoder -> Devin/OpenDevin/Devika -> SWE-Agent -> AutoCodeRover

@cedrickchee
cedrickchee / 3b1b_math.md
Created April 7, 2024 14:20
❤️ 3blue1brown

❤️ 3blue1brown

I love 3blue1brown videos. I have been watching for a few years now.

I've found that the "visual" intuition about a concept/topic is really great. They are simple, structured, and clear way to represent these concepts in video.

3blue1brown videos really transformed my view on math.

Recently, 3blue1brown drop two new videos in the Deep Learning series:

What Makes a True AI Coding Assistant?

What is an AI Coding Assistant?

If the coding assistant can't run ITERATIVE CRUD on ALL of your code, it's not a True AI Coding Assistant (TACA)

Standards for True AI Coding Assistants

  1. Must work on existing codebases
  2. Must have a file context mechanism
@cedrickchee
cedrickchee / xz-backdoor.md
Created March 31, 2024 16:33 — forked from thesamesam/xz-backdoor.md
xz-utils backdoor situation

FAQ on the xz-utils backdoor

Background

On March 29th, 2024, a backdoor was discovered in xz-utils, a suite of software that gives developers lossless compression. This package is commonly used for compressing release tarballs, software packages, kernel images, and initramfs images. It is very widely distributed, statistically your average Linux or macOS system will have it installed for

@cedrickchee
cedrickchee / programming-vs-software-engineering.md
Created September 9, 2023 13:15
Distinction Between Programming and Software Engineering

Distinction Between Programming and Software Engineering

"Programming" differs from "software engineering" in dimensionality: programming is about producing code. Software engineering extends that to include the maintenance of that code for its useful life span.

Programming is certainly a significant part of software engineering.

With this distinction, we might need to delineate between programming tasks (development) and software engineering tasks (development, modification, maintenance). The addition of time adds an important new dimension to programming. Software engineering isn't programming.