Skip to content

Instantly share code, notes, and snippets.

View cedrickchee's full-sized avatar
⚒️
⚡ 🦀 🐿️ 🐘 🐳 ⬡ ⚛️ 🚢 🚀 🦄 🍵

Cedric Chee cedrickchee

⚒️
⚡ 🦀 🐿️ 🐘 🐳 ⬡ ⚛️ 🚢 🚀 🦄 🍵
View GitHub Profile
@cedrickchee
cedrickchee / tradeoffs-long-context-llm-rag.md
Created May 10, 2024 07:08
Tradeoffs Between Long-Context LLM and RAG

Tradeoffs Between Long-Context LLM and RAG

From Claude 100K to Gemini 10M, we are in the era of long-context large language models (LLMs).

Retrieval-Augmented Generation (RAG) is a technique for enhancing the accuracy and reliability of generative AI models with facts fetched from external sources. [^1] [^2]

The hype surrounding RAG is largely driven by its potential to address some of the limitations of language models by enabling the development of more accurate, contextually grounded, and creative language generation systems. However, it's essential to note that RAG is still a relatively new and evolving field, and there are several challenges and limitations that need to be addressed before it can be widely adopted. For example, RAG requires large amounts of high-quality training data, and it can be challenging to integrate the retrieval and generation components in a way that produces coherent and natural-sounding language.

LLMs such as Claude 100K and Gemini 10M support long contexts spanning tens

@cedrickchee
cedrickchee / startup_tools.md
Created May 8, 2019 11:14
Curated directory of the best startup tools

Here are the best startup tools of 2019 that will help you build out your startup business as quickly, cheaply, and efficiently as possible.

This is a curated list of tools for everything from productivity to web hosting to development tools to designing. Most of these tools are either free or have limited free option that is enough for startups. We love all the free services out there, but it would be good to keep it on topic. It's a bit of a grey line at times so this is a bit opinionated; feel free to suggest and contribute in this list.

Source Code Repos

  • GitHub — Unlimited public repositories and unlimited private repositories (up to 3 collaborators).
  • GitLab — Unlimited public and private Git repos with unlimited collaborators.
  • BitBucket — Unlimited public and private repos (Git and Mercurial) for up to 5 users with Pipelines for CI/CD.
  • Visual Studio — Unlimited private repos (Git a
@cedrickchee
cedrickchee / rust_resources.md
Last active May 3, 2024 09:46
Awesome Rust — a collection of resources for learning Rust

Awesome Rust

I learn Rust by reading The Rust Programming Language (aka. TRPL) book.

This is my mind map and collection of resources for learning Rust in early 2019.

I plan to continuously update this list if time allows in future. I will move this into its own GitHub repo or something more permanent when this grow.


@cedrickchee
cedrickchee / llama-7b-m1.md
Last active May 2, 2024 12:47
4 Steps in Running LLaMA-7B on a M1 MacBook with `llama.cpp`

4 Steps in Running LLaMA-7B on a M1 MacBook

The large language models usability

The problem with large language models is that you can’t run these locally on your laptop. Thanks to Georgi Gerganov and his llama.cpp project, it is now possible to run Meta’s LLaMA on a single computer without a dedicated GPU.

Running LLaMA

There are multiple steps involved in running LLaMA locally on a M1 Mac after downloading the model weights.

@cedrickchee
cedrickchee / google_colab_t4_gpu.md
Last active April 30, 2024 18:22
NVIDIA Tesla T4 GPU available in Google Colab

nvidia-smi and CPU check

nvidia-smi and CPU check

Show info about the deep learning software stack

fastai lib show_install

@cedrickchee
cedrickchee / eval_llama3_coding.md
Created April 26, 2024 17:23
Evaluating Llama 3 on Code Tasks

Evaluating Llama 3 on Code Tasks

To test Meta Llama 3's performance against existing models, we used the coding benchmarks: HumanEval. HumanEval tests the model's ability to complete code based on docstrings.

The benchmark tests 137 publicly available large language models (LLMs) on code tasks.

Model Accuracy[^1]
@cedrickchee
cedrickchee / rope_embeddings.md
Created April 26, 2024 12:56
The Intuition behind Rotary Positional Embedding (RoPE)

The Intuition behind Rotary Positional Embedding (RoPE)

RoPE is a position embedding method proposed by Jianlin Su et al. in 2021 (paper).

I aim to make the subject matter accessible to a broader audience. You won't find any math-heavy equations or theoretical proofs here.


Let me try to explain RoPE in a way that a non-technical person can understand.

@cedrickchee
cedrickchee / llama3.md
Last active April 26, 2024 05:54
Meta releases Llama 3 Large Language Models (LLMs) 🦙

Meta releases Llama 3 Large Language Models (LLMs) 🦙

Llama 3 8B and 70B pretrained and instruction-tuned models available today. Based on benchmarks, 8B and 70B model is not quite GPT-4 class, but 400B+ (still in development) will reach GPT-4 level soon. Llama 3 sets a new standard for state-of-the art performance and efficiency for openly available LLMs.

Key highlights:

  • 8k context length
  • New capabilities: enhanced reasoning and coding
  • Big change: new tokenizer that expands the vocab size to 128K (from 32K tokens in v2) for better multilingual performance
  • Trained with 7x+ more data on 15 trillion tokens on two clusters with 24K GPUs
@cedrickchee
cedrickchee / ai_agents_llm_perf.md
Last active April 26, 2024 02:47
AI Agents and LLM Performance

AI Agents and LLM Performance

In response to Dr. Andrew Ng's letter: https://www.deeplearning.ai/the-batch/how-agents-can-improve-llm-performance/

When I read Andrew's letter, I'm imagining him as Steve Balmer, shouting "Agentic, agentic, agentic workflows!". Haha, we can hear you. No need for that.

AI agent competitions are rising; MetaGPT -> AgentCoder -> Devin/OpenDevin/Devika -> SWE-Agent -> AutoCodeRover

@cedrickchee
cedrickchee / rebutting_devin_ai_claim.md
Last active April 13, 2024 13:32
Rebutting Devin: "First AI Software Engineer" Claim is Not True

Rebutting Devin: "First AI Software Engineer" Claim is Not True

A human software engineer, Carl (aka. "InternetOfBugs") looked closer and exposed Cognition Labs's Devin "First AI Software Engineer" Upwork lie. InternetOfBugs is an AI enthusiast and uses coding AI himself. InternetOfBugs is not anti-AI, but anti-hype.

Debunking Devin: "First AI Software Engineer" Upwork lie exposed!

The company lied and said that their video showed Devin completing and getting paid for freelance jobs on Upwork, but it didn't show that at all. On the whole that's not surprising given the current state of Generative AI, and I wouldn't be bothering to debunk it, except:

  1. The company lied about what Devin could do in the video description, and
  2. a lot of people uncritically parroted the lie all over the Internet, and