Skip to content

Instantly share code, notes, and snippets.

View JoaoLages's full-sized avatar
🐛
Probably fixing bugs

João Lages JoaoLages

🐛
Probably fixing bugs
View GitHub Profile
@troyfontaine
troyfontaine / 1-setup.md
Last active July 23, 2024 23:25
Signing your Git Commits on MacOS

Methods of Signing Git Commits on MacOS

Last updated March 13, 2024

This Gist explains how to sign commits using gpg in a step-by-step fashion. Previously, krypt.co was heavily mentioned, but I've only recently learned they were acquired by Akamai and no longer update their previous free products. Those mentions have been removed.

Additionally, 1Password now supports signing Git commits with SSH keys and makes it pretty easy-plus you can easily configure Git Tower to use it for both signing and ssh.

For using a GUI-based GIT tool such as Tower or Github Desktop, follow the steps here for signing your commits with GPG.

@JoaoLages
JoaoLages / RLHF.md
Last active July 18, 2024 22:10
Reinforcement Learning from Human Feedback (RLHF) - a simplified explanation

Maybe you've heard about this technique but you haven't completely understood it, especially the PPO part. This explanation might help.

We will focus on text-to-text language models 📝, such as GPT-3, BLOOM, and T5. Models like BERT, which are encoder-only, are not addressed.

Reinforcement Learning from Human Feedback (RLHF) has been successfully applied in ChatGPT, hence its major increase in popularity. 📈

RLHF is especially useful in two scenarios 🌟:

  • You can’t create a good loss function
    • Example: how do you calculate a metric to measure if the model’s output was funny?
  • You want to train with production data, but you can’t easily label your production data