Skip to content

Instantly share code, notes, and snippets.

@JoaoLages
JoaoLages / RLHF.md
Last active July 18, 2024 22:10
Reinforcement Learning from Human Feedback (RLHF) - a simplified explanation

Maybe you've heard about this technique but you haven't completely understood it, especially the PPO part. This explanation might help.

We will focus on text-to-text language models 📝, such as GPT-3, BLOOM, and T5. Models like BERT, which are encoder-only, are not addressed.

Reinforcement Learning from Human Feedback (RLHF) has been successfully applied in ChatGPT, hence its major increase in popularity. 📈

RLHF is especially useful in two scenarios 🌟:

  • You can’t create a good loss function
    • Example: how do you calculate a metric to measure if the model’s output was funny?
  • You want to train with production data, but you can’t easily label your production data
@srirambaskaran
srirambaskaran / distributed-shortest-path-note.md
Last active July 12, 2021 11:31
A note on implementing community detection using Apache Spark + GraphX

A note on implementing community detection using Apache Spark + GraphX

Girvan Newman Algorithm

This is one of the earliest methods of community detection. This method is simple to understand and can be easily distributed across clusters for faster processing. Key assumption is that the graph is undirected and unweighted. But it is not hard to extend to directed graphs + weighted edges.

The algorithm is fairly straightforward. It defines a new measure called edge betweenness centrality based on which a divisive hierarchical clustering algorithm is designed to find communities. The stopping criteria for this uses a popular metric called modularity which quantifies how cohesive the communities are during the clustering process.

Side note: A bit of search reveled no implementation of this algorithm in a distributed way (mainly because its slow and better algorithms are available?). So, this note would pave way to use this naive algorithm inspite of its high time complexity.

Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@sjudeng
sjudeng / jg_tp33_cdh512.md
Last active November 6, 2022 12:54
Testing OLAP using JanusGraph with TinkerPop 3.3.0 and Spark 2.2 on Yarn (Cloudera)
@aparrish
aparrish / spacy_intro.ipynb
Last active August 9, 2023 01:41
NLP Concepts with spaCy. Code examples released under CC0 https://creativecommons.org/choose/zero/, other text released under CC BY 4.0 https://creativecommons.org/licenses/by/4.0/
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@siemanko
siemanko / tf_lstm.py
Last active July 26, 2023 06:57
Simple implementation of LSTM in Tensorflow in 50 lines (+ 130 lines of data generation and comments)
"""Short and sweet LSTM implementation in Tensorflow.
Motivation:
When Tensorflow was released, adding RNNs was a bit of a hack - it required
building separate graphs for every number of timesteps and was a bit obscure
to use. Since then TF devs added things like `dynamic_rnn`, `scan` and `map_fn`.
Currently the APIs are decent, but all the tutorials that I am aware of are not
making the best use of the new APIs.
Advantages of this implementation:
@entron
entron / imdb_cnn_kim_small_embedding.py
Last active September 16, 2023 16:23
Keras implementation of Kim's paper "Convolutional Neural Networks for Sentence Classification" with a very small embedding size. The test accuracy is 0.853.
'''This scripts implements Kim's paper "Convolutional Neural Networks for Sentence Classification"
with a very small embedding size (20) than the commonly used values (100 - 300) as it gives better
result with much less parameters.
Run on GPU: THEANO_FLAGS=mode=FAST_RUN,device=gpu,floatX=float32 python imdb_cnn.py
Get to 0.853 test accuracy after 5 epochs. 13s/epoch on Nvidia GTX980 GPU.
'''
from __future__ import print_function