Tested with Cloudera 5.12.0 Quickstart VM (https://www.cloudera.com/downloads/quickstart_vms/5-12.html)
Library | Version |
---|---|
JanusGraph | 0.3.0-SNAPSHOT |
TinkerPop | 3.3.0 |
Spark | 2.2.0 |
HBase | 1.2.0 |
Cassandra | 2.2.11 |
Java | 1.8.0_151 |
'''This scripts implements Kim's paper "Convolutional Neural Networks for Sentence Classification" | |
with a very small embedding size (20) than the commonly used values (100 - 300) as it gives better | |
result with much less parameters. | |
Run on GPU: THEANO_FLAGS=mode=FAST_RUN,device=gpu,floatX=float32 python imdb_cnn.py | |
Get to 0.853 test accuracy after 5 epochs. 13s/epoch on Nvidia GTX980 GPU. | |
''' | |
from __future__ import print_function |
"""Short and sweet LSTM implementation in Tensorflow. | |
Motivation: | |
When Tensorflow was released, adding RNNs was a bit of a hack - it required | |
building separate graphs for every number of timesteps and was a bit obscure | |
to use. Since then TF devs added things like `dynamic_rnn`, `scan` and `map_fn`. | |
Currently the APIs are decent, but all the tutorials that I am aware of are not | |
making the best use of the new APIs. | |
Advantages of this implementation: |
Tested with Cloudera 5.12.0 Quickstart VM (https://www.cloudera.com/downloads/quickstart_vms/5-12.html)
Library | Version |
---|---|
JanusGraph | 0.3.0-SNAPSHOT |
TinkerPop | 3.3.0 |
Spark | 2.2.0 |
HBase | 1.2.0 |
Cassandra | 2.2.11 |
Java | 1.8.0_151 |
This is one of the earliest methods of community detection. This method is simple to understand and can be easily distributed across clusters for faster processing. Key assumption is that the graph is undirected and unweighted. But it is not hard to extend to directed graphs + weighted edges.
The algorithm is fairly straightforward. It defines a new measure called edge betweenness centrality
based on which a divisive hierarchical clustering algorithm is designed to find communities. The stopping criteria for this uses a popular metric called modularity
which quantifies how cohesive the communities are during the clustering process.
Side note: A bit of search reveled no implementation of this algorithm in a distributed way (mainly because its slow and better algorithms are available?). So, this note would pave way to use this naive algorithm inspite of its high time complexity.
Maybe you've heard about this technique but you haven't completely understood it, especially the PPO part. This explanation might help.
We will focus on text-to-text language models 📝, such as GPT-3, BLOOM, and T5. Models like BERT, which are encoder-only, are not addressed.
Reinforcement Learning from Human Feedback (RLHF) has been successfully applied in ChatGPT, hence its major increase in popularity. 📈
RLHF is especially useful in two scenarios 🌟: