Navigation Menu

Skip to content

Instantly share code, notes, and snippets.

@sysang
Last active August 9, 2023 01:43
Show Gist options
  • Star 5 You must be signed in to star a gist
  • Fork 1 You must be signed in to fork a gist
  • Save sysang/a09cc5a388073b4bfe124a959f4ed04f to your computer and use it in GitHub Desktop.
Save sysang/a09cc5a388073b4bfe124a959f4ed04f to your computer and use it in GitHub Desktop.
Awesome Git Repositories: Deep Learning, NLP, Compute Vision, Model & Paper, Chatbot, Tensorflow, Julia Lang, Software Library, Reinforcement Learning

FRAMEWORKS, TOOLKITS, MODELS, APPLICATIONS

https://github.com/tensorflow/tensorflow

https://github.com/keras-team/keras

https://github.com/deepmind/sonnet - TensorFlow-based neural network library.

https://github.com/tensorflow/mesh - Mesh TensorFlow: Model Parallelism Made Easier

https://github.com/keras-team/keras-applications/ - It provides model definitions and pre-trained weights for a number of popular archictures, such as VGG16, ResNet50, Xception, MobileNet, and more.

https://github.com/tensorflow/tensor2tensor - library of deep learning models and datasets designed to make deep learning more accessible and accelerate ML research.

https://github.com/tflearn/tflearn - Deep learning library featuring a higher-level API for TensorFlow.

https://github.com/fchollet/keras-resources - Directory of tutorials and open-source code repositories for working with Keras, the Python deep learning library

https://github.com/tensorlayer/tensorlayer

TensorLayer is a novel TensorFlow-based deep learning and reinforcement learning library designed for researchers and engineers. It provides an extensive collection of customizable neural layers to build advanced AI models quickly, based on this, the community open-sourced mass tutorials and applications. TensorLayer is awarded the 2017 Best Open Source Software by the ACM Multimedia Society. This project can also be found at iHub and Gitee.

https://github.com/pytorch

https://github.com/fastai/fastai

fastai is a deep learning library which provides practitioners with high-level components that can quickly and easily provide state-of-the-art results in standard deep learning domains, and provides researchers with low-level components that can be mixed and matched to build new approaches. fastai includes: A new type dispatch system for Python along with a semantic type hierarchy for tensors A GPU-optimized computer vision library which can be extended in pure Python An optimizer which refractors out the common functionality of modern optimizers into two basic pieces, allowing optimization algorithms to be implemented in 4-5 lines of code A novel 2-way callback system that can access any part of the data, model, or optimizer and change it at any point during training A new data block API

https://github.com/PyTorchLightning/pytorch-lightning - The lightweight PyTorch wrapper for high-performance AI research. Scale your models, not the boilerplate.

https://github.com/pytorch/examples - A set of examples around pytorch in Vision, Text, Reinforcement Learning, etc.

https://github.com/jettify/pytorch-optimizer - torch-optimizer -- collection of optimizers for Pytorch.

https://github.com/silvandeleemput/memcnn - PyTorch Framework for Developing Memory Efficient Deep Invertible Networks.

https://github.com/davda54/ada-hessian - Easy-to-use AdaHessian optimizer (PyTorch).

https://github.com/huggingface/pytorch_block_sparse - Fast Block Sparse Matrices for Pytorch.

https://github.com/fkodom/fft-conv-pytorch - Implementation of 1D, 2D, and 3D FFT convolutions in PyTorch. Much faster than direct convolutions for large kernel sizes.

https://github.com/facebookresearch/functorch - functorch is a prototype of JAX-like composable function transforms for PyTorch.

https://github.com/apache/incubator-mxnet

https://github.com/deepmind/deepmind-research - This repository contains implementations and illustrative code to accompany DeepMind publications.

https://github.com/facebookresearch/faiss - Faiss is a library for efficient similarity search and clustering of dense vectors.

Faiss contains several methods for similarity search. It assumes that the instances are represented as vectors and are identified by an integer, and that the vectors can be compared with L2 (Euclidean) distances or dot products. Vectors that are similar to a query vector are those that have the lowest L2 distance or the highest dot product with the query vector. It also supports cosine similarity, since this is a dot product on normalized vectors.

https://github.com/torch/torch8 - Torch is a scientific computing framework with wide support for machine learning algorithms that puts GPUs first.

https://github.com/NVIDIA/TensorRT - TensorRT is a C++ library for high performance inference on NVIDIA GPUs and deep learning accelerators. https://developer.nvidia.com/tensorrt

https://github.com/NVIDIA/DALI - A library containing both highly optimized building blocks and an execution engine for data pre-processing in deep learning applications https://docs.nvidia.com/deeplearning/…

https://github.com/pluskid/Mocha.jl - Deep Learning framework for Julia

https://github.com/FluxML/Flux.jl - Relax! Flux is the ML library that doesn't make you tensor https://fluxml.ai/

https://github.com/microsoft/DeepSpeed - DeepSpeed is a deep learning optimization library that makes distributed training easy, efficient, and effective. www.deepspeed.ai .

https://github.com/pjreddie/darknet - Darknet is an open source neural network framework written in C and CUDA. It is fast, easy to install, and supports CPU and GPU computation.

https://github.com/josephmisiti/awesome-machine-learning - A curated list of awesome Machine Learning frameworks, libraries and software.

https://github.com/EthicalML/awesome-production-machine-learning - A curated list of awesome open source libraries to deploy, monitor, version and scale your machine learning

https://github.com/pjreddie/TopDeepLearning - A list of popular github projects related to deep learning

https://github.com/rasbt/deeplearning-models - A collection of various deep learning architectures, models, and tips

https://github.com/tensorflow/models - Models and examples built with TensorFlow.

https://github.com/lucidrains/g-mlp-pytorch - Implementation of gMLP, an all-MLP replacement for Transformers, in Pytorch

Pay Attention to MLPs by Quoc Le et al, https://arxiv.org/abs/2105.08050 Here we propose a simple network architecture, gMLP, based on MLPs with gating, and show that it can perform as well as Transformers in key language and vision applications.

https://github.com/martinarjovsky/WassersteinGAN - Code accompanying the paper "Wasserstein GAN", https://arxiv.org/abs/1701.07875

hhttps://github.com/facebookresearch/dlrm - An implementation of a deep learning recommendation model (DLRM)

ttps://github.com/domluna/memn2n - Implementation of "End-To-End Memory Networks" with sklearn-like interface using Tensorflow. Tasks are from the bAbl dataset.

https://github.com/nitishsrivastava/deepnet - Implementation of some deep learning algorithms.

https://github.com/lucidrains/glom-pytorch - An attempt at the implementation of Glom, Geoffrey Hinton's new idea that integrates concepts from neural fields, top-down-bottom-up processing, and attention (consensus between columns), for emergent part-whole heirarchies from data.

https://github.com/RedditSota/state-of-the-art-result-for-machine-learning-problems

This repository provides state of the art (SoTA) results for all machine learning problems. We do our best to keep this repository up to date. If you do find a problem's SoTA result is out of date.

https://github.com/hyperopt/hyperopt - Distributed Asynchronous Hyperparameter Optimization in Python http://hyperopt.github.io/hyperopt

https://github.com/lutzroeder/netron - Visualizer for neural network, deep learning and machine learning models https://www.lutzroeder.com/ai

https://github.com/mlflow/mlflow - MLflow is a platform to streamline machine learning development, including tracking experiments, packaging code into reproducible runs, and sharing and deploying models.

MLflow offers a set of lightweight APIs that can be used with any existing machine learning application or library (TensorFlow, PyTorch, XGBoost, etc).

https://github.com/optuna/optuna - A hyperparameter optimization framework.

https://github.com/clab/dynet

DyNet is a neural network library developed by Carnegie Mellon University and many others. It is written in C++ (with bindings in Python) and is designed to be efficient when run on either CPU or GPU, and to work well with networks that have dynamic structures that change for every training instance. For example, these kinds of networks are particularly important in natural language processing tasks, and DyNet has been used to build state-of-the-art systems for syntactic parsing, machine translation, morphological inflection, and many other application areas.

https://github.com/RJT1990/pyflux

PyFlux is an open source time series library for Python. The library has a good array of modern time series models, as well as a flexible array of inference options (frequentist and Bayesian) that can be applied to these models. By combining breadth of models with breadth of inference, PyFlux allows for a probabilistic approach to time series modelling

https://github.com/flashlight/flashlight - Flashlight is a fast, flexible machine learning library written entirely in C++ from the Facebook AI Research Speech team and the creators of Torch and Deep Speech. Its core features include:

Just-in-time kernel compilation with modern C++ with the ArrayFire tensor library. CUDA and CPU backends for GPU and CPU training. An emphasis on efficiency and scale.

https://github.com/horovod/horovod - Distributed training framework for TensorFlow, Keras, PyTorch, and Apache MXNet.

The primary motivation for this project is to make it easy to take a single-GPU training script and successfully scale it to train across many GPUs in parallel.

https://github.com/thangbui/geepee - A collection of Gaussian process models.

https://github.com/deepmind/learning-to-learn - Learning to Learn in TensorFlow, https://arxiv.org/abs/1606.04474

https://github.com/Spijkervet/BYOL - Bootstrap Your Own Latent: A New Approach to Self-Supervised Learning, https://arxiv.org/abs/2006.07733.

https://github.com/lucidrains/byol-pytorch - Usable Implementation of "Bootstrap Your Own Latent" self-supervised learning, from Deepmind, in Pytorch.

https://github.com/hibayesian/awesome-automl-papers - A curated list of automated machine learning papers, articles, tutorials, slides and projects.

https://github.com/ray-project/ray - An open source framework that provides a simple, universal API for building distributed applications. Ray is packaged with RLlib, a scalable reinforcement learning library, and Tune, a scalable hyperparameter tuning library.

https://github.com/google/automl - Google Brain AutoML

https://github.com/maxpumperla/hyperas - Keras + Hyperopt: A very simple wrapper for convenient hyperparameter optimization http://maxpumperla.com/hyperas/

https://github.com/keras-team/keras-tuner - Hyperparameter tuning for humans

https://github.com/keras-team/autokeras - An AutoML system based on Keras http://autokeras.com/

https://github.com/tensorflow/adanet - Fast and flexible AutoML with learning guarantees.

https://github.com/microsoft/nni - An open source AutoML toolkit for automate machine learning lifecycle, including feature engineering, neural architecture search, model compression and hyper-parameter tuning.

https://github.com/ray-project/ray - An open source framework that provides a simple, universal API for building distributed applications. Ray is packaged with RLlib, a scalable reinforcement learning library, and Tune, a scalable hyperparameter tuning library.

COMPUTATION

https://github.com/HiPerCoRe/KTT

KTT is a autotuning framework for OpenCL, CUDA kernels and GLSL compute shaders. Version 1.3 which introduces public searcher API and user-provided compute queues and buffers is now available.

https://github.com/google/tangent - Source-to-Source Debuggable Derivatives in Pure Python.

https://github.com/numba/numba - A Just-In-Time Compiler for Numerical Functions in Python

https://github.com/google/jax - JAX is Autograd and XLA, brought together for high-performance machine learning research.

With its updated version of Autograd, JAX can automatically differentiate native Python and NumPy functions. It can differentiate through loops, branches, recursion, and closures, and it can take derivatives of derivatives of derivatives.

https://github.com/onnx/onnx - ONNX provides an open source format for AI models, both deep learning and traditional ML. It defines an extensible computation graph model, as well as definitions of built-in operators and standard data types.

https://github.com/cupy/cupy - CuPy is an implementation of NumPy-compatible multi-dimensional array on CUDA. CuPy consists of the core multi-dimensional array class, cupy.ndarray, and many functions on it. It supports a subset of numpy.ndarray interface.

https://github.com/cornellius-gp/gpytorch - A highly efficient and modular implementation of Gaussian Processes in PyTorch.

https://github.com/titu1994/tfdiffeq - Tensorflow implementation of Ordinary Differential Equation Solvers with full GPU support.

https://github.com/rtqichen/torchdiffeq - Differentiable ODE solvers with full GPU support and O(1)-memory backpropagation.

https://arxiv.org/abs/1806.07366 https://arxiv.org/abs/2011.03902

https://github.com/pyro-ppl/pyro - Deep universal probabilistic programming with Python and PyTorch.

https://github.com/lmcinnes/umap - Uniform Manifold Approximation and Projection, https://arxiv.org/abs/1802.03426, https://www.biorxiv.org/content/10.1101/2020.05.12.077776v1

Uniform Manifold Approximation and Projection (UMAP) is a dimension reduction technique that can be used for visualisation similarly to t-SNE, but also for general non-linear dimension reduction. The algorithm is founded on three assumptions about the data:

  1. The data is uniformly distributed on a Riemannian manifold;
  2. The Riemannian metric is locally constant (or can be approximated as such);
  3. The manifold is locally connected.

https://github.com/fmfn/BayesianOptimization - A Python implementation of global optimization with gaussian processes.

https://github.com/mackelab/autohmm - Hidden Markov Models (HMMs) with tied states and autoregressive observations.

https://github.com/jmschrei/pomegranate - Fast, flexible and easy to use probabilistic modelling in Python. Models:

Probability Distributions General Mixture Models Hidden Markov Models Naive Bayes and Bayes Classifiers Markov Chains Discrete Bayesian Networks Discrete Markov Networks

https://github.com/jolibrain/deepdetect - Deep Learning API and Server in C++14 support for Caffe, Caffe2, PyTorch,TensorRT, Dlib, NCNN, Tensorflow, XGBoost and TSNE.

https://github.com/arogozhnikov/einops - Flexible and powerful tensor operations for readable and reliable code. Supports numpy, pytorch, tensorflow, jax, and others.

https://github.com/dask/dask - Dask provides advanced parallelism for analytics, enabling performance at scale for the tools you love

https://github.com/TheAlgorithms/Python - All Algorithms implemented in Python

https://github.com/waylonflinn/weblas - GPU accelerated Javascript. Numerical computing in your browser with performance comparable to native.

https://github.com/google-research/fast-soft-sort - Fast Differentiable Sorting and Ranking, https://arxiv.org/abs/2002.08871

The sorting operation is one of the most commonly used building blocks in computer programming. In machine learning, it is often used for robust statistics. However, seen as a function, it is piecewise linear and as a result includes many kinks where it is non-differentiable. More problematic is the related ranking operator, often used for order statistics and ranking metrics. It is a piecewise constant function, meaning that its derivatives are null or undefined. While numerous works have proposed differentiable proxies to sorting and ranking, they do not achieve the O(nlogn) time complexity one would expect from sorting and ranking operations.

AWESOME KNOWLEDGE, TUTORIALS

https://github.com/kjw0612/awesome-rnn - A curated list of resources dedicated to recurrent neural networks (closely related to deep learning).

https://github.com/ChristosChristofidis/awesome-deep-learning - A curated list of awesome Deep Learning tutorials, projects and communities.

https://github.com/dennybritz/deeplearning-papernotes - Summaries and notes on Deep Learning research papers

https://github.com/terryum/awesome-deep-learning-papers - The most cited deep learning papers

https://github.com/oxford-cs-deepnlp-2017/lectures - Oxford Deep NLP 2017 course

https://github.com/lexfridman/mit-deep-learning - Tutorials, assignments, and competitions for MIT Deep Learning related courses. https://deeplearning.mit.edu

https://github.com/yunjey/pytorch-tutorial - PyTorch Tutorial for Deep Learning Researchers.

https://github.com/floodsung/Deep-Learning-Papers-Reading-Roadmap - Deep Learning papers reading roadmap for anyone who are eager to learn this amazing tech!

https://github.com/vdumoulin/conv_arithmetic - A technical report on convolution arithmetic in the context of deep learning.

https://github.com/FrancescoSaverioZuppichini/PyTorch-Deep-Learning-Template - This template aims to make it easier for you to start a new deep learning computer vision project with PyTorch.

https://github.com/mnielsen/neural-networks-and-deep-learning - Code samples for my book "Neural Networks and Deep Learning"

https://github.com/Hvass-Labs/TensorFlow-Tutorials - TensorFlow Tutorials with YouTube Videos

https://github.com/deepmind/learning-to-learn - Learning to Learn in TensorFlow

https://github.com/fastai/course-v3 - The 3rd edition of course.fast.ai. See the nbs folder for the notebooks.

https://github.com/NVIDIA/DeepLearningExamples - This repository provides the latest deep learning example networks for training. These examples focus on achieving the best performance and convergence from NVIDIA Volta Tensor Cores.

https://github.com/ageron/handson-ml - A series of Jupyter notebooks that walk you through the fundamentals of Machine Learning and Deep Learning in python using Scikit-Learn and TensorFlow.

https://github.com/fastai/fastbook - The fastai book, published as Jupyter Notebooks.

https://github.com/graykode/distribution-is-all-you-need - The basic distribution probability Tutorial for Deep Learning Researchers.

https://github.com/huggingface/neuralcoref - ast Coreference Resolution in spaCy with Neural Networks, https://medium.com/huggingface/state-of-the-art-neural-coreference-resolution-for-chatbots-3302365dcf30

https://github.com/vi3k6i5/flashtext - This module can be used to replace keywords in sentences or extract keywords from sentences. It is based on the FlashText algorithm, https://arxiv.org/abs/1711.00046

https://github.com/eriklindernoren/ML-From-Scratch - Machine Learning From Scratch. Bare bones NumPy implementations of machine learning models and algorithms with a focus on accessibility. Aims to cover everything from linear regression to deep learning.

https://github.com/Chillee/CS344_2021 - Udacity CS344 Introduction to Parallell Programming (https://classroom.udacity.com/courses/cs344), with assignments/materials updated to build in 2021.

NLP

https://github.com/graykode/nlp-tutorial - Natural Language Processing Tutorial for Deep Learning Researchers

https://github.com/nlptown/nlp-notebooks - A collection of notebooks for Natural Language Processing from NLP Town http://www.nlp.town

https://github.com/ines/spacy-course - Advanced NLP with spaCy: A free online course.

https://github.com/bentrevett/pytorch-seq2seq - Tutorials on implementing a few sequence-to-sequence (seq2seq) models with PyTorch and TorchText.

https://github.com/RaRe-Technologies/gensim - Gensim is a Python library for topic modelling, document indexing and similarity retrieval with large corpora. Target audience is the natural language processing (NLP) and information retrieval (IR) community

https://github.com/allenai/allennlp - An open-source NLP research library, built on PyTorch. http://www.allennlp.org/

https://github.com/flairNLP/flair - Flair is:

  • A powerful NLP library. Flair allows you to apply our state-of-the-art natural language processing (NLP) models to your text, such as named entity recognition (NER), part-of-speech tagging (PoS), sense disambiguation and classification.

  • Multilingual. Thanks to the Flair community, we support a rapidly growing number of languages. We also now include 'one model, many languages' taggers, i.e. single models that predict PoS or NER tags for input text in various languages.

  • A text embedding library. Flair has simple interfaces that allow you to use and combine different word and document embeddings, including our proposed Flair embeddings, BERT embeddings and ELMo embeddings.

  • A PyTorch NLP framework. Our framework builds directly on PyTorch, making it easy to train your own models and experiment with new approaches using Flair embeddings and classes.

https://github.com/explosion/spaCy - Industrial-strength Natural Language Processing (NLP) in Python

spaCy is a library for advanced Natural Language Processing in Python and Cython. It's built on the very latest research, and was designed from day one to be used in real products.

https://github.com/sebastianruder/NLP-progress - Tracking Progress in Natural Language Processing

https://github.com/IsaacChanghau/DL-NLP-Readings - My Reading Lists of Deep Learning and Natural Language Processing

https://github.com/keon/awesome-nlp - A curated list of resources dedicated to Natural Language Processing (NLP)

https://github.com/mihail911/nlp-library - Curated collection of papers for the nlp practitioner.

https://github.com/msgi/nlp-journey - Documents, papers and codes related to NLP, including Topic Model, Word Embedding, Named Entity Recognition, Text Classificatin, Text Generation, Text Similarity, Machine Translation),etc. All codes are implemented intensorflow 2.0. https://github.com/msgi/nlp-journey

https://github.com/graykode/nlp-roadmap - ROADMAP(Mind Map) and KEYWORD for students those who have interest in learning NLP.

https://github.com/NervanaSystems/nlp-architect - A model library for exploring state-of-the-art deep learning topologies and techniques for optimizing Natural Language Processing neural networks

https://github.com/microsoft/nlp-recipes - Natural Language Processing Best Practices & Examples

https://github.com/microsoft/unilm

Pre-trained (foundation) models across tasks (understanding, generation and translation), languages (100+ languages), and modalities (language, image, audio, vision + language, audio + language, etc.).

https://github.com/roomylee/awesome-relation-extraction - A curated list of awesome resources dedicated to Relation Extraction, one of the most important tasks in Natural Language Processing (NLP).

https://github.com/Separius/awesome-sentence-embedding - A curated list of pretrained sentence and word embedding models

https://github.com/philipperemy/keras-attention-mechanism - Attention mechanism Implementation for Keras.

https://github.com/huggingface/tokenizers - Fast State-of-the-Art Tokenizers optimized for Research and Production.

https://github.com/facebookresearch/XLM - PyTorch original implementation of Cross-lingual Language Model Pretraining, https://arxiv.org/abs/1901.07291

https://github.com/facebookresearch/fastText - Library for fast text representation and classification.

https://github.com/sloria/TextBlob

TextBlob is a Python (2 and 3) library for processing textual data. It provides a simple API for diving into common natural language processing (NLP) tasks such as part-of-speech tagging, noun phrase extraction, sentiment analysis, classification, translation, and more. TextBlob stands on the giant shoulders of NLTK and Pattern, and plays nicely with both.

https://github.com/dmlc/gluon-nlp - NLP made easy http://gluon-nlp.mxnet.io

https://github.com/erickrf/nlpnet - A neural network architecture for NLP tasks, using cython for fast performance. Currently, it can perform POS tagging, SRL and dependency parsing. http://nilc.icmc.usp.br/nlpnet/

https://github.com/attardi/deepnl - deepnl is a Python library for Natural Language Processing tasks based on a Deep Learning neural network architecture. The library currently provides tools for performing part-of-speech tagging, Named Entity tagging and Semantic Role Labeling.

https://github.com/Hironsan/anago - anaGo is a Python library for sequence labeling(NER, PoS Tagging,...), implemented in Keras.

https://github.com/BrikerMan/Kashgari - Kashgari is a production-level NLP Transfer learning framework built on top of tf.keras for text-labeling and text-classification, includes Word2Vec, BERT, and GPT2 Language Embedding.

https://github.com/HIT-SCIR/ELMoForManyLangs - Pre-trained ELMo Representations for Many Languages

https://github.com/mikelkl/TF2-QA - 21th place (top2%) solution for kaggle TensorFlow 2.0 Question Answering.

https://github.com/facebookresearch/StarSpace - Learning embeddings for classification, retrieval and ranking.

https://github.com/brightmart/text_classification - The purpose of this repository is to explore text classification methods in NLP with deep learning.

https://github.com/PAIR-code/lit - The Language Interpretability Tool: Interactively analyze NLP models for model understanding in an extensible and framework agnostic interface.

https://github.com/BlinkDL/RWKV-LM - The RWKV Language Model with Token-shift. Better and Faster than usual transformer / GPT.

https://github.com/huggingface/tokenizers - Fast State-of-the-Art Tokenizers optimized for Research and Production.

https://github.com/princeton-nlp/SimCSE - EMNLP'2021: SimCSE: Simple Contrastive Learning of Sentence Embeddings.

https://github.com/lucidrains/mlm-pytorch - An implementation of masked language modeling for Pytorch, made as concise and simple as possible.

https://github.com/lucidrains/coco-lm-pytorch - Implementation of COCO-LM, Correcting and Contrasting Text Sequences for Language Model Pretraining, in Pytorch.

https://github.com/lucidrains/marge-pytorch - Implementation of Marge, Pre-training via Paraphrasing, in Pytorch.

https://github.com/lucidrains/mixture-of-experts - A Pytorch implementation of Sparsely-Gated Mixture of Experts, for massively increasing the parameter count of language models.

NLG, GPT

https://github.com/elyase/awesome-gpt3 - Awesome GPT-3 is a collection of demos and articles about the OpenAI GPT-3 API.

https://github.com/openai/gpt-2 - Code for the paper "Language Models are Unsupervised Multitask Learners" GPT-2

https://github.com/microsoft/DialoGPT - A State-of-the-Art Large-scale Pretrained Response Generation Model (DialoGPT)

https://github.com/karpathy/minGPT - A minimal PyTorch re-implementation of the OpenAI GPT (Generative Pretrained Transformer) training.

https://github.com/graykode/gpt-2-Pytorch - Simple Text-Generator with OpenAI gpt-2 Pytorch Implementation

https://github.com/minimaxir/gpt-2-simple - Python package to easily retrain OpenAI's GPT-2 text-generating model on new texts

https://github.com/lucidrains/token-shift-gpt - Implementation of Token Shift GPT - An autoregressive model that solely relies on shifting the sequence space for mixing.

https://github.com/lucidrains/g-mlp-gpt - GPT, but made only out of MLPs.

https://github.com/tokenmill/awesome-nlg - There is a wide spectrum of different technologies addressing parts or the whole of the NLG process. This list aims to represent this deversity of NLG applications and techniques by providing links to various projects, tools, research papers, and learning materials.

https://github.com/google/sentencepiece - Unsupervised text tokenizer for Neural Network-based text generation.

https://github.com/guxd/DialogWAE - Source Code for DialogWAE: Multimodal Response Generation with Conditional Wasserstein Autoencoder (https://arxiv.org/abs/1805.12352).

https://github.com/hassyGo/NLG-RL - Accelerated Reinforcement Learning for Sentence Generation by Vocabulary Prediction, of paper https://arxiv.org/abs/1809.01694

https://github.com/google/seq2seq - A general-purpose encoder-decoder framework for Tensorflow that can be used for Machine Translation, Text Summarization, Conversational Modeling, Image Captioning, and more.

https://github.com/bytedance/lightseq - LightSeq: A High Performance Library for Sequence Processing and Generation

https://github.com/karpathy/char-rnn - Multi-layer Recurrent Neural Networks (LSTM, GRU, RNN) for character-level language models in Torch.

https://github.com/sherjilozair/char-rnn-tensorflow - Multi-layer Recurrent Neural Networks (LSTM, RNN) for character-level language models in Python using Tensorflow.

https://github.com/jcjohnson/torch-rnn - Efficient, reusable RNNs and LSTMs for torch.

https://github.com/hunkim/word-rnn-tensorflow - Multi-layer Recurrent Neural Networks (LSTM, RNN) for word-level language models in Python using TensorFlow.

https://github.com/jsuarez5341/Recurrent-Highway-Hypernetworks-NIPS - Implementation for paper: Character-Level Language Modeling with Recurrent Highway Hypernetworks

https://github.com/pytorch/fairseq - Facebook AI Research Sequence-to-Sequence Toolkit written in Python.

https://github.com/minimaxir/textgenrnn - Easily train your own text-generating neural network of any size and complexity on any text dataset with a few lines of code.

https://github.com/UFAL-DSG/tgen - A statistical natural language generator for spoken dialogue systems.

https://github.com/bjascob/pyInflect - A python module for word inflections designed for use with spaCy.

https://github.com/rsennrich/subword-nmt - Unsupervised Word Segmentation for Neural Machine Translation and Text Generation.

https://github.com/jiweil/Neural-Dialogue-Generation

This project contains the code or part of the code for the dialogue generation part in the following papers: [1] J.Li, M.Galley, C.Brockett, J.Gao and B.Dolan. "A Diversity-Promoting Objective Function for Neural Conversation Models". NAACL2016. [2] J.Li, M.Galley, C.Brockett, J.Gao and B.Dolan. "A persona-based neural conversation model". ACL2016. [3] J.Li, W.Monroe, T.Shi, A.Ritter, D.Jurafsky. "Adversarial Learning for Neural Dialogue Generation " arxiv [4] J.Li, W.Monroe, D.Jurafsky. "Learning to Decode for Future Success" arxiv [5] J.Li, W.Monroe, D.Jurafsky. "A Simple, Fast Diverse Decoding Algorithm for Neural Generation" arxiv [6] J.Li, W.Monroe, D.Jurafsky. "Data Distillation for Controlling Specificity in Dialogue Generation (to appear on arxiv)"

https://github.com/tsenghungchen/dialog-generation-paper - A list of recent papers regarding dialogue generation.

https://github.com/wiseodd/generative-models - Collection of generative models, e.g. GAN, VAE in Pytorch and Tensorflow.

Generative Adversarial Nets (GAN) Vanilla GAN, Conditional GAN, InfoGAN, Wasserstein GAN, Mode Regularized GAN, Coupled GAN, Auxiliary Classifier GAN, Least Squares GAN, Boundary Seeking GAN, Energy Based GAN, f-GAN, Generative Adversarial Parallelization, DiscoGAN, Adversarial Feature Learning & Adversarially Learned Inference, Boundary Equilibrium GAN, Improved Training for Wasserstein GAN, DualGAN, MAGAN: Margin Adaptation for GAN, Softmax GAN, GibbsNet

Variational Autoencoder (VAE) Vanilla VAE, Conditional VAE, Denoising VAE, Adversarial Autoencoder, Adversarial Variational Bayes, Restricted Boltzmann Machine (RBM) Binary RBM with Contrastive Divergence Binary RBM with Persistent Contrastive Divergence Helmholtz Machine Binary Helmholtz Machine with Wake-Sleep Algorithm

https://github.com/mikelkl/TF2-QA - 21th place (top2%) solution for kaggle TensorFlow 2.0 Question Answering.

https://github.com/OceanskySun/GraftNet - This is the implementation of GraftNet described in EMNLP 2018 paper Open Domain Question Answering Using Early Fusion of Knowledge Bases and Text, https://arxiv.org/abs/1809.00782

https://github.com/ryankiros/skip-thoughts - Sent2Vec encoder and training code from the paper "Skip-Thought Vectors", https://arxiv.org/abs/1506.06726

https://github.com/bjascob/pySimpleNLG - A python port of SimpleNLG.

https://github.com/asyml/texar - Toolkit for Machine Learning, Natural Language Processing, and Text Generation, in TensorFlow. This is part of the CASL project: http://casl-project.ai/

https://github.com/martiansideofthemoon/hurdles-longform-qa - Official repository with code and data accompanying the NAACL 2021 paper "Hurdles to Progress in Long-form Question Answering" (https://arxiv.org/abs/2103.06332).

BERT, TRANSFORMERS

https://github.com/google-research/bert - TensorFlow code and pre-trained models for BERT

https://github.com/hanxiao/bert-as-service - Using BERT model as a sentence encoding service, i.e. mapping a variable-length sentence to a fixed-length vector

https://github.com/Separius/BERT-keras - Keras implementation of BERT with pre-trained weights

https://github.com/codertimo/BERT-pytorch - Google AI 2018 BERT pytorch implementation.

https://github.com/jeongukjae/pytorch-bert - An implementation of BERT using PyTorch's TransformerEncoder.

https://github.com/CyberZHG/keras-bert - Implementation of BERT that could load official pre-trained models for feature extraction and prediction

https://github.com/bojone/bert4keras - light reimplement of bert for keras.

https://github.com/jayparks/transformer - A Pytorch Implementation of "Attention is All You Need" and "Weighted Transformer Network for Machine Translation"

https://github.com/jadore801120/attention-is-all-you-need-pytorch - A PyTorch implementation of the Transformer model in "Attention is All You Need".

https://github.com/huggingface/transformers - Transformers: State-of-the-art Natural Language Processing for TensorFlow 2.0 and PyTorch. https://huggingface.co/transformers

https://github.com/huggingface/accelerate - A simple way to train and use PyTorch models with multi-GPU, TPU, mixed-precision.

https://github.com/zihangdai/xlnet - XLNet: Generalized Autoregressive Pretraining for Language Understanding

https://github.com/graykode/xlnet-Pytorch - Simple XLNet implementation with Pytorch Wrapper, https://arxiv.org/abs/1906.08237

https://github.com/graykode/ALBERT-Pytorch - Pytorch Implementation of ALBERT(A Lite BERT for Self-supervised Learning of Language Representations), https://arxiv.org/abs/2010.02480

https://github.com/UKPLab/sentence-transformers - Sentence Transformers: Multilingual Sentence Embeddings using BERT / RoBERTa / XLM-RoBERTa & Co. with PyTorch

https://github.com/kimiyoung/transformer-xl - This repository contains the code in both PyTorch and TensorFlow for our paper, https://arxiv.org/abs/1901.02860

https://github.com/openai/finetune-transformer-lm - Code and model for the paper "Improving Language Understanding by Generative Pre-Training".

https://github.com/openai/sparse_attention - Examples of using sparse attention, as in "Generating Long Sequences with Sparse Transformers"

https://github.com/allenai/longformer - Longformer: The Long-Document Transformer, https://arxiv.org/abs/2004.05150

https://github.com/salesforce/ctrl - Conditional Transformer Language Model for Controllable Generation, https://arxiv.org/abs/1909.05858

https://github.com/samsucik/knowledge-distil-bert - Master's thesis project in collaboration with Rasa, focusing on knowledge distillation from BERT into different very small networks and analysis of the students' NLP capabilities.

https://github.com/lucidrains/vit-pytorch - Implementation of Vision Transformer, a simple way to achieve SOTA in vision classification with only a single transformer encoder, in Pytorch

https://github.com/lucidrains/memformer - Implementation of Memformer, a Memory-augmented Transformer, in Pytorch.

https://github.com/lucidrains/reformer-pytorch

This is a Pytorch implementation of Reformer https://openreview.net/pdf?id=rkgNKkHtvB It includes LSH attention, reversible network, and chunking. It has been validated with an auto-regressive task (enwik8).

https://github.com/lucidrains/performer-pytorch - An implementation of Performer, a linear attention-based transformer, in Pytorch.

https://github.com/lucidrains/x-transformers - A simple but complete full-attention transformer with a set of promising experimental features from various papers.

Augmenting Self-attention with Persistent Memory, https://arxiv.org/abs/1907.01470 Memory Transformers, https://arxiv.org/abs/2006.11527 Transformers Without Tears, https://arxiv.org/abs/1910.05895 GLU Variants Improve Transformer, https://arxiv.org/abs/2002.05202 ReLU2, https://arxiv.org/abs/2109.08668 Rezero Is All You Need, https://arxiv.org/abs/2003.04887 Explicit Sparse Transformer: Concentrated Attention Through Explicit Selection, https://arxiv.org/abs/1912.11637 Talking-Heads Attention, https://arxiv.org/abs/2003.02436 Collaborative Attention, https://arxiv.org/abs/2006.16362 Attention on Attention for Image Captioning, https://arxiv.org/abs/1908.06954 Intra-attention Gating on Values Improving Transformer Models by Reordering their Sublayers, https://arxiv.org/abs/1911.03864 ...

https://github.com/lucidrains/triton-transformer - Implementation of a Transformer, but completely in Triton.

https://github.com/ofirpress/attention_with_linear_biases - Code for our ALiBi method for transformer language models.

https://github.com/jessevig/bertviz - Tool for visualizing attention in the Transformer model (BERT, GPT-2, Albert, XLNet, RoBERTa, CTRL, etc.).

https://github.com/EleutherAI/gpt-neo - An implementation of model parallel GPT-2 and GPT-3-style models using the mesh-tensorflow library.

https://github.com/EleutherAI/gpt-neox/ - An implementation of model parallel autoregressive transformers on GPUs, based on the DeepSpeed library.

https://github.com/Separius/awesome-fast-attention - list of efficient attention modules.

https://github.com/lucidrains/fast-transformer-pytorch - Implementation of Fast Transformer in Pytorch, https://arxiv.org/abs/2108.09084.

https://github.com/lucidrains/routing-transformer - Fully featured implementation of Routing Transformer, https://arxiv.org/abs/2003.05997

https://github.com/lucidrains/uformer-pytorch - https://arxiv.org/abs/2106.03106

https://github.com/lucidrains/segformer-pytorch - Implementation of Segformer, Attention + MLP neural network for segmentation, in Pytorch, https://arxiv.org/abs/2105.15203

https://github.com/lucidrains/rotary-embedding-torch - Implementation of Rotary Embeddings, from the Roformer paper, in Pytorch

https://github.com/lucidrains/local-attention - An implementation of local windowed attention for language modeling.

https://github.com/lucidrains/charformer-pytorch - Implementation of the GBST block from the Charformer paper, in Pytorch.

https://github.com/lucidrains/nystrom-attention - Implementation of NystrC6m Self-attention, from the paper NystrC6mformer, https://arxiv.org/abs/2102.03902.

https://github.com/lucidrains/memory-transformer-xl - A variant of Transformer-XL where the memory is updated not with a queue, but with attention.

https://github.com/lucidrains/feedback-transformer-pytorch - Implementation of Feedback Transformer in Pytorch.

https://github.com/lucidrains/memory-compressed-attention - Implementation of Memory-Compressed Attention, from the paper "Generating Wikipedia By Summarizing Long Sequences".

https://github.com/lucidrains/conformer - Implementation of the convolutional module from the Conformer paper, for use in Transformers.

CHATBOT

https://github.com/RasaHQ/rasa - Open source machine learning framework to automate text- and voice-based conversations: NLU, dialogue management, connect to Slack, Facebook, and more - Create chatbots and voice assistants https://rasa.com/docs/

https://github.com/RasaHQ/nlu-hyperopt - Find the best hyperparameters for your Rasa NLU model.

https://github.com/RasaHQ/awesome-rasa - list of Rasa resources curated by Rasa and the community

https://github.com/invocable/awesome-bots - The most awesome list about bots

https://github.com/fendouai/Awesome-Chatbot - https://github.com/fendouai/Awesome-Chatbot

https://github.com/facebookresearch/ParlAI - A framework for training and evaluating AI models on a variety of openly available dialogue datasets.

https://github.com/dennybritz/chatbot-retrieval - Dual LSTM Encoder for Dialog Response Generation

https://github.com/gunthercox/ChatterBot - ChatterBot is a machine learning, conversational dialog engine for creating chat bots

https://github.com/howdyai/botkit - Botkit is an open source developer tool for building chat bots, apps and custom integrations for major messaging platforms.

https://github.com/lukalabs/cakechat - CakeChat: Emotional Generative Dialog System

https://github.com/NVIDIA/NeMo - NeMo: a toolkit for conversational AI https://nvidia.github.io/NeMo/

https://github.com/alfredfrancis/ai-chatbot-framework - A python chatbot framework with Natural Language Understanding and Artificial Intelligence

https://github.com/ricsinaruto/Seq2seqChatbots - A wrapper around tensor2tensor to flexibly train, interact, and generate data for neural chatbots.

https://github.com/jakesgordon/javascript-state-machine - A javascript finite state machine library.

https://github.com/macabdul9/CASA-Dialogue-Act-Classifier

PyTorch implementation of the paper "Dialogue Act Classification with Context-Aware Self-Attention" for dialogue act classification with a generic dataset class and PyTorch-Lightning trainer.

https://github.com/ColingPaper2018/DialogueAct-Tagger - A resource to create a multi domain Dialog Act Tagger for conversational agents using publicly available data.

https://github.com/gigasquid/speech-acts-classifier - Speech act classifier for text based on Stanford CoreNLP and Weka.

CHATBOT CLIENT, CHAT SERVER

https://github.com/botfront/rasa-webchat - A feature-rich chat widget for Rasa and Botfront.

https://github.com/tinode/chat - Instant messaging platform. Backend in Go. Clients: Swift iOS, Java Android, JS webapp, scriptable command line; chatbots.

https://github.com/tdlib/td - TDLib (Telegram Database library) is a cross-platform library for building Telegram clients. It can be easily used from almost any programming language.

https://github.com/FaridSafi/react-native-gifted-chat - The most complete chat UI for React Native https://gifted.chat - The most complete chat UI for React Native https://gifted.chat.

https://github.com/hubotio/hubot - Hubot is a framework to build chat bots, modeled after GitHub's Campfire bot of the same name, hubot.

https://github.com/python-telegram-bot/python-telegram-bot - We have made you a wrapper you can't refuse.

https://github.com/LonamiWebs/Telethon - Telethon is an asyncio Python 3 MTProto library to interact with Telegram's API as a user or through a bot account (bot API alternative).

https://github.com/xmppjs/xmpp.js

XMPP is an open technology for real-time communication, which powers a wide range of applications including instant messaging, presence, multi-party chat, voice and video calls, collaboration, lightweight middleware, content syndication, and generalized routing of XML data.

https://github.com/bluszcz/awesome-xmpp

https://github.com/processone/ejabberd - ejabberd is a distributed, fault-tolerant technology that allows the creation of large-scale instant messaging applications. The server can reliably support thousands of simultaneous users on a single node and has been designed to provide exceptional standards of fault tolerance. As an open source technology, based on industry-standards, ejabberd can be used to build bespoke solutions very cost effectively.

https://github.com/feross/simple-peer

https://www.kurento.org/whats-kurento - Kurento is a WebRTC media server and a set of client APIs making simple the development of advanced video applications for WWW and smartphone platforms.

https://github.com/chatwoot/chatwoot - Open-source customer engagement suite, an alternative to Intercom, Zendesk, Salesforce Service Cloud etc.

How to echobot with XMPP, BOSH, and Strophe - https://gist.github.com/sysang/ed255cbba4d5ca92ba8a8886d58c5b08

https://github.com/RocketChat/Rocket.Chat - The communications platform that puts data protection first.

https://github.com/jitsi/jitsi-meet - Jitsi Meet - Secure, Simple and Scalable Video Conferences that you use as a standalone app or embed in your web application.

https://github.com/feross/simple-peer - Simple WebRTC video, voice, and data channels.

TEXT-TO-SPEECH, SPEECH-TO-TEXT

https://github.com/syoyo/tacotron-tts-cpp - Text-to-speech in (partially) C++ using Tacotron model + Tensorflow

https://github.com/CorentinJ/Real-Time-Voice-Cloning

This repository is an implementation of Transfer Learning from Speaker Verification to Multispeaker Text-To-Speech Synthesis (SV2TTS) with a vocoder that works in real-time.

https://github.com/mozilla/TTS - Deep learning for Text to Speech (Discussion forum: https://discourse.mozilla.org/c/tts).

https://github.com/mozilla/DeepSpeech - DeepSpeech is an open source embedded (offline, on-device) speech-to-text engine which can run in real time on devices ranging from a Raspberry Pi 4 to high power GPU servers.

https://github.com/buriburisuri/speech-to-text-wavenet - A tensorflow implementation of speech recognition based on DeepMind's WaveNet: A Generative Model for Raw Audio. (Hereafter the Paper)

https://github.com/NVIDIA/OpenSeq2Seq - Toolkit for efficient experimentation with Speech Recognition, Text2Speech and NLP https://nvidia.github.io/OpenSeq2Seq

https://github.com/kaldi-asr/kaldi - Kaldi Speech Recognition Toolkit.

https://github.com/mravanelli/pytorch-kaldi - pytorch-kaldi is a project for developing state-of-the-art DNN/RNN hybrid speech recognition systems. The DNN part is managed by pytorch, while feature extraction, label computation, and decoding are performed with the kaldi toolkit.

https://github.com/TensorSpeech/TensorFlowTTS - TensorFlowTTS: Real-Time State-of-the-art Speech Synthesis for Tensorflow 2 (supported including English, French, Korean, Chinese, German and Easy to adapt for other languages).

VIETNAMESE

https://github.com/vncorenlp/VnCoreNLP - A Vietnamese natural language processing toolkit (NAACL 2018)

https://github.com/phuonglh/vn.vitk - A Vietnamese Text Processing Toolkit

COMPUTER VISION

https://github.com/kjw0612/awesome-deep-vision - A curated list of deep learning resources for computer vision.

https://github.com/davidsandberg/facenet - Face recognition using Tensorflow.

https://github.com/matterport/Mask_RCNN - Mask R-CNN for object detection and instance segmentation on Keras and TensorFlow.

https://github.com/karpathy/neuraltalk2 - Efficient Image Captioning code in Torch, runs on GPU.

https://github.com/facebookresearch/vissl - VISSL is FAIR's library of extensible, modular and scalable components for SOTA Self-Supervised Learning with images.

https://github.com/haltakov/natural-language-image-search - Search photos on Unsplash using natural language.

https://github.com/KaimingHe/deep-residual-networks - Deep Residual Learning for Image Recognition, https://arxiv.org/abs/1512.03385

https://github.com/qubvel/efficientnet - Implementation of EfficientNet model. Keras and TensorFlow Keras.

https://github.com/facebookresearch/maskrcnn-benchmark - Fast, modular reference implementation of Instance Segmentation and Object Detection algorithms in PyTorch.

https://github.com/openai/CLIP - It can be instructed in natural language to predict the most relevant text snippet, given an image, without directly optimizing for the task, similarly to the zero-shot capabilities of GPT-2 and 3.

https://github.com/mlfoundations/open_clip - An open source implementation of CLIP.

https://github.com/lucidrains/DALLE-pytorch - Implementation / replication of DALL-E (paper), OpenAI's Text to Image Transformer, in Pytorch. It will also contain CLIP for ranking the generations.

https://github.com/CompVis/taming-transformers - Taming Transformers for High-Resolution Image Synthesis.

https://github.com/facebookresearch/detectron2 - Detectron2 is FAIR's next-generation platform for object detection, segmentation and other visual recognition tasks.

https://github.com/google-research/simclr - SimCLRv2 - Big Self-Supervised Models are Strong Semi-Supervised Learners, arxiv.org/abs/2006.10029.

https://github.com/OlafenwaMoses/ImageAI - A python library built to empower developers to build applications and systems with self-contained Computer Vision capabilities.

https://github.com/Spijkervet/SimCLR - PyTorch implementation of SimCLR: A Simple Framework for Contrastive Learning of Visual Representations by T. Chen et al.

https://github.com/sthalles/SimCLR - PyTorch implementation of SimCLR: A Simple Framework for Contrastive Learning of Visual Representations.

https://github.com/zzd1992/Image-Local-Attention - A better PyTorch implementation of image local attention which reduces the GPU memory by an order of magnitude.

GEOMETRIC DEEP LEARNING, GRAPH ALGORITHMS

https://github.com/shaoxiongji/knowledge-graphs - A collection of research on knowledge graphs.

https://github.com/dmlc/dgl - Python package built to ease deep learning on graph, on top of existing DL frameworks.

https://github.com/geomstats/geomstats - Computations and statistics on manifolds with geometric structures.

https://github.com/DeepGraphLearning/LiteratureDL5Graph - A comprehensive collection of recent papers on graph deep learning.

https://github.com/deepmind/graph_nets - Build Graph Nets in Tensorflow.

https://github.com/HazyResearch/hyperbolics

Hyperbolic embedding implementations of Representation Tradeoffs for Hyperbolic Embeddings + product embedding implementations of Learning Mixed-Curvature Representations in Product Spaces https://dawn.cs.stanford.edu/2019/10/10/noneuclidean/

https://github.com/jcyk/gtos - Code for AAAI2020 paper "Graph Transformer for Graph-to-Sequence Learning", https://arxiv.org/abs/1911.07470v2

https://github.com/facebookresearch/poincare-embeddings - PyTorch implementation of the NIPS-17 paper "PoincarC) Embeddings for Learning Hierarchical Representations", https://papers.nips.cc/paper/7213-poincare-embeddings-for-learning-hierarchical-representations

https://github.com/lateral/poincare-embeddings - A multi-threaded C++ implementation of Nickel & Kiela's "Poincare Embeddings" paper from NIPS 2017, following the implementation of the authors.

https://github.com/williamleif/GraphSAGE - Representation learning on large graphs using stochastic graph convolutions.

https://github.com/lucidrains/adjacent-attention-network - Graph neural network message passing reframed as a Transformer with local attention.

https://github.com/pyg-team/pytorch_geometric - Graph Neural Network Library for PyTorch

https://github.com/graph4ai/graph4nlp - Graph4nlp is the library for the easy use of Graph Neural Networks for NLP. Welcome to visit our DLG4NLP website (https://dlg4nlp.github.io/index.html) for various learning resources!

https://github.com/benedekrozemberczki/pytorch_geometric_temporal - PyTorch Geometric Temporal: Spatiotemporal Signal Processing with Neural Machine Learning Models (CIKM 2021)

https://github.com/HazyResearch/hgcn - Hyperbolic Graph Convolutional Networks in PyTorch.

https://github.com/dalab/hyperbolic_nn - Source code for the paper "Hyperbolic Neural Networks", https://arxiv.org/abs/1805.09112

https://github.com/daiquocnguyen/Graph-Transformer - Transformer for Graph Classification (Pytorch and Tensorflow)

https://github.com/weihua916/powerful-gnns - How Powerful are Graph Neural Networkso

https://github.com/vuptran/graph-representation-learning - Autoencoders for Link Prediction and Semi-Supervised Node Classification (DSAA 2018)

https://github.com/THUDM/GCC - GCC: Graph Contrastive Coding for Graph Neural Network Pre-Training

https://github.com/bluer555/KernelGCN - Codes for NIPS 2019 Paper: Rethinking Kernel Methods for Node Representation Learning on Graphs

https://github.com/williamleif/GraphSAGE - Representation learning on large graphs using stochastic graph convolutions.

https://github.com/phanein/deepwalk - DeepWalk - Deep Learning for Graphs.

https://github.com/shenweichen/GraphEmbedding - Implementation and experiments of graph embedding algorithms.

https://github.com/benedekrozemberczki/karateclub - Karate Club: An API Oriented Open-source Python Framework for Unsupervised Learning on Graphs (CIKM 2020).

https://github.com/gordicaleksa/pytorch-GAT - This repo contains a PyTorch implementation of the original GAT paper (Graph Attention Network byVeličković et al.).

It's aimed at making it easy to start playing and learning about GAT and GNNs in general.

EVOLOTUION ALGORITHMS

https://github.com/uber-research/deep-neuroevolution - This repo contains distributed implementations of the algorithms described in:

https://arxiv.org/abs/1712.06567 https://arxiv.org/abs/1712.06560 https://eng.uber.com/deep-neuroevolution/

REINFORCEMENT LEARNING

https://github.com/YuhangSong/Arena-Baselines

Arena is a general evaluation platform and building toolkit for single/multi-agent intelligence. As a part of Arena project, this repository is to provide implementation of state-of-the-art deep single/multi-agent reinforcement learning baselines.

https://github.com/keras-rl/keras-rl - keras-rl implements some state-of-the art deep reinforcement learning algorithms in Python and seamlessly integrates with the deep learning library Keras.

https://github.com/junhyukoh/deep-reinforcement-learning-papers - A list of recent papers regarding deep reinforcement learning

https://github.com/dennybritz/reinforcement-learning - Implementation of Reinforcement Learning Algorithms. Python, OpenAI Gym, Tensorflow. Exercises and Solutions to accompany Sutton's Book and David Silver's course.

https://github.com/openai/gym - A toolkit for developing and comparing reinforcement learning algorithms.

https://github.com/openai/universe - Universe: a software platform for measuring and training an AI's general intelligence across the world's supply of games, websites and other applications.

https://github.com/openai/spinningup - An educational resource to help anyone learn deep reinforcement learning. https://spinningup.openai.com/

https://github.com/tensorforce/tensorforce - Tensorforce: a TensorFlow library for applied reinforcement learning

https://github.com/openai/baselines - OpenAI Baselines: high-quality implementations of reinforcement learning algorithms.

https://github.com/BY571/Munchausen-RL - PyTorch implementation of the Munchausen Reinforcement Learning Algorithms M-DQN and M-IQN

DATASETS

https://github.com/huggingface/datasets - The largest hub of ready-to-use datasets for ML models with fast, easy-to-use and efficient data manipulation tools.

https://pile.eleuther.ai/ - The Pile is a 825 GiB diverse, open source language modelling data set that consists of 22 smaller, high-quality datasets combined together.

https://github.com/awesomedata/awesome-public-datasets - A topic-centric list of HQ open datasets.

https://github.com/niderhoff/nlp-datasets - Alphabetical list of free/public domain datasets with text data for use in Natural Language Processing (NLP).

https://github.com/ad-freiburg/large-qa-datasets - A collection of large datasets containing questions and their answers for use in Natural Language Processing tasks like question answering (QA). Datasets are sorted by year of publication.

https://github.com/makcedward/nlpaug - Data augmentation for NLP

Generate synthetic data for improving model performance without manual effort Simple, easy-to-use and lightweight library. Augment data in 3 lines of code Plug and play to any machine leanring/ neural network frameworks (e.g. scikit-learn, PyTorch, TensorFlow) Support textual and audio input

https://github.com/scrapy/scrapy - Scrapy, a fast high-level web crawling & scraping framework for Python.

https://github.com/chiphuyen/lazynlp - Library to scrape and clean web pages to create massive datasets.

https://github.com/rodrigopivi/Chatito - Generate datasets for AI chatbots, NLP tasks, named entity recognition or text classification models using a simple DSL!

https://github.com/snorkel-team/snorkel - A system for quickly generating training data with weak supervision.

https://github.com/clips/pattern - Web mining module for Python, with tools for scraping, natural language processing, machine learning, network analysis and visualization.

https://github.com/lorien/awesome-web-scraping - List of libraries, tools and APIs for web scraping and data processing.

https://github.com/attardi/wikiextractor - A tool for extracting plain text from Wikipedia dumps.

https://github.com/coqui-ai/open-speech-corpora - A list of accessible speech corpora for ASR, TTS, and other Speech Technologies

https://github.com/first20hours/google-10000-english - This repo contains a list of the 10,000 most common English words in order of frequency, as determined by n-gram frequency analysis of the Google's Trillion Word Corpus.

https://github.com/robvanvolt/DALLE-datasets - This is a summary of easily available datasets for generalized DALLE-pytorch training.

https://github.com/facebookresearch/AugLy - A data augmentations library for audio, image, text, and video.

OTHERS

https://github.com/spencermountain/compromise - compromiseis ajavascript library that interprets andpre-parsestext.

https://github.com/mindsdb/mindsdb/ - Predictive AI layer for existing databases.

  • Predictive AI layer for existing databases.
  • AI Tables. Move your models instantly to production, reduce resources, and overhead costs with AI Tables that deliver the results as database tables.
  • Explainable AI. Use MindsDB Studio to interpret predictions made by the model. Identify potential data biases, evaluate and visualize model accuracy using the Explainable AI.

https://github.com/jina-ai/jina/

Jina is an open-source deep-learning powered search framework, empowering developers to create cross-modal or multi-modal search systems for text, images, video, and audio. Jina is a cloud-native project, and provides long-term support from a full-time, venture-backed team.

https://github.com/EpistasisLab/tpot - A Python Automated Machine Learning tool that optimizes machine learning pipelines using genetic programming.

https://github.com/bqplot/bqplot - Plotting library for IPython/Jupyter notebooks.

https://github.com/rushter/MLAlgorithms - Minimal and clean examples of machine learning algorithms implementations.

https://github.com/openai/baselines - OpenAI Baselines: high-quality implementations of reinforcement learning algorithms.

https://github.com/hundredblocks/semantic-search - This repository contains a barebones implementation of a semantic search engine. The implementation is based on leveraging pre-trained embeddings from VGG16 (trained on Imagenet), and GloVe (trained on Wikipedia). It allows you to:

Find similar images to an input image Find similar words to an input word Search through images using any word Generate tags for any image

https://github.com/hora-search/hora - efficient approximate nearest neighbor search algorithm collections library written in Rust.

https://github.com/rom1504/awesome-semantic-search - Semantic search with embeddings: index anything.

https://github.com/Agrover112/awesome-semantic-search - A curated list of awesome resources related to Semantic Search and Semantic Similarity tasks.

https://github.com/jina-ai/jina - Jina is a neural search framework that empowers anyone to build SOTA and scalable deep learning search applications in minutes.

https://github.com/jina-ai/examples

HASKELL

https://github.com/mikeizbicki/HLearn - Homomorphic machine learning

https://github.com/brunjlar/neural - Neural Nets in native Haskell

https://github.com/creswick/chatter - A library of Natural Language Processing algorithms for Haskell.

https://github.com/roelvandijk/numerals - Convert numbers to number words

https://github.com/mrkkrp/megaparsec - Industrial-strength monadic parser combinator library.

PROLOG

https://github.com/terminusdb/terminusdb - TerminusDB is an open source knowledge graph and document store. Use it to build versioned data products.

Remark! It's implmented by Prolog, thus it's greate case to learn how to code using Prolog TerminusDB is an open source knowledge graph and document store.

https://github.com/aarroyoc/postgresql-prolog - A Prolog library to connect to PostgreSQL databases

NEW PHILOSOPHY (DL)

https://github.com/Chen-Cai-OSU/awesome-equivariant-network - Paper list for equivariant neural network

https://github.com/lucidrains/lie-transformer-pytorch - Implementation of Lie Transformer, Equivariant Self-Attention, in Pytorch.

https://github.com/Guillemdb/FractalAI - Celular automaton-based calculus for the masses

https://github.com/guillemdb/fragile - Framework for building algorithms based on FractalAI theory.

https://github.com/lolemacs/pytorch-eventprop - A PyTorch implementation of EventProp [https://arxiv.org/abs/2009.08378], a method to train Spiking Neural Networks.

https://github.com/EpistasisLab/tpot - A Python Automated Machine Learning tool that optimizes machine learning pipelines using genetic programming.

https://github.com/opencog - AGI - Machine Learning, Language, Reasoning, Robotics

https://github.com/openai/triton - This is the development repository of Triton, a language and compiler for writing highly efficient custom Deep-Learning primitives.

The aim of Triton is to provide an open-source environment to write fast code at higher productivity than CUDA, but also with higher flexibility than other existing DSLs. The foundations of this project are described in the following MAPL2019 publication, http://www.eecs.harvard.edu/~htk/publication/2019-mapl-tillet-kung-cox.pdf

https://github.com/JonathanSalwan/Triton

Triton is a Dynamic Binary Analysis (DBA) framework. It provides internal components like a Dynamic Symbolic Execution (DSE) engine, a dynamic taint engine, AST representations of the x86, x86-64, ARM32 and AArch64 Instructions Set Architecture (ISA), SMT simplification passes, an SMT solver interface and, the last but not least, Python bindings.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment