Skip to content

Instantly share code, notes, and snippets.

@sergicastellasape
Last active February 29, 2024 08:53
Show Gist options
  • Save sergicastellasape/6215eae8f21e7314b174dc36f39e34fd to your computer and use it in GitHub Desktop.
Save sergicastellasape/6215eae8f21e7314b174dc36f39e34fd to your computer and use it in GitHub Desktop.
An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale 12042 Google
A Simple Framework for Contrastive Learning of Visual Representations 8476 Google
Language Models are Few-Shot Learners 7903 OpenAI
YOLOv4: Optimal Speed and Accuracy of Object Detection 7860 Academia Sinica
Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer. 6362 Google
Momentum Contrast for Unsupervised Visual Representation Learning 6060 Meta
End-to-End Object Detection with Transformers 4998 Meta, Paris Dauphine University
Analyzing and Improving the Image Quality of StyleGAN 3101 Aalto University, NVIDIA
EfficientDet: Scalable and Efficient Object Detection 3081 Google
Advances and Open Problems in Federated Learning 2921 Australian National University, Carnegie Mellon University, Cornell University, Emory University, École Polytechnique Fédérale de Lausanne, Georgia Institute of Technology, Google, Hong Kong University of Science and Technology, INRIA, IT University of Copenhagen, MIT, Nanyang Technological University, Princeton University, Rutgers University, Stanford University, UC Berkeley, UC San Diego, University of Illinois Urbana-Champaign, University of Oulu, University of Pittsburgh, University of Southern California, University of Virginia, University of Warwick, University of Washington, University of Wisconsin-Madison
Unsupervised Cross-lingual Representation Learning at Scale 2857 Meta
Bootstrap your own latent: A new approach to self-supervised Learning 2827 DeepMind, Imperial College London
Training data-efficient image transformers & distillation through attention 2558 Meta, Sorbonne University
Random Erasing Data Augmentation. 2453 University of Technology Sydney, Xiamen University
nuScenes: A Multimodal Dataset for Autonomous Driving 2366 nuTonomy
NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis 2283 Google, UC Berkeley, UC San Diego
ELECTRA: Pre-training Text Encoders as Discriminators Rather Than Generators 2142 CIFAR, Google, Stanford University
Improved protein structure prediction using potentials from deep learning 2121 DeepMind, Francis Crick Institute, University College London
Transformers: State-of-the-Art Natural Language Processing 2071 Hugging Face
Unsupervised Learning of Visual Features by Contrasting Cluster Assignments 1847 INRIA, Meta
Supervised Contrastive Learning 1835 Boston University, Google, MIT, Snap Inc.
Improved Baselines with Momentum Contrastive Learning 1782 Meta
wav2vec 2.0: A Framework for Self-Supervised Learning of Speech Representations 1777 Meta
Exploring Simple Siamese Representation Learning 1767 Meta
ECA-Net: Efficient Channel Attention for Deep Convolutional Neural Networks 1713 Dalian University of Technology, Harbin Institute of Technology, Tianjin University
RandAugment: Practical Automated Data Augmentation with a Reduced Search Space 1684 Google
Self-Training With Noisy Student Improves ImageNet Classification 1674 Carnegie Mellon University, Google
Longformer: The Long-Document Transformer 1650 Allen Institute for Artificial Intelligence
FixMatch: Simplifying Semi-Supervised Learning with Consistency and Confidence 1603 Google
Face2Face: Real-time Face Capture and Reenactment of RGB Videos 1532 Max Planck Institute for Informatics, Stanford University, University of Erlangen-Nuremberg
Image Segmentation Using Deep Learning: A Survey 1450 Qualcomm, Snapchat, UC Los Angeles, University of Extremadura, University of Texas at Dallas, University of Waterloo
Unsupervised Data Augmentation for Consistency Training 1345 Carnegie Mellon University, Google
Big Self-Supervised Models are Strong Semi-Supervised Learners 1314 Google
SpanBERT: Improving Pre-training by Representing and Predicting Spans 1307 Allen Institute for Artificial Intelligence, Meta, Princeton University, University of Washington
Conformer: Convolution-augmented Transformer for Speech Recognition 1214 Google
Scalability in Perception for Autonomous Driving: Waymo Open Dataset 1209 Google, Waymo
Denoising Diffusion Probabilistic Models 1206 UC Berkeley
Don't Stop Pretraining: Adapt Language Models to Domains and Tasks 1150 Allen Institute for Artificial Intelligence, University of Washington
LightGCN: Simplifying and Powering Graph Convolution Network for Recommendation 1141 Hefei University of Technology, Kuaishou Technology, National University of Singapore, University of Science and Technology of China
Open Graph Benchmark: Datasets for Machine Learning on Graphs 1112 Harvard University, Microsoft, Stanford University, Technical University Dortmund
Dense Passage Retrieval for Open-Domain Question Answering 1106 Meta, Princeton University, University of Washington
SCAFFOLD: Stochastic Controlled Averaging for Federated Learning 1052 École Polytechnique Fédérale de Lausanne, Google, New York University
Data-Efficient Image Recognition with Contrastive Predictive Coding 1031 DeepMind, UC Berkeley
Stanza: A Python Natural Language Processing Toolkit for Many Human Languages 1016 Stanford University
ResNeSt: Split-Attention Networks 994 Amazon, ByteDance, Meta, SenseTime, Snap Inc., UC Davis
Self-Supervised Learning of Pretext-Invariant Representations 993 Meta
Oscar: Object-Semantics Aligned Pre-training for Vision-Language Tasks 971 Microsoft, University of Washington
Implicit Neural Representations with Periodic Activation Functions 963 Stanford University
TinyBERT: Distilling BERT for Natural Language Understanding 954 Huawei, Huazhong University of Science and Technology
Big Bird: Transformers for Longer Sequences 929 Google
BDD100K: A Diverse Driving Dataset for Heterogeneous Multitask Learning 924 Cornell University, Element, UC Berkeley, UC San Diego
StarGAN v2: Diverse Image Synthesis for Multiple Domains 916 NAVER
PV-RCNN: Point-Voxel Feature Set Abstraction for 3D Object Detection 899 Chinese Academy of Sciences, Chinese University of Hong Kong, National Laboratory of Pattern Recognition, SenseTime
A Primer in BERTology: What we know about how BERT works 892 University of Copenhagen, University of Massachusetts Lowell
RAFT: Recurrent All-Pairs Field Transforms for Optical Flow 860 Princeton University
Multilingual Denoising Pre-training for Neural Machine Translation 859 Meta
Bridging the Gap Between Anchor-Based and Anchor-Free Detection via Adaptive Training Sample Selection 850 Beijing University of Posts and Telecommunications, Chinese Academy of Sciences, National Laboratory of Pattern Recognition, University of Chinese Academy of Sciences, Westlake University
Knowledge Distillation: A Survey 850 University of London, University of Sydney
RandLA-Net: Efficient Semantic Segmentation of Large-Scale Point Clouds 848 National University of Defense Technology, Sun Yat-sen University, University of Oxford
SuperGlue: Learning Feature Matching With Graph Neural Networks 839 ETH Zurich, Magic Leap
Generative Pretraining From Pixels 838 OpenAI
Pre-trained models for natural language processing: A survey 832 Fudan University
Reliable evaluation of adversarial robustness with an ensemble of diverse parameter-free attacks 811 University of Tubingen
Deep Learning for Person Re-identification: A Survey and Outlook 806 Beijing Institute of Technology, Inception Institute of AI, Salesforce, Singapore Management University, University of Surrey, Wuhan University
Object-Contextual Representations for Semantic Segmentation 800 Chinese Academy of Sciences, Microsoft, University of Chinese Academy of Sciences
Fourier Features Let Networks Learn High Frequency Functions in Low Dimensional Domains 774 Google, UC Berkeley, UC San Diego
Scaled-YOLOv4: Scaling Cross Stage Partial Network 758 Academia Sinica, Intel, Providence University
Understanding Contrastive Representation Learning through Alignment and Uniformity on the Hypersphere 755 MIT
Fast is better than free: Revisiting adversarial training 744 Bosch, Carnegie Mellon University
Informer: Beyond Efficient Transformer for Long Sequence Time-Series Forecasting 741 Beihang University, Rutgers University, Sharjah Economic Development Department, UC Berkeley
CodeBERT: A Pre-Trained Model for Programming and Natural Languages 736 Harbin Institute of Technology, Microsoft, Sun Yat-sen University
Big Transfer (BiT): General Visual Representation Learning 704 Google
Pre-Trained Image Processing Transformer 703 Huawei, Peking University, Peng Cheng Laboratory, University of Sydney
Rethinking Attention with Performers 697 Alan Turing Institute, DeepMind, Google, University of Cambridge
What makes for good views for contrastive learning 686 Brown University, Google, MIT
Score-Based Generative Modeling through Stochastic Differential Equations 684 Google, Stanford University
Graph Contrastive Learning with Augmentations 682 Google, Texas A&M University, University of Science and Technology of China, University of Texas at Austin
Interpreting the Latent Space of GANs for Semantic Face Editing 675 Chinese University of Hong Kong
Linformer: Self-Attention with Linear Complexity 668 Meta
MaskGAN: Towards Diverse and Interactive Facial Image Manipulation 653 Chinese University of Hong Kong, SenseTime, University of Hong Kong
Simple and Deep Graph Convolutional Networks 653 Alibaba Group, Fudan University, Renmin University of China
REALM: Retrieval-Augmented Language Model Pre-Training 640 Google
Single Path One-Shot Neural Architecture Search with Uniform Sampling 627 Hong Kong University of Science and Technology, Megvii, Tsinghua University
Domain-Specific Language Model Pretraining for Biomedical Natural Language Processing 602 Microsoft
Tracking Objects as Points 589 Intel, University of Texas at Austin
Adaptive Federated Optimization 588 Google
Exploiting Cloze Questions for Few Shot Text Classification and Natural Language Inference 581 Ludwig Maximilian University of Munich, Sulzer GmbH
Making Pre-trained Language Models Better Few-shot Learners 576 MIT, Princeton University
Recipes for building an open-domain chatbot 575 Meta
Meshed-Memory Transformer for Image Captioning 572 University of Modena and Reggio Emilia
Unicoder-VL: A Universal Encoder for Vision and Language by Cross-Modal Pre-Training. 568 Microsoft, Peking University
Transformers are RNNs: Fast Autoregressive Transformers with Linear Attention 562 École Polytechnique Fédérale de Lausanne, Idiap Research Institute, University of Geneva, University of Washington
Learning to Simulate Complex Physics with Graph Networks 547 DeepMind, Stanford University
Is BERT Really Robust? A Strong Baseline for Natural Language Attack on Text Classification and Entailment. 546 Agency for Science, Technology and Research, MIT, University of Hong Kong
XTREME: A Massively Multilingual Multi-task Benchmark for Evaluating Cross-lingual Generalization 536 Carnegie Mellon University, DeepMind, Google
On Adaptive Attacks to Adversarial Example Defenses 535 Google, MIT, Stanford University, University of Tubingen
Circle Loss: A Unified Perspective of Pair Similarity Optimization 534 Australian National University, Beihang University, Megvii, Tsinghua University
Explaining machine learning classifiers through diverse counterfactual explanations. 527 Microsoft, University of Colorado Boulder
Efficient Transformers: A Survey 524 Google
Exploring Self-attention for Image Recognition 523 Chinese University of Hong Kong, Intel
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment