sergicastellasape/top-cited-2020-papers.tsv

## top-cited-2020-papers.tsv

          
            Title
            Tweets
            Citations
            Organization
            Country
            Org Type

            
              An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale
              142
              12042
              Google
              USA
              industry

            
              A Simple Framework for Contrastive Learning of Visual Representations
              16
              8476
              Google
              USA
              industry

            
              Language Models are Few-Shot Learners
              331
              7903
              OpenAI
              USA
              industry

            
              YOLOv4: Optimal Speed and Accuracy of Object Detection
              20
              7860
              Academia Sinica
              Taiwan
              industry

            
              Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer.
              53
              6362
              Google
              USA
              industry

            
              Momentum Contrast for Unsupervised Visual Representation Learning
              8
              6060
              Meta
              USA
              industry

            
              End-to-End Object Detection with Transformers
              43
              4998
              Meta, Paris Dauphine University
              France, USA
              industry

            
              Analyzing and Improving the Image Quality of StyleGAN
              44
              3101
              Aalto University, NVIDIA
              Finland, USA
              industry

            
              EfficientDet: Scalable and Efficient Object Detection
              7
              3081
              Google
              USA
              industry

            
              Advances and Open Problems in Federated Learning
              5
              2921
              Australian National University, Carnegie Mellon University, Cornell University, Emory University, École Polytechnique Fédérale de Lausanne, Georgia Institute of Technology, Google, Hong Kong University of Science and Technology, INRIA, IT University of Copenhagen, MIT, Nanyang Technological University, Princeton University, Rutgers University, Stanford University, UC Berkeley, UC San Diego, University of Illinois Urbana-Champaign, University of Oulu, University of Pittsburgh, University of Southern California, University of Virginia, University of Warwick, University of Washington, University of Wisconsin-Madison
              Australia, China, Denmark, Finland, France, Singapore, Switzerland, UK, USA
              industry

            
              Unsupervised Cross-lingual Representation Learning at Scale
              6
              2857
              Meta
              USA
              industry

            
              Bootstrap your own latent: A new approach to self-supervised Learning
              13
              2827
              DeepMind, Imperial College London
              UK
              industry

            
              Training data-efficient image transformers & distillation through attention
              45
              2558
              Meta, Sorbonne University
              France, USA
              industry

            
              Random Erasing Data Augmentation.
              2
              2453
              University of Technology Sydney, Xiamen University
              Australia, China
              academia

            
              nuScenes: A Multimodal Dataset for Autonomous Driving
              3
              2366
              nuTonomy
              USA
              industry

            
              NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis
              188
              2283
              Google, UC Berkeley, UC San Diego
              USA
              academia

            
              ELECTRA: Pre-training Text Encoders as Discriminators Rather Than Generators
              
              2142
              CIFAR, Google, Stanford University
              Canada, USA
              academia

            
              Improved protein structure prediction using potentials from deep learning
              
              2121
              DeepMind, Francis Crick Institute, University College London
              UK
              industry

            
              Transformers: State-of-the-Art Natural Language Processing
              
              2071
              Hugging Face
              USA
              industry

            
              Unsupervised Learning of Visual Features by Contrasting Cluster Assignments
              12
              1847
              INRIA, Meta
              France, USA
              industry

            
              Supervised Contrastive Learning
              29
              1835
              Boston University, Google, MIT, Snap Inc.
              USA
              industry

            
              Improved Baselines with Momentum Contrastive Learning
              1
              1782
              Meta
              USA
              industry

            
              wav2vec 2.0: A Framework for Self-Supervised Learning of Speech Representations
              34
              1777
              Meta
              USA
              industry

            
              Exploring Simple Siamese Representation Learning
              73
              1767
              Meta
              USA
              industry

            
              ECA-Net: Efficient Channel Attention for Deep Convolutional Neural Networks
              5
              1713
              Dalian University of Technology, Harbin Institute of Technology, Tianjin University
              China
              academia

            
              RandAugment: Practical Automated Data Augmentation with a Reduced Search Space
              3
              1684
              Google
              USA
              industry

            
              Self-Training With Noisy Student Improves ImageNet Classification
              19
              1674
              Carnegie Mellon University, Google
              USA
              industry

            
              Longformer: The Long-Document Transformer
              55
              1650
              Allen Institute for Artificial Intelligence
              USA
              industry

            
              FixMatch: Simplifying Semi-Supervised Learning with Consistency and Confidence
              
              1603
              Google
              USA
              industry

            
              Face2Face: Real-time Face Capture and Reenactment of RGB Videos
              1
              1532
              Max Planck Institute for Informatics, Stanford University, University of Erlangen-Nuremberg
              Germany, USA
              academia

            
              Image Segmentation Using Deep Learning: A Survey
              13
              1450
              Qualcomm, Snapchat, UC Los Angeles, University of Extremadura, University of Texas at Dallas, University of Waterloo
              Canada, Spain, USA
              academia

            
              Unsupervised Data Augmentation for Consistency Training
              11
              1345
              Carnegie Mellon University, Google
              USA
              industry

            
              Big Self-Supervised Models are Strong Semi-Supervised Learners
              16
              1314
              Google
              USA
              industry

            
              SpanBERT: Improving Pre-training by Representing and Predicting Spans
              2
              1307
              Allen Institute for Artificial Intelligence, Meta, Princeton University, University of Washington
              USA
              industry

            
              Conformer: Convolution-augmented Transformer for Speech Recognition
              51
              1214
              Google
              USA
              industry

            
              Scalability in Perception for Autonomous Driving: Waymo Open Dataset
              
              1209
              Google, Waymo
              USA
              industry

            
              Denoising Diffusion Probabilistic Models
              108
              1206
              UC Berkeley
              USA
              academia

            
              Don't Stop Pretraining: Adapt Language Models to Domains and Tasks
              10
              1150
              Allen Institute for Artificial Intelligence, University of Washington
              USA
              academia, industry

            
              LightGCN: Simplifying and Powering Graph Convolution Network for Recommendation
              
              1141
              Hefei University of Technology, Kuaishou Technology, National University of Singapore, University of Science and Technology of China
              China, Singapore
              academia

            
              Open Graph Benchmark: Datasets for Machine Learning on Graphs
              22
              1112
              Harvard University, Microsoft, Stanford University, Technical University Dortmund
              Germany, USA
              academia

            
              Dense Passage Retrieval for Open-Domain Question Answering
              23
              1106
              Meta, Princeton University, University of Washington
              USA
              industry

            
              SCAFFOLD: Stochastic Controlled Averaging for Federated Learning
              1
              1052
              École Polytechnique Fédérale de Lausanne, Google, New York University
              Switzerland, USA
              industry

            
              Data-Efficient Image Recognition with Contrastive Predictive Coding
              
              1031
              DeepMind, UC Berkeley
              UK, USA
              industry

            
              Stanza: A Python Natural Language Processing Toolkit for Many Human Languages
              
              1016
              Stanford University
              USA
              academia

            
              ResNeSt: Split-Attention Networks
              5
              994
              Amazon, ByteDance, Meta, SenseTime, Snap Inc., UC Davis
              China, USA
              industry

            
              Self-Supervised Learning of Pretext-Invariant Representations
              2
              993
              Meta
              USA
              industry

            
              Oscar: Object-Semantics Aligned Pre-training for Vision-Language Tasks
              2
              971
              Microsoft, University of Washington
              USA
              industry

            
              Implicit Neural Representations with Periodic Activation Functions
              38
              963
              Stanford University
              USA
              academia

            
              TinyBERT: Distilling BERT for Natural Language Understanding
              
              954
              Huawei, Huazhong University of Science and Technology
              China
              industry

            
              Big Bird: Transformers for Longer Sequences
              10
              929
              Google
              USA
              industry

            
              BDD100K: A Diverse Driving Dataset for Heterogeneous Multitask Learning
              2
              924
              Cornell University, Element, UC Berkeley, UC San Diego
              Canada, USA
              academia

            
              StarGAN v2: Diverse Image Synthesis for Multiple Domains
              
              916
              NAVER
              South Korea
              industry

            
              PV-RCNN: Point-Voxel Feature Set Abstraction for 3D Object Detection
              
              899
              Chinese Academy of Sciences, Chinese University of Hong Kong, National Laboratory of Pattern Recognition, SenseTime
              China
              academia

            
              A Primer in BERTology: What we know about how BERT works
              20
              892
              University of Copenhagen, University of Massachusetts Lowell
              Denmark, USA
              academia

            
              RAFT: Recurrent All-Pairs Field Transforms for Optical Flow
              10
              860
              Princeton University
              USA
              academia

            
              Multilingual Denoising Pre-training for Neural Machine Translation
              1
              859
              Meta
              USA
              industry

            
              Bridging the Gap Between Anchor-Based and Anchor-Free Detection via Adaptive Training Sample Selection
              
              850
              Beijing University of Posts and Telecommunications, Chinese Academy of Sciences, National Laboratory of Pattern Recognition, University of Chinese Academy of Sciences, Westlake University
              China
              academia

            
              Knowledge Distillation: A Survey
              19
              850
              University of London, University of Sydney
              Australia, UK
              academia

            
              RandLA-Net: Efficient Semantic Segmentation of Large-Scale Point Clouds
              3
              848
              National University of Defense Technology, Sun Yat-sen University, University of Oxford
              China, UK
              academia

            
              SuperGlue: Learning Feature Matching With Graph Neural Networks
              10
              839
              ETH Zurich, Magic Leap
              Switzerland, USA
              industry

            
              Generative Pretraining From Pixels
              
              838
              OpenAI
              USA
              industry

            
              Pre-trained models for natural language processing: A survey
              10
              832
              Fudan University
              China
              academia

            
              Reliable evaluation of adversarial robustness with an ensemble of diverse parameter-free attacks
              
              811
              University of Tubingen
              Germany
              academia

            
              Deep Learning for Person Re-identification: A Survey and Outlook
              
              806
              Beijing Institute of Technology, Inception Institute of AI, Salesforce, Singapore Management University, University of Surrey, Wuhan University
              China, Singapore, UAE, UK, USA
              academia, industry

            
              Object-Contextual Representations for Semantic Segmentation
              
              800
              Chinese Academy of Sciences, Microsoft, University of Chinese Academy of Sciences
              China, USA
              industry

            
              Fourier Features Let Networks Learn High Frequency Functions in Low Dimensional Domains
              14
              774
              Google, UC Berkeley, UC San Diego
              USA
              academia

            
              Scaled-YOLOv4: Scaling Cross Stage Partial Network
              
              758
              Academia Sinica, Intel, Providence University
              Taiwan, USA
              academia, industry

            
              Understanding Contrastive Representation Learning through Alignment and Uniformity on the Hypersphere
              3
              755
              MIT
              USA
              academia

            
              Fast is better than free: Revisiting adversarial training
              
              744
              Bosch, Carnegie Mellon University
              Germany, USA
              academia, industry

            
              Informer: Beyond Efficient Transformer for Long Sequence Time-Series Forecasting
              35
              741
              Beihang University, Rutgers University, Sharjah Economic Development Department, UC Berkeley
              China, UAE, USA
              industry

            
              CodeBERT: A Pre-Trained Model for Programming and Natural Languages
              30
              736
              Harbin Institute of Technology, Microsoft, Sun Yat-sen University
              China, USA
              industry

            
              Big Transfer (BiT): General Visual Representation Learning
              51
              704
              Google
              USA
              industry

            
              Pre-Trained Image Processing Transformer
              3
              703
              Huawei, Peking University, Peng Cheng Laboratory, University of Sydney
              Australia, China
              academia

            
              Rethinking Attention with Performers
              38
              697
              Alan Turing Institute, DeepMind, Google, University of Cambridge
              UK, USA
              academia, industry

            
              What makes for good views for contrastive learning
              8
              686
              Brown University, Google, MIT
              USA
              academia

            
              Score-Based Generative Modeling through Stochastic Differential Equations
              54
              684
              Google, Stanford University
              USA
              industry

            
              Graph Contrastive Learning with Augmentations
              2
              682
              Google, Texas A&M University, University of Science and Technology of China, University of Texas at Austin
              China, USA
              academia

            
              Interpreting the Latent Space of GANs for Semantic Face Editing
              14
              675
              Chinese University of Hong Kong
              China
              academia

            
              Linformer: Self-Attention with Linear Complexity
              8
              668
              Meta
              USA
              industry

            
              MaskGAN: Towards Diverse and Interactive Facial Image Manipulation
              
              653
              Chinese University of Hong Kong, SenseTime, University of Hong Kong
              China
              academia

            
              Simple and Deep Graph Convolutional Networks
              1
              653
              Alibaba Group, Fudan University, Renmin University of China
              China
              industry

            
              REALM: Retrieval-Augmented Language Model Pre-Training
              11
              640
              Google
              USA
              industry

            
              Single Path One-Shot Neural Architecture Search with Uniform Sampling
              1
              627
              Hong Kong University of Science and Technology, Megvii, Tsinghua University
              China
              industry

            
              Domain-Specific Language Model Pretraining for Biomedical Natural Language Processing
              18
              602
              Microsoft
              USA
              industry

            
              Tracking Objects as Points
              3
              589
              Intel, University of Texas at Austin
              USA
              academia

            
              Adaptive Federated Optimization
              2
              588
              Google
              USA
              industry

            
              Exploiting Cloze Questions for Few Shot Text Classification and Natural Language Inference
              7
              581
              Ludwig Maximilian University of Munich, Sulzer GmbH
              Germany
              academia

            
              Making Pre-trained Language Models Better Few-shot Learners
              13
              576
              MIT, Princeton University
              USA
              academia

            
              Recipes for building an open-domain chatbot
              12
              575
              Meta
              USA
              industry

            
              Meshed-Memory Transformer for Image Captioning
              
              572
              University of Modena and Reggio Emilia
              Italy
              academia

            
              Unicoder-VL: A Universal Encoder for Vision and Language by Cross-Modal Pre-Training.
              
              568
              Microsoft, Peking University
              China, USA
              industry

            
              Transformers are RNNs: Fast Autoregressive Transformers with Linear Attention
              17
              562
              École Polytechnique Fédérale de Lausanne, Idiap Research Institute, University of Geneva, University of Washington
              Switzerland, USA
              academia

            
              Learning to Simulate Complex Physics with Graph Networks
              46
              547
              DeepMind, Stanford University
              UK, USA
              industry

            
              Is BERT Really Robust? A Strong Baseline for Natural Language Attack on Text Classification and Entailment.
              
              546
              Agency for Science, Technology and Research, MIT, University of Hong Kong
              China, Singapore, USA
              academia

            
              XTREME: A Massively Multilingual Multi-task Benchmark for Evaluating Cross-lingual Generalization
              1
              536
              Carnegie Mellon University, DeepMind, Google
              UK, USA
              industry

            
              On Adaptive Attacks to Adversarial Example Defenses
              3
              535
              Google, MIT, Stanford University, University of Tubingen
              Germany, USA
              academia

            
              Circle Loss: A Unified Perspective of Pair Similarity Optimization
              
              534
              Australian National University, Beihang University, Megvii, Tsinghua University
              Australia, China
              industry

            
              Explaining machine learning classifiers through diverse counterfactual explanations.
              
              527
              Microsoft, University of Colorado Boulder
              USA
              academia

            
              Efficient Transformers: A Survey
              355
              524
              Google
              USA
              industry

            
              Exploring Self-attention for Image Recognition
              
              523
              Chinese University of Hong Kong, Intel
              China, USA
              industry
Title	Tweets	Citations	Organization	Country	Org Type
An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale	142	12042	Google	USA	industry
A Simple Framework for Contrastive Learning of Visual Representations	16	8476	Google	USA	industry
Language Models are Few-Shot Learners	331	7903	OpenAI	USA	industry
YOLOv4: Optimal Speed and Accuracy of Object Detection	20	7860	Academia Sinica	Taiwan	industry
Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer.	53	6362	Google	USA	industry
Momentum Contrast for Unsupervised Visual Representation Learning	8	6060	Meta	USA	industry
End-to-End Object Detection with Transformers	43	4998	Meta, Paris Dauphine University	France, USA	industry
Analyzing and Improving the Image Quality of StyleGAN	44	3101	Aalto University, NVIDIA	Finland, USA	industry
EfficientDet: Scalable and Efficient Object Detection	7	3081	Google	USA	industry
Advances and Open Problems in Federated Learning	5	2921	Australian National University, Carnegie Mellon University, Cornell University, Emory University, École Polytechnique Fédérale de Lausanne, Georgia Institute of Technology, Google, Hong Kong University of Science and Technology, INRIA, IT University of Copenhagen, MIT, Nanyang Technological University, Princeton University, Rutgers University, Stanford University, UC Berkeley, UC San Diego, University of Illinois Urbana-Champaign, University of Oulu, University of Pittsburgh, University of Southern California, University of Virginia, University of Warwick, University of Washington, University of Wisconsin-Madison	Australia, China, Denmark, Finland, France, Singapore, Switzerland, UK, USA	industry
Unsupervised Cross-lingual Representation Learning at Scale	6	2857	Meta	USA	industry
Bootstrap your own latent: A new approach to self-supervised Learning	13	2827	DeepMind, Imperial College London	UK	industry
Training data-efficient image transformers & distillation through attention	45	2558	Meta, Sorbonne University	France, USA	industry
Random Erasing Data Augmentation.	2	2453	University of Technology Sydney, Xiamen University	Australia, China	academia
nuScenes: A Multimodal Dataset for Autonomous Driving	3	2366	nuTonomy	USA	industry
NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis	188	2283	Google, UC Berkeley, UC San Diego	USA	academia
ELECTRA: Pre-training Text Encoders as Discriminators Rather Than Generators		2142	CIFAR, Google, Stanford University	Canada, USA	academia
Improved protein structure prediction using potentials from deep learning		2121	DeepMind, Francis Crick Institute, University College London	UK	industry
Transformers: State-of-the-Art Natural Language Processing		2071	Hugging Face	USA	industry
Unsupervised Learning of Visual Features by Contrasting Cluster Assignments	12	1847	INRIA, Meta	France, USA	industry
Supervised Contrastive Learning	29	1835	Boston University, Google, MIT, Snap Inc.	USA	industry
Improved Baselines with Momentum Contrastive Learning	1	1782	Meta	USA	industry
wav2vec 2.0: A Framework for Self-Supervised Learning of Speech Representations	34	1777	Meta	USA	industry
Exploring Simple Siamese Representation Learning	73	1767	Meta	USA	industry
ECA-Net: Efficient Channel Attention for Deep Convolutional Neural Networks	5	1713	Dalian University of Technology, Harbin Institute of Technology, Tianjin University	China	academia
RandAugment: Practical Automated Data Augmentation with a Reduced Search Space	3	1684	Google	USA	industry
Self-Training With Noisy Student Improves ImageNet Classification	19	1674	Carnegie Mellon University, Google	USA	industry
Longformer: The Long-Document Transformer	55	1650	Allen Institute for Artificial Intelligence	USA	industry
FixMatch: Simplifying Semi-Supervised Learning with Consistency and Confidence		1603	Google	USA	industry
Face2Face: Real-time Face Capture and Reenactment of RGB Videos	1	1532	Max Planck Institute for Informatics, Stanford University, University of Erlangen-Nuremberg	Germany, USA	academia
Image Segmentation Using Deep Learning: A Survey	13	1450	Qualcomm, Snapchat, UC Los Angeles, University of Extremadura, University of Texas at Dallas, University of Waterloo	Canada, Spain, USA	academia
Unsupervised Data Augmentation for Consistency Training	11	1345	Carnegie Mellon University, Google	USA	industry
Big Self-Supervised Models are Strong Semi-Supervised Learners	16	1314	Google	USA	industry
SpanBERT: Improving Pre-training by Representing and Predicting Spans	2	1307	Allen Institute for Artificial Intelligence, Meta, Princeton University, University of Washington	USA	industry
Conformer: Convolution-augmented Transformer for Speech Recognition	51	1214	Google	USA	industry
Scalability in Perception for Autonomous Driving: Waymo Open Dataset		1209	Google, Waymo	USA	industry
Denoising Diffusion Probabilistic Models	108	1206	UC Berkeley	USA	academia
Don't Stop Pretraining: Adapt Language Models to Domains and Tasks	10	1150	Allen Institute for Artificial Intelligence, University of Washington	USA	academia, industry
LightGCN: Simplifying and Powering Graph Convolution Network for Recommendation		1141	Hefei University of Technology, Kuaishou Technology, National University of Singapore, University of Science and Technology of China	China, Singapore	academia
Open Graph Benchmark: Datasets for Machine Learning on Graphs	22	1112	Harvard University, Microsoft, Stanford University, Technical University Dortmund	Germany, USA	academia
Dense Passage Retrieval for Open-Domain Question Answering	23	1106	Meta, Princeton University, University of Washington	USA	industry
SCAFFOLD: Stochastic Controlled Averaging for Federated Learning	1	1052	École Polytechnique Fédérale de Lausanne, Google, New York University	Switzerland, USA	industry
Data-Efficient Image Recognition with Contrastive Predictive Coding		1031	DeepMind, UC Berkeley	UK, USA	industry
Stanza: A Python Natural Language Processing Toolkit for Many Human Languages		1016	Stanford University	USA	academia
ResNeSt: Split-Attention Networks	5	994	Amazon, ByteDance, Meta, SenseTime, Snap Inc., UC Davis	China, USA	industry
Self-Supervised Learning of Pretext-Invariant Representations	2	993	Meta	USA	industry
Oscar: Object-Semantics Aligned Pre-training for Vision-Language Tasks	2	971	Microsoft, University of Washington	USA	industry
Implicit Neural Representations with Periodic Activation Functions	38	963	Stanford University	USA	academia
TinyBERT: Distilling BERT for Natural Language Understanding		954	Huawei, Huazhong University of Science and Technology	China	industry
Big Bird: Transformers for Longer Sequences	10	929	Google	USA	industry
BDD100K: A Diverse Driving Dataset for Heterogeneous Multitask Learning	2	924	Cornell University, Element, UC Berkeley, UC San Diego	Canada, USA	academia
StarGAN v2: Diverse Image Synthesis for Multiple Domains		916	NAVER	South Korea	industry
PV-RCNN: Point-Voxel Feature Set Abstraction for 3D Object Detection		899	Chinese Academy of Sciences, Chinese University of Hong Kong, National Laboratory of Pattern Recognition, SenseTime	China	academia
A Primer in BERTology: What we know about how BERT works	20	892	University of Copenhagen, University of Massachusetts Lowell	Denmark, USA	academia
RAFT: Recurrent All-Pairs Field Transforms for Optical Flow	10	860	Princeton University	USA	academia
Multilingual Denoising Pre-training for Neural Machine Translation	1	859	Meta	USA	industry
Bridging the Gap Between Anchor-Based and Anchor-Free Detection via Adaptive Training Sample Selection		850	Beijing University of Posts and Telecommunications, Chinese Academy of Sciences, National Laboratory of Pattern Recognition, University of Chinese Academy of Sciences, Westlake University	China	academia
Knowledge Distillation: A Survey	19	850	University of London, University of Sydney	Australia, UK	academia
RandLA-Net: Efficient Semantic Segmentation of Large-Scale Point Clouds	3	848	National University of Defense Technology, Sun Yat-sen University, University of Oxford	China, UK	academia
SuperGlue: Learning Feature Matching With Graph Neural Networks	10	839	ETH Zurich, Magic Leap	Switzerland, USA	industry
Generative Pretraining From Pixels		838	OpenAI	USA	industry
Pre-trained models for natural language processing: A survey	10	832	Fudan University	China	academia
Reliable evaluation of adversarial robustness with an ensemble of diverse parameter-free attacks		811	University of Tubingen	Germany	academia
Deep Learning for Person Re-identification: A Survey and Outlook		806	Beijing Institute of Technology, Inception Institute of AI, Salesforce, Singapore Management University, University of Surrey, Wuhan University	China, Singapore, UAE, UK, USA	academia, industry
Object-Contextual Representations for Semantic Segmentation		800	Chinese Academy of Sciences, Microsoft, University of Chinese Academy of Sciences	China, USA	industry
Fourier Features Let Networks Learn High Frequency Functions in Low Dimensional Domains	14	774	Google, UC Berkeley, UC San Diego	USA	academia
Scaled-YOLOv4: Scaling Cross Stage Partial Network		758	Academia Sinica, Intel, Providence University	Taiwan, USA	academia, industry
Understanding Contrastive Representation Learning through Alignment and Uniformity on the Hypersphere	3	755	MIT	USA	academia
Fast is better than free: Revisiting adversarial training		744	Bosch, Carnegie Mellon University	Germany, USA	academia, industry
Informer: Beyond Efficient Transformer for Long Sequence Time-Series Forecasting	35	741	Beihang University, Rutgers University, Sharjah Economic Development Department, UC Berkeley	China, UAE, USA	industry
CodeBERT: A Pre-Trained Model for Programming and Natural Languages	30	736	Harbin Institute of Technology, Microsoft, Sun Yat-sen University	China, USA	industry
Big Transfer (BiT): General Visual Representation Learning	51	704	Google	USA	industry
Pre-Trained Image Processing Transformer	3	703	Huawei, Peking University, Peng Cheng Laboratory, University of Sydney	Australia, China	academia
Rethinking Attention with Performers	38	697	Alan Turing Institute, DeepMind, Google, University of Cambridge	UK, USA	academia, industry
What makes for good views for contrastive learning	8	686	Brown University, Google, MIT	USA	academia
Score-Based Generative Modeling through Stochastic Differential Equations	54	684	Google, Stanford University	USA	industry
Graph Contrastive Learning with Augmentations	2	682	Google, Texas A&M University, University of Science and Technology of China, University of Texas at Austin	China, USA	academia
Interpreting the Latent Space of GANs for Semantic Face Editing	14	675	Chinese University of Hong Kong	China	academia
Linformer: Self-Attention with Linear Complexity	8	668	Meta	USA	industry
MaskGAN: Towards Diverse and Interactive Facial Image Manipulation		653	Chinese University of Hong Kong, SenseTime, University of Hong Kong	China	academia
Simple and Deep Graph Convolutional Networks	1	653	Alibaba Group, Fudan University, Renmin University of China	China	industry
REALM: Retrieval-Augmented Language Model Pre-Training	11	640	Google	USA	industry
Single Path One-Shot Neural Architecture Search with Uniform Sampling	1	627	Hong Kong University of Science and Technology, Megvii, Tsinghua University	China	industry
Domain-Specific Language Model Pretraining for Biomedical Natural Language Processing	18	602	Microsoft	USA	industry
Tracking Objects as Points	3	589	Intel, University of Texas at Austin	USA	academia
Adaptive Federated Optimization	2	588	Google	USA	industry
Exploiting Cloze Questions for Few Shot Text Classification and Natural Language Inference	7	581	Ludwig Maximilian University of Munich, Sulzer GmbH	Germany	academia
Making Pre-trained Language Models Better Few-shot Learners	13	576	MIT, Princeton University	USA	academia
Recipes for building an open-domain chatbot	12	575	Meta	USA	industry
Meshed-Memory Transformer for Image Captioning		572	University of Modena and Reggio Emilia	Italy	academia
Unicoder-VL: A Universal Encoder for Vision and Language by Cross-Modal Pre-Training.		568	Microsoft, Peking University	China, USA	industry
Transformers are RNNs: Fast Autoregressive Transformers with Linear Attention	17	562	École Polytechnique Fédérale de Lausanne, Idiap Research Institute, University of Geneva, University of Washington	Switzerland, USA	academia
Learning to Simulate Complex Physics with Graph Networks	46	547	DeepMind, Stanford University	UK, USA	industry
Is BERT Really Robust? A Strong Baseline for Natural Language Attack on Text Classification and Entailment.		546	Agency for Science, Technology and Research, MIT, University of Hong Kong	China, Singapore, USA	academia
XTREME: A Massively Multilingual Multi-task Benchmark for Evaluating Cross-lingual Generalization	1	536	Carnegie Mellon University, DeepMind, Google	UK, USA	industry
On Adaptive Attacks to Adversarial Example Defenses	3	535	Google, MIT, Stanford University, University of Tubingen	Germany, USA	academia
Circle Loss: A Unified Perspective of Pair Similarity Optimization		534	Australian National University, Beihang University, Megvii, Tsinghua University	Australia, China	industry
Explaining machine learning classifiers through diverse counterfactual explanations.		527	Microsoft, University of Colorado Boulder	USA	academia
Efficient Transformers: A Survey	355	524	Google	USA	industry
Exploring Self-attention for Image Recognition		523	Chinese University of Hong Kong, Intel	China, USA	industry