Title Tweets Citations Organization Country Org Type
AlphaFold Protein Structure Database: massively expanding the structural coverage of protein-sequence space with high-accuracy models 1331 DeepMind, European Molecular Biology Laboratory UK academia
ColabFold: making protein folding accessible to all 1138 Harvard University, Max Planck Institute for Multidisciplinary Sciences, Michigan State University, Seoul National University, University of Tokyo Germany, Japan, South Korea, USA academia
A ConvNet for the 2020s 857 835 Meta, UC Berkeley USA industry
Hierarchical Text-Conditional Image Generation with CLIP Latents 105 718 OpenAI USA industry
PaLM: Scaling Language Modeling with Pathways 445 426 Google USA industry
Photorealistic Text-to-Image Diffusion Models with Deep Language Understanding 2462 390 Google USA industry
Instant Neural Graphics Primitives with a Multiresolution Hash Encoding 11 342 NVIDIA USA industry
SignalP 6.0 predicts all five types of signal peptides using protein language models 274 Copenhagen University Hospital, ETH Zurich, Stanford University, Stockholm University, Technical University of Denmark, University of Copenhagen, Wellcome Genome Campus Denmark, Sweden, Switzerland, UK, USA academia
Swin Transformer V2: Scaling Up Capacity and Resolution 87 266 Huazhong University of Science and Technology, Microsoft, Tsinghua University, University of Science and Technology of China, Xi’an Jiaotong University China, USA industry
Training language models to follow instructions with human feedback 448 254 OpenAI USA industry
Chain of Thought Prompting Elicits Reasoning in Large Language Models 378 224 Google USA industry
Flamingo: a Visual Language Model for Few-Shot Learning 71 218 DeepMind UK industry
Classifier-Free Diffusion Guidance 53 194 Google Netherlands, USA industry
Magnetic control of tokamak plasmas through deep reinforcement learning 194 DeepMind, École Polytechnique Fédérale de Lausanne, Meta Switzerland, UK, USA industry
data2vec: A General Framework for Self-supervised Learning in Speech, Vision and Language 191 Meta USA industry
OPT: Open Pre-trained Transformer Language Models 812 187 Meta USA industry
BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation 79 184 Salesforce USA industry
A Generalist Agent 231 180 DeepMind UK industry
LaMDA: Language Models for Dialog Applications 473 180 Google USA industry
CMT: Convolutional Neural Networks Meet Vision Transformers 172 Huawei, University of Sydney Australia, China academia
Using DeepSpeed and Megatron to Train Megatron-Turing NLG 530B, A Large-Scale Generative Language Model 271 158 Microsoft, NVIDIA USA industry
What Makes Good In-Context Examples for GPT-3? 157 Duke University, Meta, Microsoft USA industry
Ensemble unsupervised autoencoders and Gaussian mixture model for cyberattack detection 145 Ningbo University of Technology, Tongji University China academia
Training Compute-Optimal Large Language Models 144 DeepMind UK industry
Learning robust perceptive locomotion for quadrupedal robots in the wild 3 141 ETH Zurich, Intel, Korea Advanced Institute of Science and Technology South Korea, Switzerland, USA academia
Do As I Can, Not As I Say: Grounding Language in Robotic Affordances 82 135 Everyday Robots, Google USA industry
How Do Vision Transformers Work? 193 129 NAVER, Yonsei University South Korea academia
Scaling Up Your Kernels to 31x31: Revisiting Large Kernel Design in CNNs 30 127 Aberystwyth University, Megvii, Tsinghua University China, UK industry
Large Language Models are Zero-Shot Reasoners 862 124 Google, University of Tokyo Japan, USA academia
Model soups: averaging weights of multiple fine-tuned models improves accuracy without increasing inference time 122 Columbia University, Google, Meta, Tel Aviv University, University of Washington Israel, USA academia
Patches Are All You Need? 117 116 Bosch, Carnegie Mellon University USA academia, industry
Competition-Level Code Generation with AlphaCode 113 DeepMind UK industry
TensoRF: Tensorial Radiance Fields 73 110 Adobe, ShanghaiTech University, UC San Diego, University of Tubingen China, Germany, USA academia
Video Diffusion Models 103 Google Netherlands, USA industry
Data Analytics for the Identification of Fake Reviews Using Supervised Learning 102 Albaha University, Dr. Babasaheb Ambedkar Marathwada University, King Faisal University, Nahrain University India, Iraq, Saudi Arabia academia
Visual Prompt Tuning 26 102 Cornell University, Meta, University of Copenhagen Denmark, USA industry
DINO: DETR with Improved DeNoising Anchor Boxes for End-to-End Object Detection 15 100 Hong Kong University of Science and Technology, International Digital Economy Academy, Tsinghua University China academia
VideoMAE: Masked Autoencoders are Data-Efficient Learners for Self-Supervised Video Pre-Training 66 100 Nanjing University, Shanghai AI Lab, Tencent China academia
Rethinking the Role of Demonstrations: What Makes In-Context Learning Work? 199 99 Allen Institute for Artificial Intelligence, Meta, University of Washington USA academia, industry
BEVFormer: Learning Bird's-Eye-View Representation from Multi-Camera Images via Spatiotemporal Transformers 11 96 Nanjing University, Shanghai AI Lab, University of Hong Kong China academia
Conditional Prompt Learning for Vision-Language Models 51 93 Nanyang Technological University Singapore academia
Fine-Tuning can Distort Pretrained Features and Underperform Out-of-Distribution 151 93 Stanford University USA academia
Measuring and Improving the Use of Graph Information in Graph Neural Networks 1 93 Chinese University of Hong Kong, National University of Singapore China, Singapore academia
Exploring Plain Vision Transformer Backbones for Object Detection 205 91 Meta USA industry
GeoDiff: a Geometric Diffusion Model for Molecular Conformation Generation 26 90 CIFAR, HEC Montreal, Mila, Stanford University, University of Montreal Canada, USA academia
OFA: Unifying Architectures, Tasks, and Modalities Through a Simple Sequence-to-Sequence Learning Framework 91 88 Alibaba Group China industry
Block-NeRF: Scalable Large Scene Neural View Synthesis 641 86 Google, UC Berkeley, Waymo USA industry
Language Models as Zero-Shot Planners: Extracting Actionable Knowledge for Embodied Agents 24 86 Carnegie Mellon University, Google, UC Berkeley USA industry
Beyond the Imitation Game: Quantifying and extrapolating the capabilities of language models 881 81 AI Objectives Institute, Allen Institute for Artificial Intelligence, Amazon, Amelia, Amirkabir University of Technology, Anthropic, Apergo, Arizona State University, Bauhaus-Universität Weimar, Bluevine, Carnegie Mellon University, Carnegie Robotics, Charles River Analytics, Columbia University, Complutense University of Madrid, Conjecture, Cornell University, De Anza College, DeepMind, Duke Kunshan University, Duke University, ETH Zurich, EleutherAI, Ford Motor Company, Fraunhofer Institute for Integrated Circuits, Georgia Institute of Technology, Google, Hacettepe University, Harker School, Harvard University, Heidelberg Institute for Theoretical Studies, Hong Kong University of Science and Technology, IBM, Illinois Mathematics and Science Academy, Imperial College London, Indian Institute of Technology Madras, Juelich Research Center, KU Leuven, Karlsruhe Institute of Technology, King’s College London, Koç University, Lawrence Berkeley National Laboratory, Leipzig University, Ludwig Maximilian University of Munich, MIT, ML Collective, Martin-Luther-University Halle-Wittenberg, Max Planck Institute for Intelligent Systems, Max Planck Institute for Mathematics in the Sciences, McGill University, McMaster University, Meta, Microsoft, Mila, MosaicML, NAVER, NUST School of Electrical Engineering and Computer Science, National Public School, HSR, National Research Council Canada, National University of Singapore, NeuralSpace, Neurospin, New York University, NoOverfitting Lab, OpenAI, Ought, Peking University, Penn State University, Princeton University, Queen’s University, Research Institutes of Sweden, Rice University, Rutgers University, Saarland University, Salesforce, Sapienza University of Rome, Sharif University of Technology, Stanford University, Strathmore University, Synthego Corporation, Technion, Tel Aviv University, Thapar Institute of Engineering and Technology, Thomson Reuters Special Services, TomTom, Toyota Technological Institute at Chicago, Tufts University, UC Berkeley, UC Irvine, UC Los Angeles, UC San Diego, Umeå University, UnifyID labs, University of Amsterdam, University of Bristol, University of Cambridge, University of Edinburgh, University of Hamburg, University of Heidelberg, University of Hong Kong, University of Illinois Urbana-Champaign, University of Memphis, University of Michigan, University of Milano-Bicocca, University of North Carolina at Chapel Hill, University of Notre Dame, University of Oxford, University of Pennsylvania, University of Potsdam, University of Southern California, University of Tehran, University of Texas at Austin, University of Toronto, University of Tsukuba, University of Tubingen, University of Utah, University of Virginia, University of Washington, University of Wisconsin-Madison, Valencia Polytechnic University, Wrocław University of Science and Technology, Yale University Belgium, Canada, China, France, Germany, India, Iran, Israel, Italy, Japan, Kenya, Netherlands, Pakistan, Poland, Singapore, South Korea, Spain, Sweden, Switzerland, Turkey, UK, USA academia
Outracing champion Gran Turismo drivers with deep reinforcement learning 80 Sony Japan industry
BC-Z: Zero-Shot Task Generalization with Robotic Imitation Learning 10 77 Everyday Robots, Google, Stanford University, UC Berkeley USA academia
DN-DETR: Accelerate DETR Training by Introducing Query DeNoising 74 Hong Kong University of Science and Technology, International Digital Economy Academy, Tsinghua University China academia
Emergent Abilities of Large Language Models 442 74 DeepMind, Google, Stanford University, University of North Carolina at Chapel Hill UK, USA academia, industry
Equivariant Diffusion for Molecule Generation in 3D 131 73 École Polytechnique Fédérale de Lausanne, University of Amsterdam Netherlands, Switzerland academia
Swin UNETR: Swin Transformers for Semantic Segmentation of Brain Tumors in MRI Images 6 73 NVIDIA, Vanderbilt University USA industry
GPT-NeoX-20B: An Open-Source Autoregressive Language Model 50 72 EleutherAI industry
Online reinforcement learning multiplayer non-zero sum games of continuous-time Markov jump linear systems 72 Anhui University, Chengdu University, Murdoch University, University of Kragujevac Australia, China, Serbia academia
Self-consistency improves chain of thought reasoning in language models 290 71 Google USA industry
Detecting Twenty-thousand Classes using Image-level Supervision 35 70 Meta, University of Texas at Austin USA industry
Image fusion in the loop of high-level vision tasks: A semantic-aware real-time infrared and visible image fusion network 68 Wuhan University China academia
LAION-5B: An open large-scale dataset for training next generation image-text models 53 66 Juelich Research Center, LAION, Stability AI, Technical University of Darmstadt, Technical University of Munich, UC Berkeley, University of Washington Germany, USA industry
Denoising Diffusion Restoration Models 65 NVIDIA, Stanford University, Technion Israel, USA industry
VQGAN-CLIP: Open Domain Image Generation and Editing with Natural Language Guidance 175 64 AiDock, Booz Allen Hamilton, EleutherAI, Georgia Institute of Technology Israel, USA industry
CLIP-NeRF: Text-and-Image Driven Manipulation of Neural Radiance Fields 33 63 City University of Hong Kong, Microsoft, Snap Inc., University of Southern California China, USA academia
Solving Quantitative Reasoning Problems with Language Models 139 63 Google USA industry
Masked Autoencoders As Spatiotemporal Learners 120 61 Meta USA industry
Why do tree-based models still outperform deep learning on tabular data? 646 60 CNRS, INRIA, Sorbonne University France academia
Socratic Models: Composing Zero-Shot Multimodal Reasoning with Language 499 59 Google USA industry
ViTAEv2: Vision Transformer Advanced by Exploring Inductive Bias for Image Recognition and Beyond 2 59 JD Explore Academy, University of Sydney Australia, China academia, industry
Image as a Foreign Language: BEiT Pretraining for All Vision and Vision-Language Tasks 178 58 Microsoft USA industry
Language-driven Semantic Segmentation 95 57 Apple, Cornell University, Intel, University of Copenhagen Denmark, USA industry
Vision-Language Pre-Training with Triple Contrastive Learning 34 56 Amazon, University of Texas at Arlington USA academia
Deep Reinforcement Learning-Based Path Control and Optimization for Unmanned Ships 55 Sipivt, Tongji University China industry
EquiBind: Geometric Deep Learning for Drug Binding Structure Prediction 208 54 MIT USA academia
Omnivore: A Single Model for Many Visual Modalities 89 54 Meta USA industry
Quantifying Memorization Across Neural Language Models 106 54 Cornell University, Google, University of Pennsylvania USA industry
DeepFusion: Lidar-Camera Deep Fusion for Multi-Modal 3D Object Detection 36 53 Google, Johns Hopkins University USA industry
Genetic Algorithm-Based Trajectory Optimization for Digital Twin Robots 53 Wuhan University of Science and Technology China academia
Make-A-Scene: Scene-Based Text-to-Image Generation with Human Priors 280 53 Meta USA industry
Discovering faster matrix multiplication algorithms with reinforcement learning 52 DeepMind UK industry
DreamBooth: Fine Tuning Text-to-Image Diffusion Models for Subject-Driven Generation 221 52 Boston University, Google USA industry
PETR: Position Embedding Transformation for Multi-View 3D Object Detection 4 52 Megvii China industry
Protein structure predictions to atomic accuracy with AlphaFold 51 DeepMind UK industry
ABAW: Valence-Arousal Estimation, Expression Recognition, Action Unit Detection & Multi-Task Learning Challenges 2 50 Queen Mary University of London UK academia
HumanNeRF: Free-viewpoint Rendering of Moving People from Monocular Video 72 50 Google, University of Washington USA academia, industry
UnifiedSKG: Unifying and Multi-Tasking Structured Knowledge Grounding with Text-to-Text Language Models 38 49 Allen Institute for Artificial Intelligence, Carnegie Mellon University, George Mason University, Google, Meta, Penn State University, Salesforce, ServiceNow Research, Shanghai AI Lab, Stanford University, UC Berkeley, University of Edinburgh, University of Hong Kong, University of Illinois Urbana-Champaign, University of Washington, University of Waterloo, Yale University Canada, China, UK, USA academia
A Systematic Evaluation of Large Language Models of Code 61 48 Carnegie Mellon University USA academia
Robust Speech Recognition via Large-Scale Weak Supervision 40 48 OpenAI USA industry
Diffusion Models: A Comprehensive Survey of Methods and Applications 274 47 Beijing University of Posts and Telecommunications, Carnegie Mellon University, HEC Montreal, Mila, OpenAI, Peking University, UC Los Angeles, UC Merced Canada, China, USA academia
Can language models learn from explanations in context? 113 46 DeepMind UK industry
NELA-GT-2021: A Large Multi-Labelled News Dataset for The Study of Misinformation in News Articles 9 46 Rensselaer Polytechnic Institute, University of Tennessee Knoxville USA academia
ActionFormer: Localizing Moments of Actions with Transformers 44 4Paradigm Inc., Nanjing University, University of Wisconsin-Madison China, USA academia
DeiT III: Revenge of the ViT 115 44 Meta, Sorbonne University France, USA academia, industry
Least-to-Most Prompting Enables Complex Reasoning in Large Language Models 44 Google USA industry
Diffusion-LM Improves Controllable Text Generation 253 43 Stanford University USA academia
Overview of The Shared Task on Homophobia and Transphobia Detection in Social Media Comments 41 Indian Institute of Information Technology and Management, Madurai Kamaraj University, National University of Ireland Galway, SSN College of Engineering India, Ireland academia
Text and Code Embeddings by Contrastive Pre-Training 23 40 OpenAI USA industry
Winoground: Probing Vision and Language Models for Visio-Linguistic Compositionality 125 40 Hugging Face, Meta, University College London, University of Waterloo Canada, UK, USA industry
BLOOM: A 176B-Parameter Open-Access Multilingual Language Model 325 39 BigScience Team France industry
Red Teaming Language Models with Language Models 40 39 DeepMind, New York University UK, USA industry
Transformer Memory as a Differentiable Search Index 372 39 Google USA industry
Torsional Diffusion for Molecular Conformer Generation 109 38 Harvard University, MIT USA academia
Unified Contrastive Learning in Image-Text-Label Space 66 37 Microsoft USA industry
Benchmarking Generalization via In-Context Instructions on 1, 600+ Language Tasks 149 36 Allen Institute for Artificial Intelligence, Amirkabir University of Technology, Arizona State University, Columbia University, Factored AI, Government Polytechnic Rajkot, Indian Institute of Technology Kharagpur, Indian Institute of Technology Madras, Johns Hopkins University, Microsoft, National Institute of Technology Karnataka, National University of Singapore, PSG College Of Technology, Sharif University of Technology, Stanford University, Tata Consultancy Services, UC Berkeley, University of Amsterdam, University of Massachusetts Amherst, University of Washington, Zycus Infotech India, Iran, Netherlands, Singapore, USA academia
yuweihao commented Mar 6, 2023

Hi @sergicastellasape ,

Many thanks for this helpful list! Besides, I would like to recommend our paper MetaFormer (CVPR 2022) which is not listed but has obtained 152 citations according to Google Scholar.

MetaFormer Is Actually What You Need for Vision 168.0 152 Sea AI Lab, National University of Singapore Singapore, Singapore Industry, Academia

Thank you very much for this helpful list.

