Created
March 8, 2023 08:50
-
-
Save sergicastellasape/185f72fece3bb489f79366908594504d to your computer and use it in GitHub Desktop.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Title | Tweets | Citations | Organization | Country | Org Type | |
---|---|---|---|---|---|---|
Highly accurate protein structure prediction with AlphaFold | 8783 | DeepMind, Seoul National University | South Korea, UK | industry | ||
Swin Transformer: Hierarchical Vision Transformer using Shifted Windows | 383 | 5389 | Microsoft | USA | industry | |
Learning Transferable Visual Models From Natural Language Supervision | 178 | 3658 | OpenAI | USA | industry | |
Accurate prediction of protein structures and interactions using a three-track neural network | 1659 | Harvard University, Lawrence Berkeley National Laboratory, North-West University, Stanford University, UC Berkeley, University of British Columbia, University of Cambridge, University of Graz, University of Texas Southwestern Medical Center, University of Victoria, University of Washington, University of the Free State | Austria, Canada, South Africa, UK, USA | |||
Pyramid Vision Transformer: A Versatile Backbone for Dense Prediction without Convolutions | 69 | 1306 | Inception Institute of AI, Nanjing University, Nanjing University of Science and Technology, SenseTime, University of Hong Kong | China, UAE | academia | |
Rethinking Semantic Segmentation From a Sequence-to-Sequence Perspective With Transformers | 20 | 1280 | Fudan University, Meta, Tencent, University of Oxford, University of Surrey | China, UK, USA | academia | |
On the Dangers of Stochastic Parrots: Can Language Models Be Too Big? | 1241 | Black in AI, University of Washington | USA | academia | ||
Masked Autoencoders Are Scalable Vision Learners | 843 | 1234 | Meta | USA | industry | |
Emerging Properties in Self-Supervised Vision Transformers | 269 | 1219 | INRIA, Meta, Sorbonne University | France, USA | industry | |
Review of deep learning: concepts, CNN architectures, challenges, applications, future directions | 1210 | Queensland University of Technology | Australia | academia | ||
nnU-Net: a self-configuring method for deep learning-based biomedical image segmentation | 1178 | DeepMind, German Cancer Research Center, Heidelberg University Hospital, University of Heidelberg | Germany, UK | academia | ||
Zero-Shot Text-to-Image Generation | 155 | 1177 | OpenAI | USA | industry | |
TransUNet: Transformers Make Strong Encoders for Medical Image Segmentation | 46 | 998 | East China Normal University, Johns Hopkins University, PAII Inc., Stanford University, University of Electronic Science and Technology of China | China, USA | academia | |
Barlow Twins: Self-Supervised Learning via Redundancy Reduction | 1076 | 951 | Meta, New York University | USA | industry | |
Tokens-to-Token ViT: Training Vision Transformers from Scratch on ImageNet | 13 | 912 | National University of Singapore, YITU Technology | China, Singapore | academia | |
MLP-Mixer: An all-MLP Architecture for Vision | 671 | 896 | USA | industry | ||
SimCSE: Simple Contrastive Learning of Sentence Embeddings | 85 | 866 | Princeton University, Tsinghua University | China, USA | academia | |
Coordinate Attention for Efficient Mobile Network Design | 49 | 860 | National University of Singapore | Singapore | academia | |
SegFormer: Simple and Efficient Design for Semantic Segmentation with Transformers | 100 | 831 | California Institute of Technology, NVIDIA, Nanjing University, University of Hong Kong | China, USA | academia | |
BEiT: BERT Pre-Training of Image Transformers | 143 | 785 | Harbin Institute of Technology, Microsoft | China, USA | industry | |
CvT: Introducing Convolutions to Vision Transformers | 761 | McGill University, Microsoft | Canada, USA | industry | ||
Scaling Up Visual and Vision-Language Representation Learning With Noisy Text Supervision | 41 | 759 | USA | industry | ||
Transformers in Vision: A Survey | 158 | 757 | Inception Institute of AI, Mohamed bin Zayed University of Artificial Intelligence, Monash University, University of Central Florida | Australia, UAE, USA | academia | |
Pre-train, Prompt, and Predict: A Systematic Survey of Prompting Methods in Natural Language Processing | 201 | 737 | Carnegie Mellon University, National University of Singapore | Singapore, USA | academia | |
EfficientNetV2: Smaller Models and Faster Training | 666 | 730 | USA | industry | ||
Is Space-Time Attention All You Need for Video Understanding? | 84 | 729 | Dartmouth College, Meta | USA | academia, industry | |
ViViT: A Video Vision Transformer | 66 | 713 | USA | industry | ||
Diffusion Models Beat GANs on Image Synthesis | 566 | 694 | OpenAI | USA | industry | |
An Empirical Study of Training Self-Supervised Vision Transformers | 76 | 601 | Meta | USA | industry | |
The Power of Scale for Parameter-Efficient Prompt Tuning | 227 | 594 | USA | industry | ||
SwinIR: Image Restoration Using Swin Transformer | 34 | 578 | ETH Zurich, KU Leuven | Belgium, Switzerland | academia | |
Switch Transformers: Scaling to Trillion Parameter Models with Simple and Efficient Sparsity | 120 | 576 | USA | industry | ||
Protein complex prediction with AlphaFold-Multimer | 561 | DeepMind | UK | industry | ||
Bottleneck Transformers for Visual Recognition | 46 | 542 | Google, UC Berkeley | USA | industry | |
HuBERT: Self-Supervised Speech Representation Learning by Masked Prediction of Hidden Units | 43 | 534 | ||||
Alias-Free Generative Adversarial Networks | 77 | 520 | Aalto University, NVIDIA | Finland, USA | industry | |
Towards Causal Representation Learning | 117 | 504 | CIFAR, ETH Zurich, Google, Max Planck Institute for Intelligent Systems, Mila, University of Montreal | Canada, Germany, Switzerland, USA | academia | |
Vision Transformers for Dense Prediction | 360 | 486 | Intel | USA | industry | |
Geometric Deep Learning: Grids, Groups, Graphs, Geodesics, and Gauges | 480 | DeepMind, Imperial College London, New York University, Qualcomm, Twitter | UK, USA | industry | ||
High-Resolution Image Synthesis with Latent Diffusion Models | 210 | 480 | Ludwig Maximilian University of Munich, Runway, University of Heidelberg | Germany, USA | academia | |
Segmenter: Transformer for Semantic Segmentation | 59 | 468 | INRIA | France | academia | |
RepVGG: Making VGG-style ConvNets Great Again | 467 | Aberystwyth University, Hong Kong University of Science and Technology, Megvii, Tsinghua University | China, UK | industry | ||
Multiscale Vision Transformers | 99 | 452 | Meta, UC Berkeley | USA | industry | |
CoAtNet: Marrying Convolution and Attention for All Data Sizes | 442 | USA | industry | |||
CrossViT: Cross-Attention Multi-Scale Vision Transformer for Image Classification | 34 | 435 | IBM, MIT | USA | academia, industry | |
ViLT: Vision-and-Language Transformer Without Convolution or Region Supervision | 435 | Kakao Brain, Kakao Enterprise, NAVER | South Korea | industry | ||
Video Swin Transformer | 32 | 415 | Huazhong University of Science and Technology, Microsoft, Tsinghua University, University of Science and Technology of China | China, USA | industry | |
End-to-End Video Instance Segmentation With Transformers | 411 | Meituan, University of Adelaide | Australia, China | industry | ||
StyleCLIP: Text-Driven Manipulation of StyleGAN Imagery | 602 | 401 | Adobe, Hebrew University of Jerusalem, Tel Aviv University | Israel, USA | academia | |
Evaluating Large Language Models Trained on Code | 934 | 400 | Anthropic, OpenAI, Zipeline | USA | industry | |
Improved Denoising Diffusion Probabilistic Models | 50 | 397 | OpenAI | USA | industry | |
VinVL: Revisiting Visual Representations in Vision-Language Models | 3 | 373 | Microsoft, University of Washington | USA | industry | |
ABCDM: An Attention-based Bidirectional CNN-RNN Deep Model for sentiment analysis | 361 | Deakin University, Nanyang Technological University, Ngee Ann Polytechnic, University of Shahrekord | Australia, Iran, Singapore | academia | ||
Out-of-Distribution Generalization via Risk Extrapolation (REx) | 354 | McGill University, Meta, Mila, University of Montreal, University of Toronto, Vector | Canada, USA | academia | ||
UNETR: Transformers for 3D Medical Image Segmentation | 55 | 351 | NVIDIA, Vanderbilt University | USA | industry | |
ConViT: Improving Vision Transformers with Soft Convolutional Inductive Biases | 344 | Meta, École normale supérieure | France, USA | industry | ||
Align before Fuse: Vision and Language Representation Learning with Momentum Distillation | 30 | 337 | Salesforce | USA | industry | |
GLIDE: Towards Photorealistic Image Generation and Editing with Text-Guided Diffusion Models | 1614 | 333 | OpenAI | USA | industry | |
Perceiver: General Perception with Iterative Attention | 329 | OpenAI | USA | industry | ||
Scaling Vision Transformers | 237 | 324 | USA | industry | ||
VICReg: Variance-Invariance-Covariance Regularization for Self-Supervised Learning | 241 | 314 | INRIA, Meta, New York University | France, USA | academia, industry | |
Machine learning accelerated computational fluid dynamics | 19 | 312 | Google, Harvard University | USA | industry | |
“Everyone wants to do the model work, not the data work”: Data Cascades in High-Stakes AI | 310 | USA | industry | |||
Per-Pixel Classification is Not All You Need for Semantic Segmentation | 69 | 309 | Meta, University of Illinois Urbana-Champaign | USA | industry | |
Finetuned Language Models Are Zero-Shot Learners | 402 | 307 | USA | industry | ||
Multitask Prompted Training Enables Zero-Shot Task Generalization | 640 | 300 | ASUS, BigScience Team, Birla Institute of Technology and Science, Pilani, Booz Allen Hamilton, Brown University, Charles River Analytics, EleutherAI, Hugging Face, Hyperscience, IBM, IMATAG, INRIA, IRISA, Institute for Infocomm Research, King Fahd University of Petroleum and Minerals, NAVER, Nanyang Technological University, New York University, Parity, SAP, SambaNova Systems, Snorkel AI, Stanford University, UC Berkeley, UC San Diego, University of Rome, University of Virginia, VU Amsterdam, Walmart, ZEALS | France, Germany, India, Italy, Japan, Netherlands, Saudi Arabia, Singapore, South Korea, Taiwan, UK, USA | industry | |
TransGAN: Two Pure Transformers Can Make One Strong GAN, and That Can Scale Up | 77 | 297 | IBM, MIT, UC Santa Barbara, University of Texas at Austin | USA | academia | |
Scene Text Detection and Recognition: The Deep Learning Era. | 294 | ByteDance, Carnegie Mellon University, Megvii | China, USA | industry | ||
PlenOctrees for Real-time Rendering of Neural Radiance Fields | 71 | 278 | UC Berkeley, University of Southern California | USA | academia | |
High-Performance Large-Scale Image Recognition Without Normalization | 179 | 275 | DeepMind | UK | industry | |
Mip-NeRF: A Multiscale Representation for Anti-Aliasing Neural Radiance Fields | 33 | 268 | Google, UC Berkeley | USA | industry | |
GPT Understands, Too | 87 | 264 | Beijing Academy of Artificial Intelligence, MIT, Recurrent AI, Tsinghua University | China, USA | academia | |
Less Is More: ClipBERT for Video-and-Language Learning via Sparse Sampling | 260 | Microsoft, University of North Carolina at Chapel Hill | USA | industry | ||
SimMIM: A Simple Framework for Masked Image Modeling | 76 | 257 | Microsoft, Tsinghua University, Xi’an Jiaotong University | China, USA | industry | |
VATT: Transformers for Multimodal Self-Supervised Learning from Raw Video, Audio and Text | 285 | 255 | Columbia University, Cornell University, Google | USA | industry | |
Restormer: Efficient Transformer for High-Resolution Image Restoration | 52 | 247 | Google, Inception Institute of AI, Mohamed bin Zayed University of Artificial Intelligence, Monash University, UC Merced, Yonsei University | Australia, South Korea, UAE, USA | academia, industry | |
Understanding adversarial attacks on deep learning based medical image analysis systems. | 1 | 246 | Beihang University, Chinese Academy of Sciences, National Institute of Informatics, Shanghai Jiao Tong University, University of Melbourne | Australia, China, Japan | academia | |
FairNAS: Rethinking Evaluation Fairness of Weight Sharing Neural Architecture Search | 1 | 245 | Xiaomi | China | industry | |
Calibrate Before Use: Improving Few-Shot Performance of Language Models | 90 | 243 | UC Berkeley, UC Irvine, University of Maryland | USA | academia | |
Semi-Supervised Semantic Segmentation with Cross Pseudo Supervision | 1 | 242 | Microsoft, Peking University | China, USA | industry | |
IBRNet: Learning Multi-View Image-Based Rendering | 8 | 241 | Cornell University, Google, Princeton University | USA | academia, industry | |
E(n) Equivariant Graph Neural Networks | 60 | 238 | Bosch, University of Amsterdam | Germany, Netherlands | academia, industry | |
LoFTR: Detector-Free Local Feature Matching with Transformers | 95 | 238 | SenseTime, Zhejiang University | China | academia | |
Plant leaf disease classification using EfficientNet deep learning model | 237 | Iskenderun Technical University, Karabuk University, Kastamonu University | Turkey | academia | ||
How Attentive are Graph Attention Networks? | 56 | 234 | Carnegie Mellon University, Technion | Israel, USA | academia | |
Transformer Meets Tracker: Exploiting Temporal Context for Robust Visual Tracking | 15 | 234 | Hefei Comprehensive National Science Center, University of Science and Technology of China | China | academia | |
MDETR - Modulated Detection for End-to-End Multi-Modal Understanding | 233 | Meta, New York University | USA | academia | ||
Learning to Prompt for Vision-Language Models | 61 | 231 | Nanyang Technological University | Singapore | academia | |
SimVLM: Simple Visual Language Model Pretraining with Weak Supervision | 231 | Carnegie Mellon University, Google, University of Washington | USA | industry | ||
Scaling Language Models: Methods, Analysis & Insights from Training Gopher | 229 | DeepMind | UK | industry | ||
How to Train Your Robot with Deep Reinforcement Learning; Lessons We've Learned | 42 | 220 | Google, UC Berkeley, X, The Moonshot Factory | USA | academia, industry | |
Conceptual 12M: Pushing Web-Scale Image-Text Pre-Training To Recognize Long-Tail Visual Concepts | 8 | 217 | USA | industry | ||
Model-Contrastive Federated Learning | 1 | 209 | National University of Singapore, UC Berkeley | Singapore, USA | academia | |
SpeechBrain: A General-Purpose Speech Toolkit | 117 | 208 | Aalto University, Academia Sinica, Avignon Université, HEC Montreal, Indian Institute of Technology Madras, Marche Polytechnic University, McGill University, Mila, NVIDIA, Ohio State University, Samsung, Toulouse Institute of Computer Science Research, Toyota Technological Institute at Chicago, University of Cambridge, University of Edinburgh, University of Montreal, University of Sherbrooke | Canada, Finland, France, India, Italy, South Korea, Taiwan, UK, USA | academia | |
MagFace: A Universal Representation for Face Recognition and Quality Assessment | 10 | 206 | Aibee | China | industry | |
Offline Reinforcement Learning as One Big Sequence Modeling Problem | 110 | 200 | UC Berkeley | USA | academia | |
Unified Pre-training for Program Understanding and Generation | 16 | 200 | Columbia University, UC Los Angeles | USA | academia | |
Image Super-Resolution via Iterative Refinement | 401 | 198 | USA | industry | ||
FastNeRF: High-Fidelity Neural Rendering at 200FPS | 164 | 194 | Microsoft | USA | industry | |
BEIR: A Heterogenous Benchmark for Zero-shot Evaluation of Information Retrieval Models | 95 | 194 | Technical University of Darmstadt | Germany | academia | |
Measurement and Fairness. | 26 | 191 | Microsoft, University of Michigan | USA | industry |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment