Skip to content

Instantly share code, notes, and snippets.


amitness/ Secret

Created Jan 17, 2021
What would you like to do?
title url
(Locally) Differentially Private Combinatorial Semi-Bandits
(Re)construing Meaning in NLP
2kenize: Tying Subword Sequences for Chinese Script Conversion
3D-LaneNet+: Anchor Free Lane Detection using a Semi-Local Representation
A Batch Normalized Inference Network Keeps the KL Vanishing Away
A Benchmark of Medical Out of Distribution Detection
A Bilingual Generative Transformer for Semantic Sentence Embedding
A Boolean Task Algebra for Reinforcement Learning
A Broad-Coverage Challenge Corpus for Sentence Understanding through Inference
A Call for More Rigor in Unsupervised Cross-lingual Learning
A Characterization of Mean Squared Error for Estimator with Bagging
A Closer Look at Accuracy vs. Robustness
A Closer Look at Small-loss Bounds for Bandits with Graph Feedback
A Co-Matching Model for Multi-choice Reading Comprehension
A Computational Approach to Understanding Empathy Expressed in Text-Based Mental Health Support
A Contextual Hierarchical Attention Network with Adaptive Objective for Dialogue State Tracking
A Continuous-time Perspective for Modeling Acceleration in Riemannian Optimization
A Convolutional Encoder Model for Neural Machine Translation
A Corpus for Large-Scale Phonetic Typology
A Corpus with Multi-Level Annotations of Patients, Interventions and Outcomes to Support Language Processing for Medical Literature
A Cross-Task Analysis of Text Span Representations
A Crowdsourced Frame Disambiguation Corpus with Ambiguity
A Data and Compute Efficient Design for Limited-Resources Deep Learning
A Data-driven Approach for Noise Reduction in Distantly Supervised Biomedical Relation Extraction
A Decomposable Attention Model for Natural Language Inference
A Deep Generative Model for Fragment-Based Molecule Generation
A Deep Generative Model of Vowel Formant Typology
A Deep Learning Approach for Determining Effects of Tuta Absoluta in Tomato Plants
A Deep Learning System for Sentiment Analysis of Service Calls
A Deep Neural Network Sentence Level Classification Method with Context Information
A Deep Reinforced Model for Zero-Shot Cross-Lingual Summarization with Bilingual Semantic Similarity Rewards
A Diagnostic Study of Explainability Techniques for Text Classification
A Differentiable Newton Euler Algorithm for Multi-body Model Learning
A Distributional Analysis of Sampling-Based Reinforcement Learning Algorithms
A Distributional Framework for Data Valuation
A Distributional View on Multi-Objective Policy Optimization
A Double Residual Compression Algorithm for Efficient Distributed Learning
A Fast Spectral Algorithm for Mean Estimation with Sub-Gaussian Rates
A Formal Hierarchy of RNN Architectures
A Fourier State Space Model for Bayesian ODE Filters
A Framework and Dataset for Abstract Art Generation via CalligraphyGAN
A Framework for Sample Efficient Interval Estimation with Control Variates
A Free-Energy Principle for Representation Learning
A Full Non-Monotonic Transition System for Unrestricted Non-Projective Parsing
A General Framework for Information Extraction using Dynamic Span Graphs
A Generative Approach to Titling and Clustering Wikipedia Sections
A Generative Model for Joint Natural Language Understanding and Generation
A Generative Model for Molecular Distance Geometry
A Generative Parser with a Discriminative Recognition Algorithm
A Generic First-Order Algorithmic Framework for Bi-Level Programming Beyond Lower-Level Singleton
A Geometry-Inspired Attack for Generating Natural Language Adversarial Examples
A Girl Has A Name: Detecting Authorship Obfuscation
A Graph to Graphs Framework for Retrosynthesis Prediction
A Hierarchical Latent Structure for Variational Conversation Modeling
A Hierarchical Probabilistic U-Net for Modeling Multi-Scale Ambiguities
A Hierarchical Reinforced Sequence Operation Method for Unsupervised Text Style Transfer
A Hierarchical Transformer for Unsupervised Parsing
A Hybrid Convolutional Variational Autoencoder for Text Generation
A Hybrid Stochastic Policy Gradient Algorithm for Reinforcement Learning
A Joint Named-Entity Recognizer for Heterogeneous Tag-sets Using a Tag Hierarchy
A Just and Comprehensive Strategy for Using NLP to Address Online Abuse
A Knowledge-Enhanced Pretraining Model for Commonsense Story Generation
A La Carte Embedding: Cheap but Effective Induction of Semantic Feature Vectors
A Large-Scale Multi-Document Summarization Dataset from the Wikipedia Current Events Portal
A Locally Adaptive Bayesian Cubature Method
A Meaning-based Statistical English Math Word Problem Solver
A Mention-Ranking Model for Abstract Anaphora Resolution
A Meta-Learning Approach for Graph Representation Learning in Multi-Task Settings
A Methodology for Creating Question Answering Corpora Using Inverse Data Annotation
A Minimal Span-Based Neural Constituency Parser
A Monolingual Approach to Contextualized Word Embeddings for Mid-Resource Languages
A Multi-Axis Annotation Scheme for Event Temporal Relations
A Multi-Perspective Architecture for Semantic Code Search
A Multi-Task Incremental Learning Framework with Category Name Embedding for Aspect-Category Sentiment Analysis
A Multi-modal Approach to Fine-grained Opinion Mining on Video Reviews
A Multi-sentiment-resource Enhanced Attention Network for Sentiment Classification
A Multiclass Classification Approach to Label Ranking
A Multilingual Neural Machine Translation Model for Biomedical Data
A Multitask Learning Approach for Diacritic Restoration
A Narration-based Reward Shaping Approach using Grounded Natural Language Commands
A Nested Attention Neural Hybrid Model for Grammatical Error Correction
A Neural Attention Model for Abstractive Sentence Summarization
A Neural Generative Model for Joint Learning Topics and Topic-Specific Word Embeddings
A Neural Model for User Geolocation and Lexical Dialectology
A Neural Model of Adaptation in Reading
A Neural Network for Coordination Boundary Prediction
A Neuro-AI Interface for Evaluating Generative Adversarial Networks
A New Neural Network Architecture Invariant to the Action of Symmetry Subgroups
A Nonparametric Off-Policy Policy Gradient
A Note on Data Biases in Generative Models
A Note on Over-Smoothing for Graph Neural Networks
A Novel Cascade Binary Tagging Framework for Relational Triple Extraction
A Novel Confidence-Based Algorithm for Structured Bandits
A Novel Graph-based Multi-modal Fusion Encoder for Neural Machine Translation
A Pairwise Fair and Community-preserving Approach to k-Center Clustering
A Practical Algorithm for Multiplayer Bandits when Arm Means Vary Among Players
A Principled Approach to Learning Stochastic Representations for Privacy in Deep Neural Inference
A Probabilistic End-To-End Task-Oriented Dialog Model with Latent Belief States towards Semi-Supervised Learning
A Probabilistic Generative Model for Typographical Analysis of Early Modern Printing
A Probabilistic Generative Model of Linguistic Typology
A Probabilistic Model with Commonsense Constraints for Pattern-based Temporal Fact Extraction
A Re-evaluation of Knowledge Graph Completion Methods
A Recipe for Creating Multimodal Aligned Datasets for Sequential Tasks
A Reduction from Reinforcement Learning to No-Regret Online Learning
A Reinforced Generation of Adversarial Examples for Neural Machine Translation
A Relational Memory-based Embedding Model for Triple Classification and Search Personalization
A Relaxed Matching Procedure for Unsupervised BLI
A Report on the 2020 Sarcasm Detection Shared Task
A Resource-Free Evaluation Metric for Cross-Lingual Word Embeddings Based on Graph Modularity
A Rigorous Study on Named Entity Recognition: Can Fine-tuning Pretrained Model Lead to the Promised Land?
A Sample Complexity Separation between Non-Convex and Convex Meta-Learning
A Scalable Neural Shortlisting-Reranking Approach for Large-Scale Domain Classification in Natural Language Understanding
A Self-Training Method for Machine Reading Comprehension with Soft Evidence Extraction
A Sequential Self Teaching Approach for Improving Generalization in Sound Event Recognition
A Simple Approach to Learning Unsupervised Multilingual Embeddings
A Simple Joint Model for Improved Contextual Neural Lemmatization
A Simple Regularization-based Algorithm for Learning Cross-Domain Word Embeddings
A Simple Theoretical Model of Importance for Summarization
A Simple Yet Strong Pipeline for HotpotQA
A Simple and Effective Model for Answering Multi-span Questions
A Skeleton-Based Model for Promoting Coherence Among Sentences in Narrative Story Generation
A Span-based Linearization for Constituent Trees
A Stein Goodness-of-fit Test for Directional Distributions
A Stochastic Decoder for Neural Machine Translation
A Streaming Approach For Efficient Batched Beam Search
A Study of Deep Learning Colon Cancer Detection in Limited Data Access Scenarios
A Study of Reinforcement Learning for Neural Machine Translation
A Study on Encodings for Neural Architecture Search
A Stylometric Inquiry into Hyperpartisan and Fake News
A Supervised Word Alignment Method based on Cross-Language Span Prediction using Multilingual BERT
A Survey on Recognizing Textual Entailment as an NLP Evaluation
A Syntactic Neural Model for General-Purpose Code Generation
A System for Worldwide COVID-19 Information Aggregation
A Systematic Assessment of Syntactic Generalization in Neural Language Models
A Tale of a Probe and a Parser
A Theoretical Case Study of Structured Variational Inference for Community Detection
A Top-Down Neural Architecture towards Text-Level Parsing of Discourse Rhetorical Structure
A Topology Layer for Machine Learning
A Trainable Optimal Transport Embedding for Feature Aggregation
A Transformer-based Approach for Source Code Summarization
A Transformer-based joint-encoding for Emotion Recognition and Sentiment Analysis
A Transition-Based Directed Acyclic Graph Parser for UCCA
A Two-Stage Masked LM Method for Term Set Expansion
A Unified Linear-Time Framework for Sentence-Level Discourse Parsing
A Unified MRC Framework for Named Entity Recognition
A Unified Model for Extractive and Abstractive Summarization using Inconsistency Loss
A Unified Stochastic Gradient Approach to Designing Bayesian-Optimal Experiments
A Unified Theory of Decentralized SGD with Changing Topology and Local Updates
A Unified Theory of SGD: Variance Reduction, Sampling, Quantization and Coordinate Descent
A Unified View of Label Shift Estimation
A Visual Attention Grounding Neural Model for Multimodal Machine Translation
A Wasserstein Minimum Velocity Approach to Learning Unnormalized Models
A Word-Complexity Lexicon and A Neural Readability Ranking Model for Lexical Simplification
A greedy anytime algorithm for sparse PCA
A large annotated corpus for learning natural language inference
A negative case analysis of visual grounding methods for VQA
A neurally plausible model learns successor representations in partially observable environments
A new regret analysis for Adam-type algorithms
A nonasymptotic law of iterated logarithm for general M-estimators
A principled approach for generating adversarial images under non-smooth dissimilarity metrics
A robust self-learning method for fully unsupervised cross-lingual mappings of word embeddings
A single image deep learning approach to restoration of corrupted remote sensing products
A strong baseline for question relevancy ranking
AD3: Attentive Deep Document Dater
ADVISER: A Toolkit for Developing Multi-modal, Multi-domain and Socially-engaged Conversational Agents
AIN: Fast and Accurate Sequence Labeling with Approximate Inference Network
ALICE: Active Learning with Contrastive Natural Language Explanations
AMAGOLD: Amortized Metropolis Adjustment for Efficient Stochastic Gradient MCMC
AMR Dependency Parsing with a Typed Semantic Algebra
AMR Parsing as Sequence-to-Graph Transduction
AMR Parsing via Graph-Sequence Iterative Inference
AMR-to-text Generation with Synchronous Node Replacement Grammar
AP-Perf: Incorporating Generic Performance Metrics in Differentiable Learning
AR-DAE: Towards Unbiased Neural Entropy Gradient Estimation
ASAP: Architecture Search, Anneal and Prune
ASSET: A Dataset for Tuning and Evaluation of Sentence Simplification Models with Multiple Rewriting Transformations
Abstract Syntax Networks for Code Generation and Semantic Parsing
Abstraction Mechanisms Predict Generalization in Deep Neural Networks
Abstractive Multi-Document Summarization via Phrase Selection and Merging
Abusive Language Detection with Graph Convolutional Networks
Accelerated Message Passing for Entropy-Regularized MAP Inference
Accelerated Primal-Dual Algorithms for Distributed Smooth Convex Optimization over Networks
Accelerated Reinforcement Learning for Sentence Generation by Vocabulary Prediction
Accelerated Stochastic Gradient-free and Projection-free Methods
Accelerating Large-Scale Inference with Anisotropic Vector Quantization
Accelerating NMT Batched Beam Decoding with LMBR Posteriors for Deployment
Accelerating Natural Language Understanding in Task-Oriented Dialog
Accelerating Online Reinforcement Learning with Offline Datasets
Accelerating Reinforcement Learning with Learned Skill Priors
Accurate Word Alignment Induction from Neural Machine Translation
Acrostic Poem Generation
Action and Perception as Divergence Minimization
Active Community Detection with Maximal Expected Model Change
Active Imitation Learning with Noisy Guidance
Active Learning for Coreference Resolution using Discrete Annotation
Active Learning for Identification of Linear Dynamical Systems
Active Learning from Crowd in Document Screening
Active World Model Learning with Progress Curiosity
AdaScale SGD: A User-Friendly Algorithm for Distributed Training
Adapting End-to-End Speech Recognition for Readable Subtitles
Adapting Word Embeddings to New Languages with Morphological and Phonological Subword Representations
Adaptive Attention Span in Transformers
Adaptive Attentional Network for Few-Shot Knowledge Graph Completion
Adaptive Checkpoint Adjoint Method for Gradient Estimation in Neural ODE
Adaptive Document Retrieval for Deep Question Answering
Adaptive Estimator Selection for Off-Policy Evaluation
Adaptive Exploration in Linear Contextual Bandit
Adaptive Gradient Descent without Descent
Adaptive Prediction Timing for Electronic Health Records
Adaptive Region-Based Active Learning
Adaptive Reward-Poisoning Attacks against Reinforcement Learning
Adaptive Risk Minimization: A Meta-Learning Approach for Tackling Group Shift
Adaptive Scaling for Sparse Detection in Information Extraction
Adaptive Transformers for Learning Multimodal Representations
Adding Seemingly Uninformative Labels Helps in Low Data Regimes
Additive Tree-Structured Covariance Function for Conditional Parameter Spaces in Bayesian Optimization
Addressing Ancestry Disparities in Genomic Medicine: A Geographic-aware Algorithm
Addressing Exposure Bias With Document Minimum Risk Training: Cambridge at the WMT20 Biomedical Translation Task
Addressing reward bias in Adversarial Imitation Learning with neutral reward functions
Addressing the Rare Word Problem in Neural Machine Translation
AdvAug: Robust Adversarial Augmentation for Neural Machine Translation
Advancing Renewable Electricity Consumption With Reinforcement Learning
Adversarial Alignment of Multilingual Models for Extracting Temporal Expressions from Text
Adversarial Attack and Defense of Structured Prediction Models
Adversarial Attacks on Probabilistic Autoregressive Forecasting Models
Adversarial Connective-exploiting Networks for Implicit Discourse Relation Classification
Adversarial Contrastive Estimation
Adversarial Deep Averaging Networks for Cross-Lingual Sentiment Classification
Adversarial Example Generation with Syntactically Controlled Paraphrase Networks
Adversarial Examples for Evaluating Reading Comprehension Systems
Adversarial Filters of Dataset Biases
Adversarial Learning of Privacy-Preserving Text Representations for De-Identification of Medical Records
Adversarial Multi-Criteria Learning for Chinese Word Segmentation
Adversarial Multi-task Learning for Text Classification
Adversarial Multitask Learning for Joint Multi-Feature and Multi-Dialect Morphological Modeling
Adversarial Mutual Information for Text Generation
Adversarial NLI: A New Benchmark for Natural Language Understanding
Adversarial Neural Pruning with Latent Vulnerability Suppression
Adversarial Removal of Demographic Attributes from Text Data
Adversarial Risk via Optimal Transport and Optimal Couplings
Adversarial Robustness Guarantees for Classification with Gaussian Processes
Adversarial Robustness for Code
Adversarial Robustness of Flow-Based Generative Models
Adversarial Self-Supervised Data-Free Distillation for Text Classification
Adversarial Semantic Collisions
Adversarial Training for Commonsense Inference
Adversarial Training for Satire Detection: Controlling for Confounding Variables
Adversarial attacks on Copyright Detection Systems
Adversarial representation learning for private speech generation
Adversarial training for multi-context joint entity and relation extraction
Affect-LM: A Neural Language Model for Customizable Affective Text Generation
Afro-MNIST: Synthetic generation of MNIST-style datasets for low-resource languages
Agent57: Outperforming the Atari Human Benchmark
Aggregation of Multiple Knockoffs
Algorithmic Recourse: from Counterfactual Explanations to Interventions
Algorithms and SQ Lower Bounds for PAC Learning One-Hidden-Layer ReLU Networks
Aligned Cross Entropy for Non-Autoregressive Machine Translation
Alignment-based compositional semantics for instruction following
All Fingers are not Equal: Intensity of References in Scientific Articles
All in the Exponential Family: Bregman Duality in Thermodynamic Variational Inference
Alleviating Privacy Attacks via Causal Learning
Almost Tune-Free Variance Reduction
Almost-Matching-Exactly for Treatment Effect Estimation under Network Interference
AmbigQA: Answering Ambiguous Open-domain Questions
Amharic Abstractive Text Summarization
Amodal 3D Reconstruction for Robotic Manipulation via Stability and Connectivity
Amortised Learning by Wake-Sleep
Amortized Inference of Variational Bounds for Learning Noisy-OR
Amortized Population Gibbs Samplers with Neural Sufficient Statistics
Amortized learning of neural causal representations
An AMR Aligner Tuned by Transition-based Parser
An Accelerated DFO Algorithm for Finite-sum Convex Functions
An Analysis of Action Recognition Datasets for Language and Vision Tasks
An Analysis of the Utility of Explicit Negative Examples to Improve the Syntactic Abilities of Neural Language Models
An EM Approach to Non-autoregressive Conditional Sequence Generation
An Effective Approach to Unsupervised Machine Translation
An Effective Transition-based Model for Discontinuous NER
An Effectiveness Metric for Ordinal Classification: Formal Properties and Experimental Results
An Embarrassingly Simple Approach for Transfer Learning from Pretrained Language Models
An Empirical Investigation Towards Efficient Multi-Domain Language Model Pre-training
An Empirical Investigation of Contextualized Number Prediction
An Empirical Investigation of Global and Local Normalization for Recurrent Neural Sequence Models Using a Continuous Relaxation to Beam Search
An Empirical Study of Generation Order for Machine Translation
An Empirical Study of Pre-trained Transformers for Arabic Information Extraction
An Empirical Study on Large-Scale Multi-Label Text Classification Including Few and Zero-Shot Labels
An Empirical Study on Model-agnostic Debiasing Strategies for Robust Natural Language Inference
An Empirical Study on Robustness to Spurious Correlations using Pre-trained Language Models
An Experiment on Leveraging SHAP Values to Investigate Racial Bias
An Explicitly Relational Neural Network Architecture
An Exploration of Arbitrary-Order Sequence Labeling via Energy-Based Inference Networks
An Exploratory Study of Argumentative Writing by Young Students: A Transformer-based Approach
An Imitation Game for Learning Semantic Parsers from User Interaction
An Imitation Learning Approach for Cache Replacement
An Imitation Learning Approach to Unsupervised Parsing
An Interpretable Knowledge Transfer Model for Knowledge Base Completion
An Inverse-free Truncated Rayleigh-Ritz Method for Sparse Generalized Eigenvalue Problem
An Investigation of Why Overparameterization Exacerbates Spurious Correlations
An Optimal Algorithm for Adversarial Bandits with Arbitrary Delays
An Unsupervised Joint System for Text Generation from Knowledge Graphs and Semantic Parsing
An Unsupervised Method for Uncovering Morphological Chains
An Unsupervised Probability Model for Speech-to-Translation Alignment of Low-Resource Languages
An end-to-end Differentially Private Latent Dirichlet Allocation Using a Spectral Algorithm
An end-to-end approach for the verification problem: learning the right distance
An information theoretic view on selecting linguistic probes
Analogies minus analogy test: measuring regularities in word embeddings
Analogous Process Structure Induction for Sub-event Sequence Prediction
Analogs of Linguistic Structure in Deep Representations
Analysing Lexical Semantic Change with Contextualised Word Representations
Analysis of Automatic Annotation Suggestions for Hard Discourse-Level Tasks in Expert Domains
Analytic Marching: An Analytic Meshing Solution from Deep Implicit Surface Networks
Analyzing Individual Neurons in Pre-trained Language Models
Analyzing Multi-Head Self-Attention: Specialized Heads Do the Heavy Lifting, the Rest Can Be Pruned
Analyzing Neural Discourse Coherence Models
Analyzing Polarization in Social Media: Method and Application to Tweets on 21 Mass Shootings
Analyzing Political Parody in Social Media
Analyzing Redundancy in Pretrained Transformer Models
Analyzing analytical methods: The case of phonology in neural models of spoken language
Analyzing autoencoder-based acoustic word embeddings
Analyzing the Limitations of Cross-lingual Word Embedding Mappings
Anatomy of Catastrophic Forgetting: Hidden Representations and Task Semantics
Anchored Correlation Explanation: Topic Modeling with Minimal Domain Knowledge
Anchoring and Agreement in Syntactic Annotations
Anderson Acceleration of Proximal Gradient Methods
Angular Visual Hardness
Answer-based Adversarial Training for Generating Clarification Questions
Approaching Neural Grammatical Error Correction as a Low-Resource Machine Translation Task
Approximate Cross-Validation in High Dimensions with Guarantees
Approximate Cross-validation: Guarantees for Model Assessment and Selection
Approximate Inference in Discrete Distributions with Monte Carlo Tree Search and Value Functions
Approximate is Good Enough: Probabilistic Variants of Dimensional and Margin Complexity
Approximating Stacked and Bidirectional Recurrent Architectures with the Delayed Recurrent Neural Network
Approximation Capabilities of Neural ODEs and Invertible Residual Networks
Approximation Guarantees of Local Search Algorithms via Localizability of Set Functions
Approximation Schemes for ReLU Regression
Approximation-Aware Dependency Parsing by Belief Propagation
AraDIC: Arabic Document Classification using Image-Based Character Embeddings and Class-Balanced Loss
Arc-swift: A Novel Transition System for Dependency Parsing
Architecture Agnostic Neural Networks
Are All Good Word Vector Spaces Isomorphic?
Are All Languages Created Equal in Multilingual BERT?
Are BLEU and Meaning Representation in Opposition?
Are Hyperbolic Representations in Graphs Created Equal?
Are Natural Language Inference Models IMPPRESsive? Learning IMPlicature and PRESupposition
Are Pretrained Language Models Symbolic Reasoners Over Knowledge?
Are Some Words Worth More than Others?
Are You Convinced? Choosing the More Convincing Evidence with a Siamese Network
Argument Generation with Retrieval, Planning, and Realization
Argument Invention from First Principles
Argument Mining for Understanding Peer Reviews
Argument Mining with Structured SVMs and RNNs
Artemis: A Novel Annotation Methodology for Indicative Single Document Summarization
Artificial Intelligence for Global Health: Learning From a Decade of Digital Transformation in Health Care
Asking and Answering Questions to Evaluate the Factual Consistency of Summaries
Asking without Telling: Exploring Latent Ontologies in Contextual Representations
Aspect Level Sentiment Classification with Deep Memory Network
Assessing Human Translations from French to Bambara for Machine Learning: a Pilot Study
Assessing Phrasal Representation and Composition in Transformers
Assessing Robustness to Noise: Low-Cost Head CT Triage
Assessing racial inequality in COVID-19 testing with Bayesian threshold tests
Assessing the Ability of Self-Attention Networks to Learn Word Order
Assessing the Helpfulness of Learning Materials with Inference-Based Learner-Like Agent
Associative Memory in Iterated Overparameterized Sigmoid Autoencoders
Asymmetric Private Set Intersection with Applications to Contact Tracing and Private Vertical Federated Machine Learning
Asymmetric self-play for automatic goal discovery in robotic manipulation
Asymptotic Analysis of Sampling Estimators for Randomized Numerical Linear Algebra Algorithms
Asymptotically Efficient Off-Policy Evaluation for Tabular Reinforcement Learning
Asynchronous Gibbs Sampling
Attacking Neural Text Detectors
Attend to Medical Ontologies: Content Selection for Clinical Abstractive Summarization
Attending the Emotions to Detect Online Abusive Language
Attention Guided Graph Convolutional Networks for Relation Extraction
Attention Is All You Need for Chinese Word Segmentation
Attention Strategies for Multi-Source Sequence-to-Sequence Learning
Attention is Not Only a Weight: Analyzing Transformers with Vector Norms
Attention is not Explanation
Attention-Passing Models for Robust and Data-Efficient End-to-End Speech Translation
Attention-over-Attention Neural Networks for Reading Comprehension
Attentive Group Equivariant Convolutional Networks
Audio-Visual Understanding of Passenger Intents for In-Cabin Conversational Agents
Augmented Natural Language for Generative Sequence Labeling
Augmenting Data for Sarcasm Detection with Unlabeled Conversation Context
Augmenting Neural Networks with First-order Logic
Augmenting word2vec with latent Dirichlet allocation within a clinical application
Author Commitment and Social Power: Automatic Belief Tagging to Infer the Social Context of Interactions
Auto-Rotating Perceptrons
Auto-Sizing Neural Networks: With Applications to n-gram Language Models
AutoExtend: Extending Word Embeddings to Embeddings for Synsets and Lexemes
AutoGluon-Tabular: Robust and Accurate AutoML for Structured Data
AutoML-Zero: Evolving Machine Learning Algorithms From Scratch
Autoencoder as Assistant Supervisor: Improving Text Representation for Chinese Social Media Text Summarization
Autoencoding Pixies: Amortised Variational Inference with Graph Convolutions for Functional Distributional Semantics
Automated Augmented Conjugate Inference for Non-conjugate Gaussian Process Models
Automated Topical Component Extraction Using Neural Network Attention Scores from Source-based Essay Scoring
Automatic Detection of Generated Text is Easiest when Humans are Fooled
Automatic Differentiation of Some First-Order Methods in Parametric Optimization
Automatic Estimation of Simultaneous Interpreter Performance
Automatic Event Salience Identification
Automatic Extraction of Rules Governing Morphological Agreement
Automatic Generation of High Quality CCGbanks for Parser Domain Adaptation
Automatic Metric Validation for Grammatical Error Correction
Automatic Reference-Based Evaluation of Pronoun Translation Misses the Point
Automatic Shortcut Removal for Self-Supervised Representation Learning
Automatic semantic segmentation for prediction of tuberculosis using lens-free microscopy images
Automatically Identifying Complaints in Social Media
Automatically Ranked Russian Paraphrase Corpus for Text Generation
Autoregressive Knowledge Distillation through Imitation Learning
Average-case Acceleration Through Spectral Density Estimation
Avoiding Reasoning Shortcuts: Adversarial Evaluation, Training, and Model Development for Multi-Hop QA
Avoiding the Hypothesis-Only Bias in Natural Language Inference via Ensemble Adversarial Training
AxCell: Automatic Extraction of Results from Machine Learning Papers
BAE: BERT-based Adversarial Examples for Text Classification
BAM! Born-Again Multi-Task Networks for Natural Language Understanding
BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension
BERT Fine-tuning For Arabic Text Summarization
BERT Knows Punta Cana is not just beautiful, it's gorgeous: Ranking Scalar Adjectives with Contextualised Representations
BERT-ATTACK: Adversarial Attack Against BERT Using BERT
BERT-EMD: Many-to-Many Layer Mapping for BERT Compression with Earth Mover's Distance
BERT-XML: Large Scale Automated ICD Coding Using BERT Pretraining
BERT-of-Theseus: Compressing BERT by Progressive Module Replacing
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
BERTRAM: Improved Word Embeddings Have Big Impact on Contextualized Model Performance
BERTgrid: Contextualized Embedding for 2D Document Representation and Understanding
BERTs of a feather do not generalize together: Large variability in generalization across models with similar test set performance
BINOCULARS for Efficient, Nonmyopic Sequential Experimental Design
BLEU Neighbors: A Reference-less Approach to Automatic Evaluation
BLEU might be Guilty but References are not Innocent
BLEURT: Learning Robust Metrics for Text Generation
BPE-Dropout: Simple and Effective Subword Regularization
BabyAI++: Towards Grounded-Language Learning beyond Memorization
BabyWalk: Going Farther in Vision-and-Language Navigation by Taking Baby Steps
Back to the Future: Unsupervised Backprop-based Decoding for Counterfactual and Abductive Commonsense Reasoning
Backpropagating through Structured Argmax using a SPIGOT
Balanced off-policy evaluation in general action spaces
Balancing Competing Objectives with Noisy Data: Score-Based Classifiers for Welfare-Aware Machine Learning
Balancing Cost and Benefit with Tied-Multi Transformers
Balancing Gaussian vectors in high dimension
Balancing Objectives in Counseling Conversations: Advancing Forwards or Looking Backwards
Balancing Training for Multilingual Neural Machine Translation
Bandit Convex Optimization in Non-stationary Environments
Bandit optimisation of functions in the Matérn kernel RKHS
BanditSum: Extractive Summarization as a Contextual Bandit
Bandits for BMO Functions
Bandits with adversarial scaling
Barking up the right tree: an approach to search over molecule synthesis DAGs
BasisVAE: Translation-invariant feature-level clustering with Variational Autoencoders
Batch Stationary Distribution Estimation
Batch-Constrained Distributional Reinforcement Learning for Session-based Recommendation
Batched Multi-armed Bandits Problem
Bayesian Differential Privacy for Machine Learning
Bayesian Experimental Design for Implicit Models by Mutual Information Neural Estimation
Bayesian Graph Neural Networks with Adaptive Connection Sampling
Bayesian Hierarchical Words Representation Learning
Bayesian Image Classification with Deep Convolutional Gaussian Processes
Bayesian Learning from Sequential Data using Gaussian Processes with Signature Covariances
Bayesian Optimisation over Multiple Continuous and Categorical Inputs
Bayesian Optimization for Iterative Learning
Bayesian Optimization of Text Representations
Bayesian Reinforcement Learning via Deep, Sparse Sampling
Bayesian aggregation improves traditional single image crop classification approaches
Bayesian experimental design using regularized determinantal point processes
Be More with Less: Hypergraph Attention Networks for Inductive Text Classification
BeBold: Exploration Beyond the Boundary of Explored Regions
Before Name-calling: Dynamics and Triggers of Ad Hominem Fallacies in Web Argumentation
Behavior Analysis of NLI Models: Uncovering the Influence of Three Factors on Robustness
Being Bayesian, Even Just a Bit, Fixes Overconfidence in ReLU Networks
Being Negative but Constructively: Lessons Learnt from Creating Better Visual Question Answering Datasets
Benchmarking Graph Neural Networks
Benchmarking Multimodal Regex Synthesis with Complex Structures
Best Arm Identification for Cascading Bandits in the Fixed Confidence Setting
Best-First Beam Search
Best-item Learning in Random Utility Models with Subset Choices
Beta Embeddings for Multi-Hop Logical Reasoning in Knowledge Graphs
Better Depth-Width Trade-offs for Neural Networks through the lens of Dynamical Systems
Better Document-Level Machine Translation with Bayes' Rule
Better Highlighting: Creating Sub-Sentence Summary Highlights
Better Long-Range Dependency By Bootstrapping A Mutual Information Regularizer
Beyond Accuracy: Behavioral Testing of NLP models with CheckList
Beyond Error Propagation in Neural Machine Translation: Characteristics of Language Also Matter
Beyond Exponentially Discounted Sum: Automatic Learning of Return Function
Beyond Instructional Videos: Probing for More Diverse Visual-Textual Grounding on YouTube
Beyond Synthetic Noise: Deep Learning on Controlled Noisy Labels
Beyond UCB: Optimal and Efficient Contextual Bandits with Regression Oracles
Beyond User Self-Reported Likert Scale Ratings: A Comparison Model for Automatic Dialog Evaluation
Beyond exploding and vanishing gradients: analysing RNN training using attractors and smoothness
Beyond task success: A closer look at jointly learning to see, ask, and GuessWhat
Bi-Level Graph Neural Networks for Drug-Drug Interaction Prediction
Bi-directional Attention with Agreement for Dependency Parsing
BiST: Bi-directional Spatio-Temporal Reasoning for Video-Grounded Dialogues
Bidirectional Attentive Memory Networks for Question Answering over Knowledge Bases
Bidirectional Model-based Policy Optimization
Bilingual Dictionary Based Neural Machine Translation without Using Parallel Sentences
Bilingual Lexicon Induction through Unsupervised Machine Translation
Bilingual Lexicon Induction with Semi-supervision in Non-Isometric Embedding Spaces
Bio-Inspired Hashing for Unsupervised Similarity Search
BioMegatron: Larger Biomedical Domain Language Model
Biomedical Entity Representations with Synonym Marginalization
Biomedical Information Extraction for Disease Gene Prioritization
Bipartite Flat-Graph Network for Nested Named Entity Recognition
Birds have four legs?! NumerSense: Probing Numerical Commonsense Knowledge of Pre-trained Language Models
Bisect and Conquer: Hierarchical Clustering via Max-Uncut Bisection
Black Box Submodular Maximization: Discrete and Continuous Settings
Black Loans Matter: Distributionally Robust Fairness for Fighting Subgroup Discrimination
Black is to Criminal as Caucasian is to Police: Detecting and Removing Multiclass Bias in Word Embeddings
Black-box Certification and Learning under Adversarial Perturbations
Black-box Methods for Restoring Monotonicity
Blank Language Models
Bleaching Text: Abstract Features for Cross-lingual Gender Prediction
BoXHED: Boosted eXact Hazard Estimator with Dynamic covariates
BoolQ: Exploring the Surprising Difficulty of Natural Yes/No Questions
Boosting Entity Linking Performance by Leveraging Unlabeled Documents
Boosting Frank-Wolfe by Chasing Gradients
Boosting for Control of Dynamical Systems
Bootstrap Latent-Predictive Representations for Multitask Reinforcement Learning
Bootstrapped Q-learning with Context Relevant Observation Pruning to Generalize in Text-based Games
Bootstrapping Generators from Noisy Data
Bootstrapping Named Entity Recognition in E-Commerce with Positive Unlabeled Learning
Bootstrapping Techniques for Polysynthetic Morphological Analysis
Born-Again Tree Ensembles
Bounding, Concentrating, and Truncating: Unifying Privacy Loss Composition for Data Analytics
Bounds in Query Learning
Break It Down: A Question Understanding Benchmark
Breaking NLI Systems with Sentences that Require Simple Lexical Inferences
Breaking the Curse of Many Agents: Provable Mean Embedding Q-Iteration for Mean-Field Reinforcement Learning
Breaking the Curse of Space Explosion: Towards Efficient NAS with Curriculum Search
Breast Cancer Detection Using Convolutional Neural Networks
Bridging Anaphora Resolution as Question Answering
Bridging Information-Seeking Human Gaze and Machine Reading Comprehension
Bridging Linguistic Typology and Multilingual Machine Translation with Multi-View Language Representations
Bridging the Gap between Training and Inference for Neural Machine Translation
Bringing Stories Alive: Generating Interactive Fiction Worlds
Budget Learning via Bracketing
Budget-Constrained Bandits over General Cost and Reward Distributions
C-Learning: Horizon-Aware Cumulative Accessibility Estimation
C-Learning: Learning to Achieve Goals via Recursive Classification
CAT-Gen: Improving Robustness in NLP Models via Controlled Adversarial Text Generation
CAUSE: Learning Granger Causality from Event Sequences using Attribution Methods
CAiRE-COVID: A Question Answering and Query-focused Multi-Document Summarization System for COVID-19 Scholarly Information Management
CDL: Curriculum Dual Learning for Emotion-Controllable Response Generation
CITE: A Corpus of Image-Text Discourse Relations
CLEVR Parser: A Graph Parser Library for Geometric Learning on Language Grounded Image Scenes
CLEVR-Dialog: A Diagnostic Dataset for Multi-Round Reasoning in Visual Dialog
CLUB: A Contrastive Log-ratio Upper Bound of Mutual Information
CNM: An Interpretable Complex-valued Network for Matching
CNN-based Approach for Cervical Cancer Classification in Whole-Slide Histopathology Images
COD3S: Diverse Generation with Discrete Semantic Signatures
COMET: A Neural Framework for MT Evaluation
COMETA: A Corpus for Medical Entity Linking in the Social Media
COVID-19 Literature Topic-Based Search via Hierarchical NMF
CUNI Systems for the Unsupervised and Very Low Resource Translation Task in WMT20
CURL: Contrastive Unsupervised Representations for Reinforcement Learning
Calibrated Language Model Fine-Tuning for In- and Out-of-Distribution Data
Calibrated Surrogate Losses for Adversarially Robust Classification
Calibrated Surrogate Maximization of Linear-fractional Utility in Binary Classification
Calibrated Top-1 Uncertainty estimates for classification by score based models
Calibrating Structured Output Predictors for Natural Language Processing
Calibration of Pre-trained Transformers
Calibration, Entropy Rates, and Memory in Language Models
CamemBERT: a Tasty French Language Model
Can Automatic Post-Editing Improve NMT?
Can Autonomous Vehicles Identify, Recover From, and Adapt to Distribution Shifts?
Can Increasing Input Dimensionality Improve Deep Reinforcement Learning?
Can Neural Machine Translation be Improved with User Feedback?
Can You Put it All Together: Evaluating Conversational Agents' Ability to Blend Skills
Can a Suit of Armor Conduct Electricity? A New Dataset for Open Book Question Answering
Carbontracker: Tracking and Predicting the Carbon Footprint of Training Deep Learning Models
Cascaded Mutual Modulation for Visual Reasoning
Catch Me if I Can: Detecting Strategic Behaviour in Peer Assessment
Categorical Metadata Representation for Customized Text Classification
Catplayinginthesnow: Impact of Prior Segmentation on a Model of Visually Grounded Speech
Causal Bayesian Optimization
Causal Curiosity: RL Agents Discovering Self-supervised Experiments for Causal Representation Learning
Causal Effect Estimation and Optimal Dose Suggestions in Mobile Health
Causal Feature Discovery through Strategic Modification
Causal Inference of Script Knowledge
Causal Inference using Gaussian Processes with Structured Latent Confounders
Causal Learning by a Robot with Semantic-Episodic Memory in an Aesop's Fable Experiment
Causal Modeling for Fairness in Dynamical Systems
Causal Structure Discovery from Distributions Arising from Mixtures of DAGs
Causal inference in degenerate systems: An impossibility result
Cautious Adaptation For Reinforcement Learning in Safety-Critical Settings
Censored Quantile Regression Forest
Certified Data Removal from Machine Learning Models
Certified Robustness to Label-Flipping Attacks via Randomized Smoothing
Challenges in Emotion Style Transfer: An Exploration with a Lexical Substitution Pipeline
Channel Equilibrium Networks for Learning Deep Representation
Chapter Captor: Text Segmentation in Novels
CharManteau: Character Embedding Models For Portmanteau Creation
Character-level Representations Improve DRS-based Semantic Parsing Even in the Age of BERT
Characterization of Overlap in Observational Studies
Characterizing Distribution Equivalence and Structure Learning for Cyclic and Acyclic Directed Graphs
Characterizing Private Clipped Gradient Descent on Convex Generalized Linear Problems
Characterizing the Latent Space of Molecular Deep Generative Models with Persistent Homology Metrics
CheXbert: Combining Automatic Labelers and Expert Annotations for Accurate Radiology Report Labeling Using BERT
Choice Set Optimization Under Discrete Choice Models of Group Decisions
ChrEn: Cherokee-English Machine Translation for Endangered Language Revitalization
Circuit-Based Intrinsic Methods to Detect Overfitting
ClarQ: A large-scale and diverse dataset for Clarification Question Generation
Classical Structured Prediction Losses for Sequence to Sequence Learning
Classification with Strategically Withheld Data
Classifying Syntactic Errors in Learner Language
Clinical Reading Comprehension: A Thorough Analysis of the emrQA Dataset
Clinical XLNet: Modeling Sequential Clinical Notes and Predicting Prolonged Mechanical Ventilation
Closed Loop Neural-Symbolic Learning via Integrating Neural Perception, Grammar Parsing, and Symbolic Reasoning
Closing the Gap: Joint De-Identification and Concept Extraction in the Clinical Domain
Closing the convergence gap of SGD without replacement
Closure Properties for Private Classification and Online Prediction
Clue: Cross-modal Coherence Modeling for Caption Generation
CoDEx: A Comprehensive Knowledge Graph Completion Benchmark
Coach: A Coarse-to-Fine Approach for Cross-domain Slot Filling
Coarse-to-Fine Decoding for Neural Semantic Parsing
Code and Named Entity Recognition in StackOverflow
Code-switching patterns can be an effective route to improve performance of downstream NLP applications: A case study of humour, sarcasm and hate speech detection
Cognitive Graph for Multi-Hop Reading Comprehension at Scale
CognitiveCNN: Mimicking Human Cognitive Models to resolve Texture-Shape Bias
Cold-start Active Learning through Self-supervised Language Modeling
Collaborative Machine Learning with Incentive-Aware Model Rewards
Collapsed Amortized Variational Inference for Switching Nonlinear Dynamical Systems
Collecting Diverse Natural Language Inference Problems for Sentence Representation Evaluation
Colorless green recurrent networks dream hierarchically
Colors in Context: A Pragmatic Neural Model for Grounded Language Understanding
Combating False Negatives in Adversarial Imitation Learning
Combining Pretrained High-Resource Embeddings and Subword Representations for Low-Resource Languages
Combining Self-Training and Self-Supervised Learning for Unsupervised Disfluency Detection
Combining Sentiment Lexica with a Multi-View Variational Autoencoder
Common Sense or World Knowledge? Investigating Adapter-Based Knowledge Injection into Pretrained Transformers
Commonsense for Generative Multi-Hop Question Answering Tasks
CommonsenseQA: A Question Answering Challenge Targeting Commonsense Knowledge
Communication-Efficient Asynchronous Stochastic Frank-Wolfe over Nuclear-norm Balls
Communication-Efficient Distributed Stochastic AUC Maximization with Deep Neural Networks
CompRes: A Dataset for Narrative Structure in News
Compact Personalized Models for Neural Machine Translation
Comparative Analysis of Text Classification Approaches in Electronic Health Records
Comparatives, Quantifiers, Proportions: A Multi-Task Model for the Learning of Quantities from Vision
Comparing recurrent and convolutional neural networks for predicting wave propagation
Competence-Level Prediction and Resume & Job Description Matching Using Context-Aware Transformer Models
Competence-based Curriculum Learning for Neural Machine Translation
Competing Bandits in Matching Markets
Competitive Mirror Descent
Complete Multilingual Neural Machine Translation
Complexity Guarantees for Polyak Steps with Momentum
Complexity-Weighted Loss and Diverse Reranking for Sentence Simplification
Compositional Demographic Word Embeddings
Compositional Questions Do Not Necessitate Multi-hop Reasoning
Compositional Semantic Parsing on Semi-Structured Tables
Compositional and Lexical Semantics in RoBERTa, BERT and DistilBERT: A Case Study on CoQA
Compositionality and Generalization in Emergent Languages
Comprehensive Supersense Disambiguation of English Prepositions and Possessives
Compressing BERT: Studying the Effects of Weight Pruning on Transfer Learning
Compressive Summarization with Plausibility and Salience Modeling
Computing Tight Differential Privacy Guarantees Using FFT
ConQUR: Mitigating Delusional Bias in Deep Q-learning
ConStance: Modeling Annotation Contexts to Improve Stance Classification
Concentration bounds for CVaR estimation: The cases of light-tailed and heavy-tailed distributions
Concept Bottleneck Models
Concise Explanations of Neural Networks using Adversarial Training
Concluding remarks
Conditional Augmentation for Aspect Term Extraction via Masked Sequence-to-Sequence Generation
Conditional Flow Variational Autoencoders for Structured Sequence Prediction
Conditional Generation and Snapshot Learning in Neural Dialogue Systems
Conditional Importance Sampling for Off-Policy Learning
Conditional Normalizing Flows for Low-Dose Computed Tomography Image Reconstruction
Conditional Set Generation with Transformers
Conditional gradient methods for stochastically constrained convex minimization
Conditioning of Reinforcement Learning Agents and its Policy Regularization Application
Confidence Intervals for Policy Evaluation in Adaptive Experiments
Confidence Sets and Hypothesis Testing in a Likelihood-Free Inference Setting
Confidence-Calibrated Adversarial Training: Generalizing to Unseen Attacks
ConjNLI: Natural Language Inference Over Conjunctive Sentences
Connecting Embeddings for Knowledge Graph Entity Typing
Conservative Exploration in Reinforcement Learning
Conservative Safety Critics for Exploration
Considering Likelihood in NLP Classification Explanations with Occlusion and Language Modeling
Consistency by Agreement in Zero-shot Neural Machine Translation
Consistency of a Recurrent Language Model With Respect to Incomplete Decoding
Consistent Estimators for Learning to Defer to an Expert
Consistent Structured Prediction with Max-Min Margin Markov Networks
Consistent Transcription and Translation of Speech
Consistent recovery threshold of hidden nearest neighbor graphs
Constant Curvature Graph Convolutional Networks
Constituent Parsing as Sequence Labeling
Constrained Decoding for Neural NLG from Compositional Representations in Task-Oriented Dialogue
Constrained Markov Decision Processes via Backward Value Functions
Constrained Neural Ordinary Differential Equations with Stability Guarantees
Constructing a provably adversarially-robust classifier from a high accuracy one
Constructive Universal High-Dimensional Distribution Generation through Deep ReLU Networks
Content Planning for Neural Story Generation with Aristotelian Rescoring
Content Selection in Deep Learning Models of Summarization
Context Gates for Neural Machine Translation
Context Mover's Distance & Barycenters: Optimal Transport of Contexts for Building Representations
Context-Aware Answer Extraction in Question Answering
Context-Aware Local Differential Privacy
Context-Aware Neural Machine Translation Learns Anaphora Resolution
Context-aware Dynamics Model for Generalization in Model-Based Reinforcement Learning
Contextual Constrained Learning for Dose-Finding Clinical Trials
Contextual Embeddings: When Are They Worth It?
Contextual Memory Trees
Contextual Neural Machine Translation Improves Translation of Cataphoric Pronouns
Contextual Online False Discovery Rate Control
Contextualization of Morphological Inflection
Contextualized Sparse Representations for Real-Time Open-Domain Question Answering
Contextualizing Hate Speech Classifiers with Post-hoc Explanation
Continual Learning from the Perspective of Compression
Continual Model-Based Reinforcement Learning with Hypernetworks
Continual adaptation for efficient machine communication
Continual and Multi-Task Architecture Search
Continual learning with direction-constrained optimization
Continuous Graph Flow
Continuous Graph Neural Networks
Continuous Online Learning and New Insights to Online Imitation Learning
Continuous Representation of Location for Geolocation and Lexical Dialectology using Mixture Density Networks
Continuous-time Lower Bounds for Gradient-based Algorithms
Continuously Indexed Domain Adaptation
Contrastive Behavioral Similarity Embeddings for Generalization in Reinforcement Learning
Contrastive Graph Neural Network Explanation
Contrastive Multi-View Representation Learning on Graphs
Contrastive Self-Supervised Learning for Commonsense Reasoning
Control Frequency Adaptation via Action Persistence in Batch Reinforcement Learning
Controlled Crowdsourcing for High-Quality QA-SRL Annotation
Controlling Output Length in Neural Encoder-Decoders
Controlling Overestimation Bias with Truncated Mixture of Continuous Distributional Quantile Critics
ConvLab-2: An Open-Source Toolkit for Building, Evaluating, and Diagnosing Dialogue Systems
Convergence Analysis of Block Coordinate Algorithms with Determinantal Sampling
Convergence Rates of Smooth Message Passing with Rounding in Entropy-Regularized MAP Inference
Convergence Rates of Variational Inference in Sparse Deep Learning
Conversation Modeling on Reddit using a Graph-Structured LSTM
Conversational Document Prediction to Assist Customer Care Agents
Conversational Semantic Parsing
Conversational Semantic Parsing for Dialog State Tracking
Conversational Word Embedding for Retrieval-Based Dialog System
Conversations Gone Awry: Detecting Early Signs of Conversational Failure
Convex Calibrated Surrogates for the Multi-Label F-Measure
Convex Representation Learning for Generalized Invariance in Semi-Inner-Product Space
Convolutional Kernel Networks for Graph-Structured Data
Convolutional Neural Networks with Recurrent Neural Filters
Convolutional dictionary learning based auto-encoders for natural exponential-family distributions
Cooperative Learning of Disjoint Syntax and Semantics
Cooperative Multi-Agent Bandits with Heavy Tails
Coordination without communication: optimal regret in two players multi-armed bandits
Coreferential Reasoning Learning for Language Representation
Coresets for Clustering in Graphs of Bounded Treewidth
Coresets for Data-efficient Training of Machine Learning Models
Correlating neural and symbolic representations of language
Corruption-Tolerant Gaussian Process Bandit Optimization
Counterfactual Cross-Validation: Stable Model Selection Procedure for Causal Inference Models
Counterfactual Data Augmentation for Mitigating Gender Stereotypes in Languages with Rich Morphology
Counterfactual Data Augmentation using Locally Factored Dynamics
Countering Language Drift with Seeded Iterated Learning
Countering hate on social media: Large scale classification of hate and counter speech
Coupling Distant Annotation and Adversarial Training for Cross-Domain Chinese Word Segmentation
Coupling Retrieval and Meta-Learning for Context-Dependent Semantic Parsing
Course Concept Expansion in MOOCs with External Knowledge and Interactive Game
Creating Causal Embeddings for Question Answering with Minimal Supervision
Cross Copy Network for Dialogue Generation
Cross-Domain Generalization of Neural Constituency Parsers
Cross-Lingual Alignment of Contextual Word Embeddings, with Applications to Zero-shot Dependency Parsing
Cross-Lingual Semantic Role Labeling with High-Quality Translated Training Corpus
Cross-Lingual Syntactic Transfer with Limited Resources
Cross-Lingual Training for Automatic Question Generation
Cross-Linguistic Syntactic Evaluation of Word Prediction Models
Cross-Media Keyphrase Prediction: A Unified Framework with Multi-Modality Multi-Head Attention and Image Wordings
Cross-Modal Data Programming Enables Rapid Medical Machine Learning
Cross-Modality Relevance for Reasoning on Language and Vision
Cross-Sentence N-ary Relation Extraction with Graph LSTMs
Cross-Target Stance Classification with Self-Attention Networks
Cross-Thought for Sentence Encoder Pre-training
Cross-lingual Abstract Meaning Representation Parsing
Cross-lingual Spoken Language Understanding with Regularized Representation Alignment
Cross-lingual Visual Verb Sense Disambiguation
Cross-media Structured Common Space for Multimedia Event Extraction
Cross-modal Language Generation using Pivot Stabilization for Web-scale Language Coverage
Cross-topic distributional semantic representations via unsupervised mappings
CrossWOZ: A Large-Scale Chinese Cross-Domain Task-Oriented Dialogue Dataset
Crossing Variational Autoencoders for Answer Retrieval
CrowS-Pairs: A Challenge Dataset for Measuring Social Biases in Masked Language Models
Crowdsourcing Lightweight Pyramids for Manual Summary Evaluation
Cumulo: A Dataset for Learning Cloud Classes
Curriculum Pre-training for End-to-End Speech Translation
Curse of Dimensionality on Randomized Smoothing for Certifiable Robustness
Cycles in Causal Learning
D2RL: Deep Dense Architectures in Reinforcement Learning
DADI: Dynamic Discovery of Fair Information with Adversarial Reinforcement Learning
DAGA: Data Augmentation with a Generation Approach for Low-resource Tagging Tasks
DAve-QN: A Distributed Averaged Quasi-Newton Method with Local Superlinear Convergence Rate
DERAIL: Diagnostic Environments for Reward And Imitation Learning
DGST: a Dual-Generator Network for Text Style Transfer
DLGNet: A Transformer-based Model for Dialogue Response Generation
DOC: Deep Open Classification of Text Documents
DORB: Dynamically Optimizing Multiple Rewards with Bandits
DR-BiLSTM: Dependent Reading Bidirectional LSTM for Natural Language Inference
DRS at MRP 2020: Dressing up Discourse Representation Structures as Graphs
DRTS Parsing with Structure-Aware Encoding and Decoding
DRWR: A Differentiable Renderer without Rendering for Unsupervised 3D Structure Learning from Silhouette Images
DTCA: Decision Tree-based Co-Attention Networks for Explainable Claim Verification
DYSAN: Dynamically sanitizing motion sensor data against sensitive inferences through adversarial networks
DagoBERT: Generating Derivational Morphology with a Pretrained Language Model
Data Amplification: Instance-Optimal Property Estimation
Data Appraisal Without Data Sharing
Data Augmentation for Training Dialog Models Robust to Speech Recognition Errors
Data Boost: Text Data Augmentation Through Reinforcement Learning Guided Conditional Generation
Data Generation for Neural Programming by Example
Data Manipulation: Towards Effective Instance Learning for Neural Dialogue Generation via Learning to Augment and Reweight
Data Rejuvenation: Exploiting Inactive Training Examples for Neural Machine Translation
Data Valuation using Reinforcement Learning
Data Weighted Training Strategies for Grammatical Error Correction
Data and Representation for Turkish Natural Language Inference
Data preprocessing to mitigate bias: A maximum entropy based approach
Data-Dependent Differentially Private Parameter Learning for Directed Graphical Models
Data-Efficient Image Recognition with Contrastive Predictive Coding
Data-driven confidence bands for distributed nonparametric regression
Dataset Cartography: Mapping and Diagnosing Datasets with Training Dynamics
DeBayes: a Bayesian method for debiasing network embeddings
DeClarE: Debunking Fake News and False Claims using Evidence-Aware Deep Learning
DeFormer: Decomposing Pre-trained Transformers for Faster Question Answering
DeSePtion: Dual Sequence Prediction and Adversarial Examples for Improved Fact-Checking
Debiased Sinkhorn barycenters
Debiasing Evaluations That are Biased by Evaluations
Decentralised Learning with Random Features and Distributed Gradient Descent
Decentralized Multi-player Multi-armed Bandits with No Collision Information
Decentralized gradient methods: does topology matter?
Decision Trees for Decision-Making under the Predict-then-Optimize Framework
Decomposable Neural Paraphrase Generation
Deconstructing word embedding algorithms
Decoupled Greedy Learning of CNNs
Decoupling Strategy and Generation in Negotiation Dialogues
DeeBERT: Dynamic Early Exiting for Accelerating BERT Inference
Deep Active Learning: Unified and Principled Method for Query and Training
Deep Bayesian Quadrature Policy Optimization
Deep Claim: Payer Response Prediction from Claims Data with Deep Learning
Deep Context-Aware Novelty Detection
Deep Contextualized Self-training for Low Resource Dependency Parsing
Deep Coordination Graphs
Deep Divergence Learning
Deep Dyna-Q: Integrating Planning for Task-Completion Dialogue Policy Learning
Deep Gaussian Markov Random Fields
Deep Generative Model for Joint Alignment and Word Representation
Deep Graph Contrastive Representation Learning
Deep Hierarchical Classification for Category Prediction in E-commerce System
Deep Isometric Learning for Visual Recognition
Deep Keyphrase Generation
Deep Molecular Programming: A Natural Implementation of Binary-Weight ReLU Neural Networks
Deep Networks and the Multiple Manifold Problem
Deep Neural Machine Translation with Linear Associative Unit
Deep Probabilistic Logic: A Unifying Framework for Indirect Supervision
Deep Recurrent Models with Fast-Forward Connections for Neural Machine Translation
Deep Reinforcement Learning amidst Lifelong Non-Stationarity
Deep Reinforcement Learning for Dialogue Generation
Deep Reinforcement Learning for Mention-Ranking Coreference Models
Deep Relevance Ranking Using Enhanced Document-Query Interactions
Deep Ritz revisited
Deep Structured Mixtures of Gaussian Processes
Deep Temporal-Recurrent-Replicated-Softmax for Topical Trends over Time
Deep contextualized word representations
Deep k-NN for Noisy Labels
Deep-speare: A Joint Neural Model of Poetic Language, Meter and Rhyme
DeepCoDA: personalized interpretability for compositional health data
DeepMatch: Balancing Deep Covariate Representations for Causal Inference Using Adversarial Training
DeepPath: A Reinforcement Learning Method for Knowledge Graph Reasoning
DeepSeqSLAM: A Trainable CNN+RNN for Joint Global Description and Sequence-based Place Recognition
Defense Through Diverse Directions
Defining Benchmarks for Continual Few-Shot Learning
Defining and Evaluating Fair Natural Language Generation
Defoiling Foiled Image Captions
Delete, Retrieve, Generate: A Simple Approach to Sentiment and Style Transfer
DeltaGrad: Rapid retraining of machine learning models
Demand-Weighted Completeness Prediction for a Knowledge Base
Demographic Dialectal Variation in Social Media: A Case Study of African-American English
Demographics Should Not Be the Reason of Toxicity: Mitigating Discrimination in Text Classifications with Instance Weighting
Demoting Racial Bias in Hate Speech Detection
Denoising Relation Extraction from Document-level Distant Supervision
Dense Passage Retrieval for Open-Domain Question Answering
Dense-Caption Matching and Frame-Selection Gating for Temporal Localization in VideoQA
Densely Connected Graph Convolutional Networks for Graph-to-Sequence Learning
Density Deconvolution with Normalizing Flows
Density Matching for Bilingual Word Embedding
Deontological Ethics By Monotonicity Shape Constraints
Dependency-based Hybrid Trees for Semantic Parsing
Dependent randomized rounding for clustering and partition systems with knapsack constraints
Depth Completion via Deep Basis Fitting
Depth Uncertainty in Neural Networks
DepthNet Nano: A Highly Compact Self-Normalizing Neural Network for Monocular Depth Estimation
Deriving Machine Attention from Human Rationales
Description Based Text Classification with Reinforcement Learning
Design Challenges in Low-resource Cross-lingual Entity Linking
Designing Differentially Private Estimators in High Dimensions
Designing Precise and Robust Dialogue Response Evaluators
Detecting Attackable Sentences in Arguments
Detecting Cross-Modal Inconsistency to Defend Against Neural Fake News
Detecting East Asian Prejudice on Social Media
Detecting Egregious Conversations between Customers and Virtual Agents
Detecting Fine-Grained Cross-Lingual Semantic Divergences without Supervision by Learning to Rank
Detecting Gang-Involved Escalation on Social Media Using Context
Detecting Perceived Emotions in Hurricane Disasters
Detecting Word Sense Disambiguation Biases in Machine Translation for Model-Agnostic Adversarial Attacks
Detecting dementia in Mandarin Chinese using transfer learning from a parallel corpus
Determining Semantic Textual Similarity using Natural Deduction Proofs
Deterministic Decoding for Discrete Data in Variational Autoencoders
Deterministic Non-Autoregressive Neural Sequence Modeling by Iterative Refinement
Dexterous Robotic Grasping with Object-Centric Visual Affordances
Dialogue Coherence Assessment Without Explicit Dialogue Act Labels
Dialogue Distillation: Open-Domain Dialogue Augmentation Using Unpaired Data
Dialogue Response Ranking Training with Large-Scale Human Feedback Data
Diameter-based Interactive Structure Discovery
Dice Loss for Data-imbalanced NLP Tasks
Did You Ask a Good Question? A Cross-Domain Question Intention Classification Benchmark for Text-to-SQL
Did the Model Understand the Question?
Differentiable Causal Backdoor Discovery
Differentiable Graph Module (DGM) for Graph Convolutional Networks
Differentiable Likelihoods for Fast Inversion of 'Likelihood-Free' Dynamical Systems
Differentiable Sampling with Flexible Reference Word Order for Neural Machine Translation
Differentiable Window for Dynamic Local Attention
Differential Evolution for Neural Architecture Search
Differentially Private Language Models Benefit from Public Pre-training
Differentially Private Set Union
Differentially Private Stochastic Coordinate Descent
Differentially private cross-silo federated learning
Differentiating through the Fréchet Mean
Digital Voicing of Silent Speech
Dilated Convolutional Attention Network for Medical Code Assignment from Clinical Text
Diptychs of human and machine perceptions
DisCor: Corrective Feedback in Reinforcement Learning via Distribution Correction
Discern: Discourse-Aware Entailment Reasoning Network for Conversational Machine Reading
DiscoFuse: A Large-Scale Dataset for Discourse-Based Sentence Fusion
Discontinuous Constituency Parsing with a Stack-Free Transition System and a Dynamic Oracle
Discontinuous Constituent Parsing as Sequence Labeling
Discount Factor as a Regularizer in Reinforcement Learning
Discourse Marker Augmented Network with Reinforcement Learning for Natural Language Inference
Discourse structure interacts with reference but not syntax in neural language models
Discourse-Aware Neural Extractive Text Summarization
Discovering and interpreting transcriptomic drivers of imaging traits using neural networks
Discrete Action On-Policy Learning with Action-Value Critic
Discrete Latent Variable Representations for Low-Resource Text Classification
Discrete Optimization for Unsupervised Sentence Summarization with Word-Level Extraction
Discriminative Adversarial Search for Abstractive Summarization
Discriminative Jackknife: Quantifying Uncertainty in Deep Learning via Higher-Order Influence Functions
Discriminative Nearest Neighbor Few-Shot Intent Detection by Transferring Natural Language Inference
Discriminative Neural Sentence Modeling by Tree-Based Convolution
Discriminatively-Tuned Generative Classifiers for Robust Natural Language Inference
Disentangle-based Continual Graph Representation Learning
Disentangled Planning and Control in Vision Based Robotics via Reward Machines
Disentangling Language and Knowledge in Task-Oriented Dialogs
Disentangling Trainability and Generalization in Deep Neural Networks
Dispersed Exponential Family Mixture VAEs for Interpretable Text Generation
Dissecting Lottery Ticket Transformers: Structural and Behavioral Study of Sparse Neural Machine Translation
Dissecting Non-Vacuous Generalization Bounds based on the Mean-Field Approximation
Dissecting Span Identification Tasks with Performance Prediction
Dissipative SymODEN: Encoding Hamiltonian Dynamics with Dissipation and Control into Deep Learning
Distant Supervision and Noisy Label Learning for Low Resource Named Entity Recognition: A Study on Hausa and Yorùbá
Distant Supervision from Disparate Sources for Low-Resource Part-of-Speech Tagging
Distill, Adapt, Distill: Training Small, In-Domain Models for Neural Machine Translation
Distilling Knowledge Learned in BERT for Text Generation
Distilling Knowledge for Search-based Structured Prediction
Distilling Neural Networks for Greener and Faster Dependency Parsing
Distinguish Confusing Law Articles for Legal Judgment Prediction
Distributed Differentially Private Averaging with Improved Utility and Robustness to Malicious Parties
Distributed Learning: Sequential Decision Making in Resource-Constrained Environments
Distributed, partially collapsed MCMC for Bayesian Nonparametrics
Distributionally Robust Bayesian Optimization
Distributionally Robust Bayesian Quadrature Optimization
Distributionally Robust Formulation and Model Selection for the Graphical Lasso
Diverse Exploration via InfoMax Options
Diverse, Controllable, and Keyphrase-Aware: A Corpus and Method for News Multi-Headline Generation
Diversifying Dialogue Generation with Non-Conversational Text
Diversifying Reply Suggestions using a Matching-Conditional Variational Autoencoder
Diversity driven Attention Model for Query-based Abstractive Summarization
Divide, Conquer, and Combine: a New Inference Strategy for Probabilistic Programs with Stochastic Support
Diving Deep into Context-Aware Neural Machine Translation
Do Explicit Alignments Robustly Improve Multilingual Encoders?
Do Multi-Sense Embeddings Improve Natural Language Understanding?
Do Neural Dialog Systems Use the Conversation History Effectively? An Empirical Study
Do Neural Language Models Show Preferences for Syntactic Formalisms?
Do Neural Models Learn Systematicity of Monotonicity Inference in Natural Language?
Do Neural Network Cross-Modal Mappings Really Bridge Modalities?
Do RNN and LSTM have Long Memory?
Do We Need Zero Training Loss After Achieving Zero Training Error?
Do We Really Need to Access the Source Data? Source Hypothesis Transfer for Unsupervised Domain Adaptation
Do You Have the Right Scissors? Tailoring Pre-trained Language Models via Monte-Carlo Methods
Do You See What I Mean? Visual Resolution of Linguistic Ambiguities
Do latent tree learning models identify meaningful structure in sentences?
Do sequence-to-sequence VAEs learn global features of sentences?
Document Context Neural Machine Translation with Memory Networks
Document Modeling with Graph Attention Networks for Multi-grained Machine Reading Comprehension
Document-Level Event Role Filler Extraction using Multi-Granularity Contextualized Encoding
Document-aligned Japanese-English Conversation Parallel Corpus
Does Multi-Encoder Help? A Case Study on Context-Aware Neural Machine Translation
Does label smoothing mitigate label noise?
Does syntax need to grow on trees? Sources of hierarchical inductive bias in sequence-to-sequence networks
Does the Markov Decision Process Fit the Data: Testing for the Markov Property in Sequential Decision Making
Does the Objective Matter? Comparing Training Objectives for Pronoun Resolution
Domain Adaptation with Adversarial Training and Graph Embeddings
Domain Adaptive Dialog Generation via Meta Learning
Domain Adaptive Imitation Learning
Domain Adaptive Inference for Neural Machine Translation
Domain Aggregation Networks for Multi-Source Domain Adaptation
Domain Knowledge Empowered Structured Neural Net for End-to-End Event Temporal Relation Extraction
Domain Knowledge Integration By Gradient Matching For Sample-Efficient Reinforcement Learning
Domain-Liftability of Relational Marginal Polytopes
Domain-Specific Lexical Grounding in Noisy Visual-Textual Documents
Don't Neglect the Obvious: On the Role of Unambiguous Words in Word Sense Disambiguation
Don't Read Too Much into It: Adaptive Computation for Open-Domain Question Answering
Don't Use English Dev: On the Zero-Shot Cross-Lingual Evaluation of Contextual Embeddings
Double Graph Based Reasoning for Document-level Relation Extraction
Double-Hard Debias: Tailoring Word Embeddings for Gender Bias Mitigation
Double-Loop Unadjusted Langevin Algorithm
Doubly Sparse Variational Gaussian Processes
Doubly Stochastic Variational Inference for Neural Processes with Hierarchical Latent Variables
Doubly robust off-policy evaluation with shrinkage
Dream and Search to Control: Latent Space Planning for Continuous Control
Driving Behavior Explanation with Multi-level Fusion
Dual Mirror Descent for Online Allocation Problems
DualTKB: A Dual Learning Bridge between Text and Knowledge Base
DyERNIE: Dynamic Evolution of Riemannian Manifold Embeddings for Temporal Knowledge Graph Completion
Dyna-AIL : Adversarial Imitation Learning by Planning
Dynamic Anticipation and Completion for Multi-Hop Reasoning over Sparse Knowledge Graph
Dynamic Context Selection for Document-level Neural Machine Translation via Reinforcement Learning
Dynamic Data Selection and Weighting for Iterative Back-Translation
Dynamic Fusion Network for Multi-Domain End-to-end Task-Oriented Dialog
Dynamic Knapsack Optimization Towards Efficient Multi-Channel Sequential Advertising
Dynamic Memory Induction Networks for Few-Shot Text Classification
Dynamic Oracles for Top-Down and In-Order Shift-Reduce Constituent Parsing
Dynamic Programming Encoding for Subword Segmentation in Neural Machine Translation
Dynamic Regions Graph Neural Networks for Spatio-Temporal Reasoning
Dynamical systems theory for causal inference with application to synthetic control methods
Dynamically Adjusting Transformer Batch Size by Monitoring Gradient Direction Change
ELI5: Long Form Question Answering
ELITR Non-Native Speech Translation at IWSLT 2020
EM Converges for a Mixture of Many Linear Regressions
ENGINE: Energy-Based Inference Networks for Non-Autoregressive Machine Translation
ENT-DESC: Entity Description Generation by Exploring Knowledge Graph
ERASER: A Benchmark to Evaluate Rationalized NLP Models
ESPRIT: Explaining Solutions to Physical Reasoning Tasks
ESPnet-ST: All-in-One Speech Translation Toolkit
ETC: Encoding Long and Structured Inputs in Transformers
EXP4-DFDC: A Non-Stochastic Multi-Armed Bandit for Cache Replacement
Early Disease Diagnosis for Rice Crop
Easy-First Dependency Parsing with Hierarchical Tree LSTMs
Ecological Semantics: Programming Environments for Situated Language Understanding
EditNTS: An Neural Programmer-Interpreter Model for Sentence Simplification through Explicit Editing
Educating Text Autoencoders: Latent Representation Guidance via Denoising
Effective Approaches to Attention-based Neural Machine Translation
Effective Estimation of Deep Generative Language Models
Effective Unsupervised Domain Adaptation with Adversarially Trained Language Models
Effectiveness of MPC-friendly Softmax Replacement
Efficient Competitive Self-Play Policy Optimization
Efficient Constituency Parsing by Pointing
Efficient Contextualized Representation: Language Model Pruning for Sequence Labeling
Efficient Continuous Pareto Exploration in Multi-Task Learning
Efficient Deployment of Conversational Natural Language Interfaces over Databases
Efficient Dialogue State Tracking by Selectively Overwriting Memory
Efficient Distributed Hessian Free Algorithm for Large-scale Empirical Risk Minimization via Accumulating Sample Strategy
Efficient Domain Generalization via Common-Specific Low-Rank Decomposition
Efficient EUD Parsing
Efficient Estimation of Influence of a Training Instance
Efficient Inference For Neural Machine Translation
Efficient Intent Detection with Dual Sentence Encoders
Efficient Intervention Design for Causal Discovery with Latents
Efficient Low-rank Multimodal Fusion with Modality-Specific Factors
Efficient Meta Lifelong-Learning with Limited Memory
Efficient One-Pass End-to-End Entity Linking for Questions
Efficient Online Scalar Annotation with Bounded Support
Efficient Optimistic Exploration in Linear-Quadratic Regulators via Lagrangian Relaxation
Efficient Parameter Estimation of Truncated Boolean Product Distributions
Efficient Planning under Partial Observability with Unnormalized Q Functions and Spectral Learning
Efficient Privacy-Preserving Stochastic Nonconvex Optimization
Efficient Proximal Mapping of the 1-path-norm of Shallow Networks
Efficient Reservoir Management through Deep Reinforcement Learning
Efficient Robustness Certificates for Discrete Data: Sparsity-Aware Randomized Smoothing for Graphs, Images and More
Efficient Second-Order TreeCRF for Neural Dependency Parsing
Efficient allocation of law enforcement resources using predictive police patrolling
Efficient and Robust Algorithms for Adversarial Linear Contextual Bandits
Efficient and Scalable Bayesian Neural Nets with Rank-1 Factors
Efficient improper learning for online logistic regression
Efficient non-conjugate Gaussian process factor models for spike count data using polynomial approximations
Efficient strategies for hierarchical text classification: External knowledge and auxiliary tasks
Efficient, Noise-Tolerant, and Private Learning via Boosting
Efficiently Learning Adversarially Robust Halfspaces with Noise
Efficiently Sampling Functions from Gaussian Process Posteriors
Efficiently Solving MDPs with Stochastic Mirror Descent
Egoshots, an ego-vision life-logging dataset and semantic fidelity metric to evaluate diversity in image captioning models
Einsum Networks: Fast and Scalable Learning of Tractable Probabilistic Circuits
Embarrassingly Simple Unsupervised Aspect Extraction
Embedding Multimodal Relational Data for Knowledge Base Completion
Embedding Words in Non-Vector Space with Unsupervised Graph Learning
Embedding time expressions for deep temporal ordering models
Embedding-based Scientific Literature Discovery in a Text Editor Application
Embeddings of Label Components for Sequence Labeling: A Case Study of Fine-grained Named Entity Recognition
Emergence of Syntax Needs Minimal Supervision
Emergent Road Rules In Multi-Agent Driving Environments
Emerging Cross-lingual Structure in Pretrained Language Models
Empirical Study of the Benefits of Overparameterization in Learning Latent Variable Models
Empower Entity Set Expansion via Language Model Probing
Empowering Active Learning to Jointly Optimize System and User Demands
Enabling Language Models to Fill in the Blanks
Encoder-Decoder Models Can Benefit from Pre-trained Masked Language Models in Grammatical Error Correction
Encoder-decoder neural network for solving the nonlinear Fokker-Planck-Landau collision operator in XGC
Encoding Musical Style with Transformer Autoencoders
Encoding Sentences with Graph Convolutional Networks for Semantic Role Labeling
Encoding Source Language with Convolutional Neural Network for Machine Translation
Encodings of Source Syntax: Similarities in NMT Representations Across Target Languages
End to End Binarized Neural Networks for Text Classification
End-to-End Bias Mitigation by Modelling Biases in Corpora
End-to-End Neural Word Alignment Outperforms GIZA++
End-to-End Slot Alignment and Recognition for Cross-Lingual NLU
End-to-End Synthetic Data Generation for Domain Adaptation of Question Answering Systems
End-to-end ASR: from Supervised to Semi-Supervised Learning with Modern Architectures
End-to-end Graph-based TAG Parsing with Neural Networks
End-to-end Neural Coreference Resolution
Energy and Policy Considerations for Deep Learning in NLP
Energy-Based Continuous Inverse Optimal Control
Energy-Based Processes for Exchangeable Data
Energy-based Surprise Minimization for Multi-Agent Value Factorization
Enhanced POET: Open-Ended Reinforcement Learning through Unbounded Invention of Learning Challenges and their Solutions
Enhanced Universal Dependency Parsing with Second-Order Inference and Mixture of Training Data
Enhancing Answer Boundary Detection for Multilingual Machine Reading Comprehension
Enhancing Drug-Drug Interaction Extraction from Texts by Molecular Structure Information
Enhancing Machine Translation with Dependency-Aware Self-Attention
Enhancing Pre-trained Chinese Character Representation with Word-aligned Attention
Enhancing Simple Models by Exploiting What They Already Know
Enhancing Stratospheric Weather Analyses and Forecasts by Deploying Sensors from a Weather Balloon
Enhancing Word Embeddings with Knowledge Extracted from Lexical Resources
Enriched In-Order Linearization for Faster Sequence-to-Sequence Constituent Parsing
Enriching Word Embeddings with Temporal and Spatial Information
Enriching Word Vectors with Subword Information
Entities as Experts: Sparse Memory Access with Entity Supervision
Entity Commonsense Representation for Neural Abstractive Summarization
Entity Linking for Queries by Searching Wikipedia Sentences
Entity Linking in 100 Languages
Entity Recognition at First Sight: Improving NER with Eye Movement Information
Entity-Enriched Neural Models for Clinical Question Answering
Entropy Minimization In Emergent Languages
Episodic Memory Reader: Learning What to Remember for Question Answering from Streaming Data
Equalized odds postprocessing under imperfect group information
Equivariant Hamiltonian Flows
Equivariant Neural Rendering
Error Estimation for Sketched SVD via the Bootstrap
Error bounds in estimating the out-of-sample prediction error using leave-one-out cross validation in high-dimensions
Error-Bounded Correction of Noisy Labels
Estimating Grape Yield on the Vine from Multiple Images
Estimating Principal Components under Adversarial Perturbations
Estimating Q(s,s') with Deep Deterministic Dynamics Gradients
Estimating localized complexity of white-matter wiring with GANs
Estimating predictive uncertainty for rumour verification models
Estimating the number and effect sizes of non-null hypotheses
Estimation and Inference with Trees and Forests in High Dimensions
Estimation of Bounds on Potential Outcomes For Decision Making
Evaluating Agents without Rewards
Evaluating Amharic Machine Translation
Evaluating Attribution Methods using White-Box LSTMs
Evaluating Dialogue Generation Systems via Response Selection
Evaluating Explainable AI: Which Algorithmic Explanations Help Users Predict Model Behavior?
Evaluating Explanation Methods for Neural Machine Translation
Evaluating Gender Bias in Machine Translation
Evaluating Logical Generalization in Graph Neural Networks
Evaluating Lossy Compression Rates of Deep Generative Models
Evaluating Neural Morphological Taggers for Sanskrit
Evaluating Robustness to Input Perturbations for Neural Machine Translation
Evaluating Theory of Mind in Question Answering
Evaluating and Characterizing Human Rationales
Evaluating the Calibration of Knowledge Graph Embeddings for Trustworthy Link Prediction
Evaluating the Factual Consistency of Abstractive Text Summarization
Evaluating the Utility of Hand-crafted Features in Sequence Labelling
Evaluation of Model Selection for Kernel Fragment Recognition in Corn Silage
Event Extraction by Answering (Almost) Natural Questions
Every Document Owns Its Structure: Inductive Text Classification via Graph Neural Networks
Evidence-Aware Inferential Text Generation with Vector Quantised Variational AutoEncoder
Evolution-based Fine-tuning of CNNs for Prostate Cancer Detection
EvolveGraph: Multi-Agent Trajectory Prediction with Dynamic Relational Reasoning
Evolving Reinforcement Learning Algorithms
Examination and Extension of Strategies for Improving Personalized Language Modeling via Interpolation
Examining Citations of Natural Language Processing Literature
Examining the State-of-the-Art in News Timeline Summarization
Exclusive Hierarchical Decoding for Deep Keyphrase Generation
ExpBERT: Representation Engineering with Natural Language Explanations
Experience Grounds Language
Experimental Evaluation and Development of a Silver-Standard for the MIMIC-III Clinical Coding Dataset
Expertise Style Transfer: A New Task Towards Better Communication between Experts and Laymen
Explainable Automated Fact-Checking for Public Health Claims
Explainable and Discourse Topic-aware Neural Language Understanding
Explaining Black Box Predictions and Unveiling Data Artifacts through Influence Functions
Explaining Character-Aware Neural Networks for Word-Level Prediction: Do They Discover Linguistic Rules?
Explaining Groups of Points in Low-Dimensional Representations
Explaining the Explainer: A First Theoretical Analysis of LIME
Explanation Augmented Feedback in Human-in-the-Loop Reinforcement Learning
Explicit Mean-Square Error Bounds for Monte-Carlo and Linear Stochastic Approximation
Explicit Memory Tracker with Coarse-to-Fine Reasoning for Conversational Machine Reading
Exploiting Categorical Structure Using Tree-Based Methods
Exploiting Cross-Sentence Context for Neural Machine Translation
Exploiting Deep Representations for Neural Machine Translation
Exploiting Domain Knowledge via Grouped Weight Sharing with Application to Text Categorization
Exploiting Explicit Paths for Multi-hop Reading Comprehension
Exploiting Rich Syntactic Information for Semantic Parsing with Graph-to-Sequence Model
Exploiting Sentence Order in Document Alignment
Exploiting Structured Knowledge in Text via Graph-Guided Representation Learning
Exploiting Syntactic Structure for Better Language Modeling: A Syntactic Distance Approach
Exploration by Optimisation in Partial Monitoring
Exploratory Analysis of COVID-19 Related Tweets in North America to Inform Public Health Institutes
Explore, Discover and Learn: Unsupervised Discovery of State-Covering Skills
Explore, Propose, and Assemble: An Interpretable Model for Multi-Hop Reading Comprehension
Exploring Author Context for Detecting Intended vs Perceived Sarcasm
Exploring Content Selection in Summarization of Novel Chapters
Exploring Contextual Word-level Style Relevance for Unsupervised Style Transfer
Exploring Contextualized Neural Language Models for Temporal Dependency Parsing
Exploring Exploration: Comparing Children with RL Agents in Unified Environments
Exploring Phoneme-Level Speech Representations for End-to-End Speech Translation
Exploring Recombination for Efficient Decoding of Neural Machine Translation
Exploring Semantic Capacity of Terms
Exploring Weaknesses of VQA Models through Attribution Driven Insights
Exploring and Predicting Transferability across NLP Tasks
Exploring aspects of similarity between spoken personal narratives by disentangling them into narrative clause types
Exploring the Linear Subspace Hypothesis in Gender Bias Mitigation
Exploring the Role of Argument Structure in Online Debate Persuasion
Exploring the Role of Prior Beliefs for Argument Persuasion
Exposing Shallow Heuristics of Relation Extraction Models with Challenge Data
Expressing Visual Relationships via Language
Expressive Interviewing: A Conversational System for Coping with COVID-19
Expressiveness and Learning of Hidden Quantum Markov Models
Extending Implicit Discourse Relation Recognition to the PDTB-3
Extending a Parser to Distant Domains Using a Few Dozen Partially Annotated Examples
Extensively Matching for Few-shot Learning Event Detection
Extract and Edit: An Alternative to Back-Translation for Unsupervised Neural Machine Translation
Extracting Headless MWEs from Dependency Parse Trees: Parsing, Tagging, and Joint Modeling Approaches
Extracting Implicitly Asserted Propositions in Argumentation
Extracting Symptoms and their Status from Clinical Conversations
Extractive Summarization as Text Matching
Extragradient with player sampling for faster Nash equilibrium finding
Extrapolating the profile of a finite population
Extreme Multi-label Classification from Aggregated Labels
FEQA: A Question Answering Evaluation Framework for Faithfulness Assessment in Abstractive Summarization
FFR V1.0: Fon-French Neural Machine Translation
FFR v1.1: Fon-French Neural Machine Translation
FIESTA: Fast IdEntification of State-of-The-Art models using adaptive bandit algorithms
FLAT: Chinese NER Using Flat-Lattice Transformer
F^2-Softmax: Diversifying Neural Text Generation via Frequency Factorized Softmax
Facebook AI's WMT20 News Translation Task Submission
Facet-Aware Evaluation for Extractive Summarization
Facilitating the Communication of Politeness through Fine-Grained Paraphrasing
Fact or Fiction: Verifying Scientific Claims
Fact-based Text Editing
Factorising AMR generation through syntax
Factual Error Correction for Abstractive Summarization Models
Fair Bayesian Optimization
Fair Correlation Clustering
Fair Decisions Despite Imperfect Predictions
Fair Embedding Engine: A Library for Analyzing and Mitigating Gender Bias in Word Embeddings
Fair Generative Modeling via Weak Supervision
Fair Learning with Private Demographic Data
Fairness in the Eyes of the Data: Certifying Machine-Learning Models
Fairwashing Explanations with Off-Manifold Detergent
Familywise Error Rate Control by Interactive Unmasking
Fast Adaptation via Policy-Dynamics Value Functions
Fast Algorithms for Computational Optimal Transport and Wasserstein Barycenter
Fast Differentiable Sorting and Ranking
Fast Interleaved Bidirectional Sequence Generation
Fast Learning of Graph Neural Networks with Guaranteed Generalizability: One-hidden-layer Case
Fast Linear Convergence of Randomized BFGS
Fast Markov Chain Monte Carlo Algorithms via Lie Groups
Fast OSCAR and OWL Regression via Safe Screening Rules
Fast Physical Activity Suggestions: Efficient Hyperparameter Learning in Mobile Health
Fast Rates for Online Prediction with Abstention
Fast and Accurate Deep Bidirectional Language Representations for Unsupervised Learning
Fast and Furious Convergence: Stochastic Second Order Methods under Interpolation
Fast and Scalable Expansion of Natural Language Understanding Functionality for Intelligent Agents
Fast semantic parsing with well-typedness guarantees
Fast(er) Exact Decoding and Global Training for Transition-Based Dependency Parsing via a Minimal Feature Set
Fast, Small and Exact: Infinite-order Language Modelling with Compressed Suffix Trees
FastBERT: a Self-distilling BERT with Adaptive Inference Time
FastFormers: Highly Efficient Transformer Models for Natural Language Understanding
Faster Graph Embeddings via Coarsening
Faster Projection-free Online Learning
Feature Adaptation of Pre-Trained Language Models across Languages and Domains with Robust Self-Training
Feature Noise Induces Loss Discrepancy Across Groups
Feature Quantization Improves GAN Training
Feature Selection using Stochastic Gates
Feature relevance quantification in explainable AI: A causal problem
Feature-map-level Online Adversarial Knowledge Distillation
FedPAQ: A Communication-Efficient Federated Learning Method with Periodic Averaging and Quantization
Federated Heavy Hitters Discovery with Differential Privacy
Federated Learning with Only Positive Labels
Fenchel Lifted Networks: A Lagrange Relaxation of Neural Network Training
FetchSGD: Communication-Efficient Federated Learning with Sketching
Few-Shot Complex Knowledge Base Question Answering via Meta Reinforcement Learning
Few-Shot Learning for Opinion Summarization
Few-Shot NLG with Pre-Trained Language Model
Few-shot Domain Adaptation by Causal Mechanism Transfer
Few-shot Relation Extraction via Bayesian Meta-learning on Relation Graphs
Few-shot Slot Tagging with Collapsed Dependency Transfer and Label-enhanced Task-adaptive Projection Network
Few-shot link prediction via graph neural networks for Covid-19 drug-repurposing
FewRel: A Large-Scale Supervised Few-Shot Relation Classification Dataset with State-of-the-Art Evaluation
Fiduciary Bandits
Fiedler Regularization: Learning Neural Networks with Graph Sparsity
Field-Level Crop Type Classification with k Nearest Neighbors: A Baseline for a New Kenya Smallholder Dataset
Fill in the BLANC: Human-free quality estimation of document summaries
Filling Missing Paths: Modeling Co-occurrences of Word Pairs and Dependency Paths for Recognizing Lexical Semantic Relations
Filtering Noisy Dialogue Corpora by Connectivity and Content Relatedness
FinRL: A Deep Reinforcement Learning Library for Automated Stock Trading in Quantitative Finance
Finding Convincing Arguments Using Scalable Bayesian Preference Learning
Finding Syntax in Human Encephalography with Beam Search
Finding Universal Grammatical Relations in Multilingual BERT
Finding Your Voice: The Linguistic Development of Mental Health Counselors
Finding trainable sparse networks through Neural Tangent Transfer
Fine Grained Citation Span for References in Wikipedia
Fine-Grained Analysis of Cross-Linguistic Syntactic Divergences
Fine-Grained Prediction of Syntactic Typology: Discovering Latent Structure with Supervised Learning
Fine-Grained Temporal Relation Extraction
Fine-grained Fact Verification with Kernel Graph Attention Network
Fine-grained linguistic evaluation for state-of-the-art Machine Translation
Finite Regret and Cycles with Fixed Step-Size via Alternating Gradient Descent-Ascent
Finite Time Analysis of Linear Two-timescale Stochastic Approximation with Markovian Noise
Finite-Sample Analysis of Decentralized Temporal-Difference Learning with Linear Function Approximation
Finite-Time Analysis of Asynchronous Stochastic Approximation and $Q$-Learning
Finite-Time Last-Iterate Convergence for Multi-Agent Learning in Games
Fixed-Confidence Guarantees for Bayesian Best-Arm Identification
Flexible and Efficient Long-Range Planning Through Curious Exploration
Flexible retrieval with NMSLIB and FlexNeuART
Flow Models for Arbitrary Conditional Likelihoods
Fluent Response Generation for Conversational Question Answering
Forecasting Sequential Data using Consistent Koopman Autoencoders
Formal Limitations on the Measurement of Mutual Information
Fortification of Neural Morphological Segmentation Models for Polysynthetic Minimal-Resource Languages
Fortifying Toxic Speech Detectors Against Veiled Toxicity
Fractal Gaussian Networks: A sparse random graph model based on Gaussian Multiplicative Chaos
Fractional Underdamped Langevin Dynamics: Retargeting SGD with Momentum under Heavy-Tailed Gradient Noise
Free Energy Wells and Overlap Gap Property in Sparse PCA
Frequency Bias in Neural Networks for Input of Non-Uniform Density
Frequentist Uncertainty in Recurrent Neural Networks via Blockwise Influence Functions
Friendships, Rivalries, and Trysts: Characterizing Relations between Ideas in Texts
From Arguments to Key Points: Towards Automatic Argument Summarization
From Data to Decisions: Distributionally Robust Optimization is Optimal
From Dataset Recycling to Multi-Property Extraction and Beyond
From English to Code-Switching: Transfer Learning with Strong Morphological Clues
From ImageNet to Image Classification: Contextualizing Progress on Benchmarks
From Importance Sampling to Doubly Robust Policy Gradient
From Language to Programs: Bridging Reinforcement Learning and Maximum Marginal Likelihood
From Machine Reading Comprehension to Dialogue State Tracking: Bridging the Gap
From Nesterov's Estimate Sequence to Riemannian Acceleration
From PAC to Instance-Optimal Sample Complexity in the Plackett-Luce Model
From Paraphrase Database to Compositional Paraphrase Model and Back
From Predictions to Decisions: Using Lookahead Regularization
From Speech-to-Speech Translation to Automatic Dubbing
From tree matching to sparse graph alignment
Frowning Frodo, Wincing Leia, and a Seriously Great Friendship: Learning to Classify Emotional Relationships of Fictional Characters
Frustratingly Hard Evidence Retrieval for QA Over Books
Frustratingly Simple Few-Shot Object Detection
Fully Character-Level Neural Machine Translation without Explicit Segmentation
Fully Decentralized Joint Learning of Personalized Models and Collaboration Graphs
Fully Parallel Hyperparameter Search: Reshaped Space-Filling
Fully reversible neural networks for large-scale surface and sub-surface characterization via remote sensing
Fundamental Tradeoffs between Invariance and Sensitivity to Adversarial Perturbations
GCAN: Graph-aware Co-Attention Networks for Explainable Fake News Detection on Social Media
GEAR: Graph-based Evidence Aggregating and Reasoning for Fact Verification
GNN-FiLM: Graph Neural Networks with Feature-wise Linear Modulation
GP-VAE: Deep Probabilistic Time Series Imputation
GRADE: Automatic Graph-Enhanced Coherence Metric for Evaluating Open-Domain Dialogue Systems
Gaining Insight into SARS-CoV-2 Infection and COVID-19 Severity Using Self-supervised Edge Features and Graph Neural Networks
Games for Fairness and Interpretability
Gamification of Pure Exploration for Linear Bandits
Gated Convolutional Bidirectional Attention-based Model for Off-topic Spoken Response Detection
Gaussian Mixture Latent Vector Grammars
Gaussian Sketching yields a J-L Lemma in RKHS
Gaussianization Flows
GenAug: Data Augmentation for Finetuning Text Generators
Gender Bias in Contextualized Word Embeddings
Gender Bias in Multilingual Embeddings and Cross-Lingual Transfer
Gender Coreference and Bias Evaluation at WMT 2020
Gender Gap in Natural Language Processing Research: Disparities in Authorship and Citations
Gender in Danger? Evaluating Speech Translation Technology on the MuST-SHE Corpus
Gender-preserving Debiasing for Pre-trained Word Embeddings
General Identification of Dynamic Treatment Regimes Under Interference
Generalisation error in learning with random features and the hidden manifold model
Generalization Error of Generalized Linear Models in High Dimensions
Generalization Guarantees for Sparse Kernel Approximation with Entropic Optimal Features
Generalization and Representational Limits of Graph Neural Networks
Generalization to New Actions in Reinforcement Learning
Generalized Data Augmentation for Low-Resource Translation
Generalized and Scalable Optimal Sparse Decision Trees
Generalizing Convolutional Neural Networks for Equivariance to Lie Groups on Arbitrary Continuous Data
Generalizing Natural Language Analysis through Span-relation Representations
Generalizing Word Embeddings using Bag of Subwords
Generalizing and Hybridizing Count-based and Neural Language Models
Generate, Delete and Rewrite: A Three-Stage Framework for Improving Persona Consistency of Dialogue Generation
Generating Automatic Curricula via Self-Supervised Active Domain Randomization
Generating Counter Narratives against Online Hate Speech: Data and Strategies
Generating Dialogue Responses from a Semantic Latent Space
Generating Diverse Translation from Model Distribution with Dropout
Generating Diverse and Consistent QA pairs from Contexts with Information-Maximizing Hierarchical Conditional VAEs
Generating Fact Checking Briefs
Generating Fact Checking Explanations
Generating Fine-Grained Open Vocabulary Entity Type Descriptions
Generating Hierarchical Explanations on Text Classification via Feature Interaction Detection
Generating Image Descriptions via Sequential Cross-Modal Alignment Guided by Human Gaze
Generating Label Cohesive and Well-Formed Adversarial Claims
Generating Logical Forms from Graph Representations of Text and Entities
Generating Narrative Text in a Switching Dynamical System
Generating Negative Commonsense Knowledge
Generating Novel Glyph without Human Data by Learning to Communicate
Generating Question Relevant Captions to Aid Visual Question Answering
Generating Radiology Reports via Memory-driven Transformer
Generating Sentences by Editing Prototypes
Generating Summaries with Topic Templates and Structured Convolutional Decoders
Generating and Exploiting Large-scale Pseudo Training Data for Zero Pronoun Resolution
Generative Adversarial Imitation from Observation
Generative Adversarial User Privacy in Lossy Single-Server Information Retrieval
Generative Flows with Matrix Exponential
Generative ODE Modeling with Known Unknowns
Generative Semantic Hashing Enhanced via Boltzmann Machines
Generative Teaching Networks: Accelerating Neural Architecture Search by Learning to Generate Synthetic Training Data
Geometric Dataset Distances via Optimal Transport
Geometry-aware Domain Adaptation for Unsupervised Alignment of Word Embeddings
Geoopt: Riemannian Optimization in PyTorch
Getting a CLUE: A Method for Explaining Uncertainty Estimates
Give Me Convenience and Give Her Death: Who Should Decide What Uses of NLP are Appropriate, and on What Basis?
Giving Attention to the Unexpected: Using Prosody Innovations in Disfluency Detection
Global Neural CCG Parsing with Optimality Guarantees
Global-to-Local Neural Networks for Document-Level Relation Extraction
Globally Normalized Reader
Go Wide, Then Narrow: Efficient Training of Deep Thin Networks
Good Subnetworks Provably Exist: Pruning via Greedy Forward Selection
Good-Enough Compositional Data Augmentation
GraPPa: Grammar-Augmented Pre-Training for Table Semantic Parsing
Gradient Based Memory Editing for Task-Free Continual Learning
Gradient Descent with Early Stopping is Provably Robust to Label Noise for Overparameterized Neural Networks
Gradient Temporal-Difference Learning with Regularized Corrections
Gradient descent algorithms for Bures-Wasserstein barycenters
Gradient descent follows the regularization path for general losses
Gradient-free Online Learning in Games with Delayed Rewards
GradientDICE: Rethinking Generalized Offline Estimation of Stationary Values
Grammatical Error Correction in Low Error Density Domains: A New Benchmark and Analyses
Graph Clustering with Graph Neural Networks
Graph Coarsening with Preserved Spectral Properties
Graph Convolutional Gaussian Processes For Link Prediction
Graph Convolutional Network for Recommendation with Low-pass Collaborative Filters
Graph Convolutions over Constituent Trees for Syntax-Aware Semantic Role Labeling
Graph DNA: Deep Neighborhood Aware Graph Encoding for Collaborative Filtering
Graph Filtration Learning
Graph Homomorphism Convolution
Graph Learning for Inverse Landscape Genetics
Graph Neural Networks for Massive MIMO Detection
Graph Neural Networks for the Prediction of Substrate-Specific Organic Reaction Conditions
Graph Neural Networks in TensorFlow and Keras with Spektral
Graph Optimal Transport for Cross-Domain Alignment
Graph Pattern Entity Ranking Model for Knowledge Graph Completion
Graph Random Neural Features for Distance-Preserving Graph Representations
Graph Structure of Neural Networks
Graph based Neural Networks for Event Factuality Prediction using Syntactic and Semantic Structures
Graph neural induction of value iteration
Graph-based Nearest Neighbor Search: From Practice to Theory
Graph-based, Self-Supervised Program Repair from Diagnostic Feedback
GraphDialog: Integrating Graph Knowledge into End-to-End Task-Oriented Dialogue Systems
GraphOpt: Learning Optimization Models of Graph Formation
Graphs, Entities, and Step Mixture
Grasping the Finer Point: A Supervised Similarity Network for Metaphor Detection
Greedy Search with Probabilistic N-gram Matching for Neural Machine Translation
Gromov-Wasserstein Alignment of Word Embedding Spaces
Grounded Adaptation for Zero-shot Executable Semantic Parsing
Grounded Compositional Outputs for Adaptive Language Modeling
Grounded Conversation Generation as Guided Traverses in Commonsense Knowledge Graphs
Grounding Conversations with Improvised Dialogues
Group Equivariant Deep Reinforcement Learning
Growing Action Spaces
Growing Together: Modeling Human Language Learning With n-Best Multi-Checkpoint Machine Translation
Guaranteed Validity for Empirical Approaches to Adaptive Data Analysis
Guided Learning of Nonconvex Models through Successive Functional Gradient Optimization
Guiding Attention for Self-Supervised Learning with Transformers
Guiding Variational Response Generator to Exploit Persona
HABERTOR: An Efficient and Effective Deep Hatespeech Detector
HAT: Hardware-Aware Transformers for Efficient Natural Language Processing
HEAD-QA: A Healthcare Dataset for Complex Reasoning
HENIN: Learning Heterogeneous Neural Interaction Networks for Explainable Cyberbullying Detection on Social Media
HERO: Hierarchical Encoder for Video+Language Omni-representation Pre-training
HIBERT: Document Level Pre-training of Hierarchical Bidirectional Transformers for Document Summarization
HNHN: Hypergraph Networks with Hyperedge Neurons
Haar Graph Pooling
Haar Wavelet based Block Autoregressive Flows for Trajectories
Hallucinative Topological Memory for Zero-Shot Visual Planning
Halpern Iteration for Near-Optimal and Parameter-Free Monotone Inclusion and Strong Solutions to Variational Inequalities
Hamiltonian Graph Networks with ODE Integrators
Hamiltonian Monte Carlo Swindles
Handling Divergent Reference Texts when Evaluating Table-to-Text Generation
Handling Noisy Labels for Robustly Learning from Self-Training Data for Low-Resource Sequence Labeling
Handling the Positive-Definite Constraint in the Bayesian Learning Rule
Hard-Coded Gaussian Attention for Neural Machine Translation
Hardness of Identity Testing for Restricted Boltzmann Machines and Potts models
Harmonic Decompositions of Convolutional Networks
Harnessing Distribution Ratio Estimators for Learning Agents with Quality and Diversity
Harnessing the linguistic signal to predict scalar inferences
Harry Potter and the Action Prediction Challenge from Natural Language
Harvesting Paragraph-Level Question-Answer Pairs from Wikipedia
Harvesting and Refining Question-Answer Pairs for Unsupervised QA
Has Machine Translation Achieved Human Parity? A Case for Document-level Evaluation
Hearst Patterns Revisited: Automatic Hypernym Detection from Large Text Corpora
Helping Reduce Environmental Impact of Aviation with Machine Learning
Hermitian matrices for clustering directed graphs: insights and applications
Heterogeneous Graph Neural Networks for Extractive Document Summarization
Heterogeneous Supervision for Relation Extraction: A Representation Learning Approach
Heteroskedastic and Imbalanced Deep Learning with Adaptive Regularization
Hiding Among the Clones: A Simple and Nearly Optimal Analysis of Privacy Amplification by Shuffling
Hierarchical Clustering: a 0.585 Revenue Approximation
Hierarchical Entity Typing via Multi-level Learning to Rank
Hierarchical Evidence Set Modeling for Automated Fact Extraction and Verification
Hierarchical Generation of Molecular Graphs using Structural Motifs
Hierarchical Graph Network for Multi-hop Question Answering
Hierarchical Inter-Message Passing for Learning on Molecular Graphs
Hierarchical Losses and New Resources for Fine-grained Entity Typing and Linking
Hierarchical Neural Networks for Sequential Sentence Classification in Medical Scientific Abstracts
Hierarchical Neural Story Generation
Hierarchical Protein Function Prediction with Tail-GNNs
Hierarchical Quantized Representations for Script Generation
Hierarchical Structured Model for Fine-to-coarse Manifesto Text Analysis
Hierarchical Transformers for Multi-Document Summarization
Hierarchical Verification for Adversarial Robustness
Hierarchically Decoupled Imitation for Morphological Transfer
High Dimensional Robust Sparse Regression
High Resolution Medical Image Analysis with Spatial Partitioning
High-Dimensional Robust Mean Estimation via Gradient Descent
HighRES: Highlight-based Reference-less Evaluation of Summarization
Higher-order Coreference Resolution with Coarse-to-fine Inference
Highway Transformer: Self-Gating Enhanced Self-Attentive Networks
History for Visual Dialog: Do we really need it?
History-Gradient Aided Batch Size Adaptation for Variance Reduced Algorithms
Hooks in the Headline: Learning to Generate Headlines with Controlled Styles
HotpotQA: A Dataset for Diverse, Explainable Multi-hop Question Answering
How Can We Accelerate Progress Towards Human-like Linguistic Generalization?
How Does NLP Benefit Legal System: A Summary of Legal Artificial Intelligence
How Does Selective Mechanism Improve Self-Attention Networks?
How Furiously Can Colourless Green Ideas Sleep? Sentence Acceptability in Context
How Good is the Bayes Posterior in Deep Neural Networks Really?
How Large a Vocabulary Does Text Classification Need? A Variational Approach to Vocabulary Selection
How Much Knowledge Can You Pack Into the Parameters of a Language Model?
How Much Reading Does Reading Comprehension Require? A Critical Investigation of Popular Benchmarks
How To Backdoor Federated Learning
How agents see things: On visual representations in an emergent language game
How do Decisions Emerge across Layers in Neural Models? Interpretation with Differentiable Masking
How much complexity does an RNN architecture need to learn syntax-sensitive dependencies?
How multilingual is Multilingual BERT?
How recurrent networks implement contextual processing in sentiment analysis
How to Grow a (Product) Tree: Personalized Category Suggestions for eCommerce Type-Ahead
How to Make Deep RL Work in Practice
How to Probe Sentence Embeddings in Low-Resource Languages: On Structural Design Choices for Probing Task Evaluation
How to trap a gradient flow
How well does surprisal explain N400 amplitude under different experimental conditions?
Howl: A Deployed, Open-Source Wake Word Detection System
Human computation requires and enables a new approach to ethical review
Human-Like Active Learning: Machines Simulating the Human Learning Process
Human-Paraphrased References Improve Neural Machine Translation
Human-centric Dialog Training via Offline Reinforcement Learning
Hybrid Code Networks: practical and efficient end-to-end dialog control with supervised and reinforcement learning
Hybrid Session-based News Recommendation using Recurrent Neural Networks
Hybrid Stochastic-Deterministic Minibatch Proximal Gradient: Less-Than-Single-Pass Optimization with Nearly Optimal Generalization
HydroNets: Leveraging River Structure for Hydrologic Modeling
Hyper-spectral NIR and MIR data and optimal wavebands for detection of apple tree diseases
Hyperbolic Manifold Regression
Hypernetwork approach to generating point clouds
Hyperparameter Auto-tuning in Self-Supervised Robotic Learning
Hypothesis Testing Interpretations and Renyi Differential Privacy
ID3 Learns Juntas for Smoothed Product Distributions
IGSQL: Database Schema Interaction Graph Based Neural Model for Context-Dependent Text-to-SQL Generation
IIRC: A Dataset of Incomplete Information Reading Comprehension Questions
IMHO Fine-Tuning Improves Claim Detection
IMoJIE: Iterative Memory-Based Joint Open Information Extraction
INFOTABS: Inference on Tables as Semi-structured Data
INSET: Sentence Infilling with INter-SEntential Transformer
INSPIRED: Toward Sociable Recommendation Dialog Systems
IROF: a low resource evaluation metric for explanation methods
IV-Posterior: Inverse Value Estimation for Interpretable Policy Certificates
Identifying Semantic Divergences in Parallel Text without Annotations
Identifying and Correcting Label Bias in Machine Learning
Identifying and Reducing Gender Bias in Word-Level Language Models
Identifying civilians killed by police with distantly supervised entity-event extraction
If MaxEnt RL is the Answer, What is the Question?
If beam search is the answer, what was the question?
Image Augmentation Is All You Need: Regularizing Deep Reinforcement Learning from Pixels
Image Generation With Neural Cellular Automatas
Image Pivoting for Learning Multilingual Multimodal Representations
Image-based phenotyping of diverse Rice (Oryza Sativa L.) Genotypes
Imitation Attacks and Defenses for Black-box Machine Translation Systems
Imitation Learning Approach for AI Driving Olympics Trained on Real-world and Simulation Data Simultaneously
Imitation Learning for Neural Morphological String Transduction
Implicit Bias of Gradient Descent for Wide Two-layer Neural Networks Trained with the Logistic Loss
Implicit Class-Conditioned Domain Alignment for Unsupervised Domain Adaptation
Implicit Generative Modeling for Efficient Exploration
Implicit Geometric Regularization for Learning Shapes
Implicit Regularization of Random Feature Models
Implicit competitive regularization in GANs
Implicit regularization and solution uniqueness in over-parameterized matrix sensing
Implicit regularization for deep neural networks driven by an Ornstein-Uhlenbeck like process
Improper Learning for Non-Stochastic Control
Improved Natural Language Generation via Loss Truncation
Improved Neural Relation Detection for Knowledge Base Question Answering
Improved Optimistic Algorithms for Logistic Bandits
Improved Regret Bounds for Projection-free Bandit Convex Optimization
Improved Relation Extraction with Feature-Rich Compositional Embedding Models
Improved Semantic Representations From Tree-Structured Long Short-Term Memory Networks
Improved Semantic-Aware Network Embedding with Fine-Grained Word Alignment
Improved Sentiment Detection via Label Transfer from Monolingual to Synthetic Code-Switched Text
Improved Speech Representations with Multi-Target Autoregressive Predictive Coding
Improved Transition-Based Parsing by Modeling Characters instead of Words with LSTMs
Improving AMR Parsing with Sequence-to-Sequence Pre-training
Improving Abstraction in Text Summarization
Improving Adversarial Text Generation by Modeling the Distant Future
Improving Candidate Generation for Low-resource Cross-lingual Entity Linking
Improving Character-based Decoding Using Target-Side Morphological Information for Neural Machine Translation
Improving Dialog Evaluation with a Multi-reference Adversarial Dataset and Large Scale Pretraining
Improving Dialogue State Tracking by Discerning the Relevant Context
Improving Disentangled Text Representation Learning with Information-Theoretic Guidance
Improving Disfluency Detection by Self-Training a Self-Attentive Model
Improving Domain Adaptation Translation with Domain Invariant and Specific Information
Improving Generalization by Controlling Label-Noise Information in Neural Network Weights
Improving Generative Imagination in Object-Centric World Models
Improving Grammatical Error Correction via Pre-Training a Copy-Augmented Architecture with Unlabeled Data
Improving Graph Neural Network Expressivity via Subgraph Isomorphism Counting
Improving Human Text Comprehension through Semi-Markov CRF-based Neural Section Title Generation
Improving Information Extraction by Acquiring External Evidence with Reinforcement Learning
Improving Knowledge Graph Embedding Using Simple Constraints
Improving Lemmatization of Non-Standard Languages with Joint Learning
Improving Lexical Choice in Neural Machine Translation
Improving Machine Reading Comprehension with General Reading Strategies
Improving Massively Multilingual Neural Machine Translation and Zero-Shot Translation
Improving Maximum Likelihood Training for Text Generation with Density Ratio Estimation
Improving Molecular Design by Stochastic Iterative Target Augmentation
Improving Multi-turn Dialogue Modelling with Utterance ReWriter
Improving Multilingual Models with Language-Clustered Vocabularies
Improving Multilingual Named Entity Recognition with Wikipedia Entity Type Mapping
Improving Neural Conversational Models with Entropy-Based Data Filtering
Improving Neural Parsing by Disentangling Model Combination and Reranking Effects
Improving Non-autoregressive Neural Machine Translation with Monolingual Data
Improving Question Answering over Incomplete KBs with Knowledge-Aware Reader
Improving Robustness of Deep-Learning-Based Image Reconstruction
Improving Segmentation for Technical Support Problems
Improving Slot Filling by Utilizing Contextual Information
Improving Text Generation Evaluation with Batch Centering and Tempered Word Mover Distance
Improving Text Generation with Student-Forcing Optimal Transport
Improving Topic Models with Latent Feature Word Representations
Improving Transformer Models by Reordering their Sublayers
Improving Truthfulness of Headline Generation
Improving Unsupervised Word-by-Word Translation with Language Model and Denoising Autoencoder
Improving Yorùbá Diacritic Restoration
Improving a Neural Semantic Parser by Counterfactual Learning from Human Bandit Feedback
Improving fairness in machine learning systems: What do industry practitioners need?
Improving robustness against common corruptions by covariate shift adaptation
Improving the Efficiency of Grammatical Error Correction with Erroneous Span Detection and Correction
Improving the Gating Mechanism of Recurrent Neural Networks
Improving the Similarity Measure of Determinantal Point Processes for Extractive Multi-Document Summarization
Imputation estimators for unnormalized models with missing data
Imputer: Sequence Modelling via Imputation and Dynamic Programming
In search of isoglosses: continuous and discrete language embeddings in Slavic historical phonology
In-domain representation learning for remote sensing
Incentive-Compatible Forecasting Competitions
Incentives for Federated Learning: a Hypothesis Elicitation Approach
Incidence Networks for Geometric Deep Learning
Incomplete Utterance Rewriting as Semantic Segmentation
Incorporate Semantic Structures into Machine Translation Evaluation via UCCA
Incorporating Behavioral Hypotheses for Query Generation
Incorporating External Knowledge through Pre-training for Natural Language to Code Generation
Incorporating Subword Information into Matrix Factorization Word Embeddings
Incorporating Terminology Constraints in Automatic Post-Editing
Incorporating Uncertain Segmentation Information into Chinese NER for Social Media Text
Incorporating a Local Translation Mechanism into Non-autoregressive Translation
Increasing performance of electric vehicles in ride-hailing services using deep reinforcement learning
Incremental Neural Coreference Resolution in Constant Memory
Incremental Processing in the Age of Non-Incremental Encoders: An Empirical Assessment of Bidirectional Models for Incremental NLU
Incremental Sampling Without Replacement for Sequence Models
Incremental Transformer with Deliberation Decoder for Document Grounded Conversations
Independent Subspace Analysis for Unsupervised Learning of Disentangled Representations
Individual Calibration with Randomized Forecasting
Induced Inflection-Set Keyword Search in Speech
Inductive Relation Prediction by Subgraph Reasoning
Inertial Block Proximal Methods for Non-Convex Non-Smooth Optimization
Inexact Tensor Methods with Dynamic Accuracies
Inference Strategies for Machine Translation with Conditional Masking
Inference of Dynamic Graph Changes for Functional Connectome
Inferring Which Medical Treatments Work from Reports of Clinical Trials
Inferring astrophysical X-ray polarization with deep learning
Infinite attention: NNGP and NTK for deep attention networks
Influence Paths for Characterizing Subject-Verb Number Agreement in LSTM Language Models
Information Aggregation for Multi-Head Attention with Routing-by-Agreement
Information Directed Sampling for Linear Partial Monitoring
Information Extraction from Swedish Medical Prescriptions with Sig-Transformer Encoder
Information Seeking in the Spirit of Learning: a Dataset for Conversational Curiosity
Information Theoretic Optimal Learning of Gaussian Graphical Models
Information-Theoretic Local Minima Characterization and Regularization
Information-Theoretic Probing for Linguistic Structure
Information-Theoretic Probing with Minimum Description Length
Informative Dropout for Robust Representation Learning: A Shape-bias Perspective
Infusing Disease Knowledge into BERT for Health Question Answering, Medical Inference and Disease Name Recognition
Injecting Numerical Reasoning Skills into Language Models
Inoculation by Fine-Tuning: A Method for Analyzing Challenge Datasets
Input-Sparsity Low Rank Approximation in Schatten Norm
Inquisitive Question Generation for High Level Text Comprehension
Insights into Fairness through Trust: Multi-scale Trust Quantification for Financial Deep Learning
InstaHide: Instance-hiding Schemes for Private Distributed Learning
Instance-Based Learning of Span Representations: A Case Study through Named Entity Recognition
Instance-wise Depth and Motion Learning from Monocular Videos
Integrals over Gaussians under Linear Domain Constraints
Integrating Multimodal Information in Large Pretrained Transformers
Integrating Semantic Knowledge to Tackle Zero-shot Text Classification
Integrating Semantic and Structural Information with Graph Convolutional Network for Controversy Detection
Integrating Transformer and Paraphrase Rules for Sentence Simplification
Integrating Weakly Supervised Word Sense Disambiguation into Neural Machine Translation
Inter-Level Cooperation in Hierarchical Reinforcement Learning
Inter-sentence Relation Extraction with Document-level Graph Convolutional Neural Network
Interactive Classification by Asking Informative Questions
Interactive Extractive Search over Biomedical Corpora
Interactive Fiction Game Playing as Multi-Paragraph Reading Comprehension with Reinforcement Learning
Interactive Machine Comprehension with Information Seeking Agents
Interactive Refinement of Cross-Lingual Word Embeddings
Interactive Text Ranking with Bayesian Optimisation: A Case Study on Community QA and Summarisation
Interactive Visualization for Debugging RL
Interconnected Question Generation with Coreference Alignment and Conversation Flow Modeling
Interference and Generalization in Temporal Difference Learning
Interpolation between Residual and Non-Residual Networks
Interpretable Charge Predictions for Criminal Cases: Learning to Generate Court Views from Fact Descriptions
Interpretable Companions for Black-Box Models
Interpretable Multi-dataset Evaluation for Named Entity Recognition
Interpretable Off-Policy Evaluation in Reinforcement Learning by Highlighting Influential Transitions
Interpretable Question Answering on Knowledge Bases and Text
Interpretable and Compositional Relation Learning by Joint Training with an Autoencoder
Interpretable deep Gaussian processes with moments
Interpretation of NLP models through input marginalization
Interpretations are useful: penalizing explanations to align neural networks with prior knowledge
Interpreting Attention Models with Human Visual Attention in Machine Reading Comprehension
Intrinsic Probing through Dimension Selection
Intrinsic Reward Driven Imitation Learning via Generative Model
Introducing Syntactic Structures into Target Opinion Word Extraction with Deep Learning
Invariant Causal Prediction for Block MDPs
Invariant Risk Minimization Games
Inverse Active Sensing: Modeling and Understanding Timely Decision-Making
Invertible Generative Modeling using Linear Rational Splines
Invertible generative models for inverse problems: mitigating representation error and dataset bias
Investigating African-American Vernacular English in Transformer-Based Text Generation
Investigating Capsule Networks with Dynamic Routing for Text Classification
Investigating Cross-Linguistic Adjective Ordering Tendencies with a Latent-Variable Model
Investigating Prior Knowledge for Challenging Chinese Machine Reading Comprehension
Investigating representations of verb bias in neural language models
Investigating the Effect of Sensor Modalities in Multi-Sensor Detection-Prediction Models
Involutive MCMC: a Unifying Framework
Is 42 the Answer to Everything in Subtitling-oriented Speech Translation?
Is Graph Structure Necessary for Multi-hop Question Answering?
Is Local SGD Better than Minibatch SGD?
Is There a Trade-Off Between Fairness and Accuracy? A Perspective Using Mismatched Hypothesis Testing
Is Your Classifier Actually Biased? Measuring Fairness under Uncertainty with Bernstein Bounds
Is the Best Better? Bayesian Statistical Model Comparison for Natural Language Processing
It's Easier to Translate out of English than into it: Measuring Neural Translation Difficulty by Cross-Mutual Information
It's Not What Machines Can Learn, It's What We Cannot Teach
Iterative Edit-Based Unsupervised Sentence Simplification
Iterative Refinement in the Continuous Space for Non-Autoregressive Neural Machine Translation
Ivy: Instrumental Variable Synthesis for Causal Inference
Job Recommendation through Progression of Job Selection
Joint Bootstrapping Machines for High Confidence Relation Extraction
Joint Constrained Learning for Event-Event Relation Extraction
Joint Detection and Location of English Puns
Joint Diacritization, Lemmatization, Normalization, and Fine-Grained Morphological Tagging
Joint Effects of Context and User History for Predicting Online Conversation Re-entries
Joint Entity Extraction and Assertion Detection for Clinical Text
Joint Extraction of Entities and Relations Based on a Novel Tagging Scheme
Joint Learning of Pre-Trained and Random Units for Domain Adaptation in Part-of-Speech Tagging
Joint Modeling of Content and Discourse Relations in Dialogues
Joint Modeling of Topics, Citations, and Topical Authority in Academic Corpora
Joint Modelling of Emotion and Abusive Language Detection
Joint Multilingual Supervision for Cross-lingual Entity Linking
Joint Multitask Learning for Community Question Answering Using Task-Specific Embeddings
Joint Reasoning for Temporal and Causal Relations
Joint Semantic Synthesis and Morphological Analysis of the Derived Word
Joint translation and unit conversion for end-to-end localization
Jointly Multiple Events Extraction via Attention-based Graph Information Aggregation
Jointly Optimizing Diversity and Relevance in Neural Response Generation
Jointly Predicting Predicates and Arguments in Neural Semantic Role Labeling
KLEJ: Comprehensive Benchmark for Polish Language Understanding
KdConv: A Chinese Multi-domain Dialogue Dataset Towards Multi-turn Knowledge-driven Conversation
Keep CALM and Explore: Language Models for Action Generation in Text-based Games
Keeping Up Appearances: Computational Modeling of Face Acts in Persuasion Oriented Discussions
Kernel Conditional Density Operators
Kernel and Rich Regimes in Overparametrized Models
Kernel interpolation with continuous volume sampling
Kernels over Sets of Finite Sets using RKHS Embeddings, with Application to Bayesian (Combinatorial) Optimization
Key-Value Memory Networks for Directly Reading Documents
Keyphrase Generation: A Text Summarization Struggle
KinGDOM: Knowledge-Guided DOMain adaptation for sentiment analysis
Kinematic State Abstraction and Provably Efficient Rich-Observation Reinforcement Learning
Knowing The What But Not The Where in Bayesian Optimization
Knowledge Association with Hyperbolic Knowledge Graph Embeddings
Knowledge Completion for Generics using Guided Tensor Factorization
Knowledge Distillation for Multilingual Unsupervised Neural Machine Translation
Knowledge Graph-Augmented Abstractive Summarization with Semantic-Driven Cloze Reward
Knowledge-Grounded Dialogue Generation with Pre-trained Language Models
Knowledge-aware Pronoun Coreference Resolution
Knowledge-guided Open Attribute Value Extraction with Reinforcement Learning
Knowledgeable Reader: Enhancing Cloze-Style Reading Comprehension with External Commonsense Knowledge
KutralNet: A Portable Deep Learning Model for Fire Recognition
Køpsala: Transition-Based Graph Parsing via Efficient Training and Effective Encoding
LAReQA: Language-agnostic answer retrieval from a multilingual pool
LEAN-LIFE: A Label-Efficient Annotation Framework Towards Learning from Explanation
LEEP: A New Measure to Evaluate Transferability of Learned Representations
LIBRE: Learning Interpretable Boolean Rule Ensembles
LINSPECTOR: Multilingual Probing Tasks for Word Representations
LOGAN: Local Group Bias Detection by Clustering
LP-SparseMAP: Differentiable Relaxed Optimization for Sparse Structured Prediction
LRTA: A Transparent Neural-Symbolic Reasoning Framework with Modular Supervision for Visual Question Answering
LUKE: Deep Contextualized Entity Representations with Entity-aware Self-attention
Label-aware Double Transfer Learning for Cross-Specialty Medical Named Entity Recognition
Landscape Connectivity and Dropout Stability of SGD Solutions for Over-parameterized Neural Networks
Langevin Monte Carlo without smoothness
Language (Re)modelling: Towards Embodied Language Understanding
Language (Technology) is Power: A Critical Survey of "Bias" in NLP
Language Generation with Multi-Hop Reasoning on Commonsense Knowledge Graph
Language Model Prior for Low-Resource Neural Machine Translation
Language Modelling Makes Sense: Propagating Representations through WordNet for Full-Coverage Word Sense Disambiguation
Language Models as Fact Checkers?
Language Models as an Alternative Evaluator of Word Order Hypotheses: A Case Study in Japanese
Language Models not just for Pre-training: Fast Online Neural Noisy Channel Modeling
Language Understanding for Text-based Games Using Deep Reinforcement Learning
Language as a Latent Variable: Discrete Generative Models for Sentence Compression
Large Margin Neural Language Model
Large Product Key Memory for Pretrained Language Models
Large Scale Multi-Actor Generative Dialog Modeling
Large-Scale Multi-Domain Belief Tracking with Knowledge Sharing
Large-scale Analysis of Counseling Conversations: An Application of Natural Language Processing to Mental Health
Large-scale Cloze Test Dataset Created by Teachers
Last Iterate is Slower than Averaged Iterate in Smooth Convex-Concave Saddle Point Problems
Latent Alignment of Procedural Concepts in Multimodal Recipes
Latent Space Factorisation and Manipulation via Matrix Subspace Projection
Latent Space Oddity: Exploring Latent Spaces to Design Guitar Timbres
Latent Variable Modelling with Hyperbolic Normalizing Flows
Latent-CF: A Simple Baseline for Reverse Counterfactual Explanations
Layered Sampling for Robust Optimization Problems
LazyIter: A Fast Algorithm for Counting Markov Equivalent DAGs and Designing Experiments
LdSM: Logarithm-depth Streaming Multi-label Decision Trees
Learnable Bernoulli Dropout for Bayesian Deep Learning
Learning Accurate Long-term Dynamics for Model-based Reinforcement Learning
Learning Adaptive Language Interfaces through Decomposition
Learning Adversarially Robust Representations via Worst-Case Mutual Information Maximization
Learning Algebraic Multigrid Using Graph Neural Networks
Learning Architectures from an Extended Search Space for Language Modeling
Learning Autoencoders with Relational Regularization
Learning Canonical Transformations
Learning Collaborative Agents with Rule Guidance for Knowledge Graph Reasoning
Learning Compressed Sentence Representations for On-Device Text Processing
Learning Constraints for Structured Prediction Using Rectifier Networks
Learning Context-Free Languages with Nondeterministic Stack RNNs
Learning Context-Sensitive Convolutional Filters for Text Processing
Learning Contextualized Knowledge Structures for Commonsense Reasoning
Learning Cross-lingual Distributed Logical Representations for Semantic Parsing
Learning Crosslingual Word Embeddings without Bilingual Corpora
Learning De-biased Representations with Biased Representations
Learning Deep Transformer Models for Machine Translation
Learning Dialog Policies from Weak Demonstrations
Learning Discourse-level Diversity for Neural Dialog Models using Conditional Variational Autoencoders
Learning Discrete Structured Representations by Adversarially Maximizing Mutual Information
Learning Dynamic Feature Selection for Fast Sequential Prediction
Learning Dynamic and Personalized Comorbidity Networks from Event Data using Deep Diffusion Processes
Learning Efficient Multi-agent Communication: An Information Bottleneck Approach
Learning End-to-End Goal-Oriented Dialog with Maximal User Task Success and Minimal Human Agent Use
Learning Entangled Single-Sample Gaussians in the Subset-of-Signals Model
Learning Fair Policies in Multiobjective (Deep) Reinforcement Learning with Average and Discounted Rewards
Learning Fair Representations for Kernel Models
Learning Flat Latent Manifolds with VAEs
Learning Functionally Decomposed Hierarchies for Continuous Control Tasks with Path Planning
Learning Gaussian Graphical Models via Multiplicative Weights
Learning Generic Sentence Representations Using Convolutional Neural Networks
Learning Geometric Word Meta-Embeddings
Learning Graph Models for Template-Free Retrosynthesis
Learning Graph Structure With A Finite-State Automaton Layer
Learning Group Structure and Disentangled Representations of Dynamical Environments
Learning Halfspaces with Massart Noise Under Structured Distributions
Learning Hierarchical Interactions at Scale: A Convex Optimization Approach
Learning High-dimensional Gaussian Graphical Models under Total Positivity without Adjustment of Tuning Parameters
Learning Human Objectives by Evaluating Hypothetical Behavior
Learning Hyperbolic Representations for Unsupervised 3D Segmentation
Learning Implicit Text Generation via Feature Matching
Learning Implicitly with Noisy Data in Linear Arithmetic
Learning Informative Representations of Biomedical Relations with Latent Variable Models
Learning Intrinsic Symbolic Rewards in Reinforcement Learning
Learning Invariant Representations for Reinforcement Learning without Reconstruction
Learning Joint Semantic Parsers from Disjoint Data
Learning Lexico-Functional Patterns for First-Person Affect
Learning Long-term Visual Dynamics with Region Proposal Interaction Networks
Learning Matching Models with Weak Supervision for Response Selection in Retrieval-based Chatbots
Learning Mixtures of Graphs from Epidemic Cascades
Learning Multilingual Word Embeddings in Latent Metric Space: A Geometric Approach
Learning Music Helps You Read: Using Transfer to Study Linguistic Structure in Language Models
Learning Near Optimal Policies with Low Inherent Bellman Error
Learning Neural Sequence-to-Sequence Models from Weak Feedback with Bipolar Ramp Loss
Learning Neural Templates for Text Generation
Learning Object-Centric Video Models by Contrasting Sets
Learning Optimal Tree Models Under Beam Search
Learning Outside the Box: Discourse-level Features Improve Metaphor Identification
Learning Overlapping Representations for the Estimation of Individualized Treatment Effects
Learning Portable Representations for High-Level Planning
Learning Probabilistic Sentence Representations from Paraphrases
Learning Quadratic Games on Networks
Learning Reasoning Strategies in End-to-End Differentiable Proving
Learning Representations that Support Extrapolation
Learning Robot Skills with Temporal Variational Inference
Learning Robust Models for e-Commerce Product Search
Learning Sequence Encoders for Temporal Knowledge Graph Completion
Learning Similarity Metrics for Numerical Simulations
Learning Source Phrase Representations for Neural Machine Translation
Learning Sparse Nonparametric DAGs
Learning Spoken Language Representations with Neural Lattice Language Modeling
Learning Structural Kernels for Natural Language Processing
Learning Structured Representations of Entity Names using Active Learning and Weak Supervision
Learning Symbolic Physics with Graph Networks
Learning To Navigate The Synthetically Accessible Chemical Space Using Reinforcement Learning
Learning To Solve Differential Equations Across Initial Conditions
Learning Variational Word Masks to Improve the Interpretability of Neural Text Classifiers
Learning Visually Grounded Sentence Representations
Learning What to Defer for Maximum Independent Sets
Learning Word-Like Units from Joint Audio-Visual Analysis
Learning Zero-Sum Simultaneous-Move Markov Games Using Function Approximation and Correlated Equilibrium
Learning a Cost-Effective Annotation Policy for Question Answering
Learning a Multi-Domain Curriculum for Neural Machine Translation
Learning a Neural Semantic Parser from User Feedback
Learning a Policy for Opportunistic Active Learning
Learning a Simple and Effective Model for Multi-turn Response Generation with Auxiliary Tasks
Learning a Single Neuron with Gradient Methods
Learning an Unreferenced Metric for Online Dialogue Evaluation
Learning and Evaluating Contextual Embedding of Source Code
Learning and Evaluating Emotion Lexicons for 91 Languages
Learning and Sampling of Atomic Interventions from Observations
Learning beyond datasets: Knowledge Graph Augmented Neural Networks for Natural language Processing
Learning distributed representations of graphs with Geo2DR
Learning for Dose Allocation in Adaptive Clinical Trials with Safety Constraints
Learning from Context or Names? An Empirical Study on Neural Relation Extraction
Learning from Irregularly-Sampled Time Series: A Missing Data Perspective
Learning from Task Descriptions
Learning how to Active Learn: A Deep Reinforcement Learning Approach
Learning in Gated Neural Networks
Learning piecewise Lipschitz functions in changing environments
Learning robust visual representations using data augmentation invariance
Learning spectrograms with convolutional spectral kernels
Learning the piece-wise constant graph structure of a varying Ising model
Learning to Ask Good Questions: Ranking Clarification Questions using Neural Expected Value of Perfect Information
Learning to Ask Questions in Open-domain Conversational Systems with Typed Decoders
Learning to Ask Unanswerable Questions for Machine Reading Comprehension
Learning to Branch for Multi-Task Learning
Learning to Classify Intents and Slot Labels Given a Handful of Examples
Learning to Combine Top-Down and Bottom-Up Signals in Recurrent Neural Networks with Attention over Modules
Learning to Contextually Aggregate Multi-Source Supervision for Sequence Labeling
Learning to Continually Learn
Learning to Customize Model Structures for Few-shot Dialogue Generation Tasks
Learning to Deceive with Attention-Based Explanations
Learning to Decipher Hate Symbols
Learning to Encode Position for Transformer with Continuous Dynamical Model
Learning to Evaluate Translation Beyond English: BLEURT Submissions to the WMT Metrics 2020 Shared Task
Learning to Faithfully Rationalize by Construction
Learning to Fuse Sentences with Transformers for Summarization
Learning to Generate Compositional Color Descriptions
Learning to Generate Multiple Style Transfer Outputs for an Input Sentence
Learning to Ignore: Long Document Coreference with Bounded Memory Neural Networks
Learning to Learn Kernels with Variational Random Features
Learning to Map Context-Dependent Sentences to Executable Formal Queries
Learning to Navigate Unseen Environments: Back Translation with Environmental Dropout
Learning to Parse and Translate Improves Neural Machine Translation
Learning to Prune Deep Neural Networks via Reinforcement Learning
Learning to Rank Learning Curves
Learning to Reach Goals via Iterated Supervised Learning
Learning to Recognize Discontiguous Entities
Learning to Recover from Multi-Modality Errors for Non-Autoregressive Neural Machine Translation
Learning to Represent Action Values as a Hypergraph on the Action Vertices
Learning to Sample with Local and Global Contexts in Experience Replay Buffer
Learning to Score Behaviors for Guided Policy Optimization
Learning to Segment Actions from Observation and Narration
Learning to Simulate Complex Physics with Graph Networks
Learning to Stop While Learning to Predict
Learning to Understand Child-directed and Adult-directed Speech
Learning to Update Natural Language Comments Based on Code Changes
Learning to simulate and design for structural engineering
Learning with Bounded Instance- and Label-dependent Label Noise
Learning with Good Feature Representations in Bandits and in RL with a Generative Model
Learning with Multiple Complementary Labels
Leave-One-Out Cross-Validation for Bayesian Model Comparison in Large Data
Lessons from Archives: Strategies for Collecting Sociocultural Data in Machine Learning
Lessons from the Bible on Modern Topics: Low-Resource Multilingual Topic Model Evaluation
Let Me Choose: From Verbal Context to Font Selection
Let's Agree to Agree: Neural Networks Share Classification Order on Real Datasets
Levels of Analysis for Machine Learning
Leveraging Declarative Knowledge in Text and First-Order Logic for Fine-Grained Propaganda Detection
Leveraging Frequency Analysis for Deep Fake Image Recognition
Leveraging Graph to Improve Abstractive Multi-Document Summarization
Leveraging Monolingual Data with Self-Supervision for Multilingual Neural Machine Translation
Leveraging Multimodal Behavioral Analytics for Automated Job Interview Performance Assessment and Feedback
Leveraging Pre-trained Checkpoints for Sequence Generation Tasks
Leveraging Procedural Generation to Benchmark Reinforcement Learning
Leveraging Sentence Similarity in Natural Language Generation: Improving Beam Search using Range Voting
Lexical Features in Coreference Resolution: To be Used With Caution
Lexically Constrained Neural Machine Translation with Levenshtein Transformer
Lexicosyntactic Inference in Neural Models
Lifelong Language Knowledge Distillation
Lifelong Learning CRF for Supervised Aspect Extraction
Lifted Disjoint Paths with Application in Multiple Object Tracking
Lifted Rule Injection for Relation Embeddings
Lightweight, Dynamic Graph Convolutional Networks for AMR-to-Text Generation
Like a Baby: Visually Situated Neural Language Acquisition
Like hiking? You probably enjoy nature: Persona-grounded Dialog with Commonsense Expansions
Likelihood Regret: An Out-of-Distribution Detection Score For Variational Auto-encoder
Linear Bandits with Stochastic Delayed Feedback
Linear Convergence of Adaptive Stochastic Gradient Descent
Linear Convergence of Randomized Primal-Dual Coordinate Method for Large-scale Linear Constrained Convex Programming
Linear Dynamics: Clustering without identification
Linear Lower Bounds and Conditioning of Differentiable Games
Linear Mode Connectivity and the Lottery Ticket Hypothesis
Linear-Time Constituency Parsing with RNNs and Dynamic Programming
Linearly Convergent Frank-Wolfe with Backtracking Line-Search
Linguistic Features for Readability Assessment
Linguistic Harbingers of Betrayal: A Case Study on an Online Strategy Game
Linguistic Knowledge and Transferability of Contextual Representations
Lipschitz Constrained Parameter Initialization for Deep Transformers
Lipschitz and Comparator-Norm Adaptivity in Online Learning
Lipstick on a Pig: Debiasing Methods Cover up Systematic Gender Biases in Word Embeddings But do not Remove Them
List Decodable Subspace Recovery
Lite Training Strategies for Portuguese-English and English-Portuguese Translation
Local Differentially Private Regret Minimization in Reinforcement Learning
Localizing Moments in Video with Temporal Language
Locally Accelerated Conditional Gradients
Locally Private Hypothesis Selection
Location Attention for Extrapolation to Longer Sequences
Logarithmic Regret for Learning Linear Quadratic Regulators Efficiently
Logarithmic Regret for Online Control
Logic-Guided Data Augmentation and Regularization for Consistent Question Answering
Logical Inferences with Comparatives and Generalized Quantifiers
Logical Natural Language Generation from Open-Domain Tables
LogicalFactChecker: Leveraging Logical Operations for Fact Checking with Graph Module Network
Logistic Regression for Massive Data with Rare Events
Long Short-Term Memory as a Dynamically Computed Element-wise Weighted Sum
Long-Horizon Visual Planning with Goal-Conditioned Hierarchical Predictors
Long-Short Term Masking Transformer: A Simple but Effective Baseline for Document-level Neural Machine Translation
Long-tail Relation Extraction via Knowledge Graph Embeddings and Graph Convolution Networks
Look It Up: Bilingual and Monolingual Dictionaries Improve Neural Machine Translation
Look at the First Sentence: Position Bias in Question Answering
Lookahead-Bounded Q-Learning
Loss Function Search for Face Recognition
Lossless Compression of Deep Neural Networks
Low Rank Fusion based Transformers for Multimodal Sequences
Low Resource Neural Machine Translation: A Benchmark for Five African Languages
Low Shot Learning with Untrained Neural Networks for Imaging Inverse Problems
Low-Dimensional Hyperbolic Knowledge Graph Embeddings
Low-Rank Bottleneck in Multi-head Attention Models
Low-Resource Domain Adaptation for Compositional Task-Oriented Semantic Parsing
Low-Variance and Zero-Variance Baselines for Extensive-Form Games
Low-loss connection of weight vectors: distribution-based approaches
Low-resource Deep Entity Resolution with Transfer and Active Learning
LowFER: Low-rank Bilinear Pooling for Link Prediction
MAD-X: An Adapter-Based Framework for Multi-Task Cross-Lingual Transfer
MAF: Multimodal Alignment Framework for Weakly-Supervised Phrase Grounding
MART: Memory-Augmented Recurrent Transformer for Coherent Video Paragraph Captioning
MAST: Multimodal Abstractive Summarization with Trimodal Hierarchical Attention
MATINF: A Jointly Labeled Large-Scale Dataset for Classification, Question Answering and Summarization
MAVEN: A Massive General Domain Event Detection Dataset
MCMH: Learning Multi-Chain Multi-Hop Rules for Knowledge Graph Reasoning
MEGA RST Discourse Treebanks with Structure and Nuclearity from Scalable Distant Sentiment Supervision
MEGATRON-CNTRL: Controllable Story Generation with External Knowledge Using Large-Scale Language Models
MGHRL: Meta Goal-generation for Hierarchical Reinforcement Learning
MIME: MIMicking Emotions for Empathetic Response Generation
MLSUM: The Multilingual Summarization Corpus
MOCHA: A Dataset for Training and Evaluating Generative Reading Comprehension Metrics
MOPO: Model-based Offline Policy Optimization
MORSE: Semantic-ally Drive-n MORpheme SEgment-er
MPC-guided Imitation Learning of Neural Network Policies for the Artificial Pancreas
MTL2L: A Context Aware Neural Optimiser
MULTIPOLAR: Multi-Source Policy Aggregation for Transfer Reinforcement Learning between Diverse Environmental Dynamics
Machine Learning in Population and Public Health
Making Monolingual Sentence Embeddings Multilingual using Knowledge Distillation
Mapping Natural Language Instructions to Mobile UI Action Sequences
Mapping Natural-language Problems to Formal-language Solutions Using Structured Neural Representations
Mapping to Declarative Knowledge for Word Problem Solving
Marrying up Regular Expressions with Neural Networks: A Case Study for Spoken Language Understanding
Masked Language Model Scoring
Masking as an Efficient Alternative to Finetuning for Pretrained Language Models
Massively Multilingual Adversarial Speech Recognition
Massively Multilingual Transfer for NER
Matching the Blanks: Distributional Similarity for Relation Learning
Maximum Entropy Gain Exploration for Long Horizon Multi-goal Reinforcement Learning
Maximum Likelihood with Bias-Corrected Calibration is Hard-To-Beat at Label Shift Adaptation
Maximum Mutation Reinforcement Learning for Scalable Control
Maximum Reward Formulation In Reinforcement Learning
MeDAL: Medical Abbreviation Disambiguation Dataset for Natural Language Understanding Pretraining
Meaning to Form: Measuring Systematicity as Information
Measuring Emotions in the COVID-19 Real World Worry Dataset
Measuring Forecasting Skill from Text
Measuring Impact of Climate Change on Tree Species: analysis of JSDM on FIA data
Measuring Information Propagation in Literary Social Networks
Measuring Non-Expert Comprehension of Machine Learning Fairness Metrics
Measuring Thematic Fit with Distributional Feature Overlap
Measuring Visual Generalization in Continuous Control from Pixels
Median Matrix Completion: from Embarrassment to Optimality
Memory-enhanced Decoder for Neural Machine Translation
Mention Extraction and Linking for SQL Query Generation
Merge and Label: A novel neural network architecture for nested NER
Message Passing Query Embedding
Message Passing for Hyper-Relational Knowledge Graphs
Meta Fine-Tuning Neural Language Models for Multi-Domain Text Mining
Meta Learning Deep Visual Words for Fast Video Object Segmentation
Meta-Learning for Few-Shot NMT Adaptation
Meta-Learning with Shared Amortized Variational Inference
Meta-Reinforcement Learning Robust to Distributional Shift via Model Identification and Experience Relabeling
Meta-Reinforcement Learning for Robotic Industrial Insertion Tasks
Meta-SAC: Auto-tune the Entropy Temperature of Soft Actor-Critic via Metagradient
Meta-Transfer Learning for Code-Switched Speech Recognition
Meta-learning with Stochastic Linear Bandits
MetaFun: Meta-Learning with Iterative Functional Updates
Microblog Hashtag Generation via Encoding Conversation Contexts
Mimicking Word Embeddings using Subword RNNs
MinTL: Minimalist Transfer Learning for Task-Oriented Dialogue Systems
Mind the Trade-off: Debiasing NLU Models without Degrading the In-distribution Performance
Minimally distorted Adversarial Examples with a Fast Adaptive Boundary Attack
Minimax Pareto Fairness: A Multi Objective Perspective
Minimax Testing of Identity to a Reference Ergodic Markov Chain
Minimax Weight and Q-Function Learning for Off-Policy Evaluation
Minimizing Dynamic Regret and Adaptive Regret Simultaneously
Minimizing Interference and Selection Bias in Network Experiment Design
Mining Discourse Markers for Unsupervised Sentence Representation Learning
Mining Documentation to Extract Hyperparameter Schemas
Mirror Descent Policy Optimization
Missing Data Imputation using Optimal Transport
Mitigating Gender Bias Amplification in Distribution by Posterior Regularization
Mitigating Gender Bias for Neural Dialogue Generation with Adversarial Learning
Mitigating Gender Bias in Machine Translation with Target Gender Annotations
Mitigating Gender Bias in Natural Language Processing: Literature Review
Mitigating Leakage in Federated Learning with Trusted Hardware
Mitigating Manipulation in Peer Review via Randomized Reviewer Assignments
Mitigating Overfitting in Supervised Classification from Two Unlabeled Datasets: A Consistent Risk Correction Approach
Mitigating Uncertainty in Document Classification
MixText: Linguistically-Informed Interpolation of Hidden Space for Semi-Supervised Text Classification
Mixed Strategies for Robust Optimization of Unknown Objectives
MixingBoard: a Knowledgeable Stylized Integrated Text Generation Platform
MoNet3D: Towards Accurate Monocular 3D Object Localization in Real Time
Mobile-Based Deep Learning Models for Banana Diseases Detection
MobileBERT: a Compact Task-Agnostic BERT for Resource-Limited Devices
Model Fusion with Kullback--Leibler Divergence
Model selection for contextual bandits
Model-Agnostic Counterfactual Explanations for Consequential Decisions
Model-Based Reinforcement Learning with a Generative Model is Minimax Optimal
Model-Based Visual Planning with Self-Supervised Functional Distances
Modeling Cloud Reflectance Fields using Conditional Generative Adversarial Networks
Modeling Continuous Stochastic Processes with Dynamic Normalizing Flows
Modeling Discourse Structure for Document-level Neural Machine Translation
Modeling Empathy and Distress in Reaction to News Stories
Modeling Global and Local Node Contexts for Text Generation from Knowledge Graphs
Modeling Label Semantics for Predicting Emotional Reactions
Modeling Long Context for Task-Oriented Dialogue State Generation
Modeling Naive Psychology of Characters in Simple Commonsense Stories
Modeling Protagonist Emotions for Emotion-Aware Storytelling
Modeling Recurrence for Transformer
Modeling Semantic Compositionality with Sememe Knowledge
Modeling Semantic Expectation: Using Script Knowledge for Referent Prediction
Modeling Semantic Plausibility by Injecting World Knowledge
Modeling Source Syntax for Neural Machine Translation
Modeling Subjective Assessments of Guilt in Newspaper Crime Narratives
Modeling the Music Genre Perception across Language-Bound Cultures
Modeling, Visualization, and Analysis of African Innovation Performance
Modelling Lexical Ambiguity with Density Matrices
Modelling Suspense in Short Stories as Uncertainty Reduction over Neural Representation
Modular Block-diagonal Curvature Approximations for Feedforward Architectures
Modularized Transfomer-based Ranking Framework
Modulated Fusion using Transformer for Linguistic-Acoustic Emotion Recognition
Modulating Surrogates for Bayesian Optimization
MojiTalk: Generating Emotional Responses at Scale
Molecule Edit Graph Attention Network: Modeling Chemical Reactions as Sequences of Graph Edits
Momentum Improves Normalized SGD
Momentum in Reinforcement Learning
Moniqua: Modulo Quantized Communication in Decentralized SGD
Monitoring and explainability of models in production
More Data Can Expand the Generalization Gap Between Adversarially Robust and Standard Models
More Information Supervised Probabilistic Deep Face Embedding Learning
More Powerful Selective Kernel Tests for Feature Selection
Morph-fitting: Fine-Tuning Word Vector Spaces with Simple Language-Specific Rules
Morphological Irregularity Correlates with Frequency
Morphological Segmentation Inside-Out
MuTual: A Dataset for Multi-Turn Dialogue Reasoning
Multi-Agent Determinantal Q-Learning
Multi-Agent Task-Oriented Dialog Policy Learning with Role-Aware Reward Decomposition
Multi-Attribute Bayesian Optimization With Interactive Preference Learning
Multi-Dimensional Gender Bias Classification
Multi-Domain Dialogue Acts and Response Co-Generation
Multi-Domain Neural Machine Translation with Word-Level Adaptive Layer-wise Domain Mixing
Multi-Fact Correction in Abstractive Text Summarization
Multi-Hop Knowledge Graph Reasoning with Reward Shaping
Multi-Instance Multi-Label Learning Networks for Aspect-Category Sentiment Analysis
Multi-Level Matching and Aggregation Network for Few-Shot Relation Classification
Multi-Modal Generative Adversarial Network for Short Product Title Generation in Mobile E-Commerce
Multi-News: a Large-Scale Multi-Document Summarization Dataset and Abstractive Hierarchical Model
Multi-Objective Molecule Generation using Interpretable Substructures
Multi-Passage Machine Reading Comprehension with Cross-Passage Answer Verification
Multi-Principal Assistance Games
Multi-Reference Training with Pseudo-References for Neural Translation and Text Generation
Multi-Relational Question Answering from Narratives: Machine Reading and Reasoning in Simulated Worlds
Multi-Sentence Argument Linking
Multi-Source Unsupervised Hyperparameter Optimization
Multi-Step Inference for Reasoning Over Paragraphs
Multi-Task Identification of Entities, Relations, and Coreference for Scientific Knowledge Graph Construction
Multi-Task Learning in Histo-pathology for Widely Generalizable Model
Multi-Task Networks With Universe, Group, and Task Feature Learning
Multi-Task Ordinal Regression for Jointly Predicting the Trustworthiness and the Leading Political Ideology of News Media
Multi-Task Reinforcement Learning with Soft Modularization
Multi-Task Video Captioning with Video and Entailment Generation
Multi-Unit Transformers for Neural Machine Translation
Multi-View Sequence-to-Sequence Models with Conversational Structure for Abstractive Dialogue Summarization
Multi-XScience: A Large-scale Dataset for Extreme Multi-document Summarization of Scientific Articles
Multi-agent Communication meets Natural Language: Synergies between Functional and Structural Language Learning
Multi-document Summarization with Maximal Marginal Relevance-guided Reinforcement Learning
Multi-hop Inference for Question-driven Summarization
Multi-hop Reading Comprehension across Multiple Documents by Reasoning over Heterogeneous Graphs
Multi-label Few/Zero-shot Learning with Knowledge Aggregated from Multiple Label Graphs
Multi-lingual neural title generation for e-Commerce browse pages
Multi-objective Bayesian Optimization using Pareto-frontier Entropy
Multi-space Variational Encoder-Decoders for Semi-supervised Labeled Sequence Transduction
Multi-step Greedy Reinforcement Learning Algorithms
Multi-task Learning for Multilingual Neural Machine Translation
Multi-task Learning of Pairwise Sequence Classification Tasks Over Disparate Label Spaces
Multi-task Learning with Sample Re-weighting for Machine Reading Comprehension
Multi-task Reinforcement Learning with a Planning Quasi-Metric
Multi-turn Response Selection using Dialogue Dependency Relations
Multi-view Story Characterization from Movie Plot Synopses and Reviews
MultiCQA: Zero-Shot Transfer of Self-Supervised Text Matching Models on a Massive Scale
MultiQA: An Empirical Investigation of Generalization and Transfer in Reading Comprehension
MultiQT: Multimodal Learning for Real-Time Question Tracking in Speech
MultiWOZ 2.2 : A Dialogue Dataset with Additional Annotation Corrections and State Tracking Baselines
Multidimensional Persistence Module Classification via Lattice-Theoretic Convolutions
Multidirectional Associative Optimization of Function-Specific Word Representations
Multigrid Neural Memory
Multilevel Text Alignment with Cross-Document Attention
Multilinear Latent Conditioning for Generating Unseen Attribute Combinations
Multilingual AMR-to-Text Generation
Multilingual Constituency Parsing with Self-Attention and Pre-Training
Multilingual Denoising Pre-training for Neural Machine Translation
Multilingual Factor Analysis
Multilingual Jointly Trained Acoustic and Written Word Embeddings
Multilingual Offensive Language Identification with Cross-lingual Embeddings
Multilingual Universal Sentence Encoder for Semantic Retrieval
Multimodal Affective Analysis Using Hierarchical Attention Strategy with Word-Level Alignment
Multimodal Emoji Prediction
Multimodal Joint Attribute Prediction and Value Extraction for E-commerce Product
Multimodal Language Analysis with Recurrent Multistage Fusion
Multimodal Machine Translation with Embedding Prediction
Multimodal Routing: Improving Local and Global Interpretability of Multimodal Language Analysis
Multimodal Self-Supervised Learning for Medical Image Analysis
Multimodal Transformer Networks for End-to-End Video-Grounded Dialogue Systems
Multimodal and Multi-view Models for Emotion Recognition
Multinomial Logit Bandit with Low Switching Cost
Multiple Instance Learning Networks for Fine-Grained Sentiment Analysis
Multiresolution Tensor Learning for Efficient and Interpretable Spatial Analysis
Multiscale Collaborative Deep Models for Neural Machine Translation
Musical Word Embedding: Bridging the Gap between Listening Contexts and Music
Mutual Information Maximization for Simple and Accurate Part-Of-Speech Induction
My Fair Bandit: Distributed Learning of Max-Min Fairness with Multi-player Bandits
NADS: Neural Architecture Distribution Search for Uncertainty Awareness
NARMADA: Need and Available Resource Managing Assistant for Disasters and Adversities
NASH: Toward End-to-End Neural Architecture for Generative Semantic Hashing
NAT: Noise-Aware Training for Robust Neural Sequence Labeling
NEXUS Network: Connecting the Preceding and the Following in Dialogue Generation
NGBoost: Natural Gradient Boosting for Probabilistic Prediction
NILE : Natural Language Inference with Faithful Natural Language Explanations
NLP Scholar: An Interactive Visual Explorer for Natural Language Processing Literature
NSTM: Real-Time Query-Driven News Overview Composition at Bloomberg
Naive Exploration is Optimal for Online LQR
Naive Feature Selection: Sparsity in Naive Bayes
Nakdan: Professional Hebrew Diacritizer
Named Entity Recognition Only from Word Embeddings
Named Entity Recognition as Dependency Parsing
Named Entity Recognition for Social Media Texts with Semantic Augmentation
Named Entity Recognition without Labelled Data: A Weak Supervision Approach
Native Language Cognate Effects on Second Language Lexical Choice
Natural Language Comprehension with the EpiReader
Natural Language Processing with Small Feed-Forward Networks
Natural language processing for achieving sustainable development: the case of neural labelling to enhance community profiling
Naturalizing a Programming Language via Interactive Learning
Navigating the Dynamics of Financial Embeddings over Time
Near Human-Level Performance in Grammatical Error Correction with Hybrid Machine Translation
Near Input Sparsity Time Kernel Embeddings via Adaptive Sampling
Near-Optimal Algorithms for Minimax Optimization
Near-Optimal Methods for Minimizing Star-Convex Functions and Beyond
Near-imperceptible Neural Linguistic Steganography via Self-Adjusting Arithmetic Coding
Near-linear Time Gaussian Process Optimization with Adaptive Batching and Resparsification
Near-optimal Regret Bounds for Stochastic Shortest Path
Nearly Linear Row Sampling Algorithm for Quantile Regression
Necessary and Sufficient Geometries for Gradient Methods
Negated and Misprimed Probes for Pretrained Language Models: Birds Can Talk, But Cannot Fly
Negative Training for Neural Dialogue Response Generation
Negative sampling in semi-supervised learning
Neighborhood Growth Determines Geometric Priors for Relational Representation Learning
Neighborhood Matching Network for Entity Alignment
Nested Named Entity Recognition via Second-best Sequence Learning and Decoding
Nested Reasoning About Autonomous Agents Using Probabilistic Programs
Nested Subspace Arrangement for Representation of Relational Data
Neural AMR: Sequence-to-Sequence Models for Parsing and Generation
Neural Abstract Reasoner
Neural Argument Generation Augmented with Externally Retrieved Evidence
Neural Bipartite Matching
Neural CRF Model for Sentence Alignment in Text Simplification
Neural CRF Parsing
Neural Clustering Processes
Neural Contextual Bandits with UCB-based Exploration
Neural Cross-Lingual Coreference Resolution and its Application to Entity Linking
Neural Data-to-Text Generation via Jointly Learning the Segmentation and Correspondence
Neural Decomposition: Functional ANOVA with Variational Autoencoders
Neural Deepfake Detection with Factual Structure of Text
Neural Differential Equations for Single Image Super-resolution
Neural Discourse Structure for Text Categorization
Neural Dynamic Policies for End-to-End Sensorimotor Learning
Neural End-to-End Learning for Computational Argumentation Mining
Neural Fine-Grained Entity Type Classification with Hierarchy-Aware Loss
Neural Generation of Dialogue Response Timings
Neural Grammatical Error Correction with Finite State Transducers
Neural Kernels Without Tangents
Neural Language Models as Psycholinguistic Subjects: Representations of Syntactic State
Neural Latent Relational Analysis to Capture Lexical Semantic Relations in a Vector Space
Neural Legal Judgment Prediction in English
Neural Machine Translation of Text from Non-Native Speakers
Neural Machine Translation via Binary Code Prediction
Neural Machine Translation with Source-Side Latent Graph Parsing
Neural Manifold Ordinary Differential Equations
Neural Mask Generator: Learning to Generate Adaptive Word Maskings for Language Model Adaptation
Neural Metaphor Detection in Context
Neural Models for Documents with Metadata
Neural Open Information Extraction
Neural Operator: Graph Kernel Network for Partial Differential Equations
Neural Ordinary Differential Equations on Manifolds
Neural Proof Nets
Neural Related Work Summarization with a Joint Context-driven Attention Mechanism
Neural Responding Machine for Short-Text Conversation
Neural Segmental Hypergraphs for Overlapping Mention Recognition
Neural Simultaneous Speech Translation Using Alignment-Based Chunking
Neural Symbolic Machines: Learning Semantic Parsers on Freebase with Weak Supervision
Neural Syntactic Preordering for Controlled Paraphrase Generation
Neural Temporal Opinion Modelling for Opinion Prediction on Twitter
Neural Text Generation from Structured Data with Application to the Biography Domain
Neural Topic Modeling by Incorporating Document Relationship Graph
Neural Topic Modeling with Bidirectional Adversarial Training
Neural Topic Modeling with Continual Lifelong Learning
Neural Topic Modeling with Cycle-Consistent Adversarial Training
Neural Transductive Learning and Beyond: Morphological Generation in the Minimal-Resource Setting
Neural Word Segmentation with Rich Pretraining
Neural models of factuality
Neural reparameterization improves structural optimization
Neural versus Phrase-Based Machine Translation Quality: a Case Study
NeuralREG: An end-to-end approach to referring expression generation
Neurals Networks for Projecting Named Entities from English to Ewondo
Neuro-Symbolic Visual Reasoning: Disentangling "Visual" from "Reasoning"
New Oracle-Efficient Algorithms for Private Synthetic Data Release
New Potential-Based Bounds for Prediction with Expert Advice
New Protocols and Negative Results for Textual Entailment Data Collection
Newsroom: A Dataset of 1.3 Million Summaries with Diverse Extractive Strategies
No Metrics Are Perfect: Adversarial Reward Learning for Visual Storytelling
No Permanent Friends or Enemies: Tracking Relationships between Nations from News
No-Regret Prediction in Marginally Stable Systems
Noise Contrastive Estimation and Negative Sampling for Conditional Models: Consistency and Statistical Efficiency
Noise-tolerant, Reliable Active Classification with Comparison Queries
Noisy-Input Entropy Search for Efficient Robust Bayesian Optimization
Non-Autoregressive Machine Translation with Latent Alignments
Non-Parametric Calibration for Classification
Non-Projective Dependency Parsing with Non-Local Transitions
Non-convex Learning via Replica Exchange Stochastic Gradient MCMC
Non-exchangeable feature allocation models with sublinear growth of the feature sizes
Non-linear interlinkages and key objectives amongst the Paris Agreement and the Sustainable Development Goals
Nonmyopic Gaussian Process Optimization with Macro-Actions
Nonparametric Estimation in the Dynamic Bradley-Terry Model
Nonparametric Score Estimators
Norm-Based Curriculum Learning for Neural Machine Translation
Normalized Flat Minima: Exploring Scale Invariant Definition of Flat Minima for Neural Networks using PAC-Bayesian Analysis
Normalized Loss Functions for Deep Learning with Noisy Labels
Normalizing Flows Across Dimensions
Normalizing Flows on Tori and Spheres
Normalizing Flows with Multi-Scale Autoregressive Priors
Not All Claims are Created Equal: Choosing the Right Statistical Approach to Assess Hypotheses
Not Low-Resource Anymore: Aligner Ensembling, Batch Filtering, and New Datasets for Bengali-English Machine Translation
Null It Out: Guarding Protected Attributes by Iterative Nullspace Projection
Numeracy for Language Models: Evaluating and Improving their Ability to Predict Numbers
Nurse is Closer to Woman than Surgeon? Mitigating Gender-Biased Proximities in Word Embeddings
NwQM: A neural quality assessment framework for Wikipedia
OBJ2TEXT: Generating Visually Descriptive Language from Object Layouts
ON-TRAC Consortium for End-to-End and Simultaneous Speech Translation Challenge Tasks at IWSLT 2020
OPAL: Offline Primitive Discovery for Accelerating Offline Reinforcement Learning
OSOM: A simultaneously optimal algorithm for multi-armed and linear contextual bandits
Obfuscation for Privacy-preserving Syntactic Parsing
Obfuscation via Information Density Estimation
Object Ordering with Bidirectional Matchings for Visual Reasoning
Observing Dialogue in Therapy: Categorizing and Forecasting Behavioral Codes
Obtaining Adjustable Regularization for Free via Iterate Averaging
Obtaining Faithful Interpretations from Compositional Neural Networks
Off-Dynamics Reinforcement Learning: Training for Transfer with Domain Classifiers
Off-Policy Actor-Critic with Shared Experience Replay
Offline Meta-Reinforcement Learning with Advantage Weighting
Old Dog Learns New Tricks: Randomized UCB for Bandit Problems
On Contrastive Learning for Likelihood-free Inference
On Convergence-Diagnostic based Step Sizes for Stochastic Gradient Descent
On Coresets For Regularized Regression
On Cross-Dataset Generalization in Automatic Detection of Online Abuse
On Detecting Data Pollution Attacks On Recommender Systems Using Sequential GANs
On Differentially Private Stochastic Convex Optimization with Heavy-tailed Data
On Dimensional Linguistic Properties of the Word Embedding Space
On Effective Parallelization of Monte Carlo Tree Search
On Efficient Constructions of Checkpoints
On Efficient Low Distortion Ultrametric Embedding
On Evaluation of Adversarial Perturbations for Sequence-to-Sequence Models
On Exposure Bias, Hallucination and Domain Shift in Neural Machine Translation
On Extractive and Abstractive Neural Document Summarization with Transformer Language Models
On Faithfulness and Factuality in Abstractive Summarization
On Generalization Bounds of a Family of Recurrent Neural Networks
On Gradient Descent Ascent for Nonconvex-Concave Minimax Problems
On Graph Classification Networks, Datasets and Baselines
On Incorporating Structural Information to improve Dialogue Response Generation
On Iterative Neural Network Pruning, Reinitialization, and the Similarity of Masks
On Layer Normalization in the Transformer Architecture
On Learning Language-Invariant Representations for Universal Machine Translation
On Learning Sets of Symmetric Elements
On Linear Stochastic Approximation: Fine-grained Polyak-Ruppert and Non-Asymptotic Concentration
On Losses for Modern Language Models
On Maximization of Weakly Modular Functions: Guarantees of Multi-stage Algorithms, Tractability, and Hardness
On Measuring Social Biases in Sentence Encoders
On Negative Interference in Multilingual Models: Findings and A Meta-Learning Treatment
On Optimal Transformer Depth for Low-Resource Language Translation
On Polynomial Approximations for Privacy-Preserving and Verifiable ReLU Networks
On Primes, Log-Loss Scores and (No) Privacy
On Random Subsampling of Gaussian Process Regression: A Graphon-Based Analysis
On Second-Order Group Influence Functions for Black-Box Predictions
On Suboptimality of Least Squares with Application to Estimation of Convex Bodies
On The Evaluation of Machine Translation Systems Trained With Back-Translation
On Thompson Sampling for Smoother-than-Lipschitz Bandits
On Unbalanced Optimal Transport: An Analysis of Sinkhorn Algorithm
On Using Very Large Target Vocabulary for Neural Machine Translation
On Variational Learning of Controllable Representations for Text without Supervision
On conditional versus marginal bias in multi-armed bandits
On the Benefits of Models with Perceptually-Aligned Gradients
On the Choice of Auxiliary Languages for Improved Sequence Tagging
On the Complementary Nature of Knowledge Graph Embedding, Fine Grain Entity Types, and Language Modeling
On the Computational Power of Transformers and its Implications in Sequence Modeling
On the Consistency of Top-k Surrogate Losses
On the Convergence of Adaptive Gradient Methods for Nonconvex Optimization
On the Convergence of Continuous Constrained Optimization for Structure Learning
On the Convergence of Nesterov's Accelerated Gradient Method in Stochastic Settings
On the Convergence of SARAH and Beyond
On the Convergence of Stochastic Gradient Descent with Low-Rank Projections for Convex Low-Rank Matrix Problems
On the Cross-lingual Transferability of Monolingual Representations
On the Encoder-Decoder Incompatibility in Variational Text Modeling and Beyond
On the Expressivity of Neural Networks for Deep Reinforcement Learning
On the Frailty of Universal POS Tags for Neural UD Parsers
On the Generalization Benefit of Noise in Stochastic Gradient Descent
On the Global Convergence Rates of Softmax Policy Gradient Methods
On the Idiosyncrasies of the Mandarin Chinese Classifier System
On the Inference Calibration of Neural Machine Translation
On the Limitations of Cross-lingual Encoders as Exposed by Reference-Free Machine Translation Evaluation
On the Limitations of Unsupervised Bilingual Dictionary Induction
On the Linguistic Representational Power of Neural Machine Translation Models
On the Multiple Descent of Minimum-Norm Interpolants and Restricted Lower Isometry of Kernels
On the Noisy Gradient Descent that Generalizes as SGD
On the Number of Linear Regions of Convolutional Neural Networks
On the Practical Computational Power of Finite Precision RNNs for Language Recognition
On the Relation between Quality-Diversity Evaluation and Distribution-Fitting Goal in Text Generation
On the Robustness of Language Encoders against Grammatical Errors
On the Role of Supervision in Unsupervised Constituency Parsing
On the Sample Complexity of Adversarial Multi-Source PAC Learning
On the Sample Complexity of Learning Sum-Product Networks
On the Sentence Embeddings from Pre-trained Language Models
On the Sparsity of Neural Machine Translation Models
On the Spontaneous Emergence of Discrete and Compositional Signals
On the Theoretical Properties of the Network Jackknife
On the Unreasonable Effectiveness of the Greedy Algorithm: Greedy Adapts to Sharpness
On the diminishing return of labeling clinical reports
On the importance of pre-training data volume for compact language models
On the interplay between noise and curvature and its effect on optimization and generalization
On the optimality of kernels for high-dimensional clustering
On the space-time expressivity of ResNets
On-The-Fly Information Retrieval Augmentation for Language Models
One Policy to Control Them All: Shared Modular Policies for Agent-Agnostic Control
One Size Does Not Fit All: Generating and Evaluating Variable Number of Keyphrases
One Size Fits All: Can We Train One Denoiser for All Noise Levels?
One Solution is Not All You Need: Few-Shot Extrapolation via Structured MaxEnt RL
Online Continuous DR-Submodular Maximization with Long-Term Budget Constraints
Online Conversation Disentanglement with Pointer Networks
Online Dense Subgraph Discovery via Blurred-Graph Feedback
Online Forecasting of Total-Variation-bounded Sequences
Online Hyper-parameter Tuning in Off-policy Learning via Evolutionary Strategies
Online Learning Using Only Peer Prediction
Online Learning for Active Cache Synchronization
Online Learning with Continuous Variations: Dynamic Regret and Reductions
Online Learning with Imperfect Hints
Online Pricing with Offline Data: Phase Transition and Inverse Square Law
Online Safety Assurance for Deep Reinforcement Learning
Online Segment to Segment Neural Transduction
Online metric algorithms with untrusted predictions
Online mirror descent and dual averaging: keeping pace in the dynamic case
Open Domain Event Extraction Using Neural Latent Variable Models
Open Domain Question Answering Using Early Fusion of Knowledge Bases and Text
Open Korean Corpora: A Practical Report
Operation-Aware Soft Channel Pruning using Differentiable Masks
OpinionDigest: A Simple Framework for Opinion Summarization
Opportunistic Decoding with Timely Correction for Simultaneous Translation
Optimal Client Sampling for Federated Learning
Optimal Continual Learning has Perfect Memory and is NP-hard
Optimal Randomized First-Order Methods for Least-Squares Problems
Optimal Robust Learning of Discrete Distributions from Batches
Optimal Transport-based Alignment of Learned Character Representations for String Similarity
Optimal approximation for unconstrained non-submodular minimization
Optimal group testing
Optimal transport mapping via input convex neural networks
Optimistic Policy Optimization with Bandit Feedback
Optimistic bounds for multi-output prediction
Optimization Theory for ReLU Neural Networks Trained with Normalization Layers
Optimization from Structured Samples for Coverage Functions
Optimization of Graph Total Variation via Active-Set-based Combinatorial Reconditioning
Optimized Score Transformation for Fair Classification
Optimizer Benchmarking Needs to Account for Hyperparameter Tuning
Optimizing Black-box Metrics with Adaptive Surrogates
Optimizing Data Usage via Differentiable Rewards
Optimizing Long-term Social Welfare in Recommender Systems: A Constrained Matching Approach
Optimizing Memory Placement using Evolutionary Graph Reinforcement Learning
Optimizing Millions of Hyperparameters by Implicit Differentiation
Optimizing the Factual Correctness of a Summary: A Study of Summarizing Radiology Reports
Optimus: Organizing Sentences via Pre-trained Modeling of a Latent Space
Option Discovery in the Absence of Rewards with Manifold Analysis
Oracle Efficient Private Non-Convex Optimization
Ordered SGD: A New Stochastic Optimization Framework for Empirical Risk Minimization
Ordinal Non-negative Matrix Factorization for Recommendation
Orthogonal Gradient Descent for Continual Learning
Orthogonal Relation Transforms with Graph Context Modeling for Knowledge Graph Embedding
Orthogonalized SGD and Nested Architectures for Anytime Neural Networks
Out of the Echo Chamber: Detecting Countering Debate Speeches
Overcoming Language Variation in Sentiment Analysis with Social Attention
Overfitting in adversarially robust deep learning
P-SIF: Document Embeddings Using Partition Averaging
PAC Bounds for Imitation and Model-based Batch Learning of Contextual Markov Decision Processes
PAC learning with stable and private predictions
PACRR: A Position-Aware Neural IR Model for Relevance Matching
PAGE: A Simple and Optimal Probabilistic Gradient Estimator for Nonconvex Optimization
PAIR: Planning and Iterative Refinement in Pre-trained Transformers for Long Text Generation
PALM: Pre-training an Autoencoding&Autoregressive Language Model for Context-conditioned Generation
PAN: Path Integral Based Convolution for Deep Graph Neural Networks
PARADE: A New Dataset for Paraphrase Identification Requiring Computer Science Domain Knowledge
PDO-eConvs: Partial Differential Operator Based Equivariant Convolutions
PEGASUS: Pre-training with Extracted Gap-sentences for Abstractive Summarization
PENNI: Pruned Kernel Sharing for Efficient CNN Inference
PERL: Pivot-based Domain Adaptation for Pre-trained Deep Contextualized Embedding Models
PHICON: Improving Generalization of Clinical Text De-identification Models via Data Augmentation
PLAS: Latent Action Space for Offline Reinforcement Learning
PLATO: Pre-trained Dialogue Generation Model with Discrete Latent Variable
POPCORN: Partially Observed Prediction COnstrained ReiNforcement Learning
POSEIDON: Privacy-Preserving Federated Neural Network Learning
PRover: Proof Generation for Interpretable Reasoning over Rules
PackIt: A Virtual Environment for Geometric Planning
Pan-Private Uniformity Testing
Parallel Algorithm for Non-Monotone DR-Submodular Maximization
Parallel Corpus Filtering via Pre-trained Language Models
Parallel Data Augmentation for Formality Style Transfer
Parallel Interactive Networks for Multi-Domain Dialogue State Generation
Parallels Between Phase Transitions and Circuit Complexity?
Parameters Estimation from the 21 cm signal using Variational Inference
Parametric Gaussian Process Regressors
Paraphrase Augmented Task-Oriented Dialog Generation
Paraphrase Generation as Zero-Shot Multilingual Translation: Disentangling Semantic Similarity from Lexical and Syntactic Diversity
Paraphrase to Explicate: Revealing Implicit Noun-Compound Relations
PareCO: Pareto-aware Channel Optimization for Slimmable Neural Networks
Pareto Probing: Trading Off Accuracy for Complexity
Parrot: Data-Driven Behavioral Priors for Reinforcement Learning
Parsing Speech: A Neural Approach to Integrating Lexical and Acoustic-Prosodic Information
Parsing as Reduction
Partial Trace Regression and Low-Rank Kraus Decomposition
Partially-Aligned Data-to-Text Generation with Distant Supervision
Past, Present, Future: A Computational Investigation of the Typology of Tense in 1000 Languages
Pathologies of Neural Models Make Interpretations Difficult
Patient-Specific Effects of Medication Using Latent Force Models with Gaussian Processes
PePScenes: A Novel Dataset and Baseline for Pedestrian Action Prediction in 3D
PeTra: A Sparsely Supervised Memory Model for People Tracking
Peer Loss Functions: Learning from Noisy Labels without Knowing Noise Rates
Perceptual Generative Autoencoders
Performative Prediction
Permutation Invariant Graph Generation via Score-Based Generative Modeling
Permutation invariant networks to learn Wasserstein metrics
PersLay: A Neural Network Layer for Persistence Diagrams and New Graph Topological Signatures
Personality Trait Detection Using Bagged SVM over BERT Word Embedding Ensembles
Personalized Language Model for Query Auto-Completion
Personalized Neural Embeddings for Collaborative Filtering with Text
Personalized neural language models for real-world query auto completion
Personalizing Dialogue Agents: I have a dog, do you have pets too?
Persuasion for Good: Towards a Personalized Persuasive Dialogue System for Social Good
Perturbed Masking: Parameter-free Probing for Analyzing and Interpreting BERT
Pessimism About Unknown Unknowns Inspires Conservatism
Phone Features Improve Speech Translation
Phonetic and Visual Priors for Decipherment of Informal Romanization
Phonotactic Complexity and its Trade-offs
Phrase-Based & Neural Unsupervised Machine Translation
Phrase-Indexed Question Answering: A New Challenge for Scalable Document Comprehension
Pieces of Eight: 8-bit Neural Machine Translation
Piecewise Linear Regression via a Difference of Convex Functions
Planning from Pixels using Inverse Dynamics Models
Planning to Explore via Self-Supervised World Models
Playing 20 Question Game with Policy-Based Reinforcement Learning
Playing Text-Adventure Games with Graph-Based Deep Reinforcement Learning
Please Mind the Root: Decoding Arborescences for Dependency Parsing
PlotMachines: Outline-Conditioned Generation with Dynamic Plot State Tracking
Plug and Play Autoencoders for Conditional Text Generation
PoWER-BERT: Accelerating BERT Inference via Progressive Word-vector Elimination
Pointer Graph Networks
Pointwise HSIC: A Linear-Time Kernelized Co-occurrence Norm for Sparse Linguistic Expressions
Pointwise Paraphrase Appraisal is Potentially Problematic
Poisson Learning: Graph Based Semi-Supervised Learning At Very Low Label Rates
Policy Gradient as a Proxy for Dynamic Oracles in Constituency Parsing
Policy Learning Using Weak Supervision
Policy Shaping and Generalized Update Equations for Semantic Parsing from Denotations
Policy Teaching via Environment Poisoning: Training-time Adversarial Attacks against Reinforcement Learning
Politeness Transfer: A Tag and Generate Approach
Political Advertising Dataset: the use case of the Polish 2020 Presidential Elections
PolyGen: An Autoregressive Generative Model of 3D Meshes
Polyglot Semantic Parsing in APIs
Polyglot Semantic Role Labeling
Population Mapping in Informal Settlements with High-Resolution Satellite Imagery and Equitable Ground-Truth
Population-Based Black-Box Optimization for Biological Sequence Design
Position-Aware Tagging for Aspect Sentiment Triplet Extraction
Post-Estimation Smoothing: A Simple Baseline for Learning with Side Information
Posterior Calibrated Training on Sentence Classification Tasks
Posterior Control of Blackbox Generation
PowerNorm: Rethinking Batch Normalization in Transformers
PowerTransformer: Unsupervised Controllable Revision for Biased Language Correction
Pragmatically Informative Image Captioning with Character-Level Inference
Pragmatically Informative Text Generation
Pre-Learning Environment Representations for Data-Efficient Neural Instruction Following
Pre-Training Transformers as Energy-Based Cloze Models
Pre-train and Plug-in: Flexible Conditional Text Generation with Variational Auto-Encoders
Pre-training Is (Almost) All You Need: An Application to Commonsense Reasoning
Pre-training Multilingual Neural Machine Translation by Leveraging Alignment Information
Pre-training for Abstractive Document Summarization by Reinstating Source Text
Pre-training on high-resource speech recognition improves low-resource speech-to-text translation
PreCo: A Large-scale Dataset in Preschool Vocabulary for Coreference Resolution
Precise Task Formalization Matters in Winograd Schema Evaluations
Precise Tradeoffs in Adversarial Training for Linear Regression
Predicting Choice with Set-Dependent Aggregation
Predicting Clinical Trial Results by Implicit Evidence Integration
Predicting Declension Class from Form and Meaning
Predicting In-game Actions from Interviews of NBA Players
Predicting Native Language from Gaze
Predicting Performance for Natural Language Processing Tasks
Predicting Semantic Relations using Global Graph Properties
Predicting Unplanned Readmissions with Highly Unstructured Data
Predicting and Analyzing Law-Making in Kenya
Prediction Focused Topic Models via Feature Selection
Prediction of Bayesian Intervals for Tropical Storms
Prediction of neonatal mortality in Sub-Saharan African countries using data-level linkage of multiple surveys
Predictive Biases in Natural Language Processing Models: A Conceptual Framework and Overview
Predictive Coding for Locally-Linear Control
Predictive Multiplicity in Classification
Predictive PER: Balancing Priority and Diversity towards Stable Deep Reinforcement Learning
Predictive Sampling with Forecasting Autoregressive Models
Pretrained Language Model Embryology: The Birth of ALBERT
Pretrained Transformers Improve Out-of-Distribution Robustness
Pretraining with Contrastive Sentence Objectives Improves Discourse Performance of Language Models
Principal Neighbourhood Aggregation for Graph Nets
Principled learning method for Wasserstein distributionally robust optimization with local perturbations
Privacy Amplification by Decentralization
Privacy-Preserving XGBoost Inference
Privacy-preserving Neural Representations of Text
Privacy-preserving collaborative machine learning on genomic data using TensorFlow
Private Outsourced Bayesian Optimization
Private Query Release Assisted by Public Data
Private Reinforcement Learning with PAC and Regret Guarantees
Private Stochastic Convex Optimization: Optimal Rates in Linear Time
Privately Learning Markov Random Fields
Privately Learning Thresholds: Closing the Exponential Gap
Privately detecting changes in unknown distributions
Probabilistic Assumptions Matter: Improved Models for Distantly-Supervised Document-Level Question Answering
Probabilistic FastText for Multi-Sense Word Embeddings
Probabilistic Frame Induction
Probabilistic Predictions of People Perusing: Evaluating Metrics of Language Model Performance for Psycholinguistic Modeling
Probabilistic Typology: Deep Generative Models of Vowel Inventories
Probabilistically Masked Language Model Capable of Autoregressive Generation in Arbitrary Word Order
Probing Emergent Semantics in Predictive Agents via Question Answering
Probing Linguistic Features of Sentence-Level Representations in Neural Relation Extraction
Probing Linguistic Systematicity
Probing Neural Dialog Models for Conversational Understanding
Probing Pretrained Language Models for Lexical Semantics
Probing Task-Oriented Dialogue Representation from Language Models
Probing for Semantic Classes: Diagnosing the Meaning Content of Word Embeddings
Probing the Need for Visual Context in Multimodal Machine Translation
Problems with Shapley-value-based explanations as feature importance measures
Profile Consistency Identification for Open-domain Dialogue Agents
Program Enhanced Fact Verification with Verbalization and Graph Attention Network
Progressive Graph Learning for Open-Set Domain Adaptation
Progressive Growing of Neural ODEs
Progressive Identification of True Labels for Partial-Label Learning
Progressive growing of self-organized hierarchical representations for exploration
Projective Preferential Bayesian Optimization
Pronoun-Targeted Fine-tuning for NMT with Hybrid Losses
Proper Learning, Helly Number, and an Optimal SVM Bound
Proper Network Interpretability Helps Adversarial Robustness in Classification
Prophets, Secretaries, and Maximizing the Probability of Choosing the Best
ProtoQA: A Question Answering Dataset for Prototypical Common-Sense Reasoning
Provable Representation Learning for Imitation Learning via Bi-level Optimization
Provable Self-Play Algorithms for Competitive Reinforcement Learning
Provable Smoothness Guarantees for Black-Box Variational Inference
Provably Convergent Two-Timescale Off-Policy Actor-Critic with Function Approximation
Provably Efficient Exploration in Policy Optimization
Provably Efficient Model-based Policy Adaptation
Provably Efficient Reinforcement Learning with Linear Function Approximation
Proving the Lottery Ticket Hypothesis: Pruning is All You Need
Prta: A System to Support the Analysis of Propaganda Techniques in the News
Psycholinguistics meets Continual Learning: Measuring Catastrophic Forgetting in Visual Question Answering
Pun Generation with Surprise
Putting An End to End-to-End: Gradient-Isolated Learning of Representations
PuzzLing Machines: A Challenge on Learning From Small Data
Puzzle Mix: Exploiting Saliency and Local Statistics for Optimal Mixup
PyHessian: Neural Networks Through the Lens of the Hessian
PyMT5: multi-mode translation of natural language and Python code with transformers
PySBD: Pragmatic Sentence Boundary Disambiguation
Pyramid Convolutional RNN for MRI Reconstruction
Q-learning with Language Model for Edit-based Unsupervised Summarization
Q-value Path Decomposition for Deep Multiagent Reinforcement Learning
QA2Explanation: Generating and Evaluating Explanations for Question Answering Systems over Knowledge Graph
QuASE: Question-Answer Driven Sentence Encoding
Quantifying Attention Flow in Transformers
Quantifying Differences in Reward Functions
Quantifying Intimacy in Language
Quantifying Privacy Leakage in Graph Embedding
Quantifying Similarity between Relations with Fact Distribution
Quantifying the Effects of COVID-19 on Mental Health Support Forums
Quantitative Argument Summarization and Beyond: Cross-Domain Key Point Analysis
Quantitative stability of optimal transport maps and linearization of the 2-Wasserstein space
Quantized Decentralized Stochastic Learning over Directed Graphs
Quantized Frank-Wolfe: Faster Optimization, Lower Communication, and Projection Free
Quantum Boosting
Quantum Expectation-Maximization for Gaussian Mixture Models
Quaternion Graph Neural Networks
Queens are Powerful too: Mitigating Gender Bias in Dialogue Generation
Question Directed Graph Attention Network for Numerical Reasoning over Text
R2-B2: Recursive Reasoning-Based Bayesian Optimization for No-Regret Learning in Games
R4C: A Benchmark for Evaluating RC Systems to Get the Right Answer for the Right Reason
RAMP-CNN: A Novel Neural Network for Enhanced Automotive Radar Object Recognition
RAT-SQL: Relation-Aware Schema Encoding and Linking for Text-to-SQL Parsers
RATQ: A Universal Fixed-Length Quantizer for Stochastic Optimization
RESIDE: Improving Distantly-Supervised Neural Relation Extraction using Side Information
RIFLE: Backpropagation in Depth for Deep Transfer Learning through Re-Initializing the Fully-connected LayEr
RL agents Implicitly Learning Human Preferences
RNNs can generate bounded hierarchical languages with optimal memory
ROMA: Multi-Agent Reinforcement Learning with Emergent Roles
RPD: A Distance Function Between Word Embeddings
Radial Bayesian Neural Networks: Beyond Discrete Support In Large-Scale Bayesian Deep Learning
Radioactive data: tracing through training
Random Hypervolume Scalarizations for Provable Multi-Objective Black Box Optimization
Random Matrix Theory Proves that Deep Learning Representations of GAN-data Behave as Gaussian Mixtures
Random Search and Reproducibility for Neural Architecture Search
Random extrapolation for primal-dual coordinate descent
Randomized Block-Diagonal Preconditioning for Parallel Learning
Randomized Exploration in Generalized Linear Bandits
Randomized Smoothing of All Shapes and Sizes
Randomly Projected Additive Gaussian Processes for Regression
Rank and run-time aware compression of NLP Applications
Ranking Paragraphs for Improving Answer Recall in Open-Domain Question Answering
Ranking and Selecting Multi-Hop Knowledge Paths to Better Predict Human Needs
Rapid Adaptation of Neural Machine Translation to New Languages
Rapidly Deploying a Neural Search Engine for the COVID-19 Open Research Dataset: Preliminary Thoughts and Lessons Learned
Rate-Distortion Optimization Guided Autoencoder for Isometric Embedding in Euclidean Latent Space
Rational Recurrences
Rationalizing Medical Relation Prediction from Corpus-level Statistics
Rationalizing Neural Predictions
Rationalizing Text Matching: Learning Sparse Alignments via Optimal Transport
Re-evaluating Evaluation in Text Summarization
Re-translation versus Streaming for Simultaneous Translation
ReLU Code Space: A Basis for Rating Network Quality Besides Accuracy
Reactive Supervision: A New Method for Collecting Sarcasm Data
Reading Between the Lines: Exploring Infilling in Visual Narratives
Ready Policy One: World Building Through Active Learning
Real-Time Open-Domain Question Answering with Dense-Sparse Phrase Index
Real-Time Optimisation for Online Learning in Auctions
Real-time Classification from Short Event-Camera Streams using Input-filtering Neural ODEs
Reasoning About Generalization via Conditional Mutual Information
Reasoning About Pragmatics with Neural Listeners and Speakers
Reasoning Over History: Context Aware Visual Dialog
Reasoning Over Semantic-Level Graph for Fact Checking
Reasoning about Actions and State Changes by Injecting Commonsense Knowledge
Reasoning about Goals, Steps, and Temporal Ordering with WikiHow
Reasoning with Latent Structure Refinement for Document-Level Relation Extraction
Reasoning with Sarcasm by Reading In-between
Recall and Learn: Fine-tuning Deep Pretrained Language Models with Less Forgetting
RecipeQA: A Challenge Dataset for Multimodal Comprehension of Cooking Recipes
Recipes for building an open-domain chatbot
Recognizing Implicit Discourse Relations via Repeated Reading: Neural Networks with Multi-Level Attention
Recovery of Sparse Signals from a Mixture of Linear Samples
Recurrent Chunking Mechanisms for Long-Text Machine Reading Comprehension
Recurrent Event Network: Autoregressive Structure Inference over Temporal Knowledge Graphs
Recurrent Hierarchical Topic-Guided RNN for Language Generation
Recurrent Interaction Network for Jointly Extracting Entities and Classifying Relations
Recurrent Neural Network Language Models Always Learn English-Like Relative Clause Attachment
Recurrent Neural Networks as Weighted Language Recognizers
Recurrent Neural Networks in Linguistic Theory: Revisiting Pinker and Prince (1988) and the Past Tense Debate
Recurrent babbling: evaluating the acquisition of grammar from limited input data
Recursive Subtree Composition in LSTM-Based Dependency Parsing
Reducibility and Statistical-Computational Gaps from Secret Leakage
Reducing Gender Bias in Abusive Language Detection
Reducing Gender Bias in Neural Machine Translation as a Domain Adaptation Problem
Reducing Sampling Error in Batch Temporal Difference Learning
Refer, Reuse, Reduce: Generating Subsequent References in Visual and Conversational Contexts
Refined bounds for algorithm configuration: The knife-edge of dual class approximability
Reflection-based Word Attribute Transfer
Reformulating Unsupervised Style Transfer as Paraphrase Generation
Regression Networks for Meta-Learning Few-Shot Classification
Regularity as Regularization: Smooth and Strongly Convex Brenier Potentials in Optimal Transport
Regularization via Structural Label Smoothing
Regularized Autoencoders via Relaxed Injective Probability Flow
Regularized Context Gates on Transformer for Machine Translation
Regularized Inverse Reinforcement Learning
Regularized Optimal Transport is Ground Cost Adversarial
Regularizing Dialogue Generation by Imitating Implicit Scenarios
Reinforcement Learning Based Text Style Transfer without Parallel Training Corpus
Reinforcement Learning Generalization with Surprise Minimization
Reinforcement Learning based Curriculum Optimization for Neural Machine Translation
Reinforcement Learning for Integer Programming: Learning to Cut
Reinforcement Learning for Molecular Design Guided by Quantum Mechanics
Reinforcement Learning in Feature Space: Matrix Bandit, Kernels, and Regret Bound
Reinforcement Learning through Active Inference
Reinforcement Learning with Chromatic Networks for Compact Architecture Search
Reinforcement Learning with Latent Flow
Relabel the Noise: Joint Extraction of Entities and Relations via Cooperative Multiagents
Relating Simple Sentence Representations in Deep Neural Networks and the Brain
Relation Embedding with Dihedral Group in Knowledge Graph
Relation Extraction with Explanation
Relational Graph Attention Network for Aspect-based Sentiment Analysis
Relations such as Hypernymy: Identifying and Exploiting Hearst Patterns in Distributional Vectors for Lexical Entailment
Relative gradient optimization of the Jacobian term in unsupervised deep learning
Relaxing Bijectivity Constraints with Continuously Indexed Normalising Flows
Relevance of Rotationally Equivariant Convolutions for Predicting Molecular Properties
Reliable Fidelity and Diversity Metrics for Generative Models
Reliable evaluation of adversarial robustness with an ensemble of diverse parameter-free attacks
Rep the Set: Neural Networks for Learning Set Representations
Replicability Analysis for Natural Language Processing: Testing Significance with Multiple Datasets
Representation Learning for Discovering Phonemic Tone Contours
Representation Learning for Grounded Spatial Reasoning
Representations of language in a model of visually grounded speech signal
Representing Unordered Data Using Complex-Weighted Multiset Automata
Representing and Denoising Wearable ECG Recordings
Repulsive Attention: Rethinking Multi-head Attention as Bayesian Inference
Repurposing Entailment for Multi-Hop Question Answering Tasks
Reserve Pricing in Repeated Second-Price Auctions with Strategic Bidders
Reset-Free Lifelong Learning with Skill-Space Planning
Resolution Dependent GAN Interpolation for Controllable Image Synthesis Between Domains
Resolving Spurious Correlations in Causal Models of Environments via Interventions
Response Selection for Multi-Party Conversations with Dynamic Topic Tracking
Response-Anticipated Memory for On-Demand Knowledge Integration in Response Generation
RethinkCWS: Is Chinese Word Segmentation a Solved Task?
Rethinking Action Spaces for Reinforcement Learning in End-to-end Dialog Agents with Latent Variable Models
Rethinking Dialogue State Tracking with Reasoning
Retrieval-Based Neural Code Generation
Retrofitting Structure-aware Transformer Language Model for End Tasks
Reusability and Transferability of Macro Actions for Reinforcement Learning
Reusing a Pretrained Language Model on Languages with Limited Corpora for Unsupervised NMT
Reverse Engineering Configurations of Neural Text Generation Models
Reverse-Engineering Deep ReLU Networks
Review-based Question Generation with Adaptive Instance Transfer and Augmentation
Revisiting Character-Based Neural Machine Translation with Capacity and Compression
Revisiting Ensembles in an Adversarial Context: Improving Natural Accuracy
Revisiting Fundamentals of Experience Replay
Revisiting Joint Modeling of Cross-document Entity and Event Coreference Resolution
Revisiting Low-Resource Neural Machine Translation: A Case Study
Revisiting Modularized Multilingual NMT to Meet Industrial Demands
Revisiting Rainbow: Promoting more insightful and inclusive deep reinforcement learning research
Revisiting Stochastic Extragradient
Revisiting Unsupervised Relation Extraction
Revisiting the Context Window for Cross-lingual Word Embeddings
Revisiting the Importance of Encoding Logic Rules in Sentiment Classification
RiSAWOZ: A Large-Scale Multi-Domain Wizard-of-Oz Dataset with Rich Semantic Annotations for Task-Oriented Dialogue Modeling
Rigging the Lottery: Making All Tickets Winners
Right for the Wrong Reasons: Diagnosing Syntactic Heuristics in Natural Language Inference
Rigid Formats Controlled Text Generation
RikiNet: Reading Wikipedia Pages for Natural Question Answering
Risk Assessment for Machine Learning Models
Risk Bounds for Learning Multiple Components with Permutation-Invariant Losses
Rk-means: Fast Clustering for Relational Data
Robust Bayesian Classification Using an Optimistic Score Ratio
Robust Cross-lingual Hypernymy Detection using Dependency Context
Robust Distant Supervision Relation Extraction via Deep Reinforcement Learning
Robust Domain Randomised Reinforcement Learning through Peer-to-Peer Distillation
Robust Encodings: A Framework for Combating Adversarial Typos
Robust Learning from Discriminative Feature Feedback
Robust Optimisation Monte Carlo
Robust Outlier Arm Identification
Robust Prediction of Punctuation and Truecasing for Medical ASR
Robust Reinforcement Learning using Adversarial Populations
Robust Variational Autoencoders for Outlier Detection and Repair of Mixed-Type Data
Robust Visual Domain Randomization for Reinforcement Learning
Robust and Private Learning of Halfspaces
Robust and Stable Black Box Explanations
Robust model training and generalisation with Studentising flows
Robust posterior inference when statistically emulating forward simulations
Robustifying Sequential Neural Processes
Robustness for Non-Parametric Classification: A Generic Attack and Defense
Robustness to Programmable String Transformations via Augmented Abstract Training
Robustness to Spurious Correlations via Human Annotations
Room-Across-Room: Multilingual Vision-and-Language Navigation with Dense Spatiotemporal Grounding
RussianSuperGLUE: A Russian Language Understanding Evaluation Benchmark
S-MART: Novel Tree-based Structured Learning Algorithms Applied to Tweet Entity Linking
S2ORC: The Semantic Scholar Open Research Corpus
S2RMs: Spatially Structured Recurrent Modules
SAFENet: Self-Supervised Monocular Depth Estimation with Semantic-Aware Feature Extraction
SAFER: A Structure-free Approach for Certified Robustness to Adversarial Word Substitutions
SCAFFOLD: Stochastic Controlled Averaging for Federated Learning
SCDE: Sentence Cloze Dataset with High Quality Distractors From Examinations
SCDV : Sparse Composite Document Vectors using soft clustering over distributional representations
SDE-Net: Equipping Deep Neural Networks with Uncertainty Estimates
SE(3)-Transformers: 3D Roto-Translation Equivariant Attention Networks
SECTOR: A Neural Model for Coherent Topic Segmentation and Classification
SGD Learns One-Layer Networks in WGANs
SHAPED: Shared-Private Encoder-Decoder for Text Style Adaptation
SIGMORPHON 2020 Shared Task 0: Typologically Diverse Morphological Inflection
SIGN: Scalable Inception Graph Neural Networks
SIGTYP 2020 Shared Task: Prediction of Typological Features
SIGUA: Forgetting May Make Learning with Noisy Labels More Robust
SJTU-NICT's Supervised and Unsupervised Neural Machine Translation Systems for the WMT20 News Translation Task
SKEP: Sentiment Knowledge Enhanced Pre-training for Sentiment Analysis
SLEDGE-Z: A Zero-Shot Baseline for COVID-19 Literature Search
SLM: Learning a Discourse Language Representation with Sentence Unshuffling
SLURP: A Spoken Language Understanding Resource Package
SMART: Robust and Efficient Fine-Tuning for Pre-trained Natural Language Models through Principled Regularized Optimization
SMArtCast: Predicting soil moisture interpolations into the future using Earth observation data in a deep learning framework
SOTERIA: In Search of Efficient Neural Networks for Private Inference
SOrT-ing VQA Models : Contrastive Gradient Learning for Improved Consistency
SQuAD: 100,000+ Questions for Machine Comprehension of Text
SRLGRN: Semantic Role Labeling Graph Reasoning Network
SSCR: Iterative Language-Based Image Editing via Self-Supervised Counterfactual Reasoning
SSMBA: Self-Supervised Manifold Based Data Augmentation for Improving Out-of-Domain Robustness
STARC: Structured Annotations for Reading Comprehension
STORIUM: A Dataset and Evaluation Platform for Machine-in-the-Loop Story Generation
SUPERT: Towards New Frontiers in Unsupervised Evaluation Metrics for Multi-Document Summarization
SUPP.AI: Finding Evidence for Supplement-Drug Interactions
SWAG: A Large-Scale Adversarial Dataset for Grounded Commonsense Inference
SacreROUGE: An Open-Source Library for Using and Developing Summarization Evaluation Metrics
Safe Imitation Learning via Fast Bayesian Reward Inference from Preferences
Safe Reinforcement Learning in Constrained Markov Decision Processes
Safe Reinforcement Learning with Natural Language Constraints
SafeCity: Understanding Diverse Forms of Sexual Harassment Personal Stories
Saliency Learning: Teaching the Model Where to Pay Attention
SalsaNext: Fast, Uncertainty-aware Semantic Segmentation of LiDAR Point Clouds for Autonomous Driving
Sample Amplification: Increasing Dataset Size even when Learning is Impossible
Sample Complexity Bounds for 1-bit Compressive Sensing and Binary Stable Embeddings with Generative Priors
Sample Complexity of Estimating the Policy Gradient for Nearly Deterministic Dynamical Systems
Sample Complexity of Reinforcement Learning using Linearly Combined Model Ensembles
Sample Efficient Training in Multi-Agent Adversarial Games with Limited Teammate Communication
Sample Factory: Egocentric 3D Control from Pixels at 100000 FPS with Asynchronous Reinforcement Learning
Sample-efficient proper PAC learning with approximate differential privacy
Sarcasm Detection in Tweets with BERT and GloVe Embeddings
Sarcasm Detection using Context Separators in Online Discourse
Satellite-based Prediction of Forage Conditions for Livestock in Northern Kenya
Satirical News Detection and Analysis using Attention Mechanism and Linguistic Features
Scalable Bayesian Learning of Recurrent Neural Networks for Language Modeling
Scalable Deep Generative Modeling for Sparse Graphs
Scalable Differentiable Physics for Learning and Control
Scalable Differential Privacy with Certified Robustness in Adversarial Learning
Scalable Exact Inference in Multi-Output Gaussian Processes
Scalable Gaussian Process Regression for Kernels with a Non-Stationary Phase
Scalable Gradients for Stochastic Differential Equations
Scalable Identification of Partially Observed Systems with Certainty-Equivalent EM
Scalable Multi-Hop Relational Reasoning for Knowledge-Aware Question Answering
Scalable Nearest Neighbor Search for Optimal Transport
Scalable Syntax-Aware Language Models Using Knowledge Distillation
Scalable Zero-shot Entity Linking with Dense Entity Retrieval
Scalable and Efficient Comparison-based Search without Features
Scaling Hidden Markov Language Models
Scaling up Hybrid Probabilistic Inference with Logical and Arithmetic Constraints via Message Passing
Scattering GCN: Overcoming Oversmoothness in Graph Convolutional Networks
Scene Graph Parsing as Dependency Parsing
Scene Graph Reasoning for Visual Question Answering
Schatten Norms in Matrix Streams: Hello Sparsity, Goodbye Dimension
SciDTB: Discourse Dependency TreeBank for Scientific Abstracts
SciREX: A Challenge Dataset for Document-Level Information Extraction
Score Combination for Improved Parallel Corpus Filtering for Low Resource Conditions
Scoring Lexical Entailment with a Supervised Directional Similarity Network
Screening Data Points in Empirical Risk Minimization via Ellipsoidal Regions and Safe Loss Functions
Screenplay Quality Assessment: Can We Predict Who Gets Nominated?
Screenplay Summarization Using Latent Narrative Structure
ScriptWriter: Narrative-Guided Script Generation
Secure Medical Image Analysis with CrypTFlow
Selecting Backtranslated Data from Multiple Sources for Improved Neural Machine Translation
Selecting Machine-Translated Data for Quick Bootstrapping of a Natural Language Understanding System
Selection Bias Explorations and Debias Methods for Natural Language Sentence Matching Datasets
Selective Attention for Context-aware Neural Machine Translation
Selective Dyna-style Planning Under Limited Model Capacity
Selective Encoding for Abstractive Sentence Summarization
Selective Question Answering under Domain Shift
Self-Attentive Associative Memory
Self-Induced Curriculum Learning in Self-Supervised Neural Machine Translation
Self-PU: Self Boosted and Calibrated Positive-Unlabeled Training
Self-Supervised Meta-Learning for Few-Shot Natural Language Classification Tasks
Self-Supervised Policy Adaptation during Deployment
Self-Training for Unsupervised Parsing with PRPN
Self-supervised Knowledge Triplet Learning for Zero-shot Question Answering
Self-supervised Label Augmentation via Input Transformations
SelfORE: Self-supervised Relational Feature Learning for Open Relation Extraction
Selfish Robustness and Equilibria in Multi-Player Bandits
Semantic Annotation for Microblog Topics Using Wikipedia Temporal Information
Semantic Drift in Multilingual Representations
Semantic Enrichment of Nigerian Pidgin English for Contextual Sentiment Classification
Semantic Graphs for Generating Deep Questions
Semantic Label Smoothing for Sequence to Sequence Problems
Semantic Parsing for Task Oriented Dialog using Hierarchical Representations
Semantic Parsing to Probabilistic Programs for Situated Question Answering
Semantic Parsing with Dual Learning
Semantic Parsing with Semi-Supervised Sequential Autoencoders
Semantic Role Labeling Guided Multi-turn Dialogue ReWriter
Semantic Role Labeling as Syntactic Dependency Parsing
Semantic Role Labeling for Learner Chinese: the Importance of Syntactic Parsing and L2-L1 Parallel Data
Semantic Scaffolds for Pseudocode-to-Code Generation
Semantic Structural Evaluation for Text Simplification
Semantic expressive capacity with bounded memory
Semantically Conditioned LSTM-based Natural Language Generation for Spoken Dialogue Systems
Semantically-Aligned Equation Generation for Solving and Reasoning Math Word Problems
Semantically-Aligned Universal Tree-Structured Solver for Math Word Problems
Semi-Modular Inference: enhanced learning in multi-modular models by tempering the influence of components
Semi-Supervised Bilingual Lexicon Induction with Two-way Interaction
Semi-Supervised Dialogue Policy Learning via Stochastic Reward Estimation
Semi-Supervised Learning with Normalizing Flows
Semi-Supervised QA with Generative Domain-Adaptive Nets
Semi-Supervised StyleGAN for Disentanglement Learning
Semi-supervised User Geolocation via Graph Convolutional Networks
Semiparametric Nonlinear Bipartite Graph Representation Learning with Provable Guarantees
SenseBERT: Driving Some Sense into BERT
Sentence Meta-Embeddings for Unsupervised Semantic Textual Similarity
Sentence Simplification with Deep Reinforcement Learning
Sentences with Gapping: Parsing and Reconstructing Elided Predicates
SentiBERT: A Transferable Transformer-Based Architecture for Compositional Sentiment Semantics
SentiLARE: Sentiment-Aware Language Representation Learning with Linguistic Knowledge
Seq2Edits: Sequence Transduction Using Span-level Edit Operations
SeqMix: Augmenting Active Sequence Labeling via Sequence Mixup
Sequence Tagging with Contextual and Non-Contextual Subword Representations: A Multilingual Evaluation
Sequence-Level Knowledge Distillation
Sequence-Level Mixed Sample Data Augmentation
Sequence-to-Action: End-to-End Semantic Graph Generation for Semantic Parsing
Sequential Cooperative Bayesian Inference
Sequential Matching Network: A New Architecture for Multi-turn Response Selection in Retrieval-based Chatbots
Sequential Transfer in Reinforcement Learning with a Generative Model
Serverless inferencing on Kubernetes
Set Functions for Time Series
Severing the Edge Between Before and After: Neural Architectures for Temporal Ordering of Events
Shape of synth to come: Why we should use synthetic data for English surface realization
Shaping Visual Representations with Language for Few-shot Classification
Shared-Private Bilingual Word Embeddings for Neural Machine Translation
Sharp Analysis of Expectation-Maximization for Weakly Identifiable Models
Sharp Composition Bounds for Gaussian Differential Privacy via Edgeworth Expansion
Sharp Models on Dull Hardware: Fast and Accurate Neural Machine Translation Decoding on the CPU
Sharper bounds for uniformly stable algorithms
Sheaf Neural Networks
SherLIiC: A Typed Event-Focused Lexical Inference Benchmark for Evaluating Natural Language Inference
Short-Term Meaning Shift: A Distributional Exploration
Should All Cross-Lingual Embeddings Speak English?
Showing Your Work Doesn't Always Work
SimGANs: Simulator-Based Generative Adversarial Networks for ECG Synthesis to Improve Deep ECG Classification
SimVerb-3500: A Large-Scale Evaluation Set of Verb Similarity
Similarity Analysis of Contextual Word Representation Models
Simple Unsupervised Summarization by Contextual Matching
Simple and Deep Graph Convolutional Networks
Simple and Effective Curriculum Pointer-Generator Networks for Reading Comprehension over Long Narratives
Simple and Effective Multi-Paragraph Reading Comprehension
Simple and Effective Text Simplification Using Semantic and Neural Methods
Simple and sharp analysis of k-means
SimpleQuestions Nearly Solved: A New Upperbound and Baseline Approach
Simpler but More Accurate Semantic Dependency Parsing
Simplify the Usage of Lexicon in Chinese NER
Simplifying Neural Machine Translation with Addition-Subtraction Twin-Gated Recurrent Networks
Simulator Calibration under Covariate Shift with Kernels
Simultaneous Inference for Massive Data: Distributed Bootstrap
Simultaneous Machine Translation with Visual Context
Simultaneous Translation Policies: From Fixed to Adaptive
Simultaneous Translation with Flexible Policy via Restricted Imitation Learning
Simultaneous paraphrasing and translation by fine-tuning Transformer models
Single Model Ensemble using Pseudo-Tags and Distinct Vectors
Single Point Transductive Prediction
Single Shot Multitask Pedestrian Detection and Behavior Prediction
Single-/Multi-Source Cross-Lingual NER via Teacher-Student Learning on Unlabeled Data in Target Language
Situated Mapping of Sequential Instructions to Actions with Single-step Reward Observation
Skeleton-to-Response: Dialogue Generation Guided by Retrieval Memory
Sketch-Driven Regular Expression Generation from Natural Language and Examples
Sketching Transformed Matrices with Applications to Natural Language Processing
Skill Transfer via Partially Amortized Hierarchical Planning
SlotRefine: A Fast Non-Autoregressive Model for Joint Intent Detection and Slot Filling
Small Data, Big Decisions: Model Selection in the Small-Data Regime
Small-GAN: Speeding Up GAN Training Using Core-sets
Smooth Contextual Bandits: Bridging the Parametric and Non-differentiable Regret Regimes
Social Bias Frames: Reasoning about Social and Power Implications of Language
Social Biases in NLP Models as Barriers for Persons with Disabilities
Social Chemistry 101: Learning to Reason about Social and Moral Norms
Social Media Attributions in the Context of Water Crisis
Soft Gazetteers for Low-Resource Named Entity Recognition
Soft Threshold Weight Reparameterization for Learnable Sparsity
SoftSort: A Continuous Relaxation for the argsort Operator
Software Engineering Event Modeling using Relative Time in Temporal Knowledge Graphs
Solving Constrained CASH Problems with ADMM
Solving Discounted Stochastic Two-Player Games with Near-Optimal Time and Sample Complexity
Solving General Arithmetic Word Problems
Solving Physics Puzzles by Reasoning about Paths
Source Separation with Deep Generative Priors
Sources of Transfer in Multilingual Named Entity Recognition
Span Selection Pre-training for Question Answering
Span-Based Constituency Parsing with a Structure-Label System and Provably Optimal Dynamic Oracles
Span-ConveRT: Few-shot Span Extraction for Dialog with Pretrained Conversational Representations
Span-based Localizing Network for Natural Language Video Localization
Span-based discontinuous constituency parsing: a family of exact chart-based algorithms with time complexities from O(n^6) down to O(n^3)
SpanBERT: Improving Pre-training by Representing and Predicting Spans
Sparse Coding of Neural Word Embeddings for Multilingual Sequence Labeling
Sparse Gaussian Processes with Spherical Harmonic Features
Sparse Orthogonal Variational Inference for Gaussian Processes
Sparse Overcomplete Word Vector Representations
Sparse Parallel Training of Hierarchical Dirichlet Process Topic Models
Sparse Sinkhorn Attention
Sparse Text Generation
Sparse and Constrained Attention for Neural Machine Translation
Sparse and Low-rank Tensor Estimation via Cubic Sketchings
Sparsified Linear Programming for Zero-Sum Equilibrium Finding
SpatialSim: Recognizing Spatial Configurations of Objects with Graph Neural Networks
Speak to your Parser: Interactive Text-to-SQL with Natural Language Feedback
Speaker Sensitive Response Evaluation Model
Speakers Fill Lexical Semantic Gaps with Context
Specialising Word Vectors for Lexical Entailment
Spectral Clustering with Graph Neural Networks for Graph Pooling
Spectral Frank-Wolfe Algorithm: Strict Complementarity and Linear Convergence
Spectral Subsampling MCMC for Stationary Time Series
Spectrum Dependent Learning Curves in Kernel Regression and Wide Neural Networks
Speech Translation and the End-to-End Promise: Taking Stock of Where We Are
Speeding Up Neural Machine Translation Decoding by Cube Pruning
SpellGCN: Incorporating Phonological and Visual Similarities into Language Models for Chinese Spelling Check
Spelling Error Correction with Soft-Masked BERT
Split and Rephrase
Spot The Bot: A Robust and Efficient Framework for the Evaluation of Conversational Dialogue Systems
Spying on your neighbors: Fine-grained probing of contextual embeddings for information about surrounding words
SqueezeBERT: What can computer vision teach NLP about efficient neural networks?
Stabilizing Bi-Level Hyperparameter Optimization using Moreau-Yosida Regularization
Stabilizing Differentiable Architecture Search via Perturbation-based Regularization
Stabilizing Transformers for Reinforcement Learning
Stack-Pointer Networks for Dependency Parsing
Stance Prediction and Claim Verification: An Arabic Perspective
Stance Prediction for Contemporary Issues: Data and Experiments
Stanza: A Python Natural Language Processing Toolkit for Many Human Languages
State Space Expectation Propagation: Efficient Inference Schemes for Temporal Gaussian Processes
Statistical Machine Translation Features with Multitask Tensor Networks
Statistically Efficient Off-Policy Policy Gradients
Statistically Preconditioned Accelerated Gradient Method for Distributed Optimization
Stay on the Path: Instruction Fidelity in Vision-and-Language Navigation
Staying True to Your Word: (How) Can Attention Become Explanation?
Stepwise Extractive Summarization and Planning with Structured Transformers
Stepwise Model Selection for Sequence Prediction via Deep Kernel Learning
Stereo Endoscopic Image Super-Resolution Using Disparity-Constrained Parallel Attention
Stimulating Creativity with FunLines: A Case Study of Humor Generation in Headlines
Stochastic Coordinate Minimization with Progressive Precision for Stochastic Convex Optimization
Stochastic Differential Equations with Variational Wishart Diffusions
Stochastic Flows and Geometric Optimization on the Orthogonal Group
Stochastic Frank-Wolfe for Constrained Finite-Sum Minimization
Stochastic Gauss-Newton Algorithms for Nonconvex Compositional Optimization
Stochastic Gradient and Langevin Processes
Stochastic Hamiltonian Gradient Methods for Smooth Games
Stochastic Latent Residual Video Prediction
Stochastic Linear Contextual Bandits with Diverse Contexts
Stochastic Neural Network with Kronecker Flow
Stochastic Normalizing Flows
Stochastic Optimization for Regularized Wasserstein Estimators
Stochastic Particle-Optimization Sampling and the Non-Asymptotic Convergence Theory
Stochastic Recursive Variance-Reduced Cubic Regularization Methods
Stochastic Regret Minimization in Extensive-Form Games
Stochastic Subspace Cubic Newton Method
Stochastic Top-k ListNet
Stochastic Wasserstein Autoencoder for Probabilistic Sentence Generation
Stochastic bandits with arm-dependent delays
Stochastic-YOLO: Efficient Probabilistic Object Detection under Dataset Shifts
Stochastically Dominant Distributional Reinforcement Learning
Stochasticity in Neural ODEs: An Empirical Study
Stolen Probability: A Structural Weakness of Neural Language Models
Stopping criterion for active learning based on deterministic generalization bounds
Straight to the Tree: Constituency Parsing with Neural Syntactic Distance
Strategic Classification is Causal Modeling in Disguise
Strategies for Structuring Story Generation
Strategizing against No-regret Learners
Streamlining Tensor and Network Pruning in PyTorch
Strength from Weakness: Fast Learning Using Weak Supervision
Stretching the Effectiveness of MLE from Accuracy to Bias for Pairwise Comparisons
Striving for Simplicity and Performance in Off-Policy DRL: Output Normalization and Non-Uniform Sampling
Strong Baselines for Neural Semi-supervised Learning under Domain Shift
Strong Baselines for Simple Question Answering over Knowledge Graphs with and without Neural Networks
Strong and Simple Baselines for Multimodal Utterance Embeddings
Stronger and Faster Wasserstein Adversarial Attacks
StructVAE: Tree-structured Latent Variable Models for Semi-supervised Semantic Parsing
Structural Language Models of Code
Structural Neural Encoders for AMR-to-text Generation
Structural Scaffolds for Citation Intent Classification in Scientific Publications
Structural Supervision Improves Few-Shot Learning and Syntactic Generalization in Neural Language Models
Structure Adaptive Algorithms for Stochastic Bandits
Structure Aware Negative Sampling in Knowledge Graphs
Structure Mapping for Transferability of Causal Models
Structure-Level Knowledge Distillation For Multilingual Sequence Labeling
Structured Attention for Unsupervised Dialogue Structure Induction
Structured Linear Contextual Bandits: A Sharp and Geometric Smoothed Analysis
Structured Minimally Supervised Learning for Neural Relation Extraction
Structured Multi-Label Biomedical Text Tagging via Attentive Neural Tree Decoding
Structured Policy Iteration for Linear Quadratic Regulator
Structured Prediction with Partial Labelling through the Infimum Loss
Structured Pruning of Large Language Models
Structured Training for Neural Network Transition-Based Parsing
Structured Tuning for Semantic Role Labeling
Student-Teacher Curriculum Learning via Reinforcement Learning: Predicting Hospital Inpatient Admission Location
Studying the Inductive Biases of RNNs with Synthetic Variations of Natural Languages
Style Transfer Through Back-Translation
Sub-Instruction Aware Vision-and-Language Navigation
Subgoal Discovery for Hierarchical Dialogue Policy Learning
SubjQA: A Dataset for Subjectivity and Review Comprehension
Sublinear Optimal Policy Value Estimation in Contextual Bandits
Substance over Style: Document-Level Targeted Content Transfer
Subword Regularization: Improving Neural Network Translation Models with Multiple Subword Candidates
Subword-Level Language Identification for Intra-Word Code-Switching
Successfully Applying the Stabilized Lottery Ticket Hypothesis to the Transformer Architecture
Summarizing Opinions: Aspect Extraction Meets Sentiment Prediction and They Are Both Weakly Supervised
Summarizing Text on Any Aspects: A Knowledge-Informed Weakly-Supervised Approach
Super-efficiency of automatic differentiation for functions defined as a minimum
Supermasks in Superposition
Supertagging Combinatory Categorial Grammar with Attentive Graph Convolutional Networks
Supervised Attentions for Neural Machine Translation
Supervised Domain Enablement Attention for Personalized Domain Classification
Supervised Grapheme-to-Phoneme Conversion of Orthographic Schwas in Hindi and Punjabi
Supervised Learning of Universal Sentence Representations from Natural Language Inference Data
Supervised Learning: No Loss No Cry
Supervised Seeded Iterated Learning for Interactive Language Learning
Support recovery and sup-norm convergence rates for sparse pivotal estimation
Surrogate sea ice model enables efficient tuning
SwitchOut: an Efficient Data Augmentation Algorithm for Neural Machine Translation
Symbolic Network: Generalized Neural Policies for Relational MDPs
Syn-QG: Syntactic and Shallow Semantic Rules for Question Generation
SynSetExpan: An Iterative Framework for Joint Entity Set Expansion and Synonym Discovery
Synchronous Bidirectional Neural Machine Translation
Syntactic Data Augmentation Increases Robustness to Inference Heuristics
Syntactic Scaffolds for Semantic Structures
Syntactic Search by Example
Syntactic Structure Distillation Pretraining For Bidirectional Encoders
Syntax-Enhanced Neural Machine Translation with Syntax-Aware Word Representations
Syntax-guided Controlled Generation of Paraphrases
T-Basis: a Compact Representation for Neural Networks
T-GD: Transferable GAN-generated Images Detection Framework
T3: Tree-Autoencoder Constrained Adversarial Text Generation for Targeted Attack
TACRED Revisited: A Thorough Evaluation of the TACRED Relation Extraction Task
TAG : Type Auxiliary Guiding for Code Comment Generation
TAPAS: Weakly Supervised Table Parsing via Pre-training
TOD-BERT: Pre-trained Natural Language Understanding for Task-Oriented Dialogue
TORQUE: A Reading Comprehension Dataset of Temporal Ordering Questions
TUDataset: A collection of benchmark datasets for learning with graphs
TUNIZI: a Tunisian Arabizi sentiment analysis Dataset
TVQA+: Spatio-Temporal Grounding for Video Question Answering
TVQA: Localized, Compositional Video Question Answering
TWEETQA: A Social Media Focused Question Answering Dataset
TXtract: Taxonomy-Aware Knowledge Extraction for Thousands of Product Categories
TaBERT: Pretraining for Joint Understanding of Textual and Tabular Data
Tabula nearly rasa: Probing the Linguistic Knowledge of Character-Level Neural Language Models Trained on Unsegmented Text
Tackling the Low-resource Challenge for Canonical Segmentation
Tailoring: encoding inductive biases by optimizing unsupervised objectives at prediction time
Tails of Lipschitz Triangular Flows
Taking a hint: How to leverage loss predictors in contextual bandits?
Talk to Papers: Bringing Neural Question Answering to Academic Search
Talking to the crowd: What do people react to in online discussions?
Tangled up in BLEU: Reevaluating the Evaluation of Automatic Machine Translation Evaluation Metrics
Target Conditioned Sampling: Optimizing Data Selection for Multilingual Neural Machine Translation
Target-Guided Open-Domain Conversation
Targeted Syntactic Evaluation of Language Models
Task-Oriented Dialogue as Dataflow Synthesis
Task-Oriented Query Reformulation with Reinforcement Learning
TaskNorm: Rethinking Batch Normalization for Meta-Learning
Tasty Burgers, Soggy Fries: Probing Aspect Robustness in Aspect-Based Sentiment Analysis
TaxiNLI: Taking a Ride up the NLU Hill
Taxonomy of Dual Block-Coordinate Ascent Methods for Discrete Energy Minimization
Taylor Expansion Policy Optimization
TeMP: Temporal Message Passing for Temporal Knowledge Graph Completion
TeaForN: Teacher-Forcing with N-grams
Teacher-Student Domain Adaptation for Biosensor Models
Teacher-Student chain for efficient semi-supervised histology image classification
Technology Readiness Levels for Machine Learning Systems
Tell Me How to Ask Again: Question Data Augmentation with Controllable Rewriting in Continuous Space
Tell-and-Answer: Towards Explainable Visual Question Answering using Attributes and Captions
Template-Based Question Generation from Retrieved Sentences for Improved Unsupervised Question Answering
Temporal Common Sense Acquisition with Minimal Supervision
Temporal Information Extraction by Predicting Relative Time-lines
Temporal Mental Health Dynamics on Social Media
Temporal Phenotyping using Deep Predictive Clustering of Disease Progression
Temporally-Continuous Probabilistic Prediction using Polynomial Trajectory Parameterization
TenIPS: Inverse Propensity Sampling for Tensor Completion
Tensor Fusion Network for Multimodal Sentiment Analysis
Tensor denoising and completion based on ordinal observations
Tensors over Semirings for Latent-Variable Weighted Logic Programs
TernaryBERT: Distillation-aware Ultra-low Bit BERT
Test-Time Training with Self-Supervision for Generalization under Distribution Shifts
Tetra-Tagging: Word-Synchronous Parsing with Linear-Time Inference
Text Classification Using Label Names Only: A Language Model Self-Training Approach
Text Classification with Few Examples using Controlled Generalization
Text Processing Like Humans Do: Visually Attacking and Shielding NLP Systems
Text and Causal Inference: A Review of Using Text to Remove Confounding from Causal Estimates
Text to 3D Scene Generation with Rich Lexical Grounding
Text-Based Ideal Points
TextAttack: Lessons learned in designing Python frameworks for NLP
TextBrewer: An Open-Source Knowledge Distillation Toolkit for Natural Language Processing
TextHide: Tackling Data Privacy in Language Understanding Tasks
That is a Known Lie: Detecting Previously Fact-Checked Claims
The (Non-)Utility of Structural Features in BiLSTM-based Dependency Parsers
The ADAPT Enhanced Dependency Parser at the IWPT 2020 Shared Task
The Area of the Convex Hull of Sampled Curves: a Robust Functional Statistical Depth Measure
The Argument Reasoning Comprehension Task: Identification and Reconstruction of Implicit Warrants
The Boomerang Sampler
The Cascade Transformer: an Application for Efficient Answer Sentence Selection
The Complexity of Finding Stationary Points with Stochastic Gradient Descent
The Curse of Performance Instability in Analysis Datasets: Consequences, Source, and Suggestions
The Dialogue Dodecathlon: Open-Domain Knowledge and Image Grounded Conversational Agents
The EOS Decision and Length Extrapolation
The Effect of Natural Distribution Shift on Question Answering Models
The Expressive Power of a Class of Normalizing Flow Models
The FAST Algorithm for Submodular Maximization
The Fast Loaded Dice Roller: A Near-Optimal Exact Sampler for Discrete Probability Distributions
The Galactic Dependencies Treebanks: Getting More Data by Synthesizing New Languages
The Gossiping Insert-Eliminate Algorithm for Multi-Agent Bandits
The Grammar of Emergent Languages
The Impact of Neural Network Overparameterization on Gradient Confusion and Stochastic Gradient Descent
The Implicit Regularization of Ordinary Least Squares Ensembles
The Implicit Regularization of Stochastic Gradient Flow for Least Squares
The Implicit and Explicit Regularization Effects of Dropout
The Importance of Being Recurrent for Modeling Hierarchical Structure
The Importance of Category Labels in Grammar Induction with Child-directed Utterances
The Influence of Shape Constraints on the Thresholding Bandit Problem
The Interplay between Lexical Resources and Natural Language Processing
The Intrinsic Robustness of Stochastic Bandits to Strategic Manipulation
The LMU Munich System for the WMT 2020 Unsupervised Machine Translation Shared Task
The Language of Legal and Illegal Activity on the Darknet
The Lipschitz Constant of Self-Attention
The Lower The Simpler: Simplifying Hierarchical Recurrent Models
The Measure and Mismeasure of Fairness: A Critical Review of Fair Machine Learning
The Microsoft Toolkit of Multi-Task Deep Neural Networks for Natural Language Understanding
The Multilingual Amazon Reviews Corpus
The NarrativeQA Reading Comprehension Challenge
The NetHack Learning Environment
The Neural Tangent Kernel in High Dimensions: Triple Descent and a Multi-Scale Theory of Generalization
The Non-IID Data Quagmire of Decentralized Machine Learning
The Paradigm Discovery Problem
The PhotoBook Dataset: Building Common Ground through Visually-Grounded Dialogue
The Power Spherical distribution
The Power of Batching in Multiple Hypothesis Testing
The Referential Reader: A Recurrent Entity Network for Anaphora Resolution
The Return of Lexical Dependencies: Neural Lexicalized PCFGs
The Right Tool for the Job: Matching Model and Instance Complexities
The SIGMORPHON 2020 Shared Task on Unsupervised Morphological Paradigm Completion
The SOFC-Exp Corpus and Neural Approaches to Information Extraction in the Materials Science Domain
The Secret is in the Spectra: Predicting Cross-lingual Task Performance with Spectral Similarity Measures
The Sensitivity of Language Models and Humans to Winograd Schema Perturbations
The State and Fate of Linguistic Diversity and Inclusion in the NLP World
The Sylvester Graphical Lasso (SyGlasso)
The TechQA Dataset
The Tree Ensemble Layer: Differentiability meets Conditional Computation
The True Sample Complexity of Identifying Good Arms
The Unreasonable Volatility of Neural Machine Translation Models
The Unstoppable Rise of Computational Linguistics in Deep Learning
The Usual Suspects? Reassessing Blame for VAE Posterior Collapse
The Volctrans Machine Translation System for WMT20
The Web as a Knowledge-base for Answering Complex Questions
The World is Not Binary: Learning to Rank with Grayscale Data for Dialogue Response Selection
The continuous categorical: a novel simplex-valued exponential family
The cost-free nature of optimally tuning Tikhonov regularizers and other ordered smoothers
The elephant in the interpretability room: Why use attention as explanation when we have saliency methods?
The emergence of number and syntax units in LSTM language models
The equivalence between Stein variational gradient descent and black-box variational inference
The importance of fillers for text representations of speech transcripts
The k-tied Normal Distribution: A Compact Parameterization of Gaussian Mean Field Posteriors in Bayesian Neural Networks
The many Shapley values for model explanation
The perceptual boost of visual attention is task-dependent in naturalistic settings
The role of context in neural pitch accent detection in English
The role of regularization in classification of high-dimensional noisy Gaussian mixture
The unreasonable effectiveness of Batch-Norm statistics in addressing catastrophic forgetting across medical institutions
Theoretical Limitations of Self-Attention in Neural Sequence Models
Theory of Mind with Guilt Aversion Facilitates Cooperative Reinforcement Learning
Thermodynamic Consistent Neural Networks for Learning Material Interfacial Mechanics
Thompson Sampling Algorithms for Mean-Variance Bandits
Thompson Sampling for Linearly Constrained Bandits
Thompson Sampling via Local Uncertainty
Thresholding Bandit Problem with Both Duels and Pulls
Thresholding Graph Bandits with GrAPL
Tied Multitask Learning for Neural Speech Translation
Tight Differential Privacy for Discrete-Valued Mechanisms and for the Subsampled Gaussian Mechanism Using FFT
Tight Lower Bounds for Combinatorial Multi-Armed Bandits
Tightening Exploration in Upper Confidence Reinforcement Learning
Tigrinya Neural Machine Translation with Transfer Learning for Humanitarian Response
Tilde at WMT 2020: News Task Systems
Time Adaptive Reinforcement Learning
Time Dependence in Non-Autonomous Neural ODEs
Time Series Deconfounder: Estimating Treatment Effects over Time in the Presence of Hidden Confounders
Time Series Source Separation with Slow Flows
Time-aware Large Kernel Convolutions
Tiny Video Networks
To BERT or Not to BERT: Comparing Task-specific and Task-agnostic Semi-Supervised Approaches for Sequence Tagging
To Schedule or not to Schedule: Extracting Task Specific Temporal Entities and Associated Negation Constraints
To Test Machine Comprehension, Start by Defining Comprehension
ToTTo: A Controlled Table-To-Text Generation Dataset
Token-level and sequence-level loss smoothing for RNN language models
Top-Rank-Focused Adaptive Vote Collection for the Evaluation of Domain-Specific Semantic Models
Topic Memory Networks for Short Text Classification
Topic Modeling in Embedding Spaces
Topic Modeling via Full Dependence Mixtures
Topic Sensitive Attention on Generic Corpora Corrects Sense Bias in Pretrained Embeddings
Topically Driven Neural Language Model
Topological Autoencoders
Topological Sort for Sentence Ordering
Topologically Densified Distributions
Torch-Struct: Deep Structured Prediction Library
Toward A Neuro-inspired Creative Decoder
Toward Better Storylines with Sentence-Level Language Models
Toward Fast and Accurate Neural Discourse Segmentation
Toward Gender-Inclusive Coreference Resolution
Toward Micro-Dialect Identification in Diaglossic and Code-Switched Environments
Towards A Sign Language Gloss Representation Of Modern Standard Arabic
Towards Accurate and Reliable Energy Measurement of NLP Models
Towards Content Transfer through Grounded Text Generation
Towards Conversational Recommendation over Multi-Type Dialogs
Towards Debiasing NLU Models from Unknown Biases
Towards Debiasing Sentence Representations
Towards Dynamic Computation Graphs via Sparse Latent Structure
Towards Effective Context for Meta-Reinforcement Learning: an Approach based on Contrastive Learning
Towards End-to-End In-Image Neural Machine Translation
Towards End-to-End Reinforcement Learning of Dialogue Agents for Information Access
Towards Explainable Graph Representations in Digital Pathology
Towards Exploiting Background Knowledge for Building Conversation Systems
Towards Faithful Neural Table-to-Text Generation with Content-Matching Constraints
Towards Faithfully Interpretable NLP Systems: How should we define and evaluate faithfulness?
Towards Induction of Structured Phoneme Inventories
Towards Interpretable Reasoning over Paragraph Effects in Situation
Towards Interpreting BERT for Reading Comprehension Based QA
Towards Map-Based Validation of Semantic Segmentation Masks
Towards Multimodal Simultaneous Neural Machine Translation
Towards Near-imperceptible Steganographic Text
Towards Neural Machine Translation for Edoid Languages
Towards Open Domain Event Trigger Identification using Adversarial Domain Adaptation
Towards Persona-Based Empathetic Conversational Models
Towards Physics-informed Deep Learning for Turbulent Flow Prediction
Towards Reasonably-Sized Character-Level Transformer NMT by Finetuning Subword Systems
Towards Robustifying NLI Models Against Lexical Dataset Biases
Towards String-to-Tree Neural Machine Translation
Towards Supervised and Unsupervised Neural Machine Translation Baselines for Nigerian Pidgin
Towards Transparent and Explainable Attention Models
Towards Understanding Gender Bias in Relation Extraction
Towards Understanding the Dynamics of the First-Order Adversaries
Towards Understanding the Regularization of Adversarial Robustness on Neural Networks
Towards Universal Dialogue State Tracking
Towards Unsupervised Language Understanding and Generation by Joint Dual Learning
Towards a General Theory of Infinite-Width Limits of Neural Classifiers
Towards a predictive spatio-temporal representation of brain data
Towards an Automatic Turing Test: Learning to Evaluate Dialogue Responses
Towards classification parity across cohorts
Towards intervention-centric causal reasoning in learning agents
Toxicity Detection: Does Context Really Matter?
Train No Evil: Selective Masking for Task-Guided Pre-Training
Trainable Greedy Decoding for Neural Machine Translation
Training Binary Neural Networks through Learning with Noisy Supervision
Training Binary Neural Networks using the Bayesian Learning Rule
Training Classifiers with Natural Language Explanations
Training Deep Energy-Based Models with f-Divergence Minimization
Training Linear Neural Networks: Non-Local Convergence and Complexity Results
Training Millions of Personalized Dialogue Agents
Training Neural Networks for and by Interpolation
Training Production Language Models without Memorizing User Data
Training Question Answering Models From Synthetic Data
Trajectory of Alternating Direction Method of Multipliers and Adaptive Acceleration
TrajectoryNet: A Dynamic Optimal Transport Network for Modeling Cellular Dynamics
TransQuest at WMT2020: Sentence-Level Direct Assessment
Transfer Learning and Distant Supervision for Multilingual Transformer Models: A Study on African Languages
Transfer Learning of Photometric Phenotypes in Agriculture Using Metadata
Transfer Learning without Knowing: Reprogramming Black-box Machine Learning Models with Scarce Data and Limited Resources
Transfer NAS: Knowledge Transfer between Search Spaces with Transformer Agents
Transferable Multi-Domain State Generator for Task-Oriented Dialogue Systems
Transferring Monolingual Model to Low-Resource Language: The Case of Tigrinya
Transform the Set: Memory Attentive Generation of Guided and Unguided Image Collages
Transformation Importance with Applications to Cosmology
Transformation Networks for Target-Oriented Sentiment Classification
Transformer Based Multi-Source Domain Adaptation
Transformer Hawkes Process
Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context
Transformer-based Context-aware Sarcasm Detection in Conversation Threads from Social Media
Transformers are RNNs: Fast Autoregressive Transformers with Linear Attention
Transformers to Learn Hierarchical Contexts in Multiparty Dialogue for Span-based Question Answering
Transformers without Tears: Improving the Normalization of Self-Attention
Transforming Complex Sentences into a Semantic Hierarchy
Transition-Based Dependency Parsing with Stack Long Short-Term Memory
Transition-based Semantic Dependency Parsing with Pointer Networks
Translating Natural Language Instructions for Behavioral Robot Navigation with a Multi-Head Attention Mechanism
Translating Neuralese
Translating Similar Languages: Role of Mutual Intelligibility in Multilingual Transformers
Translation Artifacts in Cross-lingual Transfer Learning
Translationese as a Language in "Multilingual" NMT
Traversing Knowledge Graphs in Vector Space
Tree-Projected Gradient Descent for Estimating Gradient-Sparse Parameters on Graphs
Treebank Embedding Vectors for Out-of-domain Dependency Parsing
Trialstreamer: Mapping and Browsing Medical Evidence in Real-Time
Triangular Architecture for Rare Language Translation
TriggerNER: Learning with Entity Triggers as Explanations for Named Entity Recognition
Trying AGAIN instead of Trying Longer: Prior Learning for Automatic Curriculum Learning
Tuning-free Plug-and-Play Proximal Algorithm for Inverse Imaging Problems
Two Birds, One Stone: A Simple, Unified Model for Text Generation from Structured and Unstructured Data
Two Routes to Scalable Credit Assignment without Weight Symmetry
Two are Better than One: Joint Entity and Relation Extraction with Table-Sequence Encoders
Two-sample Testing Using Deep Learning
TwoWingOS: A Two-Wing Optimization Strategy for Evidential Claim Verification
TyDi QA: A Benchmark for Information-Seeking Question Answering in Typologically Diverse Languages
Type B Reflexivization as an Unambiguous Testbed for Multilingual Multi-Task Gender Bias
UDapter: Language Adaptation for Truly Universal Dependency Parsing
UNION: An Unreferenced Metric for Evaluating Open-ended Story Generation
USR: An Unsupervised and Reference Free Evaluation Metric for Dialog Generation
Ultra-Fine Entity Typing
Unbiased Risk Estimators Can Mislead: A Case Study of Learning with Complementary Labels
Uncertain Natural Language Inference
Uncertainty Estimation Using a Single Deep Deterministic Neural Network
Uncertainty Estimation in Cancer Survival Prediction
Uncertainty Quantification for Deep Context-Aware Mobile Activity Recognition and Unknown Context Discovery
Uncertainty Quantification for Sparse Deep Learning
Uncertainty in Neural Networks: Approximately Bayesian Ensembling
Uncertainty in Neural Relational Inference Trajectory Reconstruction
Uncertainty quantification using martingales for misspecified Gaussian processes
Uncertainty-Aware Label Refinement for Sequence Labeling
Uncertainty-Aware Semantic Augmentation for Neural Machine Translation
Uncertainty-Aware Vehicle Orientation Estimation for Joint Detection-Prediction Models
Uncovering the Folding Landscape of RNA Secondary Structure with Deep Graph Embeddings
Understanding Climate Impacts on Vegetation with Gaussian Processes in Granger Causality
Understanding Dataset Design Choices for Multi-hop Reasoning
Understanding Deep Learning Performance through an Examination of Test Set Difficulty: A Psychometric Case Study
Understanding Generalization in Deep Learning via Tensor Methods
Understanding Learned Reward Functions
Understanding Neural Abstractive Summarization Models via Uncertainty
Understanding Points of Correspondence between Sentences for Abstractive Summarization
Understanding Self-Attention of Self-Supervised Audio Transformers
Understanding Self-Training for Gradual Domain Adaptation
Understanding Task Design Trade-offs in Crowdsourced Paraphrase Collection
Understanding Undesirable Word Embedding Associations
Understanding Unintended Memorization in Federated Learning
Understanding and Improving Transformer From a Multi-Particle Dynamic System Point of View
Understanding and Mitigating the Tradeoff Between Robustness and Accuracy
Understanding language-elicited EEG data by predicting it from a fine-tuned language model
Understanding the Curse of Horizon in Off-Policy Evaluation via Conditional Importance Sampling
Understanding the Difficulty of Training Transformers
Understanding the Impact of Model Incoherence on Convergence of Incremental SGD with Random Reshuffle
Understanding the Intrinsic Robustness of Image Distributions using Conditional Generative Models
Understanding the Mechanics of SPIGOT: Surrogate Gradients for Latent Structure Learning
Understanding the robustness of deep neural network classifiers for breast cancer screening
Undirected Graphical Models as Approximate Posteriors
Unfolding and Shrinking Neural Machine Translation Ensembles
UniConv: A Unified Conversational Neural Architecture for Multi-domain Task-oriented Dialogues
Unified Pragmatic Models for Generating and Following Instructions
Unifying Human and Statistical Evaluation for Natural Language Generation
Universal Approximation Property of Neural Ordinary Differential Equations
Universal Approximation with Deep Narrow Networks
Universal Average-Case Optimality of Polyak Momentum
Universal Decompositional Semantic Parsing
Universal Equivariant Multilayer Perceptrons
Universal Natural Language Processing with Limited Annotations: Try Few-shot Textual Entailment as a Start
Universal Neural Machine Translation for Extremely Low Resource Languages
Universal Semantic Parsing
Unlabelled Data Improves Bayesian Uncertainty Calibration under Covariate Shift
Unlocking the Potential of Deep Counterfactual Value Networks
Unnatural Language Processing: Bridging the Gap Between Synthetic and Natural Language Data
Unpaired Sentiment-to-Sentiment Translation: A Cycled Reinforcement Learning Approach
Unraveling Meta-Learning: Understanding Feature Representations for Few-Shot Tasks
Unsupervised Alignment-based Iterative Evidence Retrieval for Multi-hop Question Answering
Unsupervised Commonsense Question Answering with Self-Talk
Unsupervised Cross-lingual Transfer of Word Embedding Spaces
Unsupervised Discovery of Implicit Gender Bias
Unsupervised Discovery of Interpretable Directions in the GAN Latent Space
Unsupervised Discrete Sentence Representation Learning for Interpretable Neural Dialog Generation
Unsupervised Domain Adaptation for Visual Navigation
Unsupervised Domain Clusters in Pretrained Language Models
Unsupervised Dual Paraphrasing for Two-stage Semantic Parsing
Unsupervised Grammar Induction with Depth-bounded PCFG
Unsupervised Hierarchy Matching with Optimal Transport over Hyperbolic Spaces
Unsupervised Identification of Translationese
Unsupervised Induction of Semantic Roles within a Reconstruction-Error Minimization Framework
Unsupervised Learning of Morphological Forests
Unsupervised Learning of Syntactic Structure with Invertible Neural Projections
Unsupervised Morphological Paradigm Completion
Unsupervised Multimodal Neural Machine Translation with Pseudo Visual Pivoting
Unsupervised Natural Language Inference via Decoupled Multimodal Contrastive Learning
Unsupervised Neural Machine Translation with Weight Sharing
Unsupervised Online Grounding of Natural Language during Human-Robot Interactions
Unsupervised Opinion Summarization as Copycat-Review Generation
Unsupervised Opinion Summarization with Noising and Denoising
Unsupervised Paraphrasing by Simulated Annealing
Unsupervised Parsing via Constituency Tests
Unsupervised Pidgin Text Generation By Pivoting English Data and Self-Training
Unsupervised Pivot Translation for Distant Languages
Unsupervised Pre-training of Bidirectional Speech Encoders via Masked Reconstruction
Unsupervised Quality Estimation for Neural Machine Translation
Unsupervised Question Answering by Cloze Translation
Unsupervised Question Decomposition for Question Answering
Unsupervised Recurrent Neural Network Grammars
Unsupervised Reference-Free Summary Quality Evaluation via Contrastive Learning
Unsupervised Speech Decomposition via Triple Information Bottleneck
Unsupervised Statistical Machine Translation
Unsupervised Text Style Transfer with Padded Masked Language Models
Unsupervised Transfer Learning for Spatiotemporal Predictive Networks
Unsupervised deep clustering for predictive texture pattern discovery in medical images
Up or Down? Adaptive Rounding for Post-Training Quantization
Urban Driving with Conditional Imitation Learning
Using Automatically Extracted Minimum Spans to Disentangle Coreference Evaluation from Boundary Detection
Using Context in Neural Machine Translation Training Objectives
Using Convolutional Variational Autoencoders to Predict Post-Trauma Health Outcomes from Actigraphy Data
Using Large Pretrained Language Models for Answering User Queries from Product Specifications
Using Linguistic Features to Improve the Generalization Capability of Neural Coreference Resolvers
Using Natural Language Relations between Answer Choices for Machine Comprehension
Using Punkt for Sentence Segmentation in non-Latin Scripts: Experiments on Kurdish (Sorani) Texts
Using Type Information to Improve Entity Coreference Resolution
Using competency questions to select optimal clustering structures for residential energy consumption patterns
Using millions of emoji occurrences to learn any-domain representations for detecting sentiment, emotion and sarcasm
Utility is in the Eye of the User: A Critique of NLP Leaderboards
Utility/Privacy Trade-off through the lens of Optimal Transport
VCDM: Leveraging Variational Bi-encoding and Deep Contextualized Word Representations for Improved Definition Modeling
VD-BERT: A Unified Vision and Dialog Transformer with BERT
VFlow: More Expressive Generative Flows with Variational Data Augmentation
Validated Variational Inference via Practical Posterior Error Bounds
Validation of Approximate Likelihood and Emulator Models for Computationally Intensive Simulations
Variable Skipping for Autoregressive Range Density Estimation
Variance Reduced Coordinate Descent with Acceleration: New Method With a Surprising Application to Finite-Sum Problems
Variance Reduction for Matrix Games
Variance Reduction in Stochastic Particle-Optimization Sampling
Variational Autoencoders and Nonlinear ICA: A Unifying Framework
Variational Autoencoders for Sparse and Overdispersed Discrete Data
Variational Autoencoders with Riemannian Brownian Motion Priors
Variational Bayesian Quantization
Variational Depth Search in ResNets
Variational Inference for Learning Representations of Natural Language Edits
Variational Inference with Continuously-Indexed Normalizing Flows
Variational Knowledge Graph Reasoning
Variational Neural Machine Translation with Normalizing Flows
Variational Optimization on Lie Groups, with Examples of Leading (Generalized) Eigenvalue Problems
Variational Pretraining for Semi-supervised Text Classification
Variational Sequential Labelers for Semi-Supervised Learning
Vector-Vector-Matrix Architecture: A Novel Hardware-Aware Framework for Low-Latency Inference in NLP Applications
Vehicle Trajectory Prediction by Transfer Learning of Semi-Supervised Models
Verb Physics: Relative Physical Knowledge of Actions and Objects
Video Prediction via Example Guidance
Video-Grounded Dialogues with Pretrained Generation Language Models
Video2Commonsense: Generating Commonsense Descriptions to Enrich Video Captioning
Visual Grounding of Learned Physical Models
Visually Grounded Continual Learning of Compositional Phrases
Visually Grounded Neural Syntax Acquisition
Voice Separation with an Unknown Number of Multiple Speakers
Volctrans Parallel Corpus Filtering System for WMT 2020
Wandering Within a World: Online Contextualized Few-Shot Learning
Wasserstein Control of Mirror Langevin Monte Carlo
Wasserstein Distance Regularized Sequence Representation for Text Matching in Asymmetrical Domains
Wasserstein Smoothing: Certified Robustness against Wasserstein Adversarial Attacks
Wasserstein Style Transfer
WaveFlow: A Compact Flow-based Model for Raw Audio
WaveNODE: A Continuous Normalizing Flow for Speech Synthesis
We Can Detect Your Bias: Predicting the Political Ideology of News Articles
WeChat Neural Machine Translation Systems for WMT20
Weakly Supervised Context Encoder using DICOM metadata in Ultrasound Imaging
Weakly Supervised Learning of Nuanced Frames for Analyzing Polarization in News Media
Weakly Supervised Medication Regimen Extraction from Medical Conversations
Weakly-Supervised Aspect-Based Sentiment Analysis via Joint Aspect-Sentiment Topic Embedding
Weakly-Supervised Disentanglement Without Compromises
Weakly-Supervised Spatio-Temporally Grounding Natural Sentence in Video
WeatherBench: A benchmark dataset for data-driven weather forecasting
Weight Poisoning Attacks on Pre-trained Models
Weird AI Yankovic: Generating Parody Lyrics
Weisfeiler and Leman go sparse: Towards scalable higher-order graph embeddings
What BERT is not: Lessons from a new suite of psycholinguistic diagnostics for language models
What Can Learned Intrinsic Rewards Capture?
What Can We Learn from Collective Human Opinions on Natural Language Inference Data?
What Did You Think Would Happen? Explaining Agent Behaviour Through Intended Outcomes
What Do Position Embeddings Learn? An Empirical Study of Pre-Trained Language Model Positional Encoding
What Does My QA Model Know? Devising Controlled Probes using Expert Knowledge
What Gives the Answer Away? Question Answering Bias Analysis on Video QA Datasets
What Happens To BERT Embeddings During Fine-tuning?
What Have We Achieved on Text Summarization?
What Kind of Language Is Hard to Language-Model?
What Makes Reading Comprehension Questions Easier?
What Question Answering can Learn from Trivia Nerds
What Was Written vs. Who Read It: News Media Profiling Using Text Analysis and Social Media Context
What You Say and How You Say it: Joint Modeling of Topics and Discourse in Microblog Conversations
What Your Username Says About You
What are the Goals of Distributional Semantics?
What are the Statistical Limits of Offline RL with Linear Function Approximation?
What do Models Learn from Question Answering Datasets?
What do Neural Machine Translation Models Learn about Morphology?
What is Learned in Visually Grounded Neural Syntax Acquisition
What is Local Optimality in Nonconvex-Nonconcave Minimax Optimization?
What is More Likely to Happen Next? Video-and-Language Future Event Prediction
What makes a good conversation? How controllable attributes affect human judgments
What's in a Name? Are BERT Named Entity Representations just as Good for any other Name?
When Are Tree Structures Necessary for Deep Learning of Representations?
When BERT Plays the Lottery, All Tickets Are Winning
When Does Self-Supervision Help Graph Convolutional Networks?
When Does Unsupervised Machine Translation Work?
When Explanations Lie: Why Many Modified BP Attributions Fail
When Hearst Is not Enough: Improving Hypernymy Detection from Corpus with Distributional Models
When and Why is Unsupervised Neural Machine Translation Useless?
When deep denoising meets iterative phase retrieval
When do Word Embeddings Accurately Reflect Surveys on our Beliefs About People?
Where Are You? Localization from Embodied Dialog
Where Are the Facts? Searching for Fact-checked Information to Alleviate the Spread of Fake News
Where's the Question? A Multi-channel Deep Convolutional Neural Network for Question Identification in Textual Data
Which Tasks Should Be Learned Together in Multi-task Learning?
Who did What: A Large-Scale Person-Centered Cloze Dataset
Whodunnit? Crime Drama as a Case for Natural Language Understanding
Why Non-myopic Bayesian Optimization is Promising and How Far Should We Look-ahead? A Study via Rollout
Why Normalizing Flows Fail to Detect Out-of-Distribution Data
Why Overfitting Isn't Always Bad: Retrofitting Cross-Lingual Word Embeddings to Dictionaries
Why Self-Attention? A Targeted Evaluation of Neural Machine Translation Architectures
Why Skip If You Can Combine: A Simple Knowledge Distillation Technique for Intermediate Layers
Why bigger is not always better: on finite and infinite neural networks
Why is unsupervised alignment of English embeddings from different algorithms so hard?
Widget Captioning: Generating Natural Language Description for Mobile User Interface Elements
Wiki-CS: A Wikipedia-Based Benchmark for Graph Neural Networks
WikiConv: A Corpus of the Complete Conversational History of a Large Online Collaborative Community
Will I Sound Like Me? Improving Persona Consistency in Dialogues through Pragmatic Self-Consciousness
Will-They-Won't-They: A Very Large Dataset for Stance Detection on Twitter
Winning on the Merits: The Joint Effects of Content and Style on Debate Outcomes
WinoWhy: A Deep Diagnosis of Essential Commonsense Knowledge for Answering Winograd Schema Challenge
With Little Power Comes Great Responsibility
Woodbury Transformations for Deep Generative Flows
Word Embeddings for Chemical Patent Natural Language Processing
Word Frequency Does Not Predict Grammatical Knowledge in Language Models
Word Ordering Without Syntax
Word Rotator's Distance
Word class flexibility: A deep contextualized approach
Word-level Speech Recognition with a Letter to Word Encoder
Word-level Textual Adversarial Attacking as Combinatorial Optimization
Word-order biases in deep-agent emergent communication
Words aren't enough, their order matters: On the Robustness of Grounding Visual Referring Expressions
Working Memory Networks: Augmenting Memory Networks with a Relational Reasoning Module
World Model as a Graph: Learning Latent Landmarks for Planning
Worse WER, but Better BLEU? Leveraging Word Embedding as Intermediate in Multitask End-to-End Speech Translation
X-FACTR: Multilingual Factual Knowledge Retrieval from Pretrained Language Models
X-LXMERT: Paint, Caption and Answer Questions with Multi-Modal Transformers
XGLUE: A New Benchmark Dataset for Cross-lingual Pre-training, Understanding and Generation
XL-WiC: A Multilingual Benchmark for Evaluating Semantic Contextualization
XLNet: Generalized Autoregressive Pretraining for Language Understanding
XLVIN: eXecuted Latent Value Iteration Nets
XTREME: A Massively Multilingual Multi-task Benchmark for Evaluating Cross-lingual Generalization
Xiaomingbot: A Multilingual Robot News Reporter
XtarNet: Learning to Extract Task-Adaptive Representation for Incremental Few-Shot Learning
XtremeDistil: Multi-stage Distillation for Massive Multilingual Models
YOLObile: Real-Time Object Detection on Mobile Devices via Compression-Compilation Co-Design
You Impress Me: Dialogue Generation via Mutual Persona Perception
Zeno++: Robust Fully Asynchronous SGD
Zero-Resource Translation with Multi-Lingual Neural Machine Translation
Zero-Shot Cross-Lingual Opinion Target Extraction
Zero-Shot Stance Detection: A Dataset and Model using Generalized Topic Representations
Zero-Shot Transfer Learning for Event Extraction
Zero-Shot Transfer Learning with Synthesized Data for Multi-Domain Dialogue State Tracking
Zero-Shot Translation Quality Estimation with Explicit Cross-Lingual Patterns
Zero-shot Sequence Labeling: Transferring Knowledge from Sentences to Tokens
Zero-shot User Intent Detection via Capsule Neural Networks
ZeroShotCeres: Zero-Shot Relation Extraction from Semi-Structured Webpages
doc2dial: A Goal-Oriented Document-Grounded Dialogue Dataset
emrQA: A Large Corpus for Question Answering on Electronic Medical Records
giotto-tda: A Topological Data Analysis Toolkit for Machine Learning and Data Exploration
i-RIM applied to the fastMRI challenge
iNLTK: Natural Language Toolkit for Indic Languages
iSarcasm: A Dataset of Intended Sarcasm
jiant: A Software Toolkit for Research on General-Purpose Text Understanding Models
k-simplex2vec: a simplicial extension of node2vec
pyBART: Evidence-based Syntactic Transformations for IE
scGNN: scRNA-seq Dropout Imputation via Induced Hierarchical Cell Similarity Graph
schuBERT: Optimizing Elements of BERT
simNet: Stepwise Image-Topic Merging Network for Generating Detailed and Comprehensive Image Captions
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment