amitness/papers.md Secret

## papers.md

      
    Raw
  

              papers.md
            
          
title
url


(Locally) Differentially Private Combinatorial Semi-Bandits
http://arxiv.org/abs/2006.00706v2


(Re)construing Meaning in NLP
http://arxiv.org/abs/2005.09099v1


2kenize: Tying Subword Sequences for Chinese Script Conversion
http://arxiv.org/abs/2005.03375v1


3D-LaneNet+: Anchor Free Lane Detection using a Semi-Local Representation
http://arxiv.org/abs/2011.01535v2


A Batch Normalized Inference Network Keeps the KL Vanishing Away
http://arxiv.org/abs/2004.12585v2


A Benchmark of Medical Out of Distribution Detection
http://arxiv.org/abs/2007.04250v2


A Bilingual Generative Transformer for Semantic Sentence Embedding
http://arxiv.org/abs/1911.03895v2


A Boolean Task Algebra for Reinforcement Learning
http://arxiv.org/abs/2001.01394v2


A Broad-Coverage Challenge Corpus for Sentence Understanding through Inference
http://arxiv.org/abs/1704.05426v4


A Call for More Rigor in Unsupervised Cross-lingual Learning
http://arxiv.org/abs/2004.14958v1


A Characterization of Mean Squared Error for Estimator with Bagging
http://arxiv.org/abs/1908.02718v1


A Closer Look at Accuracy vs. Robustness
http://arxiv.org/abs/2003.02460v3


A Closer Look at Small-loss Bounds for Bandits with Graph Feedback
http://arxiv.org/abs/2002.00315v2


A Co-Matching Model for Multi-choice Reading Comprehension
http://arxiv.org/abs/1806.04068v1


A Computational Approach to Understanding Empathy Expressed in Text-Based Mental Health Support
http://arxiv.org/abs/2009.08441v1


A Contextual Hierarchical Attention Network with Adaptive Objective for Dialogue State Tracking
http://arxiv.org/abs/2006.01554v2


A Continuous-time Perspective for Modeling Acceleration in Riemannian Optimization
http://arxiv.org/abs/1910.10782v3


A Convolutional Encoder Model for Neural Machine Translation
http://arxiv.org/abs/1611.02344v3


A Corpus for Large-Scale Phonetic Typology
http://arxiv.org/abs/2005.13962v1


A Corpus with Multi-Level Annotations of Patients, Interventions and Outcomes to Support Language Processing for Medical Literature
http://arxiv.org/abs/1806.04185v1


A Cross-Task Analysis of Text Span Representations
http://arxiv.org/abs/2006.03866v1


A Crowdsourced Frame Disambiguation Corpus with Ambiguity
http://arxiv.org/abs/1904.06101v1


A Data and Compute Efficient Design for Limited-Resources Deep Learning
http://arxiv.org/abs/2004.09691v2


A Data-driven Approach for Noise Reduction in Distantly Supervised Biomedical Relation Extraction
http://arxiv.org/abs/2005.12565v1


A Decomposable Attention Model for Natural Language Inference
http://arxiv.org/abs/1606.01933v2


A Deep Generative Model for Fragment-Based Molecule Generation
http://arxiv.org/abs/2002.12826v1


A Deep Generative Model of Vowel Formant Typology
http://arxiv.org/abs/1807.02745v1


A Deep Learning Approach for Determining Effects of Tuta Absoluta in Tomato Plants
http://arxiv.org/abs/2004.04023v1


A Deep Learning System for Sentiment Analysis of Service Calls
http://arxiv.org/abs/2004.10320v1


A Deep Neural Network Sentence Level Classification Method with Context Information
http://arxiv.org/abs/1809.00934v1


A Deep Reinforced Model for Zero-Shot Cross-Lingual Summarization with Bilingual Semantic Similarity Rewards
http://arxiv.org/abs/2006.15454v1


A Diagnostic Study of Explainability Techniques for Text Classification
http://arxiv.org/abs/2009.13295v1


A Differentiable Newton Euler Algorithm for Multi-body Model Learning
http://arxiv.org/abs/2010.09802v1


A Distributional Analysis of Sampling-Based Reinforcement Learning Algorithms
http://arxiv.org/abs/2003.12239v1


A Distributional Framework for Data Valuation
http://arxiv.org/abs/2002.12334v1


A Distributional View on Multi-Objective Policy Optimization
http://arxiv.org/abs/2005.07513v1


A Double Residual Compression Algorithm for Efficient Distributed Learning
http://arxiv.org/abs/1910.07561v1


A Fast Spectral Algorithm for Mean Estimation with Sub-Gaussian Rates
http://arxiv.org/abs/1908.04468v2


A Formal Hierarchy of RNN Architectures
http://arxiv.org/abs/2004.08500v4


A Fourier State Space Model for Bayesian ODE Filters
http://arxiv.org/abs/2007.09118v2


A Framework and Dataset for Abstract Art Generation via CalligraphyGAN
http://arxiv.org/abs/2012.00744v1


A Framework for Sample Efficient Interval Estimation with Control Variates
http://arxiv.org/abs/2006.10287v1


A Free-Energy Principle for Representation Learning
http://arxiv.org/abs/2002.12406v1


A Full Non-Monotonic Transition System for Unrestricted Non-Projective Parsing
http://arxiv.org/abs/1706.03367v1


A General Framework for Information Extraction using Dynamic Span Graphs
http://arxiv.org/abs/1904.03296v1


A Generative Approach to Titling and Clustering Wikipedia Sections
http://arxiv.org/abs/2005.11216v1


A Generative Model for Joint Natural Language Understanding and Generation
http://arxiv.org/abs/2006.07499v1


A Generative Model for Molecular Distance Geometry
http://arxiv.org/abs/1909.11459v4


A Generative Parser with a Discriminative Recognition Algorithm
http://arxiv.org/abs/1708.00415v2


A Generic First-Order Algorithmic Framework for Bi-Level Programming Beyond Lower-Level Singleton
http://arxiv.org/abs/2006.04045v2


A Geometry-Inspired Attack for Generating Natural Language Adversarial Examples
http://arxiv.org/abs/2010.01345v1


A Girl Has A Name: Detecting Authorship Obfuscation
http://arxiv.org/abs/2005.00702v1


A Graph to Graphs Framework for Retrosynthesis Prediction
http://arxiv.org/abs/2003.12725v1


A Hierarchical Latent Structure for Variational Conversation Modeling
http://arxiv.org/abs/1804.03424v2


A Hierarchical Probabilistic U-Net for Modeling Multi-Scale Ambiguities
http://arxiv.org/abs/1905.13077v1


A Hierarchical Reinforced Sequence Operation Method for Unsupervised Text Style Transfer
http://arxiv.org/abs/1906.01833v1


A Hierarchical Transformer for Unsupervised Parsing
http://arxiv.org/abs/2003.13841v1


A Hybrid Convolutional Variational Autoencoder for Text Generation
http://arxiv.org/abs/1702.02390v1


A Hybrid Stochastic Policy Gradient Algorithm for Reinforcement Learning
http://arxiv.org/abs/2003.00430v2


A Joint Named-Entity Recognizer for Heterogeneous Tag-sets Using a Tag Hierarchy
http://arxiv.org/abs/1905.09135v2


A Just and Comprehensive Strategy for Using NLP to Address Online Abuse
http://arxiv.org/abs/1906.01738v2


A Knowledge-Enhanced Pretraining Model for Commonsense Story Generation
http://arxiv.org/abs/2001.05139v1


A La Carte Embedding: Cheap but Effective Induction of Semantic Feature Vectors
http://arxiv.org/abs/1805.05388v1


A Large-Scale Multi-Document Summarization Dataset from the Wikipedia Current Events Portal
http://arxiv.org/abs/2005.10070v1


A Locally Adaptive Bayesian Cubature Method
http://arxiv.org/abs/1910.02995v1


A Meaning-based Statistical English Math Word Problem Solver
http://arxiv.org/abs/1803.06064v2


A Mention-Ranking Model for Abstract Anaphora Resolution
http://arxiv.org/abs/1706.02256v2


A Meta-Learning Approach for Graph Representation Learning in Multi-Task Settings
http://arxiv.org/abs/2012.06755v1


A Methodology for Creating Question Answering Corpora Using Inverse Data Annotation
http://arxiv.org/abs/2004.07633v2


A Minimal Span-Based Neural Constituency Parser
http://arxiv.org/abs/1705.03919v1


A Monolingual Approach to Contextualized Word Embeddings for Mid-Resource Languages
http://arxiv.org/abs/2006.06202v2


A Multi-Axis Annotation Scheme for Event Temporal Relations
http://arxiv.org/abs/1804.07828v2


A Multi-Perspective Architecture for Semantic Code Search
http://arxiv.org/abs/2005.06980v1


A Multi-Task Incremental Learning Framework with Category Name Embedding for Aspect-Category Sentiment Analysis
http://arxiv.org/abs/2010.02784v1


A Multi-modal Approach to Fine-grained Opinion Mining on Video Reviews
http://arxiv.org/abs/2005.13362v2


A Multi-sentiment-resource Enhanced Attention Network for Sentiment Classification
http://arxiv.org/abs/1807.04990v1


A Multiclass Classification Approach to Label Ranking
http://arxiv.org/abs/2002.09420v1


A Multilingual Neural Machine Translation Model for Biomedical Data
http://arxiv.org/abs/2008.02878v1


A Multitask Learning Approach for Diacritic Restoration
http://arxiv.org/abs/2006.04016v1


A Narration-based Reward Shaping Approach using Grounded Natural Language Commands
http://arxiv.org/abs/1911.00497v1


A Nested Attention Neural Hybrid Model for Grammatical Error Correction
http://arxiv.org/abs/1707.02026v2


A Neural Attention Model for Abstractive Sentence Summarization
http://arxiv.org/abs/1509.00685v2


A Neural Generative Model for Joint Learning Topics and Topic-Specific Word Embeddings
http://arxiv.org/abs/2008.04702v1


A Neural Model for User Geolocation and Lexical Dialectology
http://arxiv.org/abs/1704.04008v3


A Neural Model of Adaptation in Reading
http://arxiv.org/abs/1808.09930v2


A Neural Network for Coordination Boundary Prediction
http://arxiv.org/abs/1610.03946v1


A Neuro-AI Interface for Evaluating Generative Adversarial Networks
http://arxiv.org/abs/2003.03193v2


A New Neural Network Architecture Invariant to the Action of Symmetry Subgroups
http://arxiv.org/abs/2012.06452v1


A Nonparametric Off-Policy Policy Gradient
http://arxiv.org/abs/2001.02435v3


A Note on Data Biases in Generative Models
http://arxiv.org/abs/2012.02516v1


A Note on Over-Smoothing for Graph Neural Networks
http://arxiv.org/abs/2006.13318v1


A Novel Cascade Binary Tagging Framework for Relational Triple Extraction
http://arxiv.org/abs/1909.03227v4


A Novel Confidence-Based Algorithm for Structured Bandits
http://arxiv.org/abs/2005.11593v1


A Novel Graph-based Multi-modal Fusion Encoder for Neural Machine Translation
http://arxiv.org/abs/2007.08742v1


A Pairwise Fair and Community-preserving Approach to k-Center Clustering
http://arxiv.org/abs/2007.07384v1


A Practical Algorithm for Multiplayer Bandits when Arm Means Vary Among Players
http://arxiv.org/abs/1902.01239v4


A Principled Approach to Learning Stochastic Representations for Privacy in Deep Neural Inference
http://arxiv.org/abs/2003.12154v1


A Probabilistic End-To-End Task-Oriented Dialog Model with Latent Belief States towards Semi-Supervised Learning
http://arxiv.org/abs/2009.08115v3


A Probabilistic Generative Model for Typographical Analysis of Early Modern Printing
http://arxiv.org/abs/2005.01646v1


A Probabilistic Generative Model of Linguistic Typology
http://arxiv.org/abs/1903.10950v3


A Probabilistic Model with Commonsense Constraints for Pattern-based Temporal Fact Extraction
http://arxiv.org/abs/2006.06436v1


A Re-evaluation of Knowledge Graph Completion Methods
http://arxiv.org/abs/1911.03903v3


A Recipe for Creating Multimodal Aligned Datasets for Sequential Tasks
http://arxiv.org/abs/2005.09606v1


A Reduction from Reinforcement Learning to No-Regret Online Learning
http://arxiv.org/abs/1911.05873v2


A Reinforced Generation of Adversarial Examples for Neural Machine Translation
http://arxiv.org/abs/1911.03677v2


A Relational Memory-based Embedding Model for Triple Classification and Search Personalization
http://arxiv.org/abs/1907.06080v2


A Relaxed Matching Procedure for Unsupervised BLI
http://arxiv.org/abs/2010.07095v1


A Report on the 2020 Sarcasm Detection Shared Task
http://arxiv.org/abs/2005.05814v2


A Resource-Free Evaluation Metric for Cross-Lingual Word Embeddings Based on Graph Modularity
http://arxiv.org/abs/1906.01926v1


A Rigorous Study on Named Entity Recognition: Can Fine-tuning Pretrained Model Lead to the Promised Land?
http://arxiv.org/abs/2004.12126v2


A Sample Complexity Separation between Non-Convex and Convex Meta-Learning
http://arxiv.org/abs/2002.11172v1


A Scalable Neural Shortlisting-Reranking Approach for Large-Scale Domain Classification in Natural Language Understanding
http://arxiv.org/abs/1804.08064v1


A Self-Training Method for Machine Reading Comprehension with Soft Evidence Extraction
http://arxiv.org/abs/2005.05189v2


A Sequential Self Teaching Approach for Improving Generalization in Sound Event Recognition
http://arxiv.org/abs/2007.00144v1


A Simple Approach to Learning Unsupervised Multilingual Embeddings
http://arxiv.org/abs/2004.05991v2


A Simple Joint Model for Improved Contextual Neural Lemmatization
http://arxiv.org/abs/1904.02306v4


A Simple Regularization-based Algorithm for Learning Cross-Domain Word Embeddings
http://arxiv.org/abs/1902.00184v1


A Simple Theoretical Model of Importance for Summarization
http://arxiv.org/abs/1801.08991v2


A Simple Yet Strong Pipeline for HotpotQA
http://arxiv.org/abs/2004.06753v1


A Simple and Effective Model for Answering Multi-span Questions
http://arxiv.org/abs/1909.13375v4


A Skeleton-Based Model for Promoting Coherence Among Sentences in Narrative Story Generation
http://arxiv.org/abs/1808.06945v2


A Span-based Linearization for Constituent Trees
http://arxiv.org/abs/2004.14704v2


A Stein Goodness-of-fit Test for Directional Distributions
http://arxiv.org/abs/2002.06843v1


A Stochastic Decoder for Neural Machine Translation
http://arxiv.org/abs/1805.10844v1


A Streaming Approach For Efficient Batched Beam Search
http://arxiv.org/abs/2010.02164v2


A Study of Deep Learning Colon Cancer Detection in Limited Data Access Scenarios
http://arxiv.org/abs/2005.10326v2


A Study of Reinforcement Learning for Neural Machine Translation
http://arxiv.org/abs/1808.08866v1


A Study on Encodings for Neural Architecture Search
http://arxiv.org/abs/2007.04965v1


A Stylometric Inquiry into Hyperpartisan and Fake News
http://arxiv.org/abs/1702.05638v1


A Supervised Word Alignment Method based on Cross-Language Span Prediction using Multilingual BERT
http://arxiv.org/abs/2004.14516v1


A Survey on Recognizing Textual Entailment as an NLP Evaluation
http://arxiv.org/abs/2010.03061v1


A Syntactic Neural Model for General-Purpose Code Generation
http://arxiv.org/abs/1704.01696v1


A System for Worldwide COVID-19 Information Aggregation
http://arxiv.org/abs/2008.01523v2


A Systematic Assessment of Syntactic Generalization in Neural Language Models
http://arxiv.org/abs/2005.03692v2


A Tale of a Probe and a Parser
http://arxiv.org/abs/2005.01641v2


A Theoretical Case Study of Structured Variational Inference for Community Detection
http://arxiv.org/abs/1907.12203v5


A Top-Down Neural Architecture towards Text-Level Parsing of Discourse Rhetorical Structure
http://arxiv.org/abs/2005.02680v3


A Topology Layer for Machine Learning
http://arxiv.org/abs/1905.12200v2


A Trainable Optimal Transport Embedding for Feature Aggregation
http://arxiv.org/abs/2006.12065v3


A Transformer-based Approach for Source Code Summarization
http://arxiv.org/abs/2005.00653v1


A Transformer-based joint-encoding for Emotion Recognition and Sentiment Analysis
http://arxiv.org/abs/2006.15955v1


A Transition-Based Directed Acyclic Graph Parser for UCCA
http://arxiv.org/abs/1704.00552v2


A Two-Stage Masked LM Method for Term Set Expansion
http://arxiv.org/abs/2005.01063v1


A Unified Linear-Time Framework for Sentence-Level Discourse Parsing
http://arxiv.org/abs/1905.05682v2


A Unified MRC Framework for Named Entity Recognition
http://arxiv.org/abs/1910.11476v6


A Unified Model for Extractive and Abstractive Summarization using Inconsistency Loss
http://arxiv.org/abs/1805.06266v2


A Unified Stochastic Gradient Approach to Designing Bayesian-Optimal Experiments
http://arxiv.org/abs/1911.00294v2


A Unified Theory of Decentralized SGD with Changing Topology and Local Updates
http://arxiv.org/abs/2003.10422v2


A Unified Theory of SGD: Variance Reduction, Sampling, Quantization and Coordinate Descent
http://arxiv.org/abs/1905.11261v1


A Unified View of Label Shift Estimation
http://arxiv.org/abs/2003.07554v3


A Visual Attention Grounding Neural Model for Multimodal Machine Translation
http://arxiv.org/abs/1808.08266v2


A Wasserstein Minimum Velocity Approach to Learning Unnormalized Models
http://arxiv.org/abs/2002.07501v1


A Word-Complexity Lexicon and A Neural Readability Ranking Model for Lexical Simplification
http://arxiv.org/abs/1810.05754v1


A greedy anytime algorithm for sparse PCA
http://arxiv.org/abs/1910.06846v5


A large annotated corpus for learning natural language inference
http://arxiv.org/abs/1508.05326v1


A negative case analysis of visual grounding methods for VQA
http://arxiv.org/abs/2004.05704v2


A neurally plausible model learns successor representations in partially observable environments
http://arxiv.org/abs/1906.09480v1


A new regret analysis for Adam-type algorithms
http://arxiv.org/abs/2003.09729v1


A nonasymptotic law of iterated logarithm for general M-estimators
http://arxiv.org/abs/1903.06576v2


A principled approach for generating adversarial images under non-smooth dissimilarity metrics
http://arxiv.org/abs/1908.01667v2


A robust self-learning method for fully unsupervised cross-lingual mappings of word embeddings
http://arxiv.org/abs/1805.06297v2


A single image deep learning approach to restoration of corrupted remote sensing products
http://arxiv.org/abs/2004.04209v1


A strong baseline for question relevancy ranking
http://arxiv.org/abs/1808.08836v1


AD3: Attentive Deep Document Dater
http://arxiv.org/abs/1902.02161v1


ADVISER: A Toolkit for Developing Multi-modal, Multi-domain and Socially-engaged Conversational Agents
http://arxiv.org/abs/2005.01777v1


AIN: Fast and Accurate Sequence Labeling with Approximate Inference Network
http://arxiv.org/abs/2009.08229v2


ALICE: Active Learning with Contrastive Natural Language Explanations
http://arxiv.org/abs/2009.10259v1


AMAGOLD: Amortized Metropolis Adjustment for Efficient Stochastic Gradient MCMC
http://arxiv.org/abs/2003.00193v1


AMR Dependency Parsing with a Typed Semantic Algebra
http://arxiv.org/abs/1805.11465v1


AMR Parsing as Sequence-to-Graph Transduction
http://arxiv.org/abs/1905.08704v2


AMR Parsing via Graph-Sequence Iterative Inference
http://arxiv.org/abs/2004.05572v2


AMR-to-text Generation with Synchronous Node Replacement Grammar
http://arxiv.org/abs/1702.00500v4


AP-Perf: Incorporating Generic Performance Metrics in Differentiable Learning
http://arxiv.org/abs/1912.00965v2


AR-DAE: Towards Unbiased Neural Entropy Gradient Estimation
http://arxiv.org/abs/2006.05164v1


ASAP: Architecture Search, Anneal and Prune
http://arxiv.org/abs/1904.04123v2


ASSET: A Dataset for Tuning and Evaluation of Sentence Simplification Models with Multiple Rewriting Transformations
http://arxiv.org/abs/2005.00481v1


Abstract Syntax Networks for Code Generation and Semantic Parsing
http://arxiv.org/abs/1704.07535v1


Abstraction Mechanisms Predict Generalization in Deep Neural Networks
http://arxiv.org/abs/1905.11515v2


Abstractive Multi-Document Summarization via Phrase Selection and Merging
http://arxiv.org/abs/1506.01597v2


Abusive Language Detection with Graph Convolutional Networks
http://arxiv.org/abs/1904.04073v1


Accelerated Message Passing for Entropy-Regularized MAP Inference
http://arxiv.org/abs/2007.00699v1


Accelerated Primal-Dual Algorithms for Distributed Smooth Convex Optimization over Networks
http://arxiv.org/abs/1910.10666v2


Accelerated Reinforcement Learning for Sentence Generation by Vocabulary Prediction
http://arxiv.org/abs/1809.01694v2


Accelerated Stochastic Gradient-free and Projection-free Methods
http://arxiv.org/abs/2007.12625v2


Accelerating Large-Scale Inference with Anisotropic Vector Quantization
http://arxiv.org/abs/1908.10396v5


Accelerating NMT Batched Beam Decoding with LMBR Posteriors for Deployment
http://arxiv.org/abs/1804.11324v1


Accelerating Natural Language Understanding in Task-Oriented Dialog
http://arxiv.org/abs/2006.03701v1


Accelerating Online Reinforcement Learning with Offline Datasets
http://arxiv.org/abs/2006.09359v3


Accelerating Reinforcement Learning with Learned Skill Priors
http://arxiv.org/abs/2010.11944v1


Accurate Word Alignment Induction from Neural Machine Translation
http://arxiv.org/abs/2004.14837v2


Acrostic Poem Generation
http://arxiv.org/abs/2010.02239v1


Action and Perception as Divergence Minimization
http://arxiv.org/abs/2009.01791v2


Active Community Detection with Maximal Expected Model Change
http://arxiv.org/abs/1801.05856v2


Active Imitation Learning with Noisy Guidance
http://arxiv.org/abs/2005.12801v1


Active Learning for Coreference Resolution using Discrete Annotation
http://arxiv.org/abs/2004.13671v3


Active Learning for Identification of Linear Dynamical Systems
http://arxiv.org/abs/2002.00495v2


Active Learning from Crowd in Document Screening
http://arxiv.org/abs/2012.02297v1


Active World Model Learning with Progress Curiosity
http://arxiv.org/abs/2007.07853v1


AdaScale SGD: A User-Friendly Algorithm for Distributed Training
http://arxiv.org/abs/2007.05105v1


Adapting End-to-End Speech Recognition for Readable Subtitles
http://arxiv.org/abs/2005.12143v1


Adapting Word Embeddings to New Languages with Morphological and Phonological Subword Representations
http://arxiv.org/abs/1808.09500v1


Adaptive Attention Span in Transformers
http://arxiv.org/abs/1905.07799v2


Adaptive Attentional Network for Few-Shot Knowledge Graph Completion
http://arxiv.org/abs/2010.09638v1


Adaptive Checkpoint Adjoint Method for Gradient Estimation in Neural ODE
http://arxiv.org/abs/2006.02493v1


Adaptive Document Retrieval for Deep Question Answering
http://arxiv.org/abs/1808.06528v1


Adaptive Estimator Selection for Off-Policy Evaluation
http://arxiv.org/abs/2002.07729v2


Adaptive Exploration in Linear Contextual Bandit
http://arxiv.org/abs/1910.06996v2


Adaptive Gradient Descent without Descent
http://arxiv.org/abs/1910.09529v2


Adaptive Prediction Timing for Electronic Health Records
http://arxiv.org/abs/2003.02554v1


Adaptive Region-Based Active Learning
http://arxiv.org/abs/2002.07348v1


Adaptive Reward-Poisoning Attacks against Reinforcement Learning
http://arxiv.org/abs/2003.12613v2


Adaptive Risk Minimization: A Meta-Learning Approach for Tackling Group Shift
http://arxiv.org/abs/2007.02931v2


Adaptive Scaling for Sparse Detection in Information Extraction
http://arxiv.org/abs/1805.00250v2


Adaptive Transformers for Learning Multimodal Representations
http://arxiv.org/abs/2005.07486v3


Adding Seemingly Uninformative Labels Helps in Low Data Regimes
http://arxiv.org/abs/2008.00807v2


Additive Tree-Structured Covariance Function for Conditional Parameter Spaces in Bayesian Optimization
http://arxiv.org/abs/2006.11771v1


Addressing Ancestry Disparities in Genomic Medicine: A Geographic-aware Algorithm
http://arxiv.org/abs/2004.12053v1


Addressing Exposure Bias With Document Minimum Risk Training: Cambridge at the WMT20 Biomedical Translation Task
http://arxiv.org/abs/2010.05333v1


Addressing reward bias in Adversarial Imitation Learning with neutral reward functions
http://arxiv.org/abs/2009.09467v1


Addressing the Rare Word Problem in Neural Machine Translation
http://arxiv.org/abs/1410.8206v4


AdvAug: Robust Adversarial Augmentation for Neural Machine Translation
http://arxiv.org/abs/2006.11834v3


Advancing Renewable Electricity Consumption With Reinforcement Learning
http://arxiv.org/abs/2003.04310v1


Adversarial Alignment of Multilingual Models for Extracting Temporal Expressions from Text
http://arxiv.org/abs/2005.09392v1


Adversarial Attack and Defense of Structured Prediction Models
http://arxiv.org/abs/2010.01610v2


Adversarial Attacks on Probabilistic Autoregressive Forecasting Models
http://arxiv.org/abs/2003.03778v1


Adversarial Connective-exploiting Networks for Implicit Discourse Relation Classification
http://arxiv.org/abs/1704.00217v1


Adversarial Contrastive Estimation
http://arxiv.org/abs/1805.03642v3


Adversarial Deep Averaging Networks for Cross-Lingual Sentiment Classification
http://arxiv.org/abs/1606.01614v5


Adversarial Example Generation with Syntactically Controlled Paraphrase Networks
http://arxiv.org/abs/1804.06059v1


Adversarial Examples for Evaluating Reading Comprehension Systems
http://arxiv.org/abs/1707.07328v1


Adversarial Filters of Dataset Biases
http://arxiv.org/abs/2002.04108v3


Adversarial Learning of Privacy-Preserving Text Representations for De-Identification of Medical Records
http://arxiv.org/abs/1906.05000v1


Adversarial Multi-Criteria Learning for Chinese Word Segmentation
http://arxiv.org/abs/1704.07556v1


Adversarial Multi-task Learning for Text Classification
http://arxiv.org/abs/1704.05742v1


Adversarial Multitask Learning for Joint Multi-Feature and Multi-Dialect Morphological Modeling
http://arxiv.org/abs/1910.12702v1


Adversarial Mutual Information for Text Generation
http://arxiv.org/abs/2007.00067v1


Adversarial NLI: A New Benchmark for Natural Language Understanding
http://arxiv.org/abs/1910.14599v2


Adversarial Neural Pruning with Latent Vulnerability Suppression
http://arxiv.org/abs/1908.04355v4


Adversarial Removal of Demographic Attributes from Text Data
http://arxiv.org/abs/1808.06640v2


Adversarial Risk via Optimal Transport and Optimal Couplings
http://arxiv.org/abs/1912.02794v2


Adversarial Robustness Guarantees for Classification with Gaussian Processes
http://arxiv.org/abs/1905.11876v3


Adversarial Robustness for Code
http://arxiv.org/abs/2002.04694v2


Adversarial Robustness of Flow-Based Generative Models
http://arxiv.org/abs/1911.08654v1


Adversarial Self-Supervised Data-Free Distillation for Text Classification
http://arxiv.org/abs/2010.04883v1


Adversarial Semantic Collisions
http://arxiv.org/abs/2011.04743v1


Adversarial Training for Commonsense Inference
http://arxiv.org/abs/2005.08156v1


Adversarial Training for Satire Detection: Controlling for Confounding Variables
http://arxiv.org/abs/1902.11145v2


Adversarial attacks on Copyright Detection Systems
http://arxiv.org/abs/1906.07153v2


Adversarial representation learning for private speech generation
http://arxiv.org/abs/2006.09114v2


Adversarial training for multi-context joint entity and relation extraction
http://arxiv.org/abs/1808.06876v3


Affect-LM: A Neural Language Model for Customizable Affective Text Generation
http://arxiv.org/abs/1704.06851v1


Afro-MNIST: Synthetic generation of MNIST-style datasets for low-resource languages
http://arxiv.org/abs/2009.13509v1


Agent57: Outperforming the Atari Human Benchmark
http://arxiv.org/abs/2003.13350v1


Aggregation of Multiple Knockoffs
http://arxiv.org/abs/2002.09269v2


Algorithmic Recourse: from Counterfactual Explanations to Interventions
http://arxiv.org/abs/2002.06278v4


Algorithms and SQ Lower Bounds for PAC Learning One-Hidden-Layer ReLU Networks
http://arxiv.org/abs/2006.12476v1


Aligned Cross Entropy for Non-Autoregressive Machine Translation
http://arxiv.org/abs/2004.01655v1


Alignment-based compositional semantics for instruction following
http://arxiv.org/abs/1508.06491v2


All Fingers are not Equal: Intensity of References in Scientific Articles
http://arxiv.org/abs/1609.00081v1


All in the Exponential Family: Bregman Duality in Thermodynamic Variational Inference
http://arxiv.org/abs/2007.00642v1


Alleviating Privacy Attacks via Causal Learning
http://arxiv.org/abs/1909.12732v4


Almost Tune-Free Variance Reduction
http://arxiv.org/abs/1908.09345v2


Almost-Matching-Exactly for Treatment Effect Estimation under Network Interference
http://arxiv.org/abs/2003.00964v1


AmbigQA: Answering Ambiguous Open-domain Questions
http://arxiv.org/abs/2004.10645v2


Amharic Abstractive Text Summarization
http://arxiv.org/abs/2003.13721v1


Amodal 3D Reconstruction for Robotic Manipulation via Stability and Connectivity
http://arxiv.org/abs/2009.13146v1


Amortised Learning by Wake-Sleep
http://arxiv.org/abs/2002.09737v2


Amortized Inference of Variational Bounds for Learning Noisy-OR
http://arxiv.org/abs/1906.02428v2


Amortized Population Gibbs Samplers with Neural Sufficient Statistics
http://arxiv.org/abs/1911.01382v3


Amortized learning of neural causal representations
http://arxiv.org/abs/2008.09301v1


An AMR Aligner Tuned by Transition-based Parser
http://arxiv.org/abs/1810.03541v1


An Accelerated DFO Algorithm for Finite-sum Convex Functions
http://arxiv.org/abs/2007.03311v2


An Analysis of Action Recognition Datasets for Language and Vision Tasks
http://arxiv.org/abs/1704.07129v1


An Analysis of the Utility of Explicit Negative Examples to Improve the Syntactic Abilities of Neural Language Models
http://arxiv.org/abs/2004.02451v3


An EM Approach to Non-autoregressive Conditional Sequence Generation
http://arxiv.org/abs/2006.16378v1


An Effective Approach to Unsupervised Machine Translation
http://arxiv.org/abs/1902.01313v2


An Effective Transition-based Model for Discontinuous NER
http://arxiv.org/abs/2004.13454v1


An Effectiveness Metric for Ordinal Classification: Formal Properties and Experimental Results
http://arxiv.org/abs/2006.01245v1


An Embarrassingly Simple Approach for Transfer Learning from Pretrained Language Models
http://arxiv.org/abs/1902.10547v3


An Empirical Investigation Towards Efficient Multi-Domain Language Model Pre-training
http://arxiv.org/abs/2010.00784v1


An Empirical Investigation of Contextualized Number Prediction
http://arxiv.org/abs/2011.07961v1


An Empirical Investigation of Global and Local Normalization for Recurrent Neural Sequence Models Using a Continuous Relaxation to Beam Search
http://arxiv.org/abs/1904.06834v1


An Empirical Study of Generation Order for Machine Translation
http://arxiv.org/abs/1910.13437v1


An Empirical Study of Pre-trained Transformers for Arabic Information Extraction
http://arxiv.org/abs/2004.14519v5


An Empirical Study on Large-Scale Multi-Label Text Classification Including Few and Zero-Shot Labels
http://arxiv.org/abs/2010.01653v1


An Empirical Study on Model-agnostic Debiasing Strategies for Robust Natural Language Inference
http://arxiv.org/abs/2010.03777v2


An Empirical Study on Robustness to Spurious Correlations using Pre-trained Language Models
http://arxiv.org/abs/2007.06778v3


An Experiment on Leveraging SHAP Values to Investigate Racial Bias
http://arxiv.org/abs/2011.09865v1


An Explicitly Relational Neural Network Architecture
http://arxiv.org/abs/1905.10307v4


An Exploration of Arbitrary-Order Sequence Labeling via Energy-Based Inference Networks
http://arxiv.org/abs/2010.02789v1


An Exploratory Study of Argumentative Writing by Young Students: A Transformer-based Approach
http://arxiv.org/abs/2006.09873v1


An Imitation Game for Learning Semantic Parsers from User Interaction
http://arxiv.org/abs/2005.00689v3


An Imitation Learning Approach for Cache Replacement
http://arxiv.org/abs/2006.16239v2


An Imitation Learning Approach to Unsupervised Parsing
http://arxiv.org/abs/1906.02276v1


An Interpretable Knowledge Transfer Model for Knowledge Base Completion
http://arxiv.org/abs/1704.05908v2


An Inverse-free Truncated Rayleigh-Ritz Method for Sparse Generalized Eigenvalue Problem
http://arxiv.org/abs/2003.10897v1


An Investigation of Why Overparameterization Exacerbates Spurious Correlations
http://arxiv.org/abs/2005.04345v3


An Optimal Algorithm for Adversarial Bandits with Arbitrary Delays
http://arxiv.org/abs/1910.06054v2


An Unsupervised Joint System for Text Generation from Knowledge Graphs and Semantic Parsing
http://arxiv.org/abs/1904.09447v4


An Unsupervised Method for Uncovering Morphological Chains
http://arxiv.org/abs/1503.02335v1


An Unsupervised Probability Model for Speech-to-Translation Alignment of Low-Resource Languages
http://arxiv.org/abs/1609.08139v1


An end-to-end Differentially Private Latent Dirichlet Allocation Using a Spectral Algorithm
http://arxiv.org/abs/1805.10341v3


An end-to-end approach for the verification problem: learning the right distance
http://arxiv.org/abs/2002.09469v4


An information theoretic view on selecting linguistic probes
http://arxiv.org/abs/2009.07364v2


Analogies minus analogy test: measuring regularities in word embeddings
http://arxiv.org/abs/2010.03446v1


Analogous Process Structure Induction for Sub-event Sequence Prediction
http://arxiv.org/abs/2010.08525v1


Analogs of Linguistic Structure in Deep Representations
http://arxiv.org/abs/1707.08139v1


Analysing Lexical Semantic Change with Contextualised Word Representations
http://arxiv.org/abs/2004.14118v1


Analysis of Automatic Annotation Suggestions for Hard Discourse-Level Tasks in Expert Domains
http://arxiv.org/abs/1906.02564v1


Analytic Marching: An Analytic Meshing Solution from Deep Implicit Surface Networks
http://arxiv.org/abs/2002.06597v1


Analyzing Individual Neurons in Pre-trained Language Models
http://arxiv.org/abs/2010.02695v1


Analyzing Multi-Head Self-Attention: Specialized Heads Do the Heavy Lifting, the Rest Can Be Pruned
http://arxiv.org/abs/1905.09418v2


Analyzing Neural Discourse Coherence Models
http://arxiv.org/abs/2011.06306v1


Analyzing Polarization in Social Media: Method and Application to Tweets on 21 Mass Shootings
http://arxiv.org/abs/1904.01596v2


Analyzing Political Parody in Social Media
http://arxiv.org/abs/2004.13878v2


Analyzing Redundancy in Pretrained Transformer Models
http://arxiv.org/abs/2004.04010v2


Analyzing analytical methods: The case of phonology in neural models of spoken language
http://arxiv.org/abs/2004.07070v2


Analyzing autoencoder-based acoustic word embeddings
http://arxiv.org/abs/2004.01647v1


Analyzing the Limitations of Cross-lingual Word Embedding Mappings
http://arxiv.org/abs/1906.05407v1


Anatomy of Catastrophic Forgetting: Hidden Representations and Task Semantics
http://arxiv.org/abs/2007.07400v1


Anchored Correlation Explanation: Topic Modeling with Minimal Domain Knowledge
http://arxiv.org/abs/1611.10277v4


Anchoring and Agreement in Syntactic Annotations
http://arxiv.org/abs/1605.04481v3


Anderson Acceleration of Proximal Gradient Methods
http://arxiv.org/abs/1910.08590v2


Angular Visual Hardness
http://arxiv.org/abs/1912.02279v4


Answer-based Adversarial Training for Generating Clarification Questions
http://arxiv.org/abs/1904.02281v1


Approaching Neural Grammatical Error Correction as a Low-Resource Machine Translation Task
http://arxiv.org/abs/1804.05940v1


Approximate Cross-Validation in High Dimensions with Guarantees
http://arxiv.org/abs/1905.13657v4


Approximate Cross-validation: Guarantees for Model Assessment and Selection
http://arxiv.org/abs/2003.00617v2


Approximate Inference in Discrete Distributions with Monte Carlo Tree Search and Value Functions
http://arxiv.org/abs/1910.06862v1


Approximate is Good Enough: Probabilistic Variants of Dimensional and Margin Complexity
http://arxiv.org/abs/2003.04180v1


Approximating Stacked and Bidirectional Recurrent Architectures with the Delayed Recurrent Neural Network
http://arxiv.org/abs/1909.00021v2


Approximation Capabilities of Neural ODEs and Invertible Residual Networks
http://arxiv.org/abs/1907.12998v2


Approximation Guarantees of Local Search Algorithms via Localizability of Set Functions
http://arxiv.org/abs/2006.01400v1


Approximation Schemes for ReLU Regression
http://arxiv.org/abs/2005.12844v2


Approximation-Aware Dependency Parsing by Belief Propagation
http://arxiv.org/abs/1508.02375v1


AraDIC: Arabic Document Classification using Image-Based Character Embeddings and Class-Balanced Loss
http://arxiv.org/abs/2006.11586v1


Arc-swift: A Novel Transition System for Dependency Parsing
http://arxiv.org/abs/1705.04434v1


Architecture Agnostic Neural Networks
http://arxiv.org/abs/2011.02712v2


Are All Good Word Vector Spaces Isomorphic?
http://arxiv.org/abs/2004.04070v2


Are All Languages Created Equal in Multilingual BERT?
http://arxiv.org/abs/2005.09093v2


Are BLEU and Meaning Representation in Opposition?
http://arxiv.org/abs/1805.06536v1


Are Hyperbolic Representations in Graphs Created Equal?
http://arxiv.org/abs/2007.07698v1


Are Natural Language Inference Models IMPPRESsive? Learning IMPlicature and PRESupposition
http://arxiv.org/abs/2004.03066v2


Are Pretrained Language Models Symbolic Reasoners Over Knowledge?
http://arxiv.org/abs/2006.10413v2


Are Some Words Worth More than Others?
http://arxiv.org/abs/2010.06069v2


Are You Convinced? Choosing the More Convincing Evidence with a Siamese Network
http://arxiv.org/abs/1907.08971v2


Argument Generation with Retrieval, Planning, and Realization
http://arxiv.org/abs/1906.03717v1


Argument Invention from First Principles
http://arxiv.org/abs/1908.08336v1


Argument Mining for Understanding Peer Reviews
http://arxiv.org/abs/1903.10104v1


Argument Mining with Structured SVMs and RNNs
http://arxiv.org/abs/1704.06869v1


Artemis: A Novel Annotation Methodology for Indicative Single Document Summarization
http://arxiv.org/abs/2005.02146v2


Artificial Intelligence for Global Health: Learning From a Decade of Digital Transformation in Health Care
http://arxiv.org/abs/2005.12378v2


Asking and Answering Questions to Evaluate the Factual Consistency of Summaries
http://arxiv.org/abs/2004.04228v1


Asking without Telling: Exploring Latent Ontologies in Contextual Representations
http://arxiv.org/abs/2004.14513v2


Aspect Level Sentiment Classification with Deep Memory Network
http://arxiv.org/abs/1605.08900v2


Assessing Human Translations from French to Bambara for Machine Learning: a Pilot Study
http://arxiv.org/abs/2004.00068v1


Assessing Phrasal Representation and Composition in Transformers
http://arxiv.org/abs/2010.03763v2


Assessing Robustness to Noise: Low-Cost Head CT Triage
http://arxiv.org/abs/2003.07977v2


Assessing racial inequality in COVID-19 testing with Bayesian threshold tests
http://arxiv.org/abs/2011.01179v1


Assessing the Ability of Self-Attention Networks to Learn Word Order
http://arxiv.org/abs/1906.00592v1


Assessing the Helpfulness of Learning Materials with Inference-Based Learner-Like Agent
http://arxiv.org/abs/2010.02179v1


Associative Memory in Iterated Overparameterized Sigmoid Autoencoders
http://arxiv.org/abs/2006.16540v2


Asymmetric Private Set Intersection with Applications to Contact Tracing and Private Vertical Federated Machine Learning
http://arxiv.org/abs/2011.09350v1


Asymmetric self-play for automatic goal discovery in robotic manipulation
http://arxiv.org/abs/2101.04882v1


Asymptotic Analysis of Sampling Estimators for Randomized Numerical Linear Algebra Algorithms
http://arxiv.org/abs/2002.10526v1


Asymptotically Efficient Off-Policy Evaluation for Tabular Reinforcement Learning
http://arxiv.org/abs/2001.10742v1


Asynchronous Gibbs Sampling
http://arxiv.org/abs/1509.08999v7


Attacking Neural Text Detectors
http://arxiv.org/abs/2002.11768v3


Attend to Medical Ontologies: Content Selection for Clinical Abstractive Summarization
http://arxiv.org/abs/2005.00163v1


Attending the Emotions to Detect Online Abusive Language
http://arxiv.org/abs/1909.03100v1


Attention Guided Graph Convolutional Networks for Relation Extraction
http://arxiv.org/abs/1906.07510v8


Attention Is All You Need for Chinese Word Segmentation
http://arxiv.org/abs/1910.14537v3


Attention Strategies for Multi-Source Sequence-to-Sequence Learning
http://arxiv.org/abs/1704.06567v1


Attention is Not Only a Weight: Analyzing Transformers with Vector Norms
http://arxiv.org/abs/2004.10102v2


Attention is not Explanation
http://arxiv.org/abs/1902.10186v3


Attention-Passing Models for Robust and Data-Efficient End-to-End Speech Translation
http://arxiv.org/abs/1904.07209v1


Attention-over-Attention Neural Networks for Reading Comprehension
http://arxiv.org/abs/1607.04423v4


Attentive Group Equivariant Convolutional Networks
http://arxiv.org/abs/2002.03830v3


Audio-Visual Understanding of Passenger Intents for In-Cabin Conversational Agents
http://arxiv.org/abs/2007.03876v1


Augmented Natural Language for Generative Sequence Labeling
http://arxiv.org/abs/2009.13272v1


Augmenting Data for Sarcasm Detection with Unlabeled Conversation Context
http://arxiv.org/abs/2006.06259v1


Augmenting Neural Networks with First-order Logic
http://arxiv.org/abs/1906.06298v3


Augmenting word2vec with latent Dirichlet allocation within a clinical application
http://arxiv.org/abs/1808.03967v1


Author Commitment and Social Power: Automatic Belief Tagging to Infer the Social Context of Interactions
http://arxiv.org/abs/1805.06016v1


Auto-Rotating Perceptrons
http://arxiv.org/abs/1910.02483v2


Auto-Sizing Neural Networks: With Applications to n-gram Language Models
http://arxiv.org/abs/1508.05051v1


AutoExtend: Extending Word Embeddings to Embeddings for Synsets and Lexemes
http://arxiv.org/abs/1507.01127v1


AutoGluon-Tabular: Robust and Accurate AutoML for Structured Data
http://arxiv.org/abs/2003.06505v1


AutoML-Zero: Evolving Machine Learning Algorithms From Scratch
http://arxiv.org/abs/2003.03384v2


Autoencoder as Assistant Supervisor: Improving Text Representation for Chinese Social Media Text Summarization
http://arxiv.org/abs/1805.04869v1


Autoencoding Pixies: Amortised Variational Inference with Graph Convolutions for Functional Distributional Semantics
http://arxiv.org/abs/2005.02991v2


Automated Augmented Conjugate Inference for Non-conjugate Gaussian Process Models
http://arxiv.org/abs/2002.11451v1


Automated Topical Component Extraction Using Neural Network Attention Scores from Source-based Essay Scoring
http://arxiv.org/abs/2008.01809v1


Automatic Detection of Generated Text is Easiest when Humans are Fooled
http://arxiv.org/abs/1911.00650v2


Automatic Differentiation of Some First-Order Methods in Parametric Optimization
http://arxiv.org/abs/1910.05696v1


Automatic Estimation of Simultaneous Interpreter Performance
http://arxiv.org/abs/1805.04016v2


Automatic Event Salience Identification
http://arxiv.org/abs/1809.00647v1


Automatic Extraction of Rules Governing Morphological Agreement
http://arxiv.org/abs/2010.01160v2


Automatic Generation of High Quality CCGbanks for Parser Domain Adaptation
http://arxiv.org/abs/1906.01834v1


Automatic Metric Validation for Grammatical Error Correction
http://arxiv.org/abs/1804.11225v2


Automatic Reference-Based Evaluation of Pronoun Translation Misses the Point
http://arxiv.org/abs/1808.04164v1


Automatic Shortcut Removal for Self-Supervised Representation Learning
http://arxiv.org/abs/2002.08822v3


Automatic semantic segmentation for prediction of tuberculosis using lens-free microscopy images
http://arxiv.org/abs/2007.02482v1


Automatically Identifying Complaints in Social Media
http://arxiv.org/abs/1906.03890v1


Automatically Ranked Russian Paraphrase Corpus for Text Generation
http://arxiv.org/abs/2006.09719v1


Autoregressive Knowledge Distillation through Imitation Learning
http://arxiv.org/abs/2009.07253v2


Average-case Acceleration Through Spectral Density Estimation
http://arxiv.org/abs/2002.04756v5


Avoiding Reasoning Shortcuts: Adversarial Evaluation, Training, and Model Development for Multi-Hop QA
http://arxiv.org/abs/1906.07132v1


Avoiding the Hypothesis-Only Bias in Natural Language Inference via Ensemble Adversarial Training
http://arxiv.org/abs/2004.07790v4


AxCell: Automatic Extraction of Results from Machine Learning Papers
http://arxiv.org/abs/2004.14356v1


BAE: BERT-based Adversarial Examples for Text Classification
http://arxiv.org/abs/2004.01970v3


BAM! Born-Again Multi-Task Networks for Natural Language Understanding
http://arxiv.org/abs/1907.04829v1


BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension
http://arxiv.org/abs/1910.13461v1


BERT Fine-tuning For Arabic Text Summarization
http://arxiv.org/abs/2004.14135v1


BERT Knows Punta Cana is not just beautiful, it's gorgeous: Ranking Scalar Adjectives with Contextualised Representations
http://arxiv.org/abs/2010.02686v1


BERT-ATTACK: Adversarial Attack Against BERT Using BERT
http://arxiv.org/abs/2004.09984v3


BERT-EMD: Many-to-Many Layer Mapping for BERT Compression with Earth Mover's Distance
http://arxiv.org/abs/2010.06133v1


BERT-XML: Large Scale Automated ICD Coding Using BERT Pretraining
http://arxiv.org/abs/2006.03685v1


BERT-of-Theseus: Compressing BERT by Progressive Module Replacing
http://arxiv.org/abs/2002.02925v4


BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
http://arxiv.org/abs/1810.04805v2


BERTRAM: Improved Word Embeddings Have Big Impact on Contextualized Model Performance
http://arxiv.org/abs/1910.07181v3


BERTgrid: Contextualized Embedding for 2D Document Representation and Understanding
http://arxiv.org/abs/1909.04948v2


BERTs of a feather do not generalize together: Large variability in generalization across models with similar test set performance
http://arxiv.org/abs/1911.02969v2


BINOCULARS for Efficient, Nonmyopic Sequential Experimental Design
http://arxiv.org/abs/1909.04568v3


BLEU Neighbors: A Reference-less Approach to Automatic Evaluation
http://arxiv.org/abs/2004.12726v3


BLEU might be Guilty but References are not Innocent
http://arxiv.org/abs/2004.06063v2


BLEURT: Learning Robust Metrics for Text Generation
http://arxiv.org/abs/2004.04696v5


BPE-Dropout: Simple and Effective Subword Regularization
http://arxiv.org/abs/1910.13267v2


BabyAI++: Towards Grounded-Language Learning beyond Memorization
http://arxiv.org/abs/2004.07200v1


BabyWalk: Going Farther in Vision-and-Language Navigation by Taking Baby Steps
http://arxiv.org/abs/2005.04625v2


Back to the Future: Unsupervised Backprop-based Decoding for Counterfactual and Abductive Commonsense Reasoning
http://arxiv.org/abs/2010.05906v3


Backpropagating through Structured Argmax using a SPIGOT
http://arxiv.org/abs/1805.04658v1


Balanced off-policy evaluation in general action spaces
http://arxiv.org/abs/1906.03694v4


Balancing Competing Objectives with Noisy Data: Score-Based Classifiers for Welfare-Aware Machine Learning
http://arxiv.org/abs/2003.06740v4


Balancing Cost and Benefit with Tied-Multi Transformers
http://arxiv.org/abs/2002.08614v1


Balancing Gaussian vectors in high dimension
http://arxiv.org/abs/1910.13972v2


Balancing Objectives in Counseling Conversations: Advancing Forwards or Looking Backwards
http://arxiv.org/abs/2005.04245v1


Balancing Training for Multilingual Neural Machine Translation
http://arxiv.org/abs/2004.06748v4


Bandit Convex Optimization in Non-stationary Environments
http://arxiv.org/abs/1907.12340v2


Bandit optimisation of functions in the Matérn kernel RKHS
http://arxiv.org/abs/2001.10396v2


BanditSum: Extractive Summarization as a Contextual Bandit
http://arxiv.org/abs/1809.09672v3


Bandits for BMO Functions
http://arxiv.org/abs/2007.08703v1


Bandits with adversarial scaling
http://arxiv.org/abs/2003.02287v2


Barking up the right tree: an approach to search over molecule synthesis DAGs
http://arxiv.org/abs/2012.11522v1


BasisVAE: Translation-invariant feature-level clustering with Variational Autoencoders
http://arxiv.org/abs/2003.03462v1


Batch Stationary Distribution Estimation
http://arxiv.org/abs/2003.00722v1


Batch-Constrained Distributional Reinforcement Learning for Session-based Recommendation
http://arxiv.org/abs/2012.08984v1


Batched Multi-armed Bandits Problem
http://arxiv.org/abs/1904.01763v3


Bayesian Differential Privacy for Machine Learning
http://arxiv.org/abs/1901.09697v5


Bayesian Experimental Design for Implicit Models by Mutual Information Neural Estimation
http://arxiv.org/abs/2002.08129v3


Bayesian Graph Neural Networks with Adaptive Connection Sampling
http://arxiv.org/abs/2006.04064v3


Bayesian Hierarchical Words Representation Learning
http://arxiv.org/abs/2004.07126v1


Bayesian Image Classification with Deep Convolutional Gaussian Processes
http://arxiv.org/abs/1902.05888v2


Bayesian Learning from Sequential Data using Gaussian Processes with Signature Covariances
http://arxiv.org/abs/1906.08215v2


Bayesian Optimisation over Multiple Continuous and Categorical Inputs
http://arxiv.org/abs/1906.08878v2


Bayesian Optimization for Iterative Learning
http://arxiv.org/abs/1909.09593v4


Bayesian Optimization of Text Representations
http://arxiv.org/abs/1503.00693v1


Bayesian Reinforcement Learning via Deep, Sparse Sampling
http://arxiv.org/abs/1902.02661v4


Bayesian aggregation improves traditional single image crop classification approaches
http://arxiv.org/abs/2004.03468v1


Bayesian experimental design using regularized determinantal point processes
http://arxiv.org/abs/1906.04133v1


Be More with Less: Hypergraph Attention Networks for Inductive Text Classification
http://arxiv.org/abs/2011.00387v1


BeBold: Exploration Beyond the Boundary of Explored Regions
http://arxiv.org/abs/2012.08621v1


Before Name-calling: Dynamics and Triggers of Ad Hominem Fallacies in Web Argumentation
http://arxiv.org/abs/1802.06613v2


Behavior Analysis of NLI Models: Uncovering the Influence of Three Factors on Robustness
http://arxiv.org/abs/1805.04212v1


Being Bayesian, Even Just a Bit, Fixes Overconfidence in ReLU Networks
http://arxiv.org/abs/2002.10118v2


Being Negative but Constructively: Lessons Learnt from Creating Better Visual Question Answering Datasets
http://arxiv.org/abs/1704.07121v2


Benchmarking Graph Neural Networks
http://arxiv.org/abs/2003.00982v3


Benchmarking Multimodal Regex Synthesis with Complex Structures
http://arxiv.org/abs/2005.00663v1


Best Arm Identification for Cascading Bandits in the Fixed Confidence Setting
http://arxiv.org/abs/2001.08655v3


Best-First Beam Search
http://arxiv.org/abs/2007.03909v2


Best-item Learning in Random Utility Models with Subset Choices
http://arxiv.org/abs/2002.07994v1


Beta Embeddings for Multi-Hop Logical Reasoning in Knowledge Graphs
http://arxiv.org/abs/2010.11465v1


Better Depth-Width Trade-offs for Neural Networks through the lens of Dynamical Systems
http://arxiv.org/abs/2003.00777v2


Better Document-Level Machine Translation with Bayes' Rule
http://arxiv.org/abs/1910.00553v2


Better Highlighting: Creating Sub-Sentence Summary Highlights
http://arxiv.org/abs/2010.10566v1


Better Long-Range Dependency By Bootstrapping A Mutual Information Regularizer
http://arxiv.org/abs/1905.11978v2


Beyond Accuracy: Behavioral Testing of NLP models with CheckList
http://arxiv.org/abs/2005.04118v1


Beyond Error Propagation in Neural Machine Translation: Characteristics of Language Also Matter
http://arxiv.org/abs/1809.00120v2


Beyond Exponentially Discounted Sum: Automatic Learning of Return Function
http://arxiv.org/abs/1905.11591v2


Beyond Instructional Videos: Probing for More Diverse Visual-Textual Grounding on YouTube
http://arxiv.org/abs/2004.14338v2


Beyond Synthetic Noise: Deep Learning on Controlled Noisy Labels
http://arxiv.org/abs/1911.09781v3


Beyond UCB: Optimal and Efficient Contextual Bandits with Regression Oracles
http://arxiv.org/abs/2002.04926v2


Beyond User Self-Reported Likert Scale Ratings: A Comparison Model for Automatic Dialog Evaluation
http://arxiv.org/abs/2005.10716v2


Beyond exploding and vanishing gradients: analysing RNN training using attractors and smoothness
http://arxiv.org/abs/1906.08482v3


Beyond task success: A closer look at jointly learning to see, ask, and GuessWhat
http://arxiv.org/abs/1809.03408v2


Bi-Level Graph Neural Networks for Drug-Drug Interaction Prediction
http://arxiv.org/abs/2006.14002v1


Bi-directional Attention with Agreement for Dependency Parsing
http://arxiv.org/abs/1608.02076v2


BiST: Bi-directional Spatio-Temporal Reasoning for Video-Grounded Dialogues
http://arxiv.org/abs/2010.10095v1


Bidirectional Attentive Memory Networks for Question Answering over Knowledge Bases
http://arxiv.org/abs/1903.02188v3


Bidirectional Model-based Policy Optimization
http://arxiv.org/abs/2007.01995v2


Bilingual Dictionary Based Neural Machine Translation without Using Parallel Sentences
http://arxiv.org/abs/2007.02671v1


Bilingual Lexicon Induction through Unsupervised Machine Translation
http://arxiv.org/abs/1907.10761v1


Bilingual Lexicon Induction with Semi-supervision in Non-Isometric Embedding Spaces
http://arxiv.org/abs/1908.06625v1


Bio-Inspired Hashing for Unsupervised Similarity Search
http://arxiv.org/abs/2001.04907v2


BioMegatron: Larger Biomedical Domain Language Model
http://arxiv.org/abs/2010.06060v2


Biomedical Entity Representations with Synonym Marginalization
http://arxiv.org/abs/2005.00239v1


Biomedical Information Extraction for Disease Gene Prioritization
http://arxiv.org/abs/2011.05188v2


Bipartite Flat-Graph Network for Nested Named Entity Recognition
http://arxiv.org/abs/2005.00436v1


Birds have four legs?! NumerSense: Probing Numerical Commonsense Knowledge of Pre-trained Language Models
http://arxiv.org/abs/2005.00683v2


Bisect and Conquer: Hierarchical Clustering via Max-Uncut Bisection
http://arxiv.org/abs/1912.06983v1


Black Box Submodular Maximization: Discrete and Continuous Settings
http://arxiv.org/abs/1901.09515v2


Black Loans Matter: Distributionally Robust Fairness for Fighting Subgroup Discrimination
http://arxiv.org/abs/2012.01193v1


Black is to Criminal as Caucasian is to Police: Detecting and Removing Multiclass Bias in Word Embeddings
http://arxiv.org/abs/1904.04047v3


Black-box Certification and Learning under Adversarial Perturbations
http://arxiv.org/abs/2006.16520v1


Black-box Methods for Restoring Monotonicity
http://arxiv.org/abs/2003.09554v1


Blank Language Models
http://arxiv.org/abs/2002.03079v2


Bleaching Text: Abstract Features for Cross-lingual Gender Prediction
http://arxiv.org/abs/1805.03122v1


BoXHED: Boosted eXact Hazard Estimator with Dynamic covariates
http://arxiv.org/abs/2006.14218v2


BoolQ: Exploring the Surprising Difficulty of Natural Yes/No Questions
http://arxiv.org/abs/1905.10044v1


Boosting Entity Linking Performance by Leveraging Unlabeled Documents
http://arxiv.org/abs/1906.01250v1


Boosting Frank-Wolfe by Chasing Gradients
http://arxiv.org/abs/2003.06369v2


Boosting for Control of Dynamical Systems
http://arxiv.org/abs/1906.08720v2


Bootstrap Latent-Predictive Representations for Multitask Reinforcement Learning
http://arxiv.org/abs/2004.14646v1


Bootstrapped Q-learning with Context Relevant Observation Pruning to Generalize in Text-based Games
http://arxiv.org/abs/2009.11896v1


Bootstrapping Generators from Noisy Data
http://arxiv.org/abs/1804.06385v4


Bootstrapping Named Entity Recognition in E-Commerce with Positive Unlabeled Learning
http://arxiv.org/abs/2005.11075v1


Bootstrapping Techniques for Polysynthetic Morphological Analysis
http://arxiv.org/abs/2005.00956v1


Born-Again Tree Ensembles
http://arxiv.org/abs/2003.11132v3


Bounding, Concentrating, and Truncating: Unifying Privacy Loss Composition for Data Analytics
http://arxiv.org/abs/2004.07223v3


Bounds in Query Learning
http://arxiv.org/abs/1904.10122v1


Break It Down: A Question Understanding Benchmark
http://arxiv.org/abs/2001.11770v1


Breaking NLI Systems with Sentences that Require Simple Lexical Inferences
http://arxiv.org/abs/1805.02266v1


Breaking the Curse of Many Agents: Provable Mean Embedding Q-Iteration for Mean-Field Reinforcement Learning
http://arxiv.org/abs/2006.11917v1


Breaking the Curse of Space Explosion: Towards Efficient NAS with Curriculum Search
http://arxiv.org/abs/2007.07197v2


Breast Cancer Detection Using Convolutional Neural Networks
http://arxiv.org/abs/2003.07911v3


Bridging Anaphora Resolution as Question Answering
http://arxiv.org/abs/2004.07898v3


Bridging Information-Seeking Human Gaze and Machine Reading Comprehension
http://arxiv.org/abs/2009.14780v2


Bridging Linguistic Typology and Multilingual Machine Translation with Multi-View Language Representations
http://arxiv.org/abs/2004.14923v2


Bridging the Gap between Training and Inference for Neural Machine Translation
http://arxiv.org/abs/1906.02448v2


Bringing Stories Alive: Generating Interactive Fiction Worlds
http://arxiv.org/abs/2001.10161v1


Budget Learning via Bracketing
http://arxiv.org/abs/2004.06298v1


Budget-Constrained Bandits over General Cost and Reward Distributions
http://arxiv.org/abs/2003.00365v1


C-Learning: Horizon-Aware Cumulative Accessibility Estimation
http://arxiv.org/abs/2011.12363v2


C-Learning: Learning to Achieve Goals via Recursive Classification
http://arxiv.org/abs/2011.08909v1


CAT-Gen: Improving Robustness in NLP Models via Controlled Adversarial Text Generation
http://arxiv.org/abs/2010.02338v1


CAUSE: Learning Granger Causality from Event Sequences using Attribution Methods
http://arxiv.org/abs/2002.07906v1


CAiRE-COVID: A Question Answering and Query-focused Multi-Document Summarization System for COVID-19 Scholarly Information Management
http://arxiv.org/abs/2005.03975v3


CDL: Curriculum Dual Learning for Emotion-Controllable Response Generation
http://arxiv.org/abs/2005.00329v5


CITE: A Corpus of Image-Text Discourse Relations
http://arxiv.org/abs/1904.06286v2


CLEVR Parser: A Graph Parser Library for Geometric Learning on Language Grounded Image Scenes
http://arxiv.org/abs/2009.09154v2


CLEVR-Dialog: A Diagnostic Dataset for Multi-Round Reasoning in Visual Dialog
http://arxiv.org/abs/1903.03166v2


CLUB: A Contrastive Log-ratio Upper Bound of Mutual Information
http://arxiv.org/abs/2006.12013v6


CNM: An Interpretable Complex-valued Network for Matching
http://arxiv.org/abs/1904.05298v1


CNN-based Approach for Cervical Cancer Classification in Whole-Slide Histopathology Images
http://arxiv.org/abs/2005.13924v1


COD3S: Diverse Generation with Discrete Semantic Signatures
http://arxiv.org/abs/2010.02882v1


COMET: A Neural Framework for MT Evaluation
http://arxiv.org/abs/2009.09025v2


COMETA: A Corpus for Medical Entity Linking in the Social Media
http://arxiv.org/abs/2010.03295v2


COVID-19 Literature Topic-Based Search via Hierarchical NMF
http://arxiv.org/abs/2009.09074v1


CUNI Systems for the Unsupervised and Very Low Resource Translation Task in WMT20
http://arxiv.org/abs/2010.11747v1


CURL: Contrastive Unsupervised Representations for Reinforcement Learning
http://arxiv.org/abs/2004.04136v4


Calibrated Language Model Fine-Tuning for In- and Out-of-Distribution Data
http://arxiv.org/abs/2010.11506v1


Calibrated Surrogate Losses for Adversarially Robust Classification
http://arxiv.org/abs/2005.13748v1


Calibrated Surrogate Maximization of Linear-fractional Utility in Binary Classification
http://arxiv.org/abs/1905.12511v2


Calibrated Top-1 Uncertainty estimates for classification by score based models
http://arxiv.org/abs/1903.09215v4


Calibrating Structured Output Predictors for Natural Language Processing
http://arxiv.org/abs/2004.04361v2


Calibration of Pre-trained Transformers
http://arxiv.org/abs/2003.07892v3


Calibration, Entropy Rates, and Memory in Language Models
http://arxiv.org/abs/1906.05664v1


CamemBERT: a Tasty French Language Model
http://arxiv.org/abs/1911.03894v3


Can Automatic Post-Editing Improve NMT?
http://arxiv.org/abs/2009.14395v1


Can Autonomous Vehicles Identify, Recover From, and Adapt to Distribution Shifts?
http://arxiv.org/abs/2006.14911v2


Can Increasing Input Dimensionality Improve Deep Reinforcement Learning?
http://arxiv.org/abs/2003.01629v2


Can Neural Machine Translation be Improved with User Feedback?
http://arxiv.org/abs/1804.05958v1


Can You Put it All Together: Evaluating Conversational Agents' Ability to Blend Skills
http://arxiv.org/abs/2004.08449v1


Can a Suit of Armor Conduct Electricity? A New Dataset for Open Book Question Answering
http://arxiv.org/abs/1809.02789v1


Carbontracker: Tracking and Predicting the Carbon Footprint of Training Deep Learning Models
http://arxiv.org/abs/2007.03051v1


Cascaded Mutual Modulation for Visual Reasoning
http://arxiv.org/abs/1809.01943v1


Catch Me if I Can: Detecting Strategic Behaviour in Peer Assessment
http://arxiv.org/abs/2010.04041v1


Categorical Metadata Representation for Customized Text Classification
http://arxiv.org/abs/1902.05196v1


Catplayinginthesnow: Impact of Prior Segmentation on a Model of Visually Grounded Speech
http://arxiv.org/abs/2006.08387v2


Causal Bayesian Optimization
http://arxiv.org/abs/2005.11741v2


Causal Curiosity: RL Agents Discovering Self-supervised Experiments for Causal Representation Learning
http://arxiv.org/abs/2010.03110v1


Causal Effect Estimation and Optimal Dose Suggestions in Mobile Health
http://arxiv.org/abs/2007.09812v2


Causal Feature Discovery through Strategic Modification
http://arxiv.org/abs/2002.07024v2


Causal Inference of Script Knowledge
http://arxiv.org/abs/2004.01174v1


Causal Inference using Gaussian Processes with Structured Latent Confounders
http://arxiv.org/abs/2007.07127v1


Causal Learning by a Robot with Semantic-Episodic Memory in an Aesop's Fable Experiment
http://arxiv.org/abs/2003.00274v1


Causal Modeling for Fairness in Dynamical Systems
http://arxiv.org/abs/1909.09141v2


Causal Structure Discovery from Distributions Arising from Mixtures of DAGs
http://arxiv.org/abs/2001.11940v2


Causal inference in degenerate systems: An impossibility result
http://arxiv.org/abs/1711.04466v3


Cautious Adaptation For Reinforcement Learning in Safety-Critical Settings
http://arxiv.org/abs/2008.06622v1


Censored Quantile Regression Forest
http://arxiv.org/abs/2001.03458v1


Certified Data Removal from Machine Learning Models
http://arxiv.org/abs/1911.03030v5


Certified Robustness to Label-Flipping Attacks via Randomized Smoothing
http://arxiv.org/abs/2002.03018v4


Challenges in Emotion Style Transfer: An Exploration with a Lexical Substitution Pipeline
http://arxiv.org/abs/2005.07617v1


Channel Equilibrium Networks for Learning Deep Representation
http://arxiv.org/abs/2003.00214v1


Chapter Captor: Text Segmentation in Novels
http://arxiv.org/abs/2011.04163v1


CharManteau: Character Embedding Models For Portmanteau Creation
http://arxiv.org/abs/1707.01176v2


Character-level Representations Improve DRS-based Semantic Parsing Even in the Age of BERT
http://arxiv.org/abs/2011.04308v1


Characterization of Overlap in Observational Studies
http://arxiv.org/abs/1907.04138v3


Characterizing Distribution Equivalence and Structure Learning for Cyclic and Acyclic Directed Graphs
http://arxiv.org/abs/1910.12993v3


Characterizing Private Clipped Gradient Descent on Convex Generalized Linear Problems
http://arxiv.org/abs/2006.06783v1


Characterizing the Latent Space of Molecular Deep Generative Models with Persistent Homology Metrics
http://arxiv.org/abs/2010.08548v1


CheXbert: Combining Automatic Labelers and Expert Annotations for Accurate Radiology Report Labeling Using BERT
http://arxiv.org/abs/2004.09167v3


Choice Set Optimization Under Discrete Choice Models of Group Decisions
http://arxiv.org/abs/2002.00421v2


ChrEn: Cherokee-English Machine Translation for Endangered Language Revitalization
http://arxiv.org/abs/2010.04791v1


Circuit-Based Intrinsic Methods to Detect Overfitting
http://arxiv.org/abs/1907.01991v2


ClarQ: A large-scale and diverse dataset for Clarification Question Generation
http://arxiv.org/abs/2006.05986v2


Classical Structured Prediction Losses for Sequence to Sequence Learning
http://arxiv.org/abs/1711.04956v5


Classification with Strategically Withheld Data
http://arxiv.org/abs/2012.10203v2


Classifying Syntactic Errors in Learner Language
http://arxiv.org/abs/2010.11032v2


Clinical Reading Comprehension: A Thorough Analysis of the emrQA Dataset
http://arxiv.org/abs/2005.00574v1


Clinical XLNet: Modeling Sequential Clinical Notes and Predicting Prolonged Mechanical Ventilation
http://arxiv.org/abs/1912.11975v1


Closed Loop Neural-Symbolic Learning via Integrating Neural Perception, Grammar Parsing, and Symbolic Reasoning
http://arxiv.org/abs/2006.06649v2


Closing the Gap: Joint De-Identification and Concept Extraction in the Clinical Domain
http://arxiv.org/abs/2005.09397v1


Closing the convergence gap of SGD without replacement
http://arxiv.org/abs/2002.10400v6


Closure Properties for Private Classification and Online Prediction
http://arxiv.org/abs/2003.04509v3


Clue: Cross-modal Coherence Modeling for Caption Generation
http://arxiv.org/abs/2005.00908v1


CoDEx: A Comprehensive Knowledge Graph Completion Benchmark
http://arxiv.org/abs/2009.07810v2


Coach: A Coarse-to-Fine Approach for Cross-domain Slot Filling
http://arxiv.org/abs/2004.11727v1


Coarse-to-Fine Decoding for Neural Semantic Parsing
http://arxiv.org/abs/1805.04793v1


Code and Named Entity Recognition in StackOverflow
http://arxiv.org/abs/2005.01634v3


Code-switching patterns can be an effective route to improve performance of downstream NLP applications: A case study of humour, sarcasm and hate speech detection
http://arxiv.org/abs/2005.02295v1


Cognitive Graph for Multi-Hop Reading Comprehension at Scale
http://arxiv.org/abs/1905.05460v2


CognitiveCNN: Mimicking Human Cognitive Models to resolve Texture-Shape Bias
http://arxiv.org/abs/2006.14722v1


Cold-start Active Learning through Self-supervised Language Modeling
http://arxiv.org/abs/2010.09535v2


Collaborative Machine Learning with Incentive-Aware Model Rewards
http://arxiv.org/abs/2010.12797v1


Collapsed Amortized Variational Inference for Switching Nonlinear Dynamical Systems
http://arxiv.org/abs/1910.09588v2


Collecting Diverse Natural Language Inference Problems for Sentence Representation Evaluation
http://arxiv.org/abs/1804.08207v2


Colorless green recurrent networks dream hierarchically
http://arxiv.org/abs/1803.11138v1


Colors in Context: A Pragmatic Neural Model for Grounded Language Understanding
http://arxiv.org/abs/1703.10186v2


Combating False Negatives in Adversarial Imitation Learning
http://arxiv.org/abs/2002.00412v1


Combining Pretrained High-Resource Embeddings and Subword Representations for Low-Resource Languages
http://arxiv.org/abs/2003.04419v3


Combining Self-Training and Self-Supervised Learning for Unsupervised Disfluency Detection
http://arxiv.org/abs/2010.15360v1


Combining Sentiment Lexica with a Multi-View Variational Autoencoder
http://arxiv.org/abs/1904.02839v1


Common Sense or World Knowledge? Investigating Adapter-Based Knowledge Injection into Pretrained Transformers
http://arxiv.org/abs/2005.11787v2


Commonsense for Generative Multi-Hop Question Answering Tasks
http://arxiv.org/abs/1809.06309v3


CommonsenseQA: A Question Answering Challenge Targeting Commonsense Knowledge
http://arxiv.org/abs/1811.00937v2


Communication-Efficient Asynchronous Stochastic Frank-Wolfe over Nuclear-norm Balls
http://arxiv.org/abs/1910.07703v1


Communication-Efficient Distributed Stochastic AUC Maximization with Deep Neural Networks
http://arxiv.org/abs/2005.02426v2


CompRes: A Dataset for Narrative Structure in News
http://arxiv.org/abs/2007.04874v1


Compact Personalized Models for Neural Machine Translation
http://arxiv.org/abs/1811.01990v1


Comparative Analysis of Text Classification Approaches in Electronic Health Records
http://arxiv.org/abs/2005.06624v1


Comparatives, Quantifiers, Proportions: A Multi-Task Model for the Learning of Quantities from Vision
http://arxiv.org/abs/1804.05018v1


Comparing recurrent and convolutional neural networks for predicting wave propagation
http://arxiv.org/abs/2002.08981v3


Competence-Level Prediction and Resume & Job Description Matching Using Context-Aware Transformer Models
http://arxiv.org/abs/2011.02998v1


Competence-based Curriculum Learning for Neural Machine Translation
http://arxiv.org/abs/1903.09848v2


Competing Bandits in Matching Markets
http://arxiv.org/abs/1906.05363v2


Competitive Mirror Descent
http://arxiv.org/abs/2006.10179v1


Complete Multilingual Neural Machine Translation
http://arxiv.org/abs/2010.10239v1


Complexity Guarantees for Polyak Steps with Momentum
http://arxiv.org/abs/2002.00915v2


Complexity-Weighted Loss and Diverse Reranking for Sentence Simplification
http://arxiv.org/abs/1904.02767v1


Compositional Demographic Word Embeddings
http://arxiv.org/abs/2010.02986v2


Compositional Questions Do Not Necessitate Multi-hop Reasoning
http://arxiv.org/abs/1906.02900v1


Compositional Semantic Parsing on Semi-Structured Tables
http://arxiv.org/abs/1508.00305v1


Compositional and Lexical Semantics in RoBERTa, BERT and DistilBERT: A Case Study on CoQA
http://arxiv.org/abs/2009.08257v1


Compositionality and Generalization in Emergent Languages
http://arxiv.org/abs/2004.09124v1


Comprehensive Supersense Disambiguation of English Prepositions and Possessives
http://arxiv.org/abs/1805.04905v1


Compressing BERT: Studying the Effects of Weight Pruning on Transfer Learning
http://arxiv.org/abs/2002.08307v2


Compressive Summarization with Plausibility and Salience Modeling
http://arxiv.org/abs/2010.07886v1


Computing Tight Differential Privacy Guarantees Using FFT
http://arxiv.org/abs/1906.03049v2


ConQUR: Mitigating Delusional Bias in Deep Q-learning
http://arxiv.org/abs/2002.12399v1


ConStance: Modeling Annotation Contexts to Improve Stance Classification
http://arxiv.org/abs/1708.06309v1


Concentration bounds for CVaR estimation: The cases of light-tailed and heavy-tailed distributions
http://arxiv.org/abs/1901.00997v2


Concept Bottleneck Models
http://arxiv.org/abs/2007.04612v3


Concise Explanations of Neural Networks using Adversarial Training
http://arxiv.org/abs/1810.06583v9


Concluding remarks
http://arxiv.org/abs/astro-ph/0612056v1


Conditional Augmentation for Aspect Term Extraction via Masked Sequence-to-Sequence Generation
http://arxiv.org/abs/2004.14769v2


Conditional Flow Variational Autoencoders for Structured Sequence Prediction
http://arxiv.org/abs/1908.09008v3


Conditional Generation and Snapshot Learning in Neural Dialogue Systems
http://arxiv.org/abs/1606.03352v1


Conditional Importance Sampling for Off-Policy Learning
http://arxiv.org/abs/1910.07479v2


Conditional Normalizing Flows for Low-Dose Computed Tomography Image Reconstruction
http://arxiv.org/abs/2006.06270v1


Conditional Set Generation with Transformers
http://arxiv.org/abs/2006.16841v2


Conditional gradient methods for stochastically constrained convex minimization
http://arxiv.org/abs/2007.03795v1


Conditioning of Reinforcement Learning Agents and its Policy Regularization Application
http://arxiv.org/abs/1906.05437v2


Confidence Intervals for Policy Evaluation in Adaptive Experiments
http://arxiv.org/abs/1911.02768v3


Confidence Sets and Hypothesis Testing in a Likelihood-Free Inference Setting
http://arxiv.org/abs/2002.10399v2


Confidence-Calibrated Adversarial Training: Generalizing to Unseen Attacks
http://arxiv.org/abs/1910.06259v4


ConjNLI: Natural Language Inference Over Conjunctive Sentences
http://arxiv.org/abs/2010.10418v2


Connecting Embeddings for Knowledge Graph Entity Typing
http://arxiv.org/abs/2007.10873v1


Conservative Exploration in Reinforcement Learning
http://arxiv.org/abs/2002.03218v2


Conservative Safety Critics for Exploration
http://arxiv.org/abs/2010.14497v1


Considering Likelihood in NLP Classification Explanations with Occlusion and Language Modeling
http://arxiv.org/abs/2004.09890v1


Consistency by Agreement in Zero-shot Neural Machine Translation
http://arxiv.org/abs/1904.02338v2


Consistency of a Recurrent Language Model With Respect to Incomplete Decoding
http://arxiv.org/abs/2002.02492v2


Consistent Estimators for Learning to Defer to an Expert
http://arxiv.org/abs/2006.01862v2


Consistent Structured Prediction with Max-Min Margin Markov Networks
http://arxiv.org/abs/2007.01012v2


Consistent Transcription and Translation of Speech
http://arxiv.org/abs/2007.12741v2


Consistent recovery threshold of hidden nearest neighbor graphs
http://arxiv.org/abs/1911.08004v1


Constant Curvature Graph Convolutional Networks
http://arxiv.org/abs/1911.05076v3


Constituent Parsing as Sequence Labeling
http://arxiv.org/abs/1810.08994v2


Constrained Decoding for Neural NLG from Compositional Representations in Task-Oriented Dialogue
http://arxiv.org/abs/1906.07220v1


Constrained Markov Decision Processes via Backward Value Functions
http://arxiv.org/abs/2008.11811v1


Constrained Neural Ordinary Differential Equations with Stability Guarantees
http://arxiv.org/abs/2004.10883v1


Constructing a provably adversarially-robust classifier from a high accuracy one
http://arxiv.org/abs/1912.07561v1


Constructive Universal High-Dimensional Distribution Generation through Deep ReLU Networks
http://arxiv.org/abs/2006.16664v1


Content Planning for Neural Story Generation with Aristotelian Rescoring
http://arxiv.org/abs/2009.09870v2


Content Selection in Deep Learning Models of Summarization
http://arxiv.org/abs/1810.12343v2


Context Gates for Neural Machine Translation
http://arxiv.org/abs/1608.06043v3


Context Mover's Distance & Barycenters: Optimal Transport of Contexts for Building Representations
http://arxiv.org/abs/1808.09663v6


Context-Aware Answer Extraction in Question Answering
http://arxiv.org/abs/2011.02687v1


Context-Aware Local Differential Privacy
http://arxiv.org/abs/1911.00038v2


Context-Aware Neural Machine Translation Learns Anaphora Resolution
http://arxiv.org/abs/1805.10163v1


Context-aware Dynamics Model for Generalization in Model-Based Reinforcement Learning
http://arxiv.org/abs/2005.06800v3


Contextual Constrained Learning for Dose-Finding Clinical Trials
http://arxiv.org/abs/2001.02463v2


Contextual Embeddings: When Are They Worth It?
http://arxiv.org/abs/2005.09117v1


Contextual Memory Trees
http://arxiv.org/abs/1807.06473v3


Contextual Neural Machine Translation Improves Translation of Cataphoric Pronouns
http://arxiv.org/abs/2004.09894v2


Contextual Online False Discovery Rate Control
http://arxiv.org/abs/1902.02885v2


Contextualization of Morphological Inflection
http://arxiv.org/abs/1905.01420v1


Contextualized Sparse Representations for Real-Time Open-Domain Question Answering
http://arxiv.org/abs/1911.02896v2


Contextualizing Hate Speech Classifiers with Post-hoc Explanation
http://arxiv.org/abs/2005.02439v3


Continual Learning from the Perspective of Compression
http://arxiv.org/abs/2006.15078v1


Continual Model-Based Reinforcement Learning with Hypernetworks
http://arxiv.org/abs/2009.11997v1


Continual adaptation for efficient machine communication
http://arxiv.org/abs/1911.09896v2


Continual and Multi-Task Architecture Search
http://arxiv.org/abs/1906.05226v1


Continual learning with direction-constrained optimization
http://arxiv.org/abs/2011.12581v1


Continuous Graph Flow
http://arxiv.org/abs/1908.02436v2


Continuous Graph Neural Networks
http://arxiv.org/abs/1912.00967v3


Continuous Online Learning and New Insights to Online Imitation Learning
http://arxiv.org/abs/1912.01261v1


Continuous Representation of Location for Geolocation and Lexical Dialectology using Mixture Density Networks
http://arxiv.org/abs/1708.04358v1


Continuous-time Lower Bounds for Gradient-based Algorithms
http://arxiv.org/abs/2002.03546v2


Continuously Indexed Domain Adaptation
http://arxiv.org/abs/2007.01807v2


Contrastive Behavioral Similarity Embeddings for Generalization in Reinforcement Learning
http://arxiv.org/abs/2101.05265v1


Contrastive Graph Neural Network Explanation
http://arxiv.org/abs/2010.13663v1


Contrastive Multi-View Representation Learning on Graphs
http://arxiv.org/abs/2006.05582v1


Contrastive Self-Supervised Learning for Commonsense Reasoning
http://arxiv.org/abs/2005.00669v1


Control Frequency Adaptation via Action Persistence in Batch Reinforcement Learning
http://arxiv.org/abs/2002.06836v2


Controlled Crowdsourcing for High-Quality QA-SRL Annotation
http://arxiv.org/abs/1911.03243v2


Controlling Output Length in Neural Encoder-Decoders
http://arxiv.org/abs/1609.09552v1


Controlling Overestimation Bias with Truncated Mixture of Continuous Distributional Quantile Critics
http://arxiv.org/abs/2005.04269v1


ConvLab-2: An Open-Source Toolkit for Building, Evaluating, and Diagnosing Dialogue Systems
http://arxiv.org/abs/2002.04793v2


Convergence Analysis of Block Coordinate Algorithms with Determinantal Sampling
http://arxiv.org/abs/1910.11561v3


Convergence Rates of Smooth Message Passing with Rounding in Entropy-Regularized MAP Inference
http://arxiv.org/abs/1907.01127v2


Convergence Rates of Variational Inference in Sparse Deep Learning
http://arxiv.org/abs/1908.04847v2


Conversation Modeling on Reddit using a Graph-Structured LSTM
http://arxiv.org/abs/1704.02080v1


Conversational Document Prediction to Assist Customer Care Agents
http://arxiv.org/abs/2010.02305v1


Conversational Semantic Parsing
http://arxiv.org/abs/2009.13655v1


Conversational Semantic Parsing for Dialog State Tracking
http://arxiv.org/abs/2010.12770v1


Conversational Word Embedding for Retrieval-Based Dialog System
http://arxiv.org/abs/2004.13249v1


Conversations Gone Awry: Detecting Early Signs of Conversational Failure
http://arxiv.org/abs/1805.05345v1


Convex Calibrated Surrogates for the Multi-Label F-Measure
http://arxiv.org/abs/2009.07801v1


Convex Representation Learning for Generalized Invariance in Semi-Inner-Product Space
http://arxiv.org/abs/2004.12209v3


Convolutional Kernel Networks for Graph-Structured Data
http://arxiv.org/abs/2003.05189v2


Convolutional Neural Networks with Recurrent Neural Filters
http://arxiv.org/abs/1808.09315v1


Convolutional dictionary learning based auto-encoders for natural exponential-family distributions
http://arxiv.org/abs/1907.03211v4


Cooperative Learning of Disjoint Syntax and Semantics
http://arxiv.org/abs/1902.09393v2


Cooperative Multi-Agent Bandits with Heavy Tails
http://arxiv.org/abs/2008.06244v1


Coordination without communication: optimal regret in two players multi-armed bandits
http://arxiv.org/abs/2002.07596v2


Coreferential Reasoning Learning for Language Representation
http://arxiv.org/abs/2004.06870v2


Coresets for Clustering in Graphs of Bounded Treewidth
http://arxiv.org/abs/1907.04733v4


Coresets for Data-efficient Training of Machine Learning Models
http://arxiv.org/abs/1906.01827v3


Correlating neural and symbolic representations of language
http://arxiv.org/abs/1905.06401v2


Corruption-Tolerant Gaussian Process Bandit Optimization
http://arxiv.org/abs/2003.01971v1


Counterfactual Cross-Validation: Stable Model Selection Procedure for Causal Inference Models
http://arxiv.org/abs/1909.05299v5


Counterfactual Data Augmentation for Mitigating Gender Stereotypes in Languages with Rich Morphology
http://arxiv.org/abs/1906.04571v3


Counterfactual Data Augmentation using Locally Factored Dynamics
http://arxiv.org/abs/2007.02863v2


Countering Language Drift with Seeded Iterated Learning
http://arxiv.org/abs/2003.12694v3


Countering hate on social media: Large scale classification of hate and counter speech
http://arxiv.org/abs/2006.01974v3


Coupling Distant Annotation and Adversarial Training for Cross-Domain Chinese Word Segmentation
http://arxiv.org/abs/2007.08186v2


Coupling Retrieval and Meta-Learning for Context-Dependent Semantic Parsing
http://arxiv.org/abs/1906.07108v1


Course Concept Expansion in MOOCs with External Knowledge and Interactive Game
http://arxiv.org/abs/1909.07739v1


Creating Causal Embeddings for Question Answering with Minimal Supervision
http://arxiv.org/abs/1609.08097v1


Cross Copy Network for Dialogue Generation
http://arxiv.org/abs/2010.11539v1


Cross-Domain Generalization of Neural Constituency Parsers
http://arxiv.org/abs/1907.04347v1


Cross-Lingual Alignment of Contextual Word Embeddings, with Applications to Zero-shot Dependency Parsing
http://arxiv.org/abs/1902.09492v2


Cross-Lingual Semantic Role Labeling with High-Quality Translated Training Corpus
http://arxiv.org/abs/2004.06295v2


Cross-Lingual Syntactic Transfer with Limited Resources
http://arxiv.org/abs/1610.06227v2


Cross-Lingual Training for Automatic Question Generation
http://arxiv.org/abs/1906.02525v1


Cross-Linguistic Syntactic Evaluation of Word Prediction Models
http://arxiv.org/abs/2005.00187v2


Cross-Media Keyphrase Prediction: A Unified Framework with Multi-Modality Multi-Head Attention and Image Wordings
http://arxiv.org/abs/2011.01565v1


Cross-Modal Data Programming Enables Rapid Medical Machine Learning
http://arxiv.org/abs/1903.11101v1


Cross-Modality Relevance for Reasoning on Language and Vision
http://arxiv.org/abs/2005.06035v1


Cross-Sentence N-ary Relation Extraction with Graph LSTMs
http://arxiv.org/abs/1708.03743v1


Cross-Target Stance Classification with Self-Attention Networks
http://arxiv.org/abs/1805.06593v2


Cross-Thought for Sentence Encoder Pre-training
http://arxiv.org/abs/2010.03652v1


Cross-lingual Abstract Meaning Representation Parsing
http://arxiv.org/abs/1704.04539v2


Cross-lingual Spoken Language Understanding with Regularized Representation Alignment
http://arxiv.org/abs/2009.14510v1


Cross-lingual Visual Verb Sense Disambiguation
http://arxiv.org/abs/1904.05092v2


Cross-media Structured Common Space for Multimedia Event Extraction
http://arxiv.org/abs/2005.02472v1


Cross-modal Language Generation using Pivot Stabilization for Web-scale Language Coverage
http://arxiv.org/abs/2005.00246v1


Cross-topic distributional semantic representations via unsupervised mappings
http://arxiv.org/abs/1904.05674v1


CrossWOZ: A Large-Scale Chinese Cross-Domain Task-Oriented Dialogue Dataset
http://arxiv.org/abs/2002.11893v2


Crossing Variational Autoencoders for Answer Retrieval
http://arxiv.org/abs/2005.02557v2


CrowS-Pairs: A Challenge Dataset for Measuring Social Biases in Masked Language Models
http://arxiv.org/abs/2010.00133v1


Crowdsourcing Lightweight Pyramids for Manual Summary Evaluation
http://arxiv.org/abs/1904.05929v1


Cumulo: A Dataset for Learning Cloud Classes
http://arxiv.org/abs/1911.04227v2


Curriculum Pre-training for End-to-End Speech Translation
http://arxiv.org/abs/2004.10093v1


Curse of Dimensionality on Randomized Smoothing for Certifiable Robustness
http://arxiv.org/abs/2002.03239v2


Cycles in Causal Learning
http://arxiv.org/abs/2007.12335v1


D2RL: Deep Dense Architectures in Reinforcement Learning
http://arxiv.org/abs/2010.09163v2


DADI: Dynamic Discovery of Fair Information with Adversarial Reinforcement Learning
http://arxiv.org/abs/1910.13983v1


DAGA: Data Augmentation with a Generation Approach for Low-resource Tagging Tasks
http://arxiv.org/abs/2011.01549v1


DAve-QN: A Distributed Averaged Quasi-Newton Method with Local Superlinear Convergence Rate
http://arxiv.org/abs/1906.00506v3


DERAIL: Diagnostic Environments for Reward And Imitation Learning
http://arxiv.org/abs/2012.01365v1


DGST: a Dual-Generator Network for Text Style Transfer
http://arxiv.org/abs/2010.14557v1


DLGNet: A Transformer-based Model for Dialogue Response Generation
http://arxiv.org/abs/1908.01841v2


DOC: Deep Open Classification of Text Documents
http://arxiv.org/abs/1709.08716v1


DORB: Dynamically Optimizing Multiple Rewards with Bandits
http://arxiv.org/abs/2011.07635v1


DR-BiLSTM: Dependent Reading Bidirectional LSTM for Natural Language Inference
http://arxiv.org/abs/1802.05577v2


DRS at MRP 2020: Dressing up Discourse Representation Structures as Graphs
http://arxiv.org/abs/2012.14837v1


DRTS Parsing with Structure-Aware Encoding and Decoding
http://arxiv.org/abs/2005.06901v1


DRWR: A Differentiable Renderer without Rendering for Unsupervised 3D Structure Learning from Silhouette Images
http://arxiv.org/abs/2007.06127v1


DTCA: Decision Tree-based Co-Attention Networks for Explainable Claim Verification
http://arxiv.org/abs/2004.13455v1


DYSAN: Dynamically sanitizing motion sensor data against sensitive inferences through adversarial networks
http://arxiv.org/abs/2003.10325v2


DagoBERT: Generating Derivational Morphology with a Pretrained Language Model
http://arxiv.org/abs/2005.00672v2


Data Amplification: Instance-Optimal Property Estimation
http://arxiv.org/abs/1903.01432v2


Data Appraisal Without Data Sharing
http://arxiv.org/abs/2012.06430v1


Data Augmentation for Training Dialog Models Robust to Speech Recognition Errors
http://arxiv.org/abs/2006.05635v1


Data Boost: Text Data Augmentation Through Reinforcement Learning Guided Conditional Generation
http://arxiv.org/abs/2012.02952v1


Data Generation for Neural Programming by Example
http://arxiv.org/abs/1911.02624v1


Data Manipulation: Towards Effective Instance Learning for Neural Dialogue Generation via Learning to Augment and Reweight
http://arxiv.org/abs/2004.02594v5


Data Rejuvenation: Exploiting Inactive Training Examples for Neural Machine Translation
http://arxiv.org/abs/2010.02552v1


Data Valuation using Reinforcement Learning
http://arxiv.org/abs/1909.11671v1


Data Weighted Training Strategies for Grammatical Error Correction
http://arxiv.org/abs/2008.02976v2


Data and Representation for Turkish Natural Language Inference
http://arxiv.org/abs/2004.14963v3


Data preprocessing to mitigate bias: A maximum entropy based approach
http://arxiv.org/abs/1906.02164v2


Data-Dependent Differentially Private Parameter Learning for Directed Graphical Models
http://arxiv.org/abs/1905.12813v3


Data-Efficient Image Recognition with Contrastive Predictive Coding
http://arxiv.org/abs/1905.09272v3


Data-driven confidence bands for distributed nonparametric regression
http://arxiv.org/abs/1912.06689v2


Dataset Cartography: Mapping and Diagnosing Datasets with Training Dynamics
http://arxiv.org/abs/2009.10795v2


DeBayes: a Bayesian method for debiasing network embeddings
http://arxiv.org/abs/2002.11442v2


DeClarE: Debunking Fake News and False Claims using Evidence-Aware Deep Learning
http://arxiv.org/abs/1809.06416v1


DeFormer: Decomposing Pre-trained Transformers for Faster Question Answering
http://arxiv.org/abs/2005.00697v1


DeSePtion: Dual Sequence Prediction and Adversarial Examples for Improved Fact-Checking
http://arxiv.org/abs/2004.12864v1


Debiased Sinkhorn barycenters
http://arxiv.org/abs/2006.02575v1


Debiasing Evaluations That are Biased by Evaluations
http://arxiv.org/abs/2012.00714v1


Decentralised Learning with Random Features and Distributed Gradient Descent
http://arxiv.org/abs/2007.00360v1


Decentralized Multi-player Multi-armed Bandits with No Collision Information
http://arxiv.org/abs/2003.00162v1


Decentralized gradient methods: does topology matter?
http://arxiv.org/abs/2002.12688v1


Decision Trees for Decision-Making under the Predict-then-Optimize Framework
http://arxiv.org/abs/2003.00360v2


Decomposable Neural Paraphrase Generation
http://arxiv.org/abs/1906.09741v1


Deconstructing word embedding algorithms
http://arxiv.org/abs/2011.07013v1


Decoupled Greedy Learning of CNNs
http://arxiv.org/abs/1901.08164v4


Decoupling Strategy and Generation in Negotiation Dialogues
http://arxiv.org/abs/1808.09637v1


DeeBERT: Dynamic Early Exiting for Accelerating BERT Inference
http://arxiv.org/abs/2004.12993v1


Deep Active Learning: Unified and Principled Method for Query and Training
http://arxiv.org/abs/1911.09162v2


Deep Bayesian Quadrature Policy Optimization
http://arxiv.org/abs/2006.15637v3


Deep Claim: Payer Response Prediction from Claims Data with Deep Learning
http://arxiv.org/abs/2007.06229v1


Deep Context-Aware Novelty Detection
http://arxiv.org/abs/2006.01168v2


Deep Contextualized Self-training for Low Resource Dependency Parsing
http://arxiv.org/abs/1911.04286v1


Deep Coordination Graphs
http://arxiv.org/abs/1910.00091v4


Deep Divergence Learning
http://arxiv.org/abs/2005.02612v1


Deep Dyna-Q: Integrating Planning for Task-Completion Dialogue Policy Learning
http://arxiv.org/abs/1801.06176v3


Deep Gaussian Markov Random Fields
http://arxiv.org/abs/2002.07467v2


Deep Generative Model for Joint Alignment and Word Representation
http://arxiv.org/abs/1802.05883v3


Deep Graph Contrastive Representation Learning
http://arxiv.org/abs/2006.04131v2


Deep Hierarchical Classification for Category Prediction in E-commerce System
http://arxiv.org/abs/2005.06692v1


Deep Isometric Learning for Visual Recognition
http://arxiv.org/abs/2006.16992v2


Deep Keyphrase Generation
http://arxiv.org/abs/1704.06879v2


Deep Molecular Programming: A Natural Implementation of Binary-Weight ReLU Neural Networks
http://arxiv.org/abs/2003.13720v3


Deep Networks and the Multiple Manifold Problem
http://arxiv.org/abs/2008.11245v1


Deep Neural Machine Translation with Linear Associative Unit
http://arxiv.org/abs/1705.00861v1


Deep Probabilistic Logic: A Unifying Framework for Indirect Supervision
http://arxiv.org/abs/1808.08485v1


Deep Recurrent Models with Fast-Forward Connections for Neural Machine Translation
http://arxiv.org/abs/1606.04199v3


Deep Reinforcement Learning amidst Lifelong Non-Stationarity
http://arxiv.org/abs/2006.10701v1


Deep Reinforcement Learning for Dialogue Generation
http://arxiv.org/abs/1606.01541v4


Deep Reinforcement Learning for Mention-Ranking Coreference Models
http://arxiv.org/abs/1609.08667v3


Deep Relevance Ranking Using Enhanced Document-Query Interactions
http://arxiv.org/abs/1809.01682v2


Deep Ritz revisited
http://arxiv.org/abs/1912.03937v2


Deep Structured Mixtures of Gaussian Processes
http://arxiv.org/abs/1910.04536v2


Deep Temporal-Recurrent-Replicated-Softmax for Topical Trends over Time
http://arxiv.org/abs/1711.05626v2


Deep contextualized word representations
http://arxiv.org/abs/1802.05365v2


Deep k-NN for Noisy Labels
http://arxiv.org/abs/2004.12289v1


Deep-speare: A Joint Neural Model of Poetic Language, Meter and Rhyme
http://arxiv.org/abs/1807.03491v1


DeepCoDA: personalized interpretability for compositional health data
http://arxiv.org/abs/2006.01392v2


DeepMatch: Balancing Deep Covariate Representations for Causal Inference Using Adversarial Training
http://arxiv.org/abs/1802.05664v1


DeepPath: A Reinforcement Learning Method for Knowledge Graph Reasoning
http://arxiv.org/abs/1707.06690v3


DeepSeqSLAM: A Trainable CNN+RNN for Joint Global Description and Sequence-based Place Recognition
http://arxiv.org/abs/2011.08518v1


Defense Through Diverse Directions
http://arxiv.org/abs/2003.10602v1


Defining Benchmarks for Continual Few-Shot Learning
http://arxiv.org/abs/2004.11967v1


Defining and Evaluating Fair Natural Language Generation
http://arxiv.org/abs/2008.01548v1


Defoiling Foiled Image Captions
http://arxiv.org/abs/1805.06549v1


Delete, Retrieve, Generate: A Simple Approach to Sentiment and Style Transfer
http://arxiv.org/abs/1804.06437v1


DeltaGrad: Rapid retraining of machine learning models
http://arxiv.org/abs/2006.14755v2


Demand-Weighted Completeness Prediction for a Knowledge Base
http://arxiv.org/abs/1804.11109v1


Demographic Dialectal Variation in Social Media: A Case Study of African-American English
http://arxiv.org/abs/1608.08868v1


Demographics Should Not Be the Reason of Toxicity: Mitigating Discrimination in Text Classifications with Instance Weighting
http://arxiv.org/abs/2004.14088v3


Demoting Racial Bias in Hate Speech Detection
http://arxiv.org/abs/2005.12246v1


Denoising Relation Extraction from Document-level Distant Supervision
http://arxiv.org/abs/2011.03888v1


Dense Passage Retrieval for Open-Domain Question Answering
http://arxiv.org/abs/2004.04906v3


Dense-Caption Matching and Frame-Selection Gating for Temporal Localization in VideoQA
http://arxiv.org/abs/2005.06409v1


Densely Connected Graph Convolutional Networks for Graph-to-Sequence Learning
http://arxiv.org/abs/1908.05957v2


Density Deconvolution with Normalizing Flows
http://arxiv.org/abs/2006.09396v2


Density Matching for Bilingual Word Embedding
http://arxiv.org/abs/1904.02343v3


Deontological Ethics By Monotonicity Shape Constraints
http://arxiv.org/abs/2001.11990v2


Dependency-based Hybrid Trees for Semantic Parsing
http://arxiv.org/abs/1809.00107v1


Dependent randomized rounding for clustering and partition systems with knapsack constraints
http://arxiv.org/abs/1709.06995v9


Depth Completion via Deep Basis Fitting
http://arxiv.org/abs/1912.10336v1


Depth Uncertainty in Neural Networks
http://arxiv.org/abs/2006.08437v3


DepthNet Nano: A Highly Compact Self-Normalizing Neural Network for Monocular Depth Estimation
http://arxiv.org/abs/2004.08008v1


Deriving Machine Attention from Human Rationales
http://arxiv.org/abs/1808.09367v1


Description Based Text Classification with Reinforcement Learning
http://arxiv.org/abs/2002.03067v3


Design Challenges in Low-resource Cross-lingual Entity Linking
http://arxiv.org/abs/2005.00692v2


Designing Differentially Private Estimators in High Dimensions
http://arxiv.org/abs/2006.01944v3


Designing Precise and Robust Dialogue Response Evaluators
http://arxiv.org/abs/2004.04908v2


Detecting Attackable Sentences in Arguments
http://arxiv.org/abs/2010.02660v1


Detecting Cross-Modal Inconsistency to Defend Against Neural Fake News
http://arxiv.org/abs/2009.07698v5


Detecting East Asian Prejudice on Social Media
http://arxiv.org/abs/2005.03909v1


Detecting Egregious Conversations between Customers and Virtual Agents
http://arxiv.org/abs/1711.05780v2


Detecting Fine-Grained Cross-Lingual Semantic Divergences without Supervision by Learning to Rank
http://arxiv.org/abs/2010.03662v1


Detecting Gang-Involved Escalation on Social Media Using Context
http://arxiv.org/abs/1809.03632v1


Detecting Perceived Emotions in Hurricane Disasters
http://arxiv.org/abs/2004.14299v1


Detecting Word Sense Disambiguation Biases in Machine Translation for Model-Agnostic Adversarial Attacks
http://arxiv.org/abs/2011.01846v1


Detecting dementia in Mandarin Chinese using transfer learning from a parallel corpus
http://arxiv.org/abs/1903.00933v2


Determining Semantic Textual Similarity using Natural Deduction Proofs
http://arxiv.org/abs/1707.08713v1


Deterministic Decoding for Discrete Data in Variational Autoencoders
http://arxiv.org/abs/2003.02174v1


Deterministic Non-Autoregressive Neural Sequence Modeling by Iterative Refinement
http://arxiv.org/abs/1802.06901v3


Dexterous Robotic Grasping with Object-Centric Visual Affordances
http://arxiv.org/abs/2009.01439v1


Dialogue Coherence Assessment Without Explicit Dialogue Act Labels
http://arxiv.org/abs/1908.08486v2


Dialogue Distillation: Open-Domain Dialogue Augmentation Using Unpaired Data
http://arxiv.org/abs/2009.09427v2


Dialogue Response Ranking Training with Large-Scale Human Feedback Data
http://arxiv.org/abs/2009.06978v1


Diameter-based Interactive Structure Discovery
http://arxiv.org/abs/1906.02101v2


Dice Loss for Data-imbalanced NLP Tasks
http://arxiv.org/abs/1911.02855v3


Did You Ask a Good Question? A Cross-Domain Question Intention Classification Benchmark for Text-to-SQL
http://arxiv.org/abs/2010.12634v1


Did the Model Understand the Question?
http://arxiv.org/abs/1805.05492v1


Differentiable Causal Backdoor Discovery
http://arxiv.org/abs/2003.01461v1


Differentiable Graph Module (DGM) for Graph Convolutional Networks
http://arxiv.org/abs/2002.04999v3


Differentiable Likelihoods for Fast Inversion of 'Likelihood-Free' Dynamical Systems
http://arxiv.org/abs/2002.09301v2


Differentiable Sampling with Flexible Reference Word Order for Neural Machine Translation
http://arxiv.org/abs/1904.04079v2


Differentiable Window for Dynamic Local Attention
http://arxiv.org/abs/2006.13561v1


Differential Evolution for Neural Architecture Search
http://arxiv.org/abs/2012.06400v1


Differentially Private Language Models Benefit from Public Pre-training
http://arxiv.org/abs/2009.05886v2


Differentially Private Set Union
http://arxiv.org/abs/2002.09745v1


Differentially Private Stochastic Coordinate Descent
http://arxiv.org/abs/2006.07272v3


Differentially private cross-silo federated learning
http://arxiv.org/abs/2007.05553v1


Differentiating through the Fréchet Mean
http://arxiv.org/abs/2003.00335v3


Digital Voicing of Silent Speech
http://arxiv.org/abs/2010.02960v1


Dilated Convolutional Attention Network for Medical Code Assignment from Clinical Text
http://arxiv.org/abs/2009.14578v1


Diptychs of human and machine perceptions
http://arxiv.org/abs/2010.13864v1


DisCor: Corrective Feedback in Reinforcement Learning via Distribution Correction
http://arxiv.org/abs/2003.07305v1


Discern: Discourse-Aware Entailment Reasoning Network for Conversational Machine Reading
http://arxiv.org/abs/2010.01838v3


DiscoFuse: A Large-Scale Dataset for Discourse-Based Sentence Fusion
http://arxiv.org/abs/1902.10526v3


Discontinuous Constituency Parsing with a Stack-Free Transition System and a Dynamic Oracle
http://arxiv.org/abs/1904.00615v1


Discontinuous Constituent Parsing as Sequence Labeling
http://arxiv.org/abs/2010.00633v1


Discount Factor as a Regularizer in Reinforcement Learning
http://arxiv.org/abs/2007.02040v1


Discourse Marker Augmented Network with Reinforcement Learning for Natural Language Inference
http://arxiv.org/abs/1907.09692v1


Discourse structure interacts with reference but not syntax in neural language models
http://arxiv.org/abs/2010.04887v1


Discourse-Aware Neural Extractive Text Summarization
http://arxiv.org/abs/1910.14142v2


Discovering and interpreting transcriptomic drivers of imaging traits using neural networks
http://arxiv.org/abs/1912.05071v1


Discrete Action On-Policy Learning with Action-Value Critic
http://arxiv.org/abs/2002.03534v2


Discrete Latent Variable Representations for Low-Resource Text Classification
http://arxiv.org/abs/2006.06226v1


Discrete Optimization for Unsupervised Sentence Summarization with Word-Level Extraction
http://arxiv.org/abs/2005.01791v1


Discriminative Adversarial Search for Abstractive Summarization
http://arxiv.org/abs/2002.10375v2


Discriminative Jackknife: Quantifying Uncertainty in Deep Learning via Higher-Order Influence Functions
http://arxiv.org/abs/2007.13481v1


Discriminative Nearest Neighbor Few-Shot Intent Detection by Transferring Natural Language Inference
http://arxiv.org/abs/2010.13009v1


Discriminative Neural Sentence Modeling by Tree-Based Convolution
http://arxiv.org/abs/1504.01106v5


Discriminatively-Tuned Generative Classifiers for Robust Natural Language Inference
http://arxiv.org/abs/2010.03760v1


Disentangle-based Continual Graph Representation Learning
http://arxiv.org/abs/2010.02565v4


Disentangled Planning and Control in Vision Based Robotics via Reward Machines
http://arxiv.org/abs/2012.14464v1


Disentangling Language and Knowledge in Task-Oriented Dialogs
http://arxiv.org/abs/1805.01216v3


Disentangling Trainability and Generalization in Deep Neural Networks
http://arxiv.org/abs/1912.13053v2


Dispersed Exponential Family Mixture VAEs for Interpretable Text Generation
http://arxiv.org/abs/1906.06719v4


Dissecting Lottery Ticket Transformers: Structural and Behavioral Study of Sparse Neural Machine Translation
http://arxiv.org/abs/2009.13270v2


Dissecting Non-Vacuous Generalization Bounds based on the Mean-Field Approximation
http://arxiv.org/abs/1909.03009v2


Dissecting Span Identification Tasks with Performance Prediction
http://arxiv.org/abs/2010.02587v1


Dissipative SymODEN: Encoding Hamiltonian Dynamics with Dissipation and Control into Deep Learning
http://arxiv.org/abs/2002.08860v3


Distant Supervision and Noisy Label Learning for Low Resource Named Entity Recognition: A Study on Hausa and Yorùbá
http://arxiv.org/abs/2003.08370v2


Distant Supervision from Disparate Sources for Low-Resource Part-of-Speech Tagging
http://arxiv.org/abs/1808.09733v1


Distill, Adapt, Distill: Training Small, In-Domain Models for Neural Machine Translation
http://arxiv.org/abs/2003.02877v3


Distilling Knowledge Learned in BERT for Text Generation
http://arxiv.org/abs/1911.03829v3


Distilling Knowledge for Search-based Structured Prediction
http://arxiv.org/abs/1805.11224v1


Distilling Neural Networks for Greener and Faster Dependency Parsing
http://arxiv.org/abs/2006.00844v1


Distinguish Confusing Law Articles for Legal Judgment Prediction
http://arxiv.org/abs/2004.02557v3


Distributed Differentially Private Averaging with Improved Utility and Robustness to Malicious Parties
http://arxiv.org/abs/2006.07218v1


Distributed Learning: Sequential Decision Making in Resource-Constrained Environments
http://arxiv.org/abs/2004.06171v1


Distributed, partially collapsed MCMC for Bayesian Nonparametrics
http://arxiv.org/abs/2001.05591v3


Distributionally Robust Bayesian Optimization
http://arxiv.org/abs/2002.09038v3


Distributionally Robust Bayesian Quadrature Optimization
http://arxiv.org/abs/2001.06814v1


Distributionally Robust Formulation and Model Selection for the Graphical Lasso
http://arxiv.org/abs/1905.08975v2


Diverse Exploration via InfoMax Options
http://arxiv.org/abs/2010.02756v1


Diverse, Controllable, and Keyphrase-Aware: A Corpus and Method for News Multi-Headline Generation
http://arxiv.org/abs/2004.03875v2


Diversifying Dialogue Generation with Non-Conversational Text
http://arxiv.org/abs/2005.04346v2


Diversifying Reply Suggestions using a Matching-Conditional Variational Autoencoder
http://arxiv.org/abs/1903.10630v1


Diversity driven Attention Model for Query-based Abstractive Summarization
http://arxiv.org/abs/1704.08300v2


Divide, Conquer, and Combine: a New Inference Strategy for Probabilistic Programs with Stochastic Support
http://arxiv.org/abs/1910.13324v3


Diving Deep into Context-Aware Neural Machine Translation
http://arxiv.org/abs/2010.09482v1


Do Explicit Alignments Robustly Improve Multilingual Encoders?
http://arxiv.org/abs/2010.02537v1


Do Multi-Sense Embeddings Improve Natural Language Understanding?
http://arxiv.org/abs/1506.01070v3


Do Neural Dialog Systems Use the Conversation History Effectively? An Empirical Study
http://arxiv.org/abs/1906.01603v2


Do Neural Language Models Show Preferences for Syntactic Formalisms?
http://arxiv.org/abs/2004.14096v1


Do Neural Models Learn Systematicity of Monotonicity Inference in Natural Language?
http://arxiv.org/abs/2004.14839v2


Do Neural Network Cross-Modal Mappings Really Bridge Modalities?
http://arxiv.org/abs/1805.07616v2


Do RNN and LSTM have Long Memory?
http://arxiv.org/abs/2006.03860v2


Do We Need Zero Training Loss After Achieving Zero Training Error?
http://arxiv.org/abs/2002.08709v1


Do We Really Need to Access the Source Data? Source Hypothesis Transfer for Unsupervised Domain Adaptation
http://arxiv.org/abs/2002.08546v5


Do You Have the Right Scissors? Tailoring Pre-trained Language Models via Monte-Carlo Methods
http://arxiv.org/abs/2007.06162v1


Do You See What I Mean? Visual Resolution of Linguistic Ambiguities
http://arxiv.org/abs/1603.08079v1


Do latent tree learning models identify meaningful structure in sentences?
http://arxiv.org/abs/1709.01121v2


Do sequence-to-sequence VAEs learn global features of sentences?
http://arxiv.org/abs/2004.07683v1


Document Context Neural Machine Translation with Memory Networks
http://arxiv.org/abs/1711.03688v2


Document Modeling with Graph Attention Networks for Multi-grained Machine Reading Comprehension
http://arxiv.org/abs/2005.05806v2


Document-Level Event Role Filler Extraction using Multi-Granularity Contextualized Encoding
http://arxiv.org/abs/2005.06579v1


Document-aligned Japanese-English Conversation Parallel Corpus
http://arxiv.org/abs/2012.06143v1


Does Multi-Encoder Help? A Case Study on Context-Aware Neural Machine Translation
http://arxiv.org/abs/2005.03393v2


Does label smoothing mitigate label noise?
http://arxiv.org/abs/2003.02819v1


Does syntax need to grow on trees? Sources of hierarchical inductive bias in sequence-to-sequence networks
http://arxiv.org/abs/2001.03632v1


Does the Markov Decision Process Fit the Data: Testing for the Markov Property in Sequential Decision Making
http://arxiv.org/abs/2002.01751v1


Does the Objective Matter? Comparing Training Objectives for Pronoun Resolution
http://arxiv.org/abs/2010.02570v1


Domain Adaptation with Adversarial Training and Graph Embeddings
http://arxiv.org/abs/1805.05151v1


Domain Adaptive Dialog Generation via Meta Learning
http://arxiv.org/abs/1906.03520v2


Domain Adaptive Imitation Learning
http://arxiv.org/abs/1910.00105v2


Domain Adaptive Inference for Neural Machine Translation
http://arxiv.org/abs/1906.00408v1


Domain Aggregation Networks for Multi-Source Domain Adaptation
http://arxiv.org/abs/1909.05352v2


Domain Knowledge Empowered Structured Neural Net for End-to-End Event Temporal Relation Extraction
http://arxiv.org/abs/2009.07373v2


Domain Knowledge Integration By Gradient Matching For Sample-Efficient Reinforcement Learning
http://arxiv.org/abs/2005.13778v1


Domain-Liftability of Relational Marginal Polytopes
http://arxiv.org/abs/2001.05198v1


Domain-Specific Lexical Grounding in Noisy Visual-Textual Documents
http://arxiv.org/abs/2010.16363v1


Don't Neglect the Obvious: On the Role of Unambiguous Words in Word Sense Disambiguation
http://arxiv.org/abs/2004.14325v3


Don't Read Too Much into It: Adaptive Computation for Open-Domain Question Answering
http://arxiv.org/abs/2011.05435v1


Don't Use English Dev: On the Zero-Shot Cross-Lingual Evaluation of Contextual Embeddings
http://arxiv.org/abs/2004.15001v2


Double Graph Based Reasoning for Document-level Relation Extraction
http://arxiv.org/abs/2009.13752v1


Double-Hard Debias: Tailoring Word Embeddings for Gender Bias Mitigation
http://arxiv.org/abs/2005.00965v1


Double-Loop Unadjusted Langevin Algorithm
http://arxiv.org/abs/2007.01147v1


Doubly Sparse Variational Gaussian Processes
http://arxiv.org/abs/2001.05363v1


Doubly Stochastic Variational Inference for Neural Processes with Hierarchical Latent Variables
http://arxiv.org/abs/2008.09469v2


Doubly robust off-policy evaluation with shrinkage
http://arxiv.org/abs/1907.09623v2


Dream and Search to Control: Latent Space Planning for Continuous Control
http://arxiv.org/abs/2010.09832v1


Driving Behavior Explanation with Multi-level Fusion
http://arxiv.org/abs/2012.04983v1


Dual Mirror Descent for Online Allocation Problems
http://arxiv.org/abs/2002.10421v4


DualTKB: A Dual Learning Bridge between Text and Knowledge Base
http://arxiv.org/abs/2010.14660v1


DyERNIE: Dynamic Evolution of Riemannian Manifold Embeddings for Temporal Knowledge Graph Completion
http://arxiv.org/abs/2011.03984v2


Dyna-AIL : Adversarial Imitation Learning by Planning
http://arxiv.org/abs/1903.03234v1


Dynamic Anticipation and Completion for Multi-Hop Reasoning over Sparse Knowledge Graph
http://arxiv.org/abs/2010.01899v1


Dynamic Context Selection for Document-level Neural Machine Translation via Reinforcement Learning
http://arxiv.org/abs/2010.04314v1


Dynamic Data Selection and Weighting for Iterative Back-Translation
http://arxiv.org/abs/2004.03672v2


Dynamic Fusion Network for Multi-Domain End-to-end Task-Oriented Dialog
http://arxiv.org/abs/2004.11019v3


Dynamic Knapsack Optimization Towards Efficient Multi-Channel Sequential Advertising
http://arxiv.org/abs/2006.16312v1


Dynamic Memory Induction Networks for Few-Shot Text Classification
http://arxiv.org/abs/2005.05727v1


Dynamic Oracles for Top-Down and In-Order Shift-Reduce Constituent Parsing
http://arxiv.org/abs/1810.10882v1


Dynamic Programming Encoding for Subword Segmentation in Neural Machine Translation
http://arxiv.org/abs/2005.06606v2


Dynamic Regions Graph Neural Networks for Spatio-Temporal Reasoning
http://arxiv.org/abs/2009.08427v1


Dynamical systems theory for causal inference with application to synthetic control methods
http://arxiv.org/abs/1808.08778v3


Dynamically Adjusting Transformer Batch Size by Monitoring Gradient Direction Change
http://arxiv.org/abs/2005.02008v1


ELI5: Long Form Question Answering
http://arxiv.org/abs/1907.09190v1


ELITR Non-Native Speech Translation at IWSLT 2020
http://arxiv.org/abs/2006.03331v1


EM Converges for a Mixture of Many Linear Regressions
http://arxiv.org/abs/1905.12106v2


ENGINE: Energy-Based Inference Networks for Non-Autoregressive Machine Translation
http://arxiv.org/abs/2005.00850v2


ENT-DESC: Entity Description Generation by Exploring Knowledge Graph
http://arxiv.org/abs/2004.14813v2


ERASER: A Benchmark to Evaluate Rationalized NLP Models
http://arxiv.org/abs/1911.03429v2


ESPRIT: Explaining Solutions to Physical Reasoning Tasks
http://arxiv.org/abs/2005.00730v2


ESPnet-ST: All-in-One Speech Translation Toolkit
http://arxiv.org/abs/2004.10234v2


ETC: Encoding Long and Structured Inputs in Transformers
http://arxiv.org/abs/2004.08483v5


EXP4-DFDC: A Non-Stochastic Multi-Armed Bandit for Cache Replacement
http://arxiv.org/abs/2009.11330v2


Early Disease Diagnosis for Rice Crop
http://arxiv.org/abs/2004.04775v1


Easy-First Dependency Parsing with Hierarchical Tree LSTMs
http://arxiv.org/abs/1603.00375v2


Ecological Semantics: Programming Environments for Situated Language Understanding
http://arxiv.org/abs/2003.04567v2


EditNTS: An Neural Programmer-Interpreter Model for Sentence Simplification through Explicit Editing
http://arxiv.org/abs/1906.08104v1


Educating Text Autoencoders: Latent Representation Guidance via Denoising
http://arxiv.org/abs/1905.12777v3


Effective Approaches to Attention-based Neural Machine Translation
http://arxiv.org/abs/1508.04025v5


Effective Estimation of Deep Generative Language Models
http://arxiv.org/abs/1904.08194v3


Effective Unsupervised Domain Adaptation with Adversarially Trained Language Models
http://arxiv.org/abs/2010.01739v1


Effectiveness of MPC-friendly Softmax Replacement
http://arxiv.org/abs/2011.11202v1


Efficient Competitive Self-Play Policy Optimization
http://arxiv.org/abs/2009.06086v1


Efficient Constituency Parsing by Pointing
http://arxiv.org/abs/2006.13557v1


Efficient Contextualized Representation: Language Model Pruning for Sequence Labeling
http://arxiv.org/abs/1804.07827v2


Efficient Continuous Pareto Exploration in Multi-Task Learning
http://arxiv.org/abs/2006.16434v2


Efficient Deployment of Conversational Natural Language Interfaces over Databases
http://arxiv.org/abs/2006.00591v2


Efficient Dialogue State Tracking by Selectively Overwriting Memory
http://arxiv.org/abs/1911.03906v2


Efficient Distributed Hessian Free Algorithm for Large-scale Empirical Risk Minimization via Accumulating Sample Strategy
http://arxiv.org/abs/1810.11507v2


Efficient Domain Generalization via Common-Specific Low-Rank Decomposition
http://arxiv.org/abs/2003.12815v2


Efficient EUD Parsing
http://arxiv.org/abs/2006.00838v1


Efficient Estimation of Influence of a Training Instance
http://arxiv.org/abs/2012.04207v1


Efficient Inference For Neural Machine Translation
http://arxiv.org/abs/2010.02416v2


Efficient Intent Detection with Dual Sentence Encoders
http://arxiv.org/abs/2003.04807v1


Efficient Intervention Design for Causal Discovery with Latents
http://arxiv.org/abs/2005.11736v2


Efficient Low-rank Multimodal Fusion with Modality-Specific Factors
http://arxiv.org/abs/1806.00064v1


Efficient Meta Lifelong-Learning with Limited Memory
http://arxiv.org/abs/2010.02500v1


Efficient One-Pass End-to-End Entity Linking for Questions
http://arxiv.org/abs/2010.02413v1


Efficient Online Scalar Annotation with Bounded Support
http://arxiv.org/abs/1806.01170v1


Efficient Optimistic Exploration in Linear-Quadratic Regulators via Lagrangian Relaxation
http://arxiv.org/abs/2007.06482v1


Efficient Parameter Estimation of Truncated Boolean Product Distributions
http://arxiv.org/abs/2007.02392v1


Efficient Planning under Partial Observability with Unnormalized Q Functions and Spectral Learning
http://arxiv.org/abs/1911.05010v2


Efficient Privacy-Preserving Stochastic Nonconvex Optimization
http://arxiv.org/abs/1910.13659v2


Efficient Proximal Mapping of the 1-path-norm of Shallow Networks
http://arxiv.org/abs/2007.01003v2


Efficient Reservoir Management through Deep Reinforcement Learning
http://arxiv.org/abs/2012.03822v1


Efficient Robustness Certificates for Discrete Data: Sparsity-Aware Randomized Smoothing for Graphs, Images and More
http://arxiv.org/abs/2008.12952v1


Efficient Second-Order TreeCRF for Neural Dependency Parsing
http://arxiv.org/abs/2005.00975v2


Efficient allocation of law enforcement resources using predictive police patrolling
http://arxiv.org/abs/1811.12880v1


Efficient and Robust Algorithms for Adversarial Linear Contextual Bandits
http://arxiv.org/abs/2002.00287v2


Efficient and Scalable Bayesian Neural Nets with Rank-1 Factors
http://arxiv.org/abs/2005.07186v2


Efficient improper learning for online logistic regression
http://arxiv.org/abs/2003.08109v3


Efficient non-conjugate Gaussian process factor models for spike count data using polynomial approximations
http://arxiv.org/abs/1906.03318v2


Efficient strategies for hierarchical text classification: External knowledge and auxiliary tasks
http://arxiv.org/abs/2005.02473v2


Efficient, Noise-Tolerant, and Private Learning via Boosting
http://arxiv.org/abs/2002.01100v1


Efficiently Learning Adversarially Robust Halfspaces with Noise
http://arxiv.org/abs/2005.07652v1


Efficiently Sampling Functions from Gaussian Process Posteriors
http://arxiv.org/abs/2002.09309v4


Efficiently Solving MDPs with Stochastic Mirror Descent
http://arxiv.org/abs/2008.12776v1


Egoshots, an ego-vision life-logging dataset and semantic fidelity metric to evaluate diversity in image captioning models
http://arxiv.org/abs/2003.11743v2


Einsum Networks: Fast and Scalable Learning of Tractable Probabilistic Circuits
http://arxiv.org/abs/2004.06231v1


Embarrassingly Simple Unsupervised Aspect Extraction
http://arxiv.org/abs/2004.13580v1


Embedding Multimodal Relational Data for Knowledge Base Completion
http://arxiv.org/abs/1809.01341v2


Embedding Words in Non-Vector Space with Unsupervised Graph Learning
http://arxiv.org/abs/2010.02598v1


Embedding time expressions for deep temporal ordering models
http://arxiv.org/abs/1906.08287v1


Embedding-based Scientific Literature Discovery in a Text Editor Application
http://arxiv.org/abs/2005.04961v1


Embeddings of Label Components for Sequence Labeling: A Case Study of Fine-grained Named Entity Recognition
http://arxiv.org/abs/2006.01372v2


Emergence of Syntax Needs Minimal Supervision
http://arxiv.org/abs/2005.01119v1


Emergent Road Rules In Multi-Agent Driving Environments
http://arxiv.org/abs/2011.10753v1


Emerging Cross-lingual Structure in Pretrained Language Models
http://arxiv.org/abs/1911.01464v3


Empirical Study of the Benefits of Overparameterization in Learning Latent Variable Models
http://arxiv.org/abs/1907.00030v3


Empower Entity Set Expansion via Language Model Probing
http://arxiv.org/abs/2004.13897v2


Empowering Active Learning to Jointly Optimize System and User Demands
http://arxiv.org/abs/2005.04470v2


Enabling Language Models to Fill in the Blanks
http://arxiv.org/abs/2005.05339v2


Encoder-Decoder Models Can Benefit from Pre-trained Masked Language Models in Grammatical Error Correction
http://arxiv.org/abs/2005.00987v2


Encoder-decoder neural network for solving the nonlinear Fokker-Planck-Landau collision operator in XGC
http://arxiv.org/abs/2009.06534v2


Encoding Musical Style with Transformer Autoencoders
http://arxiv.org/abs/1912.05537v2


Encoding Sentences with Graph Convolutional Networks for Semantic Role Labeling
http://arxiv.org/abs/1703.04826v4


Encoding Source Language with Convolutional Neural Network for Machine Translation
http://arxiv.org/abs/1503.01838v5


Encodings of Source Syntax: Similarities in NMT Representations Across Target Languages
http://arxiv.org/abs/2005.08177v1


End to End Binarized Neural Networks for Text Classification
http://arxiv.org/abs/2010.05223v1


End-to-End Bias Mitigation by Modelling Biases in Corpora
http://arxiv.org/abs/1909.06321v3


End-to-End Neural Word Alignment Outperforms GIZA++
http://arxiv.org/abs/2004.14675v1


End-to-End Slot Alignment and Recognition for Cross-Lingual NLU
http://arxiv.org/abs/2004.14353v2


End-to-End Synthetic Data Generation for Domain Adaptation of Question Answering Systems
http://arxiv.org/abs/2010.06028v1


End-to-end ASR: from Supervised to Semi-Supervised Learning with Modern Architectures
http://arxiv.org/abs/1911.08460v3


End-to-end Graph-based TAG Parsing with Neural Networks
http://arxiv.org/abs/1804.06610v3


End-to-end Neural Coreference Resolution
http://arxiv.org/abs/1707.07045v2


Energy and Policy Considerations for Deep Learning in NLP
http://arxiv.org/abs/1906.02243v1


Energy-Based Continuous Inverse Optimal Control
http://arxiv.org/abs/1904.05453v4


Energy-Based Processes for Exchangeable Data
http://arxiv.org/abs/2003.07521v2


Energy-based Surprise Minimization for Multi-Agent Value Factorization
http://arxiv.org/abs/2009.09842v3


Enhanced POET: Open-Ended Reinforcement Learning through Unbounded Invention of Learning Challenges and their Solutions
http://arxiv.org/abs/2003.08536v2


Enhanced Universal Dependency Parsing with Second-Order Inference and Mixture of Training Data
http://arxiv.org/abs/2006.01414v2


Enhancing Answer Boundary Detection for Multilingual Machine Reading Comprehension
http://arxiv.org/abs/2004.14069v2


Enhancing Drug-Drug Interaction Extraction from Texts by Molecular Structure Information
http://arxiv.org/abs/1805.05593v1


Enhancing Machine Translation with Dependency-Aware Self-Attention
http://arxiv.org/abs/1909.03149v3


Enhancing Pre-trained Chinese Character Representation with Word-aligned Attention
http://arxiv.org/abs/1911.02821v2


Enhancing Simple Models by Exploiting What They Already Know
http://arxiv.org/abs/1905.13565v3


Enhancing Stratospheric Weather Analyses and Forecasts by Deploying Sensors from a Weather Balloon
http://arxiv.org/abs/1912.02276v1


Enhancing Word Embeddings with Knowledge Extracted from Lexical Resources
http://arxiv.org/abs/2005.10048v1


Enriched In-Order Linearization for Faster Sequence-to-Sequence Constituent Parsing
http://arxiv.org/abs/2005.13334v1


Enriching Word Embeddings with Temporal and Spatial Information
http://arxiv.org/abs/2010.00761v1


Enriching Word Vectors with Subword Information
http://arxiv.org/abs/1607.04606v2


Entities as Experts: Sparse Memory Access with Entity Supervision
http://arxiv.org/abs/2004.07202v2


Entity Commonsense Representation for Neural Abstractive Summarization
http://arxiv.org/abs/1806.05504v1


Entity Linking for Queries by Searching Wikipedia Sentences
http://arxiv.org/abs/1704.02788v3


Entity Linking in 100 Languages
http://arxiv.org/abs/2011.02690v1


Entity Recognition at First Sight: Improving NER with Eye Movement Information
http://arxiv.org/abs/1902.10068v2


Entity-Enriched Neural Models for Clinical Question Answering
http://arxiv.org/abs/2005.06587v1


Entropy Minimization In Emergent Languages
http://arxiv.org/abs/1905.13687v3


Episodic Memory Reader: Learning What to Remember for Question Answering from Streaming Data
http://arxiv.org/abs/1903.06164v3


Equalized odds postprocessing under imperfect group information
http://arxiv.org/abs/1906.03284v3


Equivariant Hamiltonian Flows
http://arxiv.org/abs/1909.13739v1


Equivariant Neural Rendering
http://arxiv.org/abs/2006.07630v2


Error Estimation for Sketched SVD via the Bootstrap
http://arxiv.org/abs/2003.04937v1


Error bounds in estimating the out-of-sample prediction error using leave-one-out cross validation in high-dimensions
http://arxiv.org/abs/2003.01770v1


Error-Bounded Correction of Noisy Labels
http://arxiv.org/abs/2011.10077v1


Estimating Grape Yield on the Vine from Multiple Images
http://arxiv.org/abs/2004.04278v1


Estimating Principal Components under Adversarial Perturbations
http://arxiv.org/abs/2006.00602v2


Estimating Q(s,s') with Deep Deterministic Dynamics Gradients
http://arxiv.org/abs/2002.09505v2


Estimating localized complexity of white-matter wiring with GANs
http://arxiv.org/abs/1910.04868v2


Estimating predictive uncertainty for rumour verification models
http://arxiv.org/abs/2005.07174v1


Estimating the number and effect sizes of non-null hypotheses
http://arxiv.org/abs/2002.07297v2


Estimation and Inference with Trees and Forests in High Dimensions
http://arxiv.org/abs/2007.03210v2


Estimation of Bounds on Potential Outcomes For Decision Making
http://arxiv.org/abs/1910.04817v4


Evaluating Agents without Rewards
http://arxiv.org/abs/2012.11538v1


Evaluating Amharic Machine Translation
http://arxiv.org/abs/2003.14386v1


Evaluating Attribution Methods using White-Box LSTMs
http://arxiv.org/abs/2010.08606v1


Evaluating Dialogue Generation Systems via Response Selection
http://arxiv.org/abs/2004.14302v1


Evaluating Explainable AI: Which Algorithmic Explanations Help Users Predict Model Behavior?
http://arxiv.org/abs/2005.01831v1


Evaluating Explanation Methods for Neural Machine Translation
http://arxiv.org/abs/2005.01672v1


Evaluating Gender Bias in Machine Translation
http://arxiv.org/abs/1906.00591v1


Evaluating Logical Generalization in Graph Neural Networks
http://arxiv.org/abs/2003.06560v1


Evaluating Lossy Compression Rates of Deep Generative Models
http://arxiv.org/abs/2008.06653v1


Evaluating Neural Morphological Taggers for Sanskrit
http://arxiv.org/abs/2005.10893v1


Evaluating Robustness to Input Perturbations for Neural Machine Translation
http://arxiv.org/abs/2005.00580v1


Evaluating Theory of Mind in Question Answering
http://arxiv.org/abs/1808.09352v1


Evaluating and Characterizing Human Rationales
http://arxiv.org/abs/2010.04736v1


Evaluating the Calibration of Knowledge Graph Embeddings for Trustworthy Link Prediction
http://arxiv.org/abs/2004.01168v3


Evaluating the Factual Consistency of Abstractive Text Summarization
http://arxiv.org/abs/1910.12840v1


Evaluating the Utility of Hand-crafted Features in Sequence Labelling
http://arxiv.org/abs/1808.09075v1


Evaluation of Model Selection for Kernel Fragment Recognition in Corn Silage
http://arxiv.org/abs/2004.00292v1


Event Extraction by Answering (Almost) Natural Questions
http://arxiv.org/abs/2004.13625v1


Every Document Owns Its Structure: Inductive Text Classification via Graph Neural Networks
http://arxiv.org/abs/2004.13826v2


Evidence-Aware Inferential Text Generation with Vector Quantised Variational AutoEncoder
http://arxiv.org/abs/2006.08101v1


Evolution-based Fine-tuning of CNNs for Prostate Cancer Detection
http://arxiv.org/abs/1911.01477v1


EvolveGraph: Multi-Agent Trajectory Prediction with Dynamic Relational Reasoning
http://arxiv.org/abs/2003.13924v4


Evolving Reinforcement Learning Algorithms
http://arxiv.org/abs/2101.03958v1


Examination and Extension of Strategies for Improving Personalized Language Modeling via Interpolation
http://arxiv.org/abs/2006.05469v1


Examining Citations of Natural Language Processing Literature
http://arxiv.org/abs/2005.00912v1


Examining the State-of-the-Art in News Timeline Summarization
http://arxiv.org/abs/2005.10107v1


Exclusive Hierarchical Decoding for Deep Keyphrase Generation
http://arxiv.org/abs/2004.08511v1


ExpBERT: Representation Engineering with Natural Language Explanations
http://arxiv.org/abs/2005.01932v1


Experience Grounds Language
http://arxiv.org/abs/2004.10151v3


Experimental Evaluation and Development of a Silver-Standard for the MIMIC-III Clinical Coding Dataset
http://arxiv.org/abs/2006.07332v1


Expertise Style Transfer: A New Task Towards Better Communication between Experts and Laymen
http://arxiv.org/abs/2005.00701v1


Explainable Automated Fact-Checking for Public Health Claims
http://arxiv.org/abs/2010.09926v1


Explainable and Discourse Topic-aware Neural Language Understanding
http://arxiv.org/abs/2006.10632v2


Explaining Black Box Predictions and Unveiling Data Artifacts through Influence Functions
http://arxiv.org/abs/2005.06676v1


Explaining Character-Aware Neural Networks for Word-Level Prediction: Do They Discover Linguistic Rules?
http://arxiv.org/abs/1808.09551v1


Explaining Groups of Points in Low-Dimensional Representations
http://arxiv.org/abs/2003.01640v3


Explaining the Explainer: A First Theoretical Analysis of LIME
http://arxiv.org/abs/2001.03447v2


Explanation Augmented Feedback in Human-in-the-Loop Reinforcement Learning
http://arxiv.org/abs/2006.14804v3


Explicit Mean-Square Error Bounds for Monte-Carlo and Linear Stochastic Approximation
http://arxiv.org/abs/2002.02584v1


Explicit Memory Tracker with Coarse-to-Fine Reasoning for Conversational Machine Reading
http://arxiv.org/abs/2005.12484v2


Exploiting Categorical Structure Using Tree-Based Methods
http://arxiv.org/abs/2004.07383v1


Exploiting Cross-Sentence Context for Neural Machine Translation
http://arxiv.org/abs/1704.04347v3


Exploiting Deep Representations for Neural Machine Translation
http://arxiv.org/abs/1810.10181v1


Exploiting Domain Knowledge via Grouped Weight Sharing with Application to Text Categorization
http://arxiv.org/abs/1702.02535v3


Exploiting Explicit Paths for Multi-hop Reading Comprehension
http://arxiv.org/abs/1811.01127v2


Exploiting Rich Syntactic Information for Semantic Parsing with Graph-to-Sequence Model
http://arxiv.org/abs/1808.07624v1


Exploiting Sentence Order in Document Alignment
http://arxiv.org/abs/2004.14523v2


Exploiting Structured Knowledge in Text via Graph-Guided Representation Learning
http://arxiv.org/abs/2004.14224v1


Exploiting Syntactic Structure for Better Language Modeling: A Syntactic Distance Approach
http://arxiv.org/abs/2005.05864v1


Exploration by Optimisation in Partial Monitoring
http://arxiv.org/abs/1907.05772v3


Exploratory Analysis of COVID-19 Related Tweets in North America to Inform Public Health Institutes
http://arxiv.org/abs/2007.02452v1


Explore, Discover and Learn: Unsupervised Discovery of State-Covering Skills
http://arxiv.org/abs/2002.03647v4


Explore, Propose, and Assemble: An Interpretable Model for Multi-Hop Reading Comprehension
http://arxiv.org/abs/1906.05210v1


Exploring Author Context for Detecting Intended vs Perceived Sarcasm
http://arxiv.org/abs/1910.11932v1


Exploring Content Selection in Summarization of Novel Chapters
http://arxiv.org/abs/2005.01840v2


Exploring Contextual Word-level Style Relevance for Unsupervised Style Transfer
http://arxiv.org/abs/2005.02049v1


Exploring Contextualized Neural Language Models for Temporal Dependency Parsing
http://arxiv.org/abs/2004.14577v2


Exploring Exploration: Comparing Children with RL Agents in Unified Environments
http://arxiv.org/abs/2005.02880v2


Exploring Phoneme-Level Speech Representations for End-to-End Speech Translation
http://arxiv.org/abs/1906.01199v1


Exploring Recombination for Efficient Decoding of Neural Machine Translation
http://arxiv.org/abs/1808.08482v2


Exploring Semantic Capacity of Terms
http://arxiv.org/abs/2010.01898v1


Exploring Weaknesses of VQA Models through Attribution Driven Insights
http://arxiv.org/abs/2006.06637v2


Exploring and Predicting Transferability across NLP Tasks
http://arxiv.org/abs/2005.00770v2


Exploring aspects of similarity between spoken personal narratives by disentangling them into narrative clause types
http://arxiv.org/abs/2005.12762v2


Exploring the Linear Subspace Hypothesis in Gender Bias Mitigation
http://arxiv.org/abs/2009.09435v2


Exploring the Role of Argument Structure in Online Debate Persuasion
http://arxiv.org/abs/2010.03538v1


Exploring the Role of Prior Beliefs for Argument Persuasion
http://arxiv.org/abs/1906.11301v1


Exposing Shallow Heuristics of Relation Extraction Models with Challenge Data
http://arxiv.org/abs/2010.03656v1


Expressing Visual Relationships via Language
http://arxiv.org/abs/1906.07689v2


Expressive Interviewing: A Conversational System for Coping with COVID-19
http://arxiv.org/abs/2007.03819v1


Expressiveness and Learning of Hidden Quantum Markov Models
http://arxiv.org/abs/1912.02098v1


Extending Implicit Discourse Relation Recognition to the PDTB-3
http://arxiv.org/abs/2010.06294v1


Extending a Parser to Distant Domains Using a Few Dozen Partially Annotated Examples
http://arxiv.org/abs/1805.06556v1


Extensively Matching for Few-shot Learning Event Detection
http://arxiv.org/abs/2006.10093v1


Extract and Edit: An Alternative to Back-Translation for Unsupervised Neural Machine Translation
http://arxiv.org/abs/1904.02331v1


Extracting Headless MWEs from Dependency Parse Trees: Parsing, Tagging, and Joint Modeling Approaches
http://arxiv.org/abs/2005.03035v1


Extracting Implicitly Asserted Propositions in Argumentation
http://arxiv.org/abs/2010.02654v1


Extracting Symptoms and their Status from Clinical Conversations
http://arxiv.org/abs/1906.02239v1


Extractive Summarization as Text Matching
http://arxiv.org/abs/2004.08795v1


Extragradient with player sampling for faster Nash equilibrium finding
http://arxiv.org/abs/1905.12363v5


Extrapolating the profile of a finite population
http://arxiv.org/abs/2005.10561v1


Extreme Multi-label Classification from Aggregated Labels
http://arxiv.org/abs/2004.00198v1


FEQA: A Question Answering Evaluation Framework for Faithfulness Assessment in Abstractive Summarization
http://arxiv.org/abs/2005.03754v1


FFR V1.0: Fon-French Neural Machine Translation
http://arxiv.org/abs/2003.12111v1


FFR v1.1: Fon-French Neural Machine Translation
http://arxiv.org/abs/2006.09217v1


FIESTA: Fast IdEntification of State-of-The-Art models using adaptive bandit algorithms
http://arxiv.org/abs/1906.12230v1


FLAT: Chinese NER Using Flat-Lattice Transformer
http://arxiv.org/abs/2004.11795v2


F^2-Softmax: Diversifying Neural Text Generation via Frequency Factorized Softmax
http://arxiv.org/abs/2009.09417v2


Facebook AI's WMT20 News Translation Task Submission
http://arxiv.org/abs/2011.08298v1


Facet-Aware Evaluation for Extractive Summarization
http://arxiv.org/abs/1908.10383v2


Facilitating the Communication of Politeness through Fine-Grained Paraphrasing
http://arxiv.org/abs/2012.00012v1


Fact or Fiction: Verifying Scientific Claims
http://arxiv.org/abs/2004.14974v6


Fact-based Text Editing
http://arxiv.org/abs/2007.00916v1


Factorising AMR generation through syntax
http://arxiv.org/abs/1804.07707v2


Factual Error Correction for Abstractive Summarization Models
http://arxiv.org/abs/2010.08712v1


Fair Bayesian Optimization
http://arxiv.org/abs/2006.05109v1


Fair Correlation Clustering
http://arxiv.org/abs/2002.02274v2


Fair Decisions Despite Imperfect Predictions
http://arxiv.org/abs/1902.02979v4


Fair Embedding Engine: A Library for Analyzing and Mitigating Gender Bias in Word Embeddings
http://arxiv.org/abs/2010.13168v1


Fair Generative Modeling via Weak Supervision
http://arxiv.org/abs/1910.12008v2


Fair Learning with Private Demographic Data
http://arxiv.org/abs/2002.11651v2


Fairness in the Eyes of the Data: Certifying Machine-Learning Models
http://arxiv.org/abs/2009.01534v1


Fairwashing Explanations with Off-Manifold Detergent
http://arxiv.org/abs/2007.09969v1


Familywise Error Rate Control by Interactive Unmasking
http://arxiv.org/abs/2002.08545v3


Fast Adaptation via Policy-Dynamics Value Functions
http://arxiv.org/abs/2007.02879v1


Fast Algorithms for Computational Optimal Transport and Wasserstein Barycenter
http://arxiv.org/abs/1905.09952v4


Fast Differentiable Sorting and Ranking
http://arxiv.org/abs/2002.08871v2


Fast Interleaved Bidirectional Sequence Generation
http://arxiv.org/abs/2010.14481v1


Fast Learning of Graph Neural Networks with Guaranteed Generalizability: One-hidden-layer Case
http://arxiv.org/abs/2006.14117v1


Fast Linear Convergence of Randomized BFGS
http://arxiv.org/abs/2002.11337v3


Fast Markov Chain Monte Carlo Algorithms via Lie Groups
http://arxiv.org/abs/1901.08606v2


Fast OSCAR and OWL Regression via Safe Screening Rules
http://arxiv.org/abs/2006.16433v1


Fast Physical Activity Suggestions: Efficient Hyperparameter Learning in Mobile Health
http://arxiv.org/abs/2012.11646v1


Fast Rates for Online Prediction with Abstention
http://arxiv.org/abs/2001.10623v2


Fast and Accurate Deep Bidirectional Language Representations for Unsupervised Learning
http://arxiv.org/abs/2004.08097v1


Fast and Furious Convergence: Stochastic Second Order Methods under Interpolation
http://arxiv.org/abs/1910.04920v2


Fast and Scalable Expansion of Natural Language Understanding Functionality for Intelligent Agents
http://arxiv.org/abs/1805.01542v1


Fast semantic parsing with well-typedness guarantees
http://arxiv.org/abs/2009.07365v2


Fast(er) Exact Decoding and Global Training for Transition-Based Dependency Parsing via a Minimal Feature Set
http://arxiv.org/abs/1708.09403v1


Fast, Small and Exact: Infinite-order Language Modelling with Compressed Suffix Trees
http://arxiv.org/abs/1608.04465v1


FastBERT: a Self-distilling BERT with Adaptive Inference Time
http://arxiv.org/abs/2004.02178v2


FastFormers: Highly Efficient Transformer Models for Natural Language Understanding
http://arxiv.org/abs/2010.13382v1


Faster Graph Embeddings via Coarsening
http://arxiv.org/abs/2007.02817v3


Faster Projection-free Online Learning
http://arxiv.org/abs/2001.11568v2


Feature Adaptation of Pre-Trained Language Models across Languages and Domains with Robust Self-Training
http://arxiv.org/abs/2009.11538v3


Feature Noise Induces Loss Discrepancy Across Groups
http://arxiv.org/abs/1911.09876v2


Feature Quantization Improves GAN Training
http://arxiv.org/abs/2004.02088v2


Feature Selection using Stochastic Gates
http://arxiv.org/abs/1810.04247v7


Feature relevance quantification in explainable AI: A causal problem
http://arxiv.org/abs/1910.13413v2


Feature-map-level Online Adversarial Knowledge Distillation
http://arxiv.org/abs/2002.01775v3


FedPAQ: A Communication-Efficient Federated Learning Method with Periodic Averaging and Quantization
http://arxiv.org/abs/1909.13014v4


Federated Heavy Hitters Discovery with Differential Privacy
http://arxiv.org/abs/1902.08534v4


Federated Learning with Only Positive Labels
http://arxiv.org/abs/2004.10342v1


Fenchel Lifted Networks: A Lagrange Relaxation of Neural Network Training
http://arxiv.org/abs/1811.08039v3


FetchSGD: Communication-Efficient Federated Learning with Sketching
http://arxiv.org/abs/2007.07682v2


Few-Shot Complex Knowledge Base Question Answering via Meta Reinforcement Learning
http://arxiv.org/abs/2010.15877v1


Few-Shot Learning for Opinion Summarization
http://arxiv.org/abs/2004.14884v3


Few-Shot NLG with Pre-Trained Language Model
http://arxiv.org/abs/1904.09521v3


Few-shot Domain Adaptation by Causal Mechanism Transfer
http://arxiv.org/abs/2002.03497v2


Few-shot Relation Extraction via Bayesian Meta-learning on Relation Graphs
http://arxiv.org/abs/2007.02387v1


Few-shot Slot Tagging with Collapsed Dependency Transfer and Label-enhanced Task-adaptive Projection Network
http://arxiv.org/abs/2006.05702v1


Few-shot link prediction via graph neural networks for Covid-19 drug-repurposing
http://arxiv.org/abs/2007.10261v1


FewRel: A Large-Scale Supervised Few-Shot Relation Classification Dataset with State-of-the-Art Evaluation
http://arxiv.org/abs/1810.10147v2


Fiduciary Bandits
http://arxiv.org/abs/1905.07043v3


Fiedler Regularization: Learning Neural Networks with Graph Sparsity
http://arxiv.org/abs/2003.00992v3


Field-Level Crop Type Classification with k Nearest Neighbors: A Baseline for a New Kenya Smallholder Dataset
http://arxiv.org/abs/2004.03023v1


Fill in the BLANC: Human-free quality estimation of document summaries
http://arxiv.org/abs/2002.09836v2


Filling Missing Paths: Modeling Co-occurrences of Word Pairs and Dependency Paths for Recognizing Lexical Semantic Relations
http://arxiv.org/abs/1809.03411v1


Filtering Noisy Dialogue Corpora by Connectivity and Content Relatedness
http://arxiv.org/abs/2004.14008v2


FinRL: A Deep Reinforcement Learning Library for Automated Stock Trading in Quantitative Finance
http://arxiv.org/abs/2011.09607v1


Finding Convincing Arguments Using Scalable Bayesian Preference Learning
http://arxiv.org/abs/1806.02418v1


Finding Syntax in Human Encephalography with Beam Search
http://arxiv.org/abs/1806.04127v1


Finding Universal Grammatical Relations in Multilingual BERT
http://arxiv.org/abs/2005.04511v2


Finding Your Voice: The Linguistic Development of Mental Health Counselors
http://arxiv.org/abs/1906.07194v1


Finding trainable sparse networks through Neural Tangent Transfer
http://arxiv.org/abs/2006.08228v2


Fine Grained Citation Span for References in Wikipedia
http://arxiv.org/abs/1707.07278v1


Fine-Grained Analysis of Cross-Linguistic Syntactic Divergences
http://arxiv.org/abs/2005.03436v2


Fine-Grained Prediction of Syntactic Typology: Discovering Latent Structure with Supervised Learning
http://arxiv.org/abs/1710.03877v1


Fine-Grained Temporal Relation Extraction
http://arxiv.org/abs/1902.01390v2


Fine-grained Fact Verification with Kernel Graph Attention Network
http://arxiv.org/abs/1910.09796v3


Fine-grained linguistic evaluation for state-of-the-art Machine Translation
http://arxiv.org/abs/2010.06359v2


Finite Regret and Cycles with Fixed Step-Size via Alternating Gradient Descent-Ascent
http://arxiv.org/abs/1907.04392v1


Finite Time Analysis of Linear Two-timescale Stochastic Approximation with Markovian Noise
http://arxiv.org/abs/2002.01268v1


Finite-Sample Analysis of Decentralized Temporal-Difference Learning with Linear Function Approximation
http://arxiv.org/abs/1911.00934v2


Finite-Time Analysis of Asynchronous Stochastic Approximation and $Q$-Learning
http://arxiv.org/abs/2002.00260v1


Finite-Time Last-Iterate Convergence for Multi-Agent Learning in Games
http://arxiv.org/abs/2002.09806v4


Fixed-Confidence Guarantees for Bayesian Best-Arm Identification
http://arxiv.org/abs/1910.10945v3


Flexible and Efficient Long-Range Planning Through Curious Exploration
http://arxiv.org/abs/2004.10876v2


Flexible retrieval with NMSLIB and FlexNeuART
http://arxiv.org/abs/2010.14848v2


Flow Models for Arbitrary Conditional Likelihoods
http://arxiv.org/abs/1909.06319v2


Fluent Response Generation for Conversational Question Answering
http://arxiv.org/abs/2005.10464v2


Forecasting Sequential Data using Consistent Koopman Autoencoders
http://arxiv.org/abs/2003.02236v2


Formal Limitations on the Measurement of Mutual Information
http://arxiv.org/abs/1811.04251v4


Fortification of Neural Morphological Segmentation Models for Polysynthetic Minimal-Resource Languages
http://arxiv.org/abs/1804.06024v1


Fortifying Toxic Speech Detectors Against Veiled Toxicity
http://arxiv.org/abs/2010.03154v1


Fractal Gaussian Networks: A sparse random graph model based on Gaussian Multiplicative Chaos
http://arxiv.org/abs/2008.03038v1


Fractional Underdamped Langevin Dynamics: Retargeting SGD with Momentum under Heavy-Tailed Gradient Noise
http://arxiv.org/abs/2002.05685v2


Free Energy Wells and Overlap Gap Property in Sparse PCA
http://arxiv.org/abs/2006.10689v1


Frequency Bias in Neural Networks for Input of Non-Uniform Density
http://arxiv.org/abs/2003.04560v1


Frequentist Uncertainty in Recurrent Neural Networks via Blockwise Influence Functions
http://arxiv.org/abs/2006.13707v2


Friendships, Rivalries, and Trysts: Characterizing Relations between Ideas in Texts
http://arxiv.org/abs/1704.07828v2


From Arguments to Key Points: Towards Automatic Argument Summarization
http://arxiv.org/abs/2005.01619v2


From Data to Decisions: Distributionally Robust Optimization is Optimal
http://arxiv.org/abs/1704.04118v3


From Dataset Recycling to Multi-Property Extraction and Beyond
http://arxiv.org/abs/2011.03228v1


From English to Code-Switching: Transfer Learning with Strong Morphological Clues
http://arxiv.org/abs/1909.05158v3


From ImageNet to Image Classification: Contextualizing Progress on Benchmarks
http://arxiv.org/abs/2005.11295v1


From Importance Sampling to Doubly Robust Policy Gradient
http://arxiv.org/abs/1910.09066v3


From Language to Programs: Bridging Reinforcement Learning and Maximum Marginal Likelihood
http://arxiv.org/abs/1704.07926v1


From Machine Reading Comprehension to Dialogue State Tracking: Bridging the Gap
http://arxiv.org/abs/2004.05827v1


From Nesterov's Estimate Sequence to Riemannian Acceleration
http://arxiv.org/abs/2001.08876v1


From PAC to Instance-Optimal Sample Complexity in the Plackett-Luce Model
http://arxiv.org/abs/1903.00558v2


From Paraphrase Database to Compositional Paraphrase Model and Back
http://arxiv.org/abs/1506.03487v2


From Predictions to Decisions: Using Lookahead Regularization
http://arxiv.org/abs/2006.11638v2


From Speech-to-Speech Translation to Automatic Dubbing
http://arxiv.org/abs/2001.06785v3


From tree matching to sparse graph alignment
http://arxiv.org/abs/2002.01258v2


Frowning Frodo, Wincing Leia, and a Seriously Great Friendship: Learning to Classify Emotional Relationships of Fictional Characters
http://arxiv.org/abs/1903.12453v2


Frustratingly Hard Evidence Retrieval for QA Over Books
http://arxiv.org/abs/2007.09878v1


Frustratingly Simple Few-Shot Object Detection
http://arxiv.org/abs/2003.06957v1


Fully Character-Level Neural Machine Translation without Explicit Segmentation
http://arxiv.org/abs/1610.03017v3


Fully Decentralized Joint Learning of Personalized Models and Collaboration Graphs
http://arxiv.org/abs/1901.08460v4


Fully Parallel Hyperparameter Search: Reshaped Space-Filling
http://arxiv.org/abs/1910.08406v2


Fully reversible neural networks for large-scale surface and sub-surface characterization via remote sensing
http://arxiv.org/abs/2003.07474v1


Fundamental Tradeoffs between Invariance and Sensitivity to Adversarial Perturbations
http://arxiv.org/abs/2002.04599v2


GANterpretations
http://arxiv.org/abs/2011.05158v1


GCAN: Graph-aware Co-Attention Networks for Explainable Fake News Detection on Social Media
http://arxiv.org/abs/2004.11648v1


GEAR: Graph-based Evidence Aggregating and Reasoning for Fact Verification
http://arxiv.org/abs/1908.01843v1


GNN-FiLM: Graph Neural Networks with Feature-wise Linear Modulation
http://arxiv.org/abs/1906.12192v5


GP-VAE: Deep Probabilistic Time Series Imputation
http://arxiv.org/abs/1907.04155v5


GRADE: Automatic Graph-Enhanced Coherence Metric for Evaluating Open-Domain Dialogue Systems
http://arxiv.org/abs/2010.03994v1


Gaining Insight into SARS-CoV-2 Infection and COVID-19 Severity Using Self-supervised Edge Features and Graph Neural Networks
http://arxiv.org/abs/2006.12971v2


Games for Fairness and Interpretability
http://arxiv.org/abs/2004.09551v1


Gamification of Pure Exploration for Linear Bandits
http://arxiv.org/abs/2007.00953v1


Gated Convolutional Bidirectional Attention-based Model for Off-topic Spoken Response Detection
http://arxiv.org/abs/2004.09036v4


Gaussian Mixture Latent Vector Grammars
http://arxiv.org/abs/1805.04688v1


Gaussian Sketching yields a J-L Lemma in RKHS
http://arxiv.org/abs/1908.05818v2


Gaussianization Flows
http://arxiv.org/abs/2003.01941v1


GenAug: Data Augmentation for Finetuning Text Generators
http://arxiv.org/abs/2010.01794v2


Gender Bias in Contextualized Word Embeddings
http://arxiv.org/abs/1904.03310v1


Gender Bias in Multilingual Embeddings and Cross-Lingual Transfer
http://arxiv.org/abs/2005.00699v1


Gender Coreference and Bias Evaluation at WMT 2020
http://arxiv.org/abs/2010.06018v1


Gender Gap in Natural Language Processing Research: Disparities in Authorship and Citations
http://arxiv.org/abs/2005.00962v2


Gender in Danger? Evaluating Speech Translation Technology on the MuST-SHE Corpus
http://arxiv.org/abs/2006.05754v1


Gender-preserving Debiasing for Pre-trained Word Embeddings
http://arxiv.org/abs/1906.00742v1


General Identification of Dynamic Treatment Regimes Under Interference
http://arxiv.org/abs/2004.01218v1


Generalisation error in learning with random features and the hidden manifold model
http://arxiv.org/abs/2002.09339v2


Generalization Error of Generalized Linear Models in High Dimensions
http://arxiv.org/abs/2005.00180v1


Generalization Guarantees for Sparse Kernel Approximation with Entropic Optimal Features
http://arxiv.org/abs/2002.04195v1


Generalization and Representational Limits of Graph Neural Networks
http://arxiv.org/abs/2002.06157v1


Generalization to New Actions in Reinforcement Learning
http://arxiv.org/abs/2011.01928v1


Generalized Data Augmentation for Low-Resource Translation
http://arxiv.org/abs/1906.03785v1


Generalized and Scalable Optimal Sparse Decision Trees
http://arxiv.org/abs/2006.08690v3


Generalizing Convolutional Neural Networks for Equivariance to Lie Groups on Arbitrary Continuous Data
http://arxiv.org/abs/2002.12880v3


Generalizing Natural Language Analysis through Span-relation Representations
http://arxiv.org/abs/1911.03822v2


Generalizing Word Embeddings using Bag of Subwords
http://arxiv.org/abs/1809.04259v1


Generalizing and Hybridizing Count-based and Neural Language Models
http://arxiv.org/abs/1606.00499v2


Generate, Delete and Rewrite: A Three-Stage Framework for Improving Persona Consistency of Dialogue Generation
http://arxiv.org/abs/2004.07672v4


Generating Automatic Curricula via Self-Supervised Active Domain Randomization
http://arxiv.org/abs/2002.07911v2


Generating Counter Narratives against Online Hate Speech: Data and Strategies
http://arxiv.org/abs/2004.04216v1


Generating Dialogue Responses from a Semantic Latent Space
http://arxiv.org/abs/2010.01658v1


Generating Diverse Translation from Model Distribution with Dropout
http://arxiv.org/abs/2010.08178v1


Generating Diverse and Consistent QA pairs from Contexts with Information-Maximizing Hierarchical Conditional VAEs
http://arxiv.org/abs/2005.13837v5


Generating Fact Checking Briefs
http://arxiv.org/abs/2011.05448v1


Generating Fact Checking Explanations
http://arxiv.org/abs/2004.05773v1


Generating Fine-Grained Open Vocabulary Entity Type Descriptions
http://arxiv.org/abs/1805.10564v1


Generating Hierarchical Explanations on Text Classification via Feature Interaction Detection
http://arxiv.org/abs/2004.02015v3


Generating Image Descriptions via Sequential Cross-Modal Alignment Guided by Human Gaze
http://arxiv.org/abs/2011.04592v1


Generating Label Cohesive and Well-Formed Adversarial Claims
http://arxiv.org/abs/2009.08205v1


Generating Logical Forms from Graph Representations of Text and Entities
http://arxiv.org/abs/1905.08407v3


Generating Narrative Text in a Switching Dynamical System
http://arxiv.org/abs/2004.03762v1


Generating Negative Commonsense Knowledge
http://arxiv.org/abs/2011.07497v1


Generating Novel Glyph without Human Data by Learning to Communicate
http://arxiv.org/abs/2010.04402v2


Generating Question Relevant Captions to Aid Visual Question Answering
http://arxiv.org/abs/1906.00513v3


Generating Radiology Reports via Memory-driven Transformer
http://arxiv.org/abs/2010.16056v1


Generating Sentences by Editing Prototypes
http://arxiv.org/abs/1709.08878v2


Generating Summaries with Topic Templates and Structured Convolutional Decoders
http://arxiv.org/abs/1906.04687v1


Generating and Exploiting Large-scale Pseudo Training Data for Zero Pronoun Resolution
http://arxiv.org/abs/1606.01603v3


Generative Adversarial Imitation from Observation
http://arxiv.org/abs/1807.06158v4


Generative Adversarial User Privacy in Lossy Single-Server Information Retrieval
http://arxiv.org/abs/2012.03902v1


Generative Flows with Matrix Exponential
http://arxiv.org/abs/2007.09651v1


Generative ODE Modeling with Known Unknowns
http://arxiv.org/abs/2003.10775v1


Generative Semantic Hashing Enhanced via Boltzmann Machines
http://arxiv.org/abs/2006.08858v1


Generative Teaching Networks: Accelerating Neural Architecture Search by Learning to Generate Synthetic Training Data
http://arxiv.org/abs/1912.07768v1


Geometric Dataset Distances via Optimal Transport
http://arxiv.org/abs/2002.02923v1


Geometry-aware Domain Adaptation for Unsupervised Alignment of Word Embeddings
http://arxiv.org/abs/2004.08243v2


Geoopt: Riemannian Optimization in PyTorch
http://arxiv.org/abs/2005.02819v5


Getting a CLUE: A Method for Explaining Uncertainty Estimates
http://arxiv.org/abs/2006.06848v1


Give Me Convenience and Give Her Death: Who Should Decide What Uses of NLP are Appropriate, and on What Basis?
http://arxiv.org/abs/2005.13213v1


Giving Attention to the Unexpected: Using Prosody Innovations in Disfluency Detection
http://arxiv.org/abs/1904.04388v1


Global Neural CCG Parsing with Optimality Guarantees
http://arxiv.org/abs/1607.01432v2


Global-to-Local Neural Networks for Document-Level Relation Extraction
http://arxiv.org/abs/2009.10359v1


Globally Normalized Reader
http://arxiv.org/abs/1709.02828v1


Go Wide, Then Narrow: Efficient Training of Deep Thin Networks
http://arxiv.org/abs/2007.00811v2


Good Subnetworks Provably Exist: Pruning via Greedy Forward Selection
http://arxiv.org/abs/2003.01794v3


Good-Enough Compositional Data Augmentation
http://arxiv.org/abs/1904.09545v4


GraPPa: Grammar-Augmented Pre-Training for Table Semantic Parsing
http://arxiv.org/abs/2009.13845v1


Gradient Based Memory Editing for Task-Free Continual Learning
http://arxiv.org/abs/2006.15294v1


Gradient Descent with Early Stopping is Provably Robust to Label Noise for Overparameterized Neural Networks
http://arxiv.org/abs/1903.11680v3


Gradient Temporal-Difference Learning with Regularized Corrections
http://arxiv.org/abs/2007.00611v4


Gradient descent algorithms for Bures-Wasserstein barycenters
http://arxiv.org/abs/2001.01700v2


Gradient descent follows the regularization path for general losses
http://arxiv.org/abs/2006.11226v1


Gradient-free Online Learning in Games with Delayed Rewards
http://arxiv.org/abs/2006.10911v1


GradientDICE: Rethinking Generalized Offline Estimation of Stationary Values
http://arxiv.org/abs/2001.11113v7


Grammatical Error Correction in Low Error Density Domains: A New Benchmark and Analyses
http://arxiv.org/abs/2010.07574v1


Graph Clustering with Graph Neural Networks
http://arxiv.org/abs/2006.16904v1


Graph Coarsening with Preserved Spectral Properties
http://arxiv.org/abs/1802.04447v2


Graph Convolutional Gaussian Processes For Link Prediction
http://arxiv.org/abs/2002.04337v1


Graph Convolutional Network for Recommendation with Low-pass Collaborative Filters
http://arxiv.org/abs/2006.15516v2


Graph Convolutions over Constituent Trees for Syntax-Aware Semantic Role Labeling
http://arxiv.org/abs/1909.09814v3


Graph DNA: Deep Neighborhood Aware Graph Encoding for Collaborative Filtering
http://arxiv.org/abs/1905.12217v1


Graph Filtration Learning
http://arxiv.org/abs/1905.10996v2


Graph Homomorphism Convolution
http://arxiv.org/abs/2005.01214v2


Graph Learning for Inverse Landscape Genetics
http://arxiv.org/abs/2006.12334v2


Graph Neural Networks for Massive MIMO Detection
http://arxiv.org/abs/2007.05703v1


Graph Neural Networks for the Prediction of Substrate-Specific Organic Reaction Conditions
http://arxiv.org/abs/2007.04275v2


Graph Neural Networks in TensorFlow and Keras with Spektral
http://arxiv.org/abs/2006.12138v1


Graph Optimal Transport for Cross-Domain Alignment
http://arxiv.org/abs/2006.14744v3


Graph Pattern Entity Ranking Model for Knowledge Graph Completion
http://arxiv.org/abs/1904.02856v1


Graph Random Neural Features for Distance-Preserving Graph Representations
http://arxiv.org/abs/1909.03790v3


Graph Structure of Neural Networks
http://arxiv.org/abs/2007.06559v2


Graph based Neural Networks for Event Factuality Prediction using Syntactic and Semantic Structures
http://arxiv.org/abs/1907.03227v1


Graph neural induction of value iteration
http://arxiv.org/abs/2009.12604v1


Graph-based Nearest Neighbor Search: From Practice to Theory
http://arxiv.org/abs/1907.00845v4


Graph-based, Self-Supervised Program Repair from Diagnostic Feedback
http://arxiv.org/abs/2005.10636v2


GraphDialog: Integrating Graph Knowledge into End-to-End Task-Oriented Dialogue Systems
http://arxiv.org/abs/2010.01447v1


GraphOpt: Learning Optimization Models of Graph Formation
http://arxiv.org/abs/2007.03619v1


Graphs, Entities, and Step Mixture
http://arxiv.org/abs/2005.08485v2


Grasping the Finer Point: A Supervised Similarity Network for Metaphor Detection
http://arxiv.org/abs/1709.00575v1


Greedy Search with Probabilistic N-gram Matching for Neural Machine Translation
http://arxiv.org/abs/1809.03132v1


Gromov-Wasserstein Alignment of Word Embedding Spaces
http://arxiv.org/abs/1809.00013v1


Grounded Adaptation for Zero-shot Executable Semantic Parsing
http://arxiv.org/abs/2009.07396v2


Grounded Compositional Outputs for Adaptive Language Modeling
http://arxiv.org/abs/2009.11523v2


Grounded Conversation Generation as Guided Traverses in Commonsense Knowledge Graphs
http://arxiv.org/abs/1911.02707v3


Grounding Conversations with Improvised Dialogues
http://arxiv.org/abs/2004.09544v2


Group Equivariant Deep Reinforcement Learning
http://arxiv.org/abs/2007.03437v1


Growing Action Spaces
http://arxiv.org/abs/1906.12266v1


Growing Together: Modeling Human Language Learning With n-Best Multi-Checkpoint Machine Translation
http://arxiv.org/abs/2006.04050v1


Guaranteed Validity for Empirical Approaches to Adaptive Data Analysis
http://arxiv.org/abs/1906.09231v2


Guided Learning of Nonconvex Models through Successive Functional Gradient Optimization
http://arxiv.org/abs/2006.16840v1


Guiding Attention for Self-Supervised Learning with Transformers
http://arxiv.org/abs/2010.02399v1


Guiding Variational Response Generator to Exploit Persona
http://arxiv.org/abs/1911.02390v2


HABERTOR: An Efficient and Effective Deep Hatespeech Detector
http://arxiv.org/abs/2010.08865v1


HAT: Hardware-Aware Transformers for Efficient Natural Language Processing
http://arxiv.org/abs/2005.14187v1


HEAD-QA: A Healthcare Dataset for Complex Reasoning
http://arxiv.org/abs/1906.04701v1


HENIN: Learning Heterogeneous Neural Interaction Networks for Explainable Cyberbullying Detection on Social Media
http://arxiv.org/abs/2010.04576v1


HERO: Hierarchical Encoder for Video+Language Omni-representation Pre-training
http://arxiv.org/abs/2005.00200v2


HIBERT: Document Level Pre-training of Hierarchical Bidirectional Transformers for Document Summarization
http://arxiv.org/abs/1905.06566v1


HNHN: Hypergraph Networks with Hyperedge Neurons
http://arxiv.org/abs/2006.12278v1


Haar Graph Pooling
http://arxiv.org/abs/1909.11580v3


Haar Wavelet based Block Autoregressive Flows for Trajectories
http://arxiv.org/abs/2009.09878v1


Hallucinative Topological Memory for Zero-Shot Visual Planning
http://arxiv.org/abs/2002.12336v1


Halpern Iteration for Near-Optimal and Parameter-Free Monotone Inclusion and Strong Solutions to Variational Inequalities
http://arxiv.org/abs/2002.08872v3


Hamiltonian Graph Networks with ODE Integrators
http://arxiv.org/abs/1909.12790v1


Hamiltonian Monte Carlo Swindles
http://arxiv.org/abs/2001.05033v2


Handling Divergent Reference Texts when Evaluating Table-to-Text Generation
http://arxiv.org/abs/1906.01081v1


Handling Noisy Labels for Robustly Learning from Self-Training Data for Low-Resource Sequence Labeling
http://arxiv.org/abs/1903.12008v1


Handling the Positive-Definite Constraint in the Bayesian Learning Rule
http://arxiv.org/abs/2002.10060v13


Hard-Coded Gaussian Attention for Neural Machine Translation
http://arxiv.org/abs/2005.00742v1


Hardness of Identity Testing for Restricted Boltzmann Machines and Potts models
http://arxiv.org/abs/2004.10805v1


Harmonic Decompositions of Convolutional Networks
http://arxiv.org/abs/2003.12756v2


Harnessing Distribution Ratio Estimators for Learning Agents with Quality and Diversity
http://arxiv.org/abs/2011.02614v1


Harnessing the linguistic signal to predict scalar inferences
http://arxiv.org/abs/1910.14254v2


Harry Potter and the Action Prediction Challenge from Natural Language
http://arxiv.org/abs/1905.11037v1


Harvesting Paragraph-Level Question-Answer Pairs from Wikipedia
http://arxiv.org/abs/1805.05942v1


Harvesting and Refining Question-Answer Pairs for Unsupervised QA
http://arxiv.org/abs/2005.02925v1


Has Machine Translation Achieved Human Parity? A Case for Document-level Evaluation
http://arxiv.org/abs/1808.07048v1


Hearst Patterns Revisited: Automatic Hypernym Detection from Large Text Corpora
http://arxiv.org/abs/1806.03191v1


Helping Reduce Environmental Impact of Aviation with Machine Learning
http://arxiv.org/abs/2012.09433v1


Hermitian matrices for clustering directed graphs: insights and applications
http://arxiv.org/abs/1908.02096v1


Heterogeneous Graph Neural Networks for Extractive Document Summarization
http://arxiv.org/abs/2004.12393v1


Heterogeneous Supervision for Relation Extraction: A Representation Learning Approach
http://arxiv.org/abs/1707.00166v2


Heteroskedastic and Imbalanced Deep Learning with Adaptive Regularization
http://arxiv.org/abs/2006.15766v1


Hiding Among the Clones: A Simple and Nearly Optimal Analysis of Privacy Amplification by Shuffling
http://arxiv.org/abs/2012.12803v2


Hierarchical Clustering: a 0.585 Revenue Approximation
http://arxiv.org/abs/2006.01933v1


Hierarchical Entity Typing via Multi-level Learning to Rank
http://arxiv.org/abs/2004.02286v1


Hierarchical Evidence Set Modeling for Automated Fact Extraction and Verification
http://arxiv.org/abs/2010.05111v1


Hierarchical Generation of Molecular Graphs using Structural Motifs
http://arxiv.org/abs/2002.03230v2


Hierarchical Graph Network for Multi-hop Question Answering
http://arxiv.org/abs/1911.03631v4


Hierarchical Inter-Message Passing for Learning on Molecular Graphs
http://arxiv.org/abs/2006.12179v1


Hierarchical Losses and New Resources for Fine-grained Entity Typing and Linking
http://arxiv.org/abs/1807.05127v1


Hierarchical Neural Networks for Sequential Sentence Classification in Medical Scientific Abstracts
http://arxiv.org/abs/1808.06161v1


Hierarchical Neural Story Generation
http://arxiv.org/abs/1805.04833v1


Hierarchical Protein Function Prediction with Tail-GNNs
http://arxiv.org/abs/2007.12804v1


Hierarchical Quantized Representations for Script Generation
http://arxiv.org/abs/1808.09542v1


Hierarchical Structured Model for Fine-to-coarse Manifesto Text Analysis
http://arxiv.org/abs/1805.02823v1


Hierarchical Transformers for Multi-Document Summarization
http://arxiv.org/abs/1905.13164v1


Hierarchical Verification for Adversarial Robustness
http://arxiv.org/abs/2007.11826v1


Hierarchically Decoupled Imitation for Morphological Transfer
http://arxiv.org/abs/2003.01709v2


High Dimensional Robust Sparse Regression
http://arxiv.org/abs/1805.11643v3


High Resolution Medical Image Analysis with Spatial Partitioning
http://arxiv.org/abs/1909.03108v3


High-Dimensional Robust Mean Estimation via Gradient Descent
http://arxiv.org/abs/2005.01378v1


HighRES: Highlight-based Reference-less Evaluation of Summarization
http://arxiv.org/abs/1906.01361v1


Higher-order Coreference Resolution with Coarse-to-fine Inference
http://arxiv.org/abs/1804.05392v1


Highway Transformer: Self-Gating Enhanced Self-Attentive Networks
http://arxiv.org/abs/2004.08178v5


History for Visual Dialog: Do we really need it?
http://arxiv.org/abs/2005.07493v1


History-Gradient Aided Batch Size Adaptation for Variance Reduced Algorithms
http://arxiv.org/abs/1910.09670v4


Hooks in the Headline: Learning to Generate Headlines with Controlled Styles
http://arxiv.org/abs/2004.01980v3


HotpotQA: A Dataset for Diverse, Explainable Multi-hop Question Answering
http://arxiv.org/abs/1809.09600v1


How Can We Accelerate Progress Towards Human-like Linguistic Generalization?
http://arxiv.org/abs/2005.00955v1


How Does NLP Benefit Legal System: A Summary of Legal Artificial Intelligence
http://arxiv.org/abs/2004.12158v5


How Does Selective Mechanism Improve Self-Attention Networks?
http://arxiv.org/abs/2005.00979v1


How Furiously Can Colourless Green Ideas Sleep? Sentence Acceptability in Context
http://arxiv.org/abs/2004.00881v1


How Good is the Bayes Posterior in Deep Neural Networks Really?
http://arxiv.org/abs/2002.02405v2


How Large a Vocabulary Does Text Classification Need? A Variational Approach to Vocabulary Selection
http://arxiv.org/abs/1902.10339v4


How Much Knowledge Can You Pack Into the Parameters of a Language Model?
http://arxiv.org/abs/2002.08910v4


How Much Reading Does Reading Comprehension Require? A Critical Investigation of Popular Benchmarks
http://arxiv.org/abs/1808.04926v2


How To Backdoor Federated Learning
http://arxiv.org/abs/1807.00459v3


How agents see things: On visual representations in an emergent language game
http://arxiv.org/abs/1808.10696v2


How do Decisions Emerge across Layers in Neural Models? Interpretation with Differentiable Masking
http://arxiv.org/abs/2004.14992v2


How much complexity does an RNN architecture need to learn syntax-sensitive dependencies?
http://arxiv.org/abs/2005.08199v2


How multilingual is Multilingual BERT?
http://arxiv.org/abs/1906.01502v1


How recurrent networks implement contextual processing in sentiment analysis
http://arxiv.org/abs/2004.08013v1


How to Grow a (Product) Tree: Personalized Category Suggestions for eCommerce Type-Ahead
http://arxiv.org/abs/2005.12781v1


How to Make Deep RL Work in Practice
http://arxiv.org/abs/2010.13083v2


How to Probe Sentence Embeddings in Low-Resource Languages: On Structural Design Choices for Probing Task Evaluation
http://arxiv.org/abs/2006.09109v2


How to trap a gradient flow
http://arxiv.org/abs/2001.02968v3


How well does surprisal explain N400 amplitude under different experimental conditions?
http://arxiv.org/abs/2010.04844v1


Howl: A Deployed, Open-Source Wake Word Detection System
http://arxiv.org/abs/2008.09606v1


Human computation requires and enables a new approach to ethical review
http://arxiv.org/abs/2011.10754v1


Human-Like Active Learning: Machines Simulating the Human Learning Process
http://arxiv.org/abs/2011.03733v1


Human-Paraphrased References Improve Neural Machine Translation
http://arxiv.org/abs/2010.10245v1


Human-centric Dialog Training via Offline Reinforcement Learning
http://arxiv.org/abs/2010.05848v1


Hybrid Code Networks: practical and efficient end-to-end dialog control with supervised and reinforcement learning
http://arxiv.org/abs/1702.03274v2


Hybrid Session-based News Recommendation using Recurrent Neural Networks
http://arxiv.org/abs/2006.13063v1


Hybrid Stochastic-Deterministic Minibatch Proximal Gradient: Less-Than-Single-Pass Optimization with Nearly Optimal Generalization
http://arxiv.org/abs/2009.09835v1


HydroNets: Leveraging River Structure for Hydrologic Modeling
http://arxiv.org/abs/2007.00595v1


Hyper-spectral NIR and MIR data and optimal wavebands for detection of apple tree diseases
http://arxiv.org/abs/2004.02325v3


Hyperbolic Manifold Regression
http://arxiv.org/abs/2005.13885v1


Hypernetwork approach to generating point clouds
http://arxiv.org/abs/2003.00802v2


Hyperparameter Auto-tuning in Self-Supervised Robotic Learning
http://arxiv.org/abs/2010.08252v3


Hypothesis Testing Interpretations and Renyi Differential Privacy
http://arxiv.org/abs/1905.09982v2


ID3 Learns Juntas for Smoothed Product Distributions
http://arxiv.org/abs/1906.08654v1


IGSQL: Database Schema Interaction Graph Based Neural Model for Context-Dependent Text-to-SQL Generation
http://arxiv.org/abs/2011.05744v1


IIRC: A Dataset of Incomplete Information Reading Comprehension Questions
http://arxiv.org/abs/2011.07127v1


IMHO Fine-Tuning Improves Claim Detection
http://arxiv.org/abs/1905.07000v1


IMoJIE: Iterative Memory-Based Joint Open Information Extraction
http://arxiv.org/abs/2005.08178v1


INFOTABS: Inference on Tables as Semi-structured Data
http://arxiv.org/abs/2005.06117v1


INSET: Sentence Infilling with INter-SEntential Transformer
http://arxiv.org/abs/1911.03892v2


INSPIRED: Toward Sociable Recommendation Dialog Systems
http://arxiv.org/abs/2009.14306v2


IROF: a low resource evaluation metric for explanation methods
http://arxiv.org/abs/2003.08747v1


IV-Posterior: Inverse Value Estimation for Interpretable Policy Certificates
http://arxiv.org/abs/2012.01925v1


Identifying Semantic Divergences in Parallel Text without Annotations
http://arxiv.org/abs/1803.11112v1


Identifying and Correcting Label Bias in Machine Learning
http://arxiv.org/abs/1901.04966v1


Identifying and Reducing Gender Bias in Word-Level Language Models
http://arxiv.org/abs/1904.03035v1


Identifying civilians killed by police with distantly supervised entity-event extraction
http://arxiv.org/abs/1707.07086v1


If MaxEnt RL is the Answer, What is the Question?
http://arxiv.org/abs/1910.01913v1


If beam search is the answer, what was the question?
http://arxiv.org/abs/2010.02650v1


Image Augmentation Is All You Need: Regularizing Deep Reinforcement Learning from Pixels
http://arxiv.org/abs/2004.13649v3


Image Generation With Neural Cellular Automatas
http://arxiv.org/abs/2010.04949v2


Image Pivoting for Learning Multilingual Multimodal Representations
http://arxiv.org/abs/1707.07601v1


Image-based phenotyping of diverse Rice (Oryza Sativa L.) Genotypes
http://arxiv.org/abs/2004.02498v1


Imitation Attacks and Defenses for Black-box Machine Translation Systems
http://arxiv.org/abs/2004.15015v3


Imitation Learning Approach for AI Driving Olympics Trained on Real-world and Simulation Data Simultaneously
http://arxiv.org/abs/2007.03514v1


Imitation Learning for Neural Morphological String Transduction
http://arxiv.org/abs/1808.10701v1


Implicit Bias of Gradient Descent for Wide Two-layer Neural Networks Trained with the Logistic Loss
http://arxiv.org/abs/2002.04486v4


Implicit Class-Conditioned Domain Alignment for Unsupervised Domain Adaptation
http://arxiv.org/abs/2006.04996v1


Implicit Generative Modeling for Efficient Exploration
http://arxiv.org/abs/1911.08017v3


Implicit Geometric Regularization for Learning Shapes
http://arxiv.org/abs/2002.10099v2


Implicit Regularization of Random Feature Models
http://arxiv.org/abs/2002.08404v2


Implicit competitive regularization in GANs
http://arxiv.org/abs/1910.05852v4


Implicit regularization and solution uniqueness in over-parameterized matrix sensing
http://arxiv.org/abs/1806.02046v2


Implicit regularization for deep neural networks driven by an Ornstein-Uhlenbeck like process
http://arxiv.org/abs/1904.09080v2


Improper Learning for Non-Stochastic Control
http://arxiv.org/abs/2001.09254v3


Improved Natural Language Generation via Loss Truncation
http://arxiv.org/abs/2004.14589v2


Improved Neural Relation Detection for Knowledge Base Question Answering
http://arxiv.org/abs/1704.06194v2


Improved Optimistic Algorithms for Logistic Bandits
http://arxiv.org/abs/2002.07530v2


Improved Regret Bounds for Projection-free Bandit Convex Optimization
http://arxiv.org/abs/1910.03374v1


Improved Relation Extraction with Feature-Rich Compositional Embedding Models
http://arxiv.org/abs/1505.02419v3


Improved Semantic Representations From Tree-Structured Long Short-Term Memory Networks
http://arxiv.org/abs/1503.00075v3


Improved Semantic-Aware Network Embedding with Fine-Grained Word Alignment
http://arxiv.org/abs/1808.09633v1


Improved Sentiment Detection via Label Transfer from Monolingual to Synthetic Code-Switched Text
http://arxiv.org/abs/1906.05725v1


Improved Speech Representations with Multi-Target Autoregressive Predictive Coding
http://arxiv.org/abs/2004.05274v1


Improved Transition-Based Parsing by Modeling Characters instead of Words with LSTMs
http://arxiv.org/abs/1508.00657v2


Improving AMR Parsing with Sequence-to-Sequence Pre-training
http://arxiv.org/abs/2010.01771v1


Improving Abstraction in Text Summarization
http://arxiv.org/abs/1808.07913v1


Improving Adversarial Text Generation by Modeling the Distant Future
http://arxiv.org/abs/2005.01279v1


Improving Candidate Generation for Low-resource Cross-lingual Entity Linking
http://arxiv.org/abs/2003.01343v1


Improving Character-based Decoding Using Target-Side Morphological Information for Neural Machine Translation
http://arxiv.org/abs/1804.06506v1


Improving Dialog Evaluation with a Multi-reference Adversarial Dataset and Large Scale Pretraining
http://arxiv.org/abs/2009.11321v1


Improving Dialogue State Tracking by Discerning the Relevant Context
http://arxiv.org/abs/1904.02800v1


Improving Disentangled Text Representation Learning with Information-Theoretic Guidance
http://arxiv.org/abs/2006.00693v2


Improving Disfluency Detection by Self-Training a Self-Attentive Model
http://arxiv.org/abs/2004.05323v2


Improving Domain Adaptation Translation with Domain Invariant and Specific Information
http://arxiv.org/abs/1904.03879v2


Improving Generalization by Controlling Label-Noise Information in Neural Network Weights
http://arxiv.org/abs/2002.07933v2


Improving Generative Imagination in Object-Centric World Models
http://arxiv.org/abs/2010.02054v1


Improving Grammatical Error Correction via Pre-Training a Copy-Augmented Architecture with Unlabeled Data
http://arxiv.org/abs/1903.00138v3


Improving Graph Neural Network Expressivity via Subgraph Isomorphism Counting
http://arxiv.org/abs/2006.09252v2


Improving Human Text Comprehension through Semi-Markov CRF-based Neural Section Title Generation
http://arxiv.org/abs/1904.07142v1


Improving Information Extraction by Acquiring External Evidence with Reinforcement Learning
http://arxiv.org/abs/1603.07954v3


Improving Knowledge Graph Embedding Using Simple Constraints
http://arxiv.org/abs/1805.02408v2


Improving Lemmatization of Non-Standard Languages with Joint Learning
http://arxiv.org/abs/1903.06939v1


Improving Lexical Choice in Neural Machine Translation
http://arxiv.org/abs/1710.01329v3


Improving Machine Reading Comprehension with General Reading Strategies
http://arxiv.org/abs/1810.13441v2


Improving Massively Multilingual Neural Machine Translation and Zero-Shot Translation
http://arxiv.org/abs/2004.11867v1


Improving Maximum Likelihood Training for Text Generation with Density Ratio Estimation
http://arxiv.org/abs/2007.06018v1


Improving Molecular Design by Stochastic Iterative Target Augmentation
http://arxiv.org/abs/2002.04720v2


Improving Multi-turn Dialogue Modelling with Utterance ReWriter
http://arxiv.org/abs/1906.07004v1


Improving Multilingual Models with Language-Clustered Vocabularies
http://arxiv.org/abs/2010.12777v1


Improving Multilingual Named Entity Recognition with Wikipedia Entity Type Mapping
http://arxiv.org/abs/1707.02459v1


Improving Neural Conversational Models with Entropy-Based Data Filtering
http://arxiv.org/abs/1905.05471v3


Improving Neural Parsing by Disentangling Model Combination and Reranking Effects
http://arxiv.org/abs/1707.03058v1


Improving Non-autoregressive Neural Machine Translation with Monolingual Data
http://arxiv.org/abs/2005.00932v3


Improving Question Answering over Incomplete KBs with Knowledge-Aware Reader
http://arxiv.org/abs/1905.07098v2


Improving Robustness of Deep-Learning-Based Image Reconstruction
http://arxiv.org/abs/2002.11821v1


Improving Segmentation for Technical Support Problems
http://arxiv.org/abs/2005.11055v1


Improving Slot Filling by Utilizing Contextual Information
http://arxiv.org/abs/1911.01680v2


Improving Text Generation Evaluation with Batch Centering and Tempered Word Mover Distance
http://arxiv.org/abs/2010.06150v1


Improving Text Generation with Student-Forcing Optimal Transport
http://arxiv.org/abs/2010.05994v1


Improving Topic Models with Latent Feature Word Representations
http://arxiv.org/abs/1810.06306v1


Improving Transformer Models by Reordering their Sublayers
http://arxiv.org/abs/1911.03864v2


Improving Truthfulness of Headline Generation
http://arxiv.org/abs/2005.00882v2


Improving Unsupervised Word-by-Word Translation with Language Model and Denoising Autoencoder
http://arxiv.org/abs/1901.01590v1


Improving Yorùbá Diacritic Restoration
http://arxiv.org/abs/2003.10564v1


Improving a Neural Semantic Parser by Counterfactual Learning from Human Bandit Feedback
http://arxiv.org/abs/1805.01252v2


Improving fairness in machine learning systems: What do industry practitioners need?
http://arxiv.org/abs/1812.05239v2


Improving robustness against common corruptions by covariate shift adaptation
http://arxiv.org/abs/2006.16971v2


Improving the Efficiency of Grammatical Error Correction with Erroneous Span Detection and Correction
http://arxiv.org/abs/2010.03260v1


Improving the Gating Mechanism of Recurrent Neural Networks
http://arxiv.org/abs/1910.09890v2


Improving the Similarity Measure of Determinantal Point Processes for Extractive Multi-Document Summarization
http://arxiv.org/abs/1906.00072v1


Imputation estimators for unnormalized models with missing data
http://arxiv.org/abs/1903.03630v2


Imputer: Sequence Modelling via Imputation and Dynamic Programming
http://arxiv.org/abs/2002.08926v2


In search of isoglosses: continuous and discrete language embeddings in Slavic historical phonology
http://arxiv.org/abs/2005.13575v1


In-domain representation learning for remote sensing
http://arxiv.org/abs/1911.06721v1


Incentive-Compatible Forecasting Competitions
http://arxiv.org/abs/2101.01816v1


Incentives for Federated Learning: a Hypothesis Elicitation Approach
http://arxiv.org/abs/2007.10596v1


Incidence Networks for Geometric Deep Learning
http://arxiv.org/abs/1905.11460v4


Incomplete Utterance Rewriting as Semantic Segmentation
http://arxiv.org/abs/2009.13166v1


Incorporate Semantic Structures into Machine Translation Evaluation via UCCA
http://arxiv.org/abs/2010.08728v2


Incorporating Behavioral Hypotheses for Query Generation
http://arxiv.org/abs/2010.02667v1


Incorporating External Knowledge through Pre-training for Natural Language to Code Generation
http://arxiv.org/abs/2004.09015v1


Incorporating Subword Information into Matrix Factorization Word Embeddings
http://arxiv.org/abs/1805.03710v1


Incorporating Terminology Constraints in Automatic Post-Editing
http://arxiv.org/abs/2010.09608v1


Incorporating Uncertain Segmentation Information into Chinese NER for Social Media Text
http://arxiv.org/abs/2004.06384v2


Incorporating a Local Translation Mechanism into Non-autoregressive Translation
http://arxiv.org/abs/2011.06132v1


Increasing performance of electric vehicles in ride-hailing services using deep reinforcement learning
http://arxiv.org/abs/1912.03408v1


Incremental Neural Coreference Resolution in Constant Memory
http://arxiv.org/abs/2005.00128v2


Incremental Processing in the Age of Non-Incremental Encoders: An Empirical Assessment of Bidirectional Models for Incremental NLU
http://arxiv.org/abs/2010.05330v1


Incremental Sampling Without Replacement for Sequence Models
http://arxiv.org/abs/2002.09067v1


Incremental Transformer with Deliberation Decoder for Document Grounded Conversations
http://arxiv.org/abs/1907.08854v3


Independent Subspace Analysis for Unsupervised Learning of Disentangled Representations
http://arxiv.org/abs/1909.05063v1


Individual Calibration with Randomized Forecasting
http://arxiv.org/abs/2006.10288v3


Induced Inflection-Set Keyword Search in Speech
http://arxiv.org/abs/1910.12299v2


Inductive Relation Prediction by Subgraph Reasoning
http://arxiv.org/abs/1911.06962v2


Inertial Block Proximal Methods for Non-Convex Non-Smooth Optimization
http://arxiv.org/abs/1903.01818v3


Inexact Tensor Methods with Dynamic Accuracies
http://arxiv.org/abs/2002.09403v2


Inference Strategies for Machine Translation with Conditional Masking
http://arxiv.org/abs/2010.02352v2


Inference of Dynamic Graph Changes for Functional Connectome
http://arxiv.org/abs/1905.09993v2


Inferring Which Medical Treatments Work from Reports of Clinical Trials
http://arxiv.org/abs/1904.01606v2


Inferring astrophysical X-ray polarization with deep learning
http://arxiv.org/abs/2005.08126v1


Infinite attention: NNGP and NTK for deep attention networks
http://arxiv.org/abs/2006.10540v1


Influence Paths for Characterizing Subject-Verb Number Agreement in LSTM Language Models
http://arxiv.org/abs/2005.01190v1


Information Aggregation for Multi-Head Attention with Routing-by-Agreement
http://arxiv.org/abs/1904.03100v1


Information Directed Sampling for Linear Partial Monitoring
http://arxiv.org/abs/2002.11182v1


Information Extraction from Swedish Medical Prescriptions with Sig-Transformer Encoder
http://arxiv.org/abs/2010.04897v1


Information Seeking in the Spirit of Learning: a Dataset for Conversational Curiosity
http://arxiv.org/abs/2005.00172v2


Information Theoretic Optimal Learning of Gaussian Graphical Models
http://arxiv.org/abs/1703.04886v3


Information-Theoretic Local Minima Characterization and Regularization
http://arxiv.org/abs/1911.08192v2


Information-Theoretic Probing for Linguistic Structure
http://arxiv.org/abs/2004.03061v2


Information-Theoretic Probing with Minimum Description Length
http://arxiv.org/abs/2003.12298v1


Informative Dropout for Robust Representation Learning: A Shape-bias Perspective
http://arxiv.org/abs/2008.04254v1


Infusing Disease Knowledge into BERT for Health Question Answering, Medical Inference and Disease Name Recognition
http://arxiv.org/abs/2010.03746v1


Injecting Numerical Reasoning Skills into Language Models
http://arxiv.org/abs/2004.04487v1


Inoculation by Fine-Tuning: A Method for Analyzing Challenge Datasets
http://arxiv.org/abs/1904.02668v4


Input-Sparsity Low Rank Approximation in Schatten Norm
http://arxiv.org/abs/2004.12646v3


Inquisitive Question Generation for High Level Text Comprehension
http://arxiv.org/abs/2010.01657v1


Insights into Fairness through Trust: Multi-scale Trust Quantification for Financial Deep Learning
http://arxiv.org/abs/2011.01961v1


InstaHide: Instance-hiding Schemes for Private Distributed Learning
http://arxiv.org/abs/2010.02772v1


Instance-Based Learning of Span Representations: A Case Study through Named Entity Recognition
http://arxiv.org/abs/2004.14514v1


Instance-wise Depth and Motion Learning from Monocular Videos
http://arxiv.org/abs/1912.09351v2


Integrals over Gaussians under Linear Domain Constraints
http://arxiv.org/abs/1910.09328v2


Integrating Multimodal Information in Large Pretrained Transformers
http://arxiv.org/abs/1908.05787v3


Integrating Semantic Knowledge to Tackle Zero-shot Text Classification
http://arxiv.org/abs/1903.12626v1


Integrating Semantic and Structural Information with Graph Convolutional Network for Controversy Detection
http://arxiv.org/abs/2005.07886v1


Integrating Transformer and Paraphrase Rules for Sentence Simplification
http://arxiv.org/abs/1810.11193v1


Integrating Weakly Supervised Word Sense Disambiguation into Neural Machine Translation
http://arxiv.org/abs/1810.02614v1


Inter-Level Cooperation in Hierarchical Reinforcement Learning
http://arxiv.org/abs/1912.02368v2


Inter-sentence Relation Extraction with Document-level Graph Convolutional Neural Network
http://arxiv.org/abs/1906.04684v1


Interactive Classification by Asking Informative Questions
http://arxiv.org/abs/1911.03598v2


Interactive Extractive Search over Biomedical Corpora
http://arxiv.org/abs/2006.04148v1


Interactive Fiction Game Playing as Multi-Paragraph Reading Comprehension with Reinforcement Learning
http://arxiv.org/abs/2010.02386v1


Interactive Machine Comprehension with Information Seeking Agents
http://arxiv.org/abs/1908.10449v3


Interactive Refinement of Cross-Lingual Word Embeddings
http://arxiv.org/abs/1911.03070v3


Interactive Text Ranking with Bayesian Optimisation: A Case Study on Community QA and Summarisation
http://arxiv.org/abs/1911.10183v3


Interactive Visualization for Debugging RL
http://arxiv.org/abs/2008.07331v2


Interconnected Question Generation with Coreference Alignment and Conversation Flow Modeling
http://arxiv.org/abs/1906.06893v1


Interference and Generalization in Temporal Difference Learning
http://arxiv.org/abs/2003.06350v1


Interpolation between Residual and Non-Residual Networks
http://arxiv.org/abs/2006.05749v4


Interpretable Charge Predictions for Criminal Cases: Learning to Generate Court Views from Fact Descriptions
http://arxiv.org/abs/1802.08504v1


Interpretable Companions for Black-Box Models
http://arxiv.org/abs/2002.03494v2


Interpretable Multi-dataset Evaluation for Named Entity Recognition
http://arxiv.org/abs/2011.06854v2


Interpretable Off-Policy Evaluation in Reinforcement Learning by Highlighting Influential Transitions
http://arxiv.org/abs/2002.03478v3


Interpretable Question Answering on Knowledge Bases and Text
http://arxiv.org/abs/1906.10924v1


Interpretable and Compositional Relation Learning by Joint Training with an Autoencoder
http://arxiv.org/abs/1805.09547v1


Interpretable deep Gaussian processes with moments
http://arxiv.org/abs/1905.10963v3


Interpretation of NLP models through input marginalization
http://arxiv.org/abs/2010.13984v1


Interpretations are useful: penalizing explanations to align neural networks with prior knowledge
http://arxiv.org/abs/1909.13584v4


Interpreting Attention Models with Human Visual Attention in Machine Reading Comprehension
http://arxiv.org/abs/2010.06396v2


Intrinsic Probing through Dimension Selection
http://arxiv.org/abs/2010.02812v1


Intrinsic Reward Driven Imitation Learning via Generative Model
http://arxiv.org/abs/2006.15061v4


Introducing Syntactic Structures into Target Opinion Word Extraction with Deep Learning
http://arxiv.org/abs/2010.13378v1


Invariant Causal Prediction for Block MDPs
http://arxiv.org/abs/2003.06016v2


Invariant Risk Minimization Games
http://arxiv.org/abs/2002.04692v2


Inverse Active Sensing: Modeling and Understanding Timely Decision-Making
http://arxiv.org/abs/2006.14141v1


Invertible Generative Modeling using Linear Rational Splines
http://arxiv.org/abs/2001.05168v4


Invertible generative models for inverse problems: mitigating representation error and dataset bias
http://arxiv.org/abs/1905.11672v4


Investigating African-American Vernacular English in Transformer-Based Text Generation
http://arxiv.org/abs/2010.02510v2


Investigating Capsule Networks with Dynamic Routing for Text Classification
http://arxiv.org/abs/1804.00538v4


Investigating Cross-Linguistic Adjective Ordering Tendencies with a Latent-Variable Model
http://arxiv.org/abs/2010.04755v1


Investigating Prior Knowledge for Challenging Chinese Machine Reading Comprehension
http://arxiv.org/abs/1904.09679v3


Investigating representations of verb bias in neural language models
http://arxiv.org/abs/2010.02375v2


Investigating the Effect of Sensor Modalities in Multi-Sensor Detection-Prediction Models
http://arxiv.org/abs/2101.03279v1


Involutive MCMC: a Unifying Framework
http://arxiv.org/abs/2006.16653v1


Is 42 the Answer to Everything in Subtitling-oriented Speech Translation?
http://arxiv.org/abs/2006.01080v1


Is Graph Structure Necessary for Multi-hop Question Answering?
http://arxiv.org/abs/2004.03096v2


Is Local SGD Better than Minibatch SGD?
http://arxiv.org/abs/2002.07839v2


Is There a Trade-Off Between Fairness and Accuracy? A Perspective Using Mismatched Hypothesis Testing
http://arxiv.org/abs/1910.07870v2


Is Your Classifier Actually Biased? Measuring Fairness under Uncertainty with Bernstein Bounds
http://arxiv.org/abs/2004.12332v1


Is the Best Better? Bayesian Statistical Model Comparison for Natural Language Processing
http://arxiv.org/abs/2010.03088v1


It's Easier to Translate out of English than into it: Measuring Neural Translation Difficulty by Cross-Mutual Information
http://arxiv.org/abs/2005.02354v2


It's Not What Machines Can Learn, It's What We Cannot Teach
http://arxiv.org/abs/2002.09398v2


Iterative Edit-Based Unsupervised Sentence Simplification
http://arxiv.org/abs/2006.09639v1


Iterative Refinement in the Continuous Space for Non-Autoregressive Neural Machine Translation
http://arxiv.org/abs/2009.07177v1


Ivy: Instrumental Variable Synthesis for Causal Inference
http://arxiv.org/abs/2004.05316v1


Job Recommendation through Progression of Job Selection
http://arxiv.org/abs/1905.13136v2


Joint Bootstrapping Machines for High Confidence Relation Extraction
http://arxiv.org/abs/1805.00254v1


Joint Constrained Learning for Event-Event Relation Extraction
http://arxiv.org/abs/2010.06727v1


Joint Detection and Location of English Puns
http://arxiv.org/abs/1909.00175v1


Joint Diacritization, Lemmatization, Normalization, and Fine-Grained Morphological Tagging
http://arxiv.org/abs/1910.02267v1


Joint Effects of Context and User History for Predicting Online Conversation Re-entries
http://arxiv.org/abs/1906.01185v1


Joint Entity Extraction and Assertion Detection for Clinical Text
http://arxiv.org/abs/1812.05270v5


Joint Extraction of Entities and Relations Based on a Novel Tagging Scheme
http://arxiv.org/abs/1706.05075v1


Joint Learning of Pre-Trained and Random Units for Domain Adaptation in Part-of-Speech Tagging
http://arxiv.org/abs/1904.03595v1


Joint Modeling of Content and Discourse Relations in Dialogues
http://arxiv.org/abs/1705.05039v1


Joint Modeling of Topics, Citations, and Topical Authority in Academic Corpora
http://arxiv.org/abs/1706.00593v1


Joint Modelling of Emotion and Abusive Language Detection
http://arxiv.org/abs/2005.14028v1


Joint Multilingual Supervision for Cross-lingual Entity Linking
http://arxiv.org/abs/1809.07657v1


Joint Multitask Learning for Community Question Answering Using Task-Specific Embeddings
http://arxiv.org/abs/1809.08928v1


Joint Reasoning for Temporal and Causal Relations
http://arxiv.org/abs/1906.04941v1


Joint Semantic Synthesis and Morphological Analysis of the Derived Word
http://arxiv.org/abs/1701.00946v3


Joint translation and unit conversion for end-to-end localization
http://arxiv.org/abs/2004.05219v1


Jointly Multiple Events Extraction via Attention-based Graph Information Aggregation
http://arxiv.org/abs/1809.09078v2


Jointly Optimizing Diversity and Relevance in Neural Response Generation
http://arxiv.org/abs/1902.11205v3


Jointly Predicting Predicates and Arguments in Neural Semantic Role Labeling
http://arxiv.org/abs/1805.04787v2


KLEJ: Comprehensive Benchmark for Polish Language Understanding
http://arxiv.org/abs/2005.00630v1


KdConv: A Chinese Multi-domain Dialogue Dataset Towards Multi-turn Knowledge-driven Conversation
http://arxiv.org/abs/2004.04100v1


Keep CALM and Explore: Language Models for Action Generation in Text-based Games
http://arxiv.org/abs/2010.02903v1


Keeping Up Appearances: Computational Modeling of Face Acts in Persuasion Oriented Discussions
http://arxiv.org/abs/2009.10815v2


Kernel Conditional Density Operators
http://arxiv.org/abs/1905.11255v2


Kernel and Rich Regimes in Overparametrized Models
http://arxiv.org/abs/1906.05827v3


Kernel interpolation with continuous volume sampling
http://arxiv.org/abs/2002.09677v1


Kernels over Sets of Finite Sets using RKHS Embeddings, with Application to Bayesian (Combinatorial) Optimization
http://arxiv.org/abs/1910.04086v2


Key-Value Memory Networks for Directly Reading Documents
http://arxiv.org/abs/1606.03126v2


Keyphrase Generation: A Text Summarization Struggle
http://arxiv.org/abs/1904.00110v2


KinGDOM: Knowledge-Guided DOMain adaptation for sentiment analysis
http://arxiv.org/abs/2005.00791v2


Kinematic State Abstraction and Provably Efficient Rich-Observation Reinforcement Learning
http://arxiv.org/abs/1911.05815v1


Knowing The What But Not The Where in Bayesian Optimization
http://arxiv.org/abs/1905.02685v5


Knowledge Association with Hyperbolic Knowledge Graph Embeddings
http://arxiv.org/abs/2010.02162v1


Knowledge Completion for Generics using Guided Tensor Factorization
http://arxiv.org/abs/1612.03871v3


Knowledge Distillation for Multilingual Unsupervised Neural Machine Translation
http://arxiv.org/abs/2004.10171v1


Knowledge Graph-Augmented Abstractive Summarization with Semantic-Driven Cloze Reward
http://arxiv.org/abs/2005.01159v1


Knowledge-Grounded Dialogue Generation with Pre-trained Language Models
http://arxiv.org/abs/2010.08824v1


Knowledge-aware Pronoun Coreference Resolution
http://arxiv.org/abs/1907.03663v1


Knowledge-guided Open Attribute Value Extraction with Reinforcement Learning
http://arxiv.org/abs/2010.09189v1


Knowledgeable Reader: Enhancing Cloze-Style Reading Comprehension with External Commonsense Knowledge
http://arxiv.org/abs/1805.07858v1


KutralNet: A Portable Deep Learning Model for Fire Recognition
http://arxiv.org/abs/2008.06866v1


Køpsala: Transition-Based Graph Parsing via Efficient Training and Effective Encoding
http://arxiv.org/abs/2005.12094v2


LAReQA: Language-agnostic answer retrieval from a multilingual pool
http://arxiv.org/abs/2004.05484v1


LEAN-LIFE: A Label-Efficient Annotation Framework Towards Learning from Explanation
http://arxiv.org/abs/2004.07499v1


LEEP: A New Measure to Evaluate Transferability of Learned Representations
http://arxiv.org/abs/2002.12462v2


LIBRE: Learning Interpretable Boolean Rule Ensembles
http://arxiv.org/abs/1911.06537v1


LINSPECTOR: Multilingual Probing Tasks for Word Representations
http://arxiv.org/abs/1903.09442v2


LOGAN: Local Group Bias Detection by Clustering
http://arxiv.org/abs/2010.02867v1


LP-SparseMAP: Differentiable Relaxed Optimization for Sparse Structured Prediction
http://arxiv.org/abs/2001.04437v3


LRTA: A Transparent Neural-Symbolic Reasoning Framework with Modular Supervision for Visual Question Answering
http://arxiv.org/abs/2011.10731v1


LUKE: Deep Contextualized Entity Representations with Entity-aware Self-attention
http://arxiv.org/abs/2010.01057v1


Label-aware Double Transfer Learning for Cross-Specialty Medical Named Entity Recognition
http://arxiv.org/abs/1804.09021v2


Landscape Connectivity and Dropout Stability of SGD Solutions for Over-parameterized Neural Networks
http://arxiv.org/abs/1912.10095v2


Langevin Monte Carlo without smoothness
http://arxiv.org/abs/1905.13285v3


Language (Re)modelling: Towards Embodied Language Understanding
http://arxiv.org/abs/2005.00311v2


Language (Technology) is Power: A Critical Survey of "Bias" in NLP
http://arxiv.org/abs/2005.14050v2


Language Generation with Multi-Hop Reasoning on Commonsense Knowledge Graph
http://arxiv.org/abs/2009.11692v1


Language Model Prior for Low-Resource Neural Machine Translation
http://arxiv.org/abs/2004.14928v3


Language Modelling Makes Sense: Propagating Representations through WordNet for Full-Coverage Word Sense Disambiguation
http://arxiv.org/abs/1906.10007v1


Language Models as Fact Checkers?
http://arxiv.org/abs/2006.04102v2


Language Models as an Alternative Evaluator of Word Order Hypotheses: A Case Study in Japanese
http://arxiv.org/abs/2005.00842v1


Language Models not just for Pre-training: Fast Online Neural Noisy Channel Modeling
http://arxiv.org/abs/2011.07164v1


Language Understanding for Text-based Games Using Deep Reinforcement Learning
http://arxiv.org/abs/1506.08941v2


Language as a Latent Variable: Discrete Generative Models for Sentence Compression
http://arxiv.org/abs/1609.07317v2


Large Margin Neural Language Model
http://arxiv.org/abs/1808.08987v1


Large Product Key Memory for Pretrained Language Models
http://arxiv.org/abs/2010.03881v1


Large Scale Multi-Actor Generative Dialog Modeling
http://arxiv.org/abs/2005.06114v1


Large-Scale Multi-Domain Belief Tracking with Knowledge Sharing
http://arxiv.org/abs/1807.06517v1


Large-scale Analysis of Counseling Conversations: An Application of Natural Language Processing to Mental Health
http://arxiv.org/abs/1605.04462v3


Large-scale Cloze Test Dataset Created by Teachers
http://arxiv.org/abs/1711.03225v3


Last Iterate is Slower than Averaged Iterate in Smooth Convex-Concave Saddle Point Problems
http://arxiv.org/abs/2002.00057v2


Latent Alignment of Procedural Concepts in Multimodal Recipes
http://arxiv.org/abs/2101.04727v1


Latent Space Factorisation and Manipulation via Matrix Subspace Projection
http://arxiv.org/abs/1907.12385v3


Latent Space Oddity: Exploring Latent Spaces to Design Guitar Timbres
http://arxiv.org/abs/2010.15989v2


Latent Variable Modelling with Hyperbolic Normalizing Flows
http://arxiv.org/abs/2002.06336v4


Latent-CF: A Simple Baseline for Reverse Counterfactual Explanations
http://arxiv.org/abs/2012.09301v1


Layered Sampling for Robust Optimization Problems
http://arxiv.org/abs/2002.11904v1


LazyIter: A Fast Algorithm for Counting Markov Equivalent DAGs and Designing Experiments
http://arxiv.org/abs/2006.09670v1


LdSM: Logarithm-depth Streaming Multi-label Decision Trees
http://arxiv.org/abs/1905.10428v5


Learnable Bernoulli Dropout for Bayesian Deep Learning
http://arxiv.org/abs/2002.05155v1


Learning Accurate Long-term Dynamics for Model-based Reinforcement Learning
http://arxiv.org/abs/2012.09156v1


Learning Adaptive Language Interfaces through Decomposition
http://arxiv.org/abs/2010.05190v1


Learning Adversarially Robust Representations via Worst-Case Mutual Information Maximization
http://arxiv.org/abs/2002.11798v2


Learning Algebraic Multigrid Using Graph Neural Networks
http://arxiv.org/abs/2003.05744v2


Learning Architectures from an Extended Search Space for Language Modeling
http://arxiv.org/abs/2005.02593v2


Learning Autoencoders with Relational Regularization
http://arxiv.org/abs/2002.02913v4


Learning Canonical Transformations
http://arxiv.org/abs/2011.08822v1


Learning Collaborative Agents with Rule Guidance for Knowledge Graph Reasoning
http://arxiv.org/abs/2005.00571v2


Learning Compressed Sentence Representations for On-Device Text Processing
http://arxiv.org/abs/1906.08340v1


Learning Constraints for Structured Prediction Using Rectifier Networks
http://arxiv.org/abs/2006.01209v1


Learning Context-Free Languages with Nondeterministic Stack RNNs
http://arxiv.org/abs/2010.04674v1


Learning Context-Sensitive Convolutional Filters for Text Processing
http://arxiv.org/abs/1709.08294v3


Learning Contextualized Knowledge Structures for Commonsense Reasoning
http://arxiv.org/abs/2010.12873v2


Learning Cross-lingual Distributed Logical Representations for Semantic Parsing
http://arxiv.org/abs/1806.05461v1


Learning Crosslingual Word Embeddings without Bilingual Corpora
http://arxiv.org/abs/1606.09403v1


Learning De-biased Representations with Biased Representations
http://arxiv.org/abs/1910.02806v3


Learning Deep Transformer Models for Machine Translation
http://arxiv.org/abs/1906.01787v1


Learning Dialog Policies from Weak Demonstrations
http://arxiv.org/abs/2004.11054v2


Learning Discourse-level Diversity for Neural Dialog Models using Conditional Variational Autoencoders
http://arxiv.org/abs/1703.10960v3


Learning Discrete Structured Representations by Adversarially Maximizing Mutual Information
http://arxiv.org/abs/2004.03991v2


Learning Dynamic Feature Selection for Fast Sequential Prediction
http://arxiv.org/abs/1505.06169v1


Learning Dynamic and Personalized Comorbidity Networks from Event Data using Deep Diffusion Processes
http://arxiv.org/abs/2001.02585v2


Learning Efficient Multi-agent Communication: An Information Bottleneck Approach
http://arxiv.org/abs/1911.06992v2


Learning End-to-End Goal-Oriented Dialog with Maximal User Task Success and Minimal Human Agent Use
http://arxiv.org/abs/1907.07638v1


Learning Entangled Single-Sample Gaussians in the Subset-of-Signals Model
http://arxiv.org/abs/2007.05557v1


Learning Fair Policies in Multiobjective (Deep) Reinforcement Learning with Average and Discounted Rewards
http://arxiv.org/abs/2008.07773v1


Learning Fair Representations for Kernel Models
http://arxiv.org/abs/1906.11813v2


Learning Flat Latent Manifolds with VAEs
http://arxiv.org/abs/2002.04881v3


Learning Functionally Decomposed Hierarchies for Continuous Control Tasks with Path Planning
http://arxiv.org/abs/2002.05954v3


Learning Gaussian Graphical Models via Multiplicative Weights
http://arxiv.org/abs/2002.08663v2


Learning Generic Sentence Representations Using Convolutional Neural Networks
http://arxiv.org/abs/1611.07897v2


Learning Geometric Word Meta-Embeddings
http://arxiv.org/abs/2004.09219v1


Learning Graph Models for Template-Free Retrosynthesis
http://arxiv.org/abs/2006.07038v1


Learning Graph Structure With A Finite-State Automaton Layer
http://arxiv.org/abs/2007.04929v2


Learning Group Structure and Disentangled Representations of Dynamical Environments
http://arxiv.org/abs/2002.06991v2


Learning Halfspaces with Massart Noise Under Structured Distributions
http://arxiv.org/abs/2002.05632v1


Learning Hierarchical Interactions at Scale: A Convex Optimization Approach
http://arxiv.org/abs/1902.01542v5


Learning High-dimensional Gaussian Graphical Models under Total Positivity without Adjustment of Tuning Parameters
http://arxiv.org/abs/1906.05159v4


Learning Human Objectives by Evaluating Hypothetical Behavior
http://arxiv.org/abs/1912.05652v1


Learning Hyperbolic Representations for Unsupervised 3D Segmentation
http://arxiv.org/abs/2012.01644v2


Learning Implicit Text Generation via Feature Matching
http://arxiv.org/abs/2005.03588v2


Learning Implicitly with Noisy Data in Linear Arithmetic
http://arxiv.org/abs/2010.12619v1


Learning Informative Representations of Biomedical Relations with Latent Variable Models
http://arxiv.org/abs/2011.10285v1


Learning Intrinsic Symbolic Rewards in Reinforcement Learning
http://arxiv.org/abs/2010.03694v2


Learning Invariant Representations for Reinforcement Learning without Reconstruction
http://arxiv.org/abs/2006.10742v1


Learning Joint Semantic Parsers from Disjoint Data
http://arxiv.org/abs/1804.05990v1


Learning Lexico-Functional Patterns for First-Person Affect
http://arxiv.org/abs/1708.09789v1


Learning Long-term Visual Dynamics with Region Proposal Interaction Networks
http://arxiv.org/abs/2008.02265v1


Learning Matching Models with Weak Supervision for Response Selection in Retrieval-based Chatbots
http://arxiv.org/abs/1805.02333v2


Learning Mixtures of Graphs from Epidemic Cascades
http://arxiv.org/abs/1906.06057v2


Learning Multilingual Word Embeddings in Latent Metric Space: A Geometric Approach
http://arxiv.org/abs/1808.08773v3


Learning Music Helps You Read: Using Transfer to Study Linguistic Structure in Language Models
http://arxiv.org/abs/2004.14601v3


Learning Near Optimal Policies with Low Inherent Bellman Error
http://arxiv.org/abs/2003.00153v3


Learning Neural Sequence-to-Sequence Models from Weak Feedback with Bipolar Ramp Loss
http://arxiv.org/abs/1907.03748v1


Learning Neural Templates for Text Generation
http://arxiv.org/abs/1808.10122v3


Learning Object-Centric Video Models by Contrasting Sets
http://arxiv.org/abs/2011.10287v1


Learning Optimal Tree Models Under Beam Search
http://arxiv.org/abs/2006.15408v1


Learning Outside the Box: Discourse-level Features Improve Metaphor Identification
http://arxiv.org/abs/1904.02246v2


Learning Overlapping Representations for the Estimation of Individualized Treatment Effects
http://arxiv.org/abs/2001.04754v3


Learning Portable Representations for High-Level Planning
http://arxiv.org/abs/1905.12006v1


Learning Probabilistic Sentence Representations from Paraphrases
http://arxiv.org/abs/2005.08105v1


Learning Quadratic Games on Networks
http://arxiv.org/abs/1811.08790v3


Learning Reasoning Strategies in End-to-End Differentiable Proving
http://arxiv.org/abs/2007.06477v3


Learning Representations that Support Extrapolation
http://arxiv.org/abs/2007.05059v2


Learning Robot Skills with Temporal Variational Inference
http://arxiv.org/abs/2006.16232v1


Learning Robust Models for e-Commerce Product Search
http://arxiv.org/abs/2005.03624v1


Learning Sequence Encoders for Temporal Knowledge Graph Completion
http://arxiv.org/abs/1809.03202v1


Learning Similarity Metrics for Numerical Simulations
http://arxiv.org/abs/2002.07863v2


Learning Source Phrase Representations for Neural Machine Translation
http://arxiv.org/abs/2006.14405v1


Learning Sparse Nonparametric DAGs
http://arxiv.org/abs/1909.13189v2


Learning Spoken Language Representations with Neural Lattice Language Modeling
http://arxiv.org/abs/2007.02629v2


Learning Structural Kernels for Natural Language Processing
http://arxiv.org/abs/1508.02131v1


Learning Structured Representations of Entity Names using Active Learning and Weak Supervision
http://arxiv.org/abs/2011.00105v1


Learning Symbolic Physics with Graph Networks
http://arxiv.org/abs/1909.05862v2


Learning To Navigate The Synthetically Accessible Chemical Space Using Reinforcement Learning
http://arxiv.org/abs/2004.12485v2


Learning To Solve Differential Equations Across Initial Conditions
http://arxiv.org/abs/2003.12159v2


Learning Variational Word Masks to Improve the Interpretability of Neural Text Classifiers
http://arxiv.org/abs/2010.00667v3


Learning Visually Grounded Sentence Representations
http://arxiv.org/abs/1707.06320v2


Learning What to Defer for Maximum Independent Sets
http://arxiv.org/abs/2006.09607v2


Learning Word-Like Units from Joint Audio-Visual Analysis
http://arxiv.org/abs/1701.07481v3


Learning Zero-Sum Simultaneous-Move Markov Games Using Function Approximation and Correlated Equilibrium
http://arxiv.org/abs/2002.07066v3


Learning a Cost-Effective Annotation Policy for Question Answering
http://arxiv.org/abs/2010.03476v2


Learning a Multi-Domain Curriculum for Neural Machine Translation
http://arxiv.org/abs/1908.10940v2


Learning a Neural Semantic Parser from User Feedback
http://arxiv.org/abs/1704.08760v1


Learning a Policy for Opportunistic Active Learning
http://arxiv.org/abs/1808.10009v1


Learning a Simple and Effective Model for Multi-turn Response Generation with Auxiliary Tasks
http://arxiv.org/abs/2004.01972v2


Learning a Single Neuron with Gradient Methods
http://arxiv.org/abs/2001.05205v2


Learning an Unreferenced Metric for Online Dialogue Evaluation
http://arxiv.org/abs/2005.00583v1


Learning and Evaluating Contextual Embedding of Source Code
http://arxiv.org/abs/2001.00059v3


Learning and Evaluating Emotion Lexicons for 91 Languages
http://arxiv.org/abs/2005.05672v1


Learning and Sampling of Atomic Interventions from Observations
http://arxiv.org/abs/2002.04232v2


Learning beyond datasets: Knowledge Graph Augmented Neural Networks for Natural language Processing
http://arxiv.org/abs/1802.05930v2


Learning distributed representations of graphs with Geo2DR
http://arxiv.org/abs/2003.05926v3


Learning for Dose Allocation in Adaptive Clinical Trials with Safety Constraints
http://arxiv.org/abs/2006.05026v2


Learning from Context or Names? An Empirical Study on Neural Relation Extraction
http://arxiv.org/abs/2010.01923v2


Learning from Irregularly-Sampled Time Series: A Missing Data Perspective
http://arxiv.org/abs/2008.07599v1


Learning from Task Descriptions
http://arxiv.org/abs/2011.08115v1


Learning how to Active Learn: A Deep Reinforcement Learning Approach
http://arxiv.org/abs/1708.02383v1


Learning in Gated Neural Networks
http://arxiv.org/abs/1906.02777v2


Learning piecewise Lipschitz functions in changing environments
http://arxiv.org/abs/1907.09137v4


Learning robust visual representations using data augmentation invariance
http://arxiv.org/abs/1906.04547v1


Learning spectrograms with convolutional spectral kernels
http://arxiv.org/abs/1905.09917v2


Learning the piece-wise constant graph structure of a varying Ising model
http://arxiv.org/abs/1910.08512v2


Learning to Ask Good Questions: Ranking Clarification Questions using Neural Expected Value of Perfect Information
http://arxiv.org/abs/1805.04655v2


Learning to Ask Questions in Open-domain Conversational Systems with Typed Decoders
http://arxiv.org/abs/1805.04843v1


Learning to Ask Unanswerable Questions for Machine Reading Comprehension
http://arxiv.org/abs/1906.06045v1


Learning to Branch for Multi-Task Learning
http://arxiv.org/abs/2006.01895v2


Learning to Classify Intents and Slot Labels Given a Handful of Examples
http://arxiv.org/abs/2004.10793v1


Learning to Combine Top-Down and Bottom-Up Signals in Recurrent Neural Networks with Attention over Modules
http://arxiv.org/abs/2006.16981v3


Learning to Contextually Aggregate Multi-Source Supervision for Sequence Labeling
http://arxiv.org/abs/1910.04289v2


Learning to Continually Learn
http://arxiv.org/abs/2002.09571v2


Learning to Customize Model Structures for Few-shot Dialogue Generation Tasks
http://arxiv.org/abs/1910.14326v2


Learning to Deceive with Attention-Based Explanations
http://arxiv.org/abs/1909.07913v2


Learning to Decipher Hate Symbols
http://arxiv.org/abs/1904.02418v1


Learning to Encode Position for Transformer with Continuous Dynamical Model
http://arxiv.org/abs/2003.09229v1


Learning to Evaluate Translation Beyond English: BLEURT Submissions to the WMT Metrics 2020 Shared Task
http://arxiv.org/abs/2010.04297v3


Learning to Faithfully Rationalize by Construction
http://arxiv.org/abs/2005.00115v1


Learning to Fuse Sentences with Transformers for Summarization
http://arxiv.org/abs/2010.03726v1


Learning to Generate Compositional Color Descriptions
http://arxiv.org/abs/1606.03821v2


Learning to Generate Multiple Style Transfer Outputs for an Input Sentence
http://arxiv.org/abs/2002.06525v1


Learning to Ignore: Long Document Coreference with Bounded Memory Neural Networks
http://arxiv.org/abs/2010.02807v3


Learning to Learn Kernels with Variational Random Features
http://arxiv.org/abs/2006.06707v2


Learning to Map Context-Dependent Sentences to Executable Formal Queries
http://arxiv.org/abs/1804.06868v2


Learning to Navigate Unseen Environments: Back Translation with Environmental Dropout
http://arxiv.org/abs/1904.04195v1


Learning to Parse and Translate Improves Neural Machine Translation
http://arxiv.org/abs/1702.03525v2


Learning to Prune Deep Neural Networks via Reinforcement Learning
http://arxiv.org/abs/2007.04756v1


Learning to Rank Learning Curves
http://arxiv.org/abs/2006.03361v1


Learning to Reach Goals via Iterated Supervised Learning
http://arxiv.org/abs/1912.06088v4


Learning to Recognize Discontiguous Entities
http://arxiv.org/abs/1810.08579v3


Learning to Recover from Multi-Modality Errors for Non-Autoregressive Neural Machine Translation
http://arxiv.org/abs/2006.05165v1


Learning to Represent Action Values as a Hypergraph on the Action Vertices
http://arxiv.org/abs/2010.14680v1


Learning to Sample with Local and Global Contexts in Experience Replay Buffer
http://arxiv.org/abs/2007.07358v1


Learning to Score Behaviors for Guided Policy Optimization
http://arxiv.org/abs/1906.04349v4


Learning to Segment Actions from Observation and Narration
http://arxiv.org/abs/2005.03684v2


Learning to Simulate Complex Physics with Graph Networks
http://arxiv.org/abs/2002.09405v2


Learning to Stop While Learning to Predict
http://arxiv.org/abs/2006.05082v1


Learning to Understand Child-directed and Adult-directed Speech
http://arxiv.org/abs/2005.02721v3


Learning to Update Natural Language Comments Based on Code Changes
http://arxiv.org/abs/2004.12169v2


Learning to simulate and design for structural engineering
http://arxiv.org/abs/2003.09103v3


Learning with Bounded Instance- and Label-dependent Label Noise
http://arxiv.org/abs/1709.03768v3


Learning with Good Feature Representations in Bandits and in RL with a Generative Model
http://arxiv.org/abs/1911.07676v2


Learning with Multiple Complementary Labels
http://arxiv.org/abs/1912.12927v3


Leave-One-Out Cross-Validation for Bayesian Model Comparison in Large Data
http://arxiv.org/abs/2001.00980v1


Lessons from Archives: Strategies for Collecting Sociocultural Data in Machine Learning
http://arxiv.org/abs/1912.10389v1


Lessons from the Bible on Modern Topics: Low-Resource Multilingual Topic Model Evaluation
http://arxiv.org/abs/1804.10184v1


Let Me Choose: From Verbal Context to Font Selection
http://arxiv.org/abs/2005.01151v1


Let's Agree to Agree: Neural Networks Share Classification Order on Real Datasets
http://arxiv.org/abs/1905.10854v7


Levels of Analysis for Machine Learning
http://arxiv.org/abs/2004.05107v1


Leveraging Declarative Knowledge in Text and First-Order Logic for Fine-Grained Propaganda Detection
http://arxiv.org/abs/2004.14201v2


Leveraging Frequency Analysis for Deep Fake Image Recognition
http://arxiv.org/abs/2003.08685v3


Leveraging Graph to Improve Abstractive Multi-Document Summarization
http://arxiv.org/abs/2005.10043v1


Leveraging Monolingual Data with Self-Supervision for Multilingual Neural Machine Translation
http://arxiv.org/abs/2005.04816v1


Leveraging Multimodal Behavioral Analytics for Automated Job Interview Performance Assessment and Feedback
http://arxiv.org/abs/2006.07909v2


Leveraging Pre-trained Checkpoints for Sequence Generation Tasks
http://arxiv.org/abs/1907.12461v2


Leveraging Procedural Generation to Benchmark Reinforcement Learning
http://arxiv.org/abs/1912.01588v2


Leveraging Sentence Similarity in Natural Language Generation: Improving Beam Search using Range Voting
http://arxiv.org/abs/1908.06288v2


Lexical Features in Coreference Resolution: To be Used With Caution
http://arxiv.org/abs/1704.06779v1


Lexically Constrained Neural Machine Translation with Levenshtein Transformer
http://arxiv.org/abs/2004.12681v1


Lexicosyntactic Inference in Neural Models
http://arxiv.org/abs/1808.06232v1


Lifelong Language Knowledge Distillation
http://arxiv.org/abs/2010.02123v1


Lifelong Learning CRF for Supervised Aspect Extraction
http://arxiv.org/abs/1705.00251v1


Lifted Disjoint Paths with Application in Multiple Object Tracking
http://arxiv.org/abs/2006.14550v1


Lifted Rule Injection for Relation Embeddings
http://arxiv.org/abs/1606.08359v2


Lightweight, Dynamic Graph Convolutional Networks for AMR-to-Text Generation
http://arxiv.org/abs/2010.04383v1


Like a Baby: Visually Situated Neural Language Acquisition
http://arxiv.org/abs/1805.11546v2


Like hiking? You probably enjoy nature: Persona-grounded Dialog with Commonsense Expansions
http://arxiv.org/abs/2010.03205v1


Likelihood Regret: An Out-of-Distribution Detection Score For Variational Auto-encoder
http://arxiv.org/abs/2003.02977v3


Linear Bandits with Stochastic Delayed Feedback
http://arxiv.org/abs/1807.02089v3


Linear Convergence of Adaptive Stochastic Gradient Descent
http://arxiv.org/abs/1908.10525v2


Linear Convergence of Randomized Primal-Dual Coordinate Method for Large-scale Linear Constrained Convex Programming
http://arxiv.org/abs/2008.12946v1


Linear Dynamics: Clustering without identification
http://arxiv.org/abs/1908.01039v3


Linear Lower Bounds and Conditioning of Differentiable Games
http://arxiv.org/abs/1906.07300v3


Linear Mode Connectivity and the Lottery Ticket Hypothesis
http://arxiv.org/abs/1912.05671v4


Linear-Time Constituency Parsing with RNNs and Dynamic Programming
http://arxiv.org/abs/1805.06995v2


Linearly Convergent Frank-Wolfe with Backtracking Line-Search
http://arxiv.org/abs/1806.05123v4


Linguistic Features for Readability Assessment
http://arxiv.org/abs/2006.00377v1


Linguistic Harbingers of Betrayal: A Case Study on an Online Strategy Game
http://arxiv.org/abs/1506.04744v1


Linguistic Knowledge and Transferability of Contextual Representations
http://arxiv.org/abs/1903.08855v5


Lipschitz Constrained Parameter Initialization for Deep Transformers
http://arxiv.org/abs/1911.03179v2


Lipschitz and Comparator-Norm Adaptivity in Online Learning
http://arxiv.org/abs/2002.12242v2


Lipstick on a Pig: Debiasing Methods Cover up Systematic Gender Biases in Word Embeddings But do not Remove Them
http://arxiv.org/abs/1903.03862v2


List Decodable Subspace Recovery
http://arxiv.org/abs/2002.03004v1


Lite Training Strategies for Portuguese-English and English-Portuguese Translation
http://arxiv.org/abs/2008.08769v1


Local Differentially Private Regret Minimization in Reinforcement Learning
http://arxiv.org/abs/2010.07778v1


Localizing Moments in Video with Temporal Language
http://arxiv.org/abs/1809.01337v1


Locally Accelerated Conditional Gradients
http://arxiv.org/abs/1906.07867v2


Locally Private Hypothesis Selection
http://arxiv.org/abs/2002.09465v2


Location Attention for Extrapolation to Longer Sequences
http://arxiv.org/abs/1911.03872v2


Logarithmic Regret for Learning Linear Quadratic Regulators Efficiently
http://arxiv.org/abs/2002.08095v2


Logarithmic Regret for Online Control
http://arxiv.org/abs/1909.05062v1


Logic-Guided Data Augmentation and Regularization for Consistent Question Answering
http://arxiv.org/abs/2004.10157v2


Logical Inferences with Comparatives and Generalized Quantifiers
http://arxiv.org/abs/2005.07954v1


Logical Natural Language Generation from Open-Domain Tables
http://arxiv.org/abs/2004.10404v2


LogicalFactChecker: Leveraging Logical Operations for Fact Checking with Graph Module Network
http://arxiv.org/abs/2004.13659v1


Logistic Regression for Massive Data with Rare Events
http://arxiv.org/abs/2006.00683v1


Long Short-Term Memory as a Dynamically Computed Element-wise Weighted Sum
http://arxiv.org/abs/1805.03716v1


Long-Horizon Visual Planning with Goal-Conditioned Hierarchical Predictors
http://arxiv.org/abs/2006.13205v2


Long-Short Term Masking Transformer: A Simple but Effective Baseline for Document-level Neural Machine Translation
http://arxiv.org/abs/2009.09127v1


Long-tail Relation Extraction via Knowledge Graph Embeddings and Graph Convolution Networks
http://arxiv.org/abs/1903.01306v1


Look It Up: Bilingual and Monolingual Dictionaries Improve Neural Machine Translation
http://arxiv.org/abs/2010.05997v1


Look at the First Sentence: Position Bias in Question Answering
http://arxiv.org/abs/2004.14602v3


Lookahead-Bounded Q-Learning
http://arxiv.org/abs/2006.15690v1


Loss Function Search for Face Recognition
http://arxiv.org/abs/2007.06542v1


Lossless Compression of Deep Neural Networks
http://arxiv.org/abs/2001.00218v3


Low Rank Fusion based Transformers for Multimodal Sequences
http://arxiv.org/abs/2007.02038v1


Low Resource Neural Machine Translation: A Benchmark for Five African Languages
http://arxiv.org/abs/2003.14402v1


Low Shot Learning with Untrained Neural Networks for Imaging Inverse Problems
http://arxiv.org/abs/1910.10797v1


Low-Dimensional Hyperbolic Knowledge Graph Embeddings
http://arxiv.org/abs/2005.00545v1


Low-Rank Bottleneck in Multi-head Attention Models
http://arxiv.org/abs/2002.07028v1


Low-Resource Domain Adaptation for Compositional Task-Oriented Semantic Parsing
http://arxiv.org/abs/2010.03546v1


Low-Variance and Zero-Variance Baselines for Extensive-Form Games
http://arxiv.org/abs/1907.09633v1


Low-loss connection of weight vectors: distribution-based approaches
http://arxiv.org/abs/2008.00741v1


Low-resource Deep Entity Resolution with Transfer and Active Learning
http://arxiv.org/abs/1906.08042v1


LowFER: Low-rank Bilinear Pooling for Link Prediction
http://arxiv.org/abs/2008.10858v1


MAD-X: An Adapter-Based Framework for Multi-Task Cross-Lingual Transfer
http://arxiv.org/abs/2005.00052v3


MAF: Multimodal Alignment Framework for Weakly-Supervised Phrase Grounding
http://arxiv.org/abs/2010.05379v1


MART: Memory-Augmented Recurrent Transformer for Coherent Video Paragraph Captioning
http://arxiv.org/abs/2005.05402v1


MAST: Multimodal Abstractive Summarization with Trimodal Hierarchical Attention
http://arxiv.org/abs/2010.08021v1


MATINF: A Jointly Labeled Large-Scale Dataset for Classification, Question Answering and Summarization
http://arxiv.org/abs/2004.12302v2


MAVEN: A Massive General Domain Event Detection Dataset
http://arxiv.org/abs/2004.13590v2


MCMH: Learning Multi-Chain Multi-Hop Rules for Knowledge Graph Reasoning
http://arxiv.org/abs/2010.01735v1


MEGA RST Discourse Treebanks with Structure and Nuclearity from Scalable Distant Sentiment Supervision
http://arxiv.org/abs/2011.03017v1


MEGATRON-CNTRL: Controllable Story Generation with External Knowledge Using Large-Scale Language Models
http://arxiv.org/abs/2010.00840v1


MGHRL: Meta Goal-generation for Hierarchical Reinforcement Learning
http://arxiv.org/abs/1909.13607v4


MIME: MIMicking Emotions for Empathetic Response Generation
http://arxiv.org/abs/2010.01454v1


MLSUM: The Multilingual Summarization Corpus
http://arxiv.org/abs/2004.14900v1


MOCHA: A Dataset for Training and Evaluating Generative Reading Comprehension Metrics
http://arxiv.org/abs/2010.03636v2


MOPO: Model-based Offline Policy Optimization
http://arxiv.org/abs/2005.13239v6


MORSE: Semantic-ally Drive-n MORpheme SEgment-er
http://arxiv.org/abs/1702.02212v3


MPC-guided Imitation Learning of Neural Network Policies for the Artificial Pancreas
http://arxiv.org/abs/2003.01283v1


MTL2L: A Context Aware Neural Optimiser
http://arxiv.org/abs/2007.09343v1


MULTIPOLAR: Multi-Source Policy Aggregation for Transfer Reinforcement Learning between Diverse Environmental Dynamics
http://arxiv.org/abs/1909.13111v2


Machine Learning in Population and Public Health
http://arxiv.org/abs/2008.07278v1


Making Monolingual Sentence Embeddings Multilingual using Knowledge Distillation
http://arxiv.org/abs/2004.09813v2


Mapping Natural Language Instructions to Mobile UI Action Sequences
http://arxiv.org/abs/2005.03776v2


Mapping Natural-language Problems to Formal-language Solutions Using Structured Neural Representations
http://arxiv.org/abs/1910.02339v3


Mapping to Declarative Knowledge for Word Problem Solving
http://arxiv.org/abs/1712.09391v1


Marrying up Regular Expressions with Neural Networks: A Case Study for Spoken Language Understanding
http://arxiv.org/abs/1805.05588v1


Masked Language Model Scoring
http://arxiv.org/abs/1910.14659v3


Masking as an Efficient Alternative to Finetuning for Pretrained Language Models
http://arxiv.org/abs/2004.12406v2


Massively Multilingual Adversarial Speech Recognition
http://arxiv.org/abs/1904.02210v1


Massively Multilingual Transfer for NER
http://arxiv.org/abs/1902.00193v4


Matching the Blanks: Distributional Similarity for Relation Learning
http://arxiv.org/abs/1906.03158v1


Maximum Entropy Gain Exploration for Long Horizon Multi-goal Reinforcement Learning
http://arxiv.org/abs/2007.02832v1


Maximum Likelihood with Bias-Corrected Calibration is Hard-To-Beat at Label Shift Adaptation
http://arxiv.org/abs/1901.06852v5


Maximum Mutation Reinforcement Learning for Scalable Control
http://arxiv.org/abs/2007.13690v6


Maximum Reward Formulation In Reinforcement Learning
http://arxiv.org/abs/2010.03744v1


MeDAL: Medical Abbreviation Disambiguation Dataset for Natural Language Understanding Pretraining
http://arxiv.org/abs/2012.13978v1


Meaning to Form: Measuring Systematicity as Information
http://arxiv.org/abs/1906.05906v2


Measuring Emotions in the COVID-19 Real World Worry Dataset
http://arxiv.org/abs/2004.04225v2


Measuring Forecasting Skill from Text
http://arxiv.org/abs/2006.07425v2


Measuring Impact of Climate Change on Tree Species: analysis of JSDM on FIA data
http://arxiv.org/abs/1910.04932v1


Measuring Information Propagation in Literary Social Networks
http://arxiv.org/abs/2004.13980v2


Measuring Non-Expert Comprehension of Machine Learning Fairness Metrics
http://arxiv.org/abs/2001.00089v3


Measuring Thematic Fit with Distributional Feature Overlap
http://arxiv.org/abs/1707.05967v2


Measuring Visual Generalization in Continuous Control from Pixels
http://arxiv.org/abs/2010.06740v2


Median Matrix Completion: from Embarrassment to Optimality
http://arxiv.org/abs/2006.10400v1


Memory-enhanced Decoder for Neural Machine Translation
http://arxiv.org/abs/1606.02003v1


Mention Extraction and Linking for SQL Query Generation
http://arxiv.org/abs/2012.10074v1


Merge and Label: A novel neural network architecture for nested NER
http://arxiv.org/abs/1907.00464v1


Message Passing Query Embedding
http://arxiv.org/abs/2002.02406v2


Message Passing for Hyper-Relational Knowledge Graphs
http://arxiv.org/abs/2009.10847v1


Meta Fine-Tuning Neural Language Models for Multi-Domain Text Mining
http://arxiv.org/abs/2003.13003v2


Meta Learning Deep Visual Words for Fast Video Object Segmentation
http://arxiv.org/abs/1812.01397v3


Meta-Learning for Few-Shot NMT Adaptation
http://arxiv.org/abs/2004.02745v1


Meta-Learning with Shared Amortized Variational Inference
http://arxiv.org/abs/2008.12037v1


Meta-Reinforcement Learning Robust to Distributional Shift via Model Identification and Experience Relabeling
http://arxiv.org/abs/2006.07178v2


Meta-Reinforcement Learning for Robotic Industrial Insertion Tasks
http://arxiv.org/abs/2004.14404v2


Meta-SAC: Auto-tune the Entropy Temperature of Soft Actor-Critic via Metagradient
http://arxiv.org/abs/2007.01932v2


Meta-Transfer Learning for Code-Switched Speech Recognition
http://arxiv.org/abs/2004.14228v1


Meta-learning with Stochastic Linear Bandits
http://arxiv.org/abs/2005.08531v1


MetaFun: Meta-Learning with Iterative Functional Updates
http://arxiv.org/abs/1912.02738v4


Microblog Hashtag Generation via Encoding Conversation Contexts
http://arxiv.org/abs/1905.07584v1


Mimicking Word Embeddings using Subword RNNs
http://arxiv.org/abs/1707.06961v1


MinTL: Minimalist Transfer Learning for Task-Oriented Dialogue Systems
http://arxiv.org/abs/2009.12005v2


Mind the Trade-off: Debiasing NLU Models without Degrading the In-distribution Performance
http://arxiv.org/abs/2005.00315v1


Minimally distorted Adversarial Examples with a Fast Adaptive Boundary Attack
http://arxiv.org/abs/1907.02044v2


Minimax Pareto Fairness: A Multi Objective Perspective
http://arxiv.org/abs/2011.01821v1


Minimax Testing of Identity to a Reference Ergodic Markov Chain
http://arxiv.org/abs/1902.00080v3


Minimax Weight and Q-Function Learning for Off-Policy Evaluation
http://arxiv.org/abs/1910.12809v4


Minimizing Dynamic Regret and Adaptive Regret Simultaneously
http://arxiv.org/abs/2002.02085v1


Minimizing Interference and Selection Bias in Network Experiment Design
http://arxiv.org/abs/2004.07225v1


Mining Discourse Markers for Unsupervised Sentence Representation Learning
http://arxiv.org/abs/1903.11850v1


Mining Documentation to Extract Hyperparameter Schemas
http://arxiv.org/abs/2006.16984v2


Mirror Descent Policy Optimization
http://arxiv.org/abs/2005.09814v3


Missing Data Imputation using Optimal Transport
http://arxiv.org/abs/2002.03860v3


Mitigating Gender Bias Amplification in Distribution by Posterior Regularization
http://arxiv.org/abs/2005.06251v1


Mitigating Gender Bias for Neural Dialogue Generation with Adversarial Learning
http://arxiv.org/abs/2009.13028v2


Mitigating Gender Bias in Machine Translation with Target Gender Annotations
http://arxiv.org/abs/2010.06203v2


Mitigating Gender Bias in Natural Language Processing: Literature Review
http://arxiv.org/abs/1906.08976v1


Mitigating Leakage in Federated Learning with Trusted Hardware
http://arxiv.org/abs/2011.04948v3


Mitigating Manipulation in Peer Review via Randomized Reviewer Assignments
http://arxiv.org/abs/2006.16437v2


Mitigating Overfitting in Supervised Classification from Two Unlabeled Datasets: A Consistent Risk Correction Approach
http://arxiv.org/abs/1910.08974v4


Mitigating Uncertainty in Document Classification
http://arxiv.org/abs/1907.07590v1


MixText: Linguistically-Informed Interpolation of Hidden Space for Semi-Supervised Text Classification
http://arxiv.org/abs/2004.12239v1


Mixed Strategies for Robust Optimization of Unknown Objectives
http://arxiv.org/abs/2002.12613v2


MixingBoard: a Knowledgeable Stylized Integrated Text Generation Platform
http://arxiv.org/abs/2005.08365v2


MoNet3D: Towards Accurate Monocular 3D Object Localization in Real Time
http://arxiv.org/abs/2006.16007v1


Mobile-Based Deep Learning Models for Banana Diseases Detection
http://arxiv.org/abs/2004.03718v1


MobileBERT: a Compact Task-Agnostic BERT for Resource-Limited Devices
http://arxiv.org/abs/2004.02984v2


Model Fusion with Kullback--Leibler Divergence
http://arxiv.org/abs/2007.06168v1


Model selection for contextual bandits
http://arxiv.org/abs/1906.00531v3


Model-Agnostic Counterfactual Explanations for Consequential Decisions
http://arxiv.org/abs/1905.11190v5


Model-Based Reinforcement Learning with a Generative Model is Minimax Optimal
http://arxiv.org/abs/1906.03804v3


Model-Based Visual Planning with Self-Supervised Functional Distances
http://arxiv.org/abs/2012.15373v1


Modeling Cloud Reflectance Fields using Conditional Generative Adversarial Networks
http://arxiv.org/abs/2002.07579v2


Modeling Continuous Stochastic Processes with Dynamic Normalizing Flows
http://arxiv.org/abs/2002.10516v3


Modeling Discourse Structure for Document-level Neural Machine Translation
http://arxiv.org/abs/2006.04721v1


Modeling Empathy and Distress in Reaction to News Stories
http://arxiv.org/abs/1808.10399v1


Modeling Global and Local Node Contexts for Text Generation from Knowledge Graphs
http://arxiv.org/abs/2001.11003v2


Modeling Label Semantics for Predicting Emotional Reactions
http://arxiv.org/abs/2006.05489v2


Modeling Long Context for Task-Oriented Dialogue State Generation
http://arxiv.org/abs/2004.14080v1


Modeling Naive Psychology of Characters in Simple Commonsense Stories
http://arxiv.org/abs/1805.06533v1


Modeling Protagonist Emotions for Emotion-Aware Storytelling
http://arxiv.org/abs/2010.06822v2


Modeling Recurrence for Transformer
http://arxiv.org/abs/1904.03092v1


Modeling Semantic Compositionality with Sememe Knowledge
http://arxiv.org/abs/1907.04744v1


Modeling Semantic Expectation: Using Script Knowledge for Referent Prediction
http://arxiv.org/abs/1702.03121v1


Modeling Semantic Plausibility by Injecting World Knowledge
http://arxiv.org/abs/1804.00619v3


Modeling Source Syntax for Neural Machine Translation
http://arxiv.org/abs/1705.01020v1


Modeling Subjective Assessments of Guilt in Newspaper Crime Narratives
http://arxiv.org/abs/2006.09589v2


Modeling the Music Genre Perception across Language-Bound Cultures
http://arxiv.org/abs/2010.06325v2


Modeling, Visualization, and Analysis of African Innovation Performance
http://arxiv.org/abs/2008.07882v1


Modelling Lexical Ambiguity with Density Matrices
http://arxiv.org/abs/2010.05670v1


Modelling Suspense in Short Stories as Uncertainty Reduction over Neural Representation
http://arxiv.org/abs/2004.14905v1


Modular Block-diagonal Curvature Approximations for Feedforward Architectures
http://arxiv.org/abs/1902.01813v3


Modularized Transfomer-based Ranking Framework
http://arxiv.org/abs/2004.13313v3


Modulated Fusion using Transformer for Linguistic-Acoustic Emotion Recognition
http://arxiv.org/abs/2010.02057v1


Modulating Surrogates for Bayesian Optimization
http://arxiv.org/abs/1906.11152v4


MojiTalk: Generating Emotional Responses at Scale
http://arxiv.org/abs/1711.04090v2


Molecule Edit Graph Attention Network: Modeling Chemical Reactions as Sequences of Graph Edits
http://arxiv.org/abs/2006.15426v1


Momentum Improves Normalized SGD
http://arxiv.org/abs/2002.03305v2


Momentum in Reinforcement Learning
http://arxiv.org/abs/1910.09322v2


Moniqua: Modulo Quantized Communication in Decentralized SGD
http://arxiv.org/abs/2002.11787v3


Monitoring and explainability of models in production
http://arxiv.org/abs/2007.06299v1


More Data Can Expand the Generalization Gap Between Adversarially Robust and Standard Models
http://arxiv.org/abs/2002.04725v3


More Information Supervised Probabilistic Deep Face Embedding Learning
http://arxiv.org/abs/2006.04518v2


More Powerful Selective Kernel Tests for Feature Selection
http://arxiv.org/abs/1910.06134v2


Morph-fitting: Fine-Tuning Word Vector Spaces with Simple Language-Specific Rules
http://arxiv.org/abs/1706.00377v1


Morphological Irregularity Correlates with Frequency
http://arxiv.org/abs/1906.11483v1


Morphological Segmentation Inside-Out
http://arxiv.org/abs/1911.04916v1


MuTual: A Dataset for Multi-Turn Dialogue Reasoning
http://arxiv.org/abs/2004.04494v1


Multi-Agent Determinantal Q-Learning
http://arxiv.org/abs/2006.01482v4


Multi-Agent Task-Oriented Dialog Policy Learning with Role-Aware Reward Decomposition
http://arxiv.org/abs/2004.03809v2


Multi-Attribute Bayesian Optimization With Interactive Preference Learning
http://arxiv.org/abs/1911.05934v2


Multi-Dimensional Gender Bias Classification
http://arxiv.org/abs/2005.00614v1


Multi-Domain Dialogue Acts and Response Co-Generation
http://arxiv.org/abs/2004.12363v1


Multi-Domain Neural Machine Translation with Word-Level Adaptive Layer-wise Domain Mixing
http://arxiv.org/abs/1911.02692v2


Multi-Fact Correction in Abstractive Text Summarization
http://arxiv.org/abs/2010.02443v1


Multi-Hop Knowledge Graph Reasoning with Reward Shaping
http://arxiv.org/abs/1808.10568v2


Multi-Instance Multi-Label Learning Networks for Aspect-Category Sentiment Analysis
http://arxiv.org/abs/2010.02656v1


Multi-Level Matching and Aggregation Network for Few-Shot Relation Classification
http://arxiv.org/abs/1906.06678v1


Multi-Modal Generative Adversarial Network for Short Product Title Generation in Mobile E-Commerce
http://arxiv.org/abs/1904.01735v1


Multi-News: a Large-Scale Multi-Document Summarization Dataset and Abstractive Hierarchical Model
http://arxiv.org/abs/1906.01749v3


Multi-Objective Molecule Generation using Interpretable Substructures
http://arxiv.org/abs/2002.03244v3


Multi-Passage Machine Reading Comprehension with Cross-Passage Answer Verification
http://arxiv.org/abs/1805.02220v2


Multi-Principal Assistance Games
http://arxiv.org/abs/2007.09540v1


Multi-Reference Training with Pseudo-References for Neural Translation and Text Generation
http://arxiv.org/abs/1808.09564v1


Multi-Relational Question Answering from Narratives: Machine Reading and Reasoning in Simulated Worlds
http://arxiv.org/abs/1902.09093v1


Multi-Sentence Argument Linking
http://arxiv.org/abs/1911.03766v3


Multi-Source Unsupervised Hyperparameter Optimization
http://arxiv.org/abs/2006.10600v1


Multi-Step Inference for Reasoning Over Paragraphs
http://arxiv.org/abs/2004.02995v1


Multi-Task Identification of Entities, Relations, and Coreference for Scientific Knowledge Graph Construction
http://arxiv.org/abs/1808.09602v1


Multi-Task Learning in Histo-pathology for Widely Generalizable Model
http://arxiv.org/abs/2005.08645v1


Multi-Task Networks With Universe, Group, and Task Feature Learning
http://arxiv.org/abs/1907.01791v1


Multi-Task Ordinal Regression for Jointly Predicting the Trustworthiness and the Leading Political Ideology of News Media
http://arxiv.org/abs/1904.00542v1


Multi-Task Reinforcement Learning with Soft Modularization
http://arxiv.org/abs/2003.13661v2


Multi-Task Video Captioning with Video and Entailment Generation
http://arxiv.org/abs/1704.07489v2


Multi-Unit Transformers for Neural Machine Translation
http://arxiv.org/abs/2010.10743v2


Multi-View Sequence-to-Sequence Models with Conversational Structure for Abstractive Dialogue Summarization
http://arxiv.org/abs/2010.01672v1


Multi-XScience: A Large-scale Dataset for Extreme Multi-document Summarization of Scientific Articles
http://arxiv.org/abs/2010.14235v1


Multi-agent Communication meets Natural Language: Synergies between Functional and Structural Language Learning
http://arxiv.org/abs/2005.07064v1


Multi-document Summarization with Maximal Marginal Relevance-guided Reinforcement Learning
http://arxiv.org/abs/2010.00117v1


Multi-hop Inference for Question-driven Summarization
http://arxiv.org/abs/2010.03738v1


Multi-hop Reading Comprehension across Multiple Documents by Reasoning over Heterogeneous Graphs
http://arxiv.org/abs/1905.07374v2


Multi-label Few/Zero-shot Learning with Knowledge Aggregated from Multiple Label Graphs
http://arxiv.org/abs/2010.07459v1


Multi-lingual neural title generation for e-Commerce browse pages
http://arxiv.org/abs/1804.01041v1


Multi-objective Bayesian Optimization using Pareto-frontier Entropy
http://arxiv.org/abs/1906.00127v2


Multi-space Variational Encoder-Decoders for Semi-supervised Labeled Sequence Transduction
http://arxiv.org/abs/1704.01691v2


Multi-step Greedy Reinforcement Learning Algorithms
http://arxiv.org/abs/1910.02919v3


Multi-task Learning for Multilingual Neural Machine Translation
http://arxiv.org/abs/2010.02523v1


Multi-task Learning of Pairwise Sequence Classification Tasks Over Disparate Label Spaces
http://arxiv.org/abs/1802.09913v2


Multi-task Learning with Sample Re-weighting for Machine Reading Comprehension
http://arxiv.org/abs/1809.06963v3


Multi-task Reinforcement Learning with a Planning Quasi-Metric
http://arxiv.org/abs/2002.03240v3


Multi-turn Response Selection using Dialogue Dependency Relations
http://arxiv.org/abs/2010.01502v1


Multi-view Story Characterization from Movie Plot Synopses and Reviews
http://arxiv.org/abs/1908.09083v2


MultiCQA: Zero-Shot Transfer of Self-Supervised Text Matching Models on a Massive Scale
http://arxiv.org/abs/2010.00980v1


MultiQA: An Empirical Investigation of Generalization and Transfer in Reading Comprehension
http://arxiv.org/abs/1905.13453v1


MultiQT: Multimodal Learning for Real-Time Question Tracking in Speech
http://arxiv.org/abs/2005.00812v2


MultiWOZ 2.2 : A Dialogue Dataset with Additional Annotation Corrections and State Tracking Baselines
http://arxiv.org/abs/2007.12720v1


Multidimensional Persistence Module Classification via Lattice-Theoretic Convolutions
http://arxiv.org/abs/2011.14057v1


Multidirectional Associative Optimization of Function-Specific Word Representations
http://arxiv.org/abs/2005.05264v1


Multigrid Neural Memory
http://arxiv.org/abs/1906.05948v4


Multilevel Text Alignment with Cross-Document Attention
http://arxiv.org/abs/2010.01263v1


Multilinear Latent Conditioning for Generating Unseen Attribute Combinations
http://arxiv.org/abs/2009.04075v1


Multilingual AMR-to-Text Generation
http://arxiv.org/abs/2011.05443v1


Multilingual Constituency Parsing with Self-Attention and Pre-Training
http://arxiv.org/abs/1812.11760v2


Multilingual Denoising Pre-training for Neural Machine Translation
http://arxiv.org/abs/2001.08210v2


Multilingual Factor Analysis
http://arxiv.org/abs/1905.05547v2


Multilingual Jointly Trained Acoustic and Written Word Embeddings
http://arxiv.org/abs/2006.14007v1


Multilingual Offensive Language Identification with Cross-lingual Embeddings
http://arxiv.org/abs/2010.05324v1


Multilingual Universal Sentence Encoder for Semantic Retrieval
http://arxiv.org/abs/1907.04307v1


Multimodal Affective Analysis Using Hierarchical Attention Strategy with Word-Level Alignment
http://arxiv.org/abs/1805.08660v1


Multimodal Emoji Prediction
http://arxiv.org/abs/1803.02392v2


Multimodal Joint Attribute Prediction and Value Extraction for E-commerce Product
http://arxiv.org/abs/2009.07162v1


Multimodal Language Analysis with Recurrent Multistage Fusion
http://arxiv.org/abs/1808.03920v1


Multimodal Machine Translation with Embedding Prediction
http://arxiv.org/abs/1904.00639v1


Multimodal Routing: Improving Local and Global Interpretability of Multimodal Language Analysis
http://arxiv.org/abs/2004.14198v2


Multimodal Self-Supervised Learning for Medical Image Analysis
http://arxiv.org/abs/1912.05396v2


Multimodal Transformer Networks for End-to-End Video-Grounded Dialogue Systems
http://arxiv.org/abs/1907.01166v1


Multimodal and Multi-view Models for Emotion Recognition
http://arxiv.org/abs/1906.10198v1


Multinomial Logit Bandit with Low Switching Cost
http://arxiv.org/abs/2007.04876v1


Multiple Instance Learning Networks for Fine-Grained Sentiment Analysis
http://arxiv.org/abs/1711.09645v2


Multiresolution Tensor Learning for Efficient and Interpretable Spatial Analysis
http://arxiv.org/abs/2002.05578v5


Multiscale Collaborative Deep Models for Neural Machine Translation
http://arxiv.org/abs/2004.14021v3


Musical Word Embedding: Bridging the Gap between Listening Contexts and Music
http://arxiv.org/abs/2008.01190v1


Mutual Information Maximization for Simple and Accurate Part-Of-Speech Induction
http://arxiv.org/abs/1804.07849v4


My Fair Bandit: Distributed Learning of Max-Min Fairness with Multi-player Bandits
http://arxiv.org/abs/2002.09808v4


NADS: Neural Architecture Distribution Search for Uncertainty Awareness
http://arxiv.org/abs/2006.06646v1


NARMADA: Need and Available Resource Managing Assistant for Disasters and Adversities
http://arxiv.org/abs/2005.13524v1


NASH: Toward End-to-End Neural Architecture for Generative Semantic Hashing
http://arxiv.org/abs/1805.05361v1


NAT: Noise-Aware Training for Robust Neural Sequence Labeling
http://arxiv.org/abs/2005.07162v1


NEXUS Network: Connecting the Preceding and the Following in Dialogue Generation
http://arxiv.org/abs/1810.00671v2


NGBoost: Natural Gradient Boosting for Probabilistic Prediction
http://arxiv.org/abs/1910.03225v4


NILE : Natural Language Inference with Faithful Natural Language Explanations
http://arxiv.org/abs/2005.12116v1


NLP Scholar: An Interactive Visual Explorer for Natural Language Processing Literature
http://arxiv.org/abs/2006.01131v1


NSTM: Real-Time Query-Driven News Overview Composition at Bloomberg
http://arxiv.org/abs/2006.01117v1


Naive Exploration is Optimal for Online LQR
http://arxiv.org/abs/2001.09576v2


Naive Feature Selection: Sparsity in Naive Bayes
http://arxiv.org/abs/1905.09884v2


Nakdan: Professional Hebrew Diacritizer
http://arxiv.org/abs/2005.03312v1


Named Entity Recognition Only from Word Embeddings
http://arxiv.org/abs/1909.00164v2


Named Entity Recognition as Dependency Parsing
http://arxiv.org/abs/2005.07150v3


Named Entity Recognition for Social Media Texts with Semantic Augmentation
http://arxiv.org/abs/2010.15458v1


Named Entity Recognition without Labelled Data: A Weak Supervision Approach
http://arxiv.org/abs/2004.14723v1


Native Language Cognate Effects on Second Language Lexical Choice
http://arxiv.org/abs/1805.09590v1


Natural Language Comprehension with the EpiReader
http://arxiv.org/abs/1606.02270v2


Natural Language Processing with Small Feed-Forward Networks
http://arxiv.org/abs/1708.00214v1


Natural language processing for achieving sustainable development: the case of neural labelling to enhance community profiling
http://arxiv.org/abs/2004.12935v2


Naturalizing a Programming Language via Interactive Learning
http://arxiv.org/abs/1704.06956v1


Navigating the Dynamics of Financial Embeddings over Time
http://arxiv.org/abs/2007.00591v1


Near Human-Level Performance in Grammatical Error Correction with Hybrid Machine Translation
http://arxiv.org/abs/1804.05945v1


Near Input Sparsity Time Kernel Embeddings via Adaptive Sampling
http://arxiv.org/abs/2007.03927v2


Near-Optimal Algorithms for Minimax Optimization
http://arxiv.org/abs/2002.02417v5


Near-Optimal Methods for Minimizing Star-Convex Functions and Beyond
http://arxiv.org/abs/1906.11985v1


Near-imperceptible Neural Linguistic Steganography via Self-Adjusting Arithmetic Coding
http://arxiv.org/abs/2010.00677v1


Near-linear Time Gaussian Process Optimization with Adaptive Batching and Resparsification
http://arxiv.org/abs/2002.09954v2


Near-optimal Regret Bounds for Stochastic Shortest Path
http://arxiv.org/abs/2002.09869v1


Nearly Linear Row Sampling Algorithm for Quantile Regression
http://arxiv.org/abs/2006.08397v1


Necessary and Sufficient Geometries for Gradient Methods
http://arxiv.org/abs/1909.10455v2


Negated and Misprimed Probes for Pretrained Language Models: Birds Can Talk, But Cannot Fly
http://arxiv.org/abs/1911.03343v3


Negative Training for Neural Dialogue Response Generation
http://arxiv.org/abs/1903.02134v5


Negative sampling in semi-supervised learning
http://arxiv.org/abs/1911.05166v2


Neighborhood Growth Determines Geometric Priors for Relational Representation Learning
http://arxiv.org/abs/1910.05565v1


Neighborhood Matching Network for Entity Alignment
http://arxiv.org/abs/2005.05607v1


Nested Named Entity Recognition via Second-best Sequence Learning and Decoding
http://arxiv.org/abs/1909.02250v3


Nested Reasoning About Autonomous Agents Using Probabilistic Programs
http://arxiv.org/abs/1812.01569v2


Nested Subspace Arrangement for Representation of Relational Data
http://arxiv.org/abs/2007.02007v1


Neural AMR: Sequence-to-Sequence Models for Parsing and Generation
http://arxiv.org/abs/1704.08381v3


Neural Abstract Reasoner
http://arxiv.org/abs/2011.09860v1


Neural Argument Generation Augmented with Externally Retrieved Evidence
http://arxiv.org/abs/1805.10254v1


Neural Bipartite Matching
http://arxiv.org/abs/2005.11304v3


Neural CRF Model for Sentence Alignment in Text Simplification
http://arxiv.org/abs/2005.02324v3


Neural CRF Parsing
http://arxiv.org/abs/1507.03641v1


Neural Clustering Processes
http://arxiv.org/abs/1901.00409v4


Neural Contextual Bandits with UCB-based Exploration
http://arxiv.org/abs/1911.04462v3


Neural Cross-Lingual Coreference Resolution and its Application to Entity Linking
http://arxiv.org/abs/1806.10201v1


Neural Data-to-Text Generation via Jointly Learning the Segmentation and Correspondence
http://arxiv.org/abs/2005.01096v1


Neural Decomposition: Functional ANOVA with Variational Autoencoders
http://arxiv.org/abs/2006.14293v2


Neural Deepfake Detection with Factual Structure of Text
http://arxiv.org/abs/2010.07475v1


Neural Differential Equations for Single Image Super-resolution
http://arxiv.org/abs/2005.00865v1


Neural Discourse Structure for Text Categorization
http://arxiv.org/abs/1702.01829v2


Neural Dynamic Policies for End-to-End Sensorimotor Learning
http://arxiv.org/abs/2012.02788v1


Neural End-to-End Learning for Computational Argumentation Mining
http://arxiv.org/abs/1704.06104v2


Neural Fine-Grained Entity Type Classification with Hierarchy-Aware Loss
http://arxiv.org/abs/1803.03378v2


Neural Generation of Dialogue Response Timings
http://arxiv.org/abs/2005.09128v1


Neural Grammatical Error Correction with Finite State Transducers
http://arxiv.org/abs/1903.10625v2


Neural Kernels Without Tangents
http://arxiv.org/abs/2003.02237v2


Neural Language Models as Psycholinguistic Subjects: Representations of Syntactic State
http://arxiv.org/abs/1903.03260v1


Neural Latent Relational Analysis to Capture Lexical Semantic Relations in a Vector Space
http://arxiv.org/abs/1809.03401v1


Neural Legal Judgment Prediction in English
http://arxiv.org/abs/1906.02059v1


Neural Machine Translation of Text from Non-Native Speakers
http://arxiv.org/abs/1808.06267v2


Neural Machine Translation via Binary Code Prediction
http://arxiv.org/abs/1704.06918v1


Neural Machine Translation with Source-Side Latent Graph Parsing
http://arxiv.org/abs/1702.02265v4


Neural Manifold Ordinary Differential Equations
http://arxiv.org/abs/2006.10254v1


Neural Mask Generator: Learning to Generate Adaptive Word Maskings for Language Model Adaptation
http://arxiv.org/abs/2010.02705v1


Neural Metaphor Detection in Context
http://arxiv.org/abs/1808.09653v1


Neural Models for Documents with Metadata
http://arxiv.org/abs/1705.09296v2


Neural Open Information Extraction
http://arxiv.org/abs/1805.04270v1


Neural Operator: Graph Kernel Network for Partial Differential Equations
http://arxiv.org/abs/2003.03485v1


Neural Ordinary Differential Equations on Manifolds
http://arxiv.org/abs/2006.06663v1


Neural Proof Nets
http://arxiv.org/abs/2009.12702v1


Neural Related Work Summarization with a Joint Context-driven Attention Mechanism
http://arxiv.org/abs/1901.09492v1


Neural Responding Machine for Short-Text Conversation
http://arxiv.org/abs/1503.02364v2


Neural Segmental Hypergraphs for Overlapping Mention Recognition
http://arxiv.org/abs/1810.01817v1


Neural Simultaneous Speech Translation Using Alignment-Based Chunking
http://arxiv.org/abs/2005.14489v1


Neural Symbolic Machines: Learning Semantic Parsers on Freebase with Weak Supervision
http://arxiv.org/abs/1611.00020v4


Neural Syntactic Preordering for Controlled Paraphrase Generation
http://arxiv.org/abs/2005.02013v1


Neural Temporal Opinion Modelling for Opinion Prediction on Twitter
http://arxiv.org/abs/2005.13486v1


Neural Text Generation from Structured Data with Application to the Biography Domain
http://arxiv.org/abs/1603.07771v3


Neural Topic Modeling by Incorporating Document Relationship Graph
http://arxiv.org/abs/2009.13972v1


Neural Topic Modeling with Bidirectional Adversarial Training
http://arxiv.org/abs/2004.12331v1


Neural Topic Modeling with Continual Lifelong Learning
http://arxiv.org/abs/2006.10909v1


Neural Topic Modeling with Cycle-Consistent Adversarial Training
http://arxiv.org/abs/2009.13971v1


Neural Transductive Learning and Beyond: Morphological Generation in the Minimal-Resource Setting
http://arxiv.org/abs/1809.08733v2


Neural Word Segmentation with Rich Pretraining
http://arxiv.org/abs/1704.08960v1


Neural models of factuality
http://arxiv.org/abs/1804.02472v1


Neural reparameterization improves structural optimization
http://arxiv.org/abs/1909.04240v2


Neural versus Phrase-Based Machine Translation Quality: a Case Study
http://arxiv.org/abs/1608.04631v2


NeuralREG: An end-to-end approach to referring expression generation
http://arxiv.org/abs/1805.08093v1


Neurals Networks for Projecting Named Entities from English to Ewondo
http://arxiv.org/abs/2004.13841v1


Neuro-Symbolic Visual Reasoning: Disentangling "Visual" from "Reasoning"
http://arxiv.org/abs/2006.11524v3


New Oracle-Efficient Algorithms for Private Synthetic Data Release
http://arxiv.org/abs/2007.05453v1


New Potential-Based Bounds for Prediction with Expert Advice
http://arxiv.org/abs/1911.01641v3


New Protocols and Negative Results for Textual Entailment Data Collection
http://arxiv.org/abs/2004.11997v2


Newsroom: A Dataset of 1.3 Million Summaries with Diverse Extractive Strategies
http://arxiv.org/abs/1804.11283v2


No Metrics Are Perfect: Adversarial Reward Learning for Visual Storytelling
http://arxiv.org/abs/1804.09160v2


No Permanent Friends or Enemies: Tracking Relationships between Nations from News
http://arxiv.org/abs/1904.08950v1


No-Regret Prediction in Marginally Stable Systems
http://arxiv.org/abs/2002.02064v3


Noise Contrastive Estimation and Negative Sampling for Conditional Models: Consistency and Statistical Efficiency
http://arxiv.org/abs/1809.01812v1


Noise-tolerant, Reliable Active Classification with Comparison Queries
http://arxiv.org/abs/2001.05497v1


Noisy-Input Entropy Search for Efficient Robust Bayesian Optimization
http://arxiv.org/abs/2002.02820v1


Non-Autoregressive Machine Translation with Latent Alignments
http://arxiv.org/abs/2004.07437v3


Non-Parametric Calibration for Classification
http://arxiv.org/abs/1906.04933v3


Non-Projective Dependency Parsing with Non-Local Transitions
http://arxiv.org/abs/1710.09340v3


Non-convex Learning via Replica Exchange Stochastic Gradient MCMC
http://arxiv.org/abs/2008.05367v2


Non-exchangeable feature allocation models with sublinear growth of the feature sizes
http://arxiv.org/abs/2003.13491v1


Non-linear interlinkages and key objectives amongst the Paris Agreement and the Sustainable Development Goals
http://arxiv.org/abs/2004.09318v1


Nonmyopic Gaussian Process Optimization with Macro-Actions
http://arxiv.org/abs/2002.09670v1


Nonparametric Estimation in the Dynamic Bradley-Terry Model
http://arxiv.org/abs/2003.00083v1


Nonparametric Score Estimators
http://arxiv.org/abs/2005.10099v2


Norm-Based Curriculum Learning for Neural Machine Translation
http://arxiv.org/abs/2006.02014v1


Normalized Flat Minima: Exploring Scale Invariant Definition of Flat Minima for Neural Networks using PAC-Bayesian Analysis
http://arxiv.org/abs/1901.04653v2


Normalized Loss Functions for Deep Learning with Noisy Labels
http://arxiv.org/abs/2006.13554v1


Normalizing Flows Across Dimensions
http://arxiv.org/abs/2006.13070v1


Normalizing Flows on Tori and Spheres
http://arxiv.org/abs/2002.02428v2


Normalizing Flows with Multi-Scale Autoregressive Priors
http://arxiv.org/abs/2004.03891v1


Not All Claims are Created Equal: Choosing the Right Statistical Approach to Assess Hypotheses
http://arxiv.org/abs/1911.03850v3


Not Low-Resource Anymore: Aligner Ensembling, Batch Filtering, and New Datasets for Bengali-English Machine Translation
http://arxiv.org/abs/2009.09359v2


Null It Out: Guarding Protected Attributes by Iterative Nullspace Projection
http://arxiv.org/abs/2004.07667v2


Numeracy for Language Models: Evaluating and Improving their Ability to Predict Numbers
http://arxiv.org/abs/1805.08154v1


Nurse is Closer to Woman than Surgeon? Mitigating Gender-Biased Proximities in Word Embeddings
http://arxiv.org/abs/2006.01938v1


NwQM: A neural quality assessment framework for Wikipedia
http://arxiv.org/abs/2010.06969v1


OBJ2TEXT: Generating Visually Descriptive Language from Object Layouts
http://arxiv.org/abs/1707.07102v1


ON-TRAC Consortium for End-to-End and Simultaneous Speech Translation Challenge Tasks at IWSLT 2020
http://arxiv.org/abs/2005.11861v1


OPAL: Offline Primitive Discovery for Accelerating Offline Reinforcement Learning
http://arxiv.org/abs/2010.13611v2


OSOM: A simultaneously optimal algorithm for multi-armed and linear contextual bandits
http://arxiv.org/abs/1905.10040v4


Obfuscation for Privacy-preserving Syntactic Parsing
http://arxiv.org/abs/1904.09585v2


Obfuscation via Information Density Estimation
http://arxiv.org/abs/1910.08109v1


Object Ordering with Bidirectional Matchings for Visual Reasoning
http://arxiv.org/abs/1804.06870v2


Observing Dialogue in Therapy: Categorizing and Forecasting Behavioral Codes
http://arxiv.org/abs/1907.00326v1


Obtaining Adjustable Regularization for Free via Iterate Averaging
http://arxiv.org/abs/2008.06736v1


Obtaining Faithful Interpretations from Compositional Neural Networks
http://arxiv.org/abs/2005.00724v2


Off-Dynamics Reinforcement Learning: Training for Transfer with Domain Classifiers
http://arxiv.org/abs/2006.13916v1


Off-Policy Actor-Critic with Shared Experience Replay
http://arxiv.org/abs/1909.11583v2


Offline Meta-Reinforcement Learning with Advantage Weighting
http://arxiv.org/abs/2008.06043v2


Old Dog Learns New Tricks: Randomized UCB for Bandit Problems
http://arxiv.org/abs/1910.04928v2


On Contrastive Learning for Likelihood-free Inference
http://arxiv.org/abs/2002.03712v2


On Convergence-Diagnostic based Step Sizes for Stochastic Gradient Descent
http://arxiv.org/abs/2007.00534v1


On Coresets For Regularized Regression
http://arxiv.org/abs/2006.05440v3


On Cross-Dataset Generalization in Automatic Detection of Online Abuse
http://arxiv.org/abs/2010.07414v2


On Detecting Data Pollution Attacks On Recommender Systems Using Sequential GANs
http://arxiv.org/abs/2012.02509v1


On Differentially Private Stochastic Convex Optimization with Heavy-tailed Data
http://arxiv.org/abs/2010.11082v1


On Dimensional Linguistic Properties of the Word Embedding Space
http://arxiv.org/abs/1910.02211v2


On Effective Parallelization of Monte Carlo Tree Search
http://arxiv.org/abs/2006.08785v2


On Efficient Constructions of Checkpoints
http://arxiv.org/abs/2009.13003v1


On Efficient Low Distortion Ultrametric Embedding
http://arxiv.org/abs/2008.06700v1


On Evaluation of Adversarial Perturbations for Sequence-to-Sequence Models
http://arxiv.org/abs/1903.06620v2


On Exposure Bias, Hallucination and Domain Shift in Neural Machine Translation
http://arxiv.org/abs/2005.03642v1


On Extractive and Abstractive Neural Document Summarization with Transformer Language Models
http://arxiv.org/abs/1909.03186v2


On Faithfulness and Factuality in Abstractive Summarization
http://arxiv.org/abs/2005.00661v1


On Generalization Bounds of a Family of Recurrent Neural Networks
http://arxiv.org/abs/1910.12947v2


On Gradient Descent Ascent for Nonconvex-Concave Minimax Problems
http://arxiv.org/abs/1906.00331v6


On Graph Classification Networks, Datasets and Baselines
http://arxiv.org/abs/1905.04682v1


On Incorporating Structural Information to improve Dialogue Response Generation
http://arxiv.org/abs/2005.14315v1


On Iterative Neural Network Pruning, Reinitialization, and the Similarity of Masks
http://arxiv.org/abs/2001.05050v1


On Layer Normalization in the Transformer Architecture
http://arxiv.org/abs/2002.04745v2


On Learning Language-Invariant Representations for Universal Machine Translation
http://arxiv.org/abs/2008.04510v1


On Learning Sets of Symmetric Elements
http://arxiv.org/abs/2002.08599v4


On Linear Stochastic Approximation: Fine-grained Polyak-Ruppert and Non-Asymptotic Concentration
http://arxiv.org/abs/2004.04719v1


On Losses for Modern Language Models
http://arxiv.org/abs/2010.01694v1


On Maximization of Weakly Modular Functions: Guarantees of Multi-stage Algorithms, Tractability, and Hardness
http://arxiv.org/abs/1805.11251v5


On Measuring Social Biases in Sentence Encoders
http://arxiv.org/abs/1903.10561v1


On Negative Interference in Multilingual Models: Findings and A Meta-Learning Treatment
http://arxiv.org/abs/2010.03017v1


On Optimal Transformer Depth for Low-Resource Language Translation
http://arxiv.org/abs/2004.04418v2


On Polynomial Approximations for Privacy-Preserving and Verifiable ReLU Networks
http://arxiv.org/abs/2011.05530v1


On Primes, Log-Loss Scores and (No) Privacy
http://arxiv.org/abs/2009.08559v1


On Random Subsampling of Gaussian Process Regression: A Graphon-Based Analysis
http://arxiv.org/abs/1901.09541v1


On Second-Order Group Influence Functions for Black-Box Predictions
http://arxiv.org/abs/1911.00418v2


On Suboptimality of Least Squares with Application to Estimation of Convex Bodies
http://arxiv.org/abs/2006.04046v1


On The Evaluation of Machine Translation Systems Trained With Back-Translation
http://arxiv.org/abs/1908.05204v2


On Thompson Sampling for Smoother-than-Lipschitz Bandits
http://arxiv.org/abs/2001.02323v2


On Unbalanced Optimal Transport: An Analysis of Sinkhorn Algorithm
http://arxiv.org/abs/2002.03293v2


On Using Very Large Target Vocabulary for Neural Machine Translation
http://arxiv.org/abs/1412.2007v2


On Variational Learning of Controllable Representations for Text without Supervision
http://arxiv.org/abs/1905.11975v4


On conditional versus marginal bias in multi-armed bandits
http://arxiv.org/abs/2002.08422v2


On the Benefits of Models with Perceptually-Aligned Gradients
http://arxiv.org/abs/2005.01499v1


On the Choice of Auxiliary Languages for Improved Sequence Tagging
http://arxiv.org/abs/2005.09389v1


On the Complementary Nature of Knowledge Graph Embedding, Fine Grain Entity Types, and Language Modeling
http://arxiv.org/abs/2010.05732v1


On the Computational Power of Transformers and its Implications in Sequence Modeling
http://arxiv.org/abs/2006.09286v3


On the Consistency of Top-k Surrogate Losses
http://arxiv.org/abs/1901.11141v2


On the Convergence of Adaptive Gradient Methods for Nonconvex Optimization
http://arxiv.org/abs/1808.05671v3


On the Convergence of Continuous Constrained Optimization for Structure Learning
http://arxiv.org/abs/2011.11150v2


On the Convergence of Nesterov's Accelerated Gradient Method in Stochastic Settings
http://arxiv.org/abs/2002.12414v2


On the Convergence of SARAH and Beyond
http://arxiv.org/abs/1906.02351v2


On the Convergence of Stochastic Gradient Descent with Low-Rank Projections for Convex Low-Rank Matrix Problems
http://arxiv.org/abs/2001.11668v2


On the Cross-lingual Transferability of Monolingual Representations
http://arxiv.org/abs/1910.11856v3


On the Encoder-Decoder Incompatibility in Variational Text Modeling and Beyond
http://arxiv.org/abs/2004.09189v1


On the Expressivity of Neural Networks for Deep Reinforcement Learning
http://arxiv.org/abs/1910.05927v3


On the Frailty of Universal POS Tags for Neural UD Parsers
http://arxiv.org/abs/2010.01830v3


On the Generalization Benefit of Noise in Stochastic Gradient Descent
http://arxiv.org/abs/2006.15081v1


On the Global Convergence Rates of Softmax Policy Gradient Methods
http://arxiv.org/abs/2005.06392v2


On the Idiosyncrasies of the Mandarin Chinese Classifier System
http://arxiv.org/abs/1902.10193v3


On the Inference Calibration of Neural Machine Translation
http://arxiv.org/abs/2005.00963v1


On the Limitations of Cross-lingual Encoders as Exposed by Reference-Free Machine Translation Evaluation
http://arxiv.org/abs/2005.01196v3


On the Limitations of Unsupervised Bilingual Dictionary Induction
http://arxiv.org/abs/1805.03620v1


On the Linguistic Representational Power of Neural Machine Translation Models
http://arxiv.org/abs/1911.00317v1


On the Multiple Descent of Minimum-Norm Interpolants and Restricted Lower Isometry of Kernels
http://arxiv.org/abs/1908.10292v2


On the Noisy Gradient Descent that Generalizes as SGD
http://arxiv.org/abs/1906.07405v3


On the Number of Linear Regions of Convolutional Neural Networks
http://arxiv.org/abs/2006.00978v2


On the Practical Computational Power of Finite Precision RNNs for Language Recognition
http://arxiv.org/abs/1805.04908v1


On the Relation between Quality-Diversity Evaluation and Distribution-Fitting Goal in Text Generation
http://arxiv.org/abs/2007.01488v2


On the Robustness of Language Encoders against Grammatical Errors
http://arxiv.org/abs/2005.05683v1


On the Role of Supervision in Unsupervised Constituency Parsing
http://arxiv.org/abs/2010.02423v2


On the Sample Complexity of Adversarial Multi-Source PAC Learning
http://arxiv.org/abs/2002.10384v2


On the Sample Complexity of Learning Sum-Product Networks
http://arxiv.org/abs/1912.02765v2


On the Sentence Embeddings from Pre-trained Language Models
http://arxiv.org/abs/2011.05864v1


On the Sparsity of Neural Machine Translation Models
http://arxiv.org/abs/2010.02646v1


On the Spontaneous Emergence of Discrete and Compositional Signals
http://arxiv.org/abs/2005.00110v1


On the Theoretical Properties of the Network Jackknife
http://arxiv.org/abs/2004.08935v2


On the Unreasonable Effectiveness of the Greedy Algorithm: Greedy Adapts to Sharpness
http://arxiv.org/abs/2002.04063v1


On the diminishing return of labeling clinical reports
http://arxiv.org/abs/2010.14587v1


On the importance of pre-training data volume for compact language models
http://arxiv.org/abs/2010.03813v2


On the interplay between noise and curvature and its effect on optimization and generalization
http://arxiv.org/abs/1906.07774v2


On the optimality of kernels for high-dimensional clustering
http://arxiv.org/abs/1912.00458v1


On the space-time expressivity of ResNets
http://arxiv.org/abs/1910.09599v4


On-The-Fly Information Retrieval Augmentation for Language Models
http://arxiv.org/abs/2007.01528v1


One Policy to Control Them All: Shared Modular Policies for Agent-Agnostic Control
http://arxiv.org/abs/2007.04976v1


One Size Does Not Fit All: Generating and Evaluating Variable Number of Keyphrases
http://arxiv.org/abs/1810.05241v4


One Size Fits All: Can We Train One Denoiser for All Noise Levels?
http://arxiv.org/abs/2005.09627v3


One Solution is Not All You Need: Few-Shot Extrapolation via Structured MaxEnt RL
http://arxiv.org/abs/2010.14484v2


Online Continuous DR-Submodular Maximization with Long-Term Budget Constraints
http://arxiv.org/abs/1907.00316v1


Online Conversation Disentanglement with Pointer Networks
http://arxiv.org/abs/2010.11080v1


Online Dense Subgraph Discovery via Blurred-Graph Feedback
http://arxiv.org/abs/2006.13642v1


Online Forecasting of Total-Variation-bounded Sequences
http://arxiv.org/abs/1906.03364v2


Online Hyper-parameter Tuning in Off-policy Learning via Evolutionary Strategies
http://arxiv.org/abs/2006.07554v1


Online Learning Using Only Peer Prediction
http://arxiv.org/abs/1910.04382v2


Online Learning for Active Cache Synchronization
http://arxiv.org/abs/2002.12014v2


Online Learning with Continuous Variations: Dynamic Regret and Reductions
http://arxiv.org/abs/1902.07286v3


Online Learning with Imperfect Hints
http://arxiv.org/abs/2002.04726v2


Online Pricing with Offline Data: Phase Transition and Inverse Square Law
http://arxiv.org/abs/1910.08693v6


Online Safety Assurance for Deep Reinforcement Learning
http://arxiv.org/abs/2010.03625v1


Online Segment to Segment Neural Transduction
http://arxiv.org/abs/1609.08194v1


Online metric algorithms with untrusted predictions
http://arxiv.org/abs/2003.02144v2


Online mirror descent and dual averaging: keeping pace in the dynamic case
http://arxiv.org/abs/2006.02585v2


Open Domain Event Extraction Using Neural Latent Variable Models
http://arxiv.org/abs/1906.06947v1


Open Domain Question Answering Using Early Fusion of Knowledge Bases and Text
http://arxiv.org/abs/1809.00782v1


Open Korean Corpora: A Practical Report
http://arxiv.org/abs/2012.15621v1


Operation-Aware Soft Channel Pruning using Differentiable Masks
http://arxiv.org/abs/2007.03938v2


OpinionDigest: A Simple Framework for Opinion Summarization
http://arxiv.org/abs/2005.01901v1


Opportunistic Decoding with Timely Correction for Simultaneous Translation
http://arxiv.org/abs/2005.00675v1


Optimal Client Sampling for Federated Learning
http://arxiv.org/abs/2010.13723v1


Optimal Continual Learning has Perfect Memory and is NP-hard
http://arxiv.org/abs/2006.05188v1


Optimal Randomized First-Order Methods for Least-Squares Problems
http://arxiv.org/abs/2002.09488v2


Optimal Robust Learning of Discrete Distributions from Batches
http://arxiv.org/abs/1911.08532v2


Optimal Transport-based Alignment of Learned Character Representations for String Similarity
http://arxiv.org/abs/1907.10165v1


Optimal approximation for unconstrained non-submodular minimization
http://arxiv.org/abs/1905.12145v3


Optimal group testing
http://arxiv.org/abs/1911.02287v3


Optimal transport mapping via input convex neural networks
http://arxiv.org/abs/1908.10962v2


Optimistic Policy Optimization with Bandit Feedback
http://arxiv.org/abs/2002.08243v2


Optimistic bounds for multi-output prediction
http://arxiv.org/abs/2002.09769v1


Optimization Theory for ReLU Neural Networks Trained with Normalization Layers
http://arxiv.org/abs/2006.06878v1


Optimization from Structured Samples for Coverage Functions
http://arxiv.org/abs/2007.02738v1


Optimization of Graph Total Variation via Active-Set-based Combinatorial Reconditioning
http://arxiv.org/abs/2002.12236v1


Optimized Score Transformation for Fair Classification
http://arxiv.org/abs/1906.00066v2


Optimizer Benchmarking Needs to Account for Hyperparameter Tuning
http://arxiv.org/abs/1910.11758v4


Optimizing Black-box Metrics with Adaptive Surrogates
http://arxiv.org/abs/2002.08605v1


Optimizing Data Usage via Differentiable Rewards
http://arxiv.org/abs/1911.10088v2


Optimizing Long-term Social Welfare in Recommender Systems: A Constrained Matching Approach
http://arxiv.org/abs/2008.00104v2


Optimizing Memory Placement using Evolutionary Graph Reinforcement Learning
http://arxiv.org/abs/2007.07298v2


Optimizing Millions of Hyperparameters by Implicit Differentiation
http://arxiv.org/abs/1911.02590v1


Optimizing the Factual Correctness of a Summary: A Study of Summarizing Radiology Reports
http://arxiv.org/abs/1911.02541v3


Optimus: Organizing Sentences via Pre-trained Modeling of a Latent Space
http://arxiv.org/abs/2004.04092v4


Option Discovery in the Absence of Rewards with Manifold Analysis
http://arxiv.org/abs/2003.05878v2


Oracle Efficient Private Non-Convex Optimization
http://arxiv.org/abs/1909.01783v3


Ordered SGD: A New Stochastic Optimization Framework for Empirical Risk Minimization
http://arxiv.org/abs/1907.04371v5


Ordinal Non-negative Matrix Factorization for Recommendation
http://arxiv.org/abs/2006.01034v4


Orthogonal Gradient Descent for Continual Learning
http://arxiv.org/abs/1910.07104v1


Orthogonal Relation Transforms with Graph Context Modeling for Knowledge Graph Embedding
http://arxiv.org/abs/1911.04910v3


Orthogonalized SGD and Nested Architectures for Anytime Neural Networks
http://arxiv.org/abs/2008.06635v1


Out of the Echo Chamber: Detecting Countering Debate Speeches
http://arxiv.org/abs/2005.01157v1


Overcoming Language Variation in Sentiment Analysis with Social Attention
http://arxiv.org/abs/1511.06052v4


Overfitting in adversarially robust deep learning
http://arxiv.org/abs/2002.11569v2


P-SIF: Document Embeddings Using Partition Averaging
http://arxiv.org/abs/2005.09069v1


PAC Bounds for Imitation and Model-based Batch Learning of Contextual Markov Decision Processes
http://arxiv.org/abs/2006.06352v2


PAC learning with stable and private predictions
http://arxiv.org/abs/1911.10541v2


PACRR: A Position-Aware Neural IR Model for Relevance Matching
http://arxiv.org/abs/1704.03940v3


PAGE: A Simple and Optimal Probabilistic Gradient Estimator for Nonconvex Optimization
http://arxiv.org/abs/2008.10898v2


PAIR: Planning and Iterative Refinement in Pre-trained Transformers for Long Text Generation
http://arxiv.org/abs/2010.02301v1


PALM: Pre-training an Autoencoding&Autoregressive Language Model for Context-conditioned Generation
http://arxiv.org/abs/2004.07159v2


PAN: Path Integral Based Convolution for Deep Graph Neural Networks
http://arxiv.org/abs/1904.10996v1


PARADE: A New Dataset for Paraphrase Identification Requiring Computer Science Domain Knowledge
http://arxiv.org/abs/2010.03725v1


PDO-eConvs: Partial Differential Operator Based Equivariant Convolutions
http://arxiv.org/abs/2007.10408v2


PEGASUS: Pre-training with Extracted Gap-sentences for Abstractive Summarization
http://arxiv.org/abs/1912.08777v3


PENNI: Pruned Kernel Sharing for Efficient CNN Inference
http://arxiv.org/abs/2005.07133v2


PERL: Pivot-based Domain Adaptation for Pre-trained Deep Contextualized Embedding Models
http://arxiv.org/abs/2006.09075v1


PHICON: Improving Generalization of Clinical Text De-identification Models via Data Augmentation
http://arxiv.org/abs/2010.05143v1


PLAS: Latent Action Space for Offline Reinforcement Learning
http://arxiv.org/abs/2011.07213v1


PLATO: Pre-trained Dialogue Generation Model with Discrete Latent Variable
http://arxiv.org/abs/1910.07931v3


POPCORN: Partially Observed Prediction COnstrained ReiNforcement Learning
http://arxiv.org/abs/2001.04032v2


POSEIDON: Privacy-Preserving Federated Neural Network Learning
http://arxiv.org/abs/2009.00349v3


PRover: Proof Generation for Interpretable Reasoning over Rules
http://arxiv.org/abs/2010.02830v1


PackIt: A Virtual Environment for Geometric Planning
http://arxiv.org/abs/2007.11121v1


Pan-Private Uniformity Testing
http://arxiv.org/abs/1911.01452v3


Parallel Algorithm for Non-Monotone DR-Submodular Maximization
http://arxiv.org/abs/1905.13272v1


Parallel Corpus Filtering via Pre-trained Language Models
http://arxiv.org/abs/2005.06166v1


Parallel Data Augmentation for Formality Style Transfer
http://arxiv.org/abs/2005.07522v1


Parallel Interactive Networks for Multi-Domain Dialogue State Generation
http://arxiv.org/abs/2009.07616v3


Parallels Between Phase Transitions and Circuit Complexity?
http://arxiv.org/abs/1904.05483v2


Parameters Estimation from the 21 cm signal using Variational Inference
http://arxiv.org/abs/2005.02299v1


Parametric Gaussian Process Regressors
http://arxiv.org/abs/1910.07123v3


Paraphrase Augmented Task-Oriented Dialog Generation
http://arxiv.org/abs/2004.07462v2


Paraphrase Generation as Zero-Shot Multilingual Translation: Disentangling Semantic Similarity from Lexical and Syntactic Diversity
http://arxiv.org/abs/2008.04935v2


Paraphrase to Explicate: Revealing Implicit Noun-Compound Relations
http://arxiv.org/abs/1805.02442v1


PareCO: Pareto-aware Channel Optimization for Slimmable Neural Networks
http://arxiv.org/abs/2007.11752v2


Pareto Probing: Trading Off Accuracy for Complexity
http://arxiv.org/abs/2010.02180v2


Parrot: Data-Driven Behavioral Priors for Reinforcement Learning
http://arxiv.org/abs/2011.10024v1


Parsing Speech: A Neural Approach to Integrating Lexical and Acoustic-Prosodic Information
http://arxiv.org/abs/1704.07287v2


Parsing as Reduction
http://arxiv.org/abs/1503.00030v1


Partial Trace Regression and Low-Rank Kraus Decomposition
http://arxiv.org/abs/2007.00935v2


Partially-Aligned Data-to-Text Generation with Distant Supervision
http://arxiv.org/abs/2010.01268v1


Past, Present, Future: A Computational Investigation of the Typology of Tense in 1000 Languages
http://arxiv.org/abs/1704.08914v2


Pathologies of Neural Models Make Interpretations Difficult
http://arxiv.org/abs/1804.07781v3


Patient-Specific Effects of Medication Using Latent Force Models with Gaussian Processes
http://arxiv.org/abs/1906.00226v1


PePScenes: A Novel Dataset and Baseline for Pedestrian Action Prediction in 3D
http://arxiv.org/abs/2012.07773v1


PeTra: A Sparsely Supervised Memory Model for People Tracking
http://arxiv.org/abs/2005.02990v1


Peer Loss Functions: Learning from Noisy Labels without Knowing Noise Rates
http://arxiv.org/abs/1910.03231v7


Perceptual Generative Autoencoders
http://arxiv.org/abs/1906.10335v2


Performative Prediction
http://arxiv.org/abs/2002.06673v3


Permutation Invariant Graph Generation via Score-Based Generative Modeling
http://arxiv.org/abs/2003.00638v1


Permutation invariant networks to learn Wasserstein metrics
http://arxiv.org/abs/2010.05820v3


PersLay: A Neural Network Layer for Persistence Diagrams and New Graph Topological Signatures
http://arxiv.org/abs/1904.09378v4


Personality Trait Detection Using Bagged SVM over BERT Word Embedding Ensembles
http://arxiv.org/abs/2010.01309v1


Personalized Language Model for Query Auto-Completion
http://arxiv.org/abs/1804.09661v1


Personalized Neural Embeddings for Collaborative Filtering with Text
http://arxiv.org/abs/1903.07860v1


Personalized neural language models for real-world query auto completion
http://arxiv.org/abs/1804.06439v3


Personalizing Dialogue Agents: I have a dog, do you have pets too?
http://arxiv.org/abs/1801.07243v5


Persuasion for Good: Towards a Personalized Persuasive Dialogue System for Social Good
http://arxiv.org/abs/1906.06725v2


Perturbed Masking: Parameter-free Probing for Analyzing and Interpreting BERT
http://arxiv.org/abs/2004.14786v2


Pessimism About Unknown Unknowns Inspires Conservatism
http://arxiv.org/abs/2006.08753v1


Phone Features Improve Speech Translation
http://arxiv.org/abs/2005.13681v1


Phonetic and Visual Priors for Decipherment of Informal Romanization
http://arxiv.org/abs/2005.02517v1


Phonotactic Complexity and its Trade-offs
http://arxiv.org/abs/2005.03774v1


Phrase-Based & Neural Unsupervised Machine Translation
http://arxiv.org/abs/1804.07755v2


Phrase-Indexed Question Answering: A New Challenge for Scalable Document Comprehension
http://arxiv.org/abs/1804.07726v2


Pieces of Eight: 8-bit Neural Machine Translation
http://arxiv.org/abs/1804.05038v1


Piecewise Linear Regression via a Difference of Convex Functions
http://arxiv.org/abs/2007.02422v3


Planning from Pixels using Inverse Dynamics Models
http://arxiv.org/abs/2012.02419v1


Planning to Explore via Self-Supervised World Models
http://arxiv.org/abs/2005.05960v2


Playing 20 Question Game with Policy-Based Reinforcement Learning
http://arxiv.org/abs/1808.07645v3


Playing Text-Adventure Games with Graph-Based Deep Reinforcement Learning
http://arxiv.org/abs/1812.01628v2


Please Mind the Root: Decoding Arborescences for Dependency Parsing
http://arxiv.org/abs/2010.02550v2


PlotMachines: Outline-Conditioned Generation with Dynamic Plot State Tracking
http://arxiv.org/abs/2004.14967v2


Plug and Play Autoencoders for Conditional Text Generation
http://arxiv.org/abs/2010.02983v2


PoWER-BERT: Accelerating BERT Inference via Progressive Word-vector Elimination
http://arxiv.org/abs/2001.08950v5


Pointer Graph Networks
http://arxiv.org/abs/2006.06380v2


Pointwise HSIC: A Linear-Time Kernelized Co-occurrence Norm for Sparse Linguistic Expressions
http://arxiv.org/abs/1809.00800v1


Pointwise Paraphrase Appraisal is Potentially Problematic
http://arxiv.org/abs/2005.11996v2


Poisson Learning: Graph Based Semi-Supervised Learning At Very Low Label Rates
http://arxiv.org/abs/2006.11184v2


Policy Gradient as a Proxy for Dynamic Oracles in Constituency Parsing
http://arxiv.org/abs/1806.03290v1


Policy Learning Using Weak Supervision
http://arxiv.org/abs/2010.01748v2


Policy Shaping and Generalized Update Equations for Semantic Parsing from Denotations
http://arxiv.org/abs/1809.01299v1


Policy Teaching via Environment Poisoning: Training-time Adversarial Attacks against Reinforcement Learning
http://arxiv.org/abs/2003.12909v2


Politeness Transfer: A Tag and Generate Approach
http://arxiv.org/abs/2004.14257v2


Political Advertising Dataset: the use case of the Polish 2020 Presidential Elections
http://arxiv.org/abs/2006.10207v1


PolyGen: An Autoregressive Generative Model of 3D Meshes
http://arxiv.org/abs/2002.10880v1


Polyglot Semantic Parsing in APIs
http://arxiv.org/abs/1803.06966v2


Polyglot Semantic Role Labeling
http://arxiv.org/abs/1805.11598v1


Population Mapping in Informal Settlements with High-Resolution Satellite Imagery and Equitable Ground-Truth
http://arxiv.org/abs/2009.08410v1


Population-Based Black-Box Optimization for Biological Sequence Design
http://arxiv.org/abs/2006.03227v2


Position-Aware Tagging for Aspect Sentiment Triplet Extraction
http://arxiv.org/abs/2010.02609v2


Post-Estimation Smoothing: A Simple Baseline for Learning with Side Information
http://arxiv.org/abs/2003.05955v1


Posterior Calibrated Training on Sentence Classification Tasks
http://arxiv.org/abs/2004.14500v2


Posterior Control of Blackbox Generation
http://arxiv.org/abs/2005.04560v1


PowerNorm: Rethinking Batch Normalization in Transformers
http://arxiv.org/abs/2003.07845v2


PowerTransformer: Unsupervised Controllable Revision for Biased Language Correction
http://arxiv.org/abs/2010.13816v1


Pragmatically Informative Image Captioning with Character-Level Inference
http://arxiv.org/abs/1804.05417v2


Pragmatically Informative Text Generation
http://arxiv.org/abs/1904.01301v2


Pre-Learning Environment Representations for Data-Efficient Neural Instruction Following
http://arxiv.org/abs/1907.09671v1


Pre-Training Transformers as Energy-Based Cloze Models
http://arxiv.org/abs/2012.08561v1


Pre-train and Plug-in: Flexible Conditional Text Generation with Variational Auto-Encoders
http://arxiv.org/abs/1911.03882v4


Pre-training Is (Almost) All You Need: An Application to Commonsense Reasoning
http://arxiv.org/abs/2004.14074v1


Pre-training Multilingual Neural Machine Translation by Leveraging Alignment Information
http://arxiv.org/abs/2010.03142v2


Pre-training for Abstractive Document Summarization by Reinstating Source Text
http://arxiv.org/abs/2004.01853v4


Pre-training on high-resource speech recognition improves low-resource speech-to-text translation
http://arxiv.org/abs/1809.01431v2


PreCo: A Large-scale Dataset in Preschool Vocabulary for Coreference Resolution
http://arxiv.org/abs/1810.09807v1


Precise Task Formalization Matters in Winograd Schema Evaluations
http://arxiv.org/abs/2010.04043v1


Precise Tradeoffs in Adversarial Training for Linear Regression
http://arxiv.org/abs/2002.10477v1


Predicting Choice with Set-Dependent Aggregation
http://arxiv.org/abs/1906.06365v2


Predicting Clinical Trial Results by Implicit Evidence Integration
http://arxiv.org/abs/2010.05639v1


Predicting Declension Class from Form and Meaning
http://arxiv.org/abs/2005.00626v2


Predicting In-game Actions from Interviews of NBA Players
http://arxiv.org/abs/1910.11292v3


Predicting Native Language from Gaze
http://arxiv.org/abs/1704.07398v2


Predicting Performance for Natural Language Processing Tasks
http://arxiv.org/abs/2005.00870v1


Predicting Semantic Relations using Global Graph Properties
http://arxiv.org/abs/1808.08644v1


Predicting Unplanned Readmissions with Highly Unstructured Data
http://arxiv.org/abs/2003.11622v2


Predicting and Analyzing Law-Making in Kenya
http://arxiv.org/abs/2006.05493v1


Prediction Focused Topic Models via Feature Selection
http://arxiv.org/abs/1910.05495v2


Prediction of Bayesian Intervals for Tropical Storms
http://arxiv.org/abs/2003.05024v1


Prediction of neonatal mortality in Sub-Saharan African countries using data-level linkage of multiple surveys
http://arxiv.org/abs/2011.12707v1


Predictive Biases in Natural Language Processing Models: A Conceptual Framework and Overview
http://arxiv.org/abs/1912.11078v2


Predictive Coding for Locally-Linear Control
http://arxiv.org/abs/2003.01086v1


Predictive Multiplicity in Classification
http://arxiv.org/abs/1909.06677v4


Predictive PER: Balancing Priority and Diversity towards Stable Deep Reinforcement Learning
http://arxiv.org/abs/2011.13093v1


Predictive Sampling with Forecasting Autoregressive Models
http://arxiv.org/abs/2002.09928v2


Pretrained Language Model Embryology: The Birth of ALBERT
http://arxiv.org/abs/2010.02480v2


Pretrained Transformers Improve Out-of-Distribution Robustness
http://arxiv.org/abs/2004.06100v2


Pretraining with Contrastive Sentence Objectives Improves Discourse Performance of Language Models
http://arxiv.org/abs/2005.10389v1


Principal Neighbourhood Aggregation for Graph Nets
http://arxiv.org/abs/2004.05718v5


Principled learning method for Wasserstein distributionally robust optimization with local perturbations
http://arxiv.org/abs/2006.03333v2


Privacy Amplification by Decentralization
http://arxiv.org/abs/2012.05326v1


Privacy-Preserving XGBoost Inference
http://arxiv.org/abs/2011.04789v4


Privacy-preserving Neural Representations of Text
http://arxiv.org/abs/1808.09408v1


Privacy-preserving collaborative machine learning on genomic data using TensorFlow
http://arxiv.org/abs/2002.04344v2


Private Outsourced Bayesian Optimization
http://arxiv.org/abs/2010.12799v1


Private Query Release Assisted by Public Data
http://arxiv.org/abs/2004.10941v1


Private Reinforcement Learning with PAC and Regret Guarantees
http://arxiv.org/abs/2009.09052v1


Private Stochastic Convex Optimization: Optimal Rates in Linear Time
http://arxiv.org/abs/2005.04763v1


Privately Learning Markov Random Fields
http://arxiv.org/abs/2002.09463v2


Privately Learning Thresholds: Closing the Exponential Gap
http://arxiv.org/abs/1911.10137v1


Privately detecting changes in unknown distributions
http://arxiv.org/abs/1910.01327v2


Probabilistic Assumptions Matter: Improved Models for Distantly-Supervised Document-Level Question Answering
http://arxiv.org/abs/2005.01898v1


Probabilistic FastText for Multi-Sense Word Embeddings
http://arxiv.org/abs/1806.02901v1


Probabilistic Frame Induction
http://arxiv.org/abs/1302.4813v1


Probabilistic Predictions of People Perusing: Evaluating Metrics of Language Model Performance for Psycholinguistic Modeling
http://arxiv.org/abs/2009.03954v1


Probabilistic Typology: Deep Generative Models of Vowel Inventories
http://arxiv.org/abs/1705.01684v1


Probabilistically Masked Language Model Capable of Autoregressive Generation in Arbitrary Word Order
http://arxiv.org/abs/2004.11579v1


Probing Emergent Semantics in Predictive Agents via Question Answering
http://arxiv.org/abs/2006.01016v1


Probing Linguistic Features of Sentence-Level Representations in Neural Relation Extraction
http://arxiv.org/abs/2004.08134v1


Probing Linguistic Systematicity
http://arxiv.org/abs/2005.04315v2


Probing Neural Dialog Models for Conversational Understanding
http://arxiv.org/abs/2006.08331v1


Probing Pretrained Language Models for Lexical Semantics
http://arxiv.org/abs/2010.05731v1


Probing Task-Oriented Dialogue Representation from Language Models
http://arxiv.org/abs/2010.13912v1


Probing for Semantic Classes: Diagnosing the Meaning Content of Word Embeddings
http://arxiv.org/abs/1906.03608v1


Probing the Need for Visual Context in Multimodal Machine Translation
http://arxiv.org/abs/1903.08678v2


Problems with Shapley-value-based explanations as feature importance measures
http://arxiv.org/abs/2002.11097v2


Profile Consistency Identification for Open-domain Dialogue Agents
http://arxiv.org/abs/2009.09680v3


Program Enhanced Fact Verification with Verbalization and Graph Attention Network
http://arxiv.org/abs/2010.03084v5


Progressive Graph Learning for Open-Set Domain Adaptation
http://arxiv.org/abs/2006.12087v2


Progressive Growing of Neural ODEs
http://arxiv.org/abs/2003.03695v1


Progressive Identification of True Labels for Partial-Label Learning
http://arxiv.org/abs/2002.08053v3


Progressive growing of self-organized hierarchical representations for exploration
http://arxiv.org/abs/2005.06369v1


Projective Preferential Bayesian Optimization
http://arxiv.org/abs/2002.03113v4


Pronoun-Targeted Fine-tuning for NMT with Hybrid Losses
http://arxiv.org/abs/2010.07638v1


Proper Learning, Helly Number, and an Optimal SVM Bound
http://arxiv.org/abs/2005.11818v1


Proper Network Interpretability Helps Adversarial Robustness in Classification
http://arxiv.org/abs/2006.14748v2


Prophets, Secretaries, and Maximizing the Probability of Choosing the Best
http://arxiv.org/abs/1910.03798v1


ProtoQA: A Question Answering Dataset for Prototypical Common-Sense Reasoning
http://arxiv.org/abs/2005.00771v3


Provable Representation Learning for Imitation Learning via Bi-level Optimization
http://arxiv.org/abs/2002.10544v1


Provable Self-Play Algorithms for Competitive Reinforcement Learning
http://arxiv.org/abs/2002.04017v3


Provable Smoothness Guarantees for Black-Box Variational Inference
http://arxiv.org/abs/1901.08431v4


Provably Convergent Two-Timescale Off-Policy Actor-Critic with Function Approximation
http://arxiv.org/abs/1911.04384v9


Provably Efficient Exploration in Policy Optimization
http://arxiv.org/abs/1912.05830v3


Provably Efficient Model-based Policy Adaptation
http://arxiv.org/abs/2006.08051v1


Provably Efficient Reinforcement Learning with Linear Function Approximation
http://arxiv.org/abs/1907.05388v2


Proving the Lottery Ticket Hypothesis: Pruning is All You Need
http://arxiv.org/abs/2002.00585v1


Prta: A System to Support the Analysis of Propaganda Techniques in the News
http://arxiv.org/abs/2005.05854v1


Psycholinguistics meets Continual Learning: Measuring Catastrophic Forgetting in Visual Question Answering
http://arxiv.org/abs/1906.04229v1


Pun Generation with Surprise
http://arxiv.org/abs/1904.06828v1


Putting An End to End-to-End: Gradient-Isolated Learning of Representations
http://arxiv.org/abs/1905.11786v3


PuzzLing Machines: A Challenge on Learning From Small Data
http://arxiv.org/abs/2004.13161v1


Puzzle Mix: Exploiting Saliency and Local Statistics for Optimal Mixup
http://arxiv.org/abs/2009.06962v2


PyHessian: Neural Networks Through the Lens of the Hessian
http://arxiv.org/abs/1912.07145v3


PyMT5: multi-mode translation of natural language and Python code with transformers
http://arxiv.org/abs/2010.03150v1


PySBD: Pragmatic Sentence Boundary Disambiguation
http://arxiv.org/abs/2010.09657v1


Pyramid Convolutional RNN for MRI Reconstruction
http://arxiv.org/abs/1912.00543v5


Q-learning with Language Model for Edit-based Unsupervised Summarization
http://arxiv.org/abs/2010.04379v1


Q-value Path Decomposition for Deep Multiagent Reinforcement Learning
http://arxiv.org/abs/2002.03950v1


QA2Explanation: Generating and Evaluating Explanations for Question Answering Systems over Knowledge Graph
http://arxiv.org/abs/2010.08323v1


QuASE: Question-Answer Driven Sentence Encoding
http://arxiv.org/abs/1909.00333v3


Quantifying Attention Flow in Transformers
http://arxiv.org/abs/2005.00928v2


Quantifying Differences in Reward Functions
http://arxiv.org/abs/2006.13900v2


Quantifying Intimacy in Language
http://arxiv.org/abs/2011.03020v1


Quantifying Privacy Leakage in Graph Embedding
http://arxiv.org/abs/2010.00906v1


Quantifying Similarity between Relations with Fact Distribution
http://arxiv.org/abs/1907.08937v1


Quantifying the Effects of COVID-19 on Mental Health Support Forums
http://arxiv.org/abs/2009.04008v1


Quantitative Argument Summarization and Beyond: Cross-Domain Key Point Analysis
http://arxiv.org/abs/2010.05369v1


Quantitative stability of optimal transport maps and linearization of the 2-Wasserstein space
http://arxiv.org/abs/1910.05954v1


Quantized Decentralized Stochastic Learning over Directed Graphs
http://arxiv.org/abs/2002.09964v5


Quantized Frank-Wolfe: Faster Optimization, Lower Communication, and Projection Free
http://arxiv.org/abs/1902.06332v3


Quantum Boosting
http://arxiv.org/abs/2002.05056v2


Quantum Expectation-Maximization for Gaussian Mixture Models
http://arxiv.org/abs/1908.06657v2


Quaternion Graph Neural Networks
http://arxiv.org/abs/2008.05089v3


Queens are Powerful too: Mitigating Gender Bias in Dialogue Generation
http://arxiv.org/abs/1911.03842v2


Question Directed Graph Attention Network for Numerical Reasoning over Text
http://arxiv.org/abs/2009.07448v1


R2-B2: Recursive Reasoning-Based Bayesian Optimization for No-Regret Learning in Games
http://arxiv.org/abs/2006.16679v1


R4C: A Benchmark for Evaluating RC Systems to Get the Right Answer for the Right Reason
http://arxiv.org/abs/1910.04601v2


RAMP-CNN: A Novel Neural Network for Enhanced Automotive Radar Object Recognition
http://arxiv.org/abs/2011.08981v1


RAT-SQL: Relation-Aware Schema Encoding and Linking for Text-to-SQL Parsers
http://arxiv.org/abs/1911.04942v4


RATQ: A Universal Fixed-Length Quantizer for Stochastic Optimization
http://arxiv.org/abs/1908.08200v3


RESIDE: Improving Distantly-Supervised Neural Relation Extraction using Side Information
http://arxiv.org/abs/1812.04361v2


RIFLE: Backpropagation in Depth for Deep Transfer Learning through Re-Initializing the Fully-connected LayEr
http://arxiv.org/abs/2007.03349v1


RL agents Implicitly Learning Human Preferences
http://arxiv.org/abs/2002.06137v1


RNNs can generate bounded hierarchical languages with optimal memory
http://arxiv.org/abs/2010.07515v1


ROMA: Multi-Agent Reinforcement Learning with Emergent Roles
http://arxiv.org/abs/2003.08039v3


RPD: A Distance Function Between Word Embeddings
http://arxiv.org/abs/2005.08113v1


Radial Bayesian Neural Networks: Beyond Discrete Support In Large-Scale Bayesian Deep Learning
http://arxiv.org/abs/1907.00865v3


Radioactive data: tracing through training
http://arxiv.org/abs/2002.00937v1


Random Hypervolume Scalarizations for Provable Multi-Objective Black Box Optimization
http://arxiv.org/abs/2006.04655v2


Random Matrix Theory Proves that Deep Learning Representations of GAN-data Behave as Gaussian Mixtures
http://arxiv.org/abs/2001.08370v1


Random Search and Reproducibility for Neural Architecture Search
http://arxiv.org/abs/1902.07638v3


Random extrapolation for primal-dual coordinate descent
http://arxiv.org/abs/2007.06528v1


Randomized Block-Diagonal Preconditioning for Parallel Learning
http://arxiv.org/abs/2006.13591v2


Randomized Exploration in Generalized Linear Bandits
http://arxiv.org/abs/1906.08947v2


Randomized Smoothing of All Shapes and Sizes
http://arxiv.org/abs/2002.08118v5


Randomly Projected Additive Gaussian Processes for Regression
http://arxiv.org/abs/1912.12834v1


Rank and run-time aware compression of NLP Applications
http://arxiv.org/abs/2010.03193v1


Ranking Paragraphs for Improving Answer Recall in Open-Domain Question Answering
http://arxiv.org/abs/1810.00494v1


Ranking and Selecting Multi-Hop Knowledge Paths to Better Predict Human Needs
http://arxiv.org/abs/1904.00676v1


Rapid Adaptation of Neural Machine Translation to New Languages
http://arxiv.org/abs/1808.04189v1


Rapidly Deploying a Neural Search Engine for the COVID-19 Open Research Dataset: Preliminary Thoughts and Lessons Learned
http://arxiv.org/abs/2004.05125v1


Rate-Distortion Optimization Guided Autoencoder for Isometric Embedding in Euclidean Latent Space
http://arxiv.org/abs/1910.04329v4


Rational Recurrences
http://arxiv.org/abs/1808.09357v1


Rationalizing Medical Relation Prediction from Corpus-level Statistics
http://arxiv.org/abs/2005.00889v1


Rationalizing Neural Predictions
http://arxiv.org/abs/1606.04155v2


Rationalizing Text Matching: Learning Sparse Alignments via Optimal Transport
http://arxiv.org/abs/2005.13111v1


Re-evaluating Evaluation in Text Summarization
http://arxiv.org/abs/2010.07100v1


Re-translation versus Streaming for Simultaneous Translation
http://arxiv.org/abs/2004.03643v3


ReLU Code Space: A Basis for Rating Network Quality Besides Accuracy
http://arxiv.org/abs/2005.09903v1


Reactive Supervision: A New Method for Collecting Sarcasm Data
http://arxiv.org/abs/2009.13080v1


Reading Between the Lines: Exploring Infilling in Visual Narratives
http://arxiv.org/abs/2010.13944v1


Ready Policy One: World Building Through Active Learning
http://arxiv.org/abs/2002.02693v1


Real-Time Open-Domain Question Answering with Dense-Sparse Phrase Index
http://arxiv.org/abs/1906.05807v2


Real-Time Optimisation for Online Learning in Auctions
http://arxiv.org/abs/2010.10070v1


Real-time Classification from Short Event-Camera Streams using Input-filtering Neural ODEs
http://arxiv.org/abs/2004.03156v1


Reasoning About Generalization via Conditional Mutual Information
http://arxiv.org/abs/2001.09122v3


Reasoning About Pragmatics with Neural Listeners and Speakers
http://arxiv.org/abs/1604.00562v2


Reasoning Over History: Context Aware Visual Dialog
http://arxiv.org/abs/2011.00669v1


Reasoning Over Semantic-Level Graph for Fact Checking
http://arxiv.org/abs/1909.03745v3


Reasoning about Actions and State Changes by Injecting Commonsense Knowledge
http://arxiv.org/abs/1808.10012v1


Reasoning about Goals, Steps, and Temporal Ordering with WikiHow
http://arxiv.org/abs/2009.07690v2


Reasoning with Latent Structure Refinement for Document-Level Relation Extraction
http://arxiv.org/abs/2005.06312v3


Reasoning with Sarcasm by Reading In-between
http://arxiv.org/abs/1805.02856v1


Recall and Learn: Fine-tuning Deep Pretrained Language Models with Less Forgetting
http://arxiv.org/abs/2004.12651v1


RecipeQA: A Challenge Dataset for Multimodal Comprehension of Cooking Recipes
http://arxiv.org/abs/1809.00812v1


Recipes for building an open-domain chatbot
http://arxiv.org/abs/2004.13637v2


Recognizing Implicit Discourse Relations via Repeated Reading: Neural Networks with Multi-Level Attention
http://arxiv.org/abs/1609.06380v1


Recovery of Sparse Signals from a Mixture of Linear Samples
http://arxiv.org/abs/2006.16406v2


Recurrent Chunking Mechanisms for Long-Text Machine Reading Comprehension
http://arxiv.org/abs/2005.08056v2


Recurrent Event Network: Autoregressive Structure Inference over Temporal Knowledge Graphs
http://arxiv.org/abs/1904.05530v4


Recurrent Hierarchical Topic-Guided RNN for Language Generation
http://arxiv.org/abs/1912.10337v2


Recurrent Interaction Network for Jointly Extracting Entities and Classifying Relations
http://arxiv.org/abs/2005.00162v2


Recurrent Neural Network Language Models Always Learn English-Like Relative Clause Attachment
http://arxiv.org/abs/2005.00165v3


Recurrent Neural Networks as Weighted Language Recognizers
http://arxiv.org/abs/1711.05408v2


Recurrent Neural Networks in Linguistic Theory: Revisiting Pinker and Prince (1988) and the Past Tense Debate
http://arxiv.org/abs/1807.04783v2


Recurrent babbling: evaluating the acquisition of grammar from limited input data
http://arxiv.org/abs/2010.04637v1


Recursive Subtree Composition in LSTM-Based Dependency Parsing
http://arxiv.org/abs/1902.09781v1


Reducibility and Statistical-Computational Gaps from Secret Leakage
http://arxiv.org/abs/2005.08099v2


Reducing Gender Bias in Abusive Language Detection
http://arxiv.org/abs/1808.07231v1


Reducing Gender Bias in Neural Machine Translation as a Domain Adaptation Problem
http://arxiv.org/abs/2004.04498v3


Reducing Sampling Error in Batch Temporal Difference Learning
http://arxiv.org/abs/2008.06738v1


Refer, Reuse, Reduce: Generating Subsequent References in Visual and Conversational Contexts
http://arxiv.org/abs/2011.04554v1


Refined bounds for algorithm configuration: The knife-edge of dual class approximability
http://arxiv.org/abs/2006.11827v2


Reflection-based Word Attribute Transfer
http://arxiv.org/abs/2007.02598v2


Reformulating Unsupervised Style Transfer as Paraphrase Generation
http://arxiv.org/abs/2010.05700v1


Regression Networks for Meta-Learning Few-Shot Classification
http://arxiv.org/abs/1905.13613v2


Regularity as Regularization: Smooth and Strongly Convex Brenier Potentials in Optimal Transport
http://arxiv.org/abs/1905.10812v5


Regularization via Structural Label Smoothing
http://arxiv.org/abs/2001.01900v2


Regularized Autoencoders via Relaxed Injective Probability Flow
http://arxiv.org/abs/2002.08927v1


Regularized Context Gates on Transformer for Machine Translation
http://arxiv.org/abs/1908.11020v2


Regularized Inverse Reinforcement Learning
http://arxiv.org/abs/2010.03691v2


Regularized Optimal Transport is Ground Cost Adversarial
http://arxiv.org/abs/2002.03967v3


Regularizing Dialogue Generation by Imitating Implicit Scenarios
http://arxiv.org/abs/2010.01893v2


Reinforcement Learning Based Text Style Transfer without Parallel Training Corpus
http://arxiv.org/abs/1903.10671v2


Reinforcement Learning Generalization with Surprise Minimization
http://arxiv.org/abs/2004.12399v2


Reinforcement Learning based Curriculum Optimization for Neural Machine Translation
http://arxiv.org/abs/1903.00041v1


Reinforcement Learning for Integer Programming: Learning to Cut
http://arxiv.org/abs/1906.04859v3


Reinforcement Learning for Molecular Design Guided by Quantum Mechanics
http://arxiv.org/abs/2002.07717v2


Reinforcement Learning in Feature Space: Matrix Bandit, Kernels, and Regret Bound
http://arxiv.org/abs/1905.10389v2


Reinforcement Learning through Active Inference
http://arxiv.org/abs/2002.12636v1


Reinforcement Learning with Chromatic Networks for Compact Architecture Search
http://arxiv.org/abs/1907.06511v3


Reinforcement Learning with Latent Flow
http://arxiv.org/abs/2101.01857v1


Relabel the Noise: Joint Extraction of Entities and Relations via Cooperative Multiagents
http://arxiv.org/abs/2004.09930v1


Relating Simple Sentence Representations in Deep Neural Networks and the Brain
http://arxiv.org/abs/1906.11861v1


Relation Embedding with Dihedral Group in Knowledge Graph
http://arxiv.org/abs/1906.00687v1


Relation Extraction with Explanation
http://arxiv.org/abs/2005.14271v1


Relational Graph Attention Network for Aspect-based Sentiment Analysis
http://arxiv.org/abs/2004.12362v1


Relations such as Hypernymy: Identifying and Exploiting Hearst Patterns in Distributional Vectors for Lexical Entailment
http://arxiv.org/abs/1605.05433v2


Relative gradient optimization of the Jacobian term in unsupervised deep learning
http://arxiv.org/abs/2006.15090v2


Relaxing Bijectivity Constraints with Continuously Indexed Normalising Flows
http://arxiv.org/abs/1909.13833v4


Relevance of Rotationally Equivariant Convolutions for Predicting Molecular Properties
http://arxiv.org/abs/2008.08461v4


Reliable Fidelity and Diversity Metrics for Generative Models
http://arxiv.org/abs/2002.09797v2


Reliable evaluation of adversarial robustness with an ensemble of diverse parameter-free attacks
http://arxiv.org/abs/2003.01690v2


Rep the Set: Neural Networks for Learning Set Representations
http://arxiv.org/abs/1904.01962v2


Replicability Analysis for Natural Language Processing: Testing Significance with Multiple Datasets
http://arxiv.org/abs/1709.09500v1


Representation Learning for Discovering Phonemic Tone Contours
http://arxiv.org/abs/1910.08987v2


Representation Learning for Grounded Spatial Reasoning
http://arxiv.org/abs/1707.03938v2


Representations of language in a model of visually grounded speech signal
http://arxiv.org/abs/1702.01991v3


Representing Unordered Data Using Complex-Weighted Multiset Automata
http://arxiv.org/abs/2001.00610v3


Representing and Denoising Wearable ECG Recordings
http://arxiv.org/abs/2012.00110v1


Repulsive Attention: Rethinking Multi-head Attention as Bayesian Inference
http://arxiv.org/abs/2009.09364v2


Repurposing Entailment for Multi-Hop Question Answering Tasks
http://arxiv.org/abs/1904.09380v1


Reserve Pricing in Repeated Second-Price Auctions with Strategic Bidders
http://arxiv.org/abs/1906.09331v1


Reset-Free Lifelong Learning with Skill-Space Planning
http://arxiv.org/abs/2012.03548v2


Resolution Dependent GAN Interpolation for Controllable Image Synthesis Between Domains
http://arxiv.org/abs/2010.05334v3


Resolving Spurious Correlations in Causal Models of Environments via Interventions
http://arxiv.org/abs/2002.05217v2


Response Selection for Multi-Party Conversations with Dynamic Topic Tracking
http://arxiv.org/abs/2010.07785v1


Response-Anticipated Memory for On-Demand Knowledge Integration in Response Generation
http://arxiv.org/abs/2005.06128v1


RethinkCWS: Is Chinese Word Segmentation a Solved Task?
http://arxiv.org/abs/2011.06858v2


Rethinking Action Spaces for Reinforcement Learning in End-to-end Dialog Agents with Latent Variable Models
http://arxiv.org/abs/1902.08858v2


Rethinking Dialogue State Tracking with Reasoning
http://arxiv.org/abs/2005.13129v2


Retrieval-Based Neural Code Generation
http://arxiv.org/abs/1808.10025v1


Retrofitting Structure-aware Transformer Language Model for End Tasks
http://arxiv.org/abs/2009.07408v1


Reusability and Transferability of Macro Actions for Reinforcement Learning
http://arxiv.org/abs/1908.01478v2


Reusing a Pretrained Language Model on Languages with Limited Corpora for Unsupervised NMT
http://arxiv.org/abs/2009.07610v3


Reverse Engineering Configurations of Neural Text Generation Models
http://arxiv.org/abs/2004.06201v1


Reverse-Engineering Deep ReLU Networks
http://arxiv.org/abs/1910.00744v2


Review-based Question Generation with Adaptive Instance Transfer and Augmentation
http://arxiv.org/abs/1911.01556v2


Revisiting Character-Based Neural Machine Translation with Capacity and Compression
http://arxiv.org/abs/1808.09943v1


Revisiting Ensembles in an Adversarial Context: Improving Natural Accuracy
http://arxiv.org/abs/2002.11572v1


Revisiting Fundamentals of Experience Replay
http://arxiv.org/abs/2007.06700v1


Revisiting Joint Modeling of Cross-document Entity and Event Coreference Resolution
http://arxiv.org/abs/1906.01753v1


Revisiting Low-Resource Neural Machine Translation: A Case Study
http://arxiv.org/abs/1905.11901v1


Revisiting Modularized Multilingual NMT to Meet Industrial Demands
http://arxiv.org/abs/2010.09402v1


Revisiting Rainbow: Promoting more insightful and inclusive deep reinforcement learning research
http://arxiv.org/abs/2011.14826v1


Revisiting Stochastic Extragradient
http://arxiv.org/abs/1905.11373v2


Revisiting Unsupervised Relation Extraction
http://arxiv.org/abs/2005.00087v1


Revisiting the Context Window for Cross-lingual Word Embeddings
http://arxiv.org/abs/2004.10813v1


Revisiting the Importance of Encoding Logic Rules in Sentiment Classification
http://arxiv.org/abs/1808.07733v1


RiSAWOZ: A Large-Scale Multi-Domain Wizard-of-Oz Dataset with Rich Semantic Annotations for Task-Oriented Dialogue Modeling
http://arxiv.org/abs/2010.08738v1


Rigging the Lottery: Making All Tickets Winners
http://arxiv.org/abs/1911.11134v2


Right for the Wrong Reasons: Diagnosing Syntactic Heuristics in Natural Language Inference
http://arxiv.org/abs/1902.01007v4


Rigid Formats Controlled Text Generation
http://arxiv.org/abs/2004.08022v1


RikiNet: Reading Wikipedia Pages for Natural Question Answering
http://arxiv.org/abs/2004.14560v1


Risk Assessment for Machine Learning Models
http://arxiv.org/abs/2011.04328v1


Risk Bounds for Learning Multiple Components with Permutation-Invariant Losses
http://arxiv.org/abs/1904.07594v2


Rk-means: Fast Clustering for Relational Data
http://arxiv.org/abs/1910.04939v1


Robust Bayesian Classification Using an Optimistic Score Ratio
http://arxiv.org/abs/2007.04458v1


Robust Cross-lingual Hypernymy Detection using Dependency Context
http://arxiv.org/abs/1803.11291v1


Robust Distant Supervision Relation Extraction via Deep Reinforcement Learning
http://arxiv.org/abs/1805.09927v1


Robust Domain Randomised Reinforcement Learning through Peer-to-Peer Distillation
http://arxiv.org/abs/2012.04839v1


Robust Encodings: A Framework for Combating Adversarial Typos
http://arxiv.org/abs/2005.01229v1


Robust Learning from Discriminative Feature Feedback
http://arxiv.org/abs/2003.03946v1


Robust Optimisation Monte Carlo
http://arxiv.org/abs/1904.00670v3


Robust Outlier Arm Identification
http://arxiv.org/abs/2009.09988v1


Robust Prediction of Punctuation and Truecasing for Medical ASR
http://arxiv.org/abs/2007.02025v2


Robust Reinforcement Learning using Adversarial Populations
http://arxiv.org/abs/2008.01825v2


Robust Variational Autoencoders for Outlier Detection and Repair of Mixed-Type Data
http://arxiv.org/abs/1907.06671v2


Robust Visual Domain Randomization for Reinforcement Learning
http://arxiv.org/abs/1910.10537v2


Robust and Private Learning of Halfspaces
http://arxiv.org/abs/2011.14580v1


Robust and Stable Black Box Explanations
http://arxiv.org/abs/2011.06169v1


Robust model training and generalisation with Studentising flows
http://arxiv.org/abs/2006.06599v2


Robust posterior inference when statistically emulating forward simulations
http://arxiv.org/abs/2004.11929v1


Robustifying Sequential Neural Processes
http://arxiv.org/abs/2006.15987v1


Robustness for Non-Parametric Classification: A Generic Attack and Defense
http://arxiv.org/abs/1906.03310v2


Robustness to Programmable String Transformations via Augmented Abstract Training
http://arxiv.org/abs/2002.09579v4


Robustness to Spurious Correlations via Human Annotations
http://arxiv.org/abs/2007.06661v2


Room-Across-Room: Multilingual Vision-and-Language Navigation with Dense Spatiotemporal Grounding
http://arxiv.org/abs/2010.07954v1


RussianSuperGLUE: A Russian Language Understanding Evaluation Benchmark
http://arxiv.org/abs/2010.15925v2


S-MART: Novel Tree-based Structured Learning Algorithms Applied to Tweet Entity Linking
http://arxiv.org/abs/1609.08075v1


S2ORC: The Semantic Scholar Open Research Corpus
http://arxiv.org/abs/1911.02782v3


S2RMs: Spatially Structured Recurrent Modules
http://arxiv.org/abs/2007.06533v1


SAFENet: Self-Supervised Monocular Depth Estimation with Semantic-Aware Feature Extraction
http://arxiv.org/abs/2010.02893v3


SAFER: A Structure-free Approach for Certified Robustness to Adversarial Word Substitutions
http://arxiv.org/abs/2005.14424v1


SCAFFOLD: Stochastic Controlled Averaging for Federated Learning
http://arxiv.org/abs/1910.06378v3


SCDE: Sentence Cloze Dataset with High Quality Distractors From Examinations
http://arxiv.org/abs/2004.12934v1


SCDV : Sparse Composite Document Vectors using soft clustering over distributional representations
http://arxiv.org/abs/1612.06778v3


SDE-Net: Equipping Deep Neural Networks with Uncertainty Estimates
http://arxiv.org/abs/2008.10546v1


SE(3)-Transformers: 3D Roto-Translation Equivariant Attention Networks
http://arxiv.org/abs/2006.10503v3


SECTOR: A Neural Model for Coherent Topic Segmentation and Classification
http://arxiv.org/abs/1902.04793v1


SGD Learns One-Layer Networks in WGANs
http://arxiv.org/abs/1910.07030v2


SHAPED: Shared-Private Encoder-Decoder for Text Style Adaptation
http://arxiv.org/abs/1804.04093v1


SIGMORPHON 2020 Shared Task 0: Typologically Diverse Morphological Inflection
http://arxiv.org/abs/2006.11572v2


SIGN: Scalable Inception Graph Neural Networks
http://arxiv.org/abs/2004.11198v3


SIGTYP 2020 Shared Task: Prediction of Typological Features
http://arxiv.org/abs/2010.08246v2


SIGUA: Forgetting May Make Learning with Noisy Labels More Robust
http://arxiv.org/abs/1809.11008v3


SJTU-NICT's Supervised and Unsupervised Neural Machine Translation Systems for the WMT20 News Translation Task
http://arxiv.org/abs/2010.05122v1


SKEP: Sentiment Knowledge Enhanced Pre-training for Sentiment Analysis
http://arxiv.org/abs/2005.05635v2


SLEDGE-Z: A Zero-Shot Baseline for COVID-19 Literature Search
http://arxiv.org/abs/2010.05987v1


SLM: Learning a Discourse Language Representation with Sentence Unshuffling
http://arxiv.org/abs/2010.16249v1


SLURP: A Spoken Language Understanding Resource Package
http://arxiv.org/abs/2011.13205v1


SMART: Robust and Efficient Fine-Tuning for Pre-trained Natural Language Models through Principled Regularized Optimization
http://arxiv.org/abs/1911.03437v2


SMArtCast: Predicting soil moisture interpolations into the future using Earth observation data in a deep learning framework
http://arxiv.org/abs/2003.10823v2


SOTERIA: In Search of Efficient Neural Networks for Private Inference
http://arxiv.org/abs/2007.12934v1


SOrT-ing VQA Models : Contrastive Gradient Learning for Improved Consistency
http://arxiv.org/abs/2010.10038v2


SQuAD: 100,000+ Questions for Machine Comprehension of Text
http://arxiv.org/abs/1606.05250v3


SRLGRN: Semantic Role Labeling Graph Reasoning Network
http://arxiv.org/abs/2010.03604v2


SSCR: Iterative Language-Based Image Editing via Self-Supervised Counterfactual Reasoning
http://arxiv.org/abs/2009.09566v2


SSMBA: Self-Supervised Manifold Based Data Augmentation for Improving Out-of-Domain Robustness
http://arxiv.org/abs/2009.10195v2


STARC: Structured Annotations for Reading Comprehension
http://arxiv.org/abs/2004.14797v1


STORIUM: A Dataset and Evaluation Platform for Machine-in-the-Loop Story Generation
http://arxiv.org/abs/2010.01717v1


SUPERT: Towards New Frontiers in Unsupervised Evaluation Metrics for Multi-Document Summarization
http://arxiv.org/abs/2005.03724v1


SUPP.AI: Finding Evidence for Supplement-Drug Interactions
http://arxiv.org/abs/1909.08135v3


SWAG: A Large-Scale Adversarial Dataset for Grounded Commonsense Inference
http://arxiv.org/abs/1808.05326v1


SacreROUGE: An Open-Source Library for Using and Developing Summarization Evaluation Metrics
http://arxiv.org/abs/2007.05374v1


Safe Imitation Learning via Fast Bayesian Reward Inference from Preferences
http://arxiv.org/abs/2002.09089v4


Safe Reinforcement Learning in Constrained Markov Decision Processes
http://arxiv.org/abs/2008.06626v1


Safe Reinforcement Learning with Natural Language Constraints
http://arxiv.org/abs/2010.05150v1


SafeCity: Understanding Diverse Forms of Sexual Harassment Personal Stories
http://arxiv.org/abs/1809.04739v2


Saliency Learning: Teaching the Model Where to Pay Attention
http://arxiv.org/abs/1902.08649v3


SalsaNext: Fast, Uncertainty-aware Semantic Segmentation of LiDAR Point Clouds for Autonomous Driving
http://arxiv.org/abs/2003.03653v3


Sample Amplification: Increasing Dataset Size even when Learning is Impossible
http://arxiv.org/abs/1904.12053v2


Sample Complexity Bounds for 1-bit Compressive Sensing and Binary Stable Embeddings with Generative Priors
http://arxiv.org/abs/2002.01697v3


Sample Complexity of Estimating the Policy Gradient for Nearly Deterministic Dynamical Systems
http://arxiv.org/abs/1901.08562v2


Sample Complexity of Reinforcement Learning using Linearly Combined Model Ensembles
http://arxiv.org/abs/1910.10597v1


Sample Efficient Training in Multi-Agent Adversarial Games with Limited Teammate Communication
http://arxiv.org/abs/2011.00424v1


Sample Factory: Egocentric 3D Control from Pixels at 100000 FPS with Asynchronous Reinforcement Learning
http://arxiv.org/abs/2006.11751v2


Sample-efficient proper PAC learning with approximate differential privacy
http://arxiv.org/abs/2012.03893v1


Sarcasm Detection in Tweets with BERT and GloVe Embeddings
http://arxiv.org/abs/2006.11512v1


Sarcasm Detection using Context Separators in Online Discourse
http://arxiv.org/abs/2006.00850v1


Satellite-based Prediction of Forage Conditions for Livestock in Northern Kenya
http://arxiv.org/abs/2004.04081v2


Satirical News Detection and Analysis using Attention Mechanism and Linguistic Features
http://arxiv.org/abs/1709.01189v1


Scalable Bayesian Learning of Recurrent Neural Networks for Language Modeling
http://arxiv.org/abs/1611.08034v2


Scalable Deep Generative Modeling for Sparse Graphs
http://arxiv.org/abs/2006.15502v1


Scalable Differentiable Physics for Learning and Control
http://arxiv.org/abs/2007.02168v1


Scalable Differential Privacy with Certified Robustness in Adversarial Learning
http://arxiv.org/abs/1903.09822v5


Scalable Exact Inference in Multi-Output Gaussian Processes
http://arxiv.org/abs/1911.06287v3


Scalable Gaussian Process Regression for Kernels with a Non-Stationary Phase
http://arxiv.org/abs/1912.11713v1


Scalable Gradients for Stochastic Differential Equations
http://arxiv.org/abs/2001.01328v6


Scalable Identification of Partially Observed Systems with Certainty-Equivalent EM
http://arxiv.org/abs/2006.11615v1


Scalable Multi-Hop Relational Reasoning for Knowledge-Aware Question Answering
http://arxiv.org/abs/2005.00646v2


Scalable Nearest Neighbor Search for Optimal Transport
http://arxiv.org/abs/1910.04126v4


Scalable Syntax-Aware Language Models Using Knowledge Distillation
http://arxiv.org/abs/1906.06438v1


Scalable Zero-shot Entity Linking with Dense Entity Retrieval
http://arxiv.org/abs/1911.03814v3


Scalable and Efficient Comparison-based Search without Features
http://arxiv.org/abs/1905.05049v3


Scaling Hidden Markov Language Models
http://arxiv.org/abs/2011.04640v1


Scaling up Hybrid Probabilistic Inference with Logical and Arithmetic Constraints via Message Passing
http://arxiv.org/abs/2003.00126v2


Scattering GCN: Overcoming Oversmoothness in Graph Convolutional Networks
http://arxiv.org/abs/2003.08414v2


Scene Graph Parsing as Dependency Parsing
http://arxiv.org/abs/1803.09189v1


Scene Graph Reasoning for Visual Question Answering
http://arxiv.org/abs/2007.01072v1


Schatten Norms in Matrix Streams: Hello Sparsity, Goodbye Dimension
http://arxiv.org/abs/1907.05457v2


SciDTB: Discourse Dependency TreeBank for Scientific Abstracts
http://arxiv.org/abs/1806.03653v1


SciREX: A Challenge Dataset for Document-Level Information Extraction
http://arxiv.org/abs/2005.00512v1


Score Combination for Improved Parallel Corpus Filtering for Low Resource Conditions
http://arxiv.org/abs/2011.07933v1


Scoring Lexical Entailment with a Supervised Directional Similarity Network
http://arxiv.org/abs/1805.09355v1


Screening Data Points in Empirical Risk Minimization via Ellipsoidal Regions and Safe Loss Functions
http://arxiv.org/abs/1912.02566v3


Screenplay Quality Assessment: Can We Predict Who Gets Nominated?
http://arxiv.org/abs/2005.06123v1


Screenplay Summarization Using Latent Narrative Structure
http://arxiv.org/abs/2004.12727v1


ScriptWriter: Narrative-Guided Script Generation
http://arxiv.org/abs/2005.10331v2


Secure Medical Image Analysis with CrypTFlow
http://arxiv.org/abs/2012.05064v1


Selecting Backtranslated Data from Multiple Sources for Improved Neural Machine Translation
http://arxiv.org/abs/2005.00308v1


Selecting Machine-Translated Data for Quick Bootstrapping of a Natural Language Understanding System
http://arxiv.org/abs/1805.09119v1


Selection Bias Explorations and Debias Methods for Natural Language Sentence Matching Datasets
http://arxiv.org/abs/1905.06221v4


Selective Attention for Context-aware Neural Machine Translation
http://arxiv.org/abs/1903.08788v2


Selective Dyna-style Planning Under Limited Model Capacity
http://arxiv.org/abs/2007.02418v2


Selective Encoding for Abstractive Sentence Summarization
http://arxiv.org/abs/1704.07073v1


Selective Question Answering under Domain Shift
http://arxiv.org/abs/2006.09462v1


Self-Attentive Associative Memory
http://arxiv.org/abs/2002.03519v3


Self-Induced Curriculum Learning in Self-Supervised Neural Machine Translation
http://arxiv.org/abs/2004.03151v2


Self-PU: Self Boosted and Calibrated Positive-Unlabeled Training
http://arxiv.org/abs/2006.11280v1


Self-Supervised Meta-Learning for Few-Shot Natural Language Classification Tasks
http://arxiv.org/abs/2009.08445v2


Self-Supervised Policy Adaptation during Deployment
http://arxiv.org/abs/2007.04309v2


Self-Training for Unsupervised Parsing with PRPN
http://arxiv.org/abs/2005.13455v1


Self-supervised Knowledge Triplet Learning for Zero-shot Question Answering
http://arxiv.org/abs/2005.00316v2


Self-supervised Label Augmentation via Input Transformations
http://arxiv.org/abs/1910.05872v2


SelfORE: Self-supervised Relational Feature Learning for Open Relation Extraction
http://arxiv.org/abs/2004.02438v2


Selfish Robustness and Equilibria in Multi-Player Bandits
http://arxiv.org/abs/2002.01197v2


Semantic Annotation for Microblog Topics Using Wikipedia Temporal Information
http://arxiv.org/abs/1701.03939v1


Semantic Drift in Multilingual Representations
http://arxiv.org/abs/1904.10820v4


Semantic Enrichment of Nigerian Pidgin English for Contextual Sentiment Classification
http://arxiv.org/abs/2003.12450v1


Semantic Graphs for Generating Deep Questions
http://arxiv.org/abs/2004.12704v1


Semantic Label Smoothing for Sequence to Sequence Problems
http://arxiv.org/abs/2010.07447v1


Semantic Parsing for Task Oriented Dialog using Hierarchical Representations
http://arxiv.org/abs/1810.07942v1


Semantic Parsing to Probabilistic Programs for Situated Question Answering
http://arxiv.org/abs/1606.07046v2


Semantic Parsing with Dual Learning
http://arxiv.org/abs/1907.05343v2


Semantic Parsing with Semi-Supervised Sequential Autoencoders
http://arxiv.org/abs/1609.09315v1


Semantic Role Labeling Guided Multi-turn Dialogue ReWriter
http://arxiv.org/abs/2010.01417v1


Semantic Role Labeling as Syntactic Dependency Parsing
http://arxiv.org/abs/2010.11170v1


Semantic Role Labeling for Learner Chinese: the Importance of Syntactic Parsing and L2-L1 Parallel Data
http://arxiv.org/abs/1808.09409v2


Semantic Scaffolds for Pseudocode-to-Code Generation
http://arxiv.org/abs/2005.05927v1


Semantic Structural Evaluation for Text Simplification
http://arxiv.org/abs/1810.05022v1


Semantic expressive capacity with bounded memory
http://arxiv.org/abs/1906.11752v1


Semantically Conditioned LSTM-based Natural Language Generation for Spoken Dialogue Systems
http://arxiv.org/abs/1508.01745v2


Semantically-Aligned Equation Generation for Solving and Reasoning Math Word Problems
http://arxiv.org/abs/1811.00720v2


Semantically-Aligned Universal Tree-Structured Solver for Math Word Problems
http://arxiv.org/abs/2010.06823v1


Semi-Modular Inference: enhanced learning in multi-modular models by tempering the influence of components
http://arxiv.org/abs/2003.06804v1


Semi-Supervised Bilingual Lexicon Induction with Two-way Interaction
http://arxiv.org/abs/2010.07101v1


Semi-Supervised Dialogue Policy Learning via Stochastic Reward Estimation
http://arxiv.org/abs/2005.04379v1


Semi-Supervised Learning with Normalizing Flows
http://arxiv.org/abs/1912.13025v1


Semi-Supervised QA with Generative Domain-Adaptive Nets
http://arxiv.org/abs/1702.02206v2


Semi-Supervised StyleGAN for Disentanglement Learning
http://arxiv.org/abs/2003.03461v3


Semi-supervised User Geolocation via Graph Convolutional Networks
http://arxiv.org/abs/1804.08049v4


Semiparametric Nonlinear Bipartite Graph Representation Learning with Provable Guarantees
http://arxiv.org/abs/2003.01013v1


SenseBERT: Driving Some Sense into BERT
http://arxiv.org/abs/1908.05646v2


Sentence Meta-Embeddings for Unsupervised Semantic Textual Similarity
http://arxiv.org/abs/1911.03700v3


Sentence Simplification with Deep Reinforcement Learning
http://arxiv.org/abs/1703.10931v2


Sentences with Gapping: Parsing and Reconstructing Elided Predicates
http://arxiv.org/abs/1804.06922v1


SentiBERT: A Transferable Transformer-Based Architecture for Compositional Sentiment Semantics
http://arxiv.org/abs/2005.04114v4


SentiLARE: Sentiment-Aware Language Representation Learning with Linguistic Knowledge
http://arxiv.org/abs/1911.02493v3


Seq2Edits: Sequence Transduction Using Span-level Edit Operations
http://arxiv.org/abs/2009.11136v1


SeqMix: Augmenting Active Sequence Labeling via Sequence Mixup
http://arxiv.org/abs/2010.02322v1


Sequence Tagging with Contextual and Non-Contextual Subword Representations: A Multilingual Evaluation
http://arxiv.org/abs/1906.01569v1


Sequence-Level Knowledge Distillation
http://arxiv.org/abs/1606.07947v4


Sequence-Level Mixed Sample Data Augmentation
http://arxiv.org/abs/2011.09039v1


Sequence-to-Action: End-to-End Semantic Graph Generation for Semantic Parsing
http://arxiv.org/abs/1809.00773v1


Sequential Cooperative Bayesian Inference
http://arxiv.org/abs/2002.05706v3


Sequential Matching Network: A New Architecture for Multi-turn Response Selection in Retrieval-based Chatbots
http://arxiv.org/abs/1612.01627v2


Sequential Transfer in Reinforcement Learning with a Generative Model
http://arxiv.org/abs/2007.00722v1


Serverless inferencing on Kubernetes
http://arxiv.org/abs/2007.07366v2


Set Functions for Time Series
http://arxiv.org/abs/1909.12064v3


Severing the Edge Between Before and After: Neural Architectures for Temporal Ordering of Events
http://arxiv.org/abs/2004.04295v1


Shape of synth to come: Why we should use synthetic data for English surface realization
http://arxiv.org/abs/2005.02693v1


Shaping Visual Representations with Language for Few-shot Classification
http://arxiv.org/abs/1911.02683v2


Shared-Private Bilingual Word Embeddings for Neural Machine Translation
http://arxiv.org/abs/1906.03100v1


Sharp Analysis of Expectation-Maximization for Weakly Identifiable Models
http://arxiv.org/abs/1902.00194v3


Sharp Composition Bounds for Gaussian Differential Privacy via Edgeworth Expansion
http://arxiv.org/abs/2003.04493v2


Sharp Models on Dull Hardware: Fast and Accurate Neural Machine Translation Decoding on the CPU
http://arxiv.org/abs/1705.01991v1


Sharper bounds for uniformly stable algorithms
http://arxiv.org/abs/1910.07833v2


Sheaf Neural Networks
http://arxiv.org/abs/2012.06333v1


SherLIiC: A Typed Event-Focused Lexical Inference Benchmark for Evaluating Natural Language Inference
http://arxiv.org/abs/1906.01393v1


Short-Term Meaning Shift: A Distributional Exploration
http://arxiv.org/abs/1809.03169v3


Should All Cross-Lingual Embeddings Speak English?
http://arxiv.org/abs/1911.03058v2


Showing Your Work Doesn't Always Work
http://arxiv.org/abs/2004.13705v1


SimGANs: Simulator-Based Generative Adversarial Networks for ECG Synthesis to Improve Deep ECG Classification
http://arxiv.org/abs/2006.15353v1


SimVerb-3500: A Large-Scale Evaluation Set of Verb Similarity
http://arxiv.org/abs/1608.00869v4


Similarity Analysis of Contextual Word Representation Models
http://arxiv.org/abs/2005.01172v1


Simple Unsupervised Summarization by Contextual Matching
http://arxiv.org/abs/1907.13337v1


Simple and Deep Graph Convolutional Networks
http://arxiv.org/abs/2007.02133v1


Simple and Effective Curriculum Pointer-Generator Networks for Reading Comprehension over Long Narratives
http://arxiv.org/abs/1905.10847v1


Simple and Effective Multi-Paragraph Reading Comprehension
http://arxiv.org/abs/1710.10723v2


Simple and Effective Text Simplification Using Semantic and Neural Methods
http://arxiv.org/abs/1810.05104v1


Simple and sharp analysis of k-means


SimpleQuestions Nearly Solved: A New Upperbound and Baseline Approach
http://arxiv.org/abs/1804.08798v1


Simpler but More Accurate Semantic Dependency Parsing
http://arxiv.org/abs/1807.01396v1


Simplify the Usage of Lexicon in Chinese NER
http://arxiv.org/abs/1908.05969v2


Simplifying Neural Machine Translation with Addition-Subtraction Twin-Gated Recurrent Networks
http://arxiv.org/abs/1810.12546v1


Simulator Calibration under Covariate Shift with Kernels
http://arxiv.org/abs/1809.08159v4


Simultaneous Inference for Massive Data: Distributed Bootstrap
http://arxiv.org/abs/2002.08443v1


Simultaneous Machine Translation with Visual Context
http://arxiv.org/abs/2009.07310v3


Simultaneous Translation Policies: From Fixed to Adaptive
http://arxiv.org/abs/2004.13169v2


Simultaneous Translation with Flexible Policy via Restricted Imitation Learning
http://arxiv.org/abs/1906.01135v2


Simultaneous paraphrasing and translation by fine-tuning Transformer models
http://arxiv.org/abs/2005.05570v1


Single Model Ensemble using Pseudo-Tags and Distinct Vectors
http://arxiv.org/abs/2005.00879v1


Single Point Transductive Prediction
http://arxiv.org/abs/1908.02341v4


Single Shot Multitask Pedestrian Detection and Behavior Prediction
http://arxiv.org/abs/2101.02232v1


Single-/Multi-Source Cross-Lingual NER via Teacher-Student Learning on Unlabeled Data in Target Language
http://arxiv.org/abs/2004.12440v2


Situated Mapping of Sequential Instructions to Actions with Single-step Reward Observation
http://arxiv.org/abs/1805.10209v2


Skeleton-to-Response: Dialogue Generation Guided by Retrieval Memory
http://arxiv.org/abs/1809.05296v5


Sketch-Driven Regular Expression Generation from Natural Language and Examples
http://arxiv.org/abs/1908.05848v2


Sketching Transformed Matrices with Applications to Natural Language Processing
http://arxiv.org/abs/2002.09812v1


Skill Transfer via Partially Amortized Hierarchical Planning
http://arxiv.org/abs/2011.13897v1


SlotRefine: A Fast Non-Autoregressive Model for Joint Intent Detection and Slot Filling
http://arxiv.org/abs/2010.02693v2


Small Data, Big Decisions: Model Selection in the Small-Data Regime
http://arxiv.org/abs/2009.12583v1


Small-GAN: Speeding Up GAN Training Using Core-sets
http://arxiv.org/abs/1910.13540v1


Smooth Contextual Bandits: Bridging the Parametric and Non-differentiable Regret Regimes
http://arxiv.org/abs/1909.02553v4


Social Bias Frames: Reasoning about Social and Power Implications of Language
http://arxiv.org/abs/1911.03891v3


Social Biases in NLP Models as Barriers for Persons with Disabilities
http://arxiv.org/abs/2005.00813v1


Social Chemistry 101: Learning to Reason about Social and Moral Norms
http://arxiv.org/abs/2011.00620v1


Social Media Attributions in the Context of Water Crisis
http://arxiv.org/abs/2001.01697v1


Soft Gazetteers for Low-Resource Named Entity Recognition
http://arxiv.org/abs/2005.01866v1


Soft Threshold Weight Reparameterization for Learnable Sparsity
http://arxiv.org/abs/2002.03231v9


SoftSort: A Continuous Relaxation for the argsort Operator
http://arxiv.org/abs/2006.16038v1


Software Engineering Event Modeling using Relative Time in Temporal Knowledge Graphs
http://arxiv.org/abs/2007.01231v2


Solving Constrained CASH Problems with ADMM
http://arxiv.org/abs/2006.09635v2


Solving Discounted Stochastic Two-Player Games with Near-Optimal Time and Sample Complexity
http://arxiv.org/abs/1908.11071v1


Solving General Arithmetic Word Problems
http://arxiv.org/abs/1608.01413v2


Solving Physics Puzzles by Reasoning about Paths
http://arxiv.org/abs/2011.07357v1


Source Separation with Deep Generative Priors
http://arxiv.org/abs/2002.07942v2


Sources of Transfer in Multilingual Named Entity Recognition
http://arxiv.org/abs/2005.00847v1


Span Selection Pre-training for Question Answering
http://arxiv.org/abs/1909.04120v2


Span-Based Constituency Parsing with a Structure-Label System and Provably Optimal Dynamic Oracles
http://arxiv.org/abs/1612.06475v1


Span-ConveRT: Few-shot Span Extraction for Dialog with Pretrained Conversational Representations
http://arxiv.org/abs/2005.08866v2


Span-based Localizing Network for Natural Language Video Localization
http://arxiv.org/abs/2004.13931v2


Span-based discontinuous constituency parsing: a family of exact chart-based algorithms with time complexities from O(n^6) down to O(n^3)
http://arxiv.org/abs/2003.13785v1


SpanBERT: Improving Pre-training by Representing and Predicting Spans
http://arxiv.org/abs/1907.10529v3


Sparse Coding of Neural Word Embeddings for Multilingual Sequence Labeling
http://arxiv.org/abs/1612.07130v1


Sparse Gaussian Processes with Spherical Harmonic Features
http://arxiv.org/abs/2006.16649v1


Sparse Orthogonal Variational Inference for Gaussian Processes
http://arxiv.org/abs/1910.10596v3


Sparse Overcomplete Word Vector Representations
http://arxiv.org/abs/1506.02004v1


Sparse Parallel Training of Hierarchical Dirichlet Process Topic Models
http://arxiv.org/abs/1906.02416v2


Sparse Sinkhorn Attention
http://arxiv.org/abs/2002.11296v1


Sparse Text Generation
http://arxiv.org/abs/2004.02644v3


Sparse and Constrained Attention for Neural Machine Translation
http://arxiv.org/abs/1805.08241v1


Sparse and Low-rank Tensor Estimation via Cubic Sketchings
http://arxiv.org/abs/1801.09326v4


Sparsified Linear Programming for Zero-Sum Equilibrium Finding
http://arxiv.org/abs/2006.03451v2


SpatialSim: Recognizing Spatial Configurations of Objects with Graph Neural Networks
http://arxiv.org/abs/2004.04546v2


Speak to your Parser: Interactive Text-to-SQL with Natural Language Feedback
http://arxiv.org/abs/2005.02539v2


Speaker Sensitive Response Evaluation Model
http://arxiv.org/abs/2006.07015v1


Speakers Fill Lexical Semantic Gaps with Context
http://arxiv.org/abs/2010.02172v2


Specialising Word Vectors for Lexical Entailment
http://arxiv.org/abs/1710.06371v2


Spectral Clustering with Graph Neural Networks for Graph Pooling
http://arxiv.org/abs/1907.00481v6


Spectral Frank-Wolfe Algorithm: Strict Complementarity and Linear Convergence
http://arxiv.org/abs/2006.01719v4


Spectral Subsampling MCMC for Stationary Time Series
http://arxiv.org/abs/1910.13627v2


Spectrum Dependent Learning Curves in Kernel Regression and Wide Neural Networks
http://arxiv.org/abs/2002.02561v6


Speech Translation and the End-to-End Promise: Taking Stock of Where We Are
http://arxiv.org/abs/2004.06358v1


Speeding Up Neural Machine Translation Decoding by Cube Pruning
http://arxiv.org/abs/1809.02992v1


SpellGCN: Incorporating Phonological and Visual Similarities into Language Models for Chinese Spelling Check
http://arxiv.org/abs/2004.14166v2


Spelling Error Correction with Soft-Masked BERT
http://arxiv.org/abs/2005.07421v1


Split and Rephrase
http://arxiv.org/abs/1707.06971v1


Spot The Bot: A Robust and Efficient Framework for the Evaluation of Conversational Dialogue Systems
http://arxiv.org/abs/2010.02140v1


Spying on your neighbors: Fine-grained probing of contextual embeddings for information about surrounding words
http://arxiv.org/abs/2005.01810v1


SqueezeBERT: What can computer vision teach NLP about efficient neural networks?
http://arxiv.org/abs/2006.11316v1


Stabilizing Bi-Level Hyperparameter Optimization using Moreau-Yosida Regularization
http://arxiv.org/abs/2007.13322v1


Stabilizing Differentiable Architecture Search via Perturbation-based Regularization
http://arxiv.org/abs/2002.05283v3


Stabilizing Transformers for Reinforcement Learning
http://arxiv.org/abs/1910.06764v1


Stack-Pointer Networks for Dependency Parsing
http://arxiv.org/abs/1805.01087v1


Stance Prediction and Claim Verification: An Arabic Perspective
http://arxiv.org/abs/2005.10410v1


Stance Prediction for Contemporary Issues: Data and Experiments
http://arxiv.org/abs/2006.00052v1


Stanza: A Python Natural Language Processing Toolkit for Many Human Languages
http://arxiv.org/abs/2003.07082v2


State Space Expectation Propagation: Efficient Inference Schemes for Temporal Gaussian Processes
http://arxiv.org/abs/2007.05994v1


Statistical Machine Translation Features with Multitask Tensor Networks
http://arxiv.org/abs/1506.00698v1


Statistically Efficient Off-Policy Policy Gradients
http://arxiv.org/abs/2002.04014v2


Statistically Preconditioned Accelerated Gradient Method for Distributed Optimization
http://arxiv.org/abs/2002.10726v1


Stay on the Path: Instruction Fidelity in Vision-and-Language Navigation
http://arxiv.org/abs/1905.12255v3


Staying True to Your Word: (How) Can Attention Become Explanation?
http://arxiv.org/abs/2005.09379v1


Stepwise Extractive Summarization and Planning with Structured Transformers
http://arxiv.org/abs/2010.02744v1


Stepwise Model Selection for Sequence Prediction via Deep Kernel Learning
http://arxiv.org/abs/2001.03898v3


Stereo Endoscopic Image Super-Resolution Using Disparity-Constrained Parallel Attention
http://arxiv.org/abs/2003.08539v1


Stimulating Creativity with FunLines: A Case Study of Humor Generation in Headlines
http://arxiv.org/abs/2002.02031v1


Stochastic Coordinate Minimization with Progressive Precision for Stochastic Convex Optimization
http://arxiv.org/abs/2003.05482v1


Stochastic Differential Equations with Variational Wishart Diffusions
http://arxiv.org/abs/2006.14895v1


Stochastic Flows and Geometric Optimization on the Orthogonal Group
http://arxiv.org/abs/2003.13563v1


Stochastic Frank-Wolfe for Constrained Finite-Sum Minimization
http://arxiv.org/abs/2002.11860v5


Stochastic Gauss-Newton Algorithms for Nonconvex Compositional Optimization
http://arxiv.org/abs/2002.07290v2


Stochastic Gradient and Langevin Processes
http://arxiv.org/abs/1907.03215v7


Stochastic Hamiltonian Gradient Methods for Smooth Games
http://arxiv.org/abs/2007.04202v1


Stochastic Latent Residual Video Prediction
http://arxiv.org/abs/2002.09219v4


Stochastic Linear Contextual Bandits with Diverse Contexts
http://arxiv.org/abs/2003.02681v1


Stochastic Neural Network with Kronecker Flow
http://arxiv.org/abs/1906.04282v2


Stochastic Normalizing Flows
http://arxiv.org/abs/2002.09547v2


Stochastic Optimization for Regularized Wasserstein Estimators
http://arxiv.org/abs/2002.08695v1


Stochastic Particle-Optimization Sampling and the Non-Asymptotic Convergence Theory
http://arxiv.org/abs/1809.01293v5


Stochastic Recursive Variance-Reduced Cubic Regularization Methods
http://arxiv.org/abs/1901.11518v2


Stochastic Regret Minimization in Extensive-Form Games
http://arxiv.org/abs/2002.08493v1


Stochastic Subspace Cubic Newton Method
http://arxiv.org/abs/2002.09526v1


Stochastic Top-k ListNet
http://arxiv.org/abs/1511.00271v1


Stochastic Wasserstein Autoencoder for Probabilistic Sentence Generation
http://arxiv.org/abs/1806.08462v2


Stochastic bandits with arm-dependent delays
http://arxiv.org/abs/2006.10459v1


Stochastic-YOLO: Efficient Probabilistic Object Detection under Dataset Shifts
http://arxiv.org/abs/2009.02967v2


Stochastically Dominant Distributional Reinforcement Learning
http://arxiv.org/abs/1905.07318v4


Stochasticity in Neural ODEs: An Empirical Study
http://arxiv.org/abs/2002.09779v2


Stolen Probability: A Structural Weakness of Neural Language Models
http://arxiv.org/abs/2005.02433v1


Stopping criterion for active learning based on deterministic generalization bounds
http://arxiv.org/abs/2005.07402v1


Straight to the Tree: Constituency Parsing with Neural Syntactic Distance
http://arxiv.org/abs/1806.04168v1


Strategic Classification is Causal Modeling in Disguise
http://arxiv.org/abs/1910.10362v3


Strategies for Structuring Story Generation
http://arxiv.org/abs/1902.01109v2


Strategizing against No-regret Learners
http://arxiv.org/abs/1909.13861v1


Streamlining Tensor and Network Pruning in PyTorch
http://arxiv.org/abs/2004.13770v1


Strength from Weakness: Fast Learning Using Weak Supervision
http://arxiv.org/abs/2002.08483v1


Stretching the Effectiveness of MLE from Accuracy to Bias for Pairwise Comparisons
http://arxiv.org/abs/1906.04066v1


Striving for Simplicity and Performance in Off-Policy DRL: Output Normalization and Non-Uniform Sampling
http://arxiv.org/abs/1910.02208v4


Strong Baselines for Neural Semi-supervised Learning under Domain Shift
http://arxiv.org/abs/1804.09530v1


Strong Baselines for Simple Question Answering over Knowledge Graphs with and without Neural Networks
http://arxiv.org/abs/1712.01969v2


Strong and Simple Baselines for Multimodal Utterance Embeddings
http://arxiv.org/abs/1906.02125v2


Stronger and Faster Wasserstein Adversarial Attacks
http://arxiv.org/abs/2008.02883v1


StructVAE: Tree-structured Latent Variable Models for Semi-supervised Semantic Parsing
http://arxiv.org/abs/1806.07832v1


Structural Language Models of Code
http://arxiv.org/abs/1910.00577v4


Structural Neural Encoders for AMR-to-text Generation
http://arxiv.org/abs/1903.11410v2


Structural Scaffolds for Citation Intent Classification in Scientific Publications
http://arxiv.org/abs/1904.01608v2


Structural Supervision Improves Few-Shot Learning and Syntactic Generalization in Neural Language Models
http://arxiv.org/abs/2010.05725v1


Structure Adaptive Algorithms for Stochastic Bandits
http://arxiv.org/abs/2007.00969v1


Structure Aware Negative Sampling in Knowledge Graphs
http://arxiv.org/abs/2009.11355v2


Structure Mapping for Transferability of Causal Models
http://arxiv.org/abs/2007.09445v1


Structure-Level Knowledge Distillation For Multilingual Sequence Labeling
http://arxiv.org/abs/2004.03846v3


Structured Attention for Unsupervised Dialogue Structure Induction
http://arxiv.org/abs/2009.08552v2


Structured Linear Contextual Bandits: A Sharp and Geometric Smoothed Analysis
http://arxiv.org/abs/2002.11332v1


Structured Minimally Supervised Learning for Neural Relation Extraction
http://arxiv.org/abs/1904.00118v5


Structured Multi-Label Biomedical Text Tagging via Attentive Neural Tree Decoding
http://arxiv.org/abs/1810.01468v1


Structured Policy Iteration for Linear Quadratic Regulator
http://arxiv.org/abs/2007.06202v1


Structured Prediction with Partial Labelling through the Infimum Loss
http://arxiv.org/abs/2003.00920v2


Structured Pruning of Large Language Models
http://arxiv.org/abs/1910.04732v1


Structured Training for Neural Network Transition-Based Parsing
http://arxiv.org/abs/1506.06158v1


Structured Tuning for Semantic Role Labeling
http://arxiv.org/abs/2005.00496v2


Student-Teacher Curriculum Learning via Reinforcement Learning: Predicting Hospital Inpatient Admission Location
http://arxiv.org/abs/2007.01135v1


Studying the Inductive Biases of RNNs with Synthetic Variations of Natural Languages
http://arxiv.org/abs/1903.06400v2


Style Transfer Through Back-Translation
http://arxiv.org/abs/1804.09000v3


Sub-Instruction Aware Vision-and-Language Navigation
http://arxiv.org/abs/2004.02707v2


Subgoal Discovery for Hierarchical Dialogue Policy Learning
http://arxiv.org/abs/1804.07855v3


SubjQA: A Dataset for Subjectivity and Review Comprehension
http://arxiv.org/abs/2004.14283v3


Sublinear Optimal Policy Value Estimation in Contextual Bandits
http://arxiv.org/abs/1912.06111v2


Substance over Style: Document-Level Targeted Content Transfer
http://arxiv.org/abs/2010.08618v1


Subword Regularization: Improving Neural Network Translation Models with Multiple Subword Candidates
http://arxiv.org/abs/1804.10959v1


Subword-Level Language Identification for Intra-Word Code-Switching
http://arxiv.org/abs/1904.01989v1


Successfully Applying the Stabilized Lottery Ticket Hypothesis to the Transformer Architecture
http://arxiv.org/abs/2005.03454v2


Summarizing Opinions: Aspect Extraction Meets Sentiment Prediction and They Are Both Weakly Supervised
http://arxiv.org/abs/1808.08858v1


Summarizing Text on Any Aspects: A Knowledge-Informed Weakly-Supervised Approach
http://arxiv.org/abs/2010.06792v2


Super-efficiency of automatic differentiation for functions defined as a minimum
http://arxiv.org/abs/2002.03722v1


Supermasks in Superposition
http://arxiv.org/abs/2006.14769v3


Supertagging Combinatory Categorial Grammar with Attentive Graph Convolutional Networks
http://arxiv.org/abs/2010.06115v2


Supervised Attentions for Neural Machine Translation
http://arxiv.org/abs/1608.00112v1


Supervised Domain Enablement Attention for Personalized Domain Classification
http://arxiv.org/abs/1812.07546v1


Supervised Grapheme-to-Phoneme Conversion of Orthographic Schwas in Hindi and Punjabi
http://arxiv.org/abs/2004.10353v2


Supervised Learning of Universal Sentence Representations from Natural Language Inference Data
http://arxiv.org/abs/1705.02364v5


Supervised Learning: No Loss No Cry
http://arxiv.org/abs/2002.03555v1


Supervised Seeded Iterated Learning for Interactive Language Learning
http://arxiv.org/abs/2010.02975v1


Support recovery and sup-norm convergence rates for sparse pivotal estimation
http://arxiv.org/abs/2001.05401v3


Surrogate sea ice model enables efficient tuning
http://arxiv.org/abs/2006.12977v1


SwitchOut: an Efficient Data Augmentation Algorithm for Neural Machine Translation
http://arxiv.org/abs/1808.07512v2


Symbolic Network: Generalized Neural Policies for Relational MDPs
http://arxiv.org/abs/2002.07375v2


Syn-QG: Syntactic and Shallow Semantic Rules for Question Generation
http://arxiv.org/abs/2004.08694v3


SynSetExpan: An Iterative Framework for Joint Entity Set Expansion and Synonym Discovery
http://arxiv.org/abs/2009.13827v1


Synchronous Bidirectional Neural Machine Translation
http://arxiv.org/abs/1905.04847v1


Syntactic Data Augmentation Increases Robustness to Inference Heuristics
http://arxiv.org/abs/2004.11999v1


Syntactic Scaffolds for Semantic Structures
http://arxiv.org/abs/1808.10485v1


Syntactic Search by Example
http://arxiv.org/abs/2006.03010v1


Syntactic Structure Distillation Pretraining For Bidirectional Encoders
http://arxiv.org/abs/2005.13482v1


Syntax-Enhanced Neural Machine Translation with Syntax-Aware Word Representations
http://arxiv.org/abs/1905.02878v1


Syntax-guided Controlled Generation of Paraphrases
http://arxiv.org/abs/2005.08417v1


T-Basis: a Compact Representation for Neural Networks
http://arxiv.org/abs/2007.06631v1


T-GD: Transferable GAN-generated Images Detection Framework
http://arxiv.org/abs/2008.04115v1


T3: Tree-Autoencoder Constrained Adversarial Text Generation for Targeted Attack
http://arxiv.org/abs/1912.10375v2


TACRED Revisited: A Thorough Evaluation of the TACRED Relation Extraction Task
http://arxiv.org/abs/2004.14855v1


TAG : Type Auxiliary Guiding for Code Comment Generation
http://arxiv.org/abs/2005.02835v1


TAPAS: Weakly Supervised Table Parsing via Pre-training
http://arxiv.org/abs/2004.02349v2


TOD-BERT: Pre-trained Natural Language Understanding for Task-Oriented Dialogue
http://arxiv.org/abs/2004.06871v3


TORQUE: A Reading Comprehension Dataset of Temporal Ordering Questions
http://arxiv.org/abs/2005.00242v2


TUDataset: A collection of benchmark datasets for learning with graphs
http://arxiv.org/abs/2007.08663v1


TUNIZI: a Tunisian Arabizi sentiment analysis Dataset
http://arxiv.org/abs/2004.14303v1


TVQA+: Spatio-Temporal Grounding for Video Question Answering
http://arxiv.org/abs/1904.11574v2


TVQA: Localized, Compositional Video Question Answering
http://arxiv.org/abs/1809.01696v2


TWEETQA: A Social Media Focused Question Answering Dataset
http://arxiv.org/abs/1907.06292v1


TXtract: Taxonomy-Aware Knowledge Extraction for Thousands of Product Categories
http://arxiv.org/abs/2004.13852v2


TaBERT: Pretraining for Joint Understanding of Textual and Tabular Data
http://arxiv.org/abs/2005.08314v1


Tabula nearly rasa: Probing the Linguistic Knowledge of Character-Level Neural Language Models Trained on Unsegmented Text
http://arxiv.org/abs/1906.07285v1


Tackling the Low-resource Challenge for Canonical Segmentation
http://arxiv.org/abs/2010.02804v1


Tailoring: encoding inductive biases by optimizing unsupervised objectives at prediction time
http://arxiv.org/abs/2009.10623v2


Tails of Lipschitz Triangular Flows
http://arxiv.org/abs/1907.04481v3


Taking a hint: How to leverage loss predictors in contextual bandits?
http://arxiv.org/abs/2003.01922v2


Talk to Papers: Bringing Neural Question Answering to Academic Search
http://arxiv.org/abs/2004.02002v3


Talking to the crowd: What do people react to in online discussions?
http://arxiv.org/abs/1507.02205v2


Tangled up in BLEU: Reevaluating the Evaluation of Automatic Machine Translation Evaluation Metrics
http://arxiv.org/abs/2006.06264v2


Target Conditioned Sampling: Optimizing Data Selection for Multilingual Neural Machine Translation
http://arxiv.org/abs/1905.08212v1


Target-Guided Open-Domain Conversation
http://arxiv.org/abs/1905.11553v2


Targeted Syntactic Evaluation of Language Models
http://arxiv.org/abs/1808.09031v1


Task-Oriented Dialogue as Dataflow Synthesis
http://arxiv.org/abs/2009.11423v2


Task-Oriented Query Reformulation with Reinforcement Learning
http://arxiv.org/abs/1704.04572v4


TaskNorm: Rethinking Batch Normalization for Meta-Learning
http://arxiv.org/abs/2003.03284v2


Tasty Burgers, Soggy Fries: Probing Aspect Robustness in Aspect-Based Sentiment Analysis
http://arxiv.org/abs/2009.07964v4


TaxiNLI: Taking a Ride up the NLU Hill
http://arxiv.org/abs/2009.14505v3


Taxonomy of Dual Block-Coordinate Ascent Methods for Discrete Energy Minimization
http://arxiv.org/abs/2004.07715v1


Taylor Expansion Policy Optimization
http://arxiv.org/abs/2003.06259v1


TeMP: Temporal Message Passing for Temporal Knowledge Graph Completion
http://arxiv.org/abs/2010.03526v1


TeaForN: Teacher-Forcing with N-grams
http://arxiv.org/abs/2010.03494v2


Teacher-Student Domain Adaptation for Biosensor Models
http://arxiv.org/abs/2003.07896v2


Teacher-Student chain for efficient semi-supervised histology image classification
http://arxiv.org/abs/2003.08797v2


Technology Readiness Levels for Machine Learning Systems
http://arxiv.org/abs/2101.03989v1


Tell Me How to Ask Again: Question Data Augmentation with Controllable Rewriting in Continuous Space
http://arxiv.org/abs/2010.01475v1


Tell-and-Answer: Towards Explainable Visual Question Answering using Attributes and Captions
http://arxiv.org/abs/1801.09041v1


Template-Based Question Generation from Retrieved Sentences for Improved Unsupervised Question Answering
http://arxiv.org/abs/2004.11892v1


Temporal Common Sense Acquisition with Minimal Supervision
http://arxiv.org/abs/2005.04304v1


Temporal Information Extraction by Predicting Relative Time-lines
http://arxiv.org/abs/1808.09401v1


Temporal Mental Health Dynamics on Social Media
http://arxiv.org/abs/2008.13121v3


Temporal Phenotyping using Deep Predictive Clustering of Disease Progression
http://arxiv.org/abs/2006.08600v1


Temporally-Continuous Probabilistic Prediction using Polynomial Trajectory Parameterization
http://arxiv.org/abs/2011.00399v1


TenIPS: Inverse Propensity Sampling for Tensor Completion
http://arxiv.org/abs/2101.00323v1


Tensor Fusion Network for Multimodal Sentiment Analysis
http://arxiv.org/abs/1707.07250v1


Tensor denoising and completion based on ordinal observations
http://arxiv.org/abs/2002.06524v3


Tensors over Semirings for Latent-Variable Weighted Logic Programs
http://arxiv.org/abs/2006.04232v1


TernaryBERT: Distillation-aware Ultra-low Bit BERT
http://arxiv.org/abs/2009.12812v3


Test-Time Training with Self-Supervision for Generalization under Distribution Shifts
http://arxiv.org/abs/1909.13231v3


Tetra-Tagging: Word-Synchronous Parsing with Linear-Time Inference
http://arxiv.org/abs/1904.09745v2


Text Classification Using Label Names Only: A Language Model Self-Training Approach
http://arxiv.org/abs/2010.07245v1


Text Classification with Few Examples using Controlled Generalization
http://arxiv.org/abs/2005.08469v1


Text Processing Like Humans Do: Visually Attacking and Shielding NLP Systems
http://arxiv.org/abs/1903.11508v2


Text and Causal Inference: A Review of Using Text to Remove Confounding from Causal Estimates
http://arxiv.org/abs/2005.00649v1


Text to 3D Scene Generation with Rich Lexical Grounding
http://arxiv.org/abs/1505.06289v2


Text-Based Ideal Points
http://arxiv.org/abs/2005.04232v2


TextAttack: Lessons learned in designing Python frameworks for NLP
http://arxiv.org/abs/2010.01724v1


TextBrewer: An Open-Source Knowledge Distillation Toolkit for Natural Language Processing
http://arxiv.org/abs/2002.12620v2


TextHide: Tackling Data Privacy in Language Understanding Tasks
http://arxiv.org/abs/2010.06053v1


That is a Known Lie: Detecting Previously Fact-Checked Claims
http://arxiv.org/abs/2005.06058v1


The (Non-)Utility of Structural Features in BiLSTM-based Dependency Parsers
http://arxiv.org/abs/1905.12676v2


The ADAPT Enhanced Dependency Parser at the IWPT 2020 Shared Task
http://arxiv.org/abs/2009.01712v1


The Area of the Convex Hull of Sampled Curves: a Robust Functional Statistical Depth Measure
http://arxiv.org/abs/1910.04085v2


The Argument Reasoning Comprehension Task: Identification and Reconstruction of Implicit Warrants
http://arxiv.org/abs/1708.01425v4


The Boomerang Sampler
http://arxiv.org/abs/2006.13777v2


The Cascade Transformer: an Application for Efficient Answer Sentence Selection
http://arxiv.org/abs/2005.02534v2


The Complexity of Finding Stationary Points with Stochastic Gradient Descent
http://arxiv.org/abs/1910.01845v2


The Curse of Performance Instability in Analysis Datasets: Consequences, Source, and Suggestions
http://arxiv.org/abs/2004.13606v2


The Dialogue Dodecathlon: Open-Domain Knowledge and Image Grounded Conversational Agents
http://arxiv.org/abs/1911.03768v2


The EOS Decision and Length Extrapolation
http://arxiv.org/abs/2010.07174v1


The Effect of Natural Distribution Shift on Question Answering Models
http://arxiv.org/abs/2004.14444v1


The Expressive Power of a Class of Normalizing Flow Models
http://arxiv.org/abs/2006.00392v1


The FAST Algorithm for Submodular Maximization
http://arxiv.org/abs/1907.06173v1


The Fast Loaded Dice Roller: A Near-Optimal Exact Sampler for Discrete Probability Distributions
http://arxiv.org/abs/2003.03830v2


The Galactic Dependencies Treebanks: Getting More Data by Synthesizing New Languages
http://arxiv.org/abs/1710.03838v1


The Gossiping Insert-Eliminate Algorithm for Multi-Agent Bandits
http://arxiv.org/abs/2001.05452v3


The Grammar of Emergent Languages
http://arxiv.org/abs/2010.02069v2


The Impact of Neural Network Overparameterization on Gradient Confusion and Stochastic Gradient Descent
http://arxiv.org/abs/1904.06963v5


The Implicit Regularization of Ordinary Least Squares Ensembles
http://arxiv.org/abs/1910.04743v2


The Implicit Regularization of Stochastic Gradient Flow for Least Squares
http://arxiv.org/abs/2003.07802v2


The Implicit and Explicit Regularization Effects of Dropout
http://arxiv.org/abs/2002.12915v3


The Importance of Being Recurrent for Modeling Hierarchical Structure
http://arxiv.org/abs/1803.03585v2


The Importance of Category Labels in Grammar Induction with Child-directed Utterances
http://arxiv.org/abs/2006.11646v1


The Influence of Shape Constraints on the Thresholding Bandit Problem
http://arxiv.org/abs/2006.10006v2


The Interplay between Lexical Resources and Natural Language Processing
http://arxiv.org/abs/1807.00571v1


The Intrinsic Robustness of Stochastic Bandits to Strategic Manipulation
http://arxiv.org/abs/1906.01528v2


The LMU Munich System for the WMT 2020 Unsupervised Machine Translation Shared Task
http://arxiv.org/abs/2010.13192v1


The Language of Legal and Illegal Activity on the Darknet
http://arxiv.org/abs/1905.05543v2


The Lipschitz Constant of Self-Attention
http://arxiv.org/abs/2006.04710v1


The Lower The Simpler: Simplifying Hierarchical Recurrent Models
http://arxiv.org/abs/1809.02790v4


The Measure and Mismeasure of Fairness: A Critical Review of Fair Machine Learning
http://arxiv.org/abs/1808.00023v2


The Microsoft Toolkit of Multi-Task Deep Neural Networks for Natural Language Understanding
http://arxiv.org/abs/2002.07972v2


The Multilingual Amazon Reviews Corpus
http://arxiv.org/abs/2010.02573v1


The NarrativeQA Reading Comprehension Challenge
http://arxiv.org/abs/1712.07040v1


The NetHack Learning Environment
http://arxiv.org/abs/2006.13760v2


The Neural Tangent Kernel in High Dimensions: Triple Descent and a Multi-Scale Theory of Generalization
http://arxiv.org/abs/2008.06786v1


The Non-IID Data Quagmire of Decentralized Machine Learning
http://arxiv.org/abs/1910.00189v2


The Paradigm Discovery Problem
http://arxiv.org/abs/2005.01630v1


The PhotoBook Dataset: Building Common Ground through Visually-Grounded Dialogue
http://arxiv.org/abs/1906.01530v2


The Power Spherical distribution
http://arxiv.org/abs/2006.04437v2


The Power of Batching in Multiple Hypothesis Testing
http://arxiv.org/abs/1910.04968v2


The Referential Reader: A Recurrent Entity Network for Anaphora Resolution
http://arxiv.org/abs/1902.01541v2


The Return of Lexical Dependencies: Neural Lexicalized PCFGs
http://arxiv.org/abs/2007.15135v1


The Right Tool for the Job: Matching Model and Instance Complexities
http://arxiv.org/abs/2004.07453v2


The SIGMORPHON 2020 Shared Task on Unsupervised Morphological Paradigm Completion
http://arxiv.org/abs/2005.13756v1


The SOFC-Exp Corpus and Neural Approaches to Information Extraction in the Materials Science Domain
http://arxiv.org/abs/2006.03039v1


The Secret is in the Spectra: Predicting Cross-lingual Task Performance with Spectral Similarity Measures
http://arxiv.org/abs/2001.11136v2


The Sensitivity of Language Models and Humans to Winograd Schema Perturbations
http://arxiv.org/abs/2005.01348v2


The State and Fate of Linguistic Diversity and Inclusion in the NLP World
http://arxiv.org/abs/2004.09095v2


The Sylvester Graphical Lasso (SyGlasso)
http://arxiv.org/abs/2002.00288v1


The TechQA Dataset
http://arxiv.org/abs/1911.02984v1


The Tree Ensemble Layer: Differentiability meets Conditional Computation
http://arxiv.org/abs/2002.07772v2


The True Sample Complexity of Identifying Good Arms
http://arxiv.org/abs/1906.06594v1


The Unreasonable Volatility of Neural Machine Translation Models
http://arxiv.org/abs/2005.12398v1


The Unstoppable Rise of Computational Linguistics in Deep Learning
http://arxiv.org/abs/2005.06420v3


The Usual Suspects? Reassessing Blame for VAE Posterior Collapse
http://arxiv.org/abs/1912.10702v1


The Volctrans Machine Translation System for WMT20
http://arxiv.org/abs/2010.14806v2


The Web as a Knowledge-base for Answering Complex Questions
http://arxiv.org/abs/1803.06643v1


The World is Not Binary: Learning to Rank with Grayscale Data for Dialogue Response Selection
http://arxiv.org/abs/2004.02421v4


The continuous categorical: a novel simplex-valued exponential family
http://arxiv.org/abs/2002.08563v2


The cost-free nature of optimally tuning Tikhonov regularizers and other ordered smoothers
http://arxiv.org/abs/1905.12517v1


The elephant in the interpretability room: Why use attention as explanation when we have saliency methods?
http://arxiv.org/abs/2010.05607v1


The emergence of number and syntax units in LSTM language models
http://arxiv.org/abs/1903.07435v2


The equivalence between Stein variational gradient descent and black-box variational inference
http://arxiv.org/abs/2004.01822v1


The importance of fillers for text representations of speech transcripts
http://arxiv.org/abs/2009.11340v2


The k-tied Normal Distribution: A Compact Parameterization of Gaussian Mean Field Posteriors in Bayesian Neural Networks
http://arxiv.org/abs/2002.02655v2


The many Shapley values for model explanation
http://arxiv.org/abs/1908.08474v2


The perceptual boost of visual attention is task-dependent in naturalistic settings
http://arxiv.org/abs/2003.00882v2


The role of context in neural pitch accent detection in English
http://arxiv.org/abs/2004.14846v2


The role of regularization in classification of high-dimensional noisy Gaussian mixture
http://arxiv.org/abs/2002.11544v1


The unreasonable effectiveness of Batch-Norm statistics in addressing catastrophic forgetting across medical institutions
http://arxiv.org/abs/2011.08096v1


Theoretical Limitations of Self-Attention in Neural Sequence Models
http://arxiv.org/abs/1906.06755v2


Theory of Mind with Guilt Aversion Facilitates Cooperative Reinforcement Learning
http://arxiv.org/abs/2009.07445v1


Thermodynamic Consistent Neural Networks for Learning Material Interfacial Mechanics
http://arxiv.org/abs/2011.14172v1


Thompson Sampling Algorithms for Mean-Variance Bandits
http://arxiv.org/abs/2002.00232v3


Thompson Sampling for Linearly Constrained Bandits
http://arxiv.org/abs/2004.09258v2


Thompson Sampling via Local Uncertainty
http://arxiv.org/abs/1910.13673v3


Thresholding Bandit Problem with Both Duels and Pulls
http://arxiv.org/abs/1910.06368v2


Thresholding Graph Bandits with GrAPL
http://arxiv.org/abs/1905.09190v3


Tied Multitask Learning for Neural Speech Translation
http://arxiv.org/abs/1802.06655v2


Tight Differential Privacy for Discrete-Valued Mechanisms and for the Subsampled Gaussian Mechanism Using FFT
http://arxiv.org/abs/2006.07134v2


Tight Lower Bounds for Combinatorial Multi-Armed Bandits
http://arxiv.org/abs/2002.05392v3


Tightening Exploration in Upper Confidence Reinforcement Learning
http://arxiv.org/abs/2004.09656v2


Tigrinya Neural Machine Translation with Transfer Learning for Humanitarian Response
http://arxiv.org/abs/2003.11523v1


Tilde at WMT 2020: News Task Systems
http://arxiv.org/abs/2010.15423v1


Time Adaptive Reinforcement Learning
http://arxiv.org/abs/2004.08600v1


Time Dependence in Non-Autonomous Neural ODEs
http://arxiv.org/abs/2005.01906v2


Time Series Deconfounder: Estimating Treatment Effects over Time in the Presence of Hidden Confounders
http://arxiv.org/abs/1902.00450v4


Time Series Source Separation with Slow Flows
http://arxiv.org/abs/2007.10182v1


Time-aware Large Kernel Convolutions
http://arxiv.org/abs/2002.03184v2


Tiny Video Networks
http://arxiv.org/abs/1910.06961v1


To BERT or Not to BERT: Comparing Task-specific and Task-agnostic Semi-Supervised Approaches for Sequence Tagging
http://arxiv.org/abs/2010.14042v1


To Schedule or not to Schedule: Extracting Task Specific Temporal Entities and Associated Negation Constraints
http://arxiv.org/abs/2012.02594v1


To Test Machine Comprehension, Start by Defining Comprehension
http://arxiv.org/abs/2005.01525v2


ToTTo: A Controlled Table-To-Text Generation Dataset
http://arxiv.org/abs/2004.14373v3


Token-level and sequence-level loss smoothing for RNN language models
http://arxiv.org/abs/1805.05062v1


Top-Rank-Focused Adaptive Vote Collection for the Evaluation of Domain-Specific Semantic Models
http://arxiv.org/abs/2010.04486v1


Topic Memory Networks for Short Text Classification
http://arxiv.org/abs/1809.03664v1


Topic Modeling in Embedding Spaces
http://arxiv.org/abs/1907.04907v1


Topic Modeling via Full Dependence Mixtures
http://arxiv.org/abs/1906.06181v3


Topic Sensitive Attention on Generic Corpora Corrects Sense Bias in Pretrained Embeddings
http://arxiv.org/abs/1906.02688v2


Topically Driven Neural Language Model
http://arxiv.org/abs/1704.08012v2


Topological Autoencoders
http://arxiv.org/abs/1906.00722v4


Topological Sort for Sentence Ordering
http://arxiv.org/abs/2005.00432v1


Topologically Densified Distributions
http://arxiv.org/abs/2002.04805v1


Torch-Struct: Deep Structured Prediction Library
http://arxiv.org/abs/2002.00876v1


Toward A Neuro-inspired Creative Decoder
http://arxiv.org/abs/1902.02399v4


Toward Better Storylines with Sentence-Level Language Models
http://arxiv.org/abs/2005.05255v1


Toward Fast and Accurate Neural Discourse Segmentation
http://arxiv.org/abs/1808.09147v1


Toward Gender-Inclusive Coreference Resolution
http://arxiv.org/abs/1910.13913v4


Toward Micro-Dialect Identification in Diaglossic and Code-Switched Environments
http://arxiv.org/abs/2010.04900v2


Towards A Sign Language Gloss Representation Of Modern Standard Arabic
http://arxiv.org/abs/2005.01497v1


Towards Accurate and Reliable Energy Measurement of NLP Models
http://arxiv.org/abs/2010.05248v1


Towards Content Transfer through Grounded Text Generation
http://arxiv.org/abs/1905.05293v1


Towards Conversational Recommendation over Multi-Type Dialogs
http://arxiv.org/abs/2005.03954v3


Towards Debiasing NLU Models from Unknown Biases
http://arxiv.org/abs/2009.12303v4


Towards Debiasing Sentence Representations
http://arxiv.org/abs/2007.08100v1


Towards Dynamic Computation Graphs via Sparse Latent Structure
http://arxiv.org/abs/1809.00653v1


Towards Effective Context for Meta-Reinforcement Learning: an Approach based on Contrastive Learning
http://arxiv.org/abs/2009.13891v3


Towards End-to-End In-Image Neural Machine Translation
http://arxiv.org/abs/2010.10648v1


Towards End-to-End Reinforcement Learning of Dialogue Agents for Information Access
http://arxiv.org/abs/1609.00777v3


Towards Explainable Graph Representations in Digital Pathology
http://arxiv.org/abs/2007.00311v1


Towards Exploiting Background Knowledge for Building Conversation Systems
http://arxiv.org/abs/1809.08205v1


Towards Faithful Neural Table-to-Text Generation with Content-Matching Constraints
http://arxiv.org/abs/2005.00969v1


Towards Faithfully Interpretable NLP Systems: How should we define and evaluate faithfulness?
http://arxiv.org/abs/2004.03685v3


Towards Induction of Structured Phoneme Inventories
http://arxiv.org/abs/2010.05959v1


Towards Interpretable Reasoning over Paragraph Effects in Situation
http://arxiv.org/abs/2010.01272v1


Towards Interpreting BERT for Reading Comprehension Based QA
http://arxiv.org/abs/2010.08983v1


Towards Map-Based Validation of Semantic Segmentation Masks
http://arxiv.org/abs/2011.08008v2


Towards Multimodal Simultaneous Neural Machine Translation
http://arxiv.org/abs/2004.03180v2


Towards Near-imperceptible Steganographic Text
http://arxiv.org/abs/1907.06679v2


Towards Neural Machine Translation for Edoid Languages
http://arxiv.org/abs/2003.10704v1


Towards Open Domain Event Trigger Identification using Adversarial Domain Adaptation
http://arxiv.org/abs/2005.11355v1


Towards Persona-Based Empathetic Conversational Models
http://arxiv.org/abs/2004.12316v7


Towards Physics-informed Deep Learning for Turbulent Flow Prediction
http://arxiv.org/abs/1911.08655v4


Towards Reasonably-Sized Character-Level Transformer NMT by Finetuning Subword Systems
http://arxiv.org/abs/2004.14280v2


Towards Robustifying NLI Models Against Lexical Dataset Biases
http://arxiv.org/abs/2005.04732v2


Towards String-to-Tree Neural Machine Translation
http://arxiv.org/abs/1704.04743v3


Towards Supervised and Unsupervised Neural Machine Translation Baselines for Nigerian Pidgin
http://arxiv.org/abs/2003.12660v1


Towards Transparent and Explainable Attention Models
http://arxiv.org/abs/2004.14243v1


Towards Understanding Gender Bias in Relation Extraction
http://arxiv.org/abs/1911.03642v3


Towards Understanding the Dynamics of the First-Order Adversaries
http://arxiv.org/abs/2010.10650v1


Towards Understanding the Regularization of Adversarial Robustness on Neural Networks
http://arxiv.org/abs/2011.07478v1


Towards Universal Dialogue State Tracking
http://arxiv.org/abs/1810.09587v1


Towards Unsupervised Language Understanding and Generation by Joint Dual Learning
http://arxiv.org/abs/2004.14710v1


Towards a General Theory of Infinite-Width Limits of Neural Classifiers
http://arxiv.org/abs/2003.05884v3


Towards a predictive spatio-temporal representation of brain data
http://arxiv.org/abs/2003.03290v1


Towards an Automatic Turing Test: Learning to Evaluate Dialogue Responses
http://arxiv.org/abs/1708.07149v2


Towards classification parity across cohorts
http://arxiv.org/abs/2005.08033v1


Towards intervention-centric causal reasoning in learning agents
http://arxiv.org/abs/2005.12968v1


Toxicity Detection: Does Context Really Matter?
http://arxiv.org/abs/2006.00998v1


Train No Evil: Selective Masking for Task-Guided Pre-Training
http://arxiv.org/abs/2004.09733v2


Trainable Greedy Decoding for Neural Machine Translation
http://arxiv.org/abs/1702.02429v1


Training Binary Neural Networks through Learning with Noisy Supervision
http://arxiv.org/abs/2010.04871v1


Training Binary Neural Networks using the Bayesian Learning Rule
http://arxiv.org/abs/2002.10778v4


Training Classifiers with Natural Language Explanations
http://arxiv.org/abs/1805.03818v4


Training Deep Energy-Based Models with f-Divergence Minimization
http://arxiv.org/abs/2003.03463v2


Training Linear Neural Networks: Non-Local Convergence and Complexity Results
http://arxiv.org/abs/2002.09852v3


Training Millions of Personalized Dialogue Agents
http://arxiv.org/abs/1809.01984v1


Training Neural Networks for and by Interpolation
http://arxiv.org/abs/1906.05661v2


Training Production Language Models without Memorizing User Data
http://arxiv.org/abs/2009.10031v1


Training Question Answering Models From Synthetic Data
http://arxiv.org/abs/2002.09599v1


Trajectory of Alternating Direction Method of Multipliers and Adaptive Acceleration
http://arxiv.org/abs/1906.10114v2


TrajectoryNet: A Dynamic Optimal Transport Network for Modeling Cellular Dynamics
http://arxiv.org/abs/2002.04461v2


TransQuest at WMT2020: Sentence-Level Direct Assessment
http://arxiv.org/abs/2010.05318v1


Transfer Learning and Distant Supervision for Multilingual Transformer Models: A Study on African Languages
http://arxiv.org/abs/2010.03179v1


Transfer Learning of Photometric Phenotypes in Agriculture Using Metadata
http://arxiv.org/abs/2004.00303v1


Transfer Learning without Knowing: Reprogramming Black-box Machine Learning Models with Scarce Data and Limited Resources
http://arxiv.org/abs/2007.08714v2


Transfer NAS: Knowledge Transfer between Search Spaces with Transformer Agents
http://arxiv.org/abs/1906.08102v1


Transferable Multi-Domain State Generator for Task-Oriented Dialogue Systems
http://arxiv.org/abs/1905.08743v2


Transferring Monolingual Model to Low-Resource Language: The Case of Tigrinya
http://arxiv.org/abs/2006.07698v2


Transform the Set: Memory Attentive Generation of Guided and Unguided Image Collages
http://arxiv.org/abs/1910.07236v2


Transformation Importance with Applications to Cosmology
http://arxiv.org/abs/2003.01926v1


Transformation Networks for Target-Oriented Sentiment Classification
http://arxiv.org/abs/1805.01086v1


Transformer Based Multi-Source Domain Adaptation
http://arxiv.org/abs/2009.07806v1


Transformer Hawkes Process
http://arxiv.org/abs/2002.09291v4


Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context
http://arxiv.org/abs/1901.02860v3


Transformer-based Context-aware Sarcasm Detection in Conversation Threads from Social Media
http://arxiv.org/abs/2005.11424v1


Transformers are RNNs: Fast Autoregressive Transformers with Linear Attention
http://arxiv.org/abs/2006.16236v3


Transformers to Learn Hierarchical Contexts in Multiparty Dialogue for Span-based Question Answering
http://arxiv.org/abs/2004.03561v2


Transformers without Tears: Improving the Normalization of Self-Attention
http://arxiv.org/abs/1910.05895v2


Transforming Complex Sentences into a Semantic Hierarchy
http://arxiv.org/abs/1906.01038v1


Transition-Based Dependency Parsing with Stack Long Short-Term Memory
http://arxiv.org/abs/1505.08075v1


Transition-based Semantic Dependency Parsing with Pointer Networks
http://arxiv.org/abs/2005.13344v2


Translating Natural Language Instructions for Behavioral Robot Navigation with a Multi-Head Attention Mechanism
http://arxiv.org/abs/2006.00697v3


Translating Neuralese
http://arxiv.org/abs/1704.06960v5


Translating Similar Languages: Role of Mutual Intelligibility in Multilingual Transformers
http://arxiv.org/abs/2011.05037v1


Translation Artifacts in Cross-lingual Transfer Learning
http://arxiv.org/abs/2004.04721v4


Translationese as a Language in "Multilingual" NMT
http://arxiv.org/abs/1911.03823v2


Traversing Knowledge Graphs in Vector Space
http://arxiv.org/abs/1506.01094v2


Tree-Projected Gradient Descent for Estimating Gradient-Sparse Parameters on Graphs
http://arxiv.org/abs/2006.01662v1


Treebank Embedding Vectors for Out-of-domain Dependency Parsing
http://arxiv.org/abs/2005.00800v1


Trialstreamer: Mapping and Browsing Medical Evidence in Real-Time
http://arxiv.org/abs/2005.10865v1


Triangular Architecture for Rare Language Translation
http://arxiv.org/abs/1805.04813v2


TriggerNER: Learning with Entity Triggers as Explanations for Named Entity Recognition
http://arxiv.org/abs/2004.07493v4


Trying AGAIN instead of Trying Longer: Prior Learning for Automatic Curriculum Learning
http://arxiv.org/abs/2004.03168v1


Tuning-free Plug-and-Play Proximal Algorithm for Inverse Imaging Problems
http://arxiv.org/abs/2002.09611v2


Two Birds, One Stone: A Simple, Unified Model for Text Generation from Structured and Unstructured Data
http://arxiv.org/abs/1909.10158v2


Two Routes to Scalable Credit Assignment without Weight Symmetry
http://arxiv.org/abs/2003.01513v2


Two are Better than One: Joint Entity and Relation Extraction with Table-Sequence Encoders
http://arxiv.org/abs/2010.03851v1


Two-sample Testing Using Deep Learning
http://arxiv.org/abs/1910.06239v2


TwoWingOS: A Two-Wing Optimization Strategy for Evidential Claim Verification
http://arxiv.org/abs/1808.03465v2


TyDi QA: A Benchmark for Information-Seeking Question Answering in Typologically Diverse Languages
http://arxiv.org/abs/2003.05002v1


Type B Reflexivization as an Unambiguous Testbed for Multilingual Multi-Task Gender Bias
http://arxiv.org/abs/2009.11982v2


UDapter: Language Adaptation for Truly Universal Dependency Parsing
http://arxiv.org/abs/2004.14327v2


UNION: An Unreferenced Metric for Evaluating Open-ended Story Generation
http://arxiv.org/abs/2009.07602v1


USR: An Unsupervised and Reference Free Evaluation Metric for Dialog Generation
http://arxiv.org/abs/2005.00456v1


Ultra-Fine Entity Typing
http://arxiv.org/abs/1807.04905v1


Unbiased Risk Estimators Can Mislead: A Case Study of Learning with Complementary Labels
http://arxiv.org/abs/2007.02235v3


Uncertain Natural Language Inference
http://arxiv.org/abs/1909.03042v2


Uncertainty Estimation Using a Single Deep Deterministic Neural Network
http://arxiv.org/abs/2003.02037v2


Uncertainty Estimation in Cancer Survival Prediction
http://arxiv.org/abs/2003.08573v2


Uncertainty Quantification for Deep Context-Aware Mobile Activity Recognition and Unknown Context Discovery
http://arxiv.org/abs/2003.01753v1


Uncertainty Quantification for Sparse Deep Learning
http://arxiv.org/abs/2002.11815v2


Uncertainty in Neural Networks: Approximately Bayesian Ensembling
http://arxiv.org/abs/1810.05546v5


Uncertainty in Neural Relational Inference Trajectory Reconstruction
http://arxiv.org/abs/2006.13666v2


Uncertainty quantification using martingales for misspecified Gaussian processes
http://arxiv.org/abs/2006.07368v1


Uncertainty-Aware Label Refinement for Sequence Labeling
http://arxiv.org/abs/2012.10608v1


Uncertainty-Aware Semantic Augmentation for Neural Machine Translation
http://arxiv.org/abs/2010.04411v1


Uncertainty-Aware Vehicle Orientation Estimation for Joint Detection-Prediction Models
http://arxiv.org/abs/2011.03114v1


Uncovering the Folding Landscape of RNA Secondary Structure with Deep Graph Embeddings
http://arxiv.org/abs/2006.06885v2


Understanding Climate Impacts on Vegetation with Gaussian Processes in Granger Causality
http://arxiv.org/abs/2012.03338v1


Understanding Dataset Design Choices for Multi-hop Reasoning
http://arxiv.org/abs/1904.12106v1


Understanding Deep Learning Performance through an Examination of Test Set Difficulty: A Psychometric Case Study
http://arxiv.org/abs/1702.04811v3


Understanding Generalization in Deep Learning via Tensor Methods
http://arxiv.org/abs/2001.05070v2


Understanding Learned Reward Functions
http://arxiv.org/abs/2012.05862v1


Understanding Neural Abstractive Summarization Models via Uncertainty
http://arxiv.org/abs/2010.07882v1


Understanding Points of Correspondence between Sentences for Abstractive Summarization
http://arxiv.org/abs/2006.05621v1


Understanding Self-Attention of Self-Supervised Audio Transformers
http://arxiv.org/abs/2006.03265v2


Understanding Self-Training for Gradual Domain Adaptation
http://arxiv.org/abs/2002.11361v1


Understanding Task Design Trade-offs in Crowdsourced Paraphrase Collection
http://arxiv.org/abs/1704.05753v2


Understanding Undesirable Word Embedding Associations
http://arxiv.org/abs/1908.06361v1


Understanding Unintended Memorization in Federated Learning
http://arxiv.org/abs/2006.07490v1


Understanding and Improving Transformer From a Multi-Particle Dynamic System Point of View
http://arxiv.org/abs/1906.02762v1


Understanding and Mitigating the Tradeoff Between Robustness and Accuracy
http://arxiv.org/abs/2002.10716v2


Understanding language-elicited EEG data by predicting it from a fine-tuned language model
http://arxiv.org/abs/1904.01548v1


Understanding the Curse of Horizon in Off-Policy Evaluation via Conditional Importance Sampling
http://arxiv.org/abs/1910.06508v2


Understanding the Difficulty of Training Transformers
http://arxiv.org/abs/2004.08249v2


Understanding the Impact of Model Incoherence on Convergence of Incremental SGD with Random Reshuffle
http://arxiv.org/abs/2007.03509v1


Understanding the Intrinsic Robustness of Image Distributions using Conditional Generative Models
http://arxiv.org/abs/2003.00378v1


Understanding the Mechanics of SPIGOT: Surrogate Gradients for Latent Structure Learning
http://arxiv.org/abs/2010.02357v1


Understanding the robustness of deep neural network classifiers for breast cancer screening
http://arxiv.org/abs/2003.10041v1


Undirected Graphical Models as Approximate Posteriors
http://arxiv.org/abs/1901.03440v2


Unfolding and Shrinking Neural Machine Translation Ensembles
http://arxiv.org/abs/1704.03279v2


UniConv: A Unified Conversational Neural Architecture for Multi-domain Task-oriented Dialogues
http://arxiv.org/abs/2004.14307v2


Unified Pragmatic Models for Generating and Following Instructions
http://arxiv.org/abs/1711.04987v3


Unifying Human and Statistical Evaluation for Natural Language Generation
http://arxiv.org/abs/1904.02792v1


Universal Approximation Property of Neural Ordinary Differential Equations
http://arxiv.org/abs/2012.02414v1


Universal Approximation with Deep Narrow Networks
http://arxiv.org/abs/1905.08539v2


Universal Average-Case Optimality of Polyak Momentum
http://arxiv.org/abs/2002.04664v3


Universal Decompositional Semantic Parsing
http://arxiv.org/abs/1910.10138v3


Universal Equivariant Multilayer Perceptrons
http://arxiv.org/abs/2002.02912v2


Universal Natural Language Processing with Limited Annotations: Try Few-shot Textual Entailment as a Start
http://arxiv.org/abs/2010.02584v1


Universal Neural Machine Translation for Extremely Low Resource Languages
http://arxiv.org/abs/1802.05368v2


Universal Semantic Parsing
http://arxiv.org/abs/1702.03196v4


Unlabelled Data Improves Bayesian Uncertainty Calibration under Covariate Shift
http://arxiv.org/abs/2006.14988v1


Unlocking the Potential of Deep Counterfactual Value Networks
http://arxiv.org/abs/2007.10442v1


Unnatural Language Processing: Bridging the Gap Between Synthetic and Natural Language Data
http://arxiv.org/abs/2004.13645v1


Unpaired Sentiment-to-Sentiment Translation: A Cycled Reinforcement Learning Approach
http://arxiv.org/abs/1805.05181v2


Unraveling Meta-Learning: Understanding Feature Representations for Few-Shot Tasks
http://arxiv.org/abs/2002.06753v3


Unsupervised Alignment-based Iterative Evidence Retrieval for Multi-hop Question Answering
http://arxiv.org/abs/2005.01218v1


Unsupervised Commonsense Question Answering with Self-Talk
http://arxiv.org/abs/2004.05483v2


Unsupervised Cross-lingual Transfer of Word Embedding Spaces
http://arxiv.org/abs/1809.03633v1


Unsupervised Discovery of Implicit Gender Bias
http://arxiv.org/abs/2004.08361v2


Unsupervised Discovery of Interpretable Directions in the GAN Latent Space
http://arxiv.org/abs/2002.03754v3


Unsupervised Discrete Sentence Representation Learning for Interpretable Neural Dialog Generation
http://arxiv.org/abs/1804.08069v1


Unsupervised Domain Adaptation for Visual Navigation
http://arxiv.org/abs/2010.14543v2


Unsupervised Domain Clusters in Pretrained Language Models
http://arxiv.org/abs/2004.02105v2


Unsupervised Dual Paraphrasing for Two-stage Semantic Parsing
http://arxiv.org/abs/2005.13485v3


Unsupervised Grammar Induction with Depth-bounded PCFG
http://arxiv.org/abs/1802.08545v2


Unsupervised Hierarchy Matching with Optimal Transport over Hyperbolic Spaces
http://arxiv.org/abs/1911.02536v2


Unsupervised Identification of Translationese
http://arxiv.org/abs/1609.03205v1


Unsupervised Induction of Semantic Roles within a Reconstruction-Error Minimization Framework
http://arxiv.org/abs/1412.2812v1


Unsupervised Learning of Morphological Forests
http://arxiv.org/abs/1702.07015v1


Unsupervised Learning of Syntactic Structure with Invertible Neural Projections
http://arxiv.org/abs/1808.09111v1


Unsupervised Morphological Paradigm Completion
http://arxiv.org/abs/2005.00970v2


Unsupervised Multimodal Neural Machine Translation with Pseudo Visual Pivoting
http://arxiv.org/abs/2005.03119v1


Unsupervised Natural Language Inference via Decoupled Multimodal Contrastive Learning
http://arxiv.org/abs/2010.08200v1


Unsupervised Neural Machine Translation with Weight Sharing
http://arxiv.org/abs/1804.09057v1


Unsupervised Online Grounding of Natural Language during Human-Robot Interactions
http://arxiv.org/abs/2007.04304v1


Unsupervised Opinion Summarization as Copycat-Review Generation
http://arxiv.org/abs/1911.02247v2


Unsupervised Opinion Summarization with Noising and Denoising
http://arxiv.org/abs/2004.10150v1


Unsupervised Paraphrasing by Simulated Annealing
http://arxiv.org/abs/1909.03588v2


Unsupervised Parsing via Constituency Tests
http://arxiv.org/abs/2010.03146v1


Unsupervised Pidgin Text Generation By Pivoting English Data and Self-Training
http://arxiv.org/abs/2003.08272v1


Unsupervised Pivot Translation for Distant Languages
http://arxiv.org/abs/1906.02461v3


Unsupervised Pre-training of Bidirectional Speech Encoders via Masked Reconstruction
http://arxiv.org/abs/2001.10603v2


Unsupervised Quality Estimation for Neural Machine Translation
http://arxiv.org/abs/2005.10608v2


Unsupervised Question Answering by Cloze Translation
http://arxiv.org/abs/1906.04980v2


Unsupervised Question Decomposition for Question Answering
http://arxiv.org/abs/2002.09758v3


Unsupervised Recurrent Neural Network Grammars
http://arxiv.org/abs/1904.03746v6


Unsupervised Reference-Free Summary Quality Evaluation via Contrastive Learning
http://arxiv.org/abs/2010.01781v1


Unsupervised Speech Decomposition via Triple Information Bottleneck
http://arxiv.org/abs/2004.11284v5


Unsupervised Statistical Machine Translation
http://arxiv.org/abs/1809.01272v1


Unsupervised Text Style Transfer with Padded Masked Language Models
http://arxiv.org/abs/2010.01054v1


Unsupervised Transfer Learning for Spatiotemporal Predictive Networks
http://arxiv.org/abs/2009.11763v1


Unsupervised deep clustering for predictive texture pattern discovery in medical images
http://arxiv.org/abs/2002.03721v1


Up or Down? Adaptive Rounding for Post-Training Quantization
http://arxiv.org/abs/2004.10568v2


Urban Driving with Conditional Imitation Learning
http://arxiv.org/abs/1912.00177v2


Using Automatically Extracted Minimum Spans to Disentangle Coreference Evaluation from Boundary Detection
http://arxiv.org/abs/1906.06703v1


Using Context in Neural Machine Translation Training Objectives
http://arxiv.org/abs/2005.01483v1


Using Convolutional Variational Autoencoders to Predict Post-Trauma Health Outcomes from Actigraphy Data
http://arxiv.org/abs/2011.07406v2


Using Large Pretrained Language Models for Answering User Queries from Product Specifications
http://arxiv.org/abs/2005.14613v1


Using Linguistic Features to Improve the Generalization Capability of Neural Coreference Resolvers
http://arxiv.org/abs/1708.00160v2


Using Natural Language Relations between Answer Choices for Machine Comprehension
http://arxiv.org/abs/2012.15837v1


Using Punkt for Sentence Segmentation in non-Latin Scripts: Experiments on Kurdish (Sorani) Texts
http://arxiv.org/abs/2004.14134v2


Using Type Information to Improve Entity Coreference Resolution
http://arxiv.org/abs/2010.05738v1


Using competency questions to select optimal clustering structures for residential energy consumption patterns
http://arxiv.org/abs/2006.00934v1


Using millions of emoji occurrences to learn any-domain representations for detecting sentiment, emotion and sarcasm
http://arxiv.org/abs/1708.00524v2


Utility is in the Eye of the User: A Critique of NLP Leaderboards
http://arxiv.org/abs/2009.13888v3


Utility/Privacy Trade-off through the lens of Optimal Transport
http://arxiv.org/abs/1905.11148v3


VCDM: Leveraging Variational Bi-encoding and Deep Contextualized Word Representations for Improved Definition Modeling
http://arxiv.org/abs/2010.03124v1


VD-BERT: A Unified Vision and Dialog Transformer with BERT
http://arxiv.org/abs/2004.13278v3


VFlow: More Expressive Generative Flows with Variational Data Augmentation
http://arxiv.org/abs/2002.09741v2


Validated Variational Inference via Practical Posterior Error Bounds
http://arxiv.org/abs/1910.04102v4


Validation of Approximate Likelihood and Emulator Models for Computationally Intensive Simulations
http://arxiv.org/abs/1905.11505v2


Variable Skipping for Autoregressive Range Density Estimation
http://arxiv.org/abs/2007.05572v1


Variance Reduced Coordinate Descent with Acceleration: New Method With a Surprising Application to Finite-Sum Problems
http://arxiv.org/abs/2002.04670v1


Variance Reduction for Matrix Games
http://arxiv.org/abs/1907.02056v2


Variance Reduction in Stochastic Particle-Optimization Sampling
http://arxiv.org/abs/1811.08052v1


Variational Autoencoders and Nonlinear ICA: A Unifying Framework
http://arxiv.org/abs/1907.04809v4


Variational Autoencoders for Sparse and Overdispersed Discrete Data
http://arxiv.org/abs/1905.00616v2


Variational Autoencoders with Riemannian Brownian Motion Priors
http://arxiv.org/abs/2002.05227v3


Variational Bayesian Quantization
http://arxiv.org/abs/2002.08158v2


Variational Depth Search in ResNets
http://arxiv.org/abs/2002.02797v4


Variational Inference for Learning Representations of Natural Language Edits
http://arxiv.org/abs/2004.09143v4


Variational Inference with Continuously-Indexed Normalizing Flows
http://arxiv.org/abs/2007.05426v1


Variational Knowledge Graph Reasoning
http://arxiv.org/abs/1803.06581v3


Variational Neural Machine Translation with Normalizing Flows
http://arxiv.org/abs/2005.13978v1


Variational Optimization on Lie Groups, with Examples of Leading (Generalized) Eigenvalue Problems
http://arxiv.org/abs/2001.10006v1


Variational Pretraining for Semi-supervised Text Classification
http://arxiv.org/abs/1906.02242v1


Variational Sequential Labelers for Semi-Supervised Learning
http://arxiv.org/abs/1906.09535v1


Vector-Vector-Matrix Architecture: A Novel Hardware-Aware Framework for Low-Latency Inference in NLP Applications
http://arxiv.org/abs/2010.08412v1


Vehicle Trajectory Prediction by Transfer Learning of Semi-Supervised Models
http://arxiv.org/abs/2007.06781v2


Verb Physics: Relative Physical Knowledge of Actions and Objects
http://arxiv.org/abs/1706.03799v2


Video Prediction via Example Guidance
http://arxiv.org/abs/2007.01738v1


Video-Grounded Dialogues with Pretrained Generation Language Models
http://arxiv.org/abs/2006.15319v1


Video2Commonsense: Generating Commonsense Descriptions to Enrich Video Captioning
http://arxiv.org/abs/2003.05162v3


Visual Grounding of Learned Physical Models
http://arxiv.org/abs/2004.13664v2


Visually Grounded Continual Learning of Compositional Phrases
http://arxiv.org/abs/2005.00785v5


Visually Grounded Neural Syntax Acquisition
http://arxiv.org/abs/1906.02890v2


Voice Separation with an Unknown Number of Multiple Speakers
http://arxiv.org/abs/2003.01531v4


Volctrans Parallel Corpus Filtering System for WMT 2020
http://arxiv.org/abs/2010.14029v1


Wandering Within a World: Online Contextualized Few-Shot Learning
http://arxiv.org/abs/2007.04546v2


Wasserstein Control of Mirror Langevin Monte Carlo
http://arxiv.org/abs/2002.04363v1


Wasserstein Distance Regularized Sequence Representation for Text Matching in Asymmetrical Domains
http://arxiv.org/abs/2010.07717v2


Wasserstein Smoothing: Certified Robustness against Wasserstein Adversarial Attacks
http://arxiv.org/abs/1910.10783v1


Wasserstein Style Transfer
http://arxiv.org/abs/1905.12828v1


WaveFlow: A Compact Flow-based Model for Raw Audio
http://arxiv.org/abs/1912.01219v4


WaveNODE: A Continuous Normalizing Flow for Speech Synthesis
http://arxiv.org/abs/2006.04598v4


We Can Detect Your Bias: Predicting the Political Ideology of News Articles
http://arxiv.org/abs/2010.05338v1


WeChat Neural Machine Translation Systems for WMT20
http://arxiv.org/abs/2010.00247v2


Weakly Supervised Context Encoder using DICOM metadata in Ultrasound Imaging
http://arxiv.org/abs/2003.09070v1


Weakly Supervised Learning of Nuanced Frames for Analyzing Polarization in News Media
http://arxiv.org/abs/2009.09609v1


Weakly Supervised Medication Regimen Extraction from Medical Conversations
http://arxiv.org/abs/2010.05317v1


Weakly-Supervised Aspect-Based Sentiment Analysis via Joint Aspect-Sentiment Topic Embedding
http://arxiv.org/abs/2010.06705v1


Weakly-Supervised Disentanglement Without Compromises
http://arxiv.org/abs/2002.02886v4


Weakly-Supervised Spatio-Temporally Grounding Natural Sentence in Video
http://arxiv.org/abs/1906.02549v1


WeatherBench: A benchmark dataset for data-driven weather forecasting
http://arxiv.org/abs/2002.00469v3


Weight Poisoning Attacks on Pre-trained Models
http://arxiv.org/abs/2004.06660v1


Weird AI Yankovic: Generating Parody Lyrics
http://arxiv.org/abs/2009.12240v1


Weisfeiler and Leman go sparse: Towards scalable higher-order graph embeddings
http://arxiv.org/abs/1904.01543v3


What BERT is not: Lessons from a new suite of psycholinguistic diagnostics for language models
http://arxiv.org/abs/1907.13528v2


What Can Learned Intrinsic Rewards Capture?
http://arxiv.org/abs/1912.05500v3


What Can We Learn from Collective Human Opinions on Natural Language Inference Data?
http://arxiv.org/abs/2010.03532v2


What Did You Think Would Happen? Explaining Agent Behaviour Through Intended Outcomes
http://arxiv.org/abs/2011.05064v1


What Do Position Embeddings Learn? An Empirical Study of Pre-Trained Language Model Positional Encoding
http://arxiv.org/abs/2010.04903v1


What Does My QA Model Know? Devising Controlled Probes using Expert Knowledge
http://arxiv.org/abs/1912.13337v2


What Gives the Answer Away? Question Answering Bias Analysis on Video QA Datasets
http://arxiv.org/abs/2007.03626v1


What Happens To BERT Embeddings During Fine-tuning?
http://arxiv.org/abs/2004.14448v1


What Have We Achieved on Text Summarization?
http://arxiv.org/abs/2010.04529v1


What Kind of Language Is Hard to Language-Model?
http://arxiv.org/abs/1906.04726v2


What Makes Reading Comprehension Questions Easier?
http://arxiv.org/abs/1808.09384v1


What Question Answering can Learn from Trivia Nerds
http://arxiv.org/abs/1910.14464v3


What Was Written vs. Who Read It: News Media Profiling Using Text Analysis and Social Media Context
http://arxiv.org/abs/2005.04518v1


What You Say and How You Say it: Joint Modeling of Topics and Discourse in Microblog Conversations
http://arxiv.org/abs/1903.07319v1


What Your Username Says About You
http://arxiv.org/abs/1507.02045v2


What are the Goals of Distributional Semantics?
http://arxiv.org/abs/2005.02982v1


What are the Statistical Limits of Offline RL with Linear Function Approximation?
http://arxiv.org/abs/2010.11895v1


What do Models Learn from Question Answering Datasets?
http://arxiv.org/abs/2004.03490v2


What do Neural Machine Translation Models Learn about Morphology?
http://arxiv.org/abs/1704.03471v3


What is Learned in Visually Grounded Neural Syntax Acquisition
http://arxiv.org/abs/2005.01678v2


What is Local Optimality in Nonconvex-Nonconcave Minimax Optimization?
http://arxiv.org/abs/1902.00618v3


What is More Likely to Happen Next? Video-and-Language Future Event Prediction
http://arxiv.org/abs/2010.07999v1


What makes a good conversation? How controllable attributes affect human judgments
http://arxiv.org/abs/1902.08654v2


What's in a Name? Are BERT Named Entity Representations just as Good for any other Name?
http://arxiv.org/abs/2007.06897v1


When Are Tree Structures Necessary for Deep Learning of Representations?
http://arxiv.org/abs/1503.00185v5


When BERT Plays the Lottery, All Tickets Are Winning
http://arxiv.org/abs/2005.00561v2


When Does Self-Supervision Help Graph Convolutional Networks?
http://arxiv.org/abs/2006.09136v4


When Does Unsupervised Machine Translation Work?
http://arxiv.org/abs/2004.05516v3


When Explanations Lie: Why Many Modified BP Attributions Fail
http://arxiv.org/abs/1912.09818v6


When Hearst Is not Enough: Improving Hypernymy Detection from Corpus with Distributional Models
http://arxiv.org/abs/2010.04941v1


When and Why is Unsupervised Neural Machine Translation Useless?
http://arxiv.org/abs/2004.10581v1


When deep denoising meets iterative phase retrieval
http://arxiv.org/abs/2003.01792v1


When do Word Embeddings Accurately Reflect Surveys on our Beliefs About People?
http://arxiv.org/abs/2004.12043v1


Where Are You? Localization from Embodied Dialog
http://arxiv.org/abs/2011.08277v1


Where Are the Facts? Searching for Fact-checked Information to Alleviate the Spread of Fake News
http://arxiv.org/abs/2010.03159v1


Where's the Question? A Multi-channel Deep Convolutional Neural Network for Question Identification in Textual Data
http://arxiv.org/abs/2010.07816v1


Which Tasks Should Be Learned Together in Multi-task Learning?
http://arxiv.org/abs/1905.07553v4


Who did What: A Large-Scale Person-Centered Cloze Dataset
http://arxiv.org/abs/1608.05457v1


Whodunnit? Crime Drama as a Case for Natural Language Understanding
http://arxiv.org/abs/1710.11601v1


Why Non-myopic Bayesian Optimization is Promising and How Far Should We Look-ahead? A Study via Rollout
http://arxiv.org/abs/1911.01004v2


Why Normalizing Flows Fail to Detect Out-of-Distribution Data
http://arxiv.org/abs/2006.08545v1


Why Overfitting Isn't Always Bad: Retrofitting Cross-Lingual Word Embeddings to Dictionaries
http://arxiv.org/abs/2005.00524v1


Why Self-Attention? A Targeted Evaluation of Neural Machine Translation Architectures
http://arxiv.org/abs/1808.08946v3


Why Skip If You Can Combine: A Simple Knowledge Distillation Technique for Intermediate Layers
http://arxiv.org/abs/2010.03034v1


Why bigger is not always better: on finite and infinite neural networks
http://arxiv.org/abs/1910.08013v3


Why is unsupervised alignment of English embeddings from different algorithms so hard?
http://arxiv.org/abs/1809.00150v1


Widget Captioning: Generating Natural Language Description for Mobile User Interface Elements
http://arxiv.org/abs/2010.04295v1


Wiki-CS: A Wikipedia-Based Benchmark for Graph Neural Networks
http://arxiv.org/abs/2007.02901v1


WikiConv: A Corpus of the Complete Conversational History of a Large Online Collaborative Community
http://arxiv.org/abs/1810.13181v1


Will I Sound Like Me? Improving Persona Consistency in Dialogues through Pragmatic Self-Consciousness
http://arxiv.org/abs/2004.05816v2


Will-They-Won't-They: A Very Large Dataset for Stance Detection on Twitter
http://arxiv.org/abs/2005.00388v1


Winning on the Merits: The Joint Effects of Content and Style on Debate Outcomes
http://arxiv.org/abs/1705.05040v1


WinoWhy: A Deep Diagnosis of Essential Commonsense Knowledge for Answering Winograd Schema Challenge
http://arxiv.org/abs/2005.05763v1


With Little Power Comes Great Responsibility
http://arxiv.org/abs/2010.06595v1


Woodbury Transformations for Deep Generative Flows
http://arxiv.org/abs/2002.12229v3


Word Embeddings for Chemical Patent Natural Language Processing
http://arxiv.org/abs/2010.12912v1


Word Frequency Does Not Predict Grammatical Knowledge in Language Models
http://arxiv.org/abs/2010.13870v1


Word Ordering Without Syntax
http://arxiv.org/abs/1604.08633v2


Word Rotator's Distance
http://arxiv.org/abs/2004.15003v3


Word class flexibility: A deep contextualized approach
http://arxiv.org/abs/2009.09241v1


Word-level Speech Recognition with a Letter to Word Encoder
http://arxiv.org/abs/1906.04323v2


Word-level Textual Adversarial Attacking as Combinatorial Optimization
http://arxiv.org/abs/1910.12196v4


Word-order biases in deep-agent emergent communication
http://arxiv.org/abs/1905.12330v3


Words aren't enough, their order matters: On the Robustness of Grounding Visual Referring Expressions
http://arxiv.org/abs/2005.01655v1


Working Memory Networks: Augmenting Memory Networks with a Relational Reasoning Module
http://arxiv.org/abs/1805.09354v1


World Model as a Graph: Learning Latent Landmarks for Planning
http://arxiv.org/abs/2011.12491v1


Worse WER, but Better BLEU? Leveraging Word Embedding as Intermediate in Multitask End-to-End Speech Translation
http://arxiv.org/abs/2005.10678v2


X-FACTR: Multilingual Factual Knowledge Retrieval from Pretrained Language Models
http://arxiv.org/abs/2010.06189v3


X-LXMERT: Paint, Caption and Answer Questions with Multi-Modal Transformers
http://arxiv.org/abs/2009.11278v1


XGLUE: A New Benchmark Dataset for Cross-lingual Pre-training, Understanding and Generation
http://arxiv.org/abs/2004.01401v3


XL-WiC: A Multilingual Benchmark for Evaluating Semantic Contextualization
http://arxiv.org/abs/2010.06478v1


XLNet: Generalized Autoregressive Pretraining for Language Understanding
http://arxiv.org/abs/1906.08237v2


XLVIN: eXecuted Latent Value Iteration Nets
http://arxiv.org/abs/2010.13146v2


XTREME: A Massively Multilingual Multi-task Benchmark for Evaluating Cross-lingual Generalization
http://arxiv.org/abs/2003.11080v5


Xiaomingbot: A Multilingual Robot News Reporter
http://arxiv.org/abs/2007.08005v1


XtarNet: Learning to Extract Task-Adaptive Representation for Incremental Few-Shot Learning
http://arxiv.org/abs/2003.08561v2


XtremeDistil: Multi-stage Distillation for Massive Multilingual Models
http://arxiv.org/abs/2004.05686v2


YOLObile: Real-Time Object Detection on Mobile Devices via Compression-Compilation Co-Design
http://arxiv.org/abs/2009.05697v2


You Impress Me: Dialogue Generation via Mutual Persona Perception
http://arxiv.org/abs/2004.05388v1


Zeno++: Robust Fully Asynchronous SGD
http://arxiv.org/abs/1903.07020v4


Zero-Resource Translation with Multi-Lingual Neural Machine Translation
http://arxiv.org/abs/1606.04164v1


Zero-Shot Cross-Lingual Opinion Target Extraction
http://arxiv.org/abs/1904.09122v1


Zero-Shot Stance Detection: A Dataset and Model using Generalized Topic Representations
http://arxiv.org/abs/2010.03640v1


Zero-Shot Transfer Learning for Event Extraction
http://arxiv.org/abs/1707.01066v1


Zero-Shot Transfer Learning with Synthesized Data for Multi-Domain Dialogue State Tracking
http://arxiv.org/abs/2005.00891v1


Zero-Shot Translation Quality Estimation with Explicit Cross-Lingual Patterns
http://arxiv.org/abs/2010.04989v1


Zero-shot Sequence Labeling: Transferring Knowledge from Sentences to Tokens
http://arxiv.org/abs/1805.02214v1


Zero-shot User Intent Detection via Capsule Neural Networks
http://arxiv.org/abs/1809.00385v1


ZeroShotCeres: Zero-Shot Relation Extraction from Semi-Structured Webpages
http://arxiv.org/abs/2005.07105v1


doc2dial: A Goal-Oriented Document-Grounded Dialogue Dataset
http://arxiv.org/abs/2011.06623v2


emrQA: A Large Corpus for Question Answering on Electronic Medical Records
http://arxiv.org/abs/1809.00732v1


giotto-tda: A Topological Data Analysis Toolkit for Machine Learning and Data Exploration
http://arxiv.org/abs/2004.02551v1


i-RIM applied to the fastMRI challenge
http://arxiv.org/abs/1910.08952v1


iNLTK: Natural Language Toolkit for Indic Languages
http://arxiv.org/abs/2009.12534v2


iSarcasm: A Dataset of Intended Sarcasm
http://arxiv.org/abs/1911.03123v2


jiant: A Software Toolkit for Research on General-Purpose Text Understanding Models
http://arxiv.org/abs/2003.02249v2


k-simplex2vec: a simplicial extension of node2vec
http://arxiv.org/abs/2010.05636v2


pyBART: Evidence-based Syntactic Transformations for IE
http://arxiv.org/abs/2005.01306v2


scGNN: scRNA-seq Dropout Imputation via Induced Hierarchical Cell Similarity Graph
http://arxiv.org/abs/2008.03322v1


schuBERT: Optimizing Elements of BERT
http://arxiv.org/abs/2005.06628v1


simNet: Stepwise Image-Topic Merging Network for Generating Detailed and Comprehensive Image Captions
http://arxiv.org/abs/1808.08732v1