Skip to content

Instantly share code, notes, and snippets.

@amitness

amitness/papers.md Secret

Created Jan 17, 2021
Embed
What would you like to do?
title url
(Locally) Differentially Private Combinatorial Semi-Bandits http://arxiv.org/abs/2006.00706v2
(Re)construing Meaning in NLP http://arxiv.org/abs/2005.09099v1
2kenize: Tying Subword Sequences for Chinese Script Conversion http://arxiv.org/abs/2005.03375v1
3D-LaneNet+: Anchor Free Lane Detection using a Semi-Local Representation http://arxiv.org/abs/2011.01535v2
A Batch Normalized Inference Network Keeps the KL Vanishing Away http://arxiv.org/abs/2004.12585v2
A Benchmark of Medical Out of Distribution Detection http://arxiv.org/abs/2007.04250v2
A Bilingual Generative Transformer for Semantic Sentence Embedding http://arxiv.org/abs/1911.03895v2
A Boolean Task Algebra for Reinforcement Learning http://arxiv.org/abs/2001.01394v2
A Broad-Coverage Challenge Corpus for Sentence Understanding through Inference http://arxiv.org/abs/1704.05426v4
A Call for More Rigor in Unsupervised Cross-lingual Learning http://arxiv.org/abs/2004.14958v1
A Characterization of Mean Squared Error for Estimator with Bagging http://arxiv.org/abs/1908.02718v1
A Closer Look at Accuracy vs. Robustness http://arxiv.org/abs/2003.02460v3
A Closer Look at Small-loss Bounds for Bandits with Graph Feedback http://arxiv.org/abs/2002.00315v2
A Co-Matching Model for Multi-choice Reading Comprehension http://arxiv.org/abs/1806.04068v1
A Computational Approach to Understanding Empathy Expressed in Text-Based Mental Health Support http://arxiv.org/abs/2009.08441v1
A Contextual Hierarchical Attention Network with Adaptive Objective for Dialogue State Tracking http://arxiv.org/abs/2006.01554v2
A Continuous-time Perspective for Modeling Acceleration in Riemannian Optimization http://arxiv.org/abs/1910.10782v3
A Convolutional Encoder Model for Neural Machine Translation http://arxiv.org/abs/1611.02344v3
A Corpus for Large-Scale Phonetic Typology http://arxiv.org/abs/2005.13962v1
A Corpus with Multi-Level Annotations of Patients, Interventions and Outcomes to Support Language Processing for Medical Literature http://arxiv.org/abs/1806.04185v1
A Cross-Task Analysis of Text Span Representations http://arxiv.org/abs/2006.03866v1
A Crowdsourced Frame Disambiguation Corpus with Ambiguity http://arxiv.org/abs/1904.06101v1
A Data and Compute Efficient Design for Limited-Resources Deep Learning http://arxiv.org/abs/2004.09691v2
A Data-driven Approach for Noise Reduction in Distantly Supervised Biomedical Relation Extraction http://arxiv.org/abs/2005.12565v1
A Decomposable Attention Model for Natural Language Inference http://arxiv.org/abs/1606.01933v2
A Deep Generative Model for Fragment-Based Molecule Generation http://arxiv.org/abs/2002.12826v1
A Deep Generative Model of Vowel Formant Typology http://arxiv.org/abs/1807.02745v1
A Deep Learning Approach for Determining Effects of Tuta Absoluta in Tomato Plants http://arxiv.org/abs/2004.04023v1
A Deep Learning System for Sentiment Analysis of Service Calls http://arxiv.org/abs/2004.10320v1
A Deep Neural Network Sentence Level Classification Method with Context Information http://arxiv.org/abs/1809.00934v1
A Deep Reinforced Model for Zero-Shot Cross-Lingual Summarization with Bilingual Semantic Similarity Rewards http://arxiv.org/abs/2006.15454v1
A Diagnostic Study of Explainability Techniques for Text Classification http://arxiv.org/abs/2009.13295v1
A Differentiable Newton Euler Algorithm for Multi-body Model Learning http://arxiv.org/abs/2010.09802v1
A Distributional Analysis of Sampling-Based Reinforcement Learning Algorithms http://arxiv.org/abs/2003.12239v1
A Distributional Framework for Data Valuation http://arxiv.org/abs/2002.12334v1
A Distributional View on Multi-Objective Policy Optimization http://arxiv.org/abs/2005.07513v1
A Double Residual Compression Algorithm for Efficient Distributed Learning http://arxiv.org/abs/1910.07561v1
A Fast Spectral Algorithm for Mean Estimation with Sub-Gaussian Rates http://arxiv.org/abs/1908.04468v2
A Formal Hierarchy of RNN Architectures http://arxiv.org/abs/2004.08500v4
A Fourier State Space Model for Bayesian ODE Filters http://arxiv.org/abs/2007.09118v2
A Framework and Dataset for Abstract Art Generation via CalligraphyGAN http://arxiv.org/abs/2012.00744v1
A Framework for Sample Efficient Interval Estimation with Control Variates http://arxiv.org/abs/2006.10287v1
A Free-Energy Principle for Representation Learning http://arxiv.org/abs/2002.12406v1
A Full Non-Monotonic Transition System for Unrestricted Non-Projective Parsing http://arxiv.org/abs/1706.03367v1
A General Framework for Information Extraction using Dynamic Span Graphs http://arxiv.org/abs/1904.03296v1
A Generative Approach to Titling and Clustering Wikipedia Sections http://arxiv.org/abs/2005.11216v1
A Generative Model for Joint Natural Language Understanding and Generation http://arxiv.org/abs/2006.07499v1
A Generative Model for Molecular Distance Geometry http://arxiv.org/abs/1909.11459v4
A Generative Parser with a Discriminative Recognition Algorithm http://arxiv.org/abs/1708.00415v2
A Generic First-Order Algorithmic Framework for Bi-Level Programming Beyond Lower-Level Singleton http://arxiv.org/abs/2006.04045v2
A Geometry-Inspired Attack for Generating Natural Language Adversarial Examples http://arxiv.org/abs/2010.01345v1
A Girl Has A Name: Detecting Authorship Obfuscation http://arxiv.org/abs/2005.00702v1
A Graph to Graphs Framework for Retrosynthesis Prediction http://arxiv.org/abs/2003.12725v1
A Hierarchical Latent Structure for Variational Conversation Modeling http://arxiv.org/abs/1804.03424v2
A Hierarchical Probabilistic U-Net for Modeling Multi-Scale Ambiguities http://arxiv.org/abs/1905.13077v1
A Hierarchical Reinforced Sequence Operation Method for Unsupervised Text Style Transfer http://arxiv.org/abs/1906.01833v1
A Hierarchical Transformer for Unsupervised Parsing http://arxiv.org/abs/2003.13841v1
A Hybrid Convolutional Variational Autoencoder for Text Generation http://arxiv.org/abs/1702.02390v1
A Hybrid Stochastic Policy Gradient Algorithm for Reinforcement Learning http://arxiv.org/abs/2003.00430v2
A Joint Named-Entity Recognizer for Heterogeneous Tag-sets Using a Tag Hierarchy http://arxiv.org/abs/1905.09135v2
A Just and Comprehensive Strategy for Using NLP to Address Online Abuse http://arxiv.org/abs/1906.01738v2
A Knowledge-Enhanced Pretraining Model for Commonsense Story Generation http://arxiv.org/abs/2001.05139v1
A La Carte Embedding: Cheap but Effective Induction of Semantic Feature Vectors http://arxiv.org/abs/1805.05388v1
A Large-Scale Multi-Document Summarization Dataset from the Wikipedia Current Events Portal http://arxiv.org/abs/2005.10070v1
A Locally Adaptive Bayesian Cubature Method http://arxiv.org/abs/1910.02995v1
A Meaning-based Statistical English Math Word Problem Solver http://arxiv.org/abs/1803.06064v2
A Mention-Ranking Model for Abstract Anaphora Resolution http://arxiv.org/abs/1706.02256v2
A Meta-Learning Approach for Graph Representation Learning in Multi-Task Settings http://arxiv.org/abs/2012.06755v1
A Methodology for Creating Question Answering Corpora Using Inverse Data Annotation http://arxiv.org/abs/2004.07633v2
A Minimal Span-Based Neural Constituency Parser http://arxiv.org/abs/1705.03919v1
A Monolingual Approach to Contextualized Word Embeddings for Mid-Resource Languages http://arxiv.org/abs/2006.06202v2
A Multi-Axis Annotation Scheme for Event Temporal Relations http://arxiv.org/abs/1804.07828v2
A Multi-Perspective Architecture for Semantic Code Search http://arxiv.org/abs/2005.06980v1
A Multi-Task Incremental Learning Framework with Category Name Embedding for Aspect-Category Sentiment Analysis http://arxiv.org/abs/2010.02784v1
A Multi-modal Approach to Fine-grained Opinion Mining on Video Reviews http://arxiv.org/abs/2005.13362v2
A Multi-sentiment-resource Enhanced Attention Network for Sentiment Classification http://arxiv.org/abs/1807.04990v1
A Multiclass Classification Approach to Label Ranking http://arxiv.org/abs/2002.09420v1
A Multilingual Neural Machine Translation Model for Biomedical Data http://arxiv.org/abs/2008.02878v1
A Multitask Learning Approach for Diacritic Restoration http://arxiv.org/abs/2006.04016v1
A Narration-based Reward Shaping Approach using Grounded Natural Language Commands http://arxiv.org/abs/1911.00497v1
A Nested Attention Neural Hybrid Model for Grammatical Error Correction http://arxiv.org/abs/1707.02026v2
A Neural Attention Model for Abstractive Sentence Summarization http://arxiv.org/abs/1509.00685v2
A Neural Generative Model for Joint Learning Topics and Topic-Specific Word Embeddings http://arxiv.org/abs/2008.04702v1
A Neural Model for User Geolocation and Lexical Dialectology http://arxiv.org/abs/1704.04008v3
A Neural Model of Adaptation in Reading http://arxiv.org/abs/1808.09930v2
A Neural Network for Coordination Boundary Prediction http://arxiv.org/abs/1610.03946v1
A Neuro-AI Interface for Evaluating Generative Adversarial Networks http://arxiv.org/abs/2003.03193v2
A New Neural Network Architecture Invariant to the Action of Symmetry Subgroups http://arxiv.org/abs/2012.06452v1
A Nonparametric Off-Policy Policy Gradient http://arxiv.org/abs/2001.02435v3
A Note on Data Biases in Generative Models http://arxiv.org/abs/2012.02516v1
A Note on Over-Smoothing for Graph Neural Networks http://arxiv.org/abs/2006.13318v1
A Novel Cascade Binary Tagging Framework for Relational Triple Extraction http://arxiv.org/abs/1909.03227v4
A Novel Confidence-Based Algorithm for Structured Bandits http://arxiv.org/abs/2005.11593v1
A Novel Graph-based Multi-modal Fusion Encoder for Neural Machine Translation http://arxiv.org/abs/2007.08742v1
A Pairwise Fair and Community-preserving Approach to k-Center Clustering http://arxiv.org/abs/2007.07384v1
A Practical Algorithm for Multiplayer Bandits when Arm Means Vary Among Players http://arxiv.org/abs/1902.01239v4
A Principled Approach to Learning Stochastic Representations for Privacy in Deep Neural Inference http://arxiv.org/abs/2003.12154v1
A Probabilistic End-To-End Task-Oriented Dialog Model with Latent Belief States towards Semi-Supervised Learning http://arxiv.org/abs/2009.08115v3
A Probabilistic Generative Model for Typographical Analysis of Early Modern Printing http://arxiv.org/abs/2005.01646v1
A Probabilistic Generative Model of Linguistic Typology http://arxiv.org/abs/1903.10950v3
A Probabilistic Model with Commonsense Constraints for Pattern-based Temporal Fact Extraction http://arxiv.org/abs/2006.06436v1
A Re-evaluation of Knowledge Graph Completion Methods http://arxiv.org/abs/1911.03903v3
A Recipe for Creating Multimodal Aligned Datasets for Sequential Tasks http://arxiv.org/abs/2005.09606v1
A Reduction from Reinforcement Learning to No-Regret Online Learning http://arxiv.org/abs/1911.05873v2
A Reinforced Generation of Adversarial Examples for Neural Machine Translation http://arxiv.org/abs/1911.03677v2
A Relational Memory-based Embedding Model for Triple Classification and Search Personalization http://arxiv.org/abs/1907.06080v2
A Relaxed Matching Procedure for Unsupervised BLI http://arxiv.org/abs/2010.07095v1
A Report on the 2020 Sarcasm Detection Shared Task http://arxiv.org/abs/2005.05814v2
A Resource-Free Evaluation Metric for Cross-Lingual Word Embeddings Based on Graph Modularity http://arxiv.org/abs/1906.01926v1
A Rigorous Study on Named Entity Recognition: Can Fine-tuning Pretrained Model Lead to the Promised Land? http://arxiv.org/abs/2004.12126v2
A Sample Complexity Separation between Non-Convex and Convex Meta-Learning http://arxiv.org/abs/2002.11172v1
A Scalable Neural Shortlisting-Reranking Approach for Large-Scale Domain Classification in Natural Language Understanding http://arxiv.org/abs/1804.08064v1
A Self-Training Method for Machine Reading Comprehension with Soft Evidence Extraction http://arxiv.org/abs/2005.05189v2
A Sequential Self Teaching Approach for Improving Generalization in Sound Event Recognition http://arxiv.org/abs/2007.00144v1
A Simple Approach to Learning Unsupervised Multilingual Embeddings http://arxiv.org/abs/2004.05991v2
A Simple Joint Model for Improved Contextual Neural Lemmatization http://arxiv.org/abs/1904.02306v4
A Simple Regularization-based Algorithm for Learning Cross-Domain Word Embeddings http://arxiv.org/abs/1902.00184v1
A Simple Theoretical Model of Importance for Summarization http://arxiv.org/abs/1801.08991v2
A Simple Yet Strong Pipeline for HotpotQA http://arxiv.org/abs/2004.06753v1
A Simple and Effective Model for Answering Multi-span Questions http://arxiv.org/abs/1909.13375v4
A Skeleton-Based Model for Promoting Coherence Among Sentences in Narrative Story Generation http://arxiv.org/abs/1808.06945v2
A Span-based Linearization for Constituent Trees http://arxiv.org/abs/2004.14704v2
A Stein Goodness-of-fit Test for Directional Distributions http://arxiv.org/abs/2002.06843v1
A Stochastic Decoder for Neural Machine Translation http://arxiv.org/abs/1805.10844v1
A Streaming Approach For Efficient Batched Beam Search http://arxiv.org/abs/2010.02164v2
A Study of Deep Learning Colon Cancer Detection in Limited Data Access Scenarios http://arxiv.org/abs/2005.10326v2
A Study of Reinforcement Learning for Neural Machine Translation http://arxiv.org/abs/1808.08866v1
A Study on Encodings for Neural Architecture Search http://arxiv.org/abs/2007.04965v1
A Stylometric Inquiry into Hyperpartisan and Fake News http://arxiv.org/abs/1702.05638v1
A Supervised Word Alignment Method based on Cross-Language Span Prediction using Multilingual BERT http://arxiv.org/abs/2004.14516v1
A Survey on Recognizing Textual Entailment as an NLP Evaluation http://arxiv.org/abs/2010.03061v1
A Syntactic Neural Model for General-Purpose Code Generation http://arxiv.org/abs/1704.01696v1
A System for Worldwide COVID-19 Information Aggregation http://arxiv.org/abs/2008.01523v2
A Systematic Assessment of Syntactic Generalization in Neural Language Models http://arxiv.org/abs/2005.03692v2
A Tale of a Probe and a Parser http://arxiv.org/abs/2005.01641v2
A Theoretical Case Study of Structured Variational Inference for Community Detection http://arxiv.org/abs/1907.12203v5
A Top-Down Neural Architecture towards Text-Level Parsing of Discourse Rhetorical Structure http://arxiv.org/abs/2005.02680v3
A Topology Layer for Machine Learning http://arxiv.org/abs/1905.12200v2
A Trainable Optimal Transport Embedding for Feature Aggregation http://arxiv.org/abs/2006.12065v3
A Transformer-based Approach for Source Code Summarization http://arxiv.org/abs/2005.00653v1
A Transformer-based joint-encoding for Emotion Recognition and Sentiment Analysis http://arxiv.org/abs/2006.15955v1
A Transition-Based Directed Acyclic Graph Parser for UCCA http://arxiv.org/abs/1704.00552v2
A Two-Stage Masked LM Method for Term Set Expansion http://arxiv.org/abs/2005.01063v1
A Unified Linear-Time Framework for Sentence-Level Discourse Parsing http://arxiv.org/abs/1905.05682v2
A Unified MRC Framework for Named Entity Recognition http://arxiv.org/abs/1910.11476v6
A Unified Model for Extractive and Abstractive Summarization using Inconsistency Loss http://arxiv.org/abs/1805.06266v2
A Unified Stochastic Gradient Approach to Designing Bayesian-Optimal Experiments http://arxiv.org/abs/1911.00294v2
A Unified Theory of Decentralized SGD with Changing Topology and Local Updates http://arxiv.org/abs/2003.10422v2
A Unified Theory of SGD: Variance Reduction, Sampling, Quantization and Coordinate Descent http://arxiv.org/abs/1905.11261v1
A Unified View of Label Shift Estimation http://arxiv.org/abs/2003.07554v3
A Visual Attention Grounding Neural Model for Multimodal Machine Translation http://arxiv.org/abs/1808.08266v2
A Wasserstein Minimum Velocity Approach to Learning Unnormalized Models http://arxiv.org/abs/2002.07501v1
A Word-Complexity Lexicon and A Neural Readability Ranking Model for Lexical Simplification http://arxiv.org/abs/1810.05754v1
A greedy anytime algorithm for sparse PCA http://arxiv.org/abs/1910.06846v5
A large annotated corpus for learning natural language inference http://arxiv.org/abs/1508.05326v1
A negative case analysis of visual grounding methods for VQA http://arxiv.org/abs/2004.05704v2
A neurally plausible model learns successor representations in partially observable environments http://arxiv.org/abs/1906.09480v1
A new regret analysis for Adam-type algorithms http://arxiv.org/abs/2003.09729v1
A nonasymptotic law of iterated logarithm for general M-estimators http://arxiv.org/abs/1903.06576v2
A principled approach for generating adversarial images under non-smooth dissimilarity metrics http://arxiv.org/abs/1908.01667v2
A robust self-learning method for fully unsupervised cross-lingual mappings of word embeddings http://arxiv.org/abs/1805.06297v2
A single image deep learning approach to restoration of corrupted remote sensing products http://arxiv.org/abs/2004.04209v1
A strong baseline for question relevancy ranking http://arxiv.org/abs/1808.08836v1
AD3: Attentive Deep Document Dater http://arxiv.org/abs/1902.02161v1
ADVISER: A Toolkit for Developing Multi-modal, Multi-domain and Socially-engaged Conversational Agents http://arxiv.org/abs/2005.01777v1
AIN: Fast and Accurate Sequence Labeling with Approximate Inference Network http://arxiv.org/abs/2009.08229v2
ALICE: Active Learning with Contrastive Natural Language Explanations http://arxiv.org/abs/2009.10259v1
AMAGOLD: Amortized Metropolis Adjustment for Efficient Stochastic Gradient MCMC http://arxiv.org/abs/2003.00193v1
AMR Dependency Parsing with a Typed Semantic Algebra http://arxiv.org/abs/1805.11465v1
AMR Parsing as Sequence-to-Graph Transduction http://arxiv.org/abs/1905.08704v2
AMR Parsing via Graph-Sequence Iterative Inference http://arxiv.org/abs/2004.05572v2
AMR-to-text Generation with Synchronous Node Replacement Grammar http://arxiv.org/abs/1702.00500v4
AP-Perf: Incorporating Generic Performance Metrics in Differentiable Learning http://arxiv.org/abs/1912.00965v2
AR-DAE: Towards Unbiased Neural Entropy Gradient Estimation http://arxiv.org/abs/2006.05164v1
ASAP: Architecture Search, Anneal and Prune http://arxiv.org/abs/1904.04123v2
ASSET: A Dataset for Tuning and Evaluation of Sentence Simplification Models with Multiple Rewriting Transformations http://arxiv.org/abs/2005.00481v1
Abstract Syntax Networks for Code Generation and Semantic Parsing http://arxiv.org/abs/1704.07535v1
Abstraction Mechanisms Predict Generalization in Deep Neural Networks http://arxiv.org/abs/1905.11515v2
Abstractive Multi-Document Summarization via Phrase Selection and Merging http://arxiv.org/abs/1506.01597v2
Abusive Language Detection with Graph Convolutional Networks http://arxiv.org/abs/1904.04073v1
Accelerated Message Passing for Entropy-Regularized MAP Inference http://arxiv.org/abs/2007.00699v1
Accelerated Primal-Dual Algorithms for Distributed Smooth Convex Optimization over Networks http://arxiv.org/abs/1910.10666v2
Accelerated Reinforcement Learning for Sentence Generation by Vocabulary Prediction http://arxiv.org/abs/1809.01694v2
Accelerated Stochastic Gradient-free and Projection-free Methods http://arxiv.org/abs/2007.12625v2
Accelerating Large-Scale Inference with Anisotropic Vector Quantization http://arxiv.org/abs/1908.10396v5
Accelerating NMT Batched Beam Decoding with LMBR Posteriors for Deployment http://arxiv.org/abs/1804.11324v1
Accelerating Natural Language Understanding in Task-Oriented Dialog http://arxiv.org/abs/2006.03701v1
Accelerating Online Reinforcement Learning with Offline Datasets http://arxiv.org/abs/2006.09359v3
Accelerating Reinforcement Learning with Learned Skill Priors http://arxiv.org/abs/2010.11944v1
Accurate Word Alignment Induction from Neural Machine Translation http://arxiv.org/abs/2004.14837v2
Acrostic Poem Generation http://arxiv.org/abs/2010.02239v1
Action and Perception as Divergence Minimization http://arxiv.org/abs/2009.01791v2
Active Community Detection with Maximal Expected Model Change http://arxiv.org/abs/1801.05856v2
Active Imitation Learning with Noisy Guidance http://arxiv.org/abs/2005.12801v1
Active Learning for Coreference Resolution using Discrete Annotation http://arxiv.org/abs/2004.13671v3
Active Learning for Identification of Linear Dynamical Systems http://arxiv.org/abs/2002.00495v2
Active Learning from Crowd in Document Screening http://arxiv.org/abs/2012.02297v1
Active World Model Learning with Progress Curiosity http://arxiv.org/abs/2007.07853v1
AdaScale SGD: A User-Friendly Algorithm for Distributed Training http://arxiv.org/abs/2007.05105v1
Adapting End-to-End Speech Recognition for Readable Subtitles http://arxiv.org/abs/2005.12143v1
Adapting Word Embeddings to New Languages with Morphological and Phonological Subword Representations http://arxiv.org/abs/1808.09500v1
Adaptive Attention Span in Transformers http://arxiv.org/abs/1905.07799v2
Adaptive Attentional Network for Few-Shot Knowledge Graph Completion http://arxiv.org/abs/2010.09638v1
Adaptive Checkpoint Adjoint Method for Gradient Estimation in Neural ODE http://arxiv.org/abs/2006.02493v1
Adaptive Document Retrieval for Deep Question Answering http://arxiv.org/abs/1808.06528v1
Adaptive Estimator Selection for Off-Policy Evaluation http://arxiv.org/abs/2002.07729v2
Adaptive Exploration in Linear Contextual Bandit http://arxiv.org/abs/1910.06996v2
Adaptive Gradient Descent without Descent http://arxiv.org/abs/1910.09529v2
Adaptive Prediction Timing for Electronic Health Records http://arxiv.org/abs/2003.02554v1
Adaptive Region-Based Active Learning http://arxiv.org/abs/2002.07348v1
Adaptive Reward-Poisoning Attacks against Reinforcement Learning http://arxiv.org/abs/2003.12613v2
Adaptive Risk Minimization: A Meta-Learning Approach for Tackling Group Shift http://arxiv.org/abs/2007.02931v2
Adaptive Scaling for Sparse Detection in Information Extraction http://arxiv.org/abs/1805.00250v2
Adaptive Transformers for Learning Multimodal Representations http://arxiv.org/abs/2005.07486v3
Adding Seemingly Uninformative Labels Helps in Low Data Regimes http://arxiv.org/abs/2008.00807v2
Additive Tree-Structured Covariance Function for Conditional Parameter Spaces in Bayesian Optimization http://arxiv.org/abs/2006.11771v1
Addressing Ancestry Disparities in Genomic Medicine: A Geographic-aware Algorithm http://arxiv.org/abs/2004.12053v1
Addressing Exposure Bias With Document Minimum Risk Training: Cambridge at the WMT20 Biomedical Translation Task http://arxiv.org/abs/2010.05333v1
Addressing reward bias in Adversarial Imitation Learning with neutral reward functions http://arxiv.org/abs/2009.09467v1
Addressing the Rare Word Problem in Neural Machine Translation http://arxiv.org/abs/1410.8206v4
AdvAug: Robust Adversarial Augmentation for Neural Machine Translation http://arxiv.org/abs/2006.11834v3
Advancing Renewable Electricity Consumption With Reinforcement Learning http://arxiv.org/abs/2003.04310v1
Adversarial Alignment of Multilingual Models for Extracting Temporal Expressions from Text http://arxiv.org/abs/2005.09392v1
Adversarial Attack and Defense of Structured Prediction Models http://arxiv.org/abs/2010.01610v2
Adversarial Attacks on Probabilistic Autoregressive Forecasting Models http://arxiv.org/abs/2003.03778v1
Adversarial Connective-exploiting Networks for Implicit Discourse Relation Classification http://arxiv.org/abs/1704.00217v1
Adversarial Contrastive Estimation http://arxiv.org/abs/1805.03642v3
Adversarial Deep Averaging Networks for Cross-Lingual Sentiment Classification http://arxiv.org/abs/1606.01614v5
Adversarial Example Generation with Syntactically Controlled Paraphrase Networks http://arxiv.org/abs/1804.06059v1
Adversarial Examples for Evaluating Reading Comprehension Systems http://arxiv.org/abs/1707.07328v1
Adversarial Filters of Dataset Biases http://arxiv.org/abs/2002.04108v3
Adversarial Learning of Privacy-Preserving Text Representations for De-Identification of Medical Records http://arxiv.org/abs/1906.05000v1
Adversarial Multi-Criteria Learning for Chinese Word Segmentation http://arxiv.org/abs/1704.07556v1
Adversarial Multi-task Learning for Text Classification http://arxiv.org/abs/1704.05742v1
Adversarial Multitask Learning for Joint Multi-Feature and Multi-Dialect Morphological Modeling http://arxiv.org/abs/1910.12702v1
Adversarial Mutual Information for Text Generation http://arxiv.org/abs/2007.00067v1
Adversarial NLI: A New Benchmark for Natural Language Understanding http://arxiv.org/abs/1910.14599v2
Adversarial Neural Pruning with Latent Vulnerability Suppression http://arxiv.org/abs/1908.04355v4
Adversarial Removal of Demographic Attributes from Text Data http://arxiv.org/abs/1808.06640v2
Adversarial Risk via Optimal Transport and Optimal Couplings http://arxiv.org/abs/1912.02794v2
Adversarial Robustness Guarantees for Classification with Gaussian Processes http://arxiv.org/abs/1905.11876v3
Adversarial Robustness for Code http://arxiv.org/abs/2002.04694v2
Adversarial Robustness of Flow-Based Generative Models http://arxiv.org/abs/1911.08654v1
Adversarial Self-Supervised Data-Free Distillation for Text Classification http://arxiv.org/abs/2010.04883v1
Adversarial Semantic Collisions http://arxiv.org/abs/2011.04743v1
Adversarial Training for Commonsense Inference http://arxiv.org/abs/2005.08156v1
Adversarial Training for Satire Detection: Controlling for Confounding Variables http://arxiv.org/abs/1902.11145v2
Adversarial attacks on Copyright Detection Systems http://arxiv.org/abs/1906.07153v2
Adversarial representation learning for private speech generation http://arxiv.org/abs/2006.09114v2
Adversarial training for multi-context joint entity and relation extraction http://arxiv.org/abs/1808.06876v3
Affect-LM: A Neural Language Model for Customizable Affective Text Generation http://arxiv.org/abs/1704.06851v1
Afro-MNIST: Synthetic generation of MNIST-style datasets for low-resource languages http://arxiv.org/abs/2009.13509v1
Agent57: Outperforming the Atari Human Benchmark http://arxiv.org/abs/2003.13350v1
Aggregation of Multiple Knockoffs http://arxiv.org/abs/2002.09269v2
Algorithmic Recourse: from Counterfactual Explanations to Interventions http://arxiv.org/abs/2002.06278v4
Algorithms and SQ Lower Bounds for PAC Learning One-Hidden-Layer ReLU Networks http://arxiv.org/abs/2006.12476v1
Aligned Cross Entropy for Non-Autoregressive Machine Translation http://arxiv.org/abs/2004.01655v1
Alignment-based compositional semantics for instruction following http://arxiv.org/abs/1508.06491v2
All Fingers are not Equal: Intensity of References in Scientific Articles http://arxiv.org/abs/1609.00081v1
All in the Exponential Family: Bregman Duality in Thermodynamic Variational Inference http://arxiv.org/abs/2007.00642v1
Alleviating Privacy Attacks via Causal Learning http://arxiv.org/abs/1909.12732v4
Almost Tune-Free Variance Reduction http://arxiv.org/abs/1908.09345v2
Almost-Matching-Exactly for Treatment Effect Estimation under Network Interference http://arxiv.org/abs/2003.00964v1
AmbigQA: Answering Ambiguous Open-domain Questions http://arxiv.org/abs/2004.10645v2
Amharic Abstractive Text Summarization http://arxiv.org/abs/2003.13721v1
Amodal 3D Reconstruction for Robotic Manipulation via Stability and Connectivity http://arxiv.org/abs/2009.13146v1
Amortised Learning by Wake-Sleep http://arxiv.org/abs/2002.09737v2
Amortized Inference of Variational Bounds for Learning Noisy-OR http://arxiv.org/abs/1906.02428v2
Amortized Population Gibbs Samplers with Neural Sufficient Statistics http://arxiv.org/abs/1911.01382v3
Amortized learning of neural causal representations http://arxiv.org/abs/2008.09301v1
An AMR Aligner Tuned by Transition-based Parser http://arxiv.org/abs/1810.03541v1
An Accelerated DFO Algorithm for Finite-sum Convex Functions http://arxiv.org/abs/2007.03311v2
An Analysis of Action Recognition Datasets for Language and Vision Tasks http://arxiv.org/abs/1704.07129v1
An Analysis of the Utility of Explicit Negative Examples to Improve the Syntactic Abilities of Neural Language Models http://arxiv.org/abs/2004.02451v3
An EM Approach to Non-autoregressive Conditional Sequence Generation http://arxiv.org/abs/2006.16378v1
An Effective Approach to Unsupervised Machine Translation http://arxiv.org/abs/1902.01313v2
An Effective Transition-based Model for Discontinuous NER http://arxiv.org/abs/2004.13454v1
An Effectiveness Metric for Ordinal Classification: Formal Properties and Experimental Results http://arxiv.org/abs/2006.01245v1
An Embarrassingly Simple Approach for Transfer Learning from Pretrained Language Models http://arxiv.org/abs/1902.10547v3
An Empirical Investigation Towards Efficient Multi-Domain Language Model Pre-training http://arxiv.org/abs/2010.00784v1
An Empirical Investigation of Contextualized Number Prediction http://arxiv.org/abs/2011.07961v1
An Empirical Investigation of Global and Local Normalization for Recurrent Neural Sequence Models Using a Continuous Relaxation to Beam Search http://arxiv.org/abs/1904.06834v1
An Empirical Study of Generation Order for Machine Translation http://arxiv.org/abs/1910.13437v1
An Empirical Study of Pre-trained Transformers for Arabic Information Extraction http://arxiv.org/abs/2004.14519v5
An Empirical Study on Large-Scale Multi-Label Text Classification Including Few and Zero-Shot Labels http://arxiv.org/abs/2010.01653v1
An Empirical Study on Model-agnostic Debiasing Strategies for Robust Natural Language Inference http://arxiv.org/abs/2010.03777v2
An Empirical Study on Robustness to Spurious Correlations using Pre-trained Language Models http://arxiv.org/abs/2007.06778v3
An Experiment on Leveraging SHAP Values to Investigate Racial Bias http://arxiv.org/abs/2011.09865v1
An Explicitly Relational Neural Network Architecture http://arxiv.org/abs/1905.10307v4
An Exploration of Arbitrary-Order Sequence Labeling via Energy-Based Inference Networks http://arxiv.org/abs/2010.02789v1
An Exploratory Study of Argumentative Writing by Young Students: A Transformer-based Approach http://arxiv.org/abs/2006.09873v1
An Imitation Game for Learning Semantic Parsers from User Interaction http://arxiv.org/abs/2005.00689v3
An Imitation Learning Approach for Cache Replacement http://arxiv.org/abs/2006.16239v2
An Imitation Learning Approach to Unsupervised Parsing http://arxiv.org/abs/1906.02276v1
An Interpretable Knowledge Transfer Model for Knowledge Base Completion http://arxiv.org/abs/1704.05908v2
An Inverse-free Truncated Rayleigh-Ritz Method for Sparse Generalized Eigenvalue Problem http://arxiv.org/abs/2003.10897v1
An Investigation of Why Overparameterization Exacerbates Spurious Correlations http://arxiv.org/abs/2005.04345v3
An Optimal Algorithm for Adversarial Bandits with Arbitrary Delays http://arxiv.org/abs/1910.06054v2
An Unsupervised Joint System for Text Generation from Knowledge Graphs and Semantic Parsing http://arxiv.org/abs/1904.09447v4
An Unsupervised Method for Uncovering Morphological Chains http://arxiv.org/abs/1503.02335v1
An Unsupervised Probability Model for Speech-to-Translation Alignment of Low-Resource Languages http://arxiv.org/abs/1609.08139v1
An end-to-end Differentially Private Latent Dirichlet Allocation Using a Spectral Algorithm http://arxiv.org/abs/1805.10341v3
An end-to-end approach for the verification problem: learning the right distance http://arxiv.org/abs/2002.09469v4
An information theoretic view on selecting linguistic probes http://arxiv.org/abs/2009.07364v2
Analogies minus analogy test: measuring regularities in word embeddings http://arxiv.org/abs/2010.03446v1
Analogous Process Structure Induction for Sub-event Sequence Prediction http://arxiv.org/abs/2010.08525v1
Analogs of Linguistic Structure in Deep Representations http://arxiv.org/abs/1707.08139v1
Analysing Lexical Semantic Change with Contextualised Word Representations http://arxiv.org/abs/2004.14118v1
Analysis of Automatic Annotation Suggestions for Hard Discourse-Level Tasks in Expert Domains http://arxiv.org/abs/1906.02564v1
Analytic Marching: An Analytic Meshing Solution from Deep Implicit Surface Networks http://arxiv.org/abs/2002.06597v1
Analyzing Individual Neurons in Pre-trained Language Models http://arxiv.org/abs/2010.02695v1
Analyzing Multi-Head Self-Attention: Specialized Heads Do the Heavy Lifting, the Rest Can Be Pruned http://arxiv.org/abs/1905.09418v2
Analyzing Neural Discourse Coherence Models http://arxiv.org/abs/2011.06306v1
Analyzing Polarization in Social Media: Method and Application to Tweets on 21 Mass Shootings http://arxiv.org/abs/1904.01596v2
Analyzing Political Parody in Social Media http://arxiv.org/abs/2004.13878v2
Analyzing Redundancy in Pretrained Transformer Models http://arxiv.org/abs/2004.04010v2
Analyzing analytical methods: The case of phonology in neural models of spoken language http://arxiv.org/abs/2004.07070v2
Analyzing autoencoder-based acoustic word embeddings http://arxiv.org/abs/2004.01647v1
Analyzing the Limitations of Cross-lingual Word Embedding Mappings http://arxiv.org/abs/1906.05407v1
Anatomy of Catastrophic Forgetting: Hidden Representations and Task Semantics http://arxiv.org/abs/2007.07400v1
Anchored Correlation Explanation: Topic Modeling with Minimal Domain Knowledge http://arxiv.org/abs/1611.10277v4
Anchoring and Agreement in Syntactic Annotations http://arxiv.org/abs/1605.04481v3
Anderson Acceleration of Proximal Gradient Methods http://arxiv.org/abs/1910.08590v2
Angular Visual Hardness http://arxiv.org/abs/1912.02279v4
Answer-based Adversarial Training for Generating Clarification Questions http://arxiv.org/abs/1904.02281v1
Approaching Neural Grammatical Error Correction as a Low-Resource Machine Translation Task http://arxiv.org/abs/1804.05940v1
Approximate Cross-Validation in High Dimensions with Guarantees http://arxiv.org/abs/1905.13657v4
Approximate Cross-validation: Guarantees for Model Assessment and Selection http://arxiv.org/abs/2003.00617v2
Approximate Inference in Discrete Distributions with Monte Carlo Tree Search and Value Functions http://arxiv.org/abs/1910.06862v1
Approximate is Good Enough: Probabilistic Variants of Dimensional and Margin Complexity http://arxiv.org/abs/2003.04180v1
Approximating Stacked and Bidirectional Recurrent Architectures with the Delayed Recurrent Neural Network http://arxiv.org/abs/1909.00021v2
Approximation Capabilities of Neural ODEs and Invertible Residual Networks http://arxiv.org/abs/1907.12998v2
Approximation Guarantees of Local Search Algorithms via Localizability of Set Functions http://arxiv.org/abs/2006.01400v1
Approximation Schemes for ReLU Regression http://arxiv.org/abs/2005.12844v2
Approximation-Aware Dependency Parsing by Belief Propagation http://arxiv.org/abs/1508.02375v1
AraDIC: Arabic Document Classification using Image-Based Character Embeddings and Class-Balanced Loss http://arxiv.org/abs/2006.11586v1
Arc-swift: A Novel Transition System for Dependency Parsing http://arxiv.org/abs/1705.04434v1
Architecture Agnostic Neural Networks http://arxiv.org/abs/2011.02712v2
Are All Good Word Vector Spaces Isomorphic? http://arxiv.org/abs/2004.04070v2
Are All Languages Created Equal in Multilingual BERT? http://arxiv.org/abs/2005.09093v2
Are BLEU and Meaning Representation in Opposition? http://arxiv.org/abs/1805.06536v1
Are Hyperbolic Representations in Graphs Created Equal? http://arxiv.org/abs/2007.07698v1
Are Natural Language Inference Models IMPPRESsive? Learning IMPlicature and PRESupposition http://arxiv.org/abs/2004.03066v2
Are Pretrained Language Models Symbolic Reasoners Over Knowledge? http://arxiv.org/abs/2006.10413v2
Are Some Words Worth More than Others? http://arxiv.org/abs/2010.06069v2
Are You Convinced? Choosing the More Convincing Evidence with a Siamese Network http://arxiv.org/abs/1907.08971v2
Argument Generation with Retrieval, Planning, and Realization http://arxiv.org/abs/1906.03717v1
Argument Invention from First Principles http://arxiv.org/abs/1908.08336v1
Argument Mining for Understanding Peer Reviews http://arxiv.org/abs/1903.10104v1
Argument Mining with Structured SVMs and RNNs http://arxiv.org/abs/1704.06869v1
Artemis: A Novel Annotation Methodology for Indicative Single Document Summarization http://arxiv.org/abs/2005.02146v2
Artificial Intelligence for Global Health: Learning From a Decade of Digital Transformation in Health Care http://arxiv.org/abs/2005.12378v2
Asking and Answering Questions to Evaluate the Factual Consistency of Summaries http://arxiv.org/abs/2004.04228v1
Asking without Telling: Exploring Latent Ontologies in Contextual Representations http://arxiv.org/abs/2004.14513v2
Aspect Level Sentiment Classification with Deep Memory Network http://arxiv.org/abs/1605.08900v2
Assessing Human Translations from French to Bambara for Machine Learning: a Pilot Study http://arxiv.org/abs/2004.00068v1
Assessing Phrasal Representation and Composition in Transformers http://arxiv.org/abs/2010.03763v2
Assessing Robustness to Noise: Low-Cost Head CT Triage http://arxiv.org/abs/2003.07977v2
Assessing racial inequality in COVID-19 testing with Bayesian threshold tests http://arxiv.org/abs/2011.01179v1
Assessing the Ability of Self-Attention Networks to Learn Word Order http://arxiv.org/abs/1906.00592v1
Assessing the Helpfulness of Learning Materials with Inference-Based Learner-Like Agent http://arxiv.org/abs/2010.02179v1
Associative Memory in Iterated Overparameterized Sigmoid Autoencoders http://arxiv.org/abs/2006.16540v2
Asymmetric Private Set Intersection with Applications to Contact Tracing and Private Vertical Federated Machine Learning http://arxiv.org/abs/2011.09350v1
Asymmetric self-play for automatic goal discovery in robotic manipulation http://arxiv.org/abs/2101.04882v1
Asymptotic Analysis of Sampling Estimators for Randomized Numerical Linear Algebra Algorithms http://arxiv.org/abs/2002.10526v1
Asymptotically Efficient Off-Policy Evaluation for Tabular Reinforcement Learning http://arxiv.org/abs/2001.10742v1
Asynchronous Gibbs Sampling http://arxiv.org/abs/1509.08999v7
Attacking Neural Text Detectors http://arxiv.org/abs/2002.11768v3
Attend to Medical Ontologies: Content Selection for Clinical Abstractive Summarization http://arxiv.org/abs/2005.00163v1
Attending the Emotions to Detect Online Abusive Language http://arxiv.org/abs/1909.03100v1
Attention Guided Graph Convolutional Networks for Relation Extraction http://arxiv.org/abs/1906.07510v8
Attention Is All You Need for Chinese Word Segmentation http://arxiv.org/abs/1910.14537v3
Attention Strategies for Multi-Source Sequence-to-Sequence Learning http://arxiv.org/abs/1704.06567v1
Attention is Not Only a Weight: Analyzing Transformers with Vector Norms http://arxiv.org/abs/2004.10102v2
Attention is not Explanation http://arxiv.org/abs/1902.10186v3
Attention-Passing Models for Robust and Data-Efficient End-to-End Speech Translation http://arxiv.org/abs/1904.07209v1
Attention-over-Attention Neural Networks for Reading Comprehension http://arxiv.org/abs/1607.04423v4
Attentive Group Equivariant Convolutional Networks http://arxiv.org/abs/2002.03830v3
Audio-Visual Understanding of Passenger Intents for In-Cabin Conversational Agents http://arxiv.org/abs/2007.03876v1
Augmented Natural Language for Generative Sequence Labeling http://arxiv.org/abs/2009.13272v1
Augmenting Data for Sarcasm Detection with Unlabeled Conversation Context http://arxiv.org/abs/2006.06259v1
Augmenting Neural Networks with First-order Logic http://arxiv.org/abs/1906.06298v3
Augmenting word2vec with latent Dirichlet allocation within a clinical application http://arxiv.org/abs/1808.03967v1
Author Commitment and Social Power: Automatic Belief Tagging to Infer the Social Context of Interactions http://arxiv.org/abs/1805.06016v1
Auto-Rotating Perceptrons http://arxiv.org/abs/1910.02483v2
Auto-Sizing Neural Networks: With Applications to n-gram Language Models http://arxiv.org/abs/1508.05051v1
AutoExtend: Extending Word Embeddings to Embeddings for Synsets and Lexemes http://arxiv.org/abs/1507.01127v1
AutoGluon-Tabular: Robust and Accurate AutoML for Structured Data http://arxiv.org/abs/2003.06505v1
AutoML-Zero: Evolving Machine Learning Algorithms From Scratch http://arxiv.org/abs/2003.03384v2
Autoencoder as Assistant Supervisor: Improving Text Representation for Chinese Social Media Text Summarization http://arxiv.org/abs/1805.04869v1
Autoencoding Pixies: Amortised Variational Inference with Graph Convolutions for Functional Distributional Semantics http://arxiv.org/abs/2005.02991v2
Automated Augmented Conjugate Inference for Non-conjugate Gaussian Process Models http://arxiv.org/abs/2002.11451v1
Automated Topical Component Extraction Using Neural Network Attention Scores from Source-based Essay Scoring http://arxiv.org/abs/2008.01809v1
Automatic Detection of Generated Text is Easiest when Humans are Fooled http://arxiv.org/abs/1911.00650v2
Automatic Differentiation of Some First-Order Methods in Parametric Optimization http://arxiv.org/abs/1910.05696v1
Automatic Estimation of Simultaneous Interpreter Performance http://arxiv.org/abs/1805.04016v2
Automatic Event Salience Identification http://arxiv.org/abs/1809.00647v1
Automatic Extraction of Rules Governing Morphological Agreement http://arxiv.org/abs/2010.01160v2
Automatic Generation of High Quality CCGbanks for Parser Domain Adaptation http://arxiv.org/abs/1906.01834v1
Automatic Metric Validation for Grammatical Error Correction http://arxiv.org/abs/1804.11225v2
Automatic Reference-Based Evaluation of Pronoun Translation Misses the Point http://arxiv.org/abs/1808.04164v1
Automatic Shortcut Removal for Self-Supervised Representation Learning http://arxiv.org/abs/2002.08822v3
Automatic semantic segmentation for prediction of tuberculosis using lens-free microscopy images http://arxiv.org/abs/2007.02482v1
Automatically Identifying Complaints in Social Media http://arxiv.org/abs/1906.03890v1
Automatically Ranked Russian Paraphrase Corpus for Text Generation http://arxiv.org/abs/2006.09719v1
Autoregressive Knowledge Distillation through Imitation Learning http://arxiv.org/abs/2009.07253v2
Average-case Acceleration Through Spectral Density Estimation http://arxiv.org/abs/2002.04756v5
Avoiding Reasoning Shortcuts: Adversarial Evaluation, Training, and Model Development for Multi-Hop QA http://arxiv.org/abs/1906.07132v1
Avoiding the Hypothesis-Only Bias in Natural Language Inference via Ensemble Adversarial Training http://arxiv.org/abs/2004.07790v4
AxCell: Automatic Extraction of Results from Machine Learning Papers http://arxiv.org/abs/2004.14356v1
BAE: BERT-based Adversarial Examples for Text Classification http://arxiv.org/abs/2004.01970v3
BAM! Born-Again Multi-Task Networks for Natural Language Understanding http://arxiv.org/abs/1907.04829v1
BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension http://arxiv.org/abs/1910.13461v1
BERT Fine-tuning For Arabic Text Summarization http://arxiv.org/abs/2004.14135v1
BERT Knows Punta Cana is not just beautiful, it's gorgeous: Ranking Scalar Adjectives with Contextualised Representations http://arxiv.org/abs/2010.02686v1
BERT-ATTACK: Adversarial Attack Against BERT Using BERT http://arxiv.org/abs/2004.09984v3
BERT-EMD: Many-to-Many Layer Mapping for BERT Compression with Earth Mover's Distance http://arxiv.org/abs/2010.06133v1
BERT-XML: Large Scale Automated ICD Coding Using BERT Pretraining http://arxiv.org/abs/2006.03685v1
BERT-of-Theseus: Compressing BERT by Progressive Module Replacing http://arxiv.org/abs/2002.02925v4
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding http://arxiv.org/abs/1810.04805v2
BERTRAM: Improved Word Embeddings Have Big Impact on Contextualized Model Performance http://arxiv.org/abs/1910.07181v3
BERTgrid: Contextualized Embedding for 2D Document Representation and Understanding http://arxiv.org/abs/1909.04948v2
BERTs of a feather do not generalize together: Large variability in generalization across models with similar test set performance http://arxiv.org/abs/1911.02969v2
BINOCULARS for Efficient, Nonmyopic Sequential Experimental Design http://arxiv.org/abs/1909.04568v3
BLEU Neighbors: A Reference-less Approach to Automatic Evaluation http://arxiv.org/abs/2004.12726v3
BLEU might be Guilty but References are not Innocent http://arxiv.org/abs/2004.06063v2
BLEURT: Learning Robust Metrics for Text Generation http://arxiv.org/abs/2004.04696v5
BPE-Dropout: Simple and Effective Subword Regularization http://arxiv.org/abs/1910.13267v2
BabyAI++: Towards Grounded-Language Learning beyond Memorization http://arxiv.org/abs/2004.07200v1
BabyWalk: Going Farther in Vision-and-Language Navigation by Taking Baby Steps http://arxiv.org/abs/2005.04625v2
Back to the Future: Unsupervised Backprop-based Decoding for Counterfactual and Abductive Commonsense Reasoning http://arxiv.org/abs/2010.05906v3
Backpropagating through Structured Argmax using a SPIGOT http://arxiv.org/abs/1805.04658v1
Balanced off-policy evaluation in general action spaces http://arxiv.org/abs/1906.03694v4
Balancing Competing Objectives with Noisy Data: Score-Based Classifiers for Welfare-Aware Machine Learning http://arxiv.org/abs/2003.06740v4
Balancing Cost and Benefit with Tied-Multi Transformers http://arxiv.org/abs/2002.08614v1
Balancing Gaussian vectors in high dimension http://arxiv.org/abs/1910.13972v2
Balancing Objectives in Counseling Conversations: Advancing Forwards or Looking Backwards http://arxiv.org/abs/2005.04245v1
Balancing Training for Multilingual Neural Machine Translation http://arxiv.org/abs/2004.06748v4
Bandit Convex Optimization in Non-stationary Environments http://arxiv.org/abs/1907.12340v2
Bandit optimisation of functions in the Matérn kernel RKHS http://arxiv.org/abs/2001.10396v2
BanditSum: Extractive Summarization as a Contextual Bandit http://arxiv.org/abs/1809.09672v3
Bandits for BMO Functions http://arxiv.org/abs/2007.08703v1
Bandits with adversarial scaling http://arxiv.org/abs/2003.02287v2
Barking up the right tree: an approach to search over molecule synthesis DAGs http://arxiv.org/abs/2012.11522v1
BasisVAE: Translation-invariant feature-level clustering with Variational Autoencoders http://arxiv.org/abs/2003.03462v1
Batch Stationary Distribution Estimation http://arxiv.org/abs/2003.00722v1
Batch-Constrained Distributional Reinforcement Learning for Session-based Recommendation http://arxiv.org/abs/2012.08984v1
Batched Multi-armed Bandits Problem http://arxiv.org/abs/1904.01763v3
Bayesian Differential Privacy for Machine Learning http://arxiv.org/abs/1901.09697v5
Bayesian Experimental Design for Implicit Models by Mutual Information Neural Estimation http://arxiv.org/abs/2002.08129v3
Bayesian Graph Neural Networks with Adaptive Connection Sampling http://arxiv.org/abs/2006.04064v3
Bayesian Hierarchical Words Representation Learning http://arxiv.org/abs/2004.07126v1
Bayesian Image Classification with Deep Convolutional Gaussian Processes http://arxiv.org/abs/1902.05888v2
Bayesian Learning from Sequential Data using Gaussian Processes with Signature Covariances http://arxiv.org/abs/1906.08215v2
Bayesian Optimisation over Multiple Continuous and Categorical Inputs http://arxiv.org/abs/1906.08878v2
Bayesian Optimization for Iterative Learning http://arxiv.org/abs/1909.09593v4
Bayesian Optimization of Text Representations http://arxiv.org/abs/1503.00693v1
Bayesian Reinforcement Learning via Deep, Sparse Sampling http://arxiv.org/abs/1902.02661v4
Bayesian aggregation improves traditional single image crop classification approaches http://arxiv.org/abs/2004.03468v1
Bayesian experimental design using regularized determinantal point processes http://arxiv.org/abs/1906.04133v1
Be More with Less: Hypergraph Attention Networks for Inductive Text Classification http://arxiv.org/abs/2011.00387v1
BeBold: Exploration Beyond the Boundary of Explored Regions http://arxiv.org/abs/2012.08621v1
Before Name-calling: Dynamics and Triggers of Ad Hominem Fallacies in Web Argumentation http://arxiv.org/abs/1802.06613v2
Behavior Analysis of NLI Models: Uncovering the Influence of Three Factors on Robustness http://arxiv.org/abs/1805.04212v1
Being Bayesian, Even Just a Bit, Fixes Overconfidence in ReLU Networks http://arxiv.org/abs/2002.10118v2
Being Negative but Constructively: Lessons Learnt from Creating Better Visual Question Answering Datasets http://arxiv.org/abs/1704.07121v2
Benchmarking Graph Neural Networks http://arxiv.org/abs/2003.00982v3
Benchmarking Multimodal Regex Synthesis with Complex Structures http://arxiv.org/abs/2005.00663v1
Best Arm Identification for Cascading Bandits in the Fixed Confidence Setting http://arxiv.org/abs/2001.08655v3
Best-First Beam Search http://arxiv.org/abs/2007.03909v2
Best-item Learning in Random Utility Models with Subset Choices http://arxiv.org/abs/2002.07994v1
Beta Embeddings for Multi-Hop Logical Reasoning in Knowledge Graphs http://arxiv.org/abs/2010.11465v1
Better Depth-Width Trade-offs for Neural Networks through the lens of Dynamical Systems http://arxiv.org/abs/2003.00777v2
Better Document-Level Machine Translation with Bayes' Rule http://arxiv.org/abs/1910.00553v2
Better Highlighting: Creating Sub-Sentence Summary Highlights http://arxiv.org/abs/2010.10566v1
Better Long-Range Dependency By Bootstrapping A Mutual Information Regularizer http://arxiv.org/abs/1905.11978v2
Beyond Accuracy: Behavioral Testing of NLP models with CheckList http://arxiv.org/abs/2005.04118v1
Beyond Error Propagation in Neural Machine Translation: Characteristics of Language Also Matter http://arxiv.org/abs/1809.00120v2
Beyond Exponentially Discounted Sum: Automatic Learning of Return Function http://arxiv.org/abs/1905.11591v2
Beyond Instructional Videos: Probing for More Diverse Visual-Textual Grounding on YouTube http://arxiv.org/abs/2004.14338v2
Beyond Synthetic Noise: Deep Learning on Controlled Noisy Labels http://arxiv.org/abs/1911.09781v3
Beyond UCB: Optimal and Efficient Contextual Bandits with Regression Oracles http://arxiv.org/abs/2002.04926v2
Beyond User Self-Reported Likert Scale Ratings: A Comparison Model for Automatic Dialog Evaluation http://arxiv.org/abs/2005.10716v2
Beyond exploding and vanishing gradients: analysing RNN training using attractors and smoothness http://arxiv.org/abs/1906.08482v3
Beyond task success: A closer look at jointly learning to see, ask, and GuessWhat http://arxiv.org/abs/1809.03408v2
Bi-Level Graph Neural Networks for Drug-Drug Interaction Prediction http://arxiv.org/abs/2006.14002v1
Bi-directional Attention with Agreement for Dependency Parsing http://arxiv.org/abs/1608.02076v2
BiST: Bi-directional Spatio-Temporal Reasoning for Video-Grounded Dialogues http://arxiv.org/abs/2010.10095v1
Bidirectional Attentive Memory Networks for Question Answering over Knowledge Bases http://arxiv.org/abs/1903.02188v3
Bidirectional Model-based Policy Optimization http://arxiv.org/abs/2007.01995v2
Bilingual Dictionary Based Neural Machine Translation without Using Parallel Sentences http://arxiv.org/abs/2007.02671v1
Bilingual Lexicon Induction through Unsupervised Machine Translation http://arxiv.org/abs/1907.10761v1
Bilingual Lexicon Induction with Semi-supervision in Non-Isometric Embedding Spaces http://arxiv.org/abs/1908.06625v1
Bio-Inspired Hashing for Unsupervised Similarity Search http://arxiv.org/abs/2001.04907v2
BioMegatron: Larger Biomedical Domain Language Model http://arxiv.org/abs/2010.06060v2
Biomedical Entity Representations with Synonym Marginalization http://arxiv.org/abs/2005.00239v1
Biomedical Information Extraction for Disease Gene Prioritization http://arxiv.org/abs/2011.05188v2
Bipartite Flat-Graph Network for Nested Named Entity Recognition http://arxiv.org/abs/2005.00436v1
Birds have four legs?! NumerSense: Probing Numerical Commonsense Knowledge of Pre-trained Language Models http://arxiv.org/abs/2005.00683v2
Bisect and Conquer: Hierarchical Clustering via Max-Uncut Bisection http://arxiv.org/abs/1912.06983v1
Black Box Submodular Maximization: Discrete and Continuous Settings http://arxiv.org/abs/1901.09515v2
Black Loans Matter: Distributionally Robust Fairness for Fighting Subgroup Discrimination http://arxiv.org/abs/2012.01193v1
Black is to Criminal as Caucasian is to Police: Detecting and Removing Multiclass Bias in Word Embeddings http://arxiv.org/abs/1904.04047v3
Black-box Certification and Learning under Adversarial Perturbations http://arxiv.org/abs/2006.16520v1
Black-box Methods for Restoring Monotonicity http://arxiv.org/abs/2003.09554v1
Blank Language Models http://arxiv.org/abs/2002.03079v2
Bleaching Text: Abstract Features for Cross-lingual Gender Prediction http://arxiv.org/abs/1805.03122v1
BoXHED: Boosted eXact Hazard Estimator with Dynamic covariates http://arxiv.org/abs/2006.14218v2
BoolQ: Exploring the Surprising Difficulty of Natural Yes/No Questions http://arxiv.org/abs/1905.10044v1
Boosting Entity Linking Performance by Leveraging Unlabeled Documents http://arxiv.org/abs/1906.01250v1
Boosting Frank-Wolfe by Chasing Gradients http://arxiv.org/abs/2003.06369v2
Boosting for Control of Dynamical Systems http://arxiv.org/abs/1906.08720v2
Bootstrap Latent-Predictive Representations for Multitask Reinforcement Learning http://arxiv.org/abs/2004.14646v1
Bootstrapped Q-learning with Context Relevant Observation Pruning to Generalize in Text-based Games http://arxiv.org/abs/2009.11896v1
Bootstrapping Generators from Noisy Data http://arxiv.org/abs/1804.06385v4
Bootstrapping Named Entity Recognition in E-Commerce with Positive Unlabeled Learning http://arxiv.org/abs/2005.11075v1
Bootstrapping Techniques for Polysynthetic Morphological Analysis http://arxiv.org/abs/2005.00956v1
Born-Again Tree Ensembles http://arxiv.org/abs/2003.11132v3
Bounding, Concentrating, and Truncating: Unifying Privacy Loss Composition for Data Analytics http://arxiv.org/abs/2004.07223v3
Bounds in Query Learning http://arxiv.org/abs/1904.10122v1
Break It Down: A Question Understanding Benchmark http://arxiv.org/abs/2001.11770v1
Breaking NLI Systems with Sentences that Require Simple Lexical Inferences http://arxiv.org/abs/1805.02266v1
Breaking the Curse of Many Agents: Provable Mean Embedding Q-Iteration for Mean-Field Reinforcement Learning http://arxiv.org/abs/2006.11917v1
Breaking the Curse of Space Explosion: Towards Efficient NAS with Curriculum Search http://arxiv.org/abs/2007.07197v2
Breast Cancer Detection Using Convolutional Neural Networks http://arxiv.org/abs/2003.07911v3
Bridging Anaphora Resolution as Question Answering http://arxiv.org/abs/2004.07898v3
Bridging Information-Seeking Human Gaze and Machine Reading Comprehension http://arxiv.org/abs/2009.14780v2
Bridging Linguistic Typology and Multilingual Machine Translation with Multi-View Language Representations http://arxiv.org/abs/2004.14923v2
Bridging the Gap between Training and Inference for Neural Machine Translation http://arxiv.org/abs/1906.02448v2
Bringing Stories Alive: Generating Interactive Fiction Worlds http://arxiv.org/abs/2001.10161v1
Budget Learning via Bracketing http://arxiv.org/abs/2004.06298v1
Budget-Constrained Bandits over General Cost and Reward Distributions http://arxiv.org/abs/2003.00365v1
C-Learning: Horizon-Aware Cumulative Accessibility Estimation http://arxiv.org/abs/2011.12363v2
C-Learning: Learning to Achieve Goals via Recursive Classification http://arxiv.org/abs/2011.08909v1
CAT-Gen: Improving Robustness in NLP Models via Controlled Adversarial Text Generation http://arxiv.org/abs/2010.02338v1
CAUSE: Learning Granger Causality from Event Sequences using Attribution Methods http://arxiv.org/abs/2002.07906v1
CAiRE-COVID: A Question Answering and Query-focused Multi-Document Summarization System for COVID-19 Scholarly Information Management http://arxiv.org/abs/2005.03975v3
CDL: Curriculum Dual Learning for Emotion-Controllable Response Generation http://arxiv.org/abs/2005.00329v5
CITE: A Corpus of Image-Text Discourse Relations http://arxiv.org/abs/1904.06286v2
CLEVR Parser: A Graph Parser Library for Geometric Learning on Language Grounded Image Scenes http://arxiv.org/abs/2009.09154v2
CLEVR-Dialog: A Diagnostic Dataset for Multi-Round Reasoning in Visual Dialog http://arxiv.org/abs/1903.03166v2
CLUB: A Contrastive Log-ratio Upper Bound of Mutual Information http://arxiv.org/abs/2006.12013v6
CNM: An Interpretable Complex-valued Network for Matching http://arxiv.org/abs/1904.05298v1
CNN-based Approach for Cervical Cancer Classification in Whole-Slide Histopathology Images http://arxiv.org/abs/2005.13924v1
COD3S: Diverse Generation with Discrete Semantic Signatures http://arxiv.org/abs/2010.02882v1
COMET: A Neural Framework for MT Evaluation http://arxiv.org/abs/2009.09025v2
COMETA: A Corpus for Medical Entity Linking in the Social Media http://arxiv.org/abs/2010.03295v2
COVID-19 Literature Topic-Based Search via Hierarchical NMF http://arxiv.org/abs/2009.09074v1
CUNI Systems for the Unsupervised and Very Low Resource Translation Task in WMT20 http://arxiv.org/abs/2010.11747v1
CURL: Contrastive Unsupervised Representations for Reinforcement Learning http://arxiv.org/abs/2004.04136v4
Calibrated Language Model Fine-Tuning for In- and Out-of-Distribution Data http://arxiv.org/abs/2010.11506v1
Calibrated Surrogate Losses for Adversarially Robust Classification http://arxiv.org/abs/2005.13748v1
Calibrated Surrogate Maximization of Linear-fractional Utility in Binary Classification http://arxiv.org/abs/1905.12511v2
Calibrated Top-1 Uncertainty estimates for classification by score based models http://arxiv.org/abs/1903.09215v4
Calibrating Structured Output Predictors for Natural Language Processing http://arxiv.org/abs/2004.04361v2
Calibration of Pre-trained Transformers http://arxiv.org/abs/2003.07892v3
Calibration, Entropy Rates, and Memory in Language Models http://arxiv.org/abs/1906.05664v1
CamemBERT: a Tasty French Language Model http://arxiv.org/abs/1911.03894v3
Can Automatic Post-Editing Improve NMT? http://arxiv.org/abs/2009.14395v1
Can Autonomous Vehicles Identify, Recover From, and Adapt to Distribution Shifts? http://arxiv.org/abs/2006.14911v2
Can Increasing Input Dimensionality Improve Deep Reinforcement Learning? http://arxiv.org/abs/2003.01629v2
Can Neural Machine Translation be Improved with User Feedback? http://arxiv.org/abs/1804.05958v1
Can You Put it All Together: Evaluating Conversational Agents' Ability to Blend Skills http://arxiv.org/abs/2004.08449v1
Can a Suit of Armor Conduct Electricity? A New Dataset for Open Book Question Answering http://arxiv.org/abs/1809.02789v1
Carbontracker: Tracking and Predicting the Carbon Footprint of Training Deep Learning Models http://arxiv.org/abs/2007.03051v1
Cascaded Mutual Modulation for Visual Reasoning http://arxiv.org/abs/1809.01943v1
Catch Me if I Can: Detecting Strategic Behaviour in Peer Assessment http://arxiv.org/abs/2010.04041v1
Categorical Metadata Representation for Customized Text Classification http://arxiv.org/abs/1902.05196v1
Catplayinginthesnow: Impact of Prior Segmentation on a Model of Visually Grounded Speech http://arxiv.org/abs/2006.08387v2
Causal Bayesian Optimization http://arxiv.org/abs/2005.11741v2
Causal Curiosity: RL Agents Discovering Self-supervised Experiments for Causal Representation Learning http://arxiv.org/abs/2010.03110v1
Causal Effect Estimation and Optimal Dose Suggestions in Mobile Health http://arxiv.org/abs/2007.09812v2
Causal Feature Discovery through Strategic Modification http://arxiv.org/abs/2002.07024v2
Causal Inference of Script Knowledge http://arxiv.org/abs/2004.01174v1
Causal Inference using Gaussian Processes with Structured Latent Confounders http://arxiv.org/abs/2007.07127v1
Causal Learning by a Robot with Semantic-Episodic Memory in an Aesop's Fable Experiment http://arxiv.org/abs/2003.00274v1
Causal Modeling for Fairness in Dynamical Systems http://arxiv.org/abs/1909.09141v2
Causal Structure Discovery from Distributions Arising from Mixtures of DAGs http://arxiv.org/abs/2001.11940v2
Causal inference in degenerate systems: An impossibility result http://arxiv.org/abs/1711.04466v3
Cautious Adaptation For Reinforcement Learning in Safety-Critical Settings http://arxiv.org/abs/2008.06622v1
Censored Quantile Regression Forest http://arxiv.org/abs/2001.03458v1
Certified Data Removal from Machine Learning Models http://arxiv.org/abs/1911.03030v5
Certified Robustness to Label-Flipping Attacks via Randomized Smoothing http://arxiv.org/abs/2002.03018v4
Challenges in Emotion Style Transfer: An Exploration with a Lexical Substitution Pipeline http://arxiv.org/abs/2005.07617v1
Channel Equilibrium Networks for Learning Deep Representation http://arxiv.org/abs/2003.00214v1
Chapter Captor: Text Segmentation in Novels http://arxiv.org/abs/2011.04163v1
CharManteau: Character Embedding Models For Portmanteau Creation http://arxiv.org/abs/1707.01176v2
Character-level Representations Improve DRS-based Semantic Parsing Even in the Age of BERT http://arxiv.org/abs/2011.04308v1
Characterization of Overlap in Observational Studies http://arxiv.org/abs/1907.04138v3
Characterizing Distribution Equivalence and Structure Learning for Cyclic and Acyclic Directed Graphs http://arxiv.org/abs/1910.12993v3
Characterizing Private Clipped Gradient Descent on Convex Generalized Linear Problems http://arxiv.org/abs/2006.06783v1
Characterizing the Latent Space of Molecular Deep Generative Models with Persistent Homology Metrics http://arxiv.org/abs/2010.08548v1
CheXbert: Combining Automatic Labelers and Expert Annotations for Accurate Radiology Report Labeling Using BERT http://arxiv.org/abs/2004.09167v3
Choice Set Optimization Under Discrete Choice Models of Group Decisions http://arxiv.org/abs/2002.00421v2
ChrEn: Cherokee-English Machine Translation for Endangered Language Revitalization http://arxiv.org/abs/2010.04791v1
Circuit-Based Intrinsic Methods to Detect Overfitting http://arxiv.org/abs/1907.01991v2
ClarQ: A large-scale and diverse dataset for Clarification Question Generation http://arxiv.org/abs/2006.05986v2
Classical Structured Prediction Losses for Sequence to Sequence Learning http://arxiv.org/abs/1711.04956v5
Classification with Strategically Withheld Data http://arxiv.org/abs/2012.10203v2
Classifying Syntactic Errors in Learner Language http://arxiv.org/abs/2010.11032v2
Clinical Reading Comprehension: A Thorough Analysis of the emrQA Dataset http://arxiv.org/abs/2005.00574v1
Clinical XLNet: Modeling Sequential Clinical Notes and Predicting Prolonged Mechanical Ventilation http://arxiv.org/abs/1912.11975v1
Closed Loop Neural-Symbolic Learning via Integrating Neural Perception, Grammar Parsing, and Symbolic Reasoning http://arxiv.org/abs/2006.06649v2
Closing the Gap: Joint De-Identification and Concept Extraction in the Clinical Domain http://arxiv.org/abs/2005.09397v1
Closing the convergence gap of SGD without replacement http://arxiv.org/abs/2002.10400v6
Closure Properties for Private Classification and Online Prediction http://arxiv.org/abs/2003.04509v3
Clue: Cross-modal Coherence Modeling for Caption Generation http://arxiv.org/abs/2005.00908v1
CoDEx: A Comprehensive Knowledge Graph Completion Benchmark http://arxiv.org/abs/2009.07810v2
Coach: A Coarse-to-Fine Approach for Cross-domain Slot Filling http://arxiv.org/abs/2004.11727v1
Coarse-to-Fine Decoding for Neural Semantic Parsing http://arxiv.org/abs/1805.04793v1
Code and Named Entity Recognition in StackOverflow http://arxiv.org/abs/2005.01634v3
Code-switching patterns can be an effective route to improve performance of downstream NLP applications: A case study of humour, sarcasm and hate speech detection http://arxiv.org/abs/2005.02295v1
Cognitive Graph for Multi-Hop Reading Comprehension at Scale http://arxiv.org/abs/1905.05460v2
CognitiveCNN: Mimicking Human Cognitive Models to resolve Texture-Shape Bias http://arxiv.org/abs/2006.14722v1
Cold-start Active Learning through Self-supervised Language Modeling http://arxiv.org/abs/2010.09535v2
Collaborative Machine Learning with Incentive-Aware Model Rewards http://arxiv.org/abs/2010.12797v1
Collapsed Amortized Variational Inference for Switching Nonlinear Dynamical Systems http://arxiv.org/abs/1910.09588v2
Collecting Diverse Natural Language Inference Problems for Sentence Representation Evaluation http://arxiv.org/abs/1804.08207v2
Colorless green recurrent networks dream hierarchically http://arxiv.org/abs/1803.11138v1
Colors in Context: A Pragmatic Neural Model for Grounded Language Understanding http://arxiv.org/abs/1703.10186v2
Combating False Negatives in Adversarial Imitation Learning http://arxiv.org/abs/2002.00412v1
Combining Pretrained High-Resource Embeddings and Subword Representations for Low-Resource Languages http://arxiv.org/abs/2003.04419v3
Combining Self-Training and Self-Supervised Learning for Unsupervised Disfluency Detection http://arxiv.org/abs/2010.15360v1
Combining Sentiment Lexica with a Multi-View Variational Autoencoder http://arxiv.org/abs/1904.02839v1
Common Sense or World Knowledge? Investigating Adapter-Based Knowledge Injection into Pretrained Transformers http://arxiv.org/abs/2005.11787v2
Commonsense for Generative Multi-Hop Question Answering Tasks http://arxiv.org/abs/1809.06309v3
CommonsenseQA: A Question Answering Challenge Targeting Commonsense Knowledge http://arxiv.org/abs/1811.00937v2
Communication-Efficient Asynchronous Stochastic Frank-Wolfe over Nuclear-norm Balls http://arxiv.org/abs/1910.07703v1
Communication-Efficient Distributed Stochastic AUC Maximization with Deep Neural Networks http://arxiv.org/abs/2005.02426v2
CompRes: A Dataset for Narrative Structure in News http://arxiv.org/abs/2007.04874v1
Compact Personalized Models for Neural Machine Translation http://arxiv.org/abs/1811.01990v1
Comparative Analysis of Text Classification Approaches in Electronic Health Records http://arxiv.org/abs/2005.06624v1
Comparatives, Quantifiers, Proportions: A Multi-Task Model for the Learning of Quantities from Vision http://arxiv.org/abs/1804.05018v1
Comparing recurrent and convolutional neural networks for predicting wave propagation http://arxiv.org/abs/2002.08981v3
Competence-Level Prediction and Resume & Job Description Matching Using Context-Aware Transformer Models http://arxiv.org/abs/2011.02998v1
Competence-based Curriculum Learning for Neural Machine Translation http://arxiv.org/abs/1903.09848v2
Competing Bandits in Matching Markets http://arxiv.org/abs/1906.05363v2
Competitive Mirror Descent http://arxiv.org/abs/2006.10179v1
Complete Multilingual Neural Machine Translation http://arxiv.org/abs/2010.10239v1
Complexity Guarantees for Polyak Steps with Momentum http://arxiv.org/abs/2002.00915v2
Complexity-Weighted Loss and Diverse Reranking for Sentence Simplification http://arxiv.org/abs/1904.02767v1
Compositional Demographic Word Embeddings http://arxiv.org/abs/2010.02986v2
Compositional Questions Do Not Necessitate Multi-hop Reasoning http://arxiv.org/abs/1906.02900v1
Compositional Semantic Parsing on Semi-Structured Tables http://arxiv.org/abs/1508.00305v1
Compositional and Lexical Semantics in RoBERTa, BERT and DistilBERT: A Case Study on CoQA http://arxiv.org/abs/2009.08257v1
Compositionality and Generalization in Emergent Languages http://arxiv.org/abs/2004.09124v1
Comprehensive Supersense Disambiguation of English Prepositions and Possessives http://arxiv.org/abs/1805.04905v1
Compressing BERT: Studying the Effects of Weight Pruning on Transfer Learning http://arxiv.org/abs/2002.08307v2
Compressive Summarization with Plausibility and Salience Modeling http://arxiv.org/abs/2010.07886v1
Computing Tight Differential Privacy Guarantees Using FFT http://arxiv.org/abs/1906.03049v2
ConQUR: Mitigating Delusional Bias in Deep Q-learning http://arxiv.org/abs/2002.12399v1
ConStance: Modeling Annotation Contexts to Improve Stance Classification http://arxiv.org/abs/1708.06309v1
Concentration bounds for CVaR estimation: The cases of light-tailed and heavy-tailed distributions http://arxiv.org/abs/1901.00997v2
Concept Bottleneck Models http://arxiv.org/abs/2007.04612v3
Concise Explanations of Neural Networks using Adversarial Training http://arxiv.org/abs/1810.06583v9
Concluding remarks http://arxiv.org/abs/astro-ph/0612056v1
Conditional Augmentation for Aspect Term Extraction via Masked Sequence-to-Sequence Generation http://arxiv.org/abs/2004.14769v2
Conditional Flow Variational Autoencoders for Structured Sequence Prediction http://arxiv.org/abs/1908.09008v3
Conditional Generation and Snapshot Learning in Neural Dialogue Systems http://arxiv.org/abs/1606.03352v1
Conditional Importance Sampling for Off-Policy Learning http://arxiv.org/abs/1910.07479v2
Conditional Normalizing Flows for Low-Dose Computed Tomography Image Reconstruction http://arxiv.org/abs/2006.06270v1
Conditional Set Generation with Transformers http://arxiv.org/abs/2006.16841v2
Conditional gradient methods for stochastically constrained convex minimization http://arxiv.org/abs/2007.03795v1
Conditioning of Reinforcement Learning Agents and its Policy Regularization Application http://arxiv.org/abs/1906.05437v2
Confidence Intervals for Policy Evaluation in Adaptive Experiments http://arxiv.org/abs/1911.02768v3
Confidence Sets and Hypothesis Testing in a Likelihood-Free Inference Setting http://arxiv.org/abs/2002.10399v2
Confidence-Calibrated Adversarial Training: Generalizing to Unseen Attacks http://arxiv.org/abs/1910.06259v4
ConjNLI: Natural Language Inference Over Conjunctive Sentences http://arxiv.org/abs/2010.10418v2
Connecting Embeddings for Knowledge Graph Entity Typing http://arxiv.org/abs/2007.10873v1
Conservative Exploration in Reinforcement Learning http://arxiv.org/abs/2002.03218v2
Conservative Safety Critics for Exploration http://arxiv.org/abs/2010.14497v1
Considering Likelihood in NLP Classification Explanations with Occlusion and Language Modeling http://arxiv.org/abs/2004.09890v1
Consistency by Agreement in Zero-shot Neural Machine Translation http://arxiv.org/abs/1904.02338v2
Consistency of a Recurrent Language Model With Respect to Incomplete Decoding http://arxiv.org/abs/2002.02492v2
Consistent Estimators for Learning to Defer to an Expert http://arxiv.org/abs/2006.01862v2
Consistent Structured Prediction with Max-Min Margin Markov Networks http://arxiv.org/abs/2007.01012v2
Consistent Transcription and Translation of Speech http://arxiv.org/abs/2007.12741v2
Consistent recovery threshold of hidden nearest neighbor graphs http://arxiv.org/abs/1911.08004v1
Constant Curvature Graph Convolutional Networks http://arxiv.org/abs/1911.05076v3
Constituent Parsing as Sequence Labeling http://arxiv.org/abs/1810.08994v2
Constrained Decoding for Neural NLG from Compositional Representations in Task-Oriented Dialogue http://arxiv.org/abs/1906.07220v1
Constrained Markov Decision Processes via Backward Value Functions http://arxiv.org/abs/2008.11811v1
Constrained Neural Ordinary Differential Equations with Stability Guarantees http://arxiv.org/abs/2004.10883v1
Constructing a provably adversarially-robust classifier from a high accuracy one http://arxiv.org/abs/1912.07561v1
Constructive Universal High-Dimensional Distribution Generation through Deep ReLU Networks http://arxiv.org/abs/2006.16664v1
Content Planning for Neural Story Generation with Aristotelian Rescoring http://arxiv.org/abs/2009.09870v2
Content Selection in Deep Learning Models of Summarization http://arxiv.org/abs/1810.12343v2
Context Gates for Neural Machine Translation http://arxiv.org/abs/1608.06043v3
Context Mover's Distance & Barycenters: Optimal Transport of Contexts for Building Representations http://arxiv.org/abs/1808.09663v6
Context-Aware Answer Extraction in Question Answering http://arxiv.org/abs/2011.02687v1
Context-Aware Local Differential Privacy http://arxiv.org/abs/1911.00038v2
Context-Aware Neural Machine Translation Learns Anaphora Resolution http://arxiv.org/abs/1805.10163v1
Context-aware Dynamics Model for Generalization in Model-Based Reinforcement Learning http://arxiv.org/abs/2005.06800v3
Contextual Constrained Learning for Dose-Finding Clinical Trials http://arxiv.org/abs/2001.02463v2
Contextual Embeddings: When Are They Worth It? http://arxiv.org/abs/2005.09117v1
Contextual Memory Trees http://arxiv.org/abs/1807.06473v3
Contextual Neural Machine Translation Improves Translation of Cataphoric Pronouns http://arxiv.org/abs/2004.09894v2
Contextual Online False Discovery Rate Control http://arxiv.org/abs/1902.02885v2
Contextualization of Morphological Inflection http://arxiv.org/abs/1905.01420v1
Contextualized Sparse Representations for Real-Time Open-Domain Question Answering http://arxiv.org/abs/1911.02896v2
Contextualizing Hate Speech Classifiers with Post-hoc Explanation http://arxiv.org/abs/2005.02439v3
Continual Learning from the Perspective of Compression http://arxiv.org/abs/2006.15078v1
Continual Model-Based Reinforcement Learning with Hypernetworks http://arxiv.org/abs/2009.11997v1
Continual adaptation for efficient machine communication http://arxiv.org/abs/1911.09896v2
Continual and Multi-Task Architecture Search http://arxiv.org/abs/1906.05226v1
Continual learning with direction-constrained optimization http://arxiv.org/abs/2011.12581v1
Continuous Graph Flow http://arxiv.org/abs/1908.02436v2
Continuous Graph Neural Networks http://arxiv.org/abs/1912.00967v3
Continuous Online Learning and New Insights to Online Imitation Learning http://arxiv.org/abs/1912.01261v1
Continuous Representation of Location for Geolocation and Lexical Dialectology using Mixture Density Networks http://arxiv.org/abs/1708.04358v1
Continuous-time Lower Bounds for Gradient-based Algorithms http://arxiv.org/abs/2002.03546v2
Continuously Indexed Domain Adaptation http://arxiv.org/abs/2007.01807v2
Contrastive Behavioral Similarity Embeddings for Generalization in Reinforcement Learning http://arxiv.org/abs/2101.05265v1
Contrastive Graph Neural Network Explanation http://arxiv.org/abs/2010.13663v1
Contrastive Multi-View Representation Learning on Graphs http://arxiv.org/abs/2006.05582v1
Contrastive Self-Supervised Learning for Commonsense Reasoning http://arxiv.org/abs/2005.00669v1
Control Frequency Adaptation via Action Persistence in Batch Reinforcement Learning http://arxiv.org/abs/2002.06836v2
Controlled Crowdsourcing for High-Quality QA-SRL Annotation http://arxiv.org/abs/1911.03243v2
Controlling Output Length in Neural Encoder-Decoders http://arxiv.org/abs/1609.09552v1
Controlling Overestimation Bias with Truncated Mixture of Continuous Distributional Quantile Critics http://arxiv.org/abs/2005.04269v1
ConvLab-2: An Open-Source Toolkit for Building, Evaluating, and Diagnosing Dialogue Systems http://arxiv.org/abs/2002.04793v2
Convergence Analysis of Block Coordinate Algorithms with Determinantal Sampling http://arxiv.org/abs/1910.11561v3
Convergence Rates of Smooth Message Passing with Rounding in Entropy-Regularized MAP Inference http://arxiv.org/abs/1907.01127v2
Convergence Rates of Variational Inference in Sparse Deep Learning http://arxiv.org/abs/1908.04847v2
Conversation Modeling on Reddit using a Graph-Structured LSTM http://arxiv.org/abs/1704.02080v1
Conversational Document Prediction to Assist Customer Care Agents http://arxiv.org/abs/2010.02305v1
Conversational Semantic Parsing http://arxiv.org/abs/2009.13655v1
Conversational Semantic Parsing for Dialog State Tracking http://arxiv.org/abs/2010.12770v1
Conversational Word Embedding for Retrieval-Based Dialog System http://arxiv.org/abs/2004.13249v1
Conversations Gone Awry: Detecting Early Signs of Conversational Failure http://arxiv.org/abs/1805.05345v1
Convex Calibrated Surrogates for the Multi-Label F-Measure http://arxiv.org/abs/2009.07801v1
Convex Representation Learning for Generalized Invariance in Semi-Inner-Product Space http://arxiv.org/abs/2004.12209v3
Convolutional Kernel Networks for Graph-Structured Data http://arxiv.org/abs/2003.05189v2
Convolutional Neural Networks with Recurrent Neural Filters http://arxiv.org/abs/1808.09315v1
Convolutional dictionary learning based auto-encoders for natural exponential-family distributions http://arxiv.org/abs/1907.03211v4
Cooperative Learning of Disjoint Syntax and Semantics http://arxiv.org/abs/1902.09393v2
Cooperative Multi-Agent Bandits with Heavy Tails http://arxiv.org/abs/2008.06244v1
Coordination without communication: optimal regret in two players multi-armed bandits http://arxiv.org/abs/2002.07596v2
Coreferential Reasoning Learning for Language Representation http://arxiv.org/abs/2004.06870v2
Coresets for Clustering in Graphs of Bounded Treewidth http://arxiv.org/abs/1907.04733v4
Coresets for Data-efficient Training of Machine Learning Models http://arxiv.org/abs/1906.01827v3
Correlating neural and symbolic representations of language http://arxiv.org/abs/1905.06401v2
Corruption-Tolerant Gaussian Process Bandit Optimization http://arxiv.org/abs/2003.01971v1
Counterfactual Cross-Validation: Stable Model Selection Procedure for Causal Inference Models http://arxiv.org/abs/1909.05299v5
Counterfactual Data Augmentation for Mitigating Gender Stereotypes in Languages with Rich Morphology http://arxiv.org/abs/1906.04571v3
Counterfactual Data Augmentation using Locally Factored Dynamics http://arxiv.org/abs/2007.02863v2
Countering Language Drift with Seeded Iterated Learning http://arxiv.org/abs/2003.12694v3
Countering hate on social media: Large scale classification of hate and counter speech http://arxiv.org/abs/2006.01974v3
Coupling Distant Annotation and Adversarial Training for Cross-Domain Chinese Word Segmentation http://arxiv.org/abs/2007.08186v2
Coupling Retrieval and Meta-Learning for Context-Dependent Semantic Parsing http://arxiv.org/abs/1906.07108v1
Course Concept Expansion in MOOCs with External Knowledge and Interactive Game http://arxiv.org/abs/1909.07739v1
Creating Causal Embeddings for Question Answering with Minimal Supervision http://arxiv.org/abs/1609.08097v1
Cross Copy Network for Dialogue Generation http://arxiv.org/abs/2010.11539v1
Cross-Domain Generalization of Neural Constituency Parsers http://arxiv.org/abs/1907.04347v1
Cross-Lingual Alignment of Contextual Word Embeddings, with Applications to Zero-shot Dependency Parsing http://arxiv.org/abs/1902.09492v2
Cross-Lingual Semantic Role Labeling with High-Quality Translated Training Corpus http://arxiv.org/abs/2004.06295v2
Cross-Lingual Syntactic Transfer with Limited Resources http://arxiv.org/abs/1610.06227v2
Cross-Lingual Training for Automatic Question Generation http://arxiv.org/abs/1906.02525v1
Cross-Linguistic Syntactic Evaluation of Word Prediction Models http://arxiv.org/abs/2005.00187v2
Cross-Media Keyphrase Prediction: A Unified Framework with Multi-Modality Multi-Head Attention and Image Wordings http://arxiv.org/abs/2011.01565v1
Cross-Modal Data Programming Enables Rapid Medical Machine Learning http://arxiv.org/abs/1903.11101v1
Cross-Modality Relevance for Reasoning on Language and Vision http://arxiv.org/abs/2005.06035v1
Cross-Sentence N-ary Relation Extraction with Graph LSTMs http://arxiv.org/abs/1708.03743v1
Cross-Target Stance Classification with Self-Attention Networks http://arxiv.org/abs/1805.06593v2
Cross-Thought for Sentence Encoder Pre-training http://arxiv.org/abs/2010.03652v1
Cross-lingual Abstract Meaning Representation Parsing http://arxiv.org/abs/1704.04539v2
Cross-lingual Spoken Language Understanding with Regularized Representation Alignment http://arxiv.org/abs/2009.14510v1
Cross-lingual Visual Verb Sense Disambiguation http://arxiv.org/abs/1904.05092v2
Cross-media Structured Common Space for Multimedia Event Extraction http://arxiv.org/abs/2005.02472v1
Cross-modal Language Generation using Pivot Stabilization for Web-scale Language Coverage http://arxiv.org/abs/2005.00246v1
Cross-topic distributional semantic representations via unsupervised mappings http://arxiv.org/abs/1904.05674v1
CrossWOZ: A Large-Scale Chinese Cross-Domain Task-Oriented Dialogue Dataset http://arxiv.org/abs/2002.11893v2
Crossing Variational Autoencoders for Answer Retrieval http://arxiv.org/abs/2005.02557v2
CrowS-Pairs: A Challenge Dataset for Measuring Social Biases in Masked Language Models http://arxiv.org/abs/2010.00133v1
Crowdsourcing Lightweight Pyramids for Manual Summary Evaluation http://arxiv.org/abs/1904.05929v1
Cumulo: A Dataset for Learning Cloud Classes http://arxiv.org/abs/1911.04227v2
Curriculum Pre-training for End-to-End Speech Translation http://arxiv.org/abs/2004.10093v1
Curse of Dimensionality on Randomized Smoothing for Certifiable Robustness http://arxiv.org/abs/2002.03239v2
Cycles in Causal Learning http://arxiv.org/abs/2007.12335v1
D2RL: Deep Dense Architectures in Reinforcement Learning http://arxiv.org/abs/2010.09163v2
DADI: Dynamic Discovery of Fair Information with Adversarial Reinforcement Learning http://arxiv.org/abs/1910.13983v1
DAGA: Data Augmentation with a Generation Approach for Low-resource Tagging Tasks http://arxiv.org/abs/2011.01549v1
DAve-QN: A Distributed Averaged Quasi-Newton Method with Local Superlinear Convergence Rate http://arxiv.org/abs/1906.00506v3
DERAIL: Diagnostic Environments for Reward And Imitation Learning http://arxiv.org/abs/2012.01365v1
DGST: a Dual-Generator Network for Text Style Transfer http://arxiv.org/abs/2010.14557v1
DLGNet: A Transformer-based Model for Dialogue Response Generation http://arxiv.org/abs/1908.01841v2
DOC: Deep Open Classification of Text Documents http://arxiv.org/abs/1709.08716v1
DORB: Dynamically Optimizing Multiple Rewards with Bandits http://arxiv.org/abs/2011.07635v1
DR-BiLSTM: Dependent Reading Bidirectional LSTM for Natural Language Inference http://arxiv.org/abs/1802.05577v2
DRS at MRP 2020: Dressing up Discourse Representation Structures as Graphs http://arxiv.org/abs/2012.14837v1
DRTS Parsing with Structure-Aware Encoding and Decoding http://arxiv.org/abs/2005.06901v1
DRWR: A Differentiable Renderer without Rendering for Unsupervised 3D Structure Learning from Silhouette Images http://arxiv.org/abs/2007.06127v1
DTCA: Decision Tree-based Co-Attention Networks for Explainable Claim Verification http://arxiv.org/abs/2004.13455v1
DYSAN: Dynamically sanitizing motion sensor data against sensitive inferences through adversarial networks http://arxiv.org/abs/2003.10325v2
DagoBERT: Generating Derivational Morphology with a Pretrained Language Model http://arxiv.org/abs/2005.00672v2
Data Amplification: Instance-Optimal Property Estimation http://arxiv.org/abs/1903.01432v2
Data Appraisal Without Data Sharing http://arxiv.org/abs/2012.06430v1
Data Augmentation for Training Dialog Models Robust to Speech Recognition Errors http://arxiv.org/abs/2006.05635v1
Data Boost: Text Data Augmentation Through Reinforcement Learning Guided Conditional Generation http://arxiv.org/abs/2012.02952v1
Data Generation for Neural Programming by Example http://arxiv.org/abs/1911.02624v1
Data Manipulation: Towards Effective Instance Learning for Neural Dialogue Generation via Learning to Augment and Reweight http://arxiv.org/abs/2004.02594v5
Data Rejuvenation: Exploiting Inactive Training Examples for Neural Machine Translation http://arxiv.org/abs/2010.02552v1
Data Valuation using Reinforcement Learning http://arxiv.org/abs/1909.11671v1
Data Weighted Training Strategies for Grammatical Error Correction http://arxiv.org/abs/2008.02976v2
Data and Representation for Turkish Natural Language Inference http://arxiv.org/abs/2004.14963v3
Data preprocessing to mitigate bias: A maximum entropy based approach http://arxiv.org/abs/1906.02164v2
Data-Dependent Differentially Private Parameter Learning for Directed Graphical Models http://arxiv.org/abs/1905.12813v3
Data-Efficient Image Recognition with Contrastive Predictive Coding http://arxiv.org/abs/1905.09272v3
Data-driven confidence bands for distributed nonparametric regression http://arxiv.org/abs/1912.06689v2
Dataset Cartography: Mapping and Diagnosing Datasets with Training Dynamics http://arxiv.org/abs/2009.10795v2
DeBayes: a Bayesian method for debiasing network embeddings http://arxiv.org/abs/2002.11442v2
DeClarE: Debunking Fake News and False Claims using Evidence-Aware Deep Learning http://arxiv.org/abs/1809.06416v1
DeFormer: Decomposing Pre-trained Transformers for Faster Question Answering http://arxiv.org/abs/2005.00697v1
DeSePtion: Dual Sequence Prediction and Adversarial Examples for Improved Fact-Checking http://arxiv.org/abs/2004.12864v1
Debiased Sinkhorn barycenters http://arxiv.org/abs/2006.02575v1
Debiasing Evaluations That are Biased by Evaluations http://arxiv.org/abs/2012.00714v1
Decentralised Learning with Random Features and Distributed Gradient Descent http://arxiv.org/abs/2007.00360v1
Decentralized Multi-player Multi-armed Bandits with No Collision Information http://arxiv.org/abs/2003.00162v1
Decentralized gradient methods: does topology matter? http://arxiv.org/abs/2002.12688v1
Decision Trees for Decision-Making under the Predict-then-Optimize Framework http://arxiv.org/abs/2003.00360v2
Decomposable Neural Paraphrase Generation http://arxiv.org/abs/1906.09741v1
Deconstructing word embedding algorithms http://arxiv.org/abs/2011.07013v1
Decoupled Greedy Learning of CNNs http://arxiv.org/abs/1901.08164v4
Decoupling Strategy and Generation in Negotiation Dialogues http://arxiv.org/abs/1808.09637v1
DeeBERT: Dynamic Early Exiting for Accelerating BERT Inference http://arxiv.org/abs/2004.12993v1
Deep Active Learning: Unified and Principled Method for Query and Training http://arxiv.org/abs/1911.09162v2
Deep Bayesian Quadrature Policy Optimization http://arxiv.org/abs/2006.15637v3
Deep Claim: Payer Response Prediction from Claims Data with Deep Learning http://arxiv.org/abs/2007.06229v1
Deep Context-Aware Novelty Detection http://arxiv.org/abs/2006.01168v2
Deep Contextualized Self-training for Low Resource Dependency Parsing http://arxiv.org/abs/1911.04286v1
Deep Coordination Graphs http://arxiv.org/abs/1910.00091v4
Deep Divergence Learning http://arxiv.org/abs/2005.02612v1
Deep Dyna-Q: Integrating Planning for Task-Completion Dialogue Policy Learning http://arxiv.org/abs/1801.06176v3
Deep Gaussian Markov Random Fields http://arxiv.org/abs/2002.07467v2
Deep Generative Model for Joint Alignment and Word Representation http://arxiv.org/abs/1802.05883v3
Deep Graph Contrastive Representation Learning http://arxiv.org/abs/2006.04131v2
Deep Hierarchical Classification for Category Prediction in E-commerce System http://arxiv.org/abs/2005.06692v1
Deep Isometric Learning for Visual Recognition http://arxiv.org/abs/2006.16992v2
Deep Keyphrase Generation http://arxiv.org/abs/1704.06879v2
Deep Molecular Programming: A Natural Implementation of Binary-Weight ReLU Neural Networks http://arxiv.org/abs/2003.13720v3
Deep Networks and the Multiple Manifold Problem http://arxiv.org/abs/2008.11245v1
Deep Neural Machine Translation with Linear Associative Unit http://arxiv.org/abs/1705.00861v1
Deep Probabilistic Logic: A Unifying Framework for Indirect Supervision http://arxiv.org/abs/1808.08485v1
Deep Recurrent Models with Fast-Forward Connections for Neural Machine Translation http://arxiv.org/abs/1606.04199v3
Deep Reinforcement Learning amidst Lifelong Non-Stationarity http://arxiv.org/abs/2006.10701v1
Deep Reinforcement Learning for Dialogue Generation http://arxiv.org/abs/1606.01541v4
Deep Reinforcement Learning for Mention-Ranking Coreference Models http://arxiv.org/abs/1609.08667v3
Deep Relevance Ranking Using Enhanced Document-Query Interactions http://arxiv.org/abs/1809.01682v2
Deep Ritz revisited http://arxiv.org/abs/1912.03937v2
Deep Structured Mixtures of Gaussian Processes http://arxiv.org/abs/1910.04536v2
Deep Temporal-Recurrent-Replicated-Softmax for Topical Trends over Time http://arxiv.org/abs/1711.05626v2
Deep contextualized word representations http://arxiv.org/abs/1802.05365v2
Deep k-NN for Noisy Labels http://arxiv.org/abs/2004.12289v1
Deep-speare: A Joint Neural Model of Poetic Language, Meter and Rhyme http://arxiv.org/abs/1807.03491v1
DeepCoDA: personalized interpretability for compositional health data http://arxiv.org/abs/2006.01392v2
DeepMatch: Balancing Deep Covariate Representations for Causal Inference Using Adversarial Training http://arxiv.org/abs/1802.05664v1
DeepPath: A Reinforcement Learning Method for Knowledge Graph Reasoning http://arxiv.org/abs/1707.06690v3
DeepSeqSLAM: A Trainable CNN+RNN for Joint Global Description and Sequence-based Place Recognition http://arxiv.org/abs/2011.08518v1
Defense Through Diverse Directions http://arxiv.org/abs/2003.10602v1
Defining Benchmarks for Continual Few-Shot Learning http://arxiv.org/abs/2004.11967v1
Defining and Evaluating Fair Natural Language Generation http://arxiv.org/abs/2008.01548v1
Defoiling Foiled Image Captions http://arxiv.org/abs/1805.06549v1
Delete, Retrieve, Generate: A Simple Approach to Sentiment and Style Transfer http://arxiv.org/abs/1804.06437v1
DeltaGrad: Rapid retraining of machine learning models http://arxiv.org/abs/2006.14755v2
Demand-Weighted Completeness Prediction for a Knowledge Base http://arxiv.org/abs/1804.11109v1
Demographic Dialectal Variation in Social Media: A Case Study of African-American English http://arxiv.org/abs/1608.08868v1
Demographics Should Not Be the Reason of Toxicity: Mitigating Discrimination in Text Classifications with Instance Weighting http://arxiv.org/abs/2004.14088v3
Demoting Racial Bias in Hate Speech Detection http://arxiv.org/abs/2005.12246v1
Denoising Relation Extraction from Document-level Distant Supervision http://arxiv.org/abs/2011.03888v1
Dense Passage Retrieval for Open-Domain Question Answering http://arxiv.org/abs/2004.04906v3
Dense-Caption Matching and Frame-Selection Gating for Temporal Localization in VideoQA http://arxiv.org/abs/2005.06409v1
Densely Connected Graph Convolutional Networks for Graph-to-Sequence Learning http://arxiv.org/abs/1908.05957v2
Density Deconvolution with Normalizing Flows http://arxiv.org/abs/2006.09396v2
Density Matching for Bilingual Word Embedding http://arxiv.org/abs/1904.02343v3
Deontological Ethics By Monotonicity Shape Constraints http://arxiv.org/abs/2001.11990v2
Dependency-based Hybrid Trees for Semantic Parsing http://arxiv.org/abs/1809.00107v1
Dependent randomized rounding for clustering and partition systems with knapsack constraints http://arxiv.org/abs/1709.06995v9
Depth Completion via Deep Basis Fitting http://arxiv.org/abs/1912.10336v1
Depth Uncertainty in Neural Networks http://arxiv.org/abs/2006.08437v3
DepthNet Nano: A Highly Compact Self-Normalizing Neural Network for Monocular Depth Estimation http://arxiv.org/abs/2004.08008v1
Deriving Machine Attention from Human Rationales http://arxiv.org/abs/1808.09367v1
Description Based Text Classification with Reinforcement Learning http://arxiv.org/abs/2002.03067v3
Design Challenges in Low-resource Cross-lingual Entity Linking http://arxiv.org/abs/2005.00692v2
Designing Differentially Private Estimators in High Dimensions http://arxiv.org/abs/2006.01944v3
Designing Precise and Robust Dialogue Response Evaluators http://arxiv.org/abs/2004.04908v2
Detecting Attackable Sentences in Arguments http://arxiv.org/abs/2010.02660v1
Detecting Cross-Modal Inconsistency to Defend Against Neural Fake News http://arxiv.org/abs/2009.07698v5
Detecting East Asian Prejudice on Social Media http://arxiv.org/abs/2005.03909v1
Detecting Egregious Conversations between Customers and Virtual Agents http://arxiv.org/abs/1711.05780v2
Detecting Fine-Grained Cross-Lingual Semantic Divergences without Supervision by Learning to Rank http://arxiv.org/abs/2010.03662v1
Detecting Gang-Involved Escalation on Social Media Using Context http://arxiv.org/abs/1809.03632v1
Detecting Perceived Emotions in Hurricane Disasters http://arxiv.org/abs/2004.14299v1
Detecting Word Sense Disambiguation Biases in Machine Translation for Model-Agnostic Adversarial Attacks http://arxiv.org/abs/2011.01846v1
Detecting dementia in Mandarin Chinese using transfer learning from a parallel corpus http://arxiv.org/abs/1903.00933v2
Determining Semantic Textual Similarity using Natural Deduction Proofs http://arxiv.org/abs/1707.08713v1
Deterministic Decoding for Discrete Data in Variational Autoencoders http://arxiv.org/abs/2003.02174v1
Deterministic Non-Autoregressive Neural Sequence Modeling by Iterative Refinement http://arxiv.org/abs/1802.06901v3
Dexterous Robotic Grasping with Object-Centric Visual Affordances http://arxiv.org/abs/2009.01439v1
Dialogue Coherence Assessment Without Explicit Dialogue Act Labels http://arxiv.org/abs/1908.08486v2
Dialogue Distillation: Open-Domain Dialogue Augmentation Using Unpaired Data http://arxiv.org/abs/2009.09427v2
Dialogue Response Ranking Training with Large-Scale Human Feedback Data http://arxiv.org/abs/2009.06978v1
Diameter-based Interactive Structure Discovery http://arxiv.org/abs/1906.02101v2
Dice Loss for Data-imbalanced NLP Tasks http://arxiv.org/abs/1911.02855v3
Did You Ask a Good Question? A Cross-Domain Question Intention Classification Benchmark for Text-to-SQL http://arxiv.org/abs/2010.12634v1
Did the Model Understand the Question? http://arxiv.org/abs/1805.05492v1
Differentiable Causal Backdoor Discovery http://arxiv.org/abs/2003.01461v1
Differentiable Graph Module (DGM) for Graph Convolutional Networks http://arxiv.org/abs/2002.04999v3
Differentiable Likelihoods for Fast Inversion of 'Likelihood-Free' Dynamical Systems http://arxiv.org/abs/2002.09301v2
Differentiable Sampling with Flexible Reference Word Order for Neural Machine Translation http://arxiv.org/abs/1904.04079v2
Differentiable Window for Dynamic Local Attention http://arxiv.org/abs/2006.13561v1
Differential Evolution for Neural Architecture Search http://arxiv.org/abs/2012.06400v1
Differentially Private Language Models Benefit from Public Pre-training http://arxiv.org/abs/2009.05886v2
Differentially Private Set Union http://arxiv.org/abs/2002.09745v1
Differentially Private Stochastic Coordinate Descent http://arxiv.org/abs/2006.07272v3
Differentially private cross-silo federated learning http://arxiv.org/abs/2007.05553v1
Differentiating through the Fréchet Mean http://arxiv.org/abs/2003.00335v3
Digital Voicing of Silent Speech http://arxiv.org/abs/2010.02960v1
Dilated Convolutional Attention Network for Medical Code Assignment from Clinical Text http://arxiv.org/abs/2009.14578v1
Diptychs of human and machine perceptions http://arxiv.org/abs/2010.13864v1
DisCor: Corrective Feedback in Reinforcement Learning via Distribution Correction http://arxiv.org/abs/2003.07305v1
Discern: Discourse-Aware Entailment Reasoning Network for Conversational Machine Reading http://arxiv.org/abs/2010.01838v3
DiscoFuse: A Large-Scale Dataset for Discourse-Based Sentence Fusion http://arxiv.org/abs/1902.10526v3
Discontinuous Constituency Parsing with a Stack-Free Transition System and a Dynamic Oracle http://arxiv.org/abs/1904.00615v1
Discontinuous Constituent Parsing as Sequence Labeling http://arxiv.org/abs/2010.00633v1
Discount Factor as a Regularizer in Reinforcement Learning http://arxiv.org/abs/2007.02040v1
Discourse Marker Augmented Network with Reinforcement Learning for Natural Language Inference http://arxiv.org/abs/1907.09692v1
Discourse structure interacts with reference but not syntax in neural language models http://arxiv.org/abs/2010.04887v1
Discourse-Aware Neural Extractive Text Summarization http://arxiv.org/abs/1910.14142v2
Discovering and interpreting transcriptomic drivers of imaging traits using neural networks http://arxiv.org/abs/1912.05071v1
Discrete Action On-Policy Learning with Action-Value Critic http://arxiv.org/abs/2002.03534v2
Discrete Latent Variable Representations for Low-Resource Text Classification http://arxiv.org/abs/2006.06226v1
Discrete Optimization for Unsupervised Sentence Summarization with Word-Level Extraction http://arxiv.org/abs/2005.01791v1
Discriminative Adversarial Search for Abstractive Summarization http://arxiv.org/abs/2002.10375v2
Discriminative Jackknife: Quantifying Uncertainty in Deep Learning via Higher-Order Influence Functions http://arxiv.org/abs/2007.13481v1
Discriminative Nearest Neighbor Few-Shot Intent Detection by Transferring Natural Language Inference http://arxiv.org/abs/2010.13009v1
Discriminative Neural Sentence Modeling by Tree-Based Convolution http://arxiv.org/abs/1504.01106v5
Discriminatively-Tuned Generative Classifiers for Robust Natural Language Inference http://arxiv.org/abs/2010.03760v1
Disentangle-based Continual Graph Representation Learning http://arxiv.org/abs/2010.02565v4
Disentangled Planning and Control in Vision Based Robotics via Reward Machines http://arxiv.org/abs/2012.14464v1
Disentangling Language and Knowledge in Task-Oriented Dialogs http://arxiv.org/abs/1805.01216v3
Disentangling Trainability and Generalization in Deep Neural Networks http://arxiv.org/abs/1912.13053v2
Dispersed Exponential Family Mixture VAEs for Interpretable Text Generation http://arxiv.org/abs/1906.06719v4
Dissecting Lottery Ticket Transformers: Structural and Behavioral Study of Sparse Neural Machine Translation http://arxiv.org/abs/2009.13270v2
Dissecting Non-Vacuous Generalization Bounds based on the Mean-Field Approximation http://arxiv.org/abs/1909.03009v2
Dissecting Span Identification Tasks with Performance Prediction http://arxiv.org/abs/2010.02587v1
Dissipative SymODEN: Encoding Hamiltonian Dynamics with Dissipation and Control into Deep Learning http://arxiv.org/abs/2002.08860v3
Distant Supervision and Noisy Label Learning for Low Resource Named Entity Recognition: A Study on Hausa and Yorùbá http://arxiv.org/abs/2003.08370v2
Distant Supervision from Disparate Sources for Low-Resource Part-of-Speech Tagging http://arxiv.org/abs/1808.09733v1
Distill, Adapt, Distill: Training Small, In-Domain Models for Neural Machine Translation http://arxiv.org/abs/2003.02877v3
Distilling Knowledge Learned in BERT for Text Generation http://arxiv.org/abs/1911.03829v3
Distilling Knowledge for Search-based Structured Prediction http://arxiv.org/abs/1805.11224v1
Distilling Neural Networks for Greener and Faster Dependency Parsing http://arxiv.org/abs/2006.00844v1
Distinguish Confusing Law Articles for Legal Judgment Prediction http://arxiv.org/abs/2004.02557v3
Distributed Differentially Private Averaging with Improved Utility and Robustness to Malicious Parties http://arxiv.org/abs/2006.07218v1
Distributed Learning: Sequential Decision Making in Resource-Constrained Environments http://arxiv.org/abs/2004.06171v1
Distributed, partially collapsed MCMC for Bayesian Nonparametrics http://arxiv.org/abs/2001.05591v3
Distributionally Robust Bayesian Optimization http://arxiv.org/abs/2002.09038v3
Distributionally Robust Bayesian Quadrature Optimization http://arxiv.org/abs/2001.06814v1
Distributionally Robust Formulation and Model Selection for the Graphical Lasso http://arxiv.org/abs/1905.08975v2
Diverse Exploration via InfoMax Options http://arxiv.org/abs/2010.02756v1
Diverse, Controllable, and Keyphrase-Aware: A Corpus and Method for News Multi-Headline Generation http://arxiv.org/abs/2004.03875v2
Diversifying Dialogue Generation with Non-Conversational Text http://arxiv.org/abs/2005.04346v2
Diversifying Reply Suggestions using a Matching-Conditional Variational Autoencoder http://arxiv.org/abs/1903.10630v1
Diversity driven Attention Model for Query-based Abstractive Summarization http://arxiv.org/abs/1704.08300v2
Divide, Conquer, and Combine: a New Inference Strategy for Probabilistic Programs with Stochastic Support http://arxiv.org/abs/1910.13324v3
Diving Deep into Context-Aware Neural Machine Translation http://arxiv.org/abs/2010.09482v1
Do Explicit Alignments Robustly Improve Multilingual Encoders? http://arxiv.org/abs/2010.02537v1
Do Multi-Sense Embeddings Improve Natural Language Understanding? http://arxiv.org/abs/1506.01070v3
Do Neural Dialog Systems Use the Conversation History Effectively? An Empirical Study http://arxiv.org/abs/1906.01603v2
Do Neural Language Models Show Preferences for Syntactic Formalisms? http://arxiv.org/abs/2004.14096v1
Do Neural Models Learn Systematicity of Monotonicity Inference in Natural Language? http://arxiv.org/abs/2004.14839v2
Do Neural Network Cross-Modal Mappings Really Bridge Modalities? http://arxiv.org/abs/1805.07616v2
Do RNN and LSTM have Long Memory? http://arxiv.org/abs/2006.03860v2
Do We Need Zero Training Loss After Achieving Zero Training Error? http://arxiv.org/abs/2002.08709v1
Do We Really Need to Access the Source Data? Source Hypothesis Transfer for Unsupervised Domain Adaptation http://arxiv.org/abs/2002.08546v5
Do You Have the Right Scissors? Tailoring Pre-trained Language Models via Monte-Carlo Methods http://arxiv.org/abs/2007.06162v1
Do You See What I Mean? Visual Resolution of Linguistic Ambiguities http://arxiv.org/abs/1603.08079v1
Do latent tree learning models identify meaningful structure in sentences? http://arxiv.org/abs/1709.01121v2
Do sequence-to-sequence VAEs learn global features of sentences? http://arxiv.org/abs/2004.07683v1
Document Context Neural Machine Translation with Memory Networks http://arxiv.org/abs/1711.03688v2
Document Modeling with Graph Attention Networks for Multi-grained Machine Reading Comprehension http://arxiv.org/abs/2005.05806v2
Document-Level Event Role Filler Extraction using Multi-Granularity Contextualized Encoding http://arxiv.org/abs/2005.06579v1
Document-aligned Japanese-English Conversation Parallel Corpus http://arxiv.org/abs/2012.06143v1
Does Multi-Encoder Help? A Case Study on Context-Aware Neural Machine Translation http://arxiv.org/abs/2005.03393v2
Does label smoothing mitigate label noise? http://arxiv.org/abs/2003.02819v1
Does syntax need to grow on trees? Sources of hierarchical inductive bias in sequence-to-sequence networks http://arxiv.org/abs/2001.03632v1
Does the Markov Decision Process Fit the Data: Testing for the Markov Property in Sequential Decision Making http://arxiv.org/abs/2002.01751v1
Does the Objective Matter? Comparing Training Objectives for Pronoun Resolution http://arxiv.org/abs/2010.02570v1
Domain Adaptation with Adversarial Training and Graph Embeddings http://arxiv.org/abs/1805.05151v1
Domain Adaptive Dialog Generation via Meta Learning http://arxiv.org/abs/1906.03520v2
Domain Adaptive Imitation Learning http://arxiv.org/abs/1910.00105v2
Domain Adaptive Inference for Neural Machine Translation http://arxiv.org/abs/1906.00408v1
Domain Aggregation Networks for Multi-Source Domain Adaptation http://arxiv.org/abs/1909.05352v2
Domain Knowledge Empowered Structured Neural Net for End-to-End Event Temporal Relation Extraction http://arxiv.org/abs/2009.07373v2
Domain Knowledge Integration By Gradient Matching For Sample-Efficient Reinforcement Learning http://arxiv.org/abs/2005.13778v1
Domain-Liftability of Relational Marginal Polytopes http://arxiv.org/abs/2001.05198v1
Domain-Specific Lexical Grounding in Noisy Visual-Textual Documents http://arxiv.org/abs/2010.16363v1
Don't Neglect the Obvious: On the Role of Unambiguous Words in Word Sense Disambiguation http://arxiv.org/abs/2004.14325v3
Don't Read Too Much into It: Adaptive Computation for Open-Domain Question Answering http://arxiv.org/abs/2011.05435v1
Don't Use English Dev: On the Zero-Shot Cross-Lingual Evaluation of Contextual Embeddings http://arxiv.org/abs/2004.15001v2
Double Graph Based Reasoning for Document-level Relation Extraction http://arxiv.org/abs/2009.13752v1
Double-Hard Debias: Tailoring Word Embeddings for Gender Bias Mitigation http://arxiv.org/abs/2005.00965v1
Double-Loop Unadjusted Langevin Algorithm http://arxiv.org/abs/2007.01147v1
Doubly Sparse Variational Gaussian Processes http://arxiv.org/abs/2001.05363v1
Doubly Stochastic Variational Inference for Neural Processes with Hierarchical Latent Variables http://arxiv.org/abs/2008.09469v2
Doubly robust off-policy evaluation with shrinkage http://arxiv.org/abs/1907.09623v2
Dream and Search to Control: Latent Space Planning for Continuous Control http://arxiv.org/abs/2010.09832v1
Driving Behavior Explanation with Multi-level Fusion http://arxiv.org/abs/2012.04983v1
Dual Mirror Descent for Online Allocation Problems http://arxiv.org/abs/2002.10421v4
DualTKB: A Dual Learning Bridge between Text and Knowledge Base http://arxiv.org/abs/2010.14660v1
DyERNIE: Dynamic Evolution of Riemannian Manifold Embeddings for Temporal Knowledge Graph Completion http://arxiv.org/abs/2011.03984v2
Dyna-AIL : Adversarial Imitation Learning by Planning http://arxiv.org/abs/1903.03234v1
Dynamic Anticipation and Completion for Multi-Hop Reasoning over Sparse Knowledge Graph http://arxiv.org/abs/2010.01899v1
Dynamic Context Selection for Document-level Neural Machine Translation via Reinforcement Learning http://arxiv.org/abs/2010.04314v1
Dynamic Data Selection and Weighting for Iterative Back-Translation http://arxiv.org/abs/2004.03672v2
Dynamic Fusion Network for Multi-Domain End-to-end Task-Oriented Dialog http://arxiv.org/abs/2004.11019v3
Dynamic Knapsack Optimization Towards Efficient Multi-Channel Sequential Advertising http://arxiv.org/abs/2006.16312v1
Dynamic Memory Induction Networks for Few-Shot Text Classification http://arxiv.org/abs/2005.05727v1
Dynamic Oracles for Top-Down and In-Order Shift-Reduce Constituent Parsing http://arxiv.org/abs/1810.10882v1
Dynamic Programming Encoding for Subword Segmentation in Neural Machine Translation http://arxiv.org/abs/2005.06606v2
Dynamic Regions Graph Neural Networks for Spatio-Temporal Reasoning http://arxiv.org/abs/2009.08427v1
Dynamical systems theory for causal inference with application to synthetic control methods http://arxiv.org/abs/1808.08778v3
Dynamically Adjusting Transformer Batch Size by Monitoring Gradient Direction Change http://arxiv.org/abs/2005.02008v1
ELI5: Long Form Question Answering http://arxiv.org/abs/1907.09190v1
ELITR Non-Native Speech Translation at IWSLT 2020 http://arxiv.org/abs/2006.03331v1
EM Converges for a Mixture of Many Linear Regressions http://arxiv.org/abs/1905.12106v2
ENGINE: Energy-Based Inference Networks for Non-Autoregressive Machine Translation http://arxiv.org/abs/2005.00850v2
ENT-DESC: Entity Description Generation by Exploring Knowledge Graph http://arxiv.org/abs/2004.14813v2
ERASER: A Benchmark to Evaluate Rationalized NLP Models http://arxiv.org/abs/1911.03429v2
ESPRIT: Explaining Solutions to Physical Reasoning Tasks http://arxiv.org/abs/2005.00730v2
ESPnet-ST: All-in-One Speech Translation Toolkit http://arxiv.org/abs/2004.10234v2
ETC: Encoding Long and Structured Inputs in Transformers http://arxiv.org/abs/2004.08483v5
EXP4-DFDC: A Non-Stochastic Multi-Armed Bandit for Cache Replacement http://arxiv.org/abs/2009.11330v2
Early Disease Diagnosis for Rice Crop http://arxiv.org/abs/2004.04775v1
Easy-First Dependency Parsing with Hierarchical Tree LSTMs http://arxiv.org/abs/1603.00375v2
Ecological Semantics: Programming Environments for Situated Language Understanding http://arxiv.org/abs/2003.04567v2
EditNTS: An Neural Programmer-Interpreter Model for Sentence Simplification through Explicit Editing http://arxiv.org/abs/1906.08104v1
Educating Text Autoencoders: Latent Representation Guidance via Denoising http://arxiv.org/abs/1905.12777v3
Effective Approaches to Attention-based Neural Machine Translation http://arxiv.org/abs/1508.04025v5
Effective Estimation of Deep Generative Language Models http://arxiv.org/abs/1904.08194v3
Effective Unsupervised Domain Adaptation with Adversarially Trained Language Models http://arxiv.org/abs/2010.01739v1
Effectiveness of MPC-friendly Softmax Replacement http://arxiv.org/abs/2011.11202v1
Efficient Competitive Self-Play Policy Optimization http://arxiv.org/abs/2009.06086v1
Efficient Constituency Parsing by Pointing http://arxiv.org/abs/2006.13557v1
Efficient Contextualized Representation: Language Model Pruning for Sequence Labeling http://arxiv.org/abs/1804.07827v2
Efficient Continuous Pareto Exploration in Multi-Task Learning http://arxiv.org/abs/2006.16434v2
Efficient Deployment of Conversational Natural Language Interfaces over Databases http://arxiv.org/abs/2006.00591v2
Efficient Dialogue State Tracking by Selectively Overwriting Memory http://arxiv.org/abs/1911.03906v2
Efficient Distributed Hessian Free Algorithm for Large-scale Empirical Risk Minimization via Accumulating Sample Strategy http://arxiv.org/abs/1810.11507v2
Efficient Domain Generalization via Common-Specific Low-Rank Decomposition http://arxiv.org/abs/2003.12815v2
Efficient EUD Parsing http://arxiv.org/abs/2006.00838v1
Efficient Estimation of Influence of a Training Instance http://arxiv.org/abs/2012.04207v1
Efficient Inference For Neural Machine Translation http://arxiv.org/abs/2010.02416v2
Efficient Intent Detection with Dual Sentence Encoders http://arxiv.org/abs/2003.04807v1
Efficient Intervention Design for Causal Discovery with Latents http://arxiv.org/abs/2005.11736v2
Efficient Low-rank Multimodal Fusion with Modality-Specific Factors http://arxiv.org/abs/1806.00064v1
Efficient Meta Lifelong-Learning with Limited Memory http://arxiv.org/abs/2010.02500v1
Efficient One-Pass End-to-End Entity Linking for Questions http://arxiv.org/abs/2010.02413v1
Efficient Online Scalar Annotation with Bounded Support http://arxiv.org/abs/1806.01170v1
Efficient Optimistic Exploration in Linear-Quadratic Regulators via Lagrangian Relaxation http://arxiv.org/abs/2007.06482v1
Efficient Parameter Estimation of Truncated Boolean Product Distributions http://arxiv.org/abs/2007.02392v1
Efficient Planning under Partial Observability with Unnormalized Q Functions and Spectral Learning http://arxiv.org/abs/1911.05010v2
Efficient Privacy-Preserving Stochastic Nonconvex Optimization http://arxiv.org/abs/1910.13659v2
Efficient Proximal Mapping of the 1-path-norm of Shallow Networks http://arxiv.org/abs/2007.01003v2
Efficient Reservoir Management through Deep Reinforcement Learning http://arxiv.org/abs/2012.03822v1
Efficient Robustness Certificates for Discrete Data: Sparsity-Aware Randomized Smoothing for Graphs, Images and More http://arxiv.org/abs/2008.12952v1
Efficient Second-Order TreeCRF for Neural Dependency Parsing http://arxiv.org/abs/2005.00975v2
Efficient allocation of law enforcement resources using predictive police patrolling http://arxiv.org/abs/1811.12880v1
Efficient and Robust Algorithms for Adversarial Linear Contextual Bandits http://arxiv.org/abs/2002.00287v2
Efficient and Scalable Bayesian Neural Nets with Rank-1 Factors http://arxiv.org/abs/2005.07186v2
Efficient improper learning for online logistic regression http://arxiv.org/abs/2003.08109v3
Efficient non-conjugate Gaussian process factor models for spike count data using polynomial approximations http://arxiv.org/abs/1906.03318v2
Efficient strategies for hierarchical text classification: External knowledge and auxiliary tasks http://arxiv.org/abs/2005.02473v2
Efficient, Noise-Tolerant, and Private Learning via Boosting http://arxiv.org/abs/2002.01100v1
Efficiently Learning Adversarially Robust Halfspaces with Noise http://arxiv.org/abs/2005.07652v1
Efficiently Sampling Functions from Gaussian Process Posteriors http://arxiv.org/abs/2002.09309v4
Efficiently Solving MDPs with Stochastic Mirror Descent http://arxiv.org/abs/2008.12776v1
Egoshots, an ego-vision life-logging dataset and semantic fidelity metric to evaluate diversity in image captioning models http://arxiv.org/abs/2003.11743v2
Einsum Networks: Fast and Scalable Learning of Tractable Probabilistic Circuits http://arxiv.org/abs/2004.06231v1
Embarrassingly Simple Unsupervised Aspect Extraction http://arxiv.org/abs/2004.13580v1
Embedding Multimodal Relational Data for Knowledge Base Completion http://arxiv.org/abs/1809.01341v2
Embedding Words in Non-Vector Space with Unsupervised Graph Learning http://arxiv.org/abs/2010.02598v1
Embedding time expressions for deep temporal ordering models http://arxiv.org/abs/1906.08287v1
Embedding-based Scientific Literature Discovery in a Text Editor Application http://arxiv.org/abs/2005.04961v1
Embeddings of Label Components for Sequence Labeling: A Case Study of Fine-grained Named Entity Recognition http://arxiv.org/abs/2006.01372v2
Emergence of Syntax Needs Minimal Supervision http://arxiv.org/abs/2005.01119v1
Emergent Road Rules In Multi-Agent Driving Environments http://arxiv.org/abs/2011.10753v1
Emerging Cross-lingual Structure in Pretrained Language Models http://arxiv.org/abs/1911.01464v3
Empirical Study of the Benefits of Overparameterization in Learning Latent Variable Models http://arxiv.org/abs/1907.00030v3
Empower Entity Set Expansion via Language Model Probing http://arxiv.org/abs/2004.13897v2
Empowering Active Learning to Jointly Optimize System and User Demands http://arxiv.org/abs/2005.04470v2
Enabling Language Models to Fill in the Blanks http://arxiv.org/abs/2005.05339v2
Encoder-Decoder Models Can Benefit from Pre-trained Masked Language Models in Grammatical Error Correction http://arxiv.org/abs/2005.00987v2
Encoder-decoder neural network for solving the nonlinear Fokker-Planck-Landau collision operator in XGC http://arxiv.org/abs/2009.06534v2
Encoding Musical Style with Transformer Autoencoders http://arxiv.org/abs/1912.05537v2
Encoding Sentences with Graph Convolutional Networks for Semantic Role Labeling http://arxiv.org/abs/1703.04826v4
Encoding Source Language with Convolutional Neural Network for Machine Translation http://arxiv.org/abs/1503.01838v5
Encodings of Source Syntax: Similarities in NMT Representations Across Target Languages http://arxiv.org/abs/2005.08177v1
End to End Binarized Neural Networks for Text Classification http://arxiv.org/abs/2010.05223v1
End-to-End Bias Mitigation by Modelling Biases in Corpora http://arxiv.org/abs/1909.06321v3
End-to-End Neural Word Alignment Outperforms GIZA++ http://arxiv.org/abs/2004.14675v1
End-to-End Slot Alignment and Recognition for Cross-Lingual NLU http://arxiv.org/abs/2004.14353v2
End-to-End Synthetic Data Generation for Domain Adaptation of Question Answering Systems http://arxiv.org/abs/2010.06028v1
End-to-end ASR: from Supervised to Semi-Supervised Learning with Modern Architectures http://arxiv.org/abs/1911.08460v3
End-to-end Graph-based TAG Parsing with Neural Networks http://arxiv.org/abs/1804.06610v3
End-to-end Neural Coreference Resolution http://arxiv.org/abs/1707.07045v2
Energy and Policy Considerations for Deep Learning in NLP http://arxiv.org/abs/1906.02243v1
Energy-Based Continuous Inverse Optimal Control http://arxiv.org/abs/1904.05453v4
Energy-Based Processes for Exchangeable Data http://arxiv.org/abs/2003.07521v2
Energy-based Surprise Minimization for Multi-Agent Value Factorization http://arxiv.org/abs/2009.09842v3
Enhanced POET: Open-Ended Reinforcement Learning through Unbounded Invention of Learning Challenges and their Solutions http://arxiv.org/abs/2003.08536v2
Enhanced Universal Dependency Parsing with Second-Order Inference and Mixture of Training Data http://arxiv.org/abs/2006.01414v2
Enhancing Answer Boundary Detection for Multilingual Machine Reading Comprehension http://arxiv.org/abs/2004.14069v2
Enhancing Drug-Drug Interaction Extraction from Texts by Molecular Structure Information http://arxiv.org/abs/1805.05593v1
Enhancing Machine Translation with Dependency-Aware Self-Attention http://arxiv.org/abs/1909.03149v3
Enhancing Pre-trained Chinese Character Representation with Word-aligned Attention http://arxiv.org/abs/1911.02821v2
Enhancing Simple Models by Exploiting What They Already Know http://arxiv.org/abs/1905.13565v3
Enhancing Stratospheric Weather Analyses and Forecasts by Deploying Sensors from a Weather Balloon http://arxiv.org/abs/1912.02276v1
Enhancing Word Embeddings with Knowledge Extracted from Lexical Resources http://arxiv.org/abs/2005.10048v1
Enriched In-Order Linearization for Faster Sequence-to-Sequence Constituent Parsing http://arxiv.org/abs/2005.13334v1
Enriching Word Embeddings with Temporal and Spatial Information http://arxiv.org/abs/2010.00761v1
Enriching Word Vectors with Subword Information http://arxiv.org/abs/1607.04606v2
Entities as Experts: Sparse Memory Access with Entity Supervision http://arxiv.org/abs/2004.07202v2
Entity Commonsense Representation for Neural Abstractive Summarization http://arxiv.org/abs/1806.05504v1
Entity Linking for Queries by Searching Wikipedia Sentences http://arxiv.org/abs/1704.02788v3
Entity Linking in 100 Languages http://arxiv.org/abs/2011.02690v1
Entity Recognition at First Sight: Improving NER with Eye Movement Information http://arxiv.org/abs/1902.10068v2
Entity-Enriched Neural Models for Clinical Question Answering http://arxiv.org/abs/2005.06587v1
Entropy Minimization In Emergent Languages http://arxiv.org/abs/1905.13687v3
Episodic Memory Reader: Learning What to Remember for Question Answering from Streaming Data http://arxiv.org/abs/1903.06164v3
Equalized odds postprocessing under imperfect group information http://arxiv.org/abs/1906.03284v3
Equivariant Hamiltonian Flows http://arxiv.org/abs/1909.13739v1
Equivariant Neural Rendering http://arxiv.org/abs/2006.07630v2
Error Estimation for Sketched SVD via the Bootstrap http://arxiv.org/abs/2003.04937v1
Error bounds in estimating the out-of-sample prediction error using leave-one-out cross validation in high-dimensions http://arxiv.org/abs/2003.01770v1
Error-Bounded Correction of Noisy Labels http://arxiv.org/abs/2011.10077v1
Estimating Grape Yield on the Vine from Multiple Images http://arxiv.org/abs/2004.04278v1
Estimating Principal Components under Adversarial Perturbations http://arxiv.org/abs/2006.00602v2
Estimating Q(s,s') with Deep Deterministic Dynamics Gradients http://arxiv.org/abs/2002.09505v2
Estimating localized complexity of white-matter wiring with GANs http://arxiv.org/abs/1910.04868v2
Estimating predictive uncertainty for rumour verification models http://arxiv.org/abs/2005.07174v1
Estimating the number and effect sizes of non-null hypotheses http://arxiv.org/abs/2002.07297v2
Estimation and Inference with Trees and Forests in High Dimensions http://arxiv.org/abs/2007.03210v2
Estimation of Bounds on Potential Outcomes For Decision Making http://arxiv.org/abs/1910.04817v4
Evaluating Agents without Rewards http://arxiv.org/abs/2012.11538v1
Evaluating Amharic Machine Translation http://arxiv.org/abs/2003.14386v1
Evaluating Attribution Methods using White-Box LSTMs http://arxiv.org/abs/2010.08606v1
Evaluating Dialogue Generation Systems via Response Selection http://arxiv.org/abs/2004.14302v1
Evaluating Explainable AI: Which Algorithmic Explanations Help Users Predict Model Behavior? http://arxiv.org/abs/2005.01831v1
Evaluating Explanation Methods for Neural Machine Translation http://arxiv.org/abs/2005.01672v1
Evaluating Gender Bias in Machine Translation http://arxiv.org/abs/1906.00591v1
Evaluating Logical Generalization in Graph Neural Networks http://arxiv.org/abs/2003.06560v1
Evaluating Lossy Compression Rates of Deep Generative Models http://arxiv.org/abs/2008.06653v1
Evaluating Neural Morphological Taggers for Sanskrit http://arxiv.org/abs/2005.10893v1
Evaluating Robustness to Input Perturbations for Neural Machine Translation http://arxiv.org/abs/2005.00580v1
Evaluating Theory of Mind in Question Answering http://arxiv.org/abs/1808.09352v1
Evaluating and Characterizing Human Rationales http://arxiv.org/abs/2010.04736v1
Evaluating the Calibration of Knowledge Graph Embeddings for Trustworthy Link Prediction http://arxiv.org/abs/2004.01168v3
Evaluating the Factual Consistency of Abstractive Text Summarization http://arxiv.org/abs/1910.12840v1
Evaluating the Utility of Hand-crafted Features in Sequence Labelling http://arxiv.org/abs/1808.09075v1
Evaluation of Model Selection for Kernel Fragment Recognition in Corn Silage http://arxiv.org/abs/2004.00292v1
Event Extraction by Answering (Almost) Natural Questions http://arxiv.org/abs/2004.13625v1
Every Document Owns Its Structure: Inductive Text Classification via Graph Neural Networks http://arxiv.org/abs/2004.13826v2
Evidence-Aware Inferential Text Generation with Vector Quantised Variational AutoEncoder http://arxiv.org/abs/2006.08101v1
Evolution-based Fine-tuning of CNNs for Prostate Cancer Detection http://arxiv.org/abs/1911.01477v1
EvolveGraph: Multi-Agent Trajectory Prediction with Dynamic Relational Reasoning http://arxiv.org/abs/2003.13924v4
Evolving Reinforcement Learning Algorithms http://arxiv.org/abs/2101.03958v1
Examination and Extension of Strategies for Improving Personalized Language Modeling via Interpolation http://arxiv.org/abs/2006.05469v1
Examining Citations of Natural Language Processing Literature http://arxiv.org/abs/2005.00912v1
Examining the State-of-the-Art in News Timeline Summarization http://arxiv.org/abs/2005.10107v1
Exclusive Hierarchical Decoding for Deep Keyphrase Generation http://arxiv.org/abs/2004.08511v1
ExpBERT: Representation Engineering with Natural Language Explanations http://arxiv.org/abs/2005.01932v1
Experience Grounds Language http://arxiv.org/abs/2004.10151v3
Experimental Evaluation and Development of a Silver-Standard for the MIMIC-III Clinical Coding Dataset http://arxiv.org/abs/2006.07332v1
Expertise Style Transfer: A New Task Towards Better Communication between Experts and Laymen http://arxiv.org/abs/2005.00701v1
Explainable Automated Fact-Checking for Public Health Claims http://arxiv.org/abs/2010.09926v1
Explainable and Discourse Topic-aware Neural Language Understanding http://arxiv.org/abs/2006.10632v2
Explaining Black Box Predictions and Unveiling Data Artifacts through Influence Functions http://arxiv.org/abs/2005.06676v1
Explaining Character-Aware Neural Networks for Word-Level Prediction: Do They Discover Linguistic Rules? http://arxiv.org/abs/1808.09551v1
Explaining Groups of Points in Low-Dimensional Representations http://arxiv.org/abs/2003.01640v3
Explaining the Explainer: A First Theoretical Analysis of LIME http://arxiv.org/abs/2001.03447v2
Explanation Augmented Feedback in Human-in-the-Loop Reinforcement Learning http://arxiv.org/abs/2006.14804v3
Explicit Mean-Square Error Bounds for Monte-Carlo and Linear Stochastic Approximation http://arxiv.org/abs/2002.02584v1
Explicit Memory Tracker with Coarse-to-Fine Reasoning for Conversational Machine Reading http://arxiv.org/abs/2005.12484v2
Exploiting Categorical Structure Using Tree-Based Methods http://arxiv.org/abs/2004.07383v1
Exploiting Cross-Sentence Context for Neural Machine Translation http://arxiv.org/abs/1704.04347v3
Exploiting Deep Representations for Neural Machine Translation http://arxiv.org/abs/1810.10181v1
Exploiting Domain Knowledge via Grouped Weight Sharing with Application to Text Categorization http://arxiv.org/abs/1702.02535v3
Exploiting Explicit Paths for Multi-hop Reading Comprehension http://arxiv.org/abs/1811.01127v2
Exploiting Rich Syntactic Information for Semantic Parsing with Graph-to-Sequence Model http://arxiv.org/abs/1808.07624v1
Exploiting Sentence Order in Document Alignment http://arxiv.org/abs/2004.14523v2
Exploiting Structured Knowledge in Text via Graph-Guided Representation Learning http://arxiv.org/abs/2004.14224v1
Exploiting Syntactic Structure for Better Language Modeling: A Syntactic Distance Approach http://arxiv.org/abs/2005.05864v1
Exploration by Optimisation in Partial Monitoring http://arxiv.org/abs/1907.05772v3
Exploratory Analysis of COVID-19 Related Tweets in North America to Inform Public Health Institutes http://arxiv.org/abs/2007.02452v1
Explore, Discover and Learn: Unsupervised Discovery of State-Covering Skills http://arxiv.org/abs/2002.03647v4
Explore, Propose, and Assemble: An Interpretable Model for Multi-Hop Reading Comprehension http://arxiv.org/abs/1906.05210v1
Exploring Author Context for Detecting Intended vs Perceived Sarcasm http://arxiv.org/abs/1910.11932v1
Exploring Content Selection in Summarization of Novel Chapters http://arxiv.org/abs/2005.01840v2
Exploring Contextual Word-level Style Relevance for Unsupervised Style Transfer http://arxiv.org/abs/2005.02049v1
Exploring Contextualized Neural Language Models for Temporal Dependency Parsing http://arxiv.org/abs/2004.14577v2
Exploring Exploration: Comparing Children with RL Agents in Unified Environments http://arxiv.org/abs/2005.02880v2
Exploring Phoneme-Level Speech Representations for End-to-End Speech Translation http://arxiv.org/abs/1906.01199v1
Exploring Recombination for Efficient Decoding of Neural Machine Translation http://arxiv.org/abs/1808.08482v2
Exploring Semantic Capacity of Terms http://arxiv.org/abs/2010.01898v1
Exploring Weaknesses of VQA Models through Attribution Driven Insights http://arxiv.org/abs/2006.06637v2
Exploring and Predicting Transferability across NLP Tasks http://arxiv.org/abs/2005.00770v2
Exploring aspects of similarity between spoken personal narratives by disentangling them into narrative clause types http://arxiv.org/abs/2005.12762v2
Exploring the Linear Subspace Hypothesis in Gender Bias Mitigation http://arxiv.org/abs/2009.09435v2
Exploring the Role of Argument Structure in Online Debate Persuasion http://arxiv.org/abs/2010.03538v1
Exploring the Role of Prior Beliefs for Argument Persuasion http://arxiv.org/abs/1906.11301v1
Exposing Shallow Heuristics of Relation Extraction Models with Challenge Data http://arxiv.org/abs/2010.03656v1
Expressing Visual Relationships via Language http://arxiv.org/abs/1906.07689v2
Expressive Interviewing: A Conversational System for Coping with COVID-19 http://arxiv.org/abs/2007.03819v1
Expressiveness and Learning of Hidden Quantum Markov Models http://arxiv.org/abs/1912.02098v1
Extending Implicit Discourse Relation Recognition to the PDTB-3 http://arxiv.org/abs/2010.06294v1
Extending a Parser to Distant Domains Using a Few Dozen Partially Annotated Examples http://arxiv.org/abs/1805.06556v1
Extensively Matching for Few-shot Learning Event Detection http://arxiv.org/abs/2006.10093v1
Extract and Edit: An Alternative to Back-Translation for Unsupervised Neural Machine Translation http://arxiv.org/abs/1904.02331v1
Extracting Headless MWEs from Dependency Parse Trees: Parsing, Tagging, and Joint Modeling Approaches http://arxiv.org/abs/2005.03035v1
Extracting Implicitly Asserted Propositions in Argumentation http://arxiv.org/abs/2010.02654v1
Extracting Symptoms and their Status from Clinical Conversations http://arxiv.org/abs/1906.02239v1
Extractive Summarization as Text Matching http://arxiv.org/abs/2004.08795v1
Extragradient with player sampling for faster Nash equilibrium finding http://arxiv.org/abs/1905.12363v5
Extrapolating the profile of a finite population http://arxiv.org/abs/2005.10561v1
Extreme Multi-label Classification from Aggregated Labels http://arxiv.org/abs/2004.00198v1
FEQA: A Question Answering Evaluation Framework for Faithfulness Assessment in Abstractive Summarization http://arxiv.org/abs/2005.03754v1
FFR V1.0: Fon-French Neural Machine Translation http://arxiv.org/abs/2003.12111v1
FFR v1.1: Fon-French Neural Machine Translation http://arxiv.org/abs/2006.09217v1
FIESTA: Fast IdEntification of State-of-The-Art models using adaptive bandit algorithms http://arxiv.org/abs/1906.12230v1
FLAT: Chinese NER Using Flat-Lattice Transformer http://arxiv.org/abs/2004.11795v2
F^2-Softmax: Diversifying Neural Text Generation via Frequency Factorized Softmax http://arxiv.org/abs/2009.09417v2
Facebook AI's WMT20 News Translation Task Submission http://arxiv.org/abs/2011.08298v1
Facet-Aware Evaluation for Extractive Summarization http://arxiv.org/abs/1908.10383v2
Facilitating the Communication of Politeness through Fine-Grained Paraphrasing http://arxiv.org/abs/2012.00012v1
Fact or Fiction: Verifying Scientific Claims http://arxiv.org/abs/2004.14974v6
Fact-based Text Editing http://arxiv.org/abs/2007.00916v1
Factorising AMR generation through syntax http://arxiv.org/abs/1804.07707v2
Factual Error Correction for Abstractive Summarization Models http://arxiv.org/abs/2010.08712v1
Fair Bayesian Optimization http://arxiv.org/abs/2006.05109v1
Fair Correlation Clustering http://arxiv.org/abs/2002.02274v2
Fair Decisions Despite Imperfect Predictions http://arxiv.org/abs/1902.02979v4
Fair Embedding Engine: A Library for Analyzing and Mitigating Gender Bias in Word Embeddings http://arxiv.org/abs/2010.13168v1
Fair Generative Modeling via Weak Supervision http://arxiv.org/abs/1910.12008v2
Fair Learning with Private Demographic Data http://arxiv.org/abs/2002.11651v2
Fairness in the Eyes of the Data: Certifying Machine-Learning Models http://arxiv.org/abs/2009.01534v1
Fairwashing Explanations with Off-Manifold Detergent http://arxiv.org/abs/2007.09969v1
Familywise Error Rate Control by Interactive Unmasking http://arxiv.org/abs/2002.08545v3
Fast Adaptation via Policy-Dynamics Value Functions http://arxiv.org/abs/2007.02879v1
Fast Algorithms for Computational Optimal Transport and Wasserstein Barycenter http://arxiv.org/abs/1905.09952v4
Fast Differentiable Sorting and Ranking http://arxiv.org/abs/2002.08871v2
Fast Interleaved Bidirectional Sequence Generation http://arxiv.org/abs/2010.14481v1
Fast Learning of Graph Neural Networks with Guaranteed Generalizability: One-hidden-layer Case http://arxiv.org/abs/2006.14117v1
Fast Linear Convergence of Randomized BFGS http://arxiv.org/abs/2002.11337v3
Fast Markov Chain Monte Carlo Algorithms via Lie Groups http://arxiv.org/abs/1901.08606v2
Fast OSCAR and OWL Regression via Safe Screening Rules http://arxiv.org/abs/2006.16433v1
Fast Physical Activity Suggestions: Efficient Hyperparameter Learning in Mobile Health http://arxiv.org/abs/2012.11646v1
Fast Rates for Online Prediction with Abstention http://arxiv.org/abs/2001.10623v2
Fast and Accurate Deep Bidirectional Language Representations for Unsupervised Learning http://arxiv.org/abs/2004.08097v1
Fast and Furious Convergence: Stochastic Second Order Methods under Interpolation http://arxiv.org/abs/1910.04920v2
Fast and Scalable Expansion of Natural Language Understanding Functionality for Intelligent Agents http://arxiv.org/abs/1805.01542v1
Fast semantic parsing with well-typedness guarantees http://arxiv.org/abs/2009.07365v2
Fast(er) Exact Decoding and Global Training for Transition-Based Dependency Parsing via a Minimal Feature Set http://arxiv.org/abs/1708.09403v1
Fast, Small and Exact: Infinite-order Language Modelling with Compressed Suffix Trees http://arxiv.org/abs/1608.04465v1
FastBERT: a Self-distilling BERT with Adaptive Inference Time http://arxiv.org/abs/2004.02178v2
FastFormers: Highly Efficient Transformer Models for Natural Language Understanding http://arxiv.org/abs/2010.13382v1
Faster Graph Embeddings via Coarsening http://arxiv.org/abs/2007.02817v3
Faster Projection-free Online Learning http://arxiv.org/abs/2001.11568v2
Feature Adaptation of Pre-Trained Language Models across Languages and Domains with Robust Self-Training http://arxiv.org/abs/2009.11538v3
Feature Noise Induces Loss Discrepancy Across Groups http://arxiv.org/abs/1911.09876v2
Feature Quantization Improves GAN Training http://arxiv.org/abs/2004.02088v2
Feature Selection using Stochastic Gates http://arxiv.org/abs/1810.04247v7
Feature relevance quantification in explainable AI: A causal problem http://arxiv.org/abs/1910.13413v2
Feature-map-level Online Adversarial Knowledge Distillation http://arxiv.org/abs/2002.01775v3
FedPAQ: A Communication-Efficient Federated Learning Method with Periodic Averaging and Quantization http://arxiv.org/abs/1909.13014v4
Federated Heavy Hitters Discovery with Differential Privacy http://arxiv.org/abs/1902.08534v4
Federated Learning with Only Positive Labels http://arxiv.org/abs/2004.10342v1
Fenchel Lifted Networks: A Lagrange Relaxation of Neural Network Training http://arxiv.org/abs/1811.08039v3
FetchSGD: Communication-Efficient Federated Learning with Sketching http://arxiv.org/abs/2007.07682v2
Few-Shot Complex Knowledge Base Question Answering via Meta Reinforcement Learning http://arxiv.org/abs/2010.15877v1
Few-Shot Learning for Opinion Summarization http://arxiv.org/abs/2004.14884v3
Few-Shot NLG with Pre-Trained Language Model http://arxiv.org/abs/1904.09521v3
Few-shot Domain Adaptation by Causal Mechanism Transfer http://arxiv.org/abs/2002.03497v2
Few-shot Relation Extraction via Bayesian Meta-learning on Relation Graphs http://arxiv.org/abs/2007.02387v1
Few-shot Slot Tagging with Collapsed Dependency Transfer and Label-enhanced Task-adaptive Projection Network http://arxiv.org/abs/2006.05702v1
Few-shot link prediction via graph neural networks for Covid-19 drug-repurposing http://arxiv.org/abs/2007.10261v1
FewRel: A Large-Scale Supervised Few-Shot Relation Classification Dataset with State-of-the-Art Evaluation http://arxiv.org/abs/1810.10147v2
Fiduciary Bandits http://arxiv.org/abs/1905.07043v3
Fiedler Regularization: Learning Neural Networks with Graph Sparsity http://arxiv.org/abs/2003.00992v3
Field-Level Crop Type Classification with k Nearest Neighbors: A Baseline for a New Kenya Smallholder Dataset http://arxiv.org/abs/2004.03023v1
Fill in the BLANC: Human-free quality estimation of document summaries http://arxiv.org/abs/2002.09836v2
Filling Missing Paths: Modeling Co-occurrences of Word Pairs and Dependency Paths for Recognizing Lexical Semantic Relations http://arxiv.org/abs/1809.03411v1
Filtering Noisy Dialogue Corpora by Connectivity and Content Relatedness http://arxiv.org/abs/2004.14008v2
FinRL: A Deep Reinforcement Learning Library for Automated Stock Trading in Quantitative Finance http://arxiv.org/abs/2011.09607v1
Finding Convincing Arguments Using Scalable Bayesian Preference Learning http://arxiv.org/abs/1806.02418v1
Finding Syntax in Human Encephalography with Beam Search http://arxiv.org/abs/1806.04127v1
Finding Universal Grammatical Relations in Multilingual BERT http://arxiv.org/abs/2005.04511v2
Finding Your Voice: The Linguistic Development of Mental Health Counselors http://arxiv.org/abs/1906.07194v1
Finding trainable sparse networks through Neural Tangent Transfer http://arxiv.org/abs/2006.08228v2
Fine Grained Citation Span for References in Wikipedia http://arxiv.org/abs/1707.07278v1
Fine-Grained Analysis of Cross-Linguistic Syntactic Divergences http://arxiv.org/abs/2005.03436v2
Fine-Grained Prediction of Syntactic Typology: Discovering Latent Structure with Supervised Learning http://arxiv.org/abs/1710.03877v1
Fine-Grained Temporal Relation Extraction http://arxiv.org/abs/1902.01390v2
Fine-grained Fact Verification with Kernel Graph Attention Network http://arxiv.org/abs/1910.09796v3
Fine-grained linguistic evaluation for state-of-the-art Machine Translation http://arxiv.org/abs/2010.06359v2
Finite Regret and Cycles with Fixed Step-Size via Alternating Gradient Descent-Ascent http://arxiv.org/abs/1907.04392v1
Finite Time Analysis of Linear Two-timescale Stochastic Approximation with Markovian Noise http://arxiv.org/abs/2002.01268v1
Finite-Sample Analysis of Decentralized Temporal-Difference Learning with Linear Function Approximation http://arxiv.org/abs/1911.00934v2
Finite-Time Analysis of Asynchronous Stochastic Approximation and $Q$-Learning http://arxiv.org/abs/2002.00260v1
Finite-Time Last-Iterate Convergence for Multi-Agent Learning in Games http://arxiv.org/abs/2002.09806v4
Fixed-Confidence Guarantees for Bayesian Best-Arm Identification http://arxiv.org/abs/1910.10945v3
Flexible and Efficient Long-Range Planning Through Curious Exploration http://arxiv.org/abs/2004.10876v2
Flexible retrieval with NMSLIB and FlexNeuART http://arxiv.org/abs/2010.14848v2
Flow Models for Arbitrary Conditional Likelihoods http://arxiv.org/abs/1909.06319v2
Fluent Response Generation for Conversational Question Answering http://arxiv.org/abs/2005.10464v2
Forecasting Sequential Data using Consistent Koopman Autoencoders http://arxiv.org/abs/2003.02236v2
Formal Limitations on the Measurement of Mutual Information http://arxiv.org/abs/1811.04251v4
Fortification of Neural Morphological Segmentation Models for Polysynthetic Minimal-Resource Languages http://arxiv.org/abs/1804.06024v1
Fortifying Toxic Speech Detectors Against Veiled Toxicity http://arxiv.org/abs/2010.03154v1
Fractal Gaussian Networks: A sparse random graph model based on Gaussian Multiplicative Chaos http://arxiv.org/abs/2008.03038v1
Fractional Underdamped Langevin Dynamics: Retargeting SGD with Momentum under Heavy-Tailed Gradient Noise http://arxiv.org/abs/2002.05685v2
Free Energy Wells and Overlap Gap Property in Sparse PCA http://arxiv.org/abs/2006.10689v1
Frequency Bias in Neural Networks for Input of Non-Uniform Density http://arxiv.org/abs/2003.04560v1
Frequentist Uncertainty in Recurrent Neural Networks via Blockwise Influence Functions http://arxiv.org/abs/2006.13707v2
Friendships, Rivalries, and Trysts: Characterizing Relations between Ideas in Texts http://arxiv.org/abs/1704.07828v2
From Arguments to Key Points: Towards Automatic Argument Summarization http://arxiv.org/abs/2005.01619v2
From Data to Decisions: Distributionally Robust Optimization is Optimal http://arxiv.org/abs/1704.04118v3
From Dataset Recycling to Multi-Property Extraction and Beyond http://arxiv.org/abs/2011.03228v1
From English to Code-Switching: Transfer Learning with Strong Morphological Clues http://arxiv.org/abs/1909.05158v3
From ImageNet to Image Classification: Contextualizing Progress on Benchmarks http://arxiv.org/abs/2005.11295v1
From Importance Sampling to Doubly Robust Policy Gradient http://arxiv.org/abs/1910.09066v3
From Language to Programs: Bridging Reinforcement Learning and Maximum Marginal Likelihood http://arxiv.org/abs/1704.07926v1
From Machine Reading Comprehension to Dialogue State Tracking: Bridging the Gap http://arxiv.org/abs/2004.05827v1
From Nesterov's Estimate Sequence to Riemannian Acceleration http://arxiv.org/abs/2001.08876v1
From PAC to Instance-Optimal Sample Complexity in the Plackett-Luce Model http://arxiv.org/abs/1903.00558v2
From Paraphrase Database to Compositional Paraphrase Model and Back http://arxiv.org/abs/1506.03487v2
From Predictions to Decisions: Using Lookahead Regularization http://arxiv.org/abs/2006.11638v2
From Speech-to-Speech Translation to Automatic Dubbing http://arxiv.org/abs/2001.06785v3
From tree matching to sparse graph alignment http://arxiv.org/abs/2002.01258v2
Frowning Frodo, Wincing Leia, and a Seriously Great Friendship: Learning to Classify Emotional Relationships of Fictional Characters http://arxiv.org/abs/1903.12453v2
Frustratingly Hard Evidence Retrieval for QA Over Books http://arxiv.org/abs/2007.09878v1
Frustratingly Simple Few-Shot Object Detection http://arxiv.org/abs/2003.06957v1
Fully Character-Level Neural Machine Translation without Explicit Segmentation http://arxiv.org/abs/1610.03017v3
Fully Decentralized Joint Learning of Personalized Models and Collaboration Graphs http://arxiv.org/abs/1901.08460v4
Fully Parallel Hyperparameter Search: Reshaped Space-Filling http://arxiv.org/abs/1910.08406v2
Fully reversible neural networks for large-scale surface and sub-surface characterization via remote sensing http://arxiv.org/abs/2003.07474v1
Fundamental Tradeoffs between Invariance and Sensitivity to Adversarial Perturbations http://arxiv.org/abs/2002.04599v2
GANterpretations http://arxiv.org/abs/2011.05158v1
GCAN: Graph-aware Co-Attention Networks for Explainable Fake News Detection on Social Media http://arxiv.org/abs/2004.11648v1
GEAR: Graph-based Evidence Aggregating and Reasoning for Fact Verification http://arxiv.org/abs/1908.01843v1
GNN-FiLM: Graph Neural Networks with Feature-wise Linear Modulation http://arxiv.org/abs/1906.12192v5
GP-VAE: Deep Probabilistic Time Series Imputation http://arxiv.org/abs/1907.04155v5
GRADE: Automatic Graph-Enhanced Coherence Metric for Evaluating Open-Domain Dialogue Systems http://arxiv.org/abs/2010.03994v1
Gaining Insight into SARS-CoV-2 Infection and COVID-19 Severity Using Self-supervised Edge Features and Graph Neural Networks http://arxiv.org/abs/2006.12971v2
Games for Fairness and Interpretability http://arxiv.org/abs/2004.09551v1
Gamification of Pure Exploration for Linear Bandits http://arxiv.org/abs/2007.00953v1
Gated Convolutional Bidirectional Attention-based Model for Off-topic Spoken Response Detection http://arxiv.org/abs/2004.09036v4
Gaussian Mixture Latent Vector Grammars http://arxiv.org/abs/1805.04688v1
Gaussian Sketching yields a J-L Lemma in RKHS http://arxiv.org/abs/1908.05818v2
Gaussianization Flows http://arxiv.org/abs/2003.01941v1
GenAug: Data Augmentation for Finetuning Text Generators http://arxiv.org/abs/2010.01794v2
Gender Bias in Contextualized Word Embeddings http://arxiv.org/abs/1904.03310v1
Gender Bias in Multilingual Embeddings and Cross-Lingual Transfer http://arxiv.org/abs/2005.00699v1
Gender Coreference and Bias Evaluation at WMT 2020 http://arxiv.org/abs/2010.06018v1
Gender Gap in Natural Language Processing Research: Disparities in Authorship and Citations http://arxiv.org/abs/2005.00962v2
Gender in Danger? Evaluating Speech Translation Technology on the MuST-SHE Corpus http://arxiv.org/abs/2006.05754v1
Gender-preserving Debiasing for Pre-trained Word Embeddings http://arxiv.org/abs/1906.00742v1
General Identification of Dynamic Treatment Regimes Under Interference http://arxiv.org/abs/2004.01218v1
Generalisation error in learning with random features and the hidden manifold model http://arxiv.org/abs/2002.09339v2
Generalization Error of Generalized Linear Models in High Dimensions http://arxiv.org/abs/2005.00180v1
Generalization Guarantees for Sparse Kernel Approximation with Entropic Optimal Features http://arxiv.org/abs/2002.04195v1
Generalization and Representational Limits of Graph Neural Networks http://arxiv.org/abs/2002.06157v1
Generalization to New Actions in Reinforcement Learning http://arxiv.org/abs/2011.01928v1
Generalized Data Augmentation for Low-Resource Translation http://arxiv.org/abs/1906.03785v1
Generalized and Scalable Optimal Sparse Decision Trees http://arxiv.org/abs/2006.08690v3
Generalizing Convolutional Neural Networks for Equivariance to Lie Groups on Arbitrary Continuous Data http://arxiv.org/abs/2002.12880v3
Generalizing Natural Language Analysis through Span-relation Representations http://arxiv.org/abs/1911.03822v2
Generalizing Word Embeddings using Bag of Subwords http://arxiv.org/abs/1809.04259v1
Generalizing and Hybridizing Count-based and Neural Language Models http://arxiv.org/abs/1606.00499v2
Generate, Delete and Rewrite: A Three-Stage Framework for Improving Persona Consistency of Dialogue Generation http://arxiv.org/abs/2004.07672v4
Generating Automatic Curricula via Self-Supervised Active Domain Randomization http://arxiv.org/abs/2002.07911v2
Generating Counter Narratives against Online Hate Speech: Data and Strategies http://arxiv.org/abs/2004.04216v1
Generating Dialogue Responses from a Semantic Latent Space http://arxiv.org/abs/2010.01658v1
Generating Diverse Translation from Model Distribution with Dropout http://arxiv.org/abs/2010.08178v1
Generating Diverse and Consistent QA pairs from Contexts with Information-Maximizing Hierarchical Conditional VAEs http://arxiv.org/abs/2005.13837v5
Generating Fact Checking Briefs http://arxiv.org/abs/2011.05448v1
Generating Fact Checking Explanations http://arxiv.org/abs/2004.05773v1
Generating Fine-Grained Open Vocabulary Entity Type Descriptions http://arxiv.org/abs/1805.10564v1
Generating Hierarchical Explanations on Text Classification via Feature Interaction Detection http://arxiv.org/abs/2004.02015v3
Generating Image Descriptions via Sequential Cross-Modal Alignment Guided by Human Gaze http://arxiv.org/abs/2011.04592v1
Generating Label Cohesive and Well-Formed Adversarial Claims http://arxiv.org/abs/2009.08205v1
Generating Logical Forms from Graph Representations of Text and Entities http://arxiv.org/abs/1905.08407v3
Generating Narrative Text in a Switching Dynamical System http://arxiv.org/abs/2004.03762v1
Generating Negative Commonsense Knowledge http://arxiv.org/abs/2011.07497v1
Generating Novel Glyph without Human Data by Learning to Communicate http://arxiv.org/abs/2010.04402v2
Generating Question Relevant Captions to Aid Visual Question Answering http://arxiv.org/abs/1906.00513v3
Generating Radiology Reports via Memory-driven Transformer http://arxiv.org/abs/2010.16056v1
Generating Sentences by Editing Prototypes http://arxiv.org/abs/1709.08878v2
Generating Summaries with Topic Templates and Structured Convolutional Decoders http://arxiv.org/abs/1906.04687v1
Generating and Exploiting Large-scale Pseudo Training Data for Zero Pronoun Resolution http://arxiv.org/abs/1606.01603v3
Generative Adversarial Imitation from Observation http://arxiv.org/abs/1807.06158v4
Generative Adversarial User Privacy in Lossy Single-Server Information Retrieval http://arxiv.org/abs/2012.03902v1
Generative Flows with Matrix Exponential http://arxiv.org/abs/2007.09651v1
Generative ODE Modeling with Known Unknowns http://arxiv.org/abs/2003.10775v1
Generative Semantic Hashing Enhanced via Boltzmann Machines http://arxiv.org/abs/2006.08858v1
Generative Teaching Networks: Accelerating Neural Architecture Search by Learning to Generate Synthetic Training Data http://arxiv.org/abs/1912.07768v1
Geometric Dataset Distances via Optimal Transport http://arxiv.org/abs/2002.02923v1
Geometry-aware Domain Adaptation for Unsupervised Alignment of Word Embeddings http://arxiv.org/abs/2004.08243v2
Geoopt: Riemannian Optimization in PyTorch http://arxiv.org/abs/2005.02819v5
Getting a CLUE: A Method for Explaining Uncertainty Estimates http://arxiv.org/abs/2006.06848v1
Give Me Convenience and Give Her Death: Who Should Decide What Uses of NLP are Appropriate, and on What Basis? http://arxiv.org/abs/2005.13213v1
Giving Attention to the Unexpected: Using Prosody Innovations in Disfluency Detection http://arxiv.org/abs/1904.04388v1
Global Neural CCG Parsing with Optimality Guarantees http://arxiv.org/abs/1607.01432v2
Global-to-Local Neural Networks for Document-Level Relation Extraction http://arxiv.org/abs/2009.10359v1
Globally Normalized Reader http://arxiv.org/abs/1709.02828v1
Go Wide, Then Narrow: Efficient Training of Deep Thin Networks http://arxiv.org/abs/2007.00811v2
Good Subnetworks Provably Exist: Pruning via Greedy Forward Selection http://arxiv.org/abs/2003.01794v3
Good-Enough Compositional Data Augmentation http://arxiv.org/abs/1904.09545v4
GraPPa: Grammar-Augmented Pre-Training for Table Semantic Parsing http://arxiv.org/abs/2009.13845v1
Gradient Based Memory Editing for Task-Free Continual Learning http://arxiv.org/abs/2006.15294v1
Gradient Descent with Early Stopping is Provably Robust to Label Noise for Overparameterized Neural Networks http://arxiv.org/abs/1903.11680v3
Gradient Temporal-Difference Learning with Regularized Corrections http://arxiv.org/abs/2007.00611v4
Gradient descent algorithms for Bures-Wasserstein barycenters http://arxiv.org/abs/2001.01700v2
Gradient descent follows the regularization path for general losses http://arxiv.org/abs/2006.11226v1
Gradient-free Online Learning in Games with Delayed Rewards http://arxiv.org/abs/2006.10911v1
GradientDICE: Rethinking Generalized Offline Estimation of Stationary Values http://arxiv.org/abs/2001.11113v7
Grammatical Error Correction in Low Error Density Domains: A New Benchmark and Analyses http://arxiv.org/abs/2010.07574v1
Graph Clustering with Graph Neural Networks http://arxiv.org/abs/2006.16904v1
Graph Coarsening with Preserved Spectral Properties http://arxiv.org/abs/1802.04447v2
Graph Convolutional Gaussian Processes For Link Prediction http://arxiv.org/abs/2002.04337v1
Graph Convolutional Network for Recommendation with Low-pass Collaborative Filters http://arxiv.org/abs/2006.15516v2
Graph Convolutions over Constituent Trees for Syntax-Aware Semantic Role Labeling http://arxiv.org/abs/1909.09814v3
Graph DNA: Deep Neighborhood Aware Graph Encoding for Collaborative Filtering http://arxiv.org/abs/1905.12217v1
Graph Filtration Learning http://arxiv.org/abs/1905.10996v2
Graph Homomorphism Convolution http://arxiv.org/abs/2005.01214v2
Graph Learning for Inverse Landscape Genetics http://arxiv.org/abs/2006.12334v2
Graph Neural Networks for Massive MIMO Detection http://arxiv.org/abs/2007.05703v1
Graph Neural Networks for the Prediction of Substrate-Specific Organic Reaction Conditions http://arxiv.org/abs/2007.04275v2
Graph Neural Networks in TensorFlow and Keras with Spektral http://arxiv.org/abs/2006.12138v1
Graph Optimal Transport for Cross-Domain Alignment http://arxiv.org/abs/2006.14744v3
Graph Pattern Entity Ranking Model for Knowledge Graph Completion http://arxiv.org/abs/1904.02856v1
Graph Random Neural Features for Distance-Preserving Graph Representations http://arxiv.org/abs/1909.03790v3
Graph Structure of Neural Networks http://arxiv.org/abs/2007.06559v2
Graph based Neural Networks for Event Factuality Prediction using Syntactic and Semantic Structures http://arxiv.org/abs/1907.03227v1
Graph neural induction of value iteration http://arxiv.org/abs/2009.12604v1
Graph-based Nearest Neighbor Search: From Practice to Theory http://arxiv.org/abs/1907.00845v4
Graph-based, Self-Supervised Program Repair from Diagnostic Feedback http://arxiv.org/abs/2005.10636v2
GraphDialog: Integrating Graph Knowledge into End-to-End Task-Oriented Dialogue Systems http://arxiv.org/abs/2010.01447v1
GraphOpt: Learning Optimization Models of Graph Formation http://arxiv.org/abs/2007.03619v1
Graphs, Entities, and Step Mixture http://arxiv.org/abs/2005.08485v2
Grasping the Finer Point: A Supervised Similarity Network for Metaphor Detection http://arxiv.org/abs/1709.00575v1
Greedy Search with Probabilistic N-gram Matching for Neural Machine Translation http://arxiv.org/abs/1809.03132v1
Gromov-Wasserstein Alignment of Word Embedding Spaces http://arxiv.org/abs/1809.00013v1
Grounded Adaptation for Zero-shot Executable Semantic Parsing http://arxiv.org/abs/2009.07396v2
Grounded Compositional Outputs for Adaptive Language Modeling http://arxiv.org/abs/2009.11523v2
Grounded Conversation Generation as Guided Traverses in Commonsense Knowledge Graphs http://arxiv.org/abs/1911.02707v3
Grounding Conversations with Improvised Dialogues http://arxiv.org/abs/2004.09544v2
Group Equivariant Deep Reinforcement Learning http://arxiv.org/abs/2007.03437v1
Growing Action Spaces http://arxiv.org/abs/1906.12266v1
Growing Together: Modeling Human Language Learning With n-Best Multi-Checkpoint Machine Translation http://arxiv.org/abs/2006.04050v1
Guaranteed Validity for Empirical Approaches to Adaptive Data Analysis http://arxiv.org/abs/1906.09231v2
Guided Learning of Nonconvex Models through Successive Functional Gradient Optimization http://arxiv.org/abs/2006.16840v1
Guiding Attention for Self-Supervised Learning with Transformers http://arxiv.org/abs/2010.02399v1
Guiding Variational Response Generator to Exploit Persona http://arxiv.org/abs/1911.02390v2
HABERTOR: An Efficient and Effective Deep Hatespeech Detector http://arxiv.org/abs/2010.08865v1
HAT: Hardware-Aware Transformers for Efficient Natural Language Processing http://arxiv.org/abs/2005.14187v1
HEAD-QA: A Healthcare Dataset for Complex Reasoning http://arxiv.org/abs/1906.04701v1
HENIN: Learning Heterogeneous Neural Interaction Networks for Explainable Cyberbullying Detection on Social Media http://arxiv.org/abs/2010.04576v1
HERO: Hierarchical Encoder for Video+Language Omni-representation Pre-training http://arxiv.org/abs/2005.00200v2
HIBERT: Document Level Pre-training of Hierarchical Bidirectional Transformers for Document Summarization http://arxiv.org/abs/1905.06566v1
HNHN: Hypergraph Networks with Hyperedge Neurons http://arxiv.org/abs/2006.12278v1
Haar Graph Pooling http://arxiv.org/abs/1909.11580v3
Haar Wavelet based Block Autoregressive Flows for Trajectories http://arxiv.org/abs/2009.09878v1
Hallucinative Topological Memory for Zero-Shot Visual Planning http://arxiv.org/abs/2002.12336v1
Halpern Iteration for Near-Optimal and Parameter-Free Monotone Inclusion and Strong Solutions to Variational Inequalities http://arxiv.org/abs/2002.08872v3
Hamiltonian Graph Networks with ODE Integrators http://arxiv.org/abs/1909.12790v1
Hamiltonian Monte Carlo Swindles http://arxiv.org/abs/2001.05033v2
Handling Divergent Reference Texts when Evaluating Table-to-Text Generation http://arxiv.org/abs/1906.01081v1
Handling Noisy Labels for Robustly Learning from Self-Training Data for Low-Resource Sequence Labeling http://arxiv.org/abs/1903.12008v1
Handling the Positive-Definite Constraint in the Bayesian Learning Rule http://arxiv.org/abs/2002.10060v13
Hard-Coded Gaussian Attention for Neural Machine Translation http://arxiv.org/abs/2005.00742v1
Hardness of Identity Testing for Restricted Boltzmann Machines and Potts models http://arxiv.org/abs/2004.10805v1
Harmonic Decompositions of Convolutional Networks http://arxiv.org/abs/2003.12756v2
Harnessing Distribution Ratio Estimators for Learning Agents with Quality and Diversity http://arxiv.org/abs/2011.02614v1
Harnessing the linguistic signal to predict scalar inferences http://arxiv.org/abs/1910.14254v2
Harry Potter and the Action Prediction Challenge from Natural Language http://arxiv.org/abs/1905.11037v1
Harvesting Paragraph-Level Question-Answer Pairs from Wikipedia http://arxiv.org/abs/1805.05942v1
Harvesting and Refining Question-Answer Pairs for Unsupervised QA http://arxiv.org/abs/2005.02925v1
Has Machine Translation Achieved Human Parity? A Case for Document-level Evaluation http://arxiv.org/abs/1808.07048v1
Hearst Patterns Revisited: Automatic Hypernym Detection from Large Text Corpora http://arxiv.org/abs/1806.03191v1
Helping Reduce Environmental Impact of Aviation with Machine Learning http://arxiv.org/abs/2012.09433v1
Hermitian matrices for clustering directed graphs: insights and applications http://arxiv.org/abs/1908.02096v1
Heterogeneous Graph Neural Networks for Extractive Document Summarization http://arxiv.org/abs/2004.12393v1
Heterogeneous Supervision for Relation Extraction: A Representation Learning Approach http://arxiv.org/abs/1707.00166v2
Heteroskedastic and Imbalanced Deep Learning with Adaptive Regularization http://arxiv.org/abs/2006.15766v1
Hiding Among the Clones: A Simple and Nearly Optimal Analysis of Privacy Amplification by Shuffling http://arxiv.org/abs/2012.12803v2
Hierarchical Clustering: a 0.585 Revenue Approximation http://arxiv.org/abs/2006.01933v1
Hierarchical Entity Typing via Multi-level Learning to Rank http://arxiv.org/abs/2004.02286v1
Hierarchical Evidence Set Modeling for Automated Fact Extraction and Verification http://arxiv.org/abs/2010.05111v1
Hierarchical Generation of Molecular Graphs using Structural Motifs http://arxiv.org/abs/2002.03230v2
Hierarchical Graph Network for Multi-hop Question Answering http://arxiv.org/abs/1911.03631v4
Hierarchical Inter-Message Passing for Learning on Molecular Graphs http://arxiv.org/abs/2006.12179v1
Hierarchical Losses and New Resources for Fine-grained Entity Typing and Linking http://arxiv.org/abs/1807.05127v1
Hierarchical Neural Networks for Sequential Sentence Classification in Medical Scientific Abstracts http://arxiv.org/abs/1808.06161v1
Hierarchical Neural Story Generation http://arxiv.org/abs/1805.04833v1
Hierarchical Protein Function Prediction with Tail-GNNs http://arxiv.org/abs/2007.12804v1
Hierarchical Quantized Representations for Script Generation http://arxiv.org/abs/1808.09542v1
Hierarchical Structured Model for Fine-to-coarse Manifesto Text Analysis http://arxiv.org/abs/1805.02823v1
Hierarchical Transformers for Multi-Document Summarization http://arxiv.org/abs/1905.13164v1
Hierarchical Verification for Adversarial Robustness http://arxiv.org/abs/2007.11826v1
Hierarchically Decoupled Imitation for Morphological Transfer http://arxiv.org/abs/2003.01709v2
High Dimensional Robust Sparse Regression http://arxiv.org/abs/1805.11643v3
High Resolution Medical Image Analysis with Spatial Partitioning http://arxiv.org/abs/1909.03108v3
High-Dimensional Robust Mean Estimation via Gradient Descent http://arxiv.org/abs/2005.01378v1
HighRES: Highlight-based Reference-less Evaluation of Summarization http://arxiv.org/abs/1906.01361v1
Higher-order Coreference Resolution with Coarse-to-fine Inference http://arxiv.org/abs/1804.05392v1
Highway Transformer: Self-Gating Enhanced Self-Attentive Networks http://arxiv.org/abs/2004.08178v5
History for Visual Dialog: Do we really need it? http://arxiv.org/abs/2005.07493v1
History-Gradient Aided Batch Size Adaptation for Variance Reduced Algorithms http://arxiv.org/abs/1910.09670v4
Hooks in the Headline: Learning to Generate Headlines with Controlled Styles http://arxiv.org/abs/2004.01980v3
HotpotQA: A Dataset for Diverse, Explainable Multi-hop Question Answering http://arxiv.org/abs/1809.09600v1
How Can We Accelerate Progress Towards Human-like Linguistic Generalization? http://arxiv.org/abs/2005.00955v1
How Does NLP Benefit Legal System: A Summary of Legal Artificial Intelligence http://arxiv.org/abs/2004.12158v5
How Does Selective Mechanism Improve Self-Attention Networks? http://arxiv.org/abs/2005.00979v1
How Furiously Can Colourless Green Ideas Sleep? Sentence Acceptability in Context http://arxiv.org/abs/2004.00881v1
How Good is the Bayes Posterior in Deep Neural Networks Really? http://arxiv.org/abs/2002.02405v2
How Large a Vocabulary Does Text Classification Need? A Variational Approach to Vocabulary Selection http://arxiv.org/abs/1902.10339v4
How Much Knowledge Can You Pack Into the Parameters of a Language Model? http://arxiv.org/abs/2002.08910v4
How Much Reading Does Reading Comprehension Require? A Critical Investigation of Popular Benchmarks http://arxiv.org/abs/1808.04926v2
How To Backdoor Federated Learning http://arxiv.org/abs/1807.00459v3
How agents see things: On visual representations in an emergent language game http://arxiv.org/abs/1808.10696v2
How do Decisions Emerge across Layers in Neural Models? Interpretation with Differentiable Masking http://arxiv.org/abs/2004.14992v2
How much complexity does an RNN architecture need to learn syntax-sensitive dependencies? http://arxiv.org/abs/2005.08199v2
How multilingual is Multilingual BERT? http://arxiv.org/abs/1906.01502v1
How recurrent networks implement contextual processing in sentiment analysis http://arxiv.org/abs/2004.08013v1
How to Grow a (Product) Tree: Personalized Category Suggestions for eCommerce Type-Ahead http://arxiv.org/abs/2005.12781v1
How to Make Deep RL Work in Practice http://arxiv.org/abs/2010.13083v2
How to Probe Sentence Embeddings in Low-Resource Languages: On Structural Design Choices for Probing Task Evaluation http://arxiv.org/abs/2006.09109v2
How to trap a gradient flow http://arxiv.org/abs/2001.02968v3
How well does surprisal explain N400 amplitude under different experimental conditions? http://arxiv.org/abs/2010.04844v1
Howl: A Deployed, Open-Source Wake Word Detection System http://arxiv.org/abs/2008.09606v1
Human computation requires and enables a new approach to ethical review http://arxiv.org/abs/2011.10754v1
Human-Like Active Learning: Machines Simulating the Human Learning Process http://arxiv.org/abs/2011.03733v1
Human-Paraphrased References Improve Neural Machine Translation http://arxiv.org/abs/2010.10245v1
Human-centric Dialog Training via Offline Reinforcement Learning http://arxiv.org/abs/2010.05848v1
Hybrid Code Networks: practical and efficient end-to-end dialog control with supervised and reinforcement learning http://arxiv.org/abs/1702.03274v2
Hybrid Session-based News Recommendation using Recurrent Neural Networks http://arxiv.org/abs/2006.13063v1
Hybrid Stochastic-Deterministic Minibatch Proximal Gradient: Less-Than-Single-Pass Optimization with Nearly Optimal Generalization http://arxiv.org/abs/2009.09835v1
HydroNets: Leveraging River Structure for Hydrologic Modeling http://arxiv.org/abs/2007.00595v1
Hyper-spectral NIR and MIR data and optimal wavebands for detection of apple tree diseases http://arxiv.org/abs/2004.02325v3
Hyperbolic Manifold Regression http://arxiv.org/abs/2005.13885v1
Hypernetwork approach to generating point clouds http://arxiv.org/abs/2003.00802v2
Hyperparameter Auto-tuning in Self-Supervised Robotic Learning http://arxiv.org/abs/2010.08252v3
Hypothesis Testing Interpretations and Renyi Differential Privacy http://arxiv.org/abs/1905.09982v2
ID3 Learns Juntas for Smoothed Product Distributions http://arxiv.org/abs/1906.08654v1
IGSQL: Database Schema Interaction Graph Based Neural Model for Context-Dependent Text-to-SQL Generation http://arxiv.org/abs/2011.05744v1
IIRC: A Dataset of Incomplete Information Reading Comprehension Questions http://arxiv.org/abs/2011.07127v1
IMHO Fine-Tuning Improves Claim Detection http://arxiv.org/abs/1905.07000v1
IMoJIE: Iterative Memory-Based Joint Open Information Extraction http://arxiv.org/abs/2005.08178v1
INFOTABS: Inference on Tables as Semi-structured Data http://arxiv.org/abs/2005.06117v1
INSET: Sentence Infilling with INter-SEntential Transformer http://arxiv.org/abs/1911.03892v2
INSPIRED: Toward Sociable Recommendation Dialog Systems http://arxiv.org/abs/2009.14306v2
IROF: a low resource evaluation metric for explanation methods http://arxiv.org/abs/2003.08747v1
IV-Posterior: Inverse Value Estimation for Interpretable Policy Certificates http://arxiv.org/abs/2012.01925v1
Identifying Semantic Divergences in Parallel Text without Annotations http://arxiv.org/abs/1803.11112v1
Identifying and Correcting Label Bias in Machine Learning http://arxiv.org/abs/1901.04966v1
Identifying and Reducing Gender Bias in Word-Level Language Models http://arxiv.org/abs/1904.03035v1
Identifying civilians killed by police with distantly supervised entity-event extraction http://arxiv.org/abs/1707.07086v1
If MaxEnt RL is the Answer, What is the Question? http://arxiv.org/abs/1910.01913v1
If beam search is the answer, what was the question? http://arxiv.org/abs/2010.02650v1
Image Augmentation Is All You Need: Regularizing Deep Reinforcement Learning from Pixels http://arxiv.org/abs/2004.13649v3
Image Generation With Neural Cellular Automatas http://arxiv.org/abs/2010.04949v2
Image Pivoting for Learning Multilingual Multimodal Representations http://arxiv.org/abs/1707.07601v1
Image-based phenotyping of diverse Rice (Oryza Sativa L.) Genotypes http://arxiv.org/abs/2004.02498v1
Imitation Attacks and Defenses for Black-box Machine Translation Systems http://arxiv.org/abs/2004.15015v3
Imitation Learning Approach for AI Driving Olympics Trained on Real-world and Simulation Data Simultaneously http://arxiv.org/abs/2007.03514v1
Imitation Learning for Neural Morphological String Transduction http://arxiv.org/abs/1808.10701v1
Implicit Bias of Gradient Descent for Wide Two-layer Neural Networks Trained with the Logistic Loss http://arxiv.org/abs/2002.04486v4
Implicit Class-Conditioned Domain Alignment for Unsupervised Domain Adaptation http://arxiv.org/abs/2006.04996v1
Implicit Generative Modeling for Efficient Exploration http://arxiv.org/abs/1911.08017v3
Implicit Geometric Regularization for Learning Shapes http://arxiv.org/abs/2002.10099v2
Implicit Regularization of Random Feature Models http://arxiv.org/abs/2002.08404v2
Implicit competitive regularization in GANs http://arxiv.org/abs/1910.05852v4
Implicit regularization and solution uniqueness in over-parameterized matrix sensing http://arxiv.org/abs/1806.02046v2
Implicit regularization for deep neural networks driven by an Ornstein-Uhlenbeck like process http://arxiv.org/abs/1904.09080v2
Improper Learning for Non-Stochastic Control http://arxiv.org/abs/2001.09254v3
Improved Natural Language Generation via Loss Truncation http://arxiv.org/abs/2004.14589v2
Improved Neural Relation Detection for Knowledge Base Question Answering http://arxiv.org/abs/1704.06194v2
Improved Optimistic Algorithms for Logistic Bandits http://arxiv.org/abs/2002.07530v2
Improved Regret Bounds for Projection-free Bandit Convex Optimization http://arxiv.org/abs/1910.03374v1
Improved Relation Extraction with Feature-Rich Compositional Embedding Models http://arxiv.org/abs/1505.02419v3
Improved Semantic Representations From Tree-Structured Long Short-Term Memory Networks http://arxiv.org/abs/1503.00075v3
Improved Semantic-Aware Network Embedding with Fine-Grained Word Alignment http://arxiv.org/abs/1808.09633v1
Improved Sentiment Detection via Label Transfer from Monolingual to Synthetic Code-Switched Text http://arxiv.org/abs/1906.05725v1
Improved Speech Representations with Multi-Target Autoregressive Predictive Coding http://arxiv.org/abs/2004.05274v1
Improved Transition-Based Parsing by Modeling Characters instead of Words with LSTMs http://arxiv.org/abs/1508.00657v2
Improving AMR Parsing with Sequence-to-Sequence Pre-training http://arxiv.org/abs/2010.01771v1
Improving Abstraction in Text Summarization http://arxiv.org/abs/1808.07913v1
Improving Adversarial Text Generation by Modeling the Distant Future http://arxiv.org/abs/2005.01279v1
Improving Candidate Generation for Low-resource Cross-lingual Entity Linking http://arxiv.org/abs/2003.01343v1
Improving Character-based Decoding Using Target-Side Morphological Information for Neural Machine Translation http://arxiv.org/abs/1804.06506v1
Improving Dialog Evaluation with a Multi-reference Adversarial Dataset and Large Scale Pretraining http://arxiv.org/abs/2009.11321v1
Improving Dialogue State Tracking by Discerning the Relevant Context http://arxiv.org/abs/1904.02800v1
Improving Disentangled Text Representation Learning with Information-Theoretic Guidance http://arxiv.org/abs/2006.00693v2
Improving Disfluency Detection by Self-Training a Self-Attentive Model http://arxiv.org/abs/2004.05323v2
Improving Domain Adaptation Translation with Domain Invariant and Specific Information http://arxiv.org/abs/1904.03879v2
Improving Generalization by Controlling Label-Noise Information in Neural Network Weights http://arxiv.org/abs/2002.07933v2
Improving Generative Imagination in Object-Centric World Models http://arxiv.org/abs/2010.02054v1
Improving Grammatical Error Correction via Pre-Training a Copy-Augmented Architecture with Unlabeled Data http://arxiv.org/abs/1903.00138v3
Improving Graph Neural Network Expressivity via Subgraph Isomorphism Counting http://arxiv.org/abs/2006.09252v2
Improving Human Text Comprehension through Semi-Markov CRF-based Neural Section Title Generation http://arxiv.org/abs/1904.07142v1
Improving Information Extraction by Acquiring External Evidence with Reinforcement Learning http://arxiv.org/abs/1603.07954v3
Improving Knowledge Graph Embedding Using Simple Constraints http://arxiv.org/abs/1805.02408v2
Improving Lemmatization of Non-Standard Languages with Joint Learning http://arxiv.org/abs/1903.06939v1
Improving Lexical Choice in Neural Machine Translation http://arxiv.org/abs/1710.01329v3
Improving Machine Reading Comprehension with General Reading Strategies http://arxiv.org/abs/1810.13441v2
Improving Massively Multilingual Neural Machine Translation and Zero-Shot Translation http://arxiv.org/abs/2004.11867v1
Improving Maximum Likelihood Training for Text Generation with Density Ratio Estimation http://arxiv.org/abs/2007.06018v1
Improving Molecular Design by Stochastic Iterative Target Augmentation http://arxiv.org/abs/2002.04720v2
Improving Multi-turn Dialogue Modelling with Utterance ReWriter http://arxiv.org/abs/1906.07004v1
Improving Multilingual Models with Language-Clustered Vocabularies http://arxiv.org/abs/2010.12777v1
Improving Multilingual Named Entity Recognition with Wikipedia Entity Type Mapping http://arxiv.org/abs/1707.02459v1
Improving Neural Conversational Models with Entropy-Based Data Filtering http://arxiv.org/abs/1905.05471v3
Improving Neural Parsing by Disentangling Model Combination and Reranking Effects http://arxiv.org/abs/1707.03058v1
Improving Non-autoregressive Neural Machine Translation with Monolingual Data http://arxiv.org/abs/2005.00932v3
Improving Question Answering over Incomplete KBs with Knowledge-Aware Reader http://arxiv.org/abs/1905.07098v2
Improving Robustness of Deep-Learning-Based Image Reconstruction http://arxiv.org/abs/2002.11821v1
Improving Segmentation for Technical Support Problems http://arxiv.org/abs/2005.11055v1
Improving Slot Filling by Utilizing Contextual Information http://arxiv.org/abs/1911.01680v2
Improving Text Generation Evaluation with Batch Centering and Tempered Word Mover Distance http://arxiv.org/abs/2010.06150v1
Improving Text Generation with Student-Forcing Optimal Transport http://arxiv.org/abs/2010.05994v1
Improving Topic Models with Latent Feature Word Representations http://arxiv.org/abs/1810.06306v1
Improving Transformer Models by Reordering their Sublayers http://arxiv.org/abs/1911.03864v2
Improving Truthfulness of Headline Generation http://arxiv.org/abs/2005.00882v2
Improving Unsupervised Word-by-Word Translation with Language Model and Denoising Autoencoder http://arxiv.org/abs/1901.01590v1
Improving Yorùbá Diacritic Restoration http://arxiv.org/abs/2003.10564v1
Improving a Neural Semantic Parser by Counterfactual Learning from Human Bandit Feedback http://arxiv.org/abs/1805.01252v2
Improving fairness in machine learning systems: What do industry practitioners need? http://arxiv.org/abs/1812.05239v2
Improving robustness against common corruptions by covariate shift adaptation http://arxiv.org/abs/2006.16971v2
Improving the Efficiency of Grammatical Error Correction with Erroneous Span Detection and Correction http://arxiv.org/abs/2010.03260v1
Improving the Gating Mechanism of Recurrent Neural Networks http://arxiv.org/abs/1910.09890v2
Improving the Similarity Measure of Determinantal Point Processes for Extractive Multi-Document Summarization http://arxiv.org/abs/1906.00072v1
Imputation estimators for unnormalized models with missing data http://arxiv.org/abs/1903.03630v2
Imputer: Sequence Modelling via Imputation and Dynamic Programming http://arxiv.org/abs/2002.08926v2
In search of isoglosses: continuous and discrete language embeddings in Slavic historical phonology http://arxiv.org/abs/2005.13575v1
In-domain representation learning for remote sensing http://arxiv.org/abs/1911.06721v1
Incentive-Compatible Forecasting Competitions http://arxiv.org/abs/2101.01816v1
Incentives for Federated Learning: a Hypothesis Elicitation Approach http://arxiv.org/abs/2007.10596v1
Incidence Networks for Geometric Deep Learning http://arxiv.org/abs/1905.11460v4
Incomplete Utterance Rewriting as Semantic Segmentation http://arxiv.org/abs/2009.13166v1
Incorporate Semantic Structures into Machine Translation Evaluation via UCCA http://arxiv.org/abs/2010.08728v2
Incorporating Behavioral Hypotheses for Query Generation http://arxiv.org/abs/2010.02667v1
Incorporating External Knowledge through Pre-training for Natural Language to Code Generation http://arxiv.org/abs/2004.09015v1
Incorporating Subword Information into Matrix Factorization Word Embeddings http://arxiv.org/abs/1805.03710v1
Incorporating Terminology Constraints in Automatic Post-Editing http://arxiv.org/abs/2010.09608v1
Incorporating Uncertain Segmentation Information into Chinese NER for Social Media Text http://arxiv.org/abs/2004.06384v2
Incorporating a Local Translation Mechanism into Non-autoregressive Translation http://arxiv.org/abs/2011.06132v1
Increasing performance of electric vehicles in ride-hailing services using deep reinforcement learning http://arxiv.org/abs/1912.03408v1
Incremental Neural Coreference Resolution in Constant Memory http://arxiv.org/abs/2005.00128v2
Incremental Processing in the Age of Non-Incremental Encoders: An Empirical Assessment of Bidirectional Models for Incremental NLU http://arxiv.org/abs/2010.05330v1
Incremental Sampling Without Replacement for Sequence Models http://arxiv.org/abs/2002.09067v1
Incremental Transformer with Deliberation Decoder for Document Grounded Conversations http://arxiv.org/abs/1907.08854v3
Independent Subspace Analysis for Unsupervised Learning of Disentangled Representations http://arxiv.org/abs/1909.05063v1
Individual Calibration with Randomized Forecasting http://arxiv.org/abs/2006.10288v3
Induced Inflection-Set Keyword Search in Speech http://arxiv.org/abs/1910.12299v2
Inductive Relation Prediction by Subgraph Reasoning http://arxiv.org/abs/1911.06962v2
Inertial Block Proximal Methods for Non-Convex Non-Smooth Optimization http://arxiv.org/abs/1903.01818v3
Inexact Tensor Methods with Dynamic Accuracies http://arxiv.org/abs/2002.09403v2
Inference Strategies for Machine Translation with Conditional Masking http://arxiv.org/abs/2010.02352v2
Inference of Dynamic Graph Changes for Functional Connectome http://arxiv.org/abs/1905.09993v2
Inferring Which Medical Treatments Work from Reports of Clinical Trials http://arxiv.org/abs/1904.01606v2
Inferring astrophysical X-ray polarization with deep learning http://arxiv.org/abs/2005.08126v1
Infinite attention: NNGP and NTK for deep attention networks http://arxiv.org/abs/2006.10540v1
Influence Paths for Characterizing Subject-Verb Number Agreement in LSTM Language Models http://arxiv.org/abs/2005.01190v1
Information Aggregation for Multi-Head Attention with Routing-by-Agreement http://arxiv.org/abs/1904.03100v1
Information Directed Sampling for Linear Partial Monitoring http://arxiv.org/abs/2002.11182v1
Information Extraction from Swedish Medical Prescriptions with Sig-Transformer Encoder http://arxiv.org/abs/2010.04897v1
Information Seeking in the Spirit of Learning: a Dataset for Conversational Curiosity http://arxiv.org/abs/2005.00172v2
Information Theoretic Optimal Learning of Gaussian Graphical Models http://arxiv.org/abs/1703.04886v3
Information-Theoretic Local Minima Characterization and Regularization http://arxiv.org/abs/1911.08192v2
Information-Theoretic Probing for Linguistic Structure http://arxiv.org/abs/2004.03061v2
Information-Theoretic Probing with Minimum Description Length http://arxiv.org/abs/2003.12298v1
Informative Dropout for Robust Representation Learning: A Shape-bias Perspective http://arxiv.org/abs/2008.04254v1
Infusing Disease Knowledge into BERT for Health Question Answering, Medical Inference and Disease Name Recognition http://arxiv.org/abs/2010.03746v1
Injecting Numerical Reasoning Skills into Language Models http://arxiv.org/abs/2004.04487v1
Inoculation by Fine-Tuning: A Method for Analyzing Challenge Datasets http://arxiv.org/abs/1904.02668v4
Input-Sparsity Low Rank Approximation in Schatten Norm http://arxiv.org/abs/2004.12646v3
Inquisitive Question Generation for High Level Text Comprehension http://arxiv.org/abs/2010.01657v1
Insights into Fairness through Trust: Multi-scale Trust Quantification for Financial Deep Learning http://arxiv.org/abs/2011.01961v1
InstaHide: Instance-hiding Schemes for Private Distributed Learning http://arxiv.org/abs/2010.02772v1
Instance-Based Learning of Span Representations: A Case Study through Named Entity Recognition http://arxiv.org/abs/2004.14514v1
Instance-wise Depth and Motion Learning from Monocular Videos http://arxiv.org/abs/1912.09351v2
Integrals over Gaussians under Linear Domain Constraints http://arxiv.org/abs/1910.09328v2
Integrating Multimodal Information in Large Pretrained Transformers http://arxiv.org/abs/1908.05787v3
Integrating Semantic Knowledge to Tackle Zero-shot Text Classification http://arxiv.org/abs/1903.12626v1
Integrating Semantic and Structural Information with Graph Convolutional Network for Controversy Detection http://arxiv.org/abs/2005.07886v1
Integrating Transformer and Paraphrase Rules for Sentence Simplification http://arxiv.org/abs/1810.11193v1
Integrating Weakly Supervised Word Sense Disambiguation into Neural Machine Translation http://arxiv.org/abs/1810.02614v1
Inter-Level Cooperation in Hierarchical Reinforcement Learning http://arxiv.org/abs/1912.02368v2
Inter-sentence Relation Extraction with Document-level Graph Convolutional Neural Network http://arxiv.org/abs/1906.04684v1
Interactive Classification by Asking Informative Questions http://arxiv.org/abs/1911.03598v2
Interactive Extractive Search over Biomedical Corpora http://arxiv.org/abs/2006.04148v1
Interactive Fiction Game Playing as Multi-Paragraph Reading Comprehension with Reinforcement Learning http://arxiv.org/abs/2010.02386v1
Interactive Machine Comprehension with Information Seeking Agents http://arxiv.org/abs/1908.10449v3
Interactive Refinement of Cross-Lingual Word Embeddings http://arxiv.org/abs/1911.03070v3
Interactive Text Ranking with Bayesian Optimisation: A Case Study on Community QA and Summarisation http://arxiv.org/abs/1911.10183v3
Interactive Visualization for Debugging RL http://arxiv.org/abs/2008.07331v2
Interconnected Question Generation with Coreference Alignment and Conversation Flow Modeling http://arxiv.org/abs/1906.06893v1
Interference and Generalization in Temporal Difference Learning http://arxiv.org/abs/2003.06350v1
Interpolation between Residual and Non-Residual Networks http://arxiv.org/abs/2006.05749v4
Interpretable Charge Predictions for Criminal Cases: Learning to Generate Court Views from Fact Descriptions http://arxiv.org/abs/1802.08504v1
Interpretable Companions for Black-Box Models http://arxiv.org/abs/2002.03494v2
Interpretable Multi-dataset Evaluation for Named Entity Recognition http://arxiv.org/abs/2011.06854v2
Interpretable Off-Policy Evaluation in Reinforcement Learning by Highlighting Influential Transitions http://arxiv.org/abs/2002.03478v3
Interpretable Question Answering on Knowledge Bases and Text http://arxiv.org/abs/1906.10924v1
Interpretable and Compositional Relation Learning by Joint Training with an Autoencoder http://arxiv.org/abs/1805.09547v1
Interpretable deep Gaussian processes with moments http://arxiv.org/abs/1905.10963v3
Interpretation of NLP models through input marginalization http://arxiv.org/abs/2010.13984v1
Interpretations are useful: penalizing explanations to align neural networks with prior knowledge http://arxiv.org/abs/1909.13584v4
Interpreting Attention Models with Human Visual Attention in Machine Reading Comprehension http://arxiv.org/abs/2010.06396v2
Intrinsic Probing through Dimension Selection http://arxiv.org/abs/2010.02812v1
Intrinsic Reward Driven Imitation Learning via Generative Model http://arxiv.org/abs/2006.15061v4
Introducing Syntactic Structures into Target Opinion Word Extraction with Deep Learning http://arxiv.org/abs/2010.13378v1
Invariant Causal Prediction for Block MDPs http://arxiv.org/abs/2003.06016v2
Invariant Risk Minimization Games http://arxiv.org/abs/2002.04692v2
Inverse Active Sensing: Modeling and Understanding Timely Decision-Making http://arxiv.org/abs/2006.14141v1
Invertible Generative Modeling using Linear Rational Splines http://arxiv.org/abs/2001.05168v4
Invertible generative models for inverse problems: mitigating representation error and dataset bias http://arxiv.org/abs/1905.11672v4
Investigating African-American Vernacular English in Transformer-Based Text Generation http://arxiv.org/abs/2010.02510v2
Investigating Capsule Networks with Dynamic Routing for Text Classification http://arxiv.org/abs/1804.00538v4
Investigating Cross-Linguistic Adjective Ordering Tendencies with a Latent-Variable Model http://arxiv.org/abs/2010.04755v1
Investigating Prior Knowledge for Challenging Chinese Machine Reading Comprehension http://arxiv.org/abs/1904.09679v3
Investigating representations of verb bias in neural language models http://arxiv.org/abs/2010.02375v2
Investigating the Effect of Sensor Modalities in Multi-Sensor Detection-Prediction Models http://arxiv.org/abs/2101.03279v1
Involutive MCMC: a Unifying Framework http://arxiv.org/abs/2006.16653v1
Is 42 the Answer to Everything in Subtitling-oriented Speech Translation? http://arxiv.org/abs/2006.01080v1
Is Graph Structure Necessary for Multi-hop Question Answering? http://arxiv.org/abs/2004.03096v2
Is Local SGD Better than Minibatch SGD? http://arxiv.org/abs/2002.07839v2
Is There a Trade-Off Between Fairness and Accuracy? A Perspective Using Mismatched Hypothesis Testing http://arxiv.org/abs/1910.07870v2
Is Your Classifier Actually Biased? Measuring Fairness under Uncertainty with Bernstein Bounds http://arxiv.org/abs/2004.12332v1
Is the Best Better? Bayesian Statistical Model Comparison for Natural Language Processing http://arxiv.org/abs/2010.03088v1
It's Easier to Translate out of English than into it: Measuring Neural Translation Difficulty by Cross-Mutual Information http://arxiv.org/abs/2005.02354v2
It's Not What Machines Can Learn, It's What We Cannot Teach http://arxiv.org/abs/2002.09398v2
Iterative Edit-Based Unsupervised Sentence Simplification http://arxiv.org/abs/2006.09639v1
Iterative Refinement in the Continuous Space for Non-Autoregressive Neural Machine Translation http://arxiv.org/abs/2009.07177v1
Ivy: Instrumental Variable Synthesis for Causal Inference http://arxiv.org/abs/2004.05316v1
Job Recommendation through Progression of Job Selection http://arxiv.org/abs/1905.13136v2
Joint Bootstrapping Machines for High Confidence Relation Extraction http://arxiv.org/abs/1805.00254v1
Joint Constrained Learning for Event-Event Relation Extraction http://arxiv.org/abs/2010.06727v1
Joint Detection and Location of English Puns http://arxiv.org/abs/1909.00175v1
Joint Diacritization, Lemmatization, Normalization, and Fine-Grained Morphological Tagging http://arxiv.org/abs/1910.02267v1
Joint Effects of Context and User History for Predicting Online Conversation Re-entries http://arxiv.org/abs/1906.01185v1
Joint Entity Extraction and Assertion Detection for Clinical Text http://arxiv.org/abs/1812.05270v5
Joint Extraction of Entities and Relations Based on a Novel Tagging Scheme http://arxiv.org/abs/1706.05075v1
Joint Learning of Pre-Trained and Random Units for Domain Adaptation in Part-of-Speech Tagging http://arxiv.org/abs/1904.03595v1
Joint Modeling of Content and Discourse Relations in Dialogues http://arxiv.org/abs/1705.05039v1
Joint Modeling of Topics, Citations, and Topical Authority in Academic Corpora http://arxiv.org/abs/1706.00593v1
Joint Modelling of Emotion and Abusive Language Detection http://arxiv.org/abs/2005.14028v1
Joint Multilingual Supervision for Cross-lingual Entity Linking http://arxiv.org/abs/1809.07657v1
Joint Multitask Learning for Community Question Answering Using Task-Specific Embeddings http://arxiv.org/abs/1809.08928v1
Joint Reasoning for Temporal and Causal Relations http://arxiv.org/abs/1906.04941v1
Joint Semantic Synthesis and Morphological Analysis of the Derived Word http://arxiv.org/abs/1701.00946v3
Joint translation and unit conversion for end-to-end localization http://arxiv.org/abs/2004.05219v1
Jointly Multiple Events Extraction via Attention-based Graph Information Aggregation http://arxiv.org/abs/1809.09078v2
Jointly Optimizing Diversity and Relevance in Neural Response Generation http://arxiv.org/abs/1902.11205v3
Jointly Predicting Predicates and Arguments in Neural Semantic Role Labeling http://arxiv.org/abs/1805.04787v2
KLEJ: Comprehensive Benchmark for Polish Language Understanding http://arxiv.org/abs/2005.00630v1
KdConv: A Chinese Multi-domain Dialogue Dataset Towards Multi-turn Knowledge-driven Conversation http://arxiv.org/abs/2004.04100v1
Keep CALM and Explore: Language Models for Action Generation in Text-based Games http://arxiv.org/abs/2010.02903v1
Keeping Up Appearances: Computational Modeling of Face Acts in Persuasion Oriented Discussions http://arxiv.org/abs/2009.10815v2
Kernel Conditional Density Operators http://arxiv.org/abs/1905.11255v2
Kernel and Rich Regimes in Overparametrized Models http://arxiv.org/abs/1906.05827v3
Kernel interpolation with continuous volume sampling http://arxiv.org/abs/2002.09677v1
Kernels over Sets of Finite Sets using RKHS Embeddings, with Application to Bayesian (Combinatorial) Optimization http://arxiv.org/abs/1910.04086v2
Key-Value Memory Networks for Directly Reading Documents http://arxiv.org/abs/1606.03126v2
Keyphrase Generation: A Text Summarization Struggle http://arxiv.org/abs/1904.00110v2
KinGDOM: Knowledge-Guided DOMain adaptation for sentiment analysis http://arxiv.org/abs/2005.00791v2
Kinematic State Abstraction and Provably Efficient Rich-Observation Reinforcement Learning http://arxiv.org/abs/1911.05815v1
Knowing The What But Not The Where in Bayesian Optimization http://arxiv.org/abs/1905.02685v5
Knowledge Association with Hyperbolic Knowledge Graph Embeddings http://arxiv.org/abs/2010.02162v1
Knowledge Completion for Generics using Guided Tensor Factorization http://arxiv.org/abs/1612.03871v3
Knowledge Distillation for Multilingual Unsupervised Neural Machine Translation http://arxiv.org/abs/2004.10171v1
Knowledge Graph-Augmented Abstractive Summarization with Semantic-Driven Cloze Reward http://arxiv.org/abs/2005.01159v1
Knowledge-Grounded Dialogue Generation with Pre-trained Language Models http://arxiv.org/abs/2010.08824v1
Knowledge-aware Pronoun Coreference Resolution http://arxiv.org/abs/1907.03663v1
Knowledge-guided Open Attribute Value Extraction with Reinforcement Learning http://arxiv.org/abs/2010.09189v1
Knowledgeable Reader: Enhancing Cloze-Style Reading Comprehension with External Commonsense Knowledge http://arxiv.org/abs/1805.07858v1
KutralNet: A Portable Deep Learning Model for Fire Recognition http://arxiv.org/abs/2008.06866v1
Køpsala: Transition-Based Graph Parsing via Efficient Training and Effective Encoding http://arxiv.org/abs/2005.12094v2
LAReQA: Language-agnostic answer retrieval from a multilingual pool http://arxiv.org/abs/2004.05484v1
LEAN-LIFE: A Label-Efficient Annotation Framework Towards Learning from Explanation http://arxiv.org/abs/2004.07499v1
LEEP: A New Measure to Evaluate Transferability of Learned Representations http://arxiv.org/abs/2002.12462v2
LIBRE: Learning Interpretable Boolean Rule Ensembles http://arxiv.org/abs/1911.06537v1
LINSPECTOR: Multilingual Probing Tasks for Word Representations http://arxiv.org/abs/1903.09442v2
LOGAN: Local Group Bias Detection by Clustering http://arxiv.org/abs/2010.02867v1
LP-SparseMAP: Differentiable Relaxed Optimization for Sparse Structured Prediction http://arxiv.org/abs/2001.04437v3
LRTA: A Transparent Neural-Symbolic Reasoning Framework with Modular Supervision for Visual Question Answering http://arxiv.org/abs/2011.10731v1
LUKE: Deep Contextualized Entity Representations with Entity-aware Self-attention http://arxiv.org/abs/2010.01057v1
Label-aware Double Transfer Learning for Cross-Specialty Medical Named Entity Recognition http://arxiv.org/abs/1804.09021v2
Landscape Connectivity and Dropout Stability of SGD Solutions for Over-parameterized Neural Networks http://arxiv.org/abs/1912.10095v2
Langevin Monte Carlo without smoothness http://arxiv.org/abs/1905.13285v3
Language (Re)modelling: Towards Embodied Language Understanding http://arxiv.org/abs/2005.00311v2
Language (Technology) is Power: A Critical Survey of "Bias" in NLP http://arxiv.org/abs/2005.14050v2
Language Generation with Multi-Hop Reasoning on Commonsense Knowledge Graph http://arxiv.org/abs/2009.11692v1
Language Model Prior for Low-Resource Neural Machine Translation http://arxiv.org/abs/2004.14928v3
Language Modelling Makes Sense: Propagating Representations through WordNet for Full-Coverage Word Sense Disambiguation http://arxiv.org/abs/1906.10007v1
Language Models as Fact Checkers? http://arxiv.org/abs/2006.04102v2
Language Models as an Alternative Evaluator of Word Order Hypotheses: A Case Study in Japanese http://arxiv.org/abs/2005.00842v1
Language Models not just for Pre-training: Fast Online Neural Noisy Channel Modeling http://arxiv.org/abs/2011.07164v1
Language Understanding for Text-based Games Using Deep Reinforcement Learning http://arxiv.org/abs/1506.08941v2
Language as a Latent Variable: Discrete Generative Models for Sentence Compression http://arxiv.org/abs/1609.07317v2
Large Margin Neural Language Model http://arxiv.org/abs/1808.08987v1
Large Product Key Memory for Pretrained Language Models http://arxiv.org/abs/2010.03881v1
Large Scale Multi-Actor Generative Dialog Modeling http://arxiv.org/abs/2005.06114v1
Large-Scale Multi-Domain Belief Tracking with Knowledge Sharing http://arxiv.org/abs/1807.06517v1
Large-scale Analysis of Counseling Conversations: An Application of Natural Language Processing to Mental Health http://arxiv.org/abs/1605.04462v3
Large-scale Cloze Test Dataset Created by Teachers http://arxiv.org/abs/1711.03225v3
Last Iterate is Slower than Averaged Iterate in Smooth Convex-Concave Saddle Point Problems http://arxiv.org/abs/2002.00057v2
Latent Alignment of Procedural Concepts in Multimodal Recipes http://arxiv.org/abs/2101.04727v1
Latent Space Factorisation and Manipulation via Matrix Subspace Projection http://arxiv.org/abs/1907.12385v3
Latent Space Oddity: Exploring Latent Spaces to Design Guitar Timbres http://arxiv.org/abs/2010.15989v2
Latent Variable Modelling with Hyperbolic Normalizing Flows http://arxiv.org/abs/2002.06336v4
Latent-CF: A Simple Baseline for Reverse Counterfactual Explanations http://arxiv.org/abs/2012.09301v1
Layered Sampling for Robust Optimization Problems http://arxiv.org/abs/2002.11904v1
LazyIter: A Fast Algorithm for Counting Markov Equivalent DAGs and Designing Experiments http://arxiv.org/abs/2006.09670v1
LdSM: Logarithm-depth Streaming Multi-label Decision Trees http://arxiv.org/abs/1905.10428v5
Learnable Bernoulli Dropout for Bayesian Deep Learning http://arxiv.org/abs/2002.05155v1
Learning Accurate Long-term Dynamics for Model-based Reinforcement Learning http://arxiv.org/abs/2012.09156v1
Learning Adaptive Language Interfaces through Decomposition http://arxiv.org/abs/2010.05190v1
Learning Adversarially Robust Representations via Worst-Case Mutual Information Maximization http://arxiv.org/abs/2002.11798v2
Learning Algebraic Multigrid Using Graph Neural Networks http://arxiv.org/abs/2003.05744v2
Learning Architectures from an Extended Search Space for Language Modeling http://arxiv.org/abs/2005.02593v2
Learning Autoencoders with Relational Regularization http://arxiv.org/abs/2002.02913v4
Learning Canonical Transformations http://arxiv.org/abs/2011.08822v1
Learning Collaborative Agents with Rule Guidance for Knowledge Graph Reasoning http://arxiv.org/abs/2005.00571v2
Learning Compressed Sentence Representations for On-Device Text Processing http://arxiv.org/abs/1906.08340v1
Learning Constraints for Structured Prediction Using Rectifier Networks http://arxiv.org/abs/2006.01209v1
Learning Context-Free Languages with Nondeterministic Stack RNNs http://arxiv.org/abs/2010.04674v1
Learning Context-Sensitive Convolutional Filters for Text Processing http://arxiv.org/abs/1709.08294v3
Learning Contextualized Knowledge Structures for Commonsense Reasoning http://arxiv.org/abs/2010.12873v2
Learning Cross-lingual Distributed Logical Representations for Semantic Parsing http://arxiv.org/abs/1806.05461v1
Learning Crosslingual Word Embeddings without Bilingual Corpora http://arxiv.org/abs/1606.09403v1
Learning De-biased Representations with Biased Representations http://arxiv.org/abs/1910.02806v3
Learning Deep Transformer Models for Machine Translation http://arxiv.org/abs/1906.01787v1
Learning Dialog Policies from Weak Demonstrations http://arxiv.org/abs/2004.11054v2
Learning Discourse-level Diversity for Neural Dialog Models using Conditional Variational Autoencoders http://arxiv.org/abs/1703.10960v3
Learning Discrete Structured Representations by Adversarially Maximizing Mutual Information http://arxiv.org/abs/2004.03991v2
Learning Dynamic Feature Selection for Fast Sequential Prediction http://arxiv.org/abs/1505.06169v1
Learning Dynamic and Personalized Comorbidity Networks from Event Data using Deep Diffusion Processes http://arxiv.org/abs/2001.02585v2
Learning Efficient Multi-agent Communication: An Information Bottleneck Approach http://arxiv.org/abs/1911.06992v2
Learning End-to-End Goal-Oriented Dialog with Maximal User Task Success and Minimal Human Agent Use http://arxiv.org/abs/1907.07638v1
Learning Entangled Single-Sample Gaussians in the Subset-of-Signals Model http://arxiv.org/abs/2007.05557v1
Learning Fair Policies in Multiobjective (Deep) Reinforcement Learning with Average and Discounted Rewards http://arxiv.org/abs/2008.07773v1
Learning Fair Representations for Kernel Models http://arxiv.org/abs/1906.11813v2
Learning Flat Latent Manifolds with VAEs http://arxiv.org/abs/2002.04881v3
Learning Functionally Decomposed Hierarchies for Continuous Control Tasks with Path Planning http://arxiv.org/abs/2002.05954v3
Learning Gaussian Graphical Models via Multiplicative Weights http://arxiv.org/abs/2002.08663v2
Learning Generic Sentence Representations Using Convolutional Neural Networks http://arxiv.org/abs/1611.07897v2
Learning Geometric Word Meta-Embeddings http://arxiv.org/abs/2004.09219v1
Learning Graph Models for Template-Free Retrosynthesis http://arxiv.org/abs/2006.07038v1
Learning Graph Structure With A Finite-State Automaton Layer http://arxiv.org/abs/2007.04929v2
Learning Group Structure and Disentangled Representations of Dynamical Environments http://arxiv.org/abs/2002.06991v2
Learning Halfspaces with Massart Noise Under Structured Distributions http://arxiv.org/abs/2002.05632v1
Learning Hierarchical Interactions at Scale: A Convex Optimization Approach http://arxiv.org/abs/1902.01542v5
Learning High-dimensional Gaussian Graphical Models under Total Positivity without Adjustment of Tuning Parameters http://arxiv.org/abs/1906.05159v4
Learning Human Objectives by Evaluating Hypothetical Behavior http://arxiv.org/abs/1912.05652v1
Learning Hyperbolic Representations for Unsupervised 3D Segmentation http://arxiv.org/abs/2012.01644v2
Learning Implicit Text Generation via Feature Matching http://arxiv.org/abs/2005.03588v2
Learning Implicitly with Noisy Data in Linear Arithmetic http://arxiv.org/abs/2010.12619v1
Learning Informative Representations of Biomedical Relations with Latent Variable Models http://arxiv.org/abs/2011.10285v1
Learning Intrinsic Symbolic Rewards in Reinforcement Learning http://arxiv.org/abs/2010.03694v2
Learning Invariant Representations for Reinforcement Learning without Reconstruction http://arxiv.org/abs/2006.10742v1
Learning Joint Semantic Parsers from Disjoint Data http://arxiv.org/abs/1804.05990v1
Learning Lexico-Functional Patterns for First-Person Affect http://arxiv.org/abs/1708.09789v1
Learning Long-term Visual Dynamics with Region Proposal Interaction Networks http://arxiv.org/abs/2008.02265v1
Learning Matching Models with Weak Supervision for Response Selection in Retrieval-based Chatbots http://arxiv.org/abs/1805.02333v2
Learning Mixtures of Graphs from Epidemic Cascades http://arxiv.org/abs/1906.06057v2
Learning Multilingual Word Embeddings in Latent Metric Space: A Geometric Approach http://arxiv.org/abs/1808.08773v3
Learning Music Helps You Read: Using Transfer to Study Linguistic Structure in Language Models http://arxiv.org/abs/2004.14601v3
Learning Near Optimal Policies with Low Inherent Bellman Error http://arxiv.org/abs/2003.00153v3
Learning Neural Sequence-to-Sequence Models from Weak Feedback with Bipolar Ramp Loss http://arxiv.org/abs/1907.03748v1
Learning Neural Templates for Text Generation http://arxiv.org/abs/1808.10122v3
Learning Object-Centric Video Models by Contrasting Sets http://arxiv.org/abs/2011.10287v1
Learning Optimal Tree Models Under Beam Search http://arxiv.org/abs/2006.15408v1
Learning Outside the Box: Discourse-level Features Improve Metaphor Identification http://arxiv.org/abs/1904.02246v2
Learning Overlapping Representations for the Estimation of Individualized Treatment Effects http://arxiv.org/abs/2001.04754v3
Learning Portable Representations for High-Level Planning http://arxiv.org/abs/1905.12006v1
Learning Probabilistic Sentence Representations from Paraphrases http://arxiv.org/abs/2005.08105v1
Learning Quadratic Games on Networks http://arxiv.org/abs/1811.08790v3
Learning Reasoning Strategies in End-to-End Differentiable Proving http://arxiv.org/abs/2007.06477v3
Learning Representations that Support Extrapolation http://arxiv.org/abs/2007.05059v2
Learning Robot Skills with Temporal Variational Inference http://arxiv.org/abs/2006.16232v1
Learning Robust Models for e-Commerce Product Search http://arxiv.org/abs/2005.03624v1
Learning Sequence Encoders for Temporal Knowledge Graph Completion http://arxiv.org/abs/1809.03202v1
Learning Similarity Metrics for Numerical Simulations http://arxiv.org/abs/2002.07863v2
Learning Source Phrase Representations for Neural Machine Translation http://arxiv.org/abs/2006.14405v1
Learning Sparse Nonparametric DAGs http://arxiv.org/abs/1909.13189v2
Learning Spoken Language Representations with Neural Lattice Language Modeling http://arxiv.org/abs/2007.02629v2
Learning Structural Kernels for Natural Language Processing http://arxiv.org/abs/1508.02131v1
Learning Structured Representations of Entity Names using Active Learning and Weak Supervision http://arxiv.org/abs/2011.00105v1
Learning Symbolic Physics with Graph Networks http://arxiv.org/abs/1909.05862v2
Learning To Navigate The Synthetically Accessible Chemical Space Using Reinforcement Learning http://arxiv.org/abs/2004.12485v2
Learning To Solve Differential Equations Across Initial Conditions http://arxiv.org/abs/2003.12159v2
Learning Variational Word Masks to Improve the Interpretability of Neural Text Classifiers http://arxiv.org/abs/2010.00667v3
Learning Visually Grounded Sentence Representations http://arxiv.org/abs/1707.06320v2
Learning What to Defer for Maximum Independent Sets http://arxiv.org/abs/2006.09607v2
Learning Word-Like Units from Joint Audio-Visual Analysis http://arxiv.org/abs/1701.07481v3
Learning Zero-Sum Simultaneous-Move Markov Games Using Function Approximation and Correlated Equilibrium http://arxiv.org/abs/2002.07066v3
Learning a Cost-Effective Annotation Policy for Question Answering http://arxiv.org/abs/2010.03476v2
Learning a Multi-Domain Curriculum for Neural Machine Translation http://arxiv.org/abs/1908.10940v2
Learning a Neural Semantic Parser from User Feedback http://arxiv.org/abs/1704.08760v1
Learning a Policy for Opportunistic Active Learning http://arxiv.org/abs/1808.10009v1
Learning a Simple and Effective Model for Multi-turn Response Generation with Auxiliary Tasks http://arxiv.org/abs/2004.01972v2
Learning a Single Neuron with Gradient Methods http://arxiv.org/abs/2001.05205v2
Learning an Unreferenced Metric for Online Dialogue Evaluation http://arxiv.org/abs/2005.00583v1
Learning and Evaluating Contextual Embedding of Source Code http://arxiv.org/abs/2001.00059v3
Learning and Evaluating Emotion Lexicons for 91 Languages http://arxiv.org/abs/2005.05672v1
Learning and Sampling of Atomic Interventions from Observations http://arxiv.org/abs/2002.04232v2
Learning beyond datasets: Knowledge Graph Augmented Neural Networks for Natural language Processing http://arxiv.org/abs/1802.05930v2
Learning distributed representations of graphs with Geo2DR http://arxiv.org/abs/2003.05926v3
Learning for Dose Allocation in Adaptive Clinical Trials with Safety Constraints http://arxiv.org/abs/2006.05026v2
Learning from Context or Names? An Empirical Study on Neural Relation Extraction http://arxiv.org/abs/2010.01923v2
Learning from Irregularly-Sampled Time Series: A Missing Data Perspective http://arxiv.org/abs/2008.07599v1
Learning from Task Descriptions http://arxiv.org/abs/2011.08115v1
Learning how to Active Learn: A Deep Reinforcement Learning Approach http://arxiv.org/abs/1708.02383v1
Learning in Gated Neural Networks http://arxiv.org/abs/1906.02777v2
Learning piecewise Lipschitz functions in changing environments http://arxiv.org/abs/1907.09137v4
Learning robust visual representations using data augmentation invariance http://arxiv.org/abs/1906.04547v1
Learning spectrograms with convolutional spectral kernels http://arxiv.org/abs/1905.09917v2
Learning the piece-wise constant graph structure of a varying Ising model http://arxiv.org/abs/1910.08512v2
Learning to Ask Good Questions: Ranking Clarification Questions using Neural Expected Value of Perfect Information http://arxiv.org/abs/1805.04655v2
Learning to Ask Questions in Open-domain Conversational Systems with Typed Decoders http://arxiv.org/abs/1805.04843v1
Learning to Ask Unanswerable Questions for Machine Reading Comprehension http://arxiv.org/abs/1906.06045v1
Learning to Branch for Multi-Task Learning http://arxiv.org/abs/2006.01895v2
Learning to Classify Intents and Slot Labels Given a Handful of Examples http://arxiv.org/abs/2004.10793v1
Learning to Combine Top-Down and Bottom-Up Signals in Recurrent Neural Networks with Attention over Modules http://arxiv.org/abs/2006.16981v3
Learning to Contextually Aggregate Multi-Source Supervision for Sequence Labeling http://arxiv.org/abs/1910.04289v2
Learning to Continually Learn http://arxiv.org/abs/2002.09571v2
Learning to Customize Model Structures for Few-shot Dialogue Generation Tasks http://arxiv.org/abs/1910.14326v2
Learning to Deceive with Attention-Based Explanations http://arxiv.org/abs/1909.07913v2
Learning to Decipher Hate Symbols http://arxiv.org/abs/1904.02418v1
Learning to Encode Position for Transformer with Continuous Dynamical Model http://arxiv.org/abs/2003.09229v1
Learning to Evaluate Translation Beyond English: BLEURT Submissions to the WMT Metrics 2020 Shared Task http://arxiv.org/abs/2010.04297v3
Learning to Faithfully Rationalize by Construction http://arxiv.org/abs/2005.00115v1
Learning to Fuse Sentences with Transformers for Summarization http://arxiv.org/abs/2010.03726v1
Learning to Generate Compositional Color Descriptions http://arxiv.org/abs/1606.03821v2
Learning to Generate Multiple Style Transfer Outputs for an Input Sentence http://arxiv.org/abs/2002.06525v1
Learning to Ignore: Long Document Coreference with Bounded Memory Neural Networks http://arxiv.org/abs/2010.02807v3
Learning to Learn Kernels with Variational Random Features http://arxiv.org/abs/2006.06707v2
Learning to Map Context-Dependent Sentences to Executable Formal Queries http://arxiv.org/abs/1804.06868v2
Learning to Navigate Unseen Environments: Back Translation with Environmental Dropout http://arxiv.org/abs/1904.04195v1
Learning to Parse and Translate Improves Neural Machine Translation http://arxiv.org/abs/1702.03525v2
Learning to Prune Deep Neural Networks via Reinforcement Learning http://arxiv.org/abs/2007.04756v1
Learning to Rank Learning Curves http://arxiv.org/abs/2006.03361v1
Learning to Reach Goals via Iterated Supervised Learning http://arxiv.org/abs/1912.06088v4
Learning to Recognize Discontiguous Entities http://arxiv.org/abs/1810.08579v3
Learning to Recover from Multi-Modality Errors for Non-Autoregressive Neural Machine Translation http://arxiv.org/abs/2006.05165v1
Learning to Represent Action Values as a Hypergraph on the Action Vertices http://arxiv.org/abs/2010.14680v1
Learning to Sample with Local and Global Contexts in Experience Replay Buffer http://arxiv.org/abs/2007.07358v1
Learning to Score Behaviors for Guided Policy Optimization http://arxiv.org/abs/1906.04349v4
Learning to Segment Actions from Observation and Narration http://arxiv.org/abs/2005.03684v2
Learning to Simulate Complex Physics with Graph Networks http://arxiv.org/abs/2002.09405v2
Learning to Stop While Learning to Predict http://arxiv.org/abs/2006.05082v1
Learning to Understand Child-directed and Adult-directed Speech http://arxiv.org/abs/2005.02721v3
Learning to Update Natural Language Comments Based on Code Changes http://arxiv.org/abs/2004.12169v2
Learning to simulate and design for structural engineering http://arxiv.org/abs/2003.09103v3
Learning with Bounded Instance- and Label-dependent Label Noise http://arxiv.org/abs/1709.03768v3
Learning with Good Feature Representations in Bandits and in RL with a Generative Model http://arxiv.org/abs/1911.07676v2
Learning with Multiple Complementary Labels http://arxiv.org/abs/1912.12927v3
Leave-One-Out Cross-Validation for Bayesian Model Comparison in Large Data http://arxiv.org/abs/2001.00980v1
Lessons from Archives: Strategies for Collecting Sociocultural Data in Machine Learning http://arxiv.org/abs/1912.10389v1
Lessons from the Bible on Modern Topics: Low-Resource Multilingual Topic Model Evaluation http://arxiv.org/abs/1804.10184v1
Let Me Choose: From Verbal Context to Font Selection http://arxiv.org/abs/2005.01151v1
Let's Agree to Agree: Neural Networks Share Classification Order on Real Datasets http://arxiv.org/abs/1905.10854v7
Levels of Analysis for Machine Learning http://arxiv.org/abs/2004.05107v1
Leveraging Declarative Knowledge in Text and First-Order Logic for Fine-Grained Propaganda Detection http://arxiv.org/abs/2004.14201v2
Leveraging Frequency Analysis for Deep Fake Image Recognition http://arxiv.org/abs/2003.08685v3
Leveraging Graph to Improve Abstractive Multi-Document Summarization http://arxiv.org/abs/2005.10043v1
Leveraging Monolingual Data with Self-Supervision for Multilingual Neural Machine Translation http://arxiv.org/abs/2005.04816v1
Leveraging Multimodal Behavioral Analytics for Automated Job Interview Performance Assessment and Feedback http://arxiv.org/abs/2006.07909v2
Leveraging Pre-trained Checkpoints for Sequence Generation Tasks http://arxiv.org/abs/1907.12461v2
Leveraging Procedural Generation to Benchmark Reinforcement Learning http://arxiv.org/abs/1912.01588v2
Leveraging Sentence Similarity in Natural Language Generation: Improving Beam Search using Range Voting http://arxiv.org/abs/1908.06288v2
Lexical Features in Coreference Resolution: To be Used With Caution http://arxiv.org/abs/1704.06779v1
Lexically Constrained Neural Machine Translation with Levenshtein Transformer http://arxiv.org/abs/2004.12681v1
Lexicosyntactic Inference in Neural Models http://arxiv.org/abs/1808.06232v1
Lifelong Language Knowledge Distillation http://arxiv.org/abs/2010.02123v1
Lifelong Learning CRF for Supervised Aspect Extraction http://arxiv.org/abs/1705.00251v1
Lifted Disjoint Paths with Application in Multiple Object Tracking http://arxiv.org/abs/2006.14550v1
Lifted Rule Injection for Relation Embeddings http://arxiv.org/abs/1606.08359v2
Lightweight, Dynamic Graph Convolutional Networks for AMR-to-Text Generation http://arxiv.org/abs/2010.04383v1
Like a Baby: Visually Situated Neural Language Acquisition http://arxiv.org/abs/1805.11546v2
Like hiking? You probably enjoy nature: Persona-grounded Dialog with Commonsense Expansions http://arxiv.org/abs/2010.03205v1
Likelihood Regret: An Out-of-Distribution Detection Score For Variational Auto-encoder http://arxiv.org/abs/2003.02977v3
Linear Bandits with Stochastic Delayed Feedback http://arxiv.org/abs/1807.02089v3
Linear Convergence of Adaptive Stochastic Gradient Descent http://arxiv.org/abs/1908.10525v2
Linear Convergence of Randomized Primal-Dual Coordinate Method for Large-scale Linear Constrained Convex Programming http://arxiv.org/abs/2008.12946v1
Linear Dynamics: Clustering without identification http://arxiv.org/abs/1908.01039v3
Linear Lower Bounds and Conditioning of Differentiable Games http://arxiv.org/abs/1906.07300v3
Linear Mode Connectivity and the Lottery Ticket Hypothesis http://arxiv.org/abs/1912.05671v4
Linear-Time Constituency Parsing with RNNs and Dynamic Programming http://arxiv.org/abs/1805.06995v2
Linearly Convergent Frank-Wolfe with Backtracking Line-Search http://arxiv.org/abs/1806.05123v4
Linguistic Features for Readability Assessment http://arxiv.org/abs/2006.00377v1
Linguistic Harbingers of Betrayal: A Case Study on an Online Strategy Game http://arxiv.org/abs/1506.04744v1
Linguistic Knowledge and Transferability of Contextual Representations http://arxiv.org/abs/1903.08855v5
Lipschitz Constrained Parameter Initialization for Deep Transformers http://arxiv.org/abs/1911.03179v2
Lipschitz and Comparator-Norm Adaptivity in Online Learning http://arxiv.org/abs/2002.12242v2
Lipstick on a Pig: Debiasing Methods Cover up Systematic Gender Biases in Word Embeddings But do not Remove Them http://arxiv.org/abs/1903.03862v2
List Decodable Subspace Recovery http://arxiv.org/abs/2002.03004v1
Lite Training Strategies for Portuguese-English and English-Portuguese Translation http://arxiv.org/abs/2008.08769v1
Local Differentially Private Regret Minimization in Reinforcement Learning http://arxiv.org/abs/2010.07778v1
Localizing Moments in Video with Temporal Language http://arxiv.org/abs/1809.01337v1
Locally Accelerated Conditional Gradients http://arxiv.org/abs/1906.07867v2
Locally Private Hypothesis Selection http://arxiv.org/abs/2002.09465v2
Location Attention for Extrapolation to Longer Sequences http://arxiv.org/abs/1911.03872v2
Logarithmic Regret for Learning Linear Quadratic Regulators Efficiently http://arxiv.org/abs/2002.08095v2
Logarithmic Regret for Online Control http://arxiv.org/abs/1909.05062v1
Logic-Guided Data Augmentation and Regularization for Consistent Question Answering http://arxiv.org/abs/2004.10157v2
Logical Inferences with Comparatives and Generalized Quantifiers http://arxiv.org/abs/2005.07954v1
Logical Natural Language Generation from Open-Domain Tables http://arxiv.org/abs/2004.10404v2
LogicalFactChecker: Leveraging Logical Operations for Fact Checking with Graph Module Network http://arxiv.org/abs/2004.13659v1
Logistic Regression for Massive Data with Rare Events http://arxiv.org/abs/2006.00683v1
Long Short-Term Memory as a Dynamically Computed Element-wise Weighted Sum http://arxiv.org/abs/1805.03716v1
Long-Horizon Visual Planning with Goal-Conditioned Hierarchical Predictors http://arxiv.org/abs/2006.13205v2
Long-Short Term Masking Transformer: A Simple but Effective Baseline for Document-level Neural Machine Translation http://arxiv.org/abs/2009.09127v1
Long-tail Relation Extraction via Knowledge Graph Embeddings and Graph Convolution Networks http://arxiv.org/abs/1903.01306v1
Look It Up: Bilingual and Monolingual Dictionaries Improve Neural Machine Translation http://arxiv.org/abs/2010.05997v1
Look at the First Sentence: Position Bias in Question Answering http://arxiv.org/abs/2004.14602v3
Lookahead-Bounded Q-Learning http://arxiv.org/abs/2006.15690v1
Loss Function Search for Face Recognition http://arxiv.org/abs/2007.06542v1
Lossless Compression of Deep Neural Networks http://arxiv.org/abs/2001.00218v3
Low Rank Fusion based Transformers for Multimodal Sequences http://arxiv.org/abs/2007.02038v1
Low Resource Neural Machine Translation: A Benchmark for Five African Languages http://arxiv.org/abs/2003.14402v1
Low Shot Learning with Untrained Neural Networks for Imaging Inverse Problems http://arxiv.org/abs/1910.10797v1
Low-Dimensional Hyperbolic Knowledge Graph Embeddings http://arxiv.org/abs/2005.00545v1
Low-Rank Bottleneck in Multi-head Attention Models http://arxiv.org/abs/2002.07028v1
Low-Resource Domain Adaptation for Compositional Task-Oriented Semantic Parsing http://arxiv.org/abs/2010.03546v1
Low-Variance and Zero-Variance Baselines for Extensive-Form Games http://arxiv.org/abs/1907.09633v1
Low-loss connection of weight vectors: distribution-based approaches http://arxiv.org/abs/2008.00741v1
Low-resource Deep Entity Resolution with Transfer and Active Learning http://arxiv.org/abs/1906.08042v1
LowFER: Low-rank Bilinear Pooling for Link Prediction http://arxiv.org/abs/2008.10858v1
MAD-X: An Adapter-Based Framework for Multi-Task Cross-Lingual Transfer http://arxiv.org/abs/2005.00052v3
MAF: Multimodal Alignment Framework for Weakly-Supervised Phrase Grounding http://arxiv.org/abs/2010.05379v1
MART: Memory-Augmented Recurrent Transformer for Coherent Video Paragraph Captioning http://arxiv.org/abs/2005.05402v1
MAST: Multimodal Abstractive Summarization with Trimodal Hierarchical Attention http://arxiv.org/abs/2010.08021v1
MATINF: A Jointly Labeled Large-Scale Dataset for Classification, Question Answering and Summarization http://arxiv.org/abs/2004.12302v2
MAVEN: A Massive General Domain Event Detection Dataset http://arxiv.org/abs/2004.13590v2
MCMH: Learning Multi-Chain Multi-Hop Rules for Knowledge Graph Reasoning http://arxiv.org/abs/2010.01735v1
MEGA RST Discourse Treebanks with Structure and Nuclearity from Scalable Distant Sentiment Supervision http://arxiv.org/abs/2011.03017v1
MEGATRON-CNTRL: Controllable Story Generation with External Knowledge Using Large-Scale Language Models http://arxiv.org/abs/2010.00840v1
MGHRL: Meta Goal-generation for Hierarchical Reinforcement Learning http://arxiv.org/abs/1909.13607v4
MIME: MIMicking Emotions for Empathetic Response Generation http://arxiv.org/abs/2010.01454v1
MLSUM: The Multilingual Summarization Corpus http://arxiv.org/abs/2004.14900v1
MOCHA: A Dataset for Training and Evaluating Generative Reading Comprehension Metrics http://arxiv.org/abs/2010.03636v2
MOPO: Model-based Offline Policy Optimization http://arxiv.org/abs/2005.13239v6
MORSE: Semantic-ally Drive-n MORpheme SEgment-er http://arxiv.org/abs/1702.02212v3
MPC-guided Imitation Learning of Neural Network Policies for the Artificial Pancreas http://arxiv.org/abs/2003.01283v1
MTL2L: A Context Aware Neural Optimiser http://arxiv.org/abs/2007.09343v1
MULTIPOLAR: Multi-Source Policy Aggregation for Transfer Reinforcement Learning between Diverse Environmental Dynamics http://arxiv.org/abs/1909.13111v2
Machine Learning in Population and Public Health http://arxiv.org/abs/2008.07278v1
Making Monolingual Sentence Embeddings Multilingual using Knowledge Distillation http://arxiv.org/abs/2004.09813v2
Mapping Natural Language Instructions to Mobile UI Action Sequences http://arxiv.org/abs/2005.03776v2
Mapping Natural-language Problems to Formal-language Solutions Using Structured Neural Representations http://arxiv.org/abs/1910.02339v3
Mapping to Declarative Knowledge for Word Problem Solving http://arxiv.org/abs/1712.09391v1
Marrying up Regular Expressions with Neural Networks: A Case Study for Spoken Language Understanding http://arxiv.org/abs/1805.05588v1
Masked Language Model Scoring http://arxiv.org/abs/1910.14659v3
Masking as an Efficient Alternative to Finetuning for Pretrained Language Models http://arxiv.org/abs/2004.12406v2
Massively Multilingual Adversarial Speech Recognition http://arxiv.org/abs/1904.02210v1
Massively Multilingual Transfer for NER http://arxiv.org/abs/1902.00193v4
Matching the Blanks: Distributional Similarity for Relation Learning http://arxiv.org/abs/1906.03158v1
Maximum Entropy Gain Exploration for Long Horizon Multi-goal Reinforcement Learning http://arxiv.org/abs/2007.02832v1
Maximum Likelihood with Bias-Corrected Calibration is Hard-To-Beat at Label Shift Adaptation http://arxiv.org/abs/1901.06852v5
Maximum Mutation Reinforcement Learning for Scalable Control http://arxiv.org/abs/2007.13690v6
Maximum Reward Formulation In Reinforcement Learning http://arxiv.org/abs/2010.03744v1
MeDAL: Medical Abbreviation Disambiguation Dataset for Natural Language Understanding Pretraining http://arxiv.org/abs/2012.13978v1
Meaning to Form: Measuring Systematicity as Information http://arxiv.org/abs/1906.05906v2
Measuring Emotions in the COVID-19 Real World Worry Dataset http://arxiv.org/abs/2004.04225v2
Measuring Forecasting Skill from Text http://arxiv.org/abs/2006.07425v2
Measuring Impact of Climate Change on Tree Species: analysis of JSDM on FIA data http://arxiv.org/abs/1910.04932v1
Measuring Information Propagation in Literary Social Networks http://arxiv.org/abs/2004.13980v2
Measuring Non-Expert Comprehension of Machine Learning Fairness Metrics http://arxiv.org/abs/2001.00089v3
Measuring Thematic Fit with Distributional Feature Overlap http://arxiv.org/abs/1707.05967v2
Measuring Visual Generalization in Continuous Control from Pixels http://arxiv.org/abs/2010.06740v2
Median Matrix Completion: from Embarrassment to Optimality http://arxiv.org/abs/2006.10400v1
Memory-enhanced Decoder for Neural Machine Translation http://arxiv.org/abs/1606.02003v1
Mention Extraction and Linking for SQL Query Generation http://arxiv.org/abs/2012.10074v1
Merge and Label: A novel neural network architecture for nested NER http://arxiv.org/abs/1907.00464v1
Message Passing Query Embedding http://arxiv.org/abs/2002.02406v2
Message Passing for Hyper-Relational Knowledge Graphs http://arxiv.org/abs/2009.10847v1
Meta Fine-Tuning Neural Language Models for Multi-Domain Text Mining http://arxiv.org/abs/2003.13003v2
Meta Learning Deep Visual Words for Fast Video Object Segmentation http://arxiv.org/abs/1812.01397v3
Meta-Learning for Few-Shot NMT Adaptation http://arxiv.org/abs/2004.02745v1
Meta-Learning with Shared Amortized Variational Inference http://arxiv.org/abs/2008.12037v1
Meta-Reinforcement Learning Robust to Distributional Shift via Model Identification and Experience Relabeling http://arxiv.org/abs/2006.07178v2
Meta-Reinforcement Learning for Robotic Industrial Insertion Tasks http://arxiv.org/abs/2004.14404v2
Meta-SAC: Auto-tune the Entropy Temperature of Soft Actor-Critic via Metagradient http://arxiv.org/abs/2007.01932v2
Meta-Transfer Learning for Code-Switched Speech Recognition http://arxiv.org/abs/2004.14228v1
Meta-learning with Stochastic Linear Bandits http://arxiv.org/abs/2005.08531v1
MetaFun: Meta-Learning with Iterative Functional Updates http://arxiv.org/abs/1912.02738v4
Microblog Hashtag Generation via Encoding Conversation Contexts http://arxiv.org/abs/1905.07584v1
Mimicking Word Embeddings using Subword RNNs http://arxiv.org/abs/1707.06961v1
MinTL: Minimalist Transfer Learning for Task-Oriented Dialogue Systems http://arxiv.org/abs/2009.12005v2
Mind the Trade-off: Debiasing NLU Models without Degrading the In-distribution Performance http://arxiv.org/abs/2005.00315v1
Minimally distorted Adversarial Examples with a Fast Adaptive Boundary Attack http://arxiv.org/abs/1907.02044v2
Minimax Pareto Fairness: A Multi Objective Perspective http://arxiv.org/abs/2011.01821v1
Minimax Testing of Identity to a Reference Ergodic Markov Chain http://arxiv.org/abs/1902.00080v3
Minimax Weight and Q-Function Learning for Off-Policy Evaluation http://arxiv.org/abs/1910.12809v4
Minimizing Dynamic Regret and Adaptive Regret Simultaneously http://arxiv.org/abs/2002.02085v1
Minimizing Interference and Selection Bias in Network Experiment Design http://arxiv.org/abs/2004.07225v1
Mining Discourse Markers for Unsupervised Sentence Representation Learning http://arxiv.org/abs/1903.11850v1
Mining Documentation to Extract Hyperparameter Schemas http://arxiv.org/abs/2006.16984v2
Mirror Descent Policy Optimization http://arxiv.org/abs/2005.09814v3
Missing Data Imputation using Optimal Transport http://arxiv.org/abs/2002.03860v3
Mitigating Gender Bias Amplification in Distribution by Posterior Regularization http://arxiv.org/abs/2005.06251v1
Mitigating Gender Bias for Neural Dialogue Generation with Adversarial Learning http://arxiv.org/abs/2009.13028v2
Mitigating Gender Bias in Machine Translation with Target Gender Annotations http://arxiv.org/abs/2010.06203v2
Mitigating Gender Bias in Natural Language Processing: Literature Review http://arxiv.org/abs/1906.08976v1
Mitigating Leakage in Federated Learning with Trusted Hardware http://arxiv.org/abs/2011.04948v3
Mitigating Manipulation in Peer Review via Randomized Reviewer Assignments http://arxiv.org/abs/2006.16437v2
Mitigating Overfitting in Supervised Classification from Two Unlabeled Datasets: A Consistent Risk Correction Approach http://arxiv.org/abs/1910.08974v4
Mitigating Uncertainty in Document Classification http://arxiv.org/abs/1907.07590v1
MixText: Linguistically-Informed Interpolation of Hidden Space for Semi-Supervised Text Classification http://arxiv.org/abs/2004.12239v1
Mixed Strategies for Robust Optimization of Unknown Objectives http://arxiv.org/abs/2002.12613v2
MixingBoard: a Knowledgeable Stylized Integrated Text Generation Platform http://arxiv.org/abs/2005.08365v2
MoNet3D: Towards Accurate Monocular 3D Object Localization in Real Time http://arxiv.org/abs/2006.16007v1
Mobile-Based Deep Learning Models for Banana Diseases Detection http://arxiv.org/abs/2004.03718v1
MobileBERT: a Compact Task-Agnostic BERT for Resource-Limited Devices http://arxiv.org/abs/2004.02984v2
Model Fusion with Kullback--Leibler Divergence http://arxiv.org/abs/2007.06168v1
Model selection for contextual bandits http://arxiv.org/abs/1906.00531v3
Model-Agnostic Counterfactual Explanations for Consequential Decisions http://arxiv.org/abs/1905.11190v5
Model-Based Reinforcement Learning with a Generative Model is Minimax Optimal http://arxiv.org/abs/1906.03804v3
Model-Based Visual Planning with Self-Supervised Functional Distances http://arxiv.org/abs/2012.15373v1
Modeling Cloud Reflectance Fields using Conditional Generative Adversarial Networks http://arxiv.org/abs/2002.07579v2
Modeling Continuous Stochastic Processes with Dynamic Normalizing Flows http://arxiv.org/abs/2002.10516v3
Modeling Discourse Structure for Document-level Neural Machine Translation http://arxiv.org/abs/2006.04721v1
Modeling Empathy and Distress in Reaction to News Stories http://arxiv.org/abs/1808.10399v1
Modeling Global and Local Node Contexts for Text Generation from Knowledge Graphs http://arxiv.org/abs/2001.11003v2
Modeling Label Semantics for Predicting Emotional Reactions http://arxiv.org/abs/2006.05489v2
Modeling Long Context for Task-Oriented Dialogue State Generation http://arxiv.org/abs/2004.14080v1
Modeling Naive Psychology of Characters in Simple Commonsense Stories http://arxiv.org/abs/1805.06533v1
Modeling Protagonist Emotions for Emotion-Aware Storytelling http://arxiv.org/abs/2010.06822v2
Modeling Recurrence for Transformer http://arxiv.org/abs/1904.03092v1
Modeling Semantic Compositionality with Sememe Knowledge http://arxiv.org/abs/1907.04744v1
Modeling Semantic Expectation: Using Script Knowledge for Referent Prediction http://arxiv.org/abs/1702.03121v1
Modeling Semantic Plausibility by Injecting World Knowledge http://arxiv.org/abs/1804.00619v3
Modeling Source Syntax for Neural Machine Translation http://arxiv.org/abs/1705.01020v1
Modeling Subjective Assessments of Guilt in Newspaper Crime Narratives http://arxiv.org/abs/2006.09589v2
Modeling the Music Genre Perception across Language-Bound Cultures http://arxiv.org/abs/2010.06325v2
Modeling, Visualization, and Analysis of African Innovation Performance http://arxiv.org/abs/2008.07882v1
Modelling Lexical Ambiguity with Density Matrices http://arxiv.org/abs/2010.05670v1
Modelling Suspense in Short Stories as Uncertainty Reduction over Neural Representation http://arxiv.org/abs/2004.14905v1
Modular Block-diagonal Curvature Approximations for Feedforward Architectures http://arxiv.org/abs/1902.01813v3
Modularized Transfomer-based Ranking Framework http://arxiv.org/abs/2004.13313v3
Modulated Fusion using Transformer for Linguistic-Acoustic Emotion Recognition http://arxiv.org/abs/2010.02057v1
Modulating Surrogates for Bayesian Optimization http://arxiv.org/abs/1906.11152v4
MojiTalk: Generating Emotional Responses at Scale http://arxiv.org/abs/1711.04090v2
Molecule Edit Graph Attention Network: Modeling Chemical Reactions as Sequences of Graph Edits http://arxiv.org/abs/2006.15426v1
Momentum Improves Normalized SGD http://arxiv.org/abs/2002.03305v2
Momentum in Reinforcement Learning http://arxiv.org/abs/1910.09322v2
Moniqua: Modulo Quantized Communication in Decentralized SGD http://arxiv.org/abs/2002.11787v3
Monitoring and explainability of models in production http://arxiv.org/abs/2007.06299v1
More Data Can Expand the Generalization Gap Between Adversarially Robust and Standard Models http://arxiv.org/abs/2002.04725v3
More Information Supervised Probabilistic Deep Face Embedding Learning http://arxiv.org/abs/2006.04518v2
More Powerful Selective Kernel Tests for Feature Selection http://arxiv.org/abs/1910.06134v2
Morph-fitting: Fine-Tuning Word Vector Spaces with Simple Language-Specific Rules http://arxiv.org/abs/1706.00377v1
Morphological Irregularity Correlates with Frequency http://arxiv.org/abs/1906.11483v1
Morphological Segmentation Inside-Out http://arxiv.org/abs/1911.04916v1
MuTual: A Dataset for Multi-Turn Dialogue Reasoning http://arxiv.org/abs/2004.04494v1
Multi-Agent Determinantal Q-Learning http://arxiv.org/abs/2006.01482v4
Multi-Agent Task-Oriented Dialog Policy Learning with Role-Aware Reward Decomposition http://arxiv.org/abs/2004.03809v2
Multi-Attribute Bayesian Optimization With Interactive Preference Learning http://arxiv.org/abs/1911.05934v2
Multi-Dimensional Gender Bias Classification http://arxiv.org/abs/2005.00614v1
Multi-Domain Dialogue Acts and Response Co-Generation http://arxiv.org/abs/2004.12363v1
Multi-Domain Neural Machine Translation with Word-Level Adaptive Layer-wise Domain Mixing http://arxiv.org/abs/1911.02692v2
Multi-Fact Correction in Abstractive Text Summarization http://arxiv.org/abs/2010.02443v1
Multi-Hop Knowledge Graph Reasoning with Reward Shaping http://arxiv.org/abs/1808.10568v2
Multi-Instance Multi-Label Learning Networks for Aspect-Category Sentiment Analysis http://arxiv.org/abs/2010.02656v1
Multi-Level Matching and Aggregation Network for Few-Shot Relation Classification http://arxiv.org/abs/1906.06678v1
Multi-Modal Generative Adversarial Network for Short Product Title Generation in Mobile E-Commerce http://arxiv.org/abs/1904.01735v1
Multi-News: a Large-Scale Multi-Document Summarization Dataset and Abstractive Hierarchical Model http://arxiv.org/abs/1906.01749v3
Multi-Objective Molecule Generation using Interpretable Substructures http://arxiv.org/abs/2002.03244v3
Multi-Passage Machine Reading Comprehension with Cross-Passage Answer Verification http://arxiv.org/abs/1805.02220v2
Multi-Principal Assistance Games http://arxiv.org/abs/2007.09540v1
Multi-Reference Training with Pseudo-References for Neural Translation and Text Generation http://arxiv.org/abs/1808.09564v1
Multi-Relational Question Answering from Narratives: Machine Reading and Reasoning in Simulated Worlds http://arxiv.org/abs/1902.09093v1
Multi-Sentence Argument Linking http://arxiv.org/abs/1911.03766v3
Multi-Source Unsupervised Hyperparameter Optimization http://arxiv.org/abs/2006.10600v1
Multi-Step Inference for Reasoning Over Paragraphs http://arxiv.org/abs/2004.02995v1
Multi-Task Identification of Entities, Relations, and Coreference for Scientific Knowledge Graph Construction http://arxiv.org/abs/1808.09602v1
Multi-Task Learning in Histo-pathology for Widely Generalizable Model http://arxiv.org/abs/2005.08645v1
Multi-Task Networks With Universe, Group, and Task Feature Learning http://arxiv.org/abs/1907.01791v1
Multi-Task Ordinal Regression for Jointly Predicting the Trustworthiness and the Leading Political Ideology of News Media http://arxiv.org/abs/1904.00542v1
Multi-Task Reinforcement Learning with Soft Modularization http://arxiv.org/abs/2003.13661v2
Multi-Task Video Captioning with Video and Entailment Generation http://arxiv.org/abs/1704.07489v2
Multi-Unit Transformers for Neural Machine Translation http://arxiv.org/abs/2010.10743v2
Multi-View Sequence-to-Sequence Models with Conversational Structure for Abstractive Dialogue Summarization http://arxiv.org/abs/2010.01672v1
Multi-XScience: A Large-scale Dataset for Extreme Multi-document Summarization of Scientific Articles http://arxiv.org/abs/2010.14235v1
Multi-agent Communication meets Natural Language: Synergies between Functional and Structural Language Learning http://arxiv.org/abs/2005.07064v1
Multi-document Summarization with Maximal Marginal Relevance-guided Reinforcement Learning http://arxiv.org/abs/2010.00117v1
Multi-hop Inference for Question-driven Summarization http://arxiv.org/abs/2010.03738v1
Multi-hop Reading Comprehension across Multiple Documents by Reasoning over Heterogeneous Graphs http://arxiv.org/abs/1905.07374v2
Multi-label Few/Zero-shot Learning with Knowledge Aggregated from Multiple Label Graphs http://arxiv.org/abs/2010.07459v1
Multi-lingual neural title generation for e-Commerce browse pages http://arxiv.org/abs/1804.01041v1
Multi-objective Bayesian Optimization using Pareto-frontier Entropy http://arxiv.org/abs/1906.00127v2
Multi-space Variational Encoder-Decoders for Semi-supervised Labeled Sequence Transduction http://arxiv.org/abs/1704.01691v2
Multi-step Greedy Reinforcement Learning Algorithms http://arxiv.org/abs/1910.02919v3
Multi-task Learning for Multilingual Neural Machine Translation http://arxiv.org/abs/2010.02523v1
Multi-task Learning of Pairwise Sequence Classification Tasks Over Disparate Label Spaces http://arxiv.org/abs/1802.09913v2
Multi-task Learning with Sample Re-weighting for Machine Reading Comprehension http://arxiv.org/abs/1809.06963v3
Multi-task Reinforcement Learning with a Planning Quasi-Metric http://arxiv.org/abs/2002.03240v3
Multi-turn Response Selection using Dialogue Dependency Relations http://arxiv.org/abs/2010.01502v1
Multi-view Story Characterization from Movie Plot Synopses and Reviews http://arxiv.org/abs/1908.09083v2
MultiCQA: Zero-Shot Transfer of Self-Supervised Text Matching Models on a Massive Scale http://arxiv.org/abs/2010.00980v1
MultiQA: An Empirical Investigation of Generalization and Transfer in Reading Comprehension http://arxiv.org/abs/1905.13453v1
MultiQT: Multimodal Learning for Real-Time Question Tracking in Speech http://arxiv.org/abs/2005.00812v2
MultiWOZ 2.2 : A Dialogue Dataset with Additional Annotation Corrections and State Tracking Baselines http://arxiv.org/abs/2007.12720v1
Multidimensional Persistence Module Classification via Lattice-Theoretic Convolutions http://arxiv.org/abs/2011.14057v1
Multidirectional Associative Optimization of Function-Specific Word Representations http://arxiv.org/abs/2005.05264v1
Multigrid Neural Memory http://arxiv.org/abs/1906.05948v4
Multilevel Text Alignment with Cross-Document Attention http://arxiv.org/abs/2010.01263v1
Multilinear Latent Conditioning for Generating Unseen Attribute Combinations http://arxiv.org/abs/2009.04075v1
Multilingual AMR-to-Text Generation http://arxiv.org/abs/2011.05443v1
Multilingual Constituency Parsing with Self-Attention and Pre-Training http://arxiv.org/abs/1812.11760v2
Multilingual Denoising Pre-training for Neural Machine Translation http://arxiv.org/abs/2001.08210v2
Multilingual Factor Analysis http://arxiv.org/abs/1905.05547v2
Multilingual Jointly Trained Acoustic and Written Word Embeddings http://arxiv.org/abs/2006.14007v1
Multilingual Offensive Language Identification with Cross-lingual Embeddings http://arxiv.org/abs/2010.05324v1
Multilingual Universal Sentence Encoder for Semantic Retrieval http://arxiv.org/abs/1907.04307v1
Multimodal Affective Analysis Using Hierarchical Attention Strategy with Word-Level Alignment http://arxiv.org/abs/1805.08660v1
Multimodal Emoji Prediction http://arxiv.org/abs/1803.02392v2
Multimodal Joint Attribute Prediction and Value Extraction for E-commerce Product http://arxiv.org/abs/2009.07162v1
Multimodal Language Analysis with Recurrent Multistage Fusion http://arxiv.org/abs/1808.03920v1
Multimodal Machine Translation with Embedding Prediction http://arxiv.org/abs/1904.00639v1
Multimodal Routing: Improving Local and Global Interpretability of Multimodal Language Analysis http://arxiv.org/abs/2004.14198v2
Multimodal Self-Supervised Learning for Medical Image Analysis http://arxiv.org/abs/1912.05396v2
Multimodal Transformer Networks for End-to-End Video-Grounded Dialogue Systems http://arxiv.org/abs/1907.01166v1
Multimodal and Multi-view Models for Emotion Recognition http://arxiv.org/abs/1906.10198v1
Multinomial Logit Bandit with Low Switching Cost http://arxiv.org/abs/2007.04876v1
Multiple Instance Learning Networks for Fine-Grained Sentiment Analysis http://arxiv.org/abs/1711.09645v2
Multiresolution Tensor Learning for Efficient and Interpretable Spatial Analysis http://arxiv.org/abs/2002.05578v5
Multiscale Collaborative Deep Models for Neural Machine Translation http://arxiv.org/abs/2004.14021v3
Musical Word Embedding: Bridging the Gap between Listening Contexts and Music http://arxiv.org/abs/2008.01190v1
Mutual Information Maximization for Simple and Accurate Part-Of-Speech Induction http://arxiv.org/abs/1804.07849v4
My Fair Bandit: Distributed Learning of Max-Min Fairness with Multi-player Bandits http://arxiv.org/abs/2002.09808v4
NADS: Neural Architecture Distribution Search for Uncertainty Awareness http://arxiv.org/abs/2006.06646v1
NARMADA: Need and Available Resource Managing Assistant for Disasters and Adversities http://arxiv.org/abs/2005.13524v1
NASH: Toward End-to-End Neural Architecture for Generative Semantic Hashing http://arxiv.org/abs/1805.05361v1
NAT: Noise-Aware Training for Robust Neural Sequence Labeling http://arxiv.org/abs/2005.07162v1
NEXUS Network: Connecting the Preceding and the Following in Dialogue Generation http://arxiv.org/abs/1810.00671v2
NGBoost: Natural Gradient Boosting for Probabilistic Prediction http://arxiv.org/abs/1910.03225v4
NILE : Natural Language Inference with Faithful Natural Language Explanations http://arxiv.org/abs/2005.12116v1
NLP Scholar: An Interactive Visual Explorer for Natural Language Processing Literature http://arxiv.org/abs/2006.01131v1
NSTM: Real-Time Query-Driven News Overview Composition at Bloomberg http://arxiv.org/abs/2006.01117v1
Naive Exploration is Optimal for Online LQR http://arxiv.org/abs/2001.09576v2
Naive Feature Selection: Sparsity in Naive Bayes http://arxiv.org/abs/1905.09884v2
Nakdan: Professional Hebrew Diacritizer http://arxiv.org/abs/2005.03312v1
Named Entity Recognition Only from Word Embeddings http://arxiv.org/abs/1909.00164v2
Named Entity Recognition as Dependency Parsing http://arxiv.org/abs/2005.07150v3
Named Entity Recognition for Social Media Texts with Semantic Augmentation http://arxiv.org/abs/2010.15458v1
Named Entity Recognition without Labelled Data: A Weak Supervision Approach http://arxiv.org/abs/2004.14723v1
Native Language Cognate Effects on Second Language Lexical Choice http://arxiv.org/abs/1805.09590v1
Natural Language Comprehension with the EpiReader http://arxiv.org/abs/1606.02270v2
Natural Language Processing with Small Feed-Forward Networks http://arxiv.org/abs/1708.00214v1
Natural language processing for achieving sustainable development: the case of neural labelling to enhance community profiling http://arxiv.org/abs/2004.12935v2
Naturalizing a Programming Language via Interactive Learning http://arxiv.org/abs/1704.06956v1
Navigating the Dynamics of Financial Embeddings over Time http://arxiv.org/abs/2007.00591v1
Near Human-Level Performance in Grammatical Error Correction with Hybrid Machine Translation http://arxiv.org/abs/1804.05945v1
Near Input Sparsity Time Kernel Embeddings via Adaptive Sampling http://arxiv.org/abs/2007.03927v2
Near-Optimal Algorithms for Minimax Optimization http://arxiv.org/abs/2002.02417v5
Near-Optimal Methods for Minimizing Star-Convex Functions and Beyond http://arxiv.org/abs/1906.11985v1
Near-imperceptible Neural Linguistic Steganography via Self-Adjusting Arithmetic Coding http://arxiv.org/abs/2010.00677v1
Near-linear Time Gaussian Process Optimization with Adaptive Batching and Resparsification http://arxiv.org/abs/2002.09954v2
Near-optimal Regret Bounds for Stochastic Shortest Path http://arxiv.org/abs/2002.09869v1
Nearly Linear Row Sampling Algorithm for Quantile Regression http://arxiv.org/abs/2006.08397v1
Necessary and Sufficient Geometries for Gradient Methods http://arxiv.org/abs/1909.10455v2
Negated and Misprimed Probes for Pretrained Language Models: Birds Can Talk, But Cannot Fly http://arxiv.org/abs/1911.03343v3
Negative Training for Neural Dialogue Response Generation http://arxiv.org/abs/1903.02134v5
Negative sampling in semi-supervised learning http://arxiv.org/abs/1911.05166v2
Neighborhood Growth Determines Geometric Priors for Relational Representation Learning http://arxiv.org/abs/1910.05565v1
Neighborhood Matching Network for Entity Alignment http://arxiv.org/abs/2005.05607v1
Nested Named Entity Recognition via Second-best Sequence Learning and Decoding http://arxiv.org/abs/1909.02250v3
Nested Reasoning About Autonomous Agents Using Probabilistic Programs http://arxiv.org/abs/1812.01569v2
Nested Subspace Arrangement for Representation of Relational Data http://arxiv.org/abs/2007.02007v1
Neural AMR: Sequence-to-Sequence Models for Parsing and Generation http://arxiv.org/abs/1704.08381v3
Neural Abstract Reasoner http://arxiv.org/abs/2011.09860v1
Neural Argument Generation Augmented with Externally Retrieved Evidence http://arxiv.org/abs/1805.10254v1
Neural Bipartite Matching http://arxiv.org/abs/2005.11304v3
Neural CRF Model for Sentence Alignment in Text Simplification http://arxiv.org/abs/2005.02324v3
Neural CRF Parsing http://arxiv.org/abs/1507.03641v1
Neural Clustering Processes http://arxiv.org/abs/1901.00409v4
Neural Contextual Bandits with UCB-based Exploration http://arxiv.org/abs/1911.04462v3
Neural Cross-Lingual Coreference Resolution and its Application to Entity Linking http://arxiv.org/abs/1806.10201v1
Neural Data-to-Text Generation via Jointly Learning the Segmentation and Correspondence http://arxiv.org/abs/2005.01096v1
Neural Decomposition: Functional ANOVA with Variational Autoencoders http://arxiv.org/abs/2006.14293v2
Neural Deepfake Detection with Factual Structure of Text http://arxiv.org/abs/2010.07475v1
Neural Differential Equations for Single Image Super-resolution http://arxiv.org/abs/2005.00865v1
Neural Discourse Structure for Text Categorization http://arxiv.org/abs/1702.01829v2
Neural Dynamic Policies for End-to-End Sensorimotor Learning http://arxiv.org/abs/2012.02788v1
Neural End-to-End Learning for Computational Argumentation Mining http://arxiv.org/abs/1704.06104v2
Neural Fine-Grained Entity Type Classification with Hierarchy-Aware Loss http://arxiv.org/abs/1803.03378v2
Neural Generation of Dialogue Response Timings http://arxiv.org/abs/2005.09128v1
Neural Grammatical Error Correction with Finite State Transducers http://arxiv.org/abs/1903.10625v2
Neural Kernels Without Tangents http://arxiv.org/abs/2003.02237v2
Neural Language Models as Psycholinguistic Subjects: Representations of Syntactic State http://arxiv.org/abs/1903.03260v1
Neural Latent Relational Analysis to Capture Lexical Semantic Relations in a Vector Space http://arxiv.org/abs/1809.03401v1
Neural Legal Judgment Prediction in English http://arxiv.org/abs/1906.02059v1
Neural Machine Translation of Text from Non-Native Speakers http://arxiv.org/abs/1808.06267v2
Neural Machine Translation via Binary Code Prediction http://arxiv.org/abs/1704.06918v1
Neural Machine Translation with Source-Side Latent Graph Parsing http://arxiv.org/abs/1702.02265v4
Neural Manifold Ordinary Differential Equations http://arxiv.org/abs/2006.10254v1
Neural Mask Generator: Learning to Generate Adaptive Word Maskings for Language Model Adaptation http://arxiv.org/abs/2010.02705v1
Neural Metaphor Detection in Context http://arxiv.org/abs/1808.09653v1
Neural Models for Documents with Metadata http://arxiv.org/abs/1705.09296v2
Neural Open Information Extraction http://arxiv.org/abs/1805.04270v1
Neural Operator: Graph Kernel Network for Partial Differential Equations http://arxiv.org/abs/2003.03485v1
Neural Ordinary Differential Equations on Manifolds http://arxiv.org/abs/2006.06663v1
Neural Proof Nets http://arxiv.org/abs/2009.12702v1
Neural Related Work Summarization with a Joint Context-driven Attention Mechanism http://arxiv.org/abs/1901.09492v1
Neural Responding Machine for Short-Text Conversation http://arxiv.org/abs/1503.02364v2
Neural Segmental Hypergraphs for Overlapping Mention Recognition http://arxiv.org/abs/1810.01817v1
Neural Simultaneous Speech Translation Using Alignment-Based Chunking http://arxiv.org/abs/2005.14489v1
Neural Symbolic Machines: Learning Semantic Parsers on Freebase with Weak Supervision http://arxiv.org/abs/1611.00020v4
Neural Syntactic Preordering for Controlled Paraphrase Generation http://arxiv.org/abs/2005.02013v1
Neural Temporal Opinion Modelling for Opinion Prediction on Twitter http://arxiv.org/abs/2005.13486v1
Neural Text Generation from Structured Data with Application to the Biography Domain http://arxiv.org/abs/1603.07771v3
Neural Topic Modeling by Incorporating Document Relationship Graph http://arxiv.org/abs/2009.13972v1
Neural Topic Modeling with Bidirectional Adversarial Training http://arxiv.org/abs/2004.12331v1
Neural Topic Modeling with Continual Lifelong Learning http://arxiv.org/abs/2006.10909v1
Neural Topic Modeling with Cycle-Consistent Adversarial Training http://arxiv.org/abs/2009.13971v1
Neural Transductive Learning and Beyond: Morphological Generation in the Minimal-Resource Setting http://arxiv.org/abs/1809.08733v2
Neural Word Segmentation with Rich Pretraining http://arxiv.org/abs/1704.08960v1
Neural models of factuality http://arxiv.org/abs/1804.02472v1
Neural reparameterization improves structural optimization http://arxiv.org/abs/1909.04240v2
Neural versus Phrase-Based Machine Translation Quality: a Case Study http://arxiv.org/abs/1608.04631v2
NeuralREG: An end-to-end approach to referring expression generation http://arxiv.org/abs/1805.08093v1
Neurals Networks for Projecting Named Entities from English to Ewondo http://arxiv.org/abs/2004.13841v1
Neuro-Symbolic Visual Reasoning: Disentangling "Visual" from "Reasoning" http://arxiv.org/abs/2006.11524v3
New Oracle-Efficient Algorithms for Private Synthetic Data Release http://arxiv.org/abs/2007.05453v1
New Potential-Based Bounds for Prediction with Expert Advice http://arxiv.org/abs/1911.01641v3
New Protocols and Negative Results for Textual Entailment Data Collection http://arxiv.org/abs/2004.11997v2
Newsroom: A Dataset of 1.3 Million Summaries with Diverse Extractive Strategies http://arxiv.org/abs/1804.11283v2
No Metrics Are Perfect: Adversarial Reward Learning for Visual Storytelling http://arxiv.org/abs/1804.09160v2
No Permanent Friends or Enemies: Tracking Relationships between Nations from News http://arxiv.org/abs/1904.08950v1
No-Regret Prediction in Marginally Stable Systems http://arxiv.org/abs/2002.02064v3
Noise Contrastive Estimation and Negative Sampling for Conditional Models: Consistency and Statistical Efficiency http://arxiv.org/abs/1809.01812v1
Noise-tolerant, Reliable Active Classification with Comparison Queries http://arxiv.org/abs/2001.05497v1
Noisy-Input Entropy Search for Efficient Robust Bayesian Optimization http://arxiv.org/abs/2002.02820v1
Non-Autoregressive Machine Translation with Latent Alignments http://arxiv.org/abs/2004.07437v3
Non-Parametric Calibration for Classification http://arxiv.org/abs/1906.04933v3
Non-Projective Dependency Parsing with Non-Local Transitions http://arxiv.org/abs/1710.09340v3
Non-convex Learning via Replica Exchange Stochastic Gradient MCMC http://arxiv.org/abs/2008.05367v2
Non-exchangeable feature allocation models with sublinear growth of the feature sizes http://arxiv.org/abs/2003.13491v1
Non-linear interlinkages and key objectives amongst the Paris Agreement and the Sustainable Development Goals http://arxiv.org/abs/2004.09318v1
Nonmyopic Gaussian Process Optimization with Macro-Actions http://arxiv.org/abs/2002.09670v1
Nonparametric Estimation in the Dynamic Bradley-Terry Model http://arxiv.org/abs/2003.00083v1
Nonparametric Score Estimators http://arxiv.org/abs/2005.10099v2
Norm-Based Curriculum Learning for Neural Machine Translation http://arxiv.org/abs/2006.02014v1
Normalized Flat Minima: Exploring Scale Invariant Definition of Flat Minima for Neural Networks using PAC-Bayesian Analysis http://arxiv.org/abs/1901.04653v2
Normalized Loss Functions for Deep Learning with Noisy Labels http://arxiv.org/abs/2006.13554v1
Normalizing Flows Across Dimensions http://arxiv.org/abs/2006.13070v1
Normalizing Flows on Tori and Spheres http://arxiv.org/abs/2002.02428v2
Normalizing Flows with Multi-Scale Autoregressive Priors http://arxiv.org/abs/2004.03891v1
Not All Claims are Created Equal: Choosing the Right Statistical Approach to Assess Hypotheses http://arxiv.org/abs/1911.03850v3
Not Low-Resource Anymore: Aligner Ensembling, Batch Filtering, and New Datasets for Bengali-English Machine Translation http://arxiv.org/abs/2009.09359v2
Null It Out: Guarding Protected Attributes by Iterative Nullspace Projection http://arxiv.org/abs/2004.07667v2
Numeracy for Language Models: Evaluating and Improving their Ability to Predict Numbers http://arxiv.org/abs/1805.08154v1
Nurse is Closer to Woman than Surgeon? Mitigating Gender-Biased Proximities in Word Embeddings http://arxiv.org/abs/2006.01938v1
NwQM: A neural quality assessment framework for Wikipedia http://arxiv.org/abs/2010.06969v1
OBJ2TEXT: Generating Visually Descriptive Language from Object Layouts http://arxiv.org/abs/1707.07102v1
ON-TRAC Consortium for End-to-End and Simultaneous Speech Translation Challenge Tasks at IWSLT 2020 http://arxiv.org/abs/2005.11861v1
OPAL: Offline Primitive Discovery for Accelerating Offline Reinforcement Learning http://arxiv.org/abs/2010.13611v2
OSOM: A simultaneously optimal algorithm for multi-armed and linear contextual bandits http://arxiv.org/abs/1905.10040v4
Obfuscation for Privacy-preserving Syntactic Parsing http://arxiv.org/abs/1904.09585v2
Obfuscation via Information Density Estimation http://arxiv.org/abs/1910.08109v1
Object Ordering with Bidirectional Matchings for Visual Reasoning http://arxiv.org/abs/1804.06870v2
Observing Dialogue in Therapy: Categorizing and Forecasting Behavioral Codes http://arxiv.org/abs/1907.00326v1
Obtaining Adjustable Regularization for Free via Iterate Averaging http://arxiv.org/abs/2008.06736v1
Obtaining Faithful Interpretations from Compositional Neural Networks http://arxiv.org/abs/2005.00724v2
Off-Dynamics Reinforcement Learning: Training for Transfer with Domain Classifiers http://arxiv.org/abs/2006.13916v1
Off-Policy Actor-Critic with Shared Experience Replay http://arxiv.org/abs/1909.11583v2
Offline Meta-Reinforcement Learning with Advantage Weighting http://arxiv.org/abs/2008.06043v2
Old Dog Learns New Tricks: Randomized UCB for Bandit Problems http://arxiv.org/abs/1910.04928v2
On Contrastive Learning for Likelihood-free Inference http://arxiv.org/abs/2002.03712v2
On Convergence-Diagnostic based Step Sizes for Stochastic Gradient Descent http://arxiv.org/abs/2007.00534v1
On Coresets For Regularized Regression http://arxiv.org/abs/2006.05440v3
On Cross-Dataset Generalization in Automatic Detection of Online Abuse http://arxiv.org/abs/2010.07414v2
On Detecting Data Pollution Attacks On Recommender Systems Using Sequential GANs http://arxiv.org/abs/2012.02509v1
On Differentially Private Stochastic Convex Optimization with Heavy-tailed Data http://arxiv.org/abs/2010.11082v1
On Dimensional Linguistic Properties of the Word Embedding Space http://arxiv.org/abs/1910.02211v2
On Effective Parallelization of Monte Carlo Tree Search http://arxiv.org/abs/2006.08785v2
On Efficient Constructions of Checkpoints http://arxiv.org/abs/2009.13003v1
On Efficient Low Distortion Ultrametric Embedding http://arxiv.org/abs/2008.06700v1
On Evaluation of Adversarial Perturbations for Sequence-to-Sequence Models http://arxiv.org/abs/1903.06620v2
On Exposure Bias, Hallucination and Domain Shift in Neural Machine Translation http://arxiv.org/abs/2005.03642v1
On Extractive and Abstractive Neural Document Summarization with Transformer Language Models http://arxiv.org/abs/1909.03186v2
On Faithfulness and Factuality in Abstractive Summarization http://arxiv.org/abs/2005.00661v1
On Generalization Bounds of a Family of Recurrent Neural Networks http://arxiv.org/abs/1910.12947v2
On Gradient Descent Ascent for Nonconvex-Concave Minimax Problems http://arxiv.org/abs/1906.00331v6
On Graph Classification Networks, Datasets and Baselines http://arxiv.org/abs/1905.04682v1
On Incorporating Structural Information to improve Dialogue Response Generation http://arxiv.org/abs/2005.14315v1
On Iterative Neural Network Pruning, Reinitialization, and the Similarity of Masks http://arxiv.org/abs/2001.05050v1
On Layer Normalization in the Transformer Architecture http://arxiv.org/abs/2002.04745v2
On Learning Language-Invariant Representations for Universal Machine Translation http://arxiv.org/abs/2008.04510v1
On Learning Sets of Symmetric Elements http://arxiv.org/abs/2002.08599v4
On Linear Stochastic Approximation: Fine-grained Polyak-Ruppert and Non-Asymptotic Concentration http://arxiv.org/abs/2004.04719v1
On Losses for Modern Language Models http://arxiv.org/abs/2010.01694v1
On Maximization of Weakly Modular Functions: Guarantees of Multi-stage Algorithms, Tractability, and Hardness http://arxiv.org/abs/1805.11251v5
On Measuring Social Biases in Sentence Encoders http://arxiv.org/abs/1903.10561v1
On Negative Interference in Multilingual Models: Findings and A Meta-Learning Treatment http://arxiv.org/abs/2010.03017v1
On Optimal Transformer Depth for Low-Resource Language Translation http://arxiv.org/abs/2004.04418v2
On Polynomial Approximations for Privacy-Preserving and Verifiable ReLU Networks http://arxiv.org/abs/2011.05530v1
On Primes, Log-Loss Scores and (No) Privacy http://arxiv.org/abs/2009.08559v1
On Random Subsampling of Gaussian Process Regression: A Graphon-Based Analysis http://arxiv.org/abs/1901.09541v1
On Second-Order Group Influence Functions for Black-Box Predictions http://arxiv.org/abs/1911.00418v2
On Suboptimality of Least Squares with Application to Estimation of Convex Bodies http://arxiv.org/abs/2006.04046v1
On The Evaluation of Machine Translation Systems Trained With Back-Translation http://arxiv.org/abs/1908.05204v2
On Thompson Sampling for Smoother-than-Lipschitz Bandits http://arxiv.org/abs/2001.02323v2
On Unbalanced Optimal Transport: An Analysis of Sinkhorn Algorithm http://arxiv.org/abs/2002.03293v2
On Using Very Large Target Vocabulary for Neural Machine Translation http://arxiv.org/abs/1412.2007v2
On Variational Learning of Controllable Representations for Text without Supervision http://arxiv.org/abs/1905.11975v4
On conditional versus marginal bias in multi-armed bandits http://arxiv.org/abs/2002.08422v2
On the Benefits of Models with Perceptually-Aligned Gradients http://arxiv.org/abs/2005.01499v1
On the Choice of Auxiliary Languages for Improved Sequence Tagging http://arxiv.org/abs/2005.09389v1
On the Complementary Nature of Knowledge Graph Embedding, Fine Grain Entity Types, and Language Modeling http://arxiv.org/abs/2010.05732v1
On the Computational Power of Transformers and its Implications in Sequence Modeling http://arxiv.org/abs/2006.09286v3
On the Consistency of Top-k Surrogate Losses http://arxiv.org/abs/1901.11141v2
On the Convergence of Adaptive Gradient Methods for Nonconvex Optimization http://arxiv.org/abs/1808.05671v3
On the Convergence of Continuous Constrained Optimization for Structure Learning http://arxiv.org/abs/2011.11150v2
On the Convergence of Nesterov's Accelerated Gradient Method in Stochastic Settings http://arxiv.org/abs/2002.12414v2
On the Convergence of SARAH and Beyond http://arxiv.org/abs/1906.02351v2
On the Convergence of Stochastic Gradient Descent with Low-Rank Projections for Convex Low-Rank Matrix Problems http://arxiv.org/abs/2001.11668v2
On the Cross-lingual Transferability of Monolingual Representations http://arxiv.org/abs/1910.11856v3
On the Encoder-Decoder Incompatibility in Variational Text Modeling and Beyond http://arxiv.org/abs/2004.09189v1
On the Expressivity of Neural Networks for Deep Reinforcement Learning http://arxiv.org/abs/1910.05927v3
On the Frailty of Universal POS Tags for Neural UD Parsers http://arxiv.org/abs/2010.01830v3
On the Generalization Benefit of Noise in Stochastic Gradient Descent http://arxiv.org/abs/2006.15081v1
On the Global Convergence Rates of Softmax Policy Gradient Methods http://arxiv.org/abs/2005.06392v2
On the Idiosyncrasies of the Mandarin Chinese Classifier System http://arxiv.org/abs/1902.10193v3
On the Inference Calibration of Neural Machine Translation http://arxiv.org/abs/2005.00963v1
On the Limitations of Cross-lingual Encoders as Exposed by Reference-Free Machine Translation Evaluation http://arxiv.org/abs/2005.01196v3
On the Limitations of Unsupervised Bilingual Dictionary Induction http://arxiv.org/abs/1805.03620v1
On the Linguistic Representational Power of Neural Machine Translation Models http://arxiv.org/abs/1911.00317v1
On the Multiple Descent of Minimum-Norm Interpolants and Restricted Lower Isometry of Kernels http://arxiv.org/abs/1908.10292v2
On the Noisy Gradient Descent that Generalizes as SGD http://arxiv.org/abs/1906.07405v3
On the Number of Linear Regions of Convolutional Neural Networks http://arxiv.org/abs/2006.00978v2
On the Practical Computational Power of Finite Precision RNNs for Language Recognition http://arxiv.org/abs/1805.04908v1
On the Relation between Quality-Diversity Evaluation and Distribution-Fitting Goal in Text Generation http://arxiv.org/abs/2007.01488v2
On the Robustness of Language Encoders against Grammatical Errors http://arxiv.org/abs/2005.05683v1
On the Role of Supervision in Unsupervised Constituency Parsing http://arxiv.org/abs/2010.02423v2
On the Sample Complexity of Adversarial Multi-Source PAC Learning http://arxiv.org/abs/2002.10384v2
On the Sample Complexity of Learning Sum-Product Networks http://arxiv.org/abs/1912.02765v2
On the Sentence Embeddings from Pre-trained Language Models http://arxiv.org/abs/2011.05864v1
On the Sparsity of Neural Machine Translation Models http://arxiv.org/abs/2010.02646v1
On the Spontaneous Emergence of Discrete and Compositional Signals http://arxiv.org/abs/2005.00110v1
On the Theoretical Properties of the Network Jackknife http://arxiv.org/abs/2004.08935v2
On the Unreasonable Effectiveness of the Greedy Algorithm: Greedy Adapts to Sharpness http://arxiv.org/abs/2002.04063v1
On the diminishing return of labeling clinical reports http://arxiv.org/abs/2010.14587v1
On the importance of pre-training data volume for compact language models http://arxiv.org/abs/2010.03813v2
On the interplay between noise and curvature and its effect on optimization and generalization http://arxiv.org/abs/1906.07774v2
On the optimality of kernels for high-dimensional clustering http://arxiv.org/abs/1912.00458v1
On the space-time expressivity of ResNets http://arxiv.org/abs/1910.09599v4
On-The-Fly Information Retrieval Augmentation for Language Models http://arxiv.org/abs/2007.01528v1
One Policy to Control Them All: Shared Modular Policies for Agent-Agnostic Control http://arxiv.org/abs/2007.04976v1
One Size Does Not Fit All: Generating and Evaluating Variable Number of Keyphrases http://arxiv.org/abs/1810.05241v4
One Size Fits All: Can We Train One Denoiser for All Noise Levels? http://arxiv.org/abs/2005.09627v3
One Solution is Not All You Need: Few-Shot Extrapolation via Structured MaxEnt RL http://arxiv.org/abs/2010.14484v2
Online Continuous DR-Submodular Maximization with Long-Term Budget Constraints http://arxiv.org/abs/1907.00316v1
Online Conversation Disentanglement with Pointer Networks http://arxiv.org/abs/2010.11080v1
Online Dense Subgraph Discovery via Blurred-Graph Feedback http://arxiv.org/abs/2006.13642v1
Online Forecasting of Total-Variation-bounded Sequences http://arxiv.org/abs/1906.03364v2
Online Hyper-parameter Tuning in Off-policy Learning via Evolutionary Strategies http://arxiv.org/abs/2006.07554v1
Online Learning Using Only Peer Prediction http://arxiv.org/abs/1910.04382v2
Online Learning for Active Cache Synchronization http://arxiv.org/abs/2002.12014v2
Online Learning with Continuous Variations: Dynamic Regret and Reductions http://arxiv.org/abs/1902.07286v3
Online Learning with Imperfect Hints http://arxiv.org/abs/2002.04726v2
Online Pricing with Offline Data: Phase Transition and Inverse Square Law http://arxiv.org/abs/1910.08693v6
Online Safety Assurance for Deep Reinforcement Learning http://arxiv.org/abs/2010.03625v1
Online Segment to Segment Neural Transduction http://arxiv.org/abs/1609.08194v1
Online metric algorithms with untrusted predictions http://arxiv.org/abs/2003.02144v2
Online mirror descent and dual averaging: keeping pace in the dynamic case http://arxiv.org/abs/2006.02585v2
Open Domain Event Extraction Using Neural Latent Variable Models http://arxiv.org/abs/1906.06947v1
Open Domain Question Answering Using Early Fusion of Knowledge Bases and Text http://arxiv.org/abs/1809.00782v1
Open Korean Corpora: A Practical Report http://arxiv.org/abs/2012.15621v1
Operation-Aware Soft Channel Pruning using Differentiable Masks http://arxiv.org/abs/2007.03938v2
OpinionDigest: A Simple Framework for Opinion Summarization http://arxiv.org/abs/2005.01901v1
Opportunistic Decoding with Timely Correction for Simultaneous Translation http://arxiv.org/abs/2005.00675v1
Optimal Client Sampling for Federated Learning http://arxiv.org/abs/2010.13723v1
Optimal Continual Learning has Perfect Memory and is NP-hard http://arxiv.org/abs/2006.05188v1
Optimal Randomized First-Order Methods for Least-Squares Problems http://arxiv.org/abs/2002.09488v2
Optimal Robust Learning of Discrete Distributions from Batches http://arxiv.org/abs/1911.08532v2
Optimal Transport-based Alignment of Learned Character Representations for String Similarity http://arxiv.org/abs/1907.10165v1
Optimal approximation for unconstrained non-submodular minimization http://arxiv.org/abs/1905.12145v3
Optimal group testing http://arxiv.org/abs/1911.02287v3
Optimal transport mapping via input convex neural networks http://arxiv.org/abs/1908.10962v2
Optimistic Policy Optimization with Bandit Feedback http://arxiv.org/abs/2002.08243v2
Optimistic bounds for multi-output prediction http://arxiv.org/abs/2002.09769v1
Optimization Theory for ReLU Neural Networks Trained with Normalization Layers http://arxiv.org/abs/2006.06878v1
Optimization from Structured Samples for Coverage Functions http://arxiv.org/abs/2007.02738v1
Optimization of Graph Total Variation via Active-Set-based Combinatorial Reconditioning http://arxiv.org/abs/2002.12236v1
Optimized Score Transformation for Fair Classification http://arxiv.org/abs/1906.00066v2
Optimizer Benchmarking Needs to Account for Hyperparameter Tuning http://arxiv.org/abs/1910.11758v4
Optimizing Black-box Metrics with Adaptive Surrogates http://arxiv.org/abs/2002.08605v1
Optimizing Data Usage via Differentiable Rewards http://arxiv.org/abs/1911.10088v2
Optimizing Long-term Social Welfare in Recommender Systems: A Constrained Matching Approach http://arxiv.org/abs/2008.00104v2
Optimizing Memory Placement using Evolutionary Graph Reinforcement Learning http://arxiv.org/abs/2007.07298v2
Optimizing Millions of Hyperparameters by Implicit Differentiation http://arxiv.org/abs/1911.02590v1
Optimizing the Factual Correctness of a Summary: A Study of Summarizing Radiology Reports http://arxiv.org/abs/1911.02541v3
Optimus: Organizing Sentences via Pre-trained Modeling of a Latent Space http://arxiv.org/abs/2004.04092v4
Option Discovery in the Absence of Rewards with Manifold Analysis http://arxiv.org/abs/2003.05878v2
Oracle Efficient Private Non-Convex Optimization http://arxiv.org/abs/1909.01783v3
Ordered SGD: A New Stochastic Optimization Framework for Empirical Risk Minimization http://arxiv.org/abs/1907.04371v5
Ordinal Non-negative Matrix Factorization for Recommendation http://arxiv.org/abs/2006.01034v4
Orthogonal Gradient Descent for Continual Learning http://arxiv.org/abs/1910.07104v1
Orthogonal Relation Transforms with Graph Context Modeling for Knowledge Graph Embedding http://arxiv.org/abs/1911.04910v3
Orthogonalized SGD and Nested Architectures for Anytime Neural Networks http://arxiv.org/abs/2008.06635v1
Out of the Echo Chamber: Detecting Countering Debate Speeches http://arxiv.org/abs/2005.01157v1
Overcoming Language Variation in Sentiment Analysis with Social Attention http://arxiv.org/abs/1511.06052v4
Overfitting in adversarially robust deep learning http://arxiv.org/abs/2002.11569v2
P-SIF: Document Embeddings Using Partition Averaging http://arxiv.org/abs/2005.09069v1
PAC Bounds for Imitation and Model-based Batch Learning of Contextual Markov Decision Processes http://arxiv.org/abs/2006.06352v2
PAC learning with stable and private predictions http://arxiv.org/abs/1911.10541v2
PACRR: A Position-Aware Neural IR Model for Relevance Matching http://arxiv.org/abs/1704.03940v3
PAGE: A Simple and Optimal Probabilistic Gradient Estimator for Nonconvex Optimization http://arxiv.org/abs/2008.10898v2
PAIR: Planning and Iterative Refinement in Pre-trained Transformers for Long Text Generation http://arxiv.org/abs/2010.02301v1
PALM: Pre-training an Autoencoding&Autoregressive Language Model for Context-conditioned Generation http://arxiv.org/abs/2004.07159v2
PAN: Path Integral Based Convolution for Deep Graph Neural Networks http://arxiv.org/abs/1904.10996v1
PARADE: A New Dataset for Paraphrase Identification Requiring Computer Science Domain Knowledge http://arxiv.org/abs/2010.03725v1
PDO-eConvs: Partial Differential Operator Based Equivariant Convolutions http://arxiv.org/abs/2007.10408v2
PEGASUS: Pre-training with Extracted Gap-sentences for Abstractive Summarization http://arxiv.org/abs/1912.08777v3
PENNI: Pruned Kernel Sharing for Efficient CNN Inference http://arxiv.org/abs/2005.07133v2
PERL: Pivot-based Domain Adaptation for Pre-trained Deep Contextualized Embedding Models http://arxiv.org/abs/2006.09075v1
PHICON: Improving Generalization of Clinical Text De-identification Models via Data Augmentation http://arxiv.org/abs/2010.05143v1
PLAS: Latent Action Space for Offline Reinforcement Learning http://arxiv.org/abs/2011.07213v1
PLATO: Pre-trained Dialogue Generation Model with Discrete Latent Variable http://arxiv.org/abs/1910.07931v3
POPCORN: Partially Observed Prediction COnstrained ReiNforcement Learning http://arxiv.org/abs/2001.04032v2
POSEIDON: Privacy-Preserving Federated Neural Network Learning http://arxiv.org/abs/2009.00349v3
PRover: Proof Generation for Interpretable Reasoning over Rules http://arxiv.org/abs/2010.02830v1
PackIt: A Virtual Environment for Geometric Planning http://arxiv.org/abs/2007.11121v1
Pan-Private Uniformity Testing http://arxiv.org/abs/1911.01452v3
Parallel Algorithm for Non-Monotone DR-Submodular Maximization http://arxiv.org/abs/1905.13272v1
Parallel Corpus Filtering via Pre-trained Language Models http://arxiv.org/abs/2005.06166v1
Parallel Data Augmentation for Formality Style Transfer http://arxiv.org/abs/2005.07522v1
Parallel Interactive Networks for Multi-Domain Dialogue State Generation http://arxiv.org/abs/2009.07616v3
Parallels Between Phase Transitions and Circuit Complexity? http://arxiv.org/abs/1904.05483v2
Parameters Estimation from the 21 cm signal using Variational Inference http://arxiv.org/abs/2005.02299v1
Parametric Gaussian Process Regressors http://arxiv.org/abs/1910.07123v3
Paraphrase Augmented Task-Oriented Dialog Generation http://arxiv.org/abs/2004.07462v2
Paraphrase Generation as Zero-Shot Multilingual Translation: Disentangling Semantic Similarity from Lexical and Syntactic Diversity http://arxiv.org/abs/2008.04935v2
Paraphrase to Explicate: Revealing Implicit Noun-Compound Relations http://arxiv.org/abs/1805.02442v1
PareCO: Pareto-aware Channel Optimization for Slimmable Neural Networks http://arxiv.org/abs/2007.11752v2
Pareto Probing: Trading Off Accuracy for Complexity http://arxiv.org/abs/2010.02180v2
Parrot: Data-Driven Behavioral Priors for Reinforcement Learning http://arxiv.org/abs/2011.10024v1
Parsing Speech: A Neural Approach to Integrating Lexical and Acoustic-Prosodic Information http://arxiv.org/abs/1704.07287v2
Parsing as Reduction http://arxiv.org/abs/1503.00030v1
Partial Trace Regression and Low-Rank Kraus Decomposition http://arxiv.org/abs/2007.00935v2
Partially-Aligned Data-to-Text Generation with Distant Supervision http://arxiv.org/abs/2010.01268v1
Past, Present, Future: A Computational Investigation of the Typology of Tense in 1000 Languages http://arxiv.org/abs/1704.08914v2
Pathologies of Neural Models Make Interpretations Difficult http://arxiv.org/abs/1804.07781v3
Patient-Specific Effects of Medication Using Latent Force Models with Gaussian Processes http://arxiv.org/abs/1906.00226v1
PePScenes: A Novel Dataset and Baseline for Pedestrian Action Prediction in 3D http://arxiv.org/abs/2012.07773v1
PeTra: A Sparsely Supervised Memory Model for People Tracking http://arxiv.org/abs/2005.02990v1
Peer Loss Functions: Learning from Noisy Labels without Knowing Noise Rates http://arxiv.org/abs/1910.03231v7
Perceptual Generative Autoencoders http://arxiv.org/abs/1906.10335v2
Performative Prediction http://arxiv.org/abs/2002.06673v3
Permutation Invariant Graph Generation via Score-Based Generative Modeling http://arxiv.org/abs/2003.00638v1
Permutation invariant networks to learn Wasserstein metrics http://arxiv.org/abs/2010.05820v3
PersLay: A Neural Network Layer for Persistence Diagrams and New Graph Topological Signatures http://arxiv.org/abs/1904.09378v4
Personality Trait Detection Using Bagged SVM over BERT Word Embedding Ensembles http://arxiv.org/abs/2010.01309v1
Personalized Language Model for Query Auto-Completion http://arxiv.org/abs/1804.09661v1
Personalized Neural Embeddings for Collaborative Filtering with Text http://arxiv.org/abs/1903.07860v1
Personalized neural language models for real-world query auto completion http://arxiv.org/abs/1804.06439v3
Personalizing Dialogue Agents: I have a dog, do you have pets too? http://arxiv.org/abs/1801.07243v5
Persuasion for Good: Towards a Personalized Persuasive Dialogue System for Social Good http://arxiv.org/abs/1906.06725v2
Perturbed Masking: Parameter-free Probing for Analyzing and Interpreting BERT http://arxiv.org/abs/2004.14786v2
Pessimism About Unknown Unknowns Inspires Conservatism http://arxiv.org/abs/2006.08753v1
Phone Features Improve Speech Translation http://arxiv.org/abs/2005.13681v1
Phonetic and Visual Priors for Decipherment of Informal Romanization http://arxiv.org/abs/2005.02517v1
Phonotactic Complexity and its Trade-offs http://arxiv.org/abs/2005.03774v1
Phrase-Based & Neural Unsupervised Machine Translation http://arxiv.org/abs/1804.07755v2
Phrase-Indexed Question Answering: A New Challenge for Scalable Document Comprehension http://arxiv.org/abs/1804.07726v2
Pieces of Eight: 8-bit Neural Machine Translation http://arxiv.org/abs/1804.05038v1
Piecewise Linear Regression via a Difference of Convex Functions http://arxiv.org/abs/2007.02422v3
Planning from Pixels using Inverse Dynamics Models http://arxiv.org/abs/2012.02419v1
Planning to Explore via Self-Supervised World Models http://arxiv.org/abs/2005.05960v2
Playing 20 Question Game with Policy-Based Reinforcement Learning http://arxiv.org/abs/1808.07645v3
Playing Text-Adventure Games with Graph-Based Deep Reinforcement Learning http://arxiv.org/abs/1812.01628v2
Please Mind the Root: Decoding Arborescences for Dependency Parsing http://arxiv.org/abs/2010.02550v2
PlotMachines: Outline-Conditioned Generation with Dynamic Plot State Tracking http://arxiv.org/abs/2004.14967v2
Plug and Play Autoencoders for Conditional Text Generation http://arxiv.org/abs/2010.02983v2
PoWER-BERT: Accelerating BERT Inference via Progressive Word-vector Elimination http://arxiv.org/abs/2001.08950v5
Pointer Graph Networks http://arxiv.org/abs/2006.06380v2
Pointwise HSIC: A Linear-Time Kernelized Co-occurrence Norm for Sparse Linguistic Expressions http://arxiv.org/abs/1809.00800v1
Pointwise Paraphrase Appraisal is Potentially Problematic http://arxiv.org/abs/2005.11996v2
Poisson Learning: Graph Based Semi-Supervised Learning At Very Low Label Rates http://arxiv.org/abs/2006.11184v2
Policy Gradient as a Proxy for Dynamic Oracles in Constituency Parsing http://arxiv.org/abs/1806.03290v1
Policy Learning Using Weak Supervision http://arxiv.org/abs/2010.01748v2
Policy Shaping and Generalized Update Equations for Semantic Parsing from Denotations http://arxiv.org/abs/1809.01299v1
Policy Teaching via Environment Poisoning: Training-time Adversarial Attacks against Reinforcement Learning http://arxiv.org/abs/2003.12909v2
Politeness Transfer: A Tag and Generate Approach http://arxiv.org/abs/2004.14257v2
Political Advertising Dataset: the use case of the Polish 2020 Presidential Elections http://arxiv.org/abs/2006.10207v1
PolyGen: An Autoregressive Generative Model of 3D Meshes http://arxiv.org/abs/2002.10880v1
Polyglot Semantic Parsing in APIs http://arxiv.org/abs/1803.06966v2
Polyglot Semantic Role Labeling http://arxiv.org/abs/1805.11598v1
Population Mapping in Informal Settlements with High-Resolution Satellite Imagery and Equitable Ground-Truth http://arxiv.org/abs/2009.08410v1
Population-Based Black-Box Optimization for Biological Sequence Design http://arxiv.org/abs/2006.03227v2
Position-Aware Tagging for Aspect Sentiment Triplet Extraction http://arxiv.org/abs/2010.02609v2
Post-Estimation Smoothing: A Simple Baseline for Learning with Side Information http://arxiv.org/abs/2003.05955v1
Posterior Calibrated Training on Sentence Classification Tasks http://arxiv.org/abs/2004.14500v2
Posterior Control of Blackbox Generation http://arxiv.org/abs/2005.04560v1
PowerNorm: Rethinking Batch Normalization in Transformers http://arxiv.org/abs/2003.07845v2
PowerTransformer: Unsupervised Controllable Revision for Biased Language Correction http://arxiv.org/abs/2010.13816v1
Pragmatically Informative Image Captioning with Character-Level Inference http://arxiv.org/abs/1804.05417v2
Pragmatically Informative Text Generation http://arxiv.org/abs/1904.01301v2
Pre-Learning Environment Representations for Data-Efficient Neural Instruction Following http://arxiv.org/abs/1907.09671v1
Pre-Training Transformers as Energy-Based Cloze Models http://arxiv.org/abs/2012.08561v1
Pre-train and Plug-in: Flexible Conditional Text Generation with Variational Auto-Encoders http://arxiv.org/abs/1911.03882v4
Pre-training Is (Almost) All You Need: An Application to Commonsense Reasoning http://arxiv.org/abs/2004.14074v1
Pre-training Multilingual Neural Machine Translation by Leveraging Alignment Information http://arxiv.org/abs/2010.03142v2
Pre-training for Abstractive Document Summarization by Reinstating Source Text http://arxiv.org/abs/2004.01853v4
Pre-training on high-resource speech recognition improves low-resource speech-to-text translation http://arxiv.org/abs/1809.01431v2
PreCo: A Large-scale Dataset in Preschool Vocabulary for Coreference Resolution http://arxiv.org/abs/1810.09807v1
Precise Task Formalization Matters in Winograd Schema Evaluations http://arxiv.org/abs/2010.04043v1
Precise Tradeoffs in Adversarial Training for Linear Regression http://arxiv.org/abs/2002.10477v1
Predicting Choice with Set-Dependent Aggregation http://arxiv.org/abs/1906.06365v2
Predicting Clinical Trial Results by Implicit Evidence Integration http://arxiv.org/abs/2010.05639v1
Predicting Declension Class from Form and Meaning http://arxiv.org/abs/2005.00626v2
Predicting In-game Actions from Interviews of NBA Players http://arxiv.org/abs/1910.11292v3
Predicting Native Language from Gaze http://arxiv.org/abs/1704.07398v2
Predicting Performance for Natural Language Processing Tasks http://arxiv.org/abs/2005.00870v1
Predicting Semantic Relations using Global Graph Properties http://arxiv.org/abs/1808.08644v1
Predicting Unplanned Readmissions with Highly Unstructured Data http://arxiv.org/abs/2003.11622v2
Predicting and Analyzing Law-Making in Kenya http://arxiv.org/abs/2006.05493v1
Prediction Focused Topic Models via Feature Selection http://arxiv.org/abs/1910.05495v2
Prediction of Bayesian Intervals for Tropical Storms http://arxiv.org/abs/2003.05024v1
Prediction of neonatal mortality in Sub-Saharan African countries using data-level linkage of multiple surveys http://arxiv.org/abs/2011.12707v1
Predictive Biases in Natural Language Processing Models: A Conceptual Framework and Overview http://arxiv.org/abs/1912.11078v2
Predictive Coding for Locally-Linear Control http://arxiv.org/abs/2003.01086v1
Predictive Multiplicity in Classification http://arxiv.org/abs/1909.06677v4
Predictive PER: Balancing Priority and Diversity towards Stable Deep Reinforcement Learning http://arxiv.org/abs/2011.13093v1
Predictive Sampling with Forecasting Autoregressive Models http://arxiv.org/abs/2002.09928v2
Pretrained Language Model Embryology: The Birth of ALBERT http://arxiv.org/abs/2010.02480v2
Pretrained Transformers Improve Out-of-Distribution Robustness http://arxiv.org/abs/2004.06100v2
Pretraining with Contrastive Sentence Objectives Improves Discourse Performance of Language Models http://arxiv.org/abs/2005.10389v1
Principal Neighbourhood Aggregation for Graph Nets http://arxiv.org/abs/2004.05718v5
Principled learning method for Wasserstein distributionally robust optimization with local perturbations http://arxiv.org/abs/2006.03333v2
Privacy Amplification by Decentralization http://arxiv.org/abs/2012.05326v1
Privacy-Preserving XGBoost Inference http://arxiv.org/abs/2011.04789v4
Privacy-preserving Neural Representations of Text http://arxiv.org/abs/1808.09408v1
Privacy-preserving collaborative machine learning on genomic data using TensorFlow http://arxiv.org/abs/2002.04344v2
Private Outsourced Bayesian Optimization http://arxiv.org/abs/2010.12799v1
Private Query Release Assisted by Public Data http://arxiv.org/abs/2004.10941v1
Private Reinforcement Learning with PAC and Regret Guarantees http://arxiv.org/abs/2009.09052v1
Private Stochastic Convex Optimization: Optimal Rates in Linear Time http://arxiv.org/abs/2005.04763v1
Privately Learning Markov Random Fields http://arxiv.org/abs/2002.09463v2
Privately Learning Thresholds: Closing the Exponential Gap http://arxiv.org/abs/1911.10137v1
Privately detecting changes in unknown distributions http://arxiv.org/abs/1910.01327v2
Probabilistic Assumptions Matter: Improved Models for Distantly-Supervised Document-Level Question Answering http://arxiv.org/abs/2005.01898v1
Probabilistic FastText for Multi-Sense Word Embeddings http://arxiv.org/abs/1806.02901v1
Probabilistic Frame Induction http://arxiv.org/abs/1302.4813v1
Probabilistic Predictions of People Perusing: Evaluating Metrics of Language Model Performance for Psycholinguistic Modeling http://arxiv.org/abs/2009.03954v1
Probabilistic Typology: Deep Generative Models of Vowel Inventories http://arxiv.org/abs/1705.01684v1
Probabilistically Masked Language Model Capable of Autoregressive Generation in Arbitrary Word Order http://arxiv.org/abs/2004.11579v1
Probing Emergent Semantics in Predictive Agents via Question Answering http://arxiv.org/abs/2006.01016v1
Probing Linguistic Features of Sentence-Level Representations in Neural Relation Extraction http://arxiv.org/abs/2004.08134v1
Probing Linguistic Systematicity http://arxiv.org/abs/2005.04315v2
Probing Neural Dialog Models for Conversational Understanding http://arxiv.org/abs/2006.08331v1
Probing Pretrained Language Models for Lexical Semantics http://arxiv.org/abs/2010.05731v1
Probing Task-Oriented Dialogue Representation from Language Models http://arxiv.org/abs/2010.13912v1
Probing for Semantic Classes: Diagnosing the Meaning Content of Word Embeddings http://arxiv.org/abs/1906.03608v1
Probing the Need for Visual Context in Multimodal Machine Translation http://arxiv.org/abs/1903.08678v2
Problems with Shapley-value-based explanations as feature importance measures http://arxiv.org/abs/2002.11097v2
Profile Consistency Identification for Open-domain Dialogue Agents http://arxiv.org/abs/2009.09680v3
Program Enhanced Fact Verification with Verbalization and Graph Attention Network http://arxiv.org/abs/2010.03084v5
Progressive Graph Learning for Open-Set Domain Adaptation http://arxiv.org/abs/2006.12087v2
Progressive Growing of Neural ODEs http://arxiv.org/abs/2003.03695v1
Progressive Identification of True Labels for Partial-Label Learning http://arxiv.org/abs/2002.08053v3
Progressive growing of self-organized hierarchical representations for exploration http://arxiv.org/abs/2005.06369v1
Projective Preferential Bayesian Optimization http://arxiv.org/abs/2002.03113v4
Pronoun-Targeted Fine-tuning for NMT with Hybrid Losses http://arxiv.org/abs/2010.07638v1
Proper Learning, Helly Number, and an Optimal SVM Bound http://arxiv.org/abs/2005.11818v1
Proper Network Interpretability Helps Adversarial Robustness in Classification http://arxiv.org/abs/2006.14748v2
Prophets, Secretaries, and Maximizing the Probability of Choosing the Best http://arxiv.org/abs/1910.03798v1
ProtoQA: A Question Answering Dataset for Prototypical Common-Sense Reasoning http://arxiv.org/abs/2005.00771v3
Provable Representation Learning for Imitation Learning via Bi-level Optimization http://arxiv.org/abs/2002.10544v1
Provable Self-Play Algorithms for Competitive Reinforcement Learning http://arxiv.org/abs/2002.04017v3
Provable Smoothness Guarantees for Black-Box Variational Inference http://arxiv.org/abs/1901.08431v4
Provably Convergent Two-Timescale Off-Policy Actor-Critic with Function Approximation http://arxiv.org/abs/1911.04384v9
Provably Efficient Exploration in Policy Optimization http://arxiv.org/abs/1912.05830v3
Provably Efficient Model-based Policy Adaptation http://arxiv.org/abs/2006.08051v1
Provably Efficient Reinforcement Learning with Linear Function Approximation http://arxiv.org/abs/1907.05388v2
Proving the Lottery Ticket Hypothesis: Pruning is All You Need http://arxiv.org/abs/2002.00585v1
Prta: A System to Support the Analysis of Propaganda Techniques in the News http://arxiv.org/abs/2005.05854v1
Psycholinguistics meets Continual Learning: Measuring Catastrophic Forgetting in Visual Question Answering http://arxiv.org/abs/1906.04229v1
Pun Generation with Surprise http://arxiv.org/abs/1904.06828v1
Putting An End to End-to-End: Gradient-Isolated Learning of Representations http://arxiv.org/abs/1905.11786v3
PuzzLing Machines: A Challenge on Learning From Small Data http://arxiv.org/abs/2004.13161v1
Puzzle Mix: Exploiting Saliency and Local Statistics for Optimal Mixup http://arxiv.org/abs/2009.06962v2
PyHessian: Neural Networks Through the Lens of the Hessian http://arxiv.org/abs/1912.07145v3
PyMT5: multi-mode translation of natural language and Python code with transformers http://arxiv.org/abs/2010.03150v1
PySBD: Pragmatic Sentence Boundary Disambiguation http://arxiv.org/abs/2010.09657v1
Pyramid Convolutional RNN for MRI Reconstruction http://arxiv.org/abs/1912.00543v5
Q-learning with Language Model for Edit-based Unsupervised Summarization http://arxiv.org/abs/2010.04379v1
Q-value Path Decomposition for Deep Multiagent Reinforcement Learning http://arxiv.org/abs/2002.03950v1
QA2Explanation: Generating and Evaluating Explanations for Question Answering Systems over Knowledge Graph http://arxiv.org/abs/2010.08323v1
QuASE: Question-Answer Driven Sentence Encoding http://arxiv.org/abs/1909.00333v3
Quantifying Attention Flow in Transformers http://arxiv.org/abs/2005.00928v2
Quantifying Differences in Reward Functions http://arxiv.org/abs/2006.13900v2
Quantifying Intimacy in Language http://arxiv.org/abs/2011.03020v1
Quantifying Privacy Leakage in Graph Embedding http://arxiv.org/abs/2010.00906v1
Quantifying Similarity between Relations with Fact Distribution http://arxiv.org/abs/1907.08937v1
Quantifying the Effects of COVID-19 on Mental Health Support Forums http://arxiv.org/abs/2009.04008v1
Quantitative Argument Summarization and Beyond: Cross-Domain Key Point Analysis http://arxiv.org/abs/2010.05369v1
Quantitative stability of optimal transport maps and linearization of the 2-Wasserstein space http://arxiv.org/abs/1910.05954v1
Quantized Decentralized Stochastic Learning over Directed Graphs http://arxiv.org/abs/2002.09964v5
Quantized Frank-Wolfe: Faster Optimization, Lower Communication, and Projection Free http://arxiv.org/abs/1902.06332v3
Quantum Boosting http://arxiv.org/abs/2002.05056v2
Quantum Expectation-Maximization for Gaussian Mixture Models http://arxiv.org/abs/1908.06657v2
Quaternion Graph Neural Networks http://arxiv.org/abs/2008.05089v3
Queens are Powerful too: Mitigating Gender Bias in Dialogue Generation http://arxiv.org/abs/1911.03842v2
Question Directed Graph Attention Network for Numerical Reasoning over Text http://arxiv.org/abs/2009.07448v1
R2-B2: Recursive Reasoning-Based Bayesian Optimization for No-Regret Learning in Games http://arxiv.org/abs/2006.16679v1
R4C: A Benchmark for Evaluating RC Systems to Get the Right Answer for the Right Reason http://arxiv.org/abs/1910.04601v2
RAMP-CNN: A Novel Neural Network for Enhanced Automotive Radar Object Recognition http://arxiv.org/abs/2011.08981v1
RAT-SQL: Relation-Aware Schema Encoding and Linking for Text-to-SQL Parsers http://arxiv.org/abs/1911.04942v4
RATQ: A Universal Fixed-Length Quantizer for Stochastic Optimization http://arxiv.org/abs/1908.08200v3
RESIDE: Improving Distantly-Supervised Neural Relation Extraction using Side Information http://arxiv.org/abs/1812.04361v2
RIFLE: Backpropagation in Depth for Deep Transfer Learning through Re-Initializing the Fully-connected LayEr http://arxiv.org/abs/2007.03349v1
RL agents Implicitly Learning Human Preferences http://arxiv.org/abs/2002.06137v1
RNNs can generate bounded hierarchical languages with optimal memory http://arxiv.org/abs/2010.07515v1
ROMA: Multi-Agent Reinforcement Learning with Emergent Roles http://arxiv.org/abs/2003.08039v3
RPD: A Distance Function Between Word Embeddings http://arxiv.org/abs/2005.08113v1
Radial Bayesian Neural Networks: Beyond Discrete Support In Large-Scale Bayesian Deep Learning http://arxiv.org/abs/1907.00865v3
Radioactive data: tracing through training http://arxiv.org/abs/2002.00937v1
Random Hypervolume Scalarizations for Provable Multi-Objective Black Box Optimization http://arxiv.org/abs/2006.04655v2
Random Matrix Theory Proves that Deep Learning Representations of GAN-data Behave as Gaussian Mixtures http://arxiv.org/abs/2001.08370v1
Random Search and Reproducibility for Neural Architecture Search http://arxiv.org/abs/1902.07638v3
Random extrapolation for primal-dual coordinate descent http://arxiv.org/abs/2007.06528v1
Randomized Block-Diagonal Preconditioning for Parallel Learning http://arxiv.org/abs/2006.13591v2
Randomized Exploration in Generalized Linear Bandits http://arxiv.org/abs/1906.08947v2
Randomized Smoothing of All Shapes and Sizes http://arxiv.org/abs/2002.08118v5
Randomly Projected Additive Gaussian Processes for Regression http://arxiv.org/abs/1912.12834v1
Rank and run-time aware compression of NLP Applications http://arxiv.org/abs/2010.03193v1
Ranking Paragraphs for Improving Answer Recall in Open-Domain Question Answering http://arxiv.org/abs/1810.00494v1
Ranking and Selecting Multi-Hop Knowledge Paths to Better Predict Human Needs http://arxiv.org/abs/1904.00676v1
Rapid Adaptation of Neural Machine Translation to New Languages http://arxiv.org/abs/1808.04189v1
Rapidly Deploying a Neural Search Engine for the COVID-19 Open Research Dataset: Preliminary Thoughts and Lessons Learned http://arxiv.org/abs/2004.05125v1
Rate-Distortion Optimization Guided Autoencoder for Isometric Embedding in Euclidean Latent Space http://arxiv.org/abs/1910.04329v4
Rational Recurrences http://arxiv.org/abs/1808.09357v1
Rationalizing Medical Relation Prediction from Corpus-level Statistics http://arxiv.org/abs/2005.00889v1
Rationalizing Neural Predictions http://arxiv.org/abs/1606.04155v2
Rationalizing Text Matching: Learning Sparse Alignments via Optimal Transport http://arxiv.org/abs/2005.13111v1
Re-evaluating Evaluation in Text Summarization http://arxiv.org/abs/2010.07100v1
Re-translation versus Streaming for Simultaneous Translation http://arxiv.org/abs/2004.03643v3
ReLU Code Space: A Basis for Rating Network Quality Besides Accuracy http://arxiv.org/abs/2005.09903v1
Reactive Supervision: A New Method for Collecting Sarcasm Data http://arxiv.org/abs/2009.13080v1
Reading Between the Lines: Exploring Infilling in Visual Narratives http://arxiv.org/abs/2010.13944v1
Ready Policy One: World Building Through Active Learning http://arxiv.org/abs/2002.02693v1
Real-Time Open-Domain Question Answering with Dense-Sparse Phrase Index http://arxiv.org/abs/1906.05807v2
Real-Time Optimisation for Online Learning in Auctions http://arxiv.org/abs/2010.10070v1
Real-time Classification from Short Event-Camera Streams using Input-filtering Neural ODEs http://arxiv.org/abs/2004.03156v1
Reasoning About Generalization via Conditional Mutual Information http://arxiv.org/abs/2001.09122v3
Reasoning About Pragmatics with Neural Listeners and Speakers http://arxiv.org/abs/1604.00562v2
Reasoning Over History: Context Aware Visual Dialog http://arxiv.org/abs/2011.00669v1
Reasoning Over Semantic-Level Graph for Fact Checking http://arxiv.org/abs/1909.03745v3
Reasoning about Actions and State Changes by Injecting Commonsense Knowledge http://arxiv.org/abs/1808.10012v1
Reasoning about Goals, Steps, and Temporal Ordering with WikiHow http://arxiv.org/abs/2009.07690v2
Reasoning with Latent Structure Refinement for Document-Level Relation Extraction http://arxiv.org/abs/2005.06312v3
Reasoning with Sarcasm by Reading In-between http://arxiv.org/abs/1805.02856v1
Recall and Learn: Fine-tuning Deep Pretrained Language Models with Less Forgetting http://arxiv.org/abs/2004.12651v1
RecipeQA: A Challenge Dataset for Multimodal Comprehension of Cooking Recipes http://arxiv.org/abs/1809.00812v1
Recipes for building an open-domain chatbot http://arxiv.org/abs/2004.13637v2
Recognizing Implicit Discourse Relations via Repeated Reading: Neural Networks with Multi-Level Attention http://arxiv.org/abs/1609.06380v1
Recovery of Sparse Signals from a Mixture of Linear Samples http://arxiv.org/abs/2006.16406v2
Recurrent Chunking Mechanisms for Long-Text Machine Reading Comprehension http://arxiv.org/abs/2005.08056v2
Recurrent Event Network: Autoregressive Structure Inference over Temporal Knowledge Graphs http://arxiv.org/abs/1904.05530v4
Recurrent Hierarchical Topic-Guided RNN for Language Generation http://arxiv.org/abs/1912.10337v2
Recurrent Interaction Network for Jointly Extracting Entities and Classifying Relations http://arxiv.org/abs/2005.00162v2
Recurrent Neural Network Language Models Always Learn English-Like Relative Clause Attachment http://arxiv.org/abs/2005.00165v3
Recurrent Neural Networks as Weighted Language Recognizers http://arxiv.org/abs/1711.05408v2
Recurrent Neural Networks in Linguistic Theory: Revisiting Pinker and Prince (1988) and the Past Tense Debate http://arxiv.org/abs/1807.04783v2
Recurrent babbling: evaluating the acquisition of grammar from limited input data http://arxiv.org/abs/2010.04637v1
Recursive Subtree Composition in LSTM-Based Dependency Parsing http://arxiv.org/abs/1902.09781v1
Reducibility and Statistical-Computational Gaps from Secret Leakage http://arxiv.org/abs/2005.08099v2
Reducing Gender Bias in Abusive Language Detection http://arxiv.org/abs/1808.07231v1
Reducing Gender Bias in Neural Machine Translation as a Domain Adaptation Problem http://arxiv.org/abs/2004.04498v3
Reducing Sampling Error in Batch Temporal Difference Learning http://arxiv.org/abs/2008.06738v1
Refer, Reuse, Reduce: Generating Subsequent References in Visual and Conversational Contexts http://arxiv.org/abs/2011.04554v1
Refined bounds for algorithm configuration: The knife-edge of dual class approximability http://arxiv.org/abs/2006.11827v2
Reflection-based Word Attribute Transfer http://arxiv.org/abs/2007.02598v2
Reformulating Unsupervised Style Transfer as Paraphrase Generation http://arxiv.org/abs/2010.05700v1
Regression Networks for Meta-Learning Few-Shot Classification http://arxiv.org/abs/1905.13613v2
Regularity as Regularization: Smooth and Strongly Convex Brenier Potentials in Optimal Transport http://arxiv.org/abs/1905.10812v5
Regularization via Structural Label Smoothing http://arxiv.org/abs/2001.01900v2
Regularized Autoencoders via Relaxed Injective Probability Flow http://arxiv.org/abs/2002.08927v1
Regularized Context Gates on Transformer for Machine Translation http://arxiv.org/abs/1908.11020v2
Regularized Inverse Reinforcement Learning http://arxiv.org/abs/2010.03691v2
Regularized Optimal Transport is Ground Cost Adversarial http://arxiv.org/abs/2002.03967v3
Regularizing Dialogue Generation by Imitating Implicit Scenarios http://arxiv.org/abs/2010.01893v2
Reinforcement Learning Based Text Style Transfer without Parallel Training Corpus http://arxiv.org/abs/1903.10671v2
Reinforcement Learning Generalization with Surprise Minimization http://arxiv.org/abs/2004.12399v2
Reinforcement Learning based Curriculum Optimization for Neural Machine Translation http://arxiv.org/abs/1903.00041v1
Reinforcement Learning for Integer Programming: Learning to Cut http://arxiv.org/abs/1906.04859v3
Reinforcement Learning for Molecular Design Guided by Quantum Mechanics http://arxiv.org/abs/2002.07717v2
Reinforcement Learning in Feature Space: Matrix Bandit, Kernels, and Regret Bound http://arxiv.org/abs/1905.10389v2
Reinforcement Learning through Active Inference http://arxiv.org/abs/2002.12636v1
Reinforcement Learning with Chromatic Networks for Compact Architecture Search http://arxiv.org/abs/1907.06511v3
Reinforcement Learning with Latent Flow http://arxiv.org/abs/2101.01857v1
Relabel the Noise: Joint Extraction of Entities and Relations via Cooperative Multiagents http://arxiv.org/abs/2004.09930v1
Relating Simple Sentence Representations in Deep Neural Networks and the Brain http://arxiv.org/abs/1906.11861v1
Relation Embedding with Dihedral Group in Knowledge Graph http://arxiv.org/abs/1906.00687v1
Relation Extraction with Explanation http://arxiv.org/abs/2005.14271v1
Relational Graph Attention Network for Aspect-based Sentiment Analysis http://arxiv.org/abs/2004.12362v1
Relations such as Hypernymy: Identifying and Exploiting Hearst Patterns in Distributional Vectors for Lexical Entailment http://arxiv.org/abs/1605.05433v2
Relative gradient optimization of the Jacobian term in unsupervised deep learning http://arxiv.org/abs/2006.15090v2
Relaxing Bijectivity Constraints with Continuously Indexed Normalising Flows http://arxiv.org/abs/1909.13833v4
Relevance of Rotationally Equivariant Convolutions for Predicting Molecular Properties http://arxiv.org/abs/2008.08461v4
Reliable Fidelity and Diversity Metrics for Generative Models http://arxiv.org/abs/2002.09797v2
Reliable evaluation of adversarial robustness with an ensemble of diverse parameter-free attacks http://arxiv.org/abs/2003.01690v2
Rep the Set: Neural Networks for Learning Set Representations http://arxiv.org/abs/1904.01962v2
Replicability Analysis for Natural Language Processing: Testing Significance with Multiple Datasets http://arxiv.org/abs/1709.09500v1
Representation Learning for Discovering Phonemic Tone Contours http://arxiv.org/abs/1910.08987v2
Representation Learning for Grounded Spatial Reasoning http://arxiv.org/abs/1707.03938v2
Representations of language in a model of visually grounded speech signal http://arxiv.org/abs/1702.01991v3
Representing Unordered Data Using Complex-Weighted Multiset Automata http://arxiv.org/abs/2001.00610v3
Representing and Denoising Wearable ECG Recordings http://arxiv.org/abs/2012.00110v1
Repulsive Attention: Rethinking Multi-head Attention as Bayesian Inference http://arxiv.org/abs/2009.09364v2
Repurposing Entailment for Multi-Hop Question Answering Tasks http://arxiv.org/abs/1904.09380v1
Reserve Pricing in Repeated Second-Price Auctions with Strategic Bidders http://arxiv.org/abs/1906.09331v1
Reset-Free Lifelong Learning with Skill-Space Planning http://arxiv.org/abs/2012.03548v2
Resolution Dependent GAN Interpolation for Controllable Image Synthesis Between Domains http://arxiv.org/abs/2010.05334v3
Resolving Spurious Correlations in Causal Models of Environments via Interventions http://arxiv.org/abs/2002.05217v2
Response Selection for Multi-Party Conversations with Dynamic Topic Tracking http://arxiv.org/abs/2010.07785v1
Response-Anticipated Memory for On-Demand Knowledge Integration in Response Generation http://arxiv.org/abs/2005.06128v1
RethinkCWS: Is Chinese Word Segmentation a Solved Task? http://arxiv.org/abs/2011.06858v2
Rethinking Action Spaces for Reinforcement Learning in End-to-end Dialog Agents with Latent Variable Models http://arxiv.org/abs/1902.08858v2
Rethinking Dialogue State Tracking with Reasoning http://arxiv.org/abs/2005.13129v2
Retrieval-Based Neural Code Generation http://arxiv.org/abs/1808.10025v1
Retrofitting Structure-aware Transformer Language Model for End Tasks http://arxiv.org/abs/2009.07408v1
Reusability and Transferability of Macro Actions for Reinforcement Learning http://arxiv.org/abs/1908.01478v2
Reusing a Pretrained Language Model on Languages with Limited Corpora for Unsupervised NMT http://arxiv.org/abs/2009.07610v3
Reverse Engineering Configurations of Neural Text Generation Models http://arxiv.org/abs/2004.06201v1
Reverse-Engineering Deep ReLU Networks http://arxiv.org/abs/1910.00744v2
Review-based Question Generation with Adaptive Instance Transfer and Augmentation http://arxiv.org/abs/1911.01556v2
Revisiting Character-Based Neural Machine Translation with Capacity and Compression http://arxiv.org/abs/1808.09943v1
Revisiting Ensembles in an Adversarial Context: Improving Natural Accuracy http://arxiv.org/abs/2002.11572v1
Revisiting Fundamentals of Experience Replay http://arxiv.org/abs/2007.06700v1
Revisiting Joint Modeling of Cross-document Entity and Event Coreference Resolution http://arxiv.org/abs/1906.01753v1
Revisiting Low-Resource Neural Machine Translation: A Case Study http://arxiv.org/abs/1905.11901v1
Revisiting Modularized Multilingual NMT to Meet Industrial Demands http://arxiv.org/abs/2010.09402v1
Revisiting Rainbow: Promoting more insightful and inclusive deep reinforcement learning research http://arxiv.org/abs/2011.14826v1
Revisiting Stochastic Extragradient http://arxiv.org/abs/1905.11373v2
Revisiting Unsupervised Relation Extraction http://arxiv.org/abs/2005.00087v1
Revisiting the Context Window for Cross-lingual Word Embeddings http://arxiv.org/abs/2004.10813v1
Revisiting the Importance of Encoding Logic Rules in Sentiment Classification http://arxiv.org/abs/1808.07733v1
RiSAWOZ: A Large-Scale Multi-Domain Wizard-of-Oz Dataset with Rich Semantic Annotations for Task-Oriented Dialogue Modeling http://arxiv.org/abs/2010.08738v1
Rigging the Lottery: Making All Tickets Winners http://arxiv.org/abs/1911.11134v2
Right for the Wrong Reasons: Diagnosing Syntactic Heuristics in Natural Language Inference http://arxiv.org/abs/1902.01007v4
Rigid Formats Controlled Text Generation http://arxiv.org/abs/2004.08022v1
RikiNet: Reading Wikipedia Pages for Natural Question Answering http://arxiv.org/abs/2004.14560v1
Risk Assessment for Machine Learning Models http://arxiv.org/abs/2011.04328v1
Risk Bounds for Learning Multiple Components with Permutation-Invariant Losses http://arxiv.org/abs/1904.07594v2
Rk-means: Fast Clustering for Relational Data http://arxiv.org/abs/1910.04939v1
Robust Bayesian Classification Using an Optimistic Score Ratio http://arxiv.org/abs/2007.04458v1
Robust Cross-lingual Hypernymy Detection using Dependency Context http://arxiv.org/abs/1803.11291v1
Robust Distant Supervision Relation Extraction via Deep Reinforcement Learning http://arxiv.org/abs/1805.09927v1
Robust Domain Randomised Reinforcement Learning through Peer-to-Peer Distillation http://arxiv.org/abs/2012.04839v1
Robust Encodings: A Framework for Combating Adversarial Typos http://arxiv.org/abs/2005.01229v1
Robust Learning from Discriminative Feature Feedback http://arxiv.org/abs/2003.03946v1
Robust Optimisation Monte Carlo http://arxiv.org/abs/1904.00670v3
Robust Outlier Arm Identification http://arxiv.org/abs/2009.09988v1
Robust Prediction of Punctuation and Truecasing for Medical ASR http://arxiv.org/abs/2007.02025v2
Robust Reinforcement Learning using Adversarial Populations http://arxiv.org/abs/2008.01825v2
Robust Variational Autoencoders for Outlier Detection and Repair of Mixed-Type Data http://arxiv.org/abs/1907.06671v2
Robust Visual Domain Randomization for Reinforcement Learning http://arxiv.org/abs/1910.10537v2
Robust and Private Learning of Halfspaces http://arxiv.org/abs/2011.14580v1
Robust and Stable Black Box Explanations http://arxiv.org/abs/2011.06169v1
Robust model training and generalisation with Studentising flows http://arxiv.org/abs/2006.06599v2
Robust posterior inference when statistically emulating forward simulations http://arxiv.org/abs/2004.11929v1
Robustifying Sequential Neural Processes http://arxiv.org/abs/2006.15987v1
Robustness for Non-Parametric Classification: A Generic Attack and Defense http://arxiv.org/abs/1906.03310v2
Robustness to Programmable String Transformations via Augmented Abstract Training http://arxiv.org/abs/2002.09579v4
Robustness to Spurious Correlations via Human Annotations http://arxiv.org/abs/2007.06661v2
Room-Across-Room: Multilingual Vision-and-Language Navigation with Dense Spatiotemporal Grounding http://arxiv.org/abs/2010.07954v1
RussianSuperGLUE: A Russian Language Understanding Evaluation Benchmark http://arxiv.org/abs/2010.15925v2
S-MART: Novel Tree-based Structured Learning Algorithms Applied to Tweet Entity Linking http://arxiv.org/abs/1609.08075v1
S2ORC: The Semantic Scholar Open Research Corpus http://arxiv.org/abs/1911.02782v3
S2RMs: Spatially Structured Recurrent Modules http://arxiv.org/abs/2007.06533v1
SAFENet: Self-Supervised Monocular Depth Estimation with Semantic-Aware Feature Extraction http://arxiv.org/abs/2010.02893v3
SAFER: A Structure-free Approach for Certified Robustness to Adversarial Word Substitutions http://arxiv.org/abs/2005.14424v1
SCAFFOLD: Stochastic Controlled Averaging for Federated Learning http://arxiv.org/abs/1910.06378v3
SCDE: Sentence Cloze Dataset with High Quality Distractors From Examinations http://arxiv.org/abs/2004.12934v1
SCDV : Sparse Composite Document Vectors using soft clustering over distributional representations http://arxiv.org/abs/1612.06778v3
SDE-Net: Equipping Deep Neural Networks with Uncertainty Estimates http://arxiv.org/abs/2008.10546v1
SE(3)-Transformers: 3D Roto-Translation Equivariant Attention Networks http://arxiv.org/abs/2006.10503v3
SECTOR: A Neural Model for Coherent Topic Segmentation and Classification http://arxiv.org/abs/1902.04793v1
SGD Learns One-Layer Networks in WGANs http://arxiv.org/abs/1910.07030v2
SHAPED: Shared-Private Encoder-Decoder for Text Style Adaptation http://arxiv.org/abs/1804.04093v1
SIGMORPHON 2020 Shared Task 0: Typologically Diverse Morphological Inflection http://arxiv.org/abs/2006.11572v2
SIGN: Scalable Inception Graph Neural Networks http://arxiv.org/abs/2004.11198v3
SIGTYP 2020 Shared Task: Prediction of Typological Features http://arxiv.org/abs/2010.08246v2
SIGUA: Forgetting May Make Learning with Noisy Labels More Robust http://arxiv.org/abs/1809.11008v3
SJTU-NICT's Supervised and Unsupervised Neural Machine Translation Systems for the WMT20 News Translation Task http://arxiv.org/abs/2010.05122v1
SKEP: Sentiment Knowledge Enhanced Pre-training for Sentiment Analysis http://arxiv.org/abs/2005.05635v2
SLEDGE-Z: A Zero-Shot Baseline for COVID-19 Literature Search http://arxiv.org/abs/2010.05987v1
SLM: Learning a Discourse Language Representation with Sentence Unshuffling http://arxiv.org/abs/2010.16249v1
SLURP: A Spoken Language Understanding Resource Package http://arxiv.org/abs/2011.13205v1
SMART: Robust and Efficient Fine-Tuning for Pre-trained Natural Language Models through Principled Regularized Optimization http://arxiv.org/abs/1911.03437v2
SMArtCast: Predicting soil moisture interpolations into the future using Earth observation data in a deep learning framework http://arxiv.org/abs/2003.10823v2
SOTERIA: In Search of Efficient Neural Networks for Private Inference http://arxiv.org/abs/2007.12934v1
SOrT-ing VQA Models : Contrastive Gradient Learning for Improved Consistency http://arxiv.org/abs/2010.10038v2
SQuAD: 100,000+ Questions for Machine Comprehension of Text http://arxiv.org/abs/1606.05250v3
SRLGRN: Semantic Role Labeling Graph Reasoning Network http://arxiv.org/abs/2010.03604v2
SSCR: Iterative Language-Based Image Editing via Self-Supervised Counterfactual Reasoning http://arxiv.org/abs/2009.09566v2
SSMBA: Self-Supervised Manifold Based Data Augmentation for Improving Out-of-Domain Robustness http://arxiv.org/abs/2009.10195v2
STARC: Structured Annotations for Reading Comprehension http://arxiv.org/abs/2004.14797v1
STORIUM: A Dataset and Evaluation Platform for Machine-in-the-Loop Story Generation http://arxiv.org/abs/2010.01717v1
SUPERT: Towards New Frontiers in Unsupervised Evaluation Metrics for Multi-Document Summarization http://arxiv.org/abs/2005.03724v1
SUPP.AI: Finding Evidence for Supplement-Drug Interactions http://arxiv.org/abs/1909.08135v3
SWAG: A Large-Scale Adversarial Dataset for Grounded Commonsense Inference http://arxiv.org/abs/1808.05326v1
SacreROUGE: An Open-Source Library for Using and Developing Summarization Evaluation Metrics http://arxiv.org/abs/2007.05374v1
Safe Imitation Learning via Fast Bayesian Reward Inference from Preferences http://arxiv.org/abs/2002.09089v4
Safe Reinforcement Learning in Constrained Markov Decision Processes http://arxiv.org/abs/2008.06626v1
Safe Reinforcement Learning with Natural Language Constraints http://arxiv.org/abs/2010.05150v1
SafeCity: Understanding Diverse Forms of Sexual Harassment Personal Stories http://arxiv.org/abs/1809.04739v2
Saliency Learning: Teaching the Model Where to Pay Attention http://arxiv.org/abs/1902.08649v3
SalsaNext: Fast, Uncertainty-aware Semantic Segmentation of LiDAR Point Clouds for Autonomous Driving http://arxiv.org/abs/2003.03653v3
Sample Amplification: Increasing Dataset Size even when Learning is Impossible http://arxiv.org/abs/1904.12053v2
Sample Complexity Bounds for 1-bit Compressive Sensing and Binary Stable Embeddings with Generative Priors http://arxiv.org/abs/2002.01697v3
Sample Complexity of Estimating the Policy Gradient for Nearly Deterministic Dynamical Systems http://arxiv.org/abs/1901.08562v2
Sample Complexity of Reinforcement Learning using Linearly Combined Model Ensembles http://arxiv.org/abs/1910.10597v1
Sample Efficient Training in Multi-Agent Adversarial Games with Limited Teammate Communication http://arxiv.org/abs/2011.00424v1
Sample Factory: Egocentric 3D Control from Pixels at 100000 FPS with Asynchronous Reinforcement Learning http://arxiv.org/abs/2006.11751v2
Sample-efficient proper PAC learning with approximate differential privacy http://arxiv.org/abs/2012.03893v1
Sarcasm Detection in Tweets with BERT and GloVe Embeddings http://arxiv.org/abs/2006.11512v1
Sarcasm Detection using Context Separators in Online Discourse http://arxiv.org/abs/2006.00850v1
Satellite-based Prediction of Forage Conditions for Livestock in Northern Kenya http://arxiv.org/abs/2004.04081v2
Satirical News Detection and Analysis using Attention Mechanism and Linguistic Features http://arxiv.org/abs/1709.01189v1
Scalable Bayesian Learning of Recurrent Neural Networks for Language Modeling http://arxiv.org/abs/1611.08034v2
Scalable Deep Generative Modeling for Sparse Graphs http://arxiv.org/abs/2006.15502v1
Scalable Differentiable Physics for Learning and Control http://arxiv.org/abs/2007.02168v1
Scalable Differential Privacy with Certified Robustness in Adversarial Learning http://arxiv.org/abs/1903.09822v5
Scalable Exact Inference in Multi-Output Gaussian Processes http://arxiv.org/abs/1911.06287v3
Scalable Gaussian Process Regression for Kernels with a Non-Stationary Phase http://arxiv.org/abs/1912.11713v1
Scalable Gradients for Stochastic Differential Equations http://arxiv.org/abs/2001.01328v6
Scalable Identification of Partially Observed Systems with Certainty-Equivalent EM http://arxiv.org/abs/2006.11615v1
Scalable Multi-Hop Relational Reasoning for Knowledge-Aware Question Answering http://arxiv.org/abs/2005.00646v2
Scalable Nearest Neighbor Search for Optimal Transport http://arxiv.org/abs/1910.04126v4
Scalable Syntax-Aware Language Models Using Knowledge Distillation http://arxiv.org/abs/1906.06438v1
Scalable Zero-shot Entity Linking with Dense Entity Retrieval http://arxiv.org/abs/1911.03814v3
Scalable and Efficient Comparison-based Search without Features http://arxiv.org/abs/1905.05049v3
Scaling Hidden Markov Language Models http://arxiv.org/abs/2011.04640v1
Scaling up Hybrid Probabilistic Inference with Logical and Arithmetic Constraints via Message Passing http://arxiv.org/abs/2003.00126v2
Scattering GCN: Overcoming Oversmoothness in Graph Convolutional Networks http://arxiv.org/abs/2003.08414v2
Scene Graph Parsing as Dependency Parsing http://arxiv.org/abs/1803.09189v1
Scene Graph Reasoning for Visual Question Answering http://arxiv.org/abs/2007.01072v1
Schatten Norms in Matrix Streams: Hello Sparsity, Goodbye Dimension http://arxiv.org/abs/1907.05457v2
SciDTB: Discourse Dependency TreeBank for Scientific Abstracts http://arxiv.org/abs/1806.03653v1
SciREX: A Challenge Dataset for Document-Level Information Extraction http://arxiv.org/abs/2005.00512v1
Score Combination for Improved Parallel Corpus Filtering for Low Resource Conditions http://arxiv.org/abs/2011.07933v1
Scoring Lexical Entailment with a Supervised Directional Similarity Network http://arxiv.org/abs/1805.09355v1
Screening Data Points in Empirical Risk Minimization via Ellipsoidal Regions and Safe Loss Functions http://arxiv.org/abs/1912.02566v3
Screenplay Quality Assessment: Can We Predict Who Gets Nominated? http://arxiv.org/abs/2005.06123v1
Screenplay Summarization Using Latent Narrative Structure http://arxiv.org/abs/2004.12727v1
ScriptWriter: Narrative-Guided Script Generation http://arxiv.org/abs/2005.10331v2
Secure Medical Image Analysis with CrypTFlow http://arxiv.org/abs/2012.05064v1
Selecting Backtranslated Data from Multiple Sources for Improved Neural Machine Translation http://arxiv.org/abs/2005.00308v1
Selecting Machine-Translated Data for Quick Bootstrapping of a Natural Language Understanding System http://arxiv.org/abs/1805.09119v1
Selection Bias Explorations and Debias Methods for Natural Language Sentence Matching Datasets http://arxiv.org/abs/1905.06221v4
Selective Attention for Context-aware Neural Machine Translation http://arxiv.org/abs/1903.08788v2
Selective Dyna-style Planning Under Limited Model Capacity http://arxiv.org/abs/2007.02418v2
Selective Encoding for Abstractive Sentence Summarization http://arxiv.org/abs/1704.07073v1
Selective Question Answering under Domain Shift http://arxiv.org/abs/2006.09462v1
Self-Attentive Associative Memory http://arxiv.org/abs/2002.03519v3
Self-Induced Curriculum Learning in Self-Supervised Neural Machine Translation http://arxiv.org/abs/2004.03151v2
Self-PU: Self Boosted and Calibrated Positive-Unlabeled Training http://arxiv.org/abs/2006.11280v1
Self-Supervised Meta-Learning for Few-Shot Natural Language Classification Tasks http://arxiv.org/abs/2009.08445v2
Self-Supervised Policy Adaptation during Deployment http://arxiv.org/abs/2007.04309v2
Self-Training for Unsupervised Parsing with PRPN http://arxiv.org/abs/2005.13455v1
Self-supervised Knowledge Triplet Learning for Zero-shot Question Answering http://arxiv.org/abs/2005.00316v2
Self-supervised Label Augmentation via Input Transformations http://arxiv.org/abs/1910.05872v2
SelfORE: Self-supervised Relational Feature Learning for Open Relation Extraction http://arxiv.org/abs/2004.02438v2
Selfish Robustness and Equilibria in Multi-Player Bandits http://arxiv.org/abs/2002.01197v2
Semantic Annotation for Microblog Topics Using Wikipedia Temporal Information http://arxiv.org/abs/1701.03939v1
Semantic Drift in Multilingual Representations http://arxiv.org/abs/1904.10820v4
Semantic Enrichment of Nigerian Pidgin English for Contextual Sentiment Classification http://arxiv.org/abs/2003.12450v1
Semantic Graphs for Generating Deep Questions http://arxiv.org/abs/2004.12704v1
Semantic Label Smoothing for Sequence to Sequence Problems http://arxiv.org/abs/2010.07447v1
Semantic Parsing for Task Oriented Dialog using Hierarchical Representations http://arxiv.org/abs/1810.07942v1
Semantic Parsing to Probabilistic Programs for Situated Question Answering http://arxiv.org/abs/1606.07046v2
Semantic Parsing with Dual Learning http://arxiv.org/abs/1907.05343v2
Semantic Parsing with Semi-Supervised Sequential Autoencoders http://arxiv.org/abs/1609.09315v1
Semantic Role Labeling Guided Multi-turn Dialogue ReWriter http://arxiv.org/abs/2010.01417v1
Semantic Role Labeling as Syntactic Dependency Parsing http://arxiv.org/abs/2010.11170v1
Semantic Role Labeling for Learner Chinese: the Importance of Syntactic Parsing and L2-L1 Parallel Data http://arxiv.org/abs/1808.09409v2
Semantic Scaffolds for Pseudocode-to-Code Generation http://arxiv.org/abs/2005.05927v1
Semantic Structural Evaluation for Text Simplification http://arxiv.org/abs/1810.05022v1
Semantic expressive capacity with bounded memory http://arxiv.org/abs/1906.11752v1
Semantically Conditioned LSTM-based Natural Language Generation for Spoken Dialogue Systems http://arxiv.org/abs/1508.01745v2
Semantically-Aligned Equation Generation for Solving and Reasoning Math Word Problems http://arxiv.org/abs/1811.00720v2
Semantically-Aligned Universal Tree-Structured Solver for Math Word Problems http://arxiv.org/abs/2010.06823v1
Semi-Modular Inference: enhanced learning in multi-modular models by tempering the influence of components http://arxiv.org/abs/2003.06804v1
Semi-Supervised Bilingual Lexicon Induction with Two-way Interaction http://arxiv.org/abs/2010.07101v1
Semi-Supervised Dialogue Policy Learning via Stochastic Reward Estimation http://arxiv.org/abs/2005.04379v1
Semi-Supervised Learning with Normalizing Flows http://arxiv.org/abs/1912.13025v1
Semi-Supervised QA with Generative Domain-Adaptive Nets http://arxiv.org/abs/1702.02206v2
Semi-Supervised StyleGAN for Disentanglement Learning http://arxiv.org/abs/2003.03461v3
Semi-supervised User Geolocation via Graph Convolutional Networks http://arxiv.org/abs/1804.08049v4
Semiparametric Nonlinear Bipartite Graph Representation Learning with Provable Guarantees http://arxiv.org/abs/2003.01013v1
SenseBERT: Driving Some Sense into BERT http://arxiv.org/abs/1908.05646v2
Sentence Meta-Embeddings for Unsupervised Semantic Textual Similarity http://arxiv.org/abs/1911.03700v3
Sentence Simplification with Deep Reinforcement Learning http://arxiv.org/abs/1703.10931v2
Sentences with Gapping: Parsing and Reconstructing Elided Predicates http://arxiv.org/abs/1804.06922v1
SentiBERT: A Transferable Transformer-Based Architecture for Compositional Sentiment Semantics http://arxiv.org/abs/2005.04114v4
SentiLARE: Sentiment-Aware Language Representation Learning with Linguistic Knowledge http://arxiv.org/abs/1911.02493v3
Seq2Edits: Sequence Transduction Using Span-level Edit Operations http://arxiv.org/abs/2009.11136v1
SeqMix: Augmenting Active Sequence Labeling via Sequence Mixup http://arxiv.org/abs/2010.02322v1
Sequence Tagging with Contextual and Non-Contextual Subword Representations: A Multilingual Evaluation http://arxiv.org/abs/1906.01569v1
Sequence-Level Knowledge Distillation http://arxiv.org/abs/1606.07947v4
Sequence-Level Mixed Sample Data Augmentation http://arxiv.org/abs/2011.09039v1
Sequence-to-Action: End-to-End Semantic Graph Generation for Semantic Parsing http://arxiv.org/abs/1809.00773v1
Sequential Cooperative Bayesian Inference http://arxiv.org/abs/2002.05706v3
Sequential Matching Network: A New Architecture for Multi-turn Response Selection in Retrieval-based Chatbots http://arxiv.org/abs/1612.01627v2
Sequential Transfer in Reinforcement Learning with a Generative Model http://arxiv.org/abs/2007.00722v1
Serverless inferencing on Kubernetes http://arxiv.org/abs/2007.07366v2
Set Functions for Time Series http://arxiv.org/abs/1909.12064v3
Severing the Edge Between Before and After: Neural Architectures for Temporal Ordering of Events http://arxiv.org/abs/2004.04295v1
Shape of synth to come: Why we should use synthetic data for English surface realization http://arxiv.org/abs/2005.02693v1
Shaping Visual Representations with Language for Few-shot Classification http://arxiv.org/abs/1911.02683v2
Shared-Private Bilingual Word Embeddings for Neural Machine Translation http://arxiv.org/abs/1906.03100v1
Sharp Analysis of Expectation-Maximization for Weakly Identifiable Models http://arxiv.org/abs/1902.00194v3
Sharp Composition Bounds for Gaussian Differential Privacy via Edgeworth Expansion http://arxiv.org/abs/2003.04493v2
Sharp Models on Dull Hardware: Fast and Accurate Neural Machine Translation Decoding on the CPU http://arxiv.org/abs/1705.01991v1
Sharper bounds for uniformly stable algorithms http://arxiv.org/abs/1910.07833v2
Sheaf Neural Networks http://arxiv.org/abs/2012.06333v1
SherLIiC: A Typed Event-Focused Lexical Inference Benchmark for Evaluating Natural Language Inference http://arxiv.org/abs/1906.01393v1
Short-Term Meaning Shift: A Distributional Exploration http://arxiv.org/abs/1809.03169v3
Should All Cross-Lingual Embeddings Speak English? http://arxiv.org/abs/1911.03058v2
Showing Your Work Doesn't Always Work http://arxiv.org/abs/2004.13705v1
SimGANs: Simulator-Based Generative Adversarial Networks for ECG Synthesis to Improve Deep ECG Classification http://arxiv.org/abs/2006.15353v1
SimVerb-3500: A Large-Scale Evaluation Set of Verb Similarity http://arxiv.org/abs/1608.00869v4
Similarity Analysis of Contextual Word Representation Models http://arxiv.org/abs/2005.01172v1
Simple Unsupervised Summarization by Contextual Matching http://arxiv.org/abs/1907.13337v1
Simple and Deep Graph Convolutional Networks http://arxiv.org/abs/2007.02133v1
Simple and Effective Curriculum Pointer-Generator Networks for Reading Comprehension over Long Narratives http://arxiv.org/abs/1905.10847v1
Simple and Effective Multi-Paragraph Reading Comprehension http://arxiv.org/abs/1710.10723v2
Simple and Effective Text Simplification Using Semantic and Neural Methods http://arxiv.org/abs/1810.05104v1
Simple and sharp analysis of k-means
SimpleQuestions Nearly Solved: A New Upperbound and Baseline Approach http://arxiv.org/abs/1804.08798v1
Simpler but More Accurate Semantic Dependency Parsing http://arxiv.org/abs/1807.01396v1
Simplify the Usage of Lexicon in Chinese NER http://arxiv.org/abs/1908.05969v2
Simplifying Neural Machine Translation with Addition-Subtraction Twin-Gated Recurrent Networks http://arxiv.org/abs/1810.12546v1
Simulator Calibration under Covariate Shift with Kernels http://arxiv.org/abs/1809.08159v4
Simultaneous Inference for Massive Data: Distributed Bootstrap http://arxiv.org/abs/2002.08443v1
Simultaneous Machine Translation with Visual Context http://arxiv.org/abs/2009.07310v3
Simultaneous Translation Policies: From Fixed to Adaptive http://arxiv.org/abs/2004.13169v2
Simultaneous Translation with Flexible Policy via Restricted Imitation Learning http://arxiv.org/abs/1906.01135v2
Simultaneous paraphrasing and translation by fine-tuning Transformer models http://arxiv.org/abs/2005.05570v1
Single Model Ensemble using Pseudo-Tags and Distinct Vectors http://arxiv.org/abs/2005.00879v1
Single Point Transductive Prediction http://arxiv.org/abs/1908.02341v4
Single Shot Multitask Pedestrian Detection and Behavior Prediction http://arxiv.org/abs/2101.02232v1
Single-/Multi-Source Cross-Lingual NER via Teacher-Student Learning on Unlabeled Data in Target Language http://arxiv.org/abs/2004.12440v2
Situated Mapping of Sequential Instructions to Actions with Single-step Reward Observation http://arxiv.org/abs/1805.10209v2
Skeleton-to-Response: Dialogue Generation Guided by Retrieval Memory http://arxiv.org/abs/1809.05296v5
Sketch-Driven Regular Expression Generation from Natural Language and Examples http://arxiv.org/abs/1908.05848v2
Sketching Transformed Matrices with Applications to Natural Language Processing http://arxiv.org/abs/2002.09812v1
Skill Transfer via Partially Amortized Hierarchical Planning http://arxiv.org/abs/2011.13897v1
SlotRefine: A Fast Non-Autoregressive Model for Joint Intent Detection and Slot Filling http://arxiv.org/abs/2010.02693v2
Small Data, Big Decisions: Model Selection in the Small-Data Regime http://arxiv.org/abs/2009.12583v1
Small-GAN: Speeding Up GAN Training Using Core-sets http://arxiv.org/abs/1910.13540v1
Smooth Contextual Bandits: Bridging the Parametric and Non-differentiable Regret Regimes http://arxiv.org/abs/1909.02553v4
Social Bias Frames: Reasoning about Social and Power Implications of Language http://arxiv.org/abs/1911.03891v3
Social Biases in NLP Models as Barriers for Persons with Disabilities http://arxiv.org/abs/2005.00813v1
Social Chemistry 101: Learning to Reason about Social and Moral Norms http://arxiv.org/abs/2011.00620v1
Social Media Attributions in the Context of Water Crisis http://arxiv.org/abs/2001.01697v1
Soft Gazetteers for Low-Resource Named Entity Recognition http://arxiv.org/abs/2005.01866v1
Soft Threshold Weight Reparameterization for Learnable Sparsity http://arxiv.org/abs/2002.03231v9
SoftSort: A Continuous Relaxation for the argsort Operator http://arxiv.org/abs/2006.16038v1
Software Engineering Event Modeling using Relative Time in Temporal Knowledge Graphs http://arxiv.org/abs/2007.01231v2
Solving Constrained CASH Problems with ADMM http://arxiv.org/abs/2006.09635v2
Solving Discounted Stochastic Two-Player Games with Near-Optimal Time and Sample Complexity http://arxiv.org/abs/1908.11071v1
Solving General Arithmetic Word Problems http://arxiv.org/abs/1608.01413v2
Solving Physics Puzzles by Reasoning about Paths http://arxiv.org/abs/2011.07357v1
Source Separation with Deep Generative Priors http://arxiv.org/abs/2002.07942v2
Sources of Transfer in Multilingual Named Entity Recognition http://arxiv.org/abs/2005.00847v1
Span Selection Pre-training for Question Answering http://arxiv.org/abs/1909.04120v2
Span-Based Constituency Parsing with a Structure-Label System and Provably Optimal Dynamic Oracles http://arxiv.org/abs/1612.06475v1
Span-ConveRT: Few-shot Span Extraction for Dialog with Pretrained Conversational Representations http://arxiv.org/abs/2005.08866v2
Span-based Localizing Network for Natural Language Video Localization http://arxiv.org/abs/2004.13931v2
Span-based discontinuous constituency parsing: a family of exact chart-based algorithms with time complexities from O(n^6) down to O(n^3) http://arxiv.org/abs/2003.13785v1
SpanBERT: Improving Pre-training by Representing and Predicting Spans http://arxiv.org/abs/1907.10529v3
Sparse Coding of Neural Word Embeddings for Multilingual Sequence Labeling http://arxiv.org/abs/1612.07130v1
Sparse Gaussian Processes with Spherical Harmonic Features http://arxiv.org/abs/2006.16649v1
Sparse Orthogonal Variational Inference for Gaussian Processes http://arxiv.org/abs/1910.10596v3
Sparse Overcomplete Word Vector Representations http://arxiv.org/abs/1506.02004v1
Sparse Parallel Training of Hierarchical Dirichlet Process Topic Models http://arxiv.org/abs/1906.02416v2
Sparse Sinkhorn Attention http://arxiv.org/abs/2002.11296v1
Sparse Text Generation http://arxiv.org/abs/2004.02644v3
Sparse and Constrained Attention for Neural Machine Translation http://arxiv.org/abs/1805.08241v1
Sparse and Low-rank Tensor Estimation via Cubic Sketchings http://arxiv.org/abs/1801.09326v4
Sparsified Linear Programming for Zero-Sum Equilibrium Finding http://arxiv.org/abs/2006.03451v2
SpatialSim: Recognizing Spatial Configurations of Objects with Graph Neural Networks http://arxiv.org/abs/2004.04546v2
Speak to your Parser: Interactive Text-to-SQL with Natural Language Feedback http://arxiv.org/abs/2005.02539v2
Speaker Sensitive Response Evaluation Model http://arxiv.org/abs/2006.07015v1
Speakers Fill Lexical Semantic Gaps with Context http://arxiv.org/abs/2010.02172v2
Specialising Word Vectors for Lexical Entailment http://arxiv.org/abs/1710.06371v2
Spectral Clustering with Graph Neural Networks for Graph Pooling http://arxiv.org/abs/1907.00481v6
Spectral Frank-Wolfe Algorithm: Strict Complementarity and Linear Convergence http://arxiv.org/abs/2006.01719v4
Spectral Subsampling MCMC for Stationary Time Series http://arxiv.org/abs/1910.13627v2
Spectrum Dependent Learning Curves in Kernel Regression and Wide Neural Networks http://arxiv.org/abs/2002.02561v6
Speech Translation and the End-to-End Promise: Taking Stock of Where We Are http://arxiv.org/abs/2004.06358v1
Speeding Up Neural Machine Translation Decoding by Cube Pruning http://arxiv.org/abs/1809.02992v1
SpellGCN: Incorporating Phonological and Visual Similarities into Language Models for Chinese Spelling Check http://arxiv.org/abs/2004.14166v2
Spelling Error Correction with Soft-Masked BERT http://arxiv.org/abs/2005.07421v1
Split and Rephrase http://arxiv.org/abs/1707.06971v1
Spot The Bot: A Robust and Efficient Framework for the Evaluation of Conversational Dialogue Systems http://arxiv.org/abs/2010.02140v1
Spying on your neighbors: Fine-grained probing of contextual embeddings for information about surrounding words http://arxiv.org/abs/2005.01810v1
SqueezeBERT: What can computer vision teach NLP about efficient neural networks? http://arxiv.org/abs/2006.11316v1
Stabilizing Bi-Level Hyperparameter Optimization using Moreau-Yosida Regularization http://arxiv.org/abs/2007.13322v1
Stabilizing Differentiable Architecture Search via Perturbation-based Regularization http://arxiv.org/abs/2002.05283v3
Stabilizing Transformers for Reinforcement Learning http://arxiv.org/abs/1910.06764v1
Stack-Pointer Networks for Dependency Parsing http://arxiv.org/abs/1805.01087v1
Stance Prediction and Claim Verification: An Arabic Perspective http://arxiv.org/abs/2005.10410v1
Stance Prediction for Contemporary Issues: Data and Experiments http://arxiv.org/abs/2006.00052v1
Stanza: A Python Natural Language Processing Toolkit for Many Human Languages http://arxiv.org/abs/2003.07082v2
State Space Expectation Propagation: Efficient Inference Schemes for Temporal Gaussian Processes http://arxiv.org/abs/2007.05994v1
Statistical Machine Translation Features with Multitask Tensor Networks http://arxiv.org/abs/1506.00698v1
Statistically Efficient Off-Policy Policy Gradients http://arxiv.org/abs/2002.04014v2
Statistically Preconditioned Accelerated Gradient Method for Distributed Optimization http://arxiv.org/abs/2002.10726v1
Stay on the Path: Instruction Fidelity in Vision-and-Language Navigation http://arxiv.org/abs/1905.12255v3
Staying True to Your Word: (How) Can Attention Become Explanation? http://arxiv.org/abs/2005.09379v1
Stepwise Extractive Summarization and Planning with Structured Transformers http://arxiv.org/abs/2010.02744v1
Stepwise Model Selection for Sequence Prediction via Deep Kernel Learning http://arxiv.org/abs/2001.03898v3
Stereo Endoscopic Image Super-Resolution Using Disparity-Constrained Parallel Attention http://arxiv.org/abs/2003.08539v1
Stimulating Creativity with FunLines: A Case Study of Humor Generation in Headlines http://arxiv.org/abs/2002.02031v1
Stochastic Coordinate Minimization with Progressive Precision for Stochastic Convex Optimization http://arxiv.org/abs/2003.05482v1
Stochastic Differential Equations with Variational Wishart Diffusions http://arxiv.org/abs/2006.14895v1
Stochastic Flows and Geometric Optimization on the Orthogonal Group http://arxiv.org/abs/2003.13563v1
Stochastic Frank-Wolfe for Constrained Finite-Sum Minimization http://arxiv.org/abs/2002.11860v5
Stochastic Gauss-Newton Algorithms for Nonconvex Compositional Optimization http://arxiv.org/abs/2002.07290v2
Stochastic Gradient and Langevin Processes http://arxiv.org/abs/1907.03215v7
Stochastic Hamiltonian Gradient Methods for Smooth Games http://arxiv.org/abs/2007.04202v1
Stochastic Latent Residual Video Prediction http://arxiv.org/abs/2002.09219v4
Stochastic Linear Contextual Bandits with Diverse Contexts http://arxiv.org/abs/2003.02681v1
Stochastic Neural Network with Kronecker Flow http://arxiv.org/abs/1906.04282v2
Stochastic Normalizing Flows http://arxiv.org/abs/2002.09547v2
Stochastic Optimization for Regularized Wasserstein Estimators http://arxiv.org/abs/2002.08695v1
Stochastic Particle-Optimization Sampling and the Non-Asymptotic Convergence Theory http://arxiv.org/abs/1809.01293v5
Stochastic Recursive Variance-Reduced Cubic Regularization Methods http://arxiv.org/abs/1901.11518v2
Stochastic Regret Minimization in Extensive-Form Games http://arxiv.org/abs/2002.08493v1
Stochastic Subspace Cubic Newton Method http://arxiv.org/abs/2002.09526v1
Stochastic Top-k ListNet http://arxiv.org/abs/1511.00271v1
Stochastic Wasserstein Autoencoder for Probabilistic Sentence Generation http://arxiv.org/abs/1806.08462v2
Stochastic bandits with arm-dependent delays http://arxiv.org/abs/2006.10459v1
Stochastic-YOLO: Efficient Probabilistic Object Detection under Dataset Shifts http://arxiv.org/abs/2009.02967v2
Stochastically Dominant Distributional Reinforcement Learning http://arxiv.org/abs/1905.07318v4
Stochasticity in Neural ODEs: An Empirical Study http://arxiv.org/abs/2002.09779v2
Stolen Probability: A Structural Weakness of Neural Language Models http://arxiv.org/abs/2005.02433v1
Stopping criterion for active learning based on deterministic generalization bounds http://arxiv.org/abs/2005.07402v1
Straight to the Tree: Constituency Parsing with Neural Syntactic Distance http://arxiv.org/abs/1806.04168v1
Strategic Classification is Causal Modeling in Disguise http://arxiv.org/abs/1910.10362v3
Strategies for Structuring Story Generation http://arxiv.org/abs/1902.01109v2
Strategizing against No-regret Learners http://arxiv.org/abs/1909.13861v1
Streamlining Tensor and Network Pruning in PyTorch http://arxiv.org/abs/2004.13770v1
Strength from Weakness: Fast Learning Using Weak Supervision http://arxiv.org/abs/2002.08483v1
Stretching the Effectiveness of MLE from Accuracy to Bias for Pairwise Comparisons http://arxiv.org/abs/1906.04066v1
Striving for Simplicity and Performance in Off-Policy DRL: Output Normalization and Non-Uniform Sampling http://arxiv.org/abs/1910.02208v4
Strong Baselines for Neural Semi-supervised Learning under Domain Shift http://arxiv.org/abs/1804.09530v1
Strong Baselines for Simple Question Answering over Knowledge Graphs with and without Neural Networks http://arxiv.org/abs/1712.01969v2
Strong and Simple Baselines for Multimodal Utterance Embeddings http://arxiv.org/abs/1906.02125v2
Stronger and Faster Wasserstein Adversarial Attacks http://arxiv.org/abs/2008.02883v1
StructVAE: Tree-structured Latent Variable Models for Semi-supervised Semantic Parsing http://arxiv.org/abs/1806.07832v1
Structural Language Models of Code http://arxiv.org/abs/1910.00577v4
Structural Neural Encoders for AMR-to-text Generation http://arxiv.org/abs/1903.11410v2
Structural Scaffolds for Citation Intent Classification in Scientific Publications http://arxiv.org/abs/1904.01608v2
Structural Supervision Improves Few-Shot Learning and Syntactic Generalization in Neural Language Models http://arxiv.org/abs/2010.05725v1
Structure Adaptive Algorithms for Stochastic Bandits http://arxiv.org/abs/2007.00969v1
Structure Aware Negative Sampling in Knowledge Graphs http://arxiv.org/abs/2009.11355v2
Structure Mapping for Transferability of Causal Models http://arxiv.org/abs/2007.09445v1
Structure-Level Knowledge Distillation For Multilingual Sequence Labeling http://arxiv.org/abs/2004.03846v3
Structured Attention for Unsupervised Dialogue Structure Induction http://arxiv.org/abs/2009.08552v2
Structured Linear Contextual Bandits: A Sharp and Geometric Smoothed Analysis http://arxiv.org/abs/2002.11332v1
Structured Minimally Supervised Learning for Neural Relation Extraction http://arxiv.org/abs/1904.00118v5
Structured Multi-Label Biomedical Text Tagging via Attentive Neural Tree Decoding http://arxiv.org/abs/1810.01468v1
Structured Policy Iteration for Linear Quadratic Regulator http://arxiv.org/abs/2007.06202v1
Structured Prediction with Partial Labelling through the Infimum Loss http://arxiv.org/abs/2003.00920v2
Structured Pruning of Large Language Models http://arxiv.org/abs/1910.04732v1
Structured Training for Neural Network Transition-Based Parsing http://arxiv.org/abs/1506.06158v1
Structured Tuning for Semantic Role Labeling http://arxiv.org/abs/2005.00496v2
Student-Teacher Curriculum Learning via Reinforcement Learning: Predicting Hospital Inpatient Admission Location http://arxiv.org/abs/2007.01135v1
Studying the Inductive Biases of RNNs with Synthetic Variations of Natural Languages http://arxiv.org/abs/1903.06400v2
Style Transfer Through Back-Translation http://arxiv.org/abs/1804.09000v3
Sub-Instruction Aware Vision-and-Language Navigation http://arxiv.org/abs/2004.02707v2
Subgoal Discovery for Hierarchical Dialogue Policy Learning http://arxiv.org/abs/1804.07855v3
SubjQA: A Dataset for Subjectivity and Review Comprehension http://arxiv.org/abs/2004.14283v3
Sublinear Optimal Policy Value Estimation in Contextual Bandits http://arxiv.org/abs/1912.06111v2
Substance over Style: Document-Level Targeted Content Transfer http://arxiv.org/abs/2010.08618v1
Subword Regularization: Improving Neural Network Translation Models with Multiple Subword Candidates http://arxiv.org/abs/1804.10959v1
Subword-Level Language Identification for Intra-Word Code-Switching http://arxiv.org/abs/1904.01989v1
Successfully Applying the Stabilized Lottery Ticket Hypothesis to the Transformer Architecture http://arxiv.org/abs/2005.03454v2
Summarizing Opinions: Aspect Extraction Meets Sentiment Prediction and They Are Both Weakly Supervised http://arxiv.org/abs/1808.08858v1
Summarizing Text on Any Aspects: A Knowledge-Informed Weakly-Supervised Approach http://arxiv.org/abs/2010.06792v2
Super-efficiency of automatic differentiation for functions defined as a minimum http://arxiv.org/abs/2002.03722v1
Supermasks in Superposition http://arxiv.org/abs/2006.14769v3
Supertagging Combinatory Categorial Grammar with Attentive Graph Convolutional Networks http://arxiv.org/abs/2010.06115v2
Supervised Attentions for Neural Machine Translation http://arxiv.org/abs/1608.00112v1
Supervised Domain Enablement Attention for Personalized Domain Classification http://arxiv.org/abs/1812.07546v1
Supervised Grapheme-to-Phoneme Conversion of Orthographic Schwas in Hindi and Punjabi http://arxiv.org/abs/2004.10353v2
Supervised Learning of Universal Sentence Representations from Natural Language Inference Data http://arxiv.org/abs/1705.02364v5
Supervised Learning: No Loss No Cry http://arxiv.org/abs/2002.03555v1
Supervised Seeded Iterated Learning for Interactive Language Learning http://arxiv.org/abs/2010.02975v1
Support recovery and sup-norm convergence rates for sparse pivotal estimation http://arxiv.org/abs/2001.05401v3
Surrogate sea ice model enables efficient tuning http://arxiv.org/abs/2006.12977v1
SwitchOut: an Efficient Data Augmentation Algorithm for Neural Machine Translation http://arxiv.org/abs/1808.07512v2
Symbolic Network: Generalized Neural Policies for Relational MDPs http://arxiv.org/abs/2002.07375v2
Syn-QG: Syntactic and Shallow Semantic Rules for Question Generation http://arxiv.org/abs/2004.08694v3
SynSetExpan: An Iterative Framework for Joint Entity Set Expansion and Synonym Discovery http://arxiv.org/abs/2009.13827v1
Synchronous Bidirectional Neural Machine Translation http://arxiv.org/abs/1905.04847v1
Syntactic Data Augmentation Increases Robustness to Inference Heuristics http://arxiv.org/abs/2004.11999v1
Syntactic Scaffolds for Semantic Structures http://arxiv.org/abs/1808.10485v1
Syntactic Search by Example http://arxiv.org/abs/2006.03010v1
Syntactic Structure Distillation Pretraining For Bidirectional Encoders http://arxiv.org/abs/2005.13482v1
Syntax-Enhanced Neural Machine Translation with Syntax-Aware Word Representations http://arxiv.org/abs/1905.02878v1
Syntax-guided Controlled Generation of Paraphrases http://arxiv.org/abs/2005.08417v1
T-Basis: a Compact Representation for Neural Networks http://arxiv.org/abs/2007.06631v1
T-GD: Transferable GAN-generated Images Detection Framework http://arxiv.org/abs/2008.04115v1
T3: Tree-Autoencoder Constrained Adversarial Text Generation for Targeted Attack http://arxiv.org/abs/1912.10375v2
TACRED Revisited: A Thorough Evaluation of the TACRED Relation Extraction Task http://arxiv.org/abs/2004.14855v1
TAG : Type Auxiliary Guiding for Code Comment Generation http://arxiv.org/abs/2005.02835v1
TAPAS: Weakly Supervised Table Parsing via Pre-training http://arxiv.org/abs/2004.02349v2
TOD-BERT: Pre-trained Natural Language Understanding for Task-Oriented Dialogue http://arxiv.org/abs/2004.06871v3
TORQUE: A Reading Comprehension Dataset of Temporal Ordering Questions http://arxiv.org/abs/2005.00242v2
TUDataset: A collection of benchmark datasets for learning with graphs http://arxiv.org/abs/2007.08663v1
TUNIZI: a Tunisian Arabizi sentiment analysis Dataset http://arxiv.org/abs/2004.14303v1
TVQA+: Spatio-Temporal Grounding for Video Question Answering http://arxiv.org/abs/1904.11574v2
TVQA: Localized, Compositional Video Question Answering http://arxiv.org/abs/1809.01696v2
TWEETQA: A Social Media Focused Question Answering Dataset http://arxiv.org/abs/1907.06292v1
TXtract: Taxonomy-Aware Knowledge Extraction for Thousands of Product Categories http://arxiv.org/abs/2004.13852v2
TaBERT: Pretraining for Joint Understanding of Textual and Tabular Data http://arxiv.org/abs/2005.08314v1
Tabula nearly rasa: Probing the Linguistic Knowledge of Character-Level Neural Language Models Trained on Unsegmented Text http://arxiv.org/abs/1906.07285v1
Tackling the Low-resource Challenge for Canonical Segmentation http://arxiv.org/abs/2010.02804v1
Tailoring: encoding inductive biases by optimizing unsupervised objectives at prediction time http://arxiv.org/abs/2009.10623v2
Tails of Lipschitz Triangular Flows http://arxiv.org/abs/1907.04481v3
Taking a hint: How to leverage loss predictors in contextual bandits? http://arxiv.org/abs/2003.01922v2
Talk to Papers: Bringing Neural Question Answering to Academic Search http://arxiv.org/abs/2004.02002v3
Talking to the crowd: What do people react to in online discussions? http://arxiv.org/abs/1507.02205v2
Tangled up in BLEU: Reevaluating the Evaluation of Automatic Machine Translation Evaluation Metrics http://arxiv.org/abs/2006.06264v2
Target Conditioned Sampling: Optimizing Data Selection for Multilingual Neural Machine Translation http://arxiv.org/abs/1905.08212v1
Target-Guided Open-Domain Conversation http://arxiv.org/abs/1905.11553v2
Targeted Syntactic Evaluation of Language Models http://arxiv.org/abs/1808.09031v1
Task-Oriented Dialogue as Dataflow Synthesis http://arxiv.org/abs/2009.11423v2
Task-Oriented Query Reformulation with Reinforcement Learning http://arxiv.org/abs/1704.04572v4
TaskNorm: Rethinking Batch Normalization for Meta-Learning http://arxiv.org/abs/2003.03284v2
Tasty Burgers, Soggy Fries: Probing Aspect Robustness in Aspect-Based Sentiment Analysis http://arxiv.org/abs/2009.07964v4
TaxiNLI: Taking a Ride up the NLU Hill http://arxiv.org/abs/2009.14505v3
Taxonomy of Dual Block-Coordinate Ascent Methods for Discrete Energy Minimization http://arxiv.org/abs/2004.07715v1
Taylor Expansion Policy Optimization http://arxiv.org/abs/2003.06259v1
TeMP: Temporal Message Passing for Temporal Knowledge Graph Completion http://arxiv.org/abs/2010.03526v1
TeaForN: Teacher-Forcing with N-grams http://arxiv.org/abs/2010.03494v2
Teacher-Student Domain Adaptation for Biosensor Models http://arxiv.org/abs/2003.07896v2
Teacher-Student chain for efficient semi-supervised histology image classification http://arxiv.org/abs/2003.08797v2
Technology Readiness Levels for Machine Learning Systems http://arxiv.org/abs/2101.03989v1
Tell Me How to Ask Again: Question Data Augmentation with Controllable Rewriting in Continuous Space http://arxiv.org/abs/2010.01475v1
Tell-and-Answer: Towards Explainable Visual Question Answering using Attributes and Captions http://arxiv.org/abs/1801.09041v1
Template-Based Question Generation from Retrieved Sentences for Improved Unsupervised Question Answering http://arxiv.org/abs/2004.11892v1
Temporal Common Sense Acquisition with Minimal Supervision http://arxiv.org/abs/2005.04304v1
Temporal Information Extraction by Predicting Relative Time-lines http://arxiv.org/abs/1808.09401v1
Temporal Mental Health Dynamics on Social Media http://arxiv.org/abs/2008.13121v3
Temporal Phenotyping using Deep Predictive Clustering of Disease Progression http://arxiv.org/abs/2006.08600v1
Temporally-Continuous Probabilistic Prediction using Polynomial Trajectory Parameterization http://arxiv.org/abs/2011.00399v1
TenIPS: Inverse Propensity Sampling for Tensor Completion http://arxiv.org/abs/2101.00323v1
Tensor Fusion Network for Multimodal Sentiment Analysis http://arxiv.org/abs/1707.07250v1
Tensor denoising and completion based on ordinal observations http://arxiv.org/abs/2002.06524v3
Tensors over Semirings for Latent-Variable Weighted Logic Programs http://arxiv.org/abs/2006.04232v1
TernaryBERT: Distillation-aware Ultra-low Bit BERT http://arxiv.org/abs/2009.12812v3
Test-Time Training with Self-Supervision for Generalization under Distribution Shifts http://arxiv.org/abs/1909.13231v3
Tetra-Tagging: Word-Synchronous Parsing with Linear-Time Inference http://arxiv.org/abs/1904.09745v2
Text Classification Using Label Names Only: A Language Model Self-Training Approach http://arxiv.org/abs/2010.07245v1
Text Classification with Few Examples using Controlled Generalization http://arxiv.org/abs/2005.08469v1
Text Processing Like Humans Do: Visually Attacking and Shielding NLP Systems http://arxiv.org/abs/1903.11508v2
Text and Causal Inference: A Review of Using Text to Remove Confounding from Causal Estimates http://arxiv.org/abs/2005.00649v1
Text to 3D Scene Generation with Rich Lexical Grounding http://arxiv.org/abs/1505.06289v2
Text-Based Ideal Points http://arxiv.org/abs/2005.04232v2
TextAttack: Lessons learned in designing Python frameworks for NLP http://arxiv.org/abs/2010.01724v1
TextBrewer: An Open-Source Knowledge Distillation Toolkit for Natural Language Processing http://arxiv.org/abs/2002.12620v2
TextHide: Tackling Data Privacy in Language Understanding Tasks http://arxiv.org/abs/2010.06053v1
That is a Known Lie: Detecting Previously Fact-Checked Claims http://arxiv.org/abs/2005.06058v1
The (Non-)Utility of Structural Features in BiLSTM-based Dependency Parsers http://arxiv.org/abs/1905.12676v2
The ADAPT Enhanced Dependency Parser at the IWPT 2020 Shared Task http://arxiv.org/abs/2009.01712v1
The Area of the Convex Hull of Sampled Curves: a Robust Functional Statistical Depth Measure http://arxiv.org/abs/1910.04085v2
The Argument Reasoning Comprehension Task: Identification and Reconstruction of Implicit Warrants http://arxiv.org/abs/1708.01425v4
The Boomerang Sampler http://arxiv.org/abs/2006.13777v2
The Cascade Transformer: an Application for Efficient Answer Sentence Selection http://arxiv.org/abs/2005.02534v2
The Complexity of Finding Stationary Points with Stochastic Gradient Descent http://arxiv.org/abs/1910.01845v2
The Curse of Performance Instability in Analysis Datasets: Consequences, Source, and Suggestions http://arxiv.org/abs/2004.13606v2
The Dialogue Dodecathlon: Open-Domain Knowledge and Image Grounded Conversational Agents http://arxiv.org/abs/1911.03768v2
The EOS Decision and Length Extrapolation http://arxiv.org/abs/2010.07174v1
The Effect of Natural Distribution Shift on Question Answering Models http://arxiv.org/abs/2004.14444v1
The Expressive Power of a Class of Normalizing Flow Models http://arxiv.org/abs/2006.00392v1
The FAST Algorithm for Submodular Maximization http://arxiv.org/abs/1907.06173v1
The Fast Loaded Dice Roller: A Near-Optimal Exact Sampler for Discrete Probability Distributions http://arxiv.org/abs/2003.03830v2
The Galactic Dependencies Treebanks: Getting More Data by Synthesizing New Languages http://arxiv.org/abs/1710.03838v1
The Gossiping Insert-Eliminate Algorithm for Multi-Agent Bandits http://arxiv.org/abs/2001.05452v3
The Grammar of Emergent Languages http://arxiv.org/abs/2010.02069v2
The Impact of Neural Network Overparameterization on Gradient Confusion and Stochastic Gradient Descent http://arxiv.org/abs/1904.06963v5
The Implicit Regularization of Ordinary Least Squares Ensembles http://arxiv.org/abs/1910.04743v2
The Implicit Regularization of Stochastic Gradient Flow for Least Squares http://arxiv.org/abs/2003.07802v2
The Implicit and Explicit Regularization Effects of Dropout http://arxiv.org/abs/2002.12915v3
The Importance of Being Recurrent for Modeling Hierarchical Structure http://arxiv.org/abs/1803.03585v2
The Importance of Category Labels in Grammar Induction with Child-directed Utterances http://arxiv.org/abs/2006.11646v1
The Influence of Shape Constraints on the Thresholding Bandit Problem http://arxiv.org/abs/2006.10006v2
The Interplay between Lexical Resources and Natural Language Processing http://arxiv.org/abs/1807.00571v1
The Intrinsic Robustness of Stochastic Bandits to Strategic Manipulation http://arxiv.org/abs/1906.01528v2
The LMU Munich System for the WMT 2020 Unsupervised Machine Translation Shared Task http://arxiv.org/abs/2010.13192v1
The Language of Legal and Illegal Activity on the Darknet http://arxiv.org/abs/1905.05543v2
The Lipschitz Constant of Self-Attention http://arxiv.org/abs/2006.04710v1
The Lower The Simpler: Simplifying Hierarchical Recurrent Models http://arxiv.org/abs/1809.02790v4
The Measure and Mismeasure of Fairness: A Critical Review of Fair Machine Learning http://arxiv.org/abs/1808.00023v2
The Microsoft Toolkit of Multi-Task Deep Neural Networks for Natural Language Understanding http://arxiv.org/abs/2002.07972v2
The Multilingual Amazon Reviews Corpus http://arxiv.org/abs/2010.02573v1
The NarrativeQA Reading Comprehension Challenge http://arxiv.org/abs/1712.07040v1
The NetHack Learning Environment http://arxiv.org/abs/2006.13760v2
The Neural Tangent Kernel in High Dimensions: Triple Descent and a Multi-Scale Theory of Generalization http://arxiv.org/abs/2008.06786v1
The Non-IID Data Quagmire of Decentralized Machine Learning http://arxiv.org/abs/1910.00189v2
The Paradigm Discovery Problem http://arxiv.org/abs/2005.01630v1
The PhotoBook Dataset: Building Common Ground through Visually-Grounded Dialogue http://arxiv.org/abs/1906.01530v2
The Power Spherical distribution http://arxiv.org/abs/2006.04437v2
The Power of Batching in Multiple Hypothesis Testing http://arxiv.org/abs/1910.04968v2
The Referential Reader: A Recurrent Entity Network for Anaphora Resolution http://arxiv.org/abs/1902.01541v2
The Return of Lexical Dependencies: Neural Lexicalized PCFGs http://arxiv.org/abs/2007.15135v1
The Right Tool for the Job: Matching Model and Instance Complexities http://arxiv.org/abs/2004.07453v2
The SIGMORPHON 2020 Shared Task on Unsupervised Morphological Paradigm Completion http://arxiv.org/abs/2005.13756v1
The SOFC-Exp Corpus and Neural Approaches to Information Extraction in the Materials Science Domain http://arxiv.org/abs/2006.03039v1
The Secret is in the Spectra: Predicting Cross-lingual Task Performance with Spectral Similarity Measures http://arxiv.org/abs/2001.11136v2
The Sensitivity of Language Models and Humans to Winograd Schema Perturbations http://arxiv.org/abs/2005.01348v2
The State and Fate of Linguistic Diversity and Inclusion in the NLP World http://arxiv.org/abs/2004.09095v2
The Sylvester Graphical Lasso (SyGlasso) http://arxiv.org/abs/2002.00288v1
The TechQA Dataset http://arxiv.org/abs/1911.02984v1
The Tree Ensemble Layer: Differentiability meets Conditional Computation http://arxiv.org/abs/2002.07772v2
The True Sample Complexity of Identifying Good Arms http://arxiv.org/abs/1906.06594v1
The Unreasonable Volatility of Neural Machine Translation Models http://arxiv.org/abs/2005.12398v1
The Unstoppable Rise of Computational Linguistics in Deep Learning http://arxiv.org/abs/2005.06420v3
The Usual Suspects? Reassessing Blame for VAE Posterior Collapse http://arxiv.org/abs/1912.10702v1
The Volctrans Machine Translation System for WMT20 http://arxiv.org/abs/2010.14806v2
The Web as a Knowledge-base for Answering Complex Questions http://arxiv.org/abs/1803.06643v1
The World is Not Binary: Learning to Rank with Grayscale Data for Dialogue Response Selection http://arxiv.org/abs/2004.02421v4
The continuous categorical: a novel simplex-valued exponential family http://arxiv.org/abs/2002.08563v2
The cost-free nature of optimally tuning Tikhonov regularizers and other ordered smoothers http://arxiv.org/abs/1905.12517v1
The elephant in the interpretability room: Why use attention as explanation when we have saliency methods? http://arxiv.org/abs/2010.05607v1
The emergence of number and syntax units in LSTM language models http://arxiv.org/abs/1903.07435v2
The equivalence between Stein variational gradient descent and black-box variational inference http://arxiv.org/abs/2004.01822v1
The importance of fillers for text representations of speech transcripts http://arxiv.org/abs/2009.11340v2
The k-tied Normal Distribution: A Compact Parameterization of Gaussian Mean Field Posteriors in Bayesian Neural Networks http://arxiv.org/abs/2002.02655v2
The many Shapley values for model explanation http://arxiv.org/abs/1908.08474v2
The perceptual boost of visual attention is task-dependent in naturalistic settings http://arxiv.org/abs/2003.00882v2
The role of context in neural pitch accent detection in English http://arxiv.org/abs/2004.14846v2
The role of regularization in classification of high-dimensional noisy Gaussian mixture http://arxiv.org/abs/2002.11544v1
The unreasonable effectiveness of Batch-Norm statistics in addressing catastrophic forgetting across medical institutions http://arxiv.org/abs/2011.08096v1
Theoretical Limitations of Self-Attention in Neural Sequence Models http://arxiv.org/abs/1906.06755v2
Theory of Mind with Guilt Aversion Facilitates Cooperative Reinforcement Learning http://arxiv.org/abs/2009.07445v1
Thermodynamic Consistent Neural Networks for Learning Material Interfacial Mechanics http://arxiv.org/abs/2011.14172v1
Thompson Sampling Algorithms for Mean-Variance Bandits http://arxiv.org/abs/2002.00232v3
Thompson Sampling for Linearly Constrained Bandits http://arxiv.org/abs/2004.09258v2
Thompson Sampling via Local Uncertainty http://arxiv.org/abs/1910.13673v3
Thresholding Bandit Problem with Both Duels and Pulls http://arxiv.org/abs/1910.06368v2
Thresholding Graph Bandits with GrAPL http://arxiv.org/abs/1905.09190v3
Tied Multitask Learning for Neural Speech Translation http://arxiv.org/abs/1802.06655v2
Tight Differential Privacy for Discrete-Valued Mechanisms and for the Subsampled Gaussian Mechanism Using FFT http://arxiv.org/abs/2006.07134v2
Tight Lower Bounds for Combinatorial Multi-Armed Bandits http://arxiv.org/abs/2002.05392v3
Tightening Exploration in Upper Confidence Reinforcement Learning http://arxiv.org/abs/2004.09656v2
Tigrinya Neural Machine Translation with Transfer Learning for Humanitarian Response http://arxiv.org/abs/2003.11523v1
Tilde at WMT 2020: News Task Systems http://arxiv.org/abs/2010.15423v1
Time Adaptive Reinforcement Learning http://arxiv.org/abs/2004.08600v1
Time Dependence in Non-Autonomous Neural ODEs http://arxiv.org/abs/2005.01906v2
Time Series Deconfounder: Estimating Treatment Effects over Time in the Presence of Hidden Confounders http://arxiv.org/abs/1902.00450v4
Time Series Source Separation with Slow Flows http://arxiv.org/abs/2007.10182v1
Time-aware Large Kernel Convolutions http://arxiv.org/abs/2002.03184v2
Tiny Video Networks http://arxiv.org/abs/1910.06961v1
To BERT or Not to BERT: Comparing Task-specific and Task-agnostic Semi-Supervised Approaches for Sequence Tagging http://arxiv.org/abs/2010.14042v1
To Schedule or not to Schedule: Extracting Task Specific Temporal Entities and Associated Negation Constraints http://arxiv.org/abs/2012.02594v1
To Test Machine Comprehension, Start by Defining Comprehension http://arxiv.org/abs/2005.01525v2
ToTTo: A Controlled Table-To-Text Generation Dataset http://arxiv.org/abs/2004.14373v3
Token-level and sequence-level loss smoothing for RNN language models http://arxiv.org/abs/1805.05062v1
Top-Rank-Focused Adaptive Vote Collection for the Evaluation of Domain-Specific Semantic Models http://arxiv.org/abs/2010.04486v1
Topic Memory Networks for Short Text Classification http://arxiv.org/abs/1809.03664v1
Topic Modeling in Embedding Spaces http://arxiv.org/abs/1907.04907v1
Topic Modeling via Full Dependence Mixtures http://arxiv.org/abs/1906.06181v3
Topic Sensitive Attention on Generic Corpora Corrects Sense Bias in Pretrained Embeddings http://arxiv.org/abs/1906.02688v2
Topically Driven Neural Language Model http://arxiv.org/abs/1704.08012v2
Topological Autoencoders http://arxiv.org/abs/1906.00722v4
Topological Sort for Sentence Ordering http://arxiv.org/abs/2005.00432v1
Topologically Densified Distributions http://arxiv.org/abs/2002.04805v1
Torch-Struct: Deep Structured Prediction Library http://arxiv.org/abs/2002.00876v1
Toward A Neuro-inspired Creative Decoder http://arxiv.org/abs/1902.02399v4
Toward Better Storylines with Sentence-Level Language Models http://arxiv.org/abs/2005.05255v1
Toward Fast and Accurate Neural Discourse Segmentation http://arxiv.org/abs/1808.09147v1
Toward Gender-Inclusive Coreference Resolution http://arxiv.org/abs/1910.13913v4
Toward Micro-Dialect Identification in Diaglossic and Code-Switched Environments http://arxiv.org/abs/2010.04900v2
Towards A Sign Language Gloss Representation Of Modern Standard Arabic http://arxiv.org/abs/2005.01497v1
Towards Accurate and Reliable Energy Measurement of NLP Models http://arxiv.org/abs/2010.05248v1
Towards Content Transfer through Grounded Text Generation http://arxiv.org/abs/1905.05293v1
Towards Conversational Recommendation over Multi-Type Dialogs http://arxiv.org/abs/2005.03954v3
Towards Debiasing NLU Models from Unknown Biases http://arxiv.org/abs/2009.12303v4
Towards Debiasing Sentence Representations http://arxiv.org/abs/2007.08100v1
Towards Dynamic Computation Graphs via Sparse Latent Structure http://arxiv.org/abs/1809.00653v1
Towards Effective Context for Meta-Reinforcement Learning: an Approach based on Contrastive Learning http://arxiv.org/abs/2009.13891v3
Towards End-to-End In-Image Neural Machine Translation http://arxiv.org/abs/2010.10648v1
Towards End-to-End Reinforcement Learning of Dialogue Agents for Information Access http://arxiv.org/abs/1609.00777v3
Towards Explainable Graph Representations in Digital Pathology http://arxiv.org/abs/2007.00311v1
Towards Exploiting Background Knowledge for Building Conversation Systems http://arxiv.org/abs/1809.08205v1
Towards Faithful Neural Table-to-Text Generation with Content-Matching Constraints http://arxiv.org/abs/2005.00969v1
Towards Faithfully Interpretable NLP Systems: How should we define and evaluate faithfulness? http://arxiv.org/abs/2004.03685v3
Towards Induction of Structured Phoneme Inventories http://arxiv.org/abs/2010.05959v1
Towards Interpretable Reasoning over Paragraph Effects in Situation http://arxiv.org/abs/2010.01272v1
Towards Interpreting BERT for Reading Comprehension Based QA http://arxiv.org/abs/2010.08983v1
Towards Map-Based Validation of Semantic Segmentation Masks http://arxiv.org/abs/2011.08008v2
Towards Multimodal Simultaneous Neural Machine Translation http://arxiv.org/abs/2004.03180v2
Towards Near-imperceptible Steganographic Text http://arxiv.org/abs/1907.06679v2
Towards Neural Machine Translation for Edoid Languages http://arxiv.org/abs/2003.10704v1
Towards Open Domain Event Trigger Identification using Adversarial Domain Adaptation http://arxiv.org/abs/2005.11355v1
Towards Persona-Based Empathetic Conversational Models http://arxiv.org/abs/2004.12316v7
Towards Physics-informed Deep Learning for Turbulent Flow Prediction http://arxiv.org/abs/1911.08655v4
Towards Reasonably-Sized Character-Level Transformer NMT by Finetuning Subword Systems http://arxiv.org/abs/2004.14280v2
Towards Robustifying NLI Models Against Lexical Dataset Biases http://arxiv.org/abs/2005.04732v2
Towards String-to-Tree Neural Machine Translation http://arxiv.org/abs/1704.04743v3
Towards Supervised and Unsupervised Neural Machine Translation Baselines for Nigerian Pidgin http://arxiv.org/abs/2003.12660v1
Towards Transparent and Explainable Attention Models http://arxiv.org/abs/2004.14243v1
Towards Understanding Gender Bias in Relation Extraction http://arxiv.org/abs/1911.03642v3
Towards Understanding the Dynamics of the First-Order Adversaries http://arxiv.org/abs/2010.10650v1
Towards Understanding the Regularization of Adversarial Robustness on Neural Networks http://arxiv.org/abs/2011.07478v1
Towards Universal Dialogue State Tracking http://arxiv.org/abs/1810.09587v1
Towards Unsupervised Language Understanding and Generation by Joint Dual Learning http://arxiv.org/abs/2004.14710v1
Towards a General Theory of Infinite-Width Limits of Neural Classifiers http://arxiv.org/abs/2003.05884v3
Towards a predictive spatio-temporal representation of brain data http://arxiv.org/abs/2003.03290v1
Towards an Automatic Turing Test: Learning to Evaluate Dialogue Responses http://arxiv.org/abs/1708.07149v2
Towards classification parity across cohorts http://arxiv.org/abs/2005.08033v1
Towards intervention-centric causal reasoning in learning agents http://arxiv.org/abs/2005.12968v1
Toxicity Detection: Does Context Really Matter? http://arxiv.org/abs/2006.00998v1
Train No Evil: Selective Masking for Task-Guided Pre-Training http://arxiv.org/abs/2004.09733v2
Trainable Greedy Decoding for Neural Machine Translation http://arxiv.org/abs/1702.02429v1
Training Binary Neural Networks through Learning with Noisy Supervision http://arxiv.org/abs/2010.04871v1
Training Binary Neural Networks using the Bayesian Learning Rule http://arxiv.org/abs/2002.10778v4
Training Classifiers with Natural Language Explanations http://arxiv.org/abs/1805.03818v4
Training Deep Energy-Based Models with f-Divergence Minimization http://arxiv.org/abs/2003.03463v2
Training Linear Neural Networks: Non-Local Convergence and Complexity Results http://arxiv.org/abs/2002.09852v3
Training Millions of Personalized Dialogue Agents http://arxiv.org/abs/1809.01984v1
Training Neural Networks for and by Interpolation http://arxiv.org/abs/1906.05661v2
Training Production Language Models without Memorizing User Data http://arxiv.org/abs/2009.10031v1
Training Question Answering Models From Synthetic Data http://arxiv.org/abs/2002.09599v1
Trajectory of Alternating Direction Method of Multipliers and Adaptive Acceleration http://arxiv.org/abs/1906.10114v2
TrajectoryNet: A Dynamic Optimal Transport Network for Modeling Cellular Dynamics http://arxiv.org/abs/2002.04461v2
TransQuest at WMT2020: Sentence-Level Direct Assessment http://arxiv.org/abs/2010.05318v1
Transfer Learning and Distant Supervision for Multilingual Transformer Models: A Study on African Languages http://arxiv.org/abs/2010.03179v1
Transfer Learning of Photometric Phenotypes in Agriculture Using Metadata http://arxiv.org/abs/2004.00303v1
Transfer Learning without Knowing: Reprogramming Black-box Machine Learning Models with Scarce Data and Limited Resources http://arxiv.org/abs/2007.08714v2
Transfer NAS: Knowledge Transfer between Search Spaces with Transformer Agents http://arxiv.org/abs/1906.08102v1
Transferable Multi-Domain State Generator for Task-Oriented Dialogue Systems http://arxiv.org/abs/1905.08743v2
Transferring Monolingual Model to Low-Resource Language: The Case of Tigrinya http://arxiv.org/abs/2006.07698v2
Transform the Set: Memory Attentive Generation of Guided and Unguided Image Collages http://arxiv.org/abs/1910.07236v2
Transformation Importance with Applications to Cosmology http://arxiv.org/abs/2003.01926v1
Transformation Networks for Target-Oriented Sentiment Classification http://arxiv.org/abs/1805.01086v1
Transformer Based Multi-Source Domain Adaptation http://arxiv.org/abs/2009.07806v1
Transformer Hawkes Process http://arxiv.org/abs/2002.09291v4
Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context http://arxiv.org/abs/1901.02860v3
Transformer-based Context-aware Sarcasm Detection in Conversation Threads from Social Media http://arxiv.org/abs/2005.11424v1
Transformers are RNNs: Fast Autoregressive Transformers with Linear Attention http://arxiv.org/abs/2006.16236v3
Transformers to Learn Hierarchical Contexts in Multiparty Dialogue for Span-based Question Answering http://arxiv.org/abs/2004.03561v2
Transformers without Tears: Improving the Normalization of Self-Attention http://arxiv.org/abs/1910.05895v2
Transforming Complex Sentences into a Semantic Hierarchy http://arxiv.org/abs/1906.01038v1
Transition-Based Dependency Parsing with Stack Long Short-Term Memory http://arxiv.org/abs/1505.08075v1
Transition-based Semantic Dependency Parsing with Pointer Networks http://arxiv.org/abs/2005.13344v2
Translating Natural Language Instructions for Behavioral Robot Navigation with a Multi-Head Attention Mechanism http://arxiv.org/abs/2006.00697v3
Translating Neuralese http://arxiv.org/abs/1704.06960v5
Translating Similar Languages: Role of Mutual Intelligibility in Multilingual Transformers http://arxiv.org/abs/2011.05037v1
Translation Artifacts in Cross-lingual Transfer Learning http://arxiv.org/abs/2004.04721v4
Translationese as a Language in "Multilingual" NMT http://arxiv.org/abs/1911.03823v2
Traversing Knowledge Graphs in Vector Space http://arxiv.org/abs/1506.01094v2
Tree-Projected Gradient Descent for Estimating Gradient-Sparse Parameters on Graphs http://arxiv.org/abs/2006.01662v1
Treebank Embedding Vectors for Out-of-domain Dependency Parsing http://arxiv.org/abs/2005.00800v1
Trialstreamer: Mapping and Browsing Medical Evidence in Real-Time http://arxiv.org/abs/2005.10865v1
Triangular Architecture for Rare Language Translation http://arxiv.org/abs/1805.04813v2
TriggerNER: Learning with Entity Triggers as Explanations for Named Entity Recognition http://arxiv.org/abs/2004.07493v4
Trying AGAIN instead of Trying Longer: Prior Learning for Automatic Curriculum Learning http://arxiv.org/abs/2004.03168v1
Tuning-free Plug-and-Play Proximal Algorithm for Inverse Imaging Problems http://arxiv.org/abs/2002.09611v2
Two Birds, One Stone: A Simple, Unified Model for Text Generation from Structured and Unstructured Data http://arxiv.org/abs/1909.10158v2
Two Routes to Scalable Credit Assignment without Weight Symmetry http://arxiv.org/abs/2003.01513v2
Two are Better than One: Joint Entity and Relation Extraction with Table-Sequence Encoders http://arxiv.org/abs/2010.03851v1
Two-sample Testing Using Deep Learning http://arxiv.org/abs/1910.06239v2
TwoWingOS: A Two-Wing Optimization Strategy for Evidential Claim Verification http://arxiv.org/abs/1808.03465v2
TyDi QA: A Benchmark for Information-Seeking Question Answering in Typologically Diverse Languages http://arxiv.org/abs/2003.05002v1
Type B Reflexivization as an Unambiguous Testbed for Multilingual Multi-Task Gender Bias http://arxiv.org/abs/2009.11982v2
UDapter: Language Adaptation for Truly Universal Dependency Parsing http://arxiv.org/abs/2004.14327v2
UNION: An Unreferenced Metric for Evaluating Open-ended Story Generation http://arxiv.org/abs/2009.07602v1
USR: An Unsupervised and Reference Free Evaluation Metric for Dialog Generation http://arxiv.org/abs/2005.00456v1
Ultra-Fine Entity Typing http://arxiv.org/abs/1807.04905v1
Unbiased Risk Estimators Can Mislead: A Case Study of Learning with Complementary Labels http://arxiv.org/abs/2007.02235v3
Uncertain Natural Language Inference http://arxiv.org/abs/1909.03042v2
Uncertainty Estimation Using a Single Deep Deterministic Neural Network http://arxiv.org/abs/2003.02037v2
Uncertainty Estimation in Cancer Survival Prediction http://arxiv.org/abs/2003.08573v2
Uncertainty Quantification for Deep Context-Aware Mobile Activity Recognition and Unknown Context Discovery http://arxiv.org/abs/2003.01753v1
Uncertainty Quantification for Sparse Deep Learning http://arxiv.org/abs/2002.11815v2
Uncertainty in Neural Networks: Approximately Bayesian Ensembling http://arxiv.org/abs/1810.05546v5
Uncertainty in Neural Relational Inference Trajectory Reconstruction http://arxiv.org/abs/2006.13666v2
Uncertainty quantification using martingales for misspecified Gaussian processes http://arxiv.org/abs/2006.07368v1
Uncertainty-Aware Label Refinement for Sequence Labeling http://arxiv.org/abs/2012.10608v1
Uncertainty-Aware Semantic Augmentation for Neural Machine Translation http://arxiv.org/abs/2010.04411v1
Uncertainty-Aware Vehicle Orientation Estimation for Joint Detection-Prediction Models http://arxiv.org/abs/2011.03114v1
Uncovering the Folding Landscape of RNA Secondary Structure with Deep Graph Embeddings http://arxiv.org/abs/2006.06885v2
Understanding Climate Impacts on Vegetation with Gaussian Processes in Granger Causality http://arxiv.org/abs/2012.03338v1
Understanding Dataset Design Choices for Multi-hop Reasoning http://arxiv.org/abs/1904.12106v1
Understanding Deep Learning Performance through an Examination of Test Set Difficulty: A Psychometric Case Study http://arxiv.org/abs/1702.04811v3
Understanding Generalization in Deep Learning via Tensor Methods http://arxiv.org/abs/2001.05070v2
Understanding Learned Reward Functions http://arxiv.org/abs/2012.05862v1
Understanding Neural Abstractive Summarization Models via Uncertainty http://arxiv.org/abs/2010.07882v1
Understanding Points of Correspondence between Sentences for Abstractive Summarization http://arxiv.org/abs/2006.05621v1
Understanding Self-Attention of Self-Supervised Audio Transformers http://arxiv.org/abs/2006.03265v2
Understanding Self-Training for Gradual Domain Adaptation http://arxiv.org/abs/2002.11361v1
Understanding Task Design Trade-offs in Crowdsourced Paraphrase Collection http://arxiv.org/abs/1704.05753v2
Understanding Undesirable Word Embedding Associations http://arxiv.org/abs/1908.06361v1
Understanding Unintended Memorization in Federated Learning http://arxiv.org/abs/2006.07490v1
Understanding and Improving Transformer From a Multi-Particle Dynamic System Point of View http://arxiv.org/abs/1906.02762v1
Understanding and Mitigating the Tradeoff Between Robustness and Accuracy http://arxiv.org/abs/2002.10716v2
Understanding language-elicited EEG data by predicting it from a fine-tuned language model http://arxiv.org/abs/1904.01548v1
Understanding the Curse of Horizon in Off-Policy Evaluation via Conditional Importance Sampling http://arxiv.org/abs/1910.06508v2
Understanding the Difficulty of Training Transformers http://arxiv.org/abs/2004.08249v2
Understanding the Impact of Model Incoherence on Convergence of Incremental SGD with Random Reshuffle http://arxiv.org/abs/2007.03509v1
Understanding the Intrinsic Robustness of Image Distributions using Conditional Generative Models http://arxiv.org/abs/2003.00378v1
Understanding the Mechanics of SPIGOT: Surrogate Gradients for Latent Structure Learning http://arxiv.org/abs/2010.02357v1
Understanding the robustness of deep neural network classifiers for breast cancer screening http://arxiv.org/abs/2003.10041v1
Undirected Graphical Models as Approximate Posteriors http://arxiv.org/abs/1901.03440v2
Unfolding and Shrinking Neural Machine Translation Ensembles http://arxiv.org/abs/1704.03279v2
UniConv: A Unified Conversational Neural Architecture for Multi-domain Task-oriented Dialogues http://arxiv.org/abs/2004.14307v2
Unified Pragmatic Models for Generating and Following Instructions http://arxiv.org/abs/1711.04987v3
Unifying Human and Statistical Evaluation for Natural Language Generation http://arxiv.org/abs/1904.02792v1
Universal Approximation Property of Neural Ordinary Differential Equations http://arxiv.org/abs/2012.02414v1
Universal Approximation with Deep Narrow Networks http://arxiv.org/abs/1905.08539v2
Universal Average-Case Optimality of Polyak Momentum http://arxiv.org/abs/2002.04664v3
Universal Decompositional Semantic Parsing http://arxiv.org/abs/1910.10138v3
Universal Equivariant Multilayer Perceptrons http://arxiv.org/abs/2002.02912v2
Universal Natural Language Processing with Limited Annotations: Try Few-shot Textual Entailment as a Start http://arxiv.org/abs/2010.02584v1
Universal Neural Machine Translation for Extremely Low Resource Languages http://arxiv.org/abs/1802.05368v2
Universal Semantic Parsing http://arxiv.org/abs/1702.03196v4
Unlabelled Data Improves Bayesian Uncertainty Calibration under Covariate Shift http://arxiv.org/abs/2006.14988v1
Unlocking the Potential of Deep Counterfactual Value Networks http://arxiv.org/abs/2007.10442v1
Unnatural Language Processing: Bridging the Gap Between Synthetic and Natural Language Data http://arxiv.org/abs/2004.13645v1
Unpaired Sentiment-to-Sentiment Translation: A Cycled Reinforcement Learning Approach http://arxiv.org/abs/1805.05181v2
Unraveling Meta-Learning: Understanding Feature Representations for Few-Shot Tasks http://arxiv.org/abs/2002.06753v3
Unsupervised Alignment-based Iterative Evidence Retrieval for Multi-hop Question Answering http://arxiv.org/abs/2005.01218v1
Unsupervised Commonsense Question Answering with Self-Talk http://arxiv.org/abs/2004.05483v2
Unsupervised Cross-lingual Transfer of Word Embedding Spaces http://arxiv.org/abs/1809.03633v1
Unsupervised Discovery of Implicit Gender Bias http://arxiv.org/abs/2004.08361v2
Unsupervised Discovery of Interpretable Directions in the GAN Latent Space http://arxiv.org/abs/2002.03754v3
Unsupervised Discrete Sentence Representation Learning for Interpretable Neural Dialog Generation http://arxiv.org/abs/1804.08069v1
Unsupervised Domain Adaptation for Visual Navigation http://arxiv.org/abs/2010.14543v2
Unsupervised Domain Clusters in Pretrained Language Models http://arxiv.org/abs/2004.02105v2
Unsupervised Dual Paraphrasing for Two-stage Semantic Parsing http://arxiv.org/abs/2005.13485v3
Unsupervised Grammar Induction with Depth-bounded PCFG http://arxiv.org/abs/1802.08545v2
Unsupervised Hierarchy Matching with Optimal Transport over Hyperbolic Spaces http://arxiv.org/abs/1911.02536v2
Unsupervised Identification of Translationese http://arxiv.org/abs/1609.03205v1
Unsupervised Induction of Semantic Roles within a Reconstruction-Error Minimization Framework http://arxiv.org/abs/1412.2812v1
Unsupervised Learning of Morphological Forests http://arxiv.org/abs/1702.07015v1
Unsupervised Learning of Syntactic Structure with Invertible Neural Projections http://arxiv.org/abs/1808.09111v1
Unsupervised Morphological Paradigm Completion http://arxiv.org/abs/2005.00970v2
Unsupervised Multimodal Neural Machine Translation with Pseudo Visual Pivoting http://arxiv.org/abs/2005.03119v1
Unsupervised Natural Language Inference via Decoupled Multimodal Contrastive Learning http://arxiv.org/abs/2010.08200v1
Unsupervised Neural Machine Translation with Weight Sharing http://arxiv.org/abs/1804.09057v1
Unsupervised Online Grounding of Natural Language during Human-Robot Interactions http://arxiv.org/abs/2007.04304v1
Unsupervised Opinion Summarization as Copycat-Review Generation http://arxiv.org/abs/1911.02247v2
Unsupervised Opinion Summarization with Noising and Denoising http://arxiv.org/abs/2004.10150v1
Unsupervised Paraphrasing by Simulated Annealing http://arxiv.org/abs/1909.03588v2
Unsupervised Parsing via Constituency Tests http://arxiv.org/abs/2010.03146v1
Unsupervised Pidgin Text Generation By Pivoting English Data and Self-Training http://arxiv.org/abs/2003.08272v1
Unsupervised Pivot Translation for Distant Languages http://arxiv.org/abs/1906.02461v3
Unsupervised Pre-training of Bidirectional Speech Encoders via Masked Reconstruction http://arxiv.org/abs/2001.10603v2
Unsupervised Quality Estimation for Neural Machine Translation http://arxiv.org/abs/2005.10608v2
Unsupervised Question Answering by Cloze Translation http://arxiv.org/abs/1906.04980v2
Unsupervised Question Decomposition for Question Answering http://arxiv.org/abs/2002.09758v3
Unsupervised Recurrent Neural Network Grammars http://arxiv.org/abs/1904.03746v6
Unsupervised Reference-Free Summary Quality Evaluation via Contrastive Learning http://arxiv.org/abs/2010.01781v1
Unsupervised Speech Decomposition via Triple Information Bottleneck http://arxiv.org/abs/2004.11284v5
Unsupervised Statistical Machine Translation http://arxiv.org/abs/1809.01272v1
Unsupervised Text Style Transfer with Padded Masked Language Models http://arxiv.org/abs/2010.01054v1
Unsupervised Transfer Learning for Spatiotemporal Predictive Networks http://arxiv.org/abs/2009.11763v1
Unsupervised deep clustering for predictive texture pattern discovery in medical images http://arxiv.org/abs/2002.03721v1
Up or Down? Adaptive Rounding for Post-Training Quantization http://arxiv.org/abs/2004.10568v2
Urban Driving with Conditional Imitation Learning http://arxiv.org/abs/1912.00177v2
Using Automatically Extracted Minimum Spans to Disentangle Coreference Evaluation from Boundary Detection http://arxiv.org/abs/1906.06703v1
Using Context in Neural Machine Translation Training Objectives http://arxiv.org/abs/2005.01483v1
Using Convolutional Variational Autoencoders to Predict Post-Trauma Health Outcomes from Actigraphy Data http://arxiv.org/abs/2011.07406v2
Using Large Pretrained Language Models for Answering User Queries from Product Specifications http://arxiv.org/abs/2005.14613v1
Using Linguistic Features to Improve the Generalization Capability of Neural Coreference Resolvers http://arxiv.org/abs/1708.00160v2
Using Natural Language Relations between Answer Choices for Machine Comprehension http://arxiv.org/abs/2012.15837v1
Using Punkt for Sentence Segmentation in non-Latin Scripts: Experiments on Kurdish (Sorani) Texts http://arxiv.org/abs/2004.14134v2
Using Type Information to Improve Entity Coreference Resolution http://arxiv.org/abs/2010.05738v1
Using competency questions to select optimal clustering structures for residential energy consumption patterns http://arxiv.org/abs/2006.00934v1
Using millions of emoji occurrences to learn any-domain representations for detecting sentiment, emotion and sarcasm http://arxiv.org/abs/1708.00524v2
Utility is in the Eye of the User: A Critique of NLP Leaderboards http://arxiv.org/abs/2009.13888v3
Utility/Privacy Trade-off through the lens of Optimal Transport http://arxiv.org/abs/1905.11148v3
VCDM: Leveraging Variational Bi-encoding and Deep Contextualized Word Representations for Improved Definition Modeling http://arxiv.org/abs/2010.03124v1
VD-BERT: A Unified Vision and Dialog Transformer with BERT http://arxiv.org/abs/2004.13278v3
VFlow: More Expressive Generative Flows with Variational Data Augmentation http://arxiv.org/abs/2002.09741v2
Validated Variational Inference via Practical Posterior Error Bounds http://arxiv.org/abs/1910.04102v4
Validation of Approximate Likelihood and Emulator Models for Computationally Intensive Simulations http://arxiv.org/abs/1905.11505v2
Variable Skipping for Autoregressive Range Density Estimation http://arxiv.org/abs/2007.05572v1
Variance Reduced Coordinate Descent with Acceleration: New Method With a Surprising Application to Finite-Sum Problems http://arxiv.org/abs/2002.04670v1
Variance Reduction for Matrix Games http://arxiv.org/abs/1907.02056v2
Variance Reduction in Stochastic Particle-Optimization Sampling http://arxiv.org/abs/1811.08052v1
Variational Autoencoders and Nonlinear ICA: A Unifying Framework http://arxiv.org/abs/1907.04809v4
Variational Autoencoders for Sparse and Overdispersed Discrete Data http://arxiv.org/abs/1905.00616v2
Variational Autoencoders with Riemannian Brownian Motion Priors http://arxiv.org/abs/2002.05227v3
Variational Bayesian Quantization http://arxiv.org/abs/2002.08158v2
Variational Depth Search in ResNets http://arxiv.org/abs/2002.02797v4
Variational Inference for Learning Representations of Natural Language Edits http://arxiv.org/abs/2004.09143v4
Variational Inference with Continuously-Indexed Normalizing Flows http://arxiv.org/abs/2007.05426v1
Variational Knowledge Graph Reasoning http://arxiv.org/abs/1803.06581v3
Variational Neural Machine Translation with Normalizing Flows http://arxiv.org/abs/2005.13978v1
Variational Optimization on Lie Groups, with Examples of Leading (Generalized) Eigenvalue Problems http://arxiv.org/abs/2001.10006v1
Variational Pretraining for Semi-supervised Text Classification http://arxiv.org/abs/1906.02242v1
Variational Sequential Labelers for Semi-Supervised Learning http://arxiv.org/abs/1906.09535v1
Vector-Vector-Matrix Architecture: A Novel Hardware-Aware Framework for Low-Latency Inference in NLP Applications http://arxiv.org/abs/2010.08412v1
Vehicle Trajectory Prediction by Transfer Learning of Semi-Supervised Models http://arxiv.org/abs/2007.06781v2
Verb Physics: Relative Physical Knowledge of Actions and Objects http://arxiv.org/abs/1706.03799v2
Video Prediction via Example Guidance http://arxiv.org/abs/2007.01738v1
Video-Grounded Dialogues with Pretrained Generation Language Models http://arxiv.org/abs/2006.15319v1
Video2Commonsense: Generating Commonsense Descriptions to Enrich Video Captioning http://arxiv.org/abs/2003.05162v3
Visual Grounding of Learned Physical Models http://arxiv.org/abs/2004.13664v2
Visually Grounded Continual Learning of Compositional Phrases http://arxiv.org/abs/2005.00785v5
Visually Grounded Neural Syntax Acquisition http://arxiv.org/abs/1906.02890v2
Voice Separation with an Unknown Number of Multiple Speakers http://arxiv.org/abs/2003.01531v4
Volctrans Parallel Corpus Filtering System for WMT 2020 http://arxiv.org/abs/2010.14029v1
Wandering Within a World: Online Contextualized Few-Shot Learning http://arxiv.org/abs/2007.04546v2
Wasserstein Control of Mirror Langevin Monte Carlo http://arxiv.org/abs/2002.04363v1
Wasserstein Distance Regularized Sequence Representation for Text Matching in Asymmetrical Domains http://arxiv.org/abs/2010.07717v2
Wasserstein Smoothing: Certified Robustness against Wasserstein Adversarial Attacks http://arxiv.org/abs/1910.10783v1
Wasserstein Style Transfer http://arxiv.org/abs/1905.12828v1
WaveFlow: A Compact Flow-based Model for Raw Audio http://arxiv.org/abs/1912.01219v4
WaveNODE: A Continuous Normalizing Flow for Speech Synthesis http://arxiv.org/abs/2006.04598v4
We Can Detect Your Bias: Predicting the Political Ideology of News Articles http://arxiv.org/abs/2010.05338v1
WeChat Neural Machine Translation Systems for WMT20 http://arxiv.org/abs/2010.00247v2
Weakly Supervised Context Encoder using DICOM metadata in Ultrasound Imaging http://arxiv.org/abs/2003.09070v1
Weakly Supervised Learning of Nuanced Frames for Analyzing Polarization in News Media http://arxiv.org/abs/2009.09609v1
Weakly Supervised Medication Regimen Extraction from Medical Conversations http://arxiv.org/abs/2010.05317v1
Weakly-Supervised Aspect-Based Sentiment Analysis via Joint Aspect-Sentiment Topic Embedding http://arxiv.org/abs/2010.06705v1
Weakly-Supervised Disentanglement Without Compromises http://arxiv.org/abs/2002.02886v4
Weakly-Supervised Spatio-Temporally Grounding Natural Sentence in Video http://arxiv.org/abs/1906.02549v1
WeatherBench: A benchmark dataset for data-driven weather forecasting http://arxiv.org/abs/2002.00469v3
Weight Poisoning Attacks on Pre-trained Models http://arxiv.org/abs/2004.06660v1
Weird AI Yankovic: Generating Parody Lyrics http://arxiv.org/abs/2009.12240v1
Weisfeiler and Leman go sparse: Towards scalable higher-order graph embeddings http://arxiv.org/abs/1904.01543v3
What BERT is not: Lessons from a new suite of psycholinguistic diagnostics for language models http://arxiv.org/abs/1907.13528v2
What Can Learned Intrinsic Rewards Capture? http://arxiv.org/abs/1912.05500v3
What Can We Learn from Collective Human Opinions on Natural Language Inference Data? http://arxiv.org/abs/2010.03532v2
What Did You Think Would Happen? Explaining Agent Behaviour Through Intended Outcomes http://arxiv.org/abs/2011.05064v1
What Do Position Embeddings Learn? An Empirical Study of Pre-Trained Language Model Positional Encoding http://arxiv.org/abs/2010.04903v1
What Does My QA Model Know? Devising Controlled Probes using Expert Knowledge http://arxiv.org/abs/1912.13337v2
What Gives the Answer Away? Question Answering Bias Analysis on Video QA Datasets http://arxiv.org/abs/2007.03626v1
What Happens To BERT Embeddings During Fine-tuning? http://arxiv.org/abs/2004.14448v1
What Have We Achieved on Text Summarization? http://arxiv.org/abs/2010.04529v1
What Kind of Language Is Hard to Language-Model? http://arxiv.org/abs/1906.04726v2
What Makes Reading Comprehension Questions Easier? http://arxiv.org/abs/1808.09384v1
What Question Answering can Learn from Trivia Nerds http://arxiv.org/abs/1910.14464v3
What Was Written vs. Who Read It: News Media Profiling Using Text Analysis and Social Media Context http://arxiv.org/abs/2005.04518v1
What You Say and How You Say it: Joint Modeling of Topics and Discourse in Microblog Conversations http://arxiv.org/abs/1903.07319v1
What Your Username Says About You http://arxiv.org/abs/1507.02045v2
What are the Goals of Distributional Semantics? http://arxiv.org/abs/2005.02982v1
What are the Statistical Limits of Offline RL with Linear Function Approximation? http://arxiv.org/abs/2010.11895v1
What do Models Learn from Question Answering Datasets? http://arxiv.org/abs/2004.03490v2
What do Neural Machine Translation Models Learn about Morphology? http://arxiv.org/abs/1704.03471v3
What is Learned in Visually Grounded Neural Syntax Acquisition http://arxiv.org/abs/2005.01678v2
What is Local Optimality in Nonconvex-Nonconcave Minimax Optimization? http://arxiv.org/abs/1902.00618v3
What is More Likely to Happen Next? Video-and-Language Future Event Prediction http://arxiv.org/abs/2010.07999v1
What makes a good conversation? How controllable attributes affect human judgments http://arxiv.org/abs/1902.08654v2
What's in a Name? Are BERT Named Entity Representations just as Good for any other Name? http://arxiv.org/abs/2007.06897v1
When Are Tree Structures Necessary for Deep Learning of Representations? http://arxiv.org/abs/1503.00185v5
When BERT Plays the Lottery, All Tickets Are Winning http://arxiv.org/abs/2005.00561v2
When Does Self-Supervision Help Graph Convolutional Networks? http://arxiv.org/abs/2006.09136v4
When Does Unsupervised Machine Translation Work? http://arxiv.org/abs/2004.05516v3
When Explanations Lie: Why Many Modified BP Attributions Fail http://arxiv.org/abs/1912.09818v6
When Hearst Is not Enough: Improving Hypernymy Detection from Corpus with Distributional Models http://arxiv.org/abs/2010.04941v1
When and Why is Unsupervised Neural Machine Translation Useless? http://arxiv.org/abs/2004.10581v1
When deep denoising meets iterative phase retrieval http://arxiv.org/abs/2003.01792v1
When do Word Embeddings Accurately Reflect Surveys on our Beliefs About People? http://arxiv.org/abs/2004.12043v1
Where Are You? Localization from Embodied Dialog http://arxiv.org/abs/2011.08277v1
Where Are the Facts? Searching for Fact-checked Information to Alleviate the Spread of Fake News http://arxiv.org/abs/2010.03159v1
Where's the Question? A Multi-channel Deep Convolutional Neural Network for Question Identification in Textual Data http://arxiv.org/abs/2010.07816v1
Which Tasks Should Be Learned Together in Multi-task Learning? http://arxiv.org/abs/1905.07553v4
Who did What: A Large-Scale Person-Centered Cloze Dataset http://arxiv.org/abs/1608.05457v1
Whodunnit? Crime Drama as a Case for Natural Language Understanding http://arxiv.org/abs/1710.11601v1
Why Non-myopic Bayesian Optimization is Promising and How Far Should We Look-ahead? A Study via Rollout http://arxiv.org/abs/1911.01004v2
Why Normalizing Flows Fail to Detect Out-of-Distribution Data http://arxiv.org/abs/2006.08545v1
Why Overfitting Isn't Always Bad: Retrofitting Cross-Lingual Word Embeddings to Dictionaries http://arxiv.org/abs/2005.00524v1
Why Self-Attention? A Targeted Evaluation of Neural Machine Translation Architectures http://arxiv.org/abs/1808.08946v3
Why Skip If You Can Combine: A Simple Knowledge Distillation Technique for Intermediate Layers http://arxiv.org/abs/2010.03034v1
Why bigger is not always better: on finite and infinite neural networks http://arxiv.org/abs/1910.08013v3
Why is unsupervised alignment of English embeddings from different algorithms so hard? http://arxiv.org/abs/1809.00150v1
Widget Captioning: Generating Natural Language Description for Mobile User Interface Elements http://arxiv.org/abs/2010.04295v1
Wiki-CS: A Wikipedia-Based Benchmark for Graph Neural Networks http://arxiv.org/abs/2007.02901v1
WikiConv: A Corpus of the Complete Conversational History of a Large Online Collaborative Community http://arxiv.org/abs/1810.13181v1
Will I Sound Like Me? Improving Persona Consistency in Dialogues through Pragmatic Self-Consciousness http://arxiv.org/abs/2004.05816v2
Will-They-Won't-They: A Very Large Dataset for Stance Detection on Twitter http://arxiv.org/abs/2005.00388v1
Winning on the Merits: The Joint Effects of Content and Style on Debate Outcomes http://arxiv.org/abs/1705.05040v1
WinoWhy: A Deep Diagnosis of Essential Commonsense Knowledge for Answering Winograd Schema Challenge http://arxiv.org/abs/2005.05763v1
With Little Power Comes Great Responsibility http://arxiv.org/abs/2010.06595v1
Woodbury Transformations for Deep Generative Flows http://arxiv.org/abs/2002.12229v3
Word Embeddings for Chemical Patent Natural Language Processing http://arxiv.org/abs/2010.12912v1
Word Frequency Does Not Predict Grammatical Knowledge in Language Models http://arxiv.org/abs/2010.13870v1
Word Ordering Without Syntax http://arxiv.org/abs/1604.08633v2
Word Rotator's Distance http://arxiv.org/abs/2004.15003v3
Word class flexibility: A deep contextualized approach http://arxiv.org/abs/2009.09241v1
Word-level Speech Recognition with a Letter to Word Encoder http://arxiv.org/abs/1906.04323v2
Word-level Textual Adversarial Attacking as Combinatorial Optimization http://arxiv.org/abs/1910.12196v4
Word-order biases in deep-agent emergent communication http://arxiv.org/abs/1905.12330v3
Words aren't enough, their order matters: On the Robustness of Grounding Visual Referring Expressions http://arxiv.org/abs/2005.01655v1
Working Memory Networks: Augmenting Memory Networks with a Relational Reasoning Module http://arxiv.org/abs/1805.09354v1
World Model as a Graph: Learning Latent Landmarks for Planning http://arxiv.org/abs/2011.12491v1
Worse WER, but Better BLEU? Leveraging Word Embedding as Intermediate in Multitask End-to-End Speech Translation http://arxiv.org/abs/2005.10678v2
X-FACTR: Multilingual Factual Knowledge Retrieval from Pretrained Language Models http://arxiv.org/abs/2010.06189v3
X-LXMERT: Paint, Caption and Answer Questions with Multi-Modal Transformers http://arxiv.org/abs/2009.11278v1
XGLUE: A New Benchmark Dataset for Cross-lingual Pre-training, Understanding and Generation http://arxiv.org/abs/2004.01401v3
XL-WiC: A Multilingual Benchmark for Evaluating Semantic Contextualization http://arxiv.org/abs/2010.06478v1
XLNet: Generalized Autoregressive Pretraining for Language Understanding http://arxiv.org/abs/1906.08237v2
XLVIN: eXecuted Latent Value Iteration Nets http://arxiv.org/abs/2010.13146v2
XTREME: A Massively Multilingual Multi-task Benchmark for Evaluating Cross-lingual Generalization http://arxiv.org/abs/2003.11080v5
Xiaomingbot: A Multilingual Robot News Reporter http://arxiv.org/abs/2007.08005v1
XtarNet: Learning to Extract Task-Adaptive Representation for Incremental Few-Shot Learning http://arxiv.org/abs/2003.08561v2
XtremeDistil: Multi-stage Distillation for Massive Multilingual Models http://arxiv.org/abs/2004.05686v2
YOLObile: Real-Time Object Detection on Mobile Devices via Compression-Compilation Co-Design http://arxiv.org/abs/2009.05697v2
You Impress Me: Dialogue Generation via Mutual Persona Perception http://arxiv.org/abs/2004.05388v1
Zeno++: Robust Fully Asynchronous SGD http://arxiv.org/abs/1903.07020v4
Zero-Resource Translation with Multi-Lingual Neural Machine Translation http://arxiv.org/abs/1606.04164v1
Zero-Shot Cross-Lingual Opinion Target Extraction http://arxiv.org/abs/1904.09122v1
Zero-Shot Stance Detection: A Dataset and Model using Generalized Topic Representations http://arxiv.org/abs/2010.03640v1
Zero-Shot Transfer Learning for Event Extraction http://arxiv.org/abs/1707.01066v1
Zero-Shot Transfer Learning with Synthesized Data for Multi-Domain Dialogue State Tracking http://arxiv.org/abs/2005.00891v1
Zero-Shot Translation Quality Estimation with Explicit Cross-Lingual Patterns http://arxiv.org/abs/2010.04989v1
Zero-shot Sequence Labeling: Transferring Knowledge from Sentences to Tokens http://arxiv.org/abs/1805.02214v1
Zero-shot User Intent Detection via Capsule Neural Networks http://arxiv.org/abs/1809.00385v1
ZeroShotCeres: Zero-Shot Relation Extraction from Semi-Structured Webpages http://arxiv.org/abs/2005.07105v1
doc2dial: A Goal-Oriented Document-Grounded Dialogue Dataset http://arxiv.org/abs/2011.06623v2
emrQA: A Large Corpus for Question Answering on Electronic Medical Records http://arxiv.org/abs/1809.00732v1
giotto-tda: A Topological Data Analysis Toolkit for Machine Learning and Data Exploration http://arxiv.org/abs/2004.02551v1
i-RIM applied to the fastMRI challenge http://arxiv.org/abs/1910.08952v1
iNLTK: Natural Language Toolkit for Indic Languages http://arxiv.org/abs/2009.12534v2
iSarcasm: A Dataset of Intended Sarcasm http://arxiv.org/abs/1911.03123v2
jiant: A Software Toolkit for Research on General-Purpose Text Understanding Models http://arxiv.org/abs/2003.02249v2
k-simplex2vec: a simplicial extension of node2vec http://arxiv.org/abs/2010.05636v2
pyBART: Evidence-based Syntactic Transformations for IE http://arxiv.org/abs/2005.01306v2
scGNN: scRNA-seq Dropout Imputation via Induced Hierarchical Cell Similarity Graph http://arxiv.org/abs/2008.03322v1
schuBERT: Optimizing Elements of BERT http://arxiv.org/abs/2005.06628v1
simNet: Stepwise Image-Topic Merging Network for Generating Detailed and Comprehensive Image Captions http://arxiv.org/abs/1808.08732v1
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment