title | url |
---|---|
(Locally) Differentially Private Combinatorial Semi-Bandits | http://arxiv.org/abs/2006.00706v2 |
(Re)construing Meaning in NLP | http://arxiv.org/abs/2005.09099v1 |
2kenize: Tying Subword Sequences for Chinese Script Conversion | http://arxiv.org/abs/2005.03375v1 |
3D-LaneNet+: Anchor Free Lane Detection using a Semi-Local Representation | http://arxiv.org/abs/2011.01535v2 |
A Batch Normalized Inference Network Keeps the KL Vanishing Away | http://arxiv.org/abs/2004.12585v2 |
A Benchmark of Medical Out of Distribution Detection | http://arxiv.org/abs/2007.04250v2 |
A Bilingual Generative Transformer for Semantic Sentence Embedding | http://arxiv.org/abs/1911.03895v2 |
A Boolean Task Algebra for Reinforcement Learning | http://arxiv.org/abs/2001.01394v2 |
A Broad-Coverage Challenge Corpus for Sentence Understanding through Inference | http://arxiv.org/abs/1704.05426v4 |
A Call for More Rigor in Unsupervised Cross-lingual Learning | http://arxiv.org/abs/2004.14958v1 |
A Characterization of Mean Squared Error for Estimator with Bagging | http://arxiv.org/abs/1908.02718v1 |
A Closer Look at Accuracy vs. Robustness | http://arxiv.org/abs/2003.02460v3 |
A Closer Look at Small-loss Bounds for Bandits with Graph Feedback | http://arxiv.org/abs/2002.00315v2 |
A Co-Matching Model for Multi-choice Reading Comprehension | http://arxiv.org/abs/1806.04068v1 |
A Computational Approach to Understanding Empathy Expressed in Text-Based Mental Health Support | http://arxiv.org/abs/2009.08441v1 |
A Contextual Hierarchical Attention Network with Adaptive Objective for Dialogue State Tracking | http://arxiv.org/abs/2006.01554v2 |
A Continuous-time Perspective for Modeling Acceleration in Riemannian Optimization | http://arxiv.org/abs/1910.10782v3 |
A Convolutional Encoder Model for Neural Machine Translation | http://arxiv.org/abs/1611.02344v3 |
A Corpus for Large-Scale Phonetic Typology | http://arxiv.org/abs/2005.13962v1 |
A Corpus with Multi-Level Annotations of Patients, Interventions and Outcomes to Support Language Processing for Medical Literature | http://arxiv.org/abs/1806.04185v1 |
A Cross-Task Analysis of Text Span Representations | http://arxiv.org/abs/2006.03866v1 |
A Crowdsourced Frame Disambiguation Corpus with Ambiguity | http://arxiv.org/abs/1904.06101v1 |
A Data and Compute Efficient Design for Limited-Resources Deep Learning | http://arxiv.org/abs/2004.09691v2 |
A Data-driven Approach for Noise Reduction in Distantly Supervised Biomedical Relation Extraction | http://arxiv.org/abs/2005.12565v1 |
A Decomposable Attention Model for Natural Language Inference | http://arxiv.org/abs/1606.01933v2 |
A Deep Generative Model for Fragment-Based Molecule Generation | http://arxiv.org/abs/2002.12826v1 |
A Deep Generative Model of Vowel Formant Typology | http://arxiv.org/abs/1807.02745v1 |
A Deep Learning Approach for Determining Effects of Tuta Absoluta in Tomato Plants | http://arxiv.org/abs/2004.04023v1 |
A Deep Learning System for Sentiment Analysis of Service Calls | http://arxiv.org/abs/2004.10320v1 |
A Deep Neural Network Sentence Level Classification Method with Context Information | http://arxiv.org/abs/1809.00934v1 |
A Deep Reinforced Model for Zero-Shot Cross-Lingual Summarization with Bilingual Semantic Similarity Rewards | http://arxiv.org/abs/2006.15454v1 |
A Diagnostic Study of Explainability Techniques for Text Classification | http://arxiv.org/abs/2009.13295v1 |
A Differentiable Newton Euler Algorithm for Multi-body Model Learning | http://arxiv.org/abs/2010.09802v1 |
A Distributional Analysis of Sampling-Based Reinforcement Learning Algorithms | http://arxiv.org/abs/2003.12239v1 |
A Distributional Framework for Data Valuation | http://arxiv.org/abs/2002.12334v1 |
A Distributional View on Multi-Objective Policy Optimization | http://arxiv.org/abs/2005.07513v1 |
A Double Residual Compression Algorithm for Efficient Distributed Learning | http://arxiv.org/abs/1910.07561v1 |
A Fast Spectral Algorithm for Mean Estimation with Sub-Gaussian Rates | http://arxiv.org/abs/1908.04468v2 |
A Formal Hierarchy of RNN Architectures | http://arxiv.org/abs/2004.08500v4 |
A Fourier State Space Model for Bayesian ODE Filters | http://arxiv.org/abs/2007.09118v2 |
A Framework and Dataset for Abstract Art Generation via CalligraphyGAN | http://arxiv.org/abs/2012.00744v1 |
A Framework for Sample Efficient Interval Estimation with Control Variates | http://arxiv.org/abs/2006.10287v1 |
A Free-Energy Principle for Representation Learning | http://arxiv.org/abs/2002.12406v1 |
A Full Non-Monotonic Transition System for Unrestricted Non-Projective Parsing | http://arxiv.org/abs/1706.03367v1 |
A General Framework for Information Extraction using Dynamic Span Graphs | http://arxiv.org/abs/1904.03296v1 |
A Generative Approach to Titling and Clustering Wikipedia Sections | http://arxiv.org/abs/2005.11216v1 |
A Generative Model for Joint Natural Language Understanding and Generation | http://arxiv.org/abs/2006.07499v1 |
A Generative Model for Molecular Distance Geometry | http://arxiv.org/abs/1909.11459v4 |
A Generative Parser with a Discriminative Recognition Algorithm | http://arxiv.org/abs/1708.00415v2 |
A Generic First-Order Algorithmic Framework for Bi-Level Programming Beyond Lower-Level Singleton | http://arxiv.org/abs/2006.04045v2 |
A Geometry-Inspired Attack for Generating Natural Language Adversarial Examples | http://arxiv.org/abs/2010.01345v1 |
A Girl Has A Name: Detecting Authorship Obfuscation | http://arxiv.org/abs/2005.00702v1 |
A Graph to Graphs Framework for Retrosynthesis Prediction | http://arxiv.org/abs/2003.12725v1 |
A Hierarchical Latent Structure for Variational Conversation Modeling | http://arxiv.org/abs/1804.03424v2 |
A Hierarchical Probabilistic U-Net for Modeling Multi-Scale Ambiguities | http://arxiv.org/abs/1905.13077v1 |
A Hierarchical Reinforced Sequence Operation Method for Unsupervised Text Style Transfer | http://arxiv.org/abs/1906.01833v1 |
A Hierarchical Transformer for Unsupervised Parsing | http://arxiv.org/abs/2003.13841v1 |
A Hybrid Convolutional Variational Autoencoder for Text Generation | http://arxiv.org/abs/1702.02390v1 |
A Hybrid Stochastic Policy Gradient Algorithm for Reinforcement Learning | http://arxiv.org/abs/2003.00430v2 |
A Joint Named-Entity Recognizer for Heterogeneous Tag-sets Using a Tag Hierarchy | http://arxiv.org/abs/1905.09135v2 |
A Just and Comprehensive Strategy for Using NLP to Address Online Abuse | http://arxiv.org/abs/1906.01738v2 |
A Knowledge-Enhanced Pretraining Model for Commonsense Story Generation | http://arxiv.org/abs/2001.05139v1 |
A La Carte Embedding: Cheap but Effective Induction of Semantic Feature Vectors | http://arxiv.org/abs/1805.05388v1 |
A Large-Scale Multi-Document Summarization Dataset from the Wikipedia Current Events Portal | http://arxiv.org/abs/2005.10070v1 |
A Locally Adaptive Bayesian Cubature Method | http://arxiv.org/abs/1910.02995v1 |
A Meaning-based Statistical English Math Word Problem Solver | http://arxiv.org/abs/1803.06064v2 |
A Mention-Ranking Model for Abstract Anaphora Resolution | http://arxiv.org/abs/1706.02256v2 |
A Meta-Learning Approach for Graph Representation Learning in Multi-Task Settings | http://arxiv.org/abs/2012.06755v1 |
A Methodology for Creating Question Answering Corpora Using Inverse Data Annotation | http://arxiv.org/abs/2004.07633v2 |
A Minimal Span-Based Neural Constituency Parser | http://arxiv.org/abs/1705.03919v1 |
A Monolingual Approach to Contextualized Word Embeddings for Mid-Resource Languages | http://arxiv.org/abs/2006.06202v2 |
A Multi-Axis Annotation Scheme for Event Temporal Relations | http://arxiv.org/abs/1804.07828v2 |
A Multi-Perspective Architecture for Semantic Code Search | http://arxiv.org/abs/2005.06980v1 |
A Multi-Task Incremental Learning Framework with Category Name Embedding for Aspect-Category Sentiment Analysis | http://arxiv.org/abs/2010.02784v1 |
A Multi-modal Approach to Fine-grained Opinion Mining on Video Reviews | http://arxiv.org/abs/2005.13362v2 |
A Multi-sentiment-resource Enhanced Attention Network for Sentiment Classification | http://arxiv.org/abs/1807.04990v1 |
A Multiclass Classification Approach to Label Ranking | http://arxiv.org/abs/2002.09420v1 |
A Multilingual Neural Machine Translation Model for Biomedical Data | http://arxiv.org/abs/2008.02878v1 |
A Multitask Learning Approach for Diacritic Restoration | http://arxiv.org/abs/2006.04016v1 |
A Narration-based Reward Shaping Approach using Grounded Natural Language Commands | http://arxiv.org/abs/1911.00497v1 |
A Nested Attention Neural Hybrid Model for Grammatical Error Correction | http://arxiv.org/abs/1707.02026v2 |
A Neural Attention Model for Abstractive Sentence Summarization | http://arxiv.org/abs/1509.00685v2 |
A Neural Generative Model for Joint Learning Topics and Topic-Specific Word Embeddings | http://arxiv.org/abs/2008.04702v1 |
A Neural Model for User Geolocation and Lexical Dialectology | http://arxiv.org/abs/1704.04008v3 |
A Neural Model of Adaptation in Reading | http://arxiv.org/abs/1808.09930v2 |
A Neural Network for Coordination Boundary Prediction | http://arxiv.org/abs/1610.03946v1 |
A Neuro-AI Interface for Evaluating Generative Adversarial Networks | http://arxiv.org/abs/2003.03193v2 |
A New Neural Network Architecture Invariant to the Action of Symmetry Subgroups | http://arxiv.org/abs/2012.06452v1 |
A Nonparametric Off-Policy Policy Gradient | http://arxiv.org/abs/2001.02435v3 |
A Note on Data Biases in Generative Models | http://arxiv.org/abs/2012.02516v1 |
A Note on Over-Smoothing for Graph Neural Networks | http://arxiv.org/abs/2006.13318v1 |
A Novel Cascade Binary Tagging Framework for Relational Triple Extraction | http://arxiv.org/abs/1909.03227v4 |
A Novel Confidence-Based Algorithm for Structured Bandits | http://arxiv.org/abs/2005.11593v1 |
A Novel Graph-based Multi-modal Fusion Encoder for Neural Machine Translation | http://arxiv.org/abs/2007.08742v1 |
A Pairwise Fair and Community-preserving Approach to k-Center Clustering | http://arxiv.org/abs/2007.07384v1 |
A Practical Algorithm for Multiplayer Bandits when Arm Means Vary Among Players | http://arxiv.org/abs/1902.01239v4 |
A Principled Approach to Learning Stochastic Representations for Privacy in Deep Neural Inference | http://arxiv.org/abs/2003.12154v1 |
A Probabilistic End-To-End Task-Oriented Dialog Model with Latent Belief States towards Semi-Supervised Learning | http://arxiv.org/abs/2009.08115v3 |
A Probabilistic Generative Model for Typographical Analysis of Early Modern Printing | http://arxiv.org/abs/2005.01646v1 |
A Probabilistic Generative Model of Linguistic Typology | http://arxiv.org/abs/1903.10950v3 |
A Probabilistic Model with Commonsense Constraints for Pattern-based Temporal Fact Extraction | http://arxiv.org/abs/2006.06436v1 |
A Re-evaluation of Knowledge Graph Completion Methods | http://arxiv.org/abs/1911.03903v3 |
A Recipe for Creating Multimodal Aligned Datasets for Sequential Tasks | http://arxiv.org/abs/2005.09606v1 |
A Reduction from Reinforcement Learning to No-Regret Online Learning | http://arxiv.org/abs/1911.05873v2 |
A Reinforced Generation of Adversarial Examples for Neural Machine Translation | http://arxiv.org/abs/1911.03677v2 |
A Relational Memory-based Embedding Model for Triple Classification and Search Personalization | http://arxiv.org/abs/1907.06080v2 |
A Relaxed Matching Procedure for Unsupervised BLI | http://arxiv.org/abs/2010.07095v1 |
A Report on the 2020 Sarcasm Detection Shared Task | http://arxiv.org/abs/2005.05814v2 |
A Resource-Free Evaluation Metric for Cross-Lingual Word Embeddings Based on Graph Modularity | http://arxiv.org/abs/1906.01926v1 |
A Rigorous Study on Named Entity Recognition: Can Fine-tuning Pretrained Model Lead to the Promised Land? | http://arxiv.org/abs/2004.12126v2 |
A Sample Complexity Separation between Non-Convex and Convex Meta-Learning | http://arxiv.org/abs/2002.11172v1 |
A Scalable Neural Shortlisting-Reranking Approach for Large-Scale Domain Classification in Natural Language Understanding | http://arxiv.org/abs/1804.08064v1 |
A Self-Training Method for Machine Reading Comprehension with Soft Evidence Extraction | http://arxiv.org/abs/2005.05189v2 |
A Sequential Self Teaching Approach for Improving Generalization in Sound Event Recognition | http://arxiv.org/abs/2007.00144v1 |
A Simple Approach to Learning Unsupervised Multilingual Embeddings | http://arxiv.org/abs/2004.05991v2 |
A Simple Joint Model for Improved Contextual Neural Lemmatization | http://arxiv.org/abs/1904.02306v4 |
A Simple Regularization-based Algorithm for Learning Cross-Domain Word Embeddings | http://arxiv.org/abs/1902.00184v1 |
A Simple Theoretical Model of Importance for Summarization | http://arxiv.org/abs/1801.08991v2 |
A Simple Yet Strong Pipeline for HotpotQA | http://arxiv.org/abs/2004.06753v1 |
A Simple and Effective Model for Answering Multi-span Questions | http://arxiv.org/abs/1909.13375v4 |
A Skeleton-Based Model for Promoting Coherence Among Sentences in Narrative Story Generation | http://arxiv.org/abs/1808.06945v2 |
A Span-based Linearization for Constituent Trees | http://arxiv.org/abs/2004.14704v2 |
A Stein Goodness-of-fit Test for Directional Distributions | http://arxiv.org/abs/2002.06843v1 |
A Stochastic Decoder for Neural Machine Translation | http://arxiv.org/abs/1805.10844v1 |
A Streaming Approach For Efficient Batched Beam Search | http://arxiv.org/abs/2010.02164v2 |
A Study of Deep Learning Colon Cancer Detection in Limited Data Access Scenarios | http://arxiv.org/abs/2005.10326v2 |
A Study of Reinforcement Learning for Neural Machine Translation | http://arxiv.org/abs/1808.08866v1 |
A Study on Encodings for Neural Architecture Search | http://arxiv.org/abs/2007.04965v1 |
A Stylometric Inquiry into Hyperpartisan and Fake News | http://arxiv.org/abs/1702.05638v1 |
A Supervised Word Alignment Method based on Cross-Language Span Prediction using Multilingual BERT | http://arxiv.org/abs/2004.14516v1 |
A Survey on Recognizing Textual Entailment as an NLP Evaluation | http://arxiv.org/abs/2010.03061v1 |
A Syntactic Neural Model for General-Purpose Code Generation | http://arxiv.org/abs/1704.01696v1 |
A System for Worldwide COVID-19 Information Aggregation | http://arxiv.org/abs/2008.01523v2 |
A Systematic Assessment of Syntactic Generalization in Neural Language Models | http://arxiv.org/abs/2005.03692v2 |
A Tale of a Probe and a Parser | http://arxiv.org/abs/2005.01641v2 |
A Theoretical Case Study of Structured Variational Inference for Community Detection | http://arxiv.org/abs/1907.12203v5 |
A Top-Down Neural Architecture towards Text-Level Parsing of Discourse Rhetorical Structure | http://arxiv.org/abs/2005.02680v3 |
A Topology Layer for Machine Learning | http://arxiv.org/abs/1905.12200v2 |
A Trainable Optimal Transport Embedding for Feature Aggregation | http://arxiv.org/abs/2006.12065v3 |
A Transformer-based Approach for Source Code Summarization | http://arxiv.org/abs/2005.00653v1 |
A Transformer-based joint-encoding for Emotion Recognition and Sentiment Analysis | http://arxiv.org/abs/2006.15955v1 |
A Transition-Based Directed Acyclic Graph Parser for UCCA | http://arxiv.org/abs/1704.00552v2 |
A Two-Stage Masked LM Method for Term Set Expansion | http://arxiv.org/abs/2005.01063v1 |
A Unified Linear-Time Framework for Sentence-Level Discourse Parsing | http://arxiv.org/abs/1905.05682v2 |
A Unified MRC Framework for Named Entity Recognition | http://arxiv.org/abs/1910.11476v6 |
A Unified Model for Extractive and Abstractive Summarization using Inconsistency Loss | http://arxiv.org/abs/1805.06266v2 |
A Unified Stochastic Gradient Approach to Designing Bayesian-Optimal Experiments | http://arxiv.org/abs/1911.00294v2 |
A Unified Theory of Decentralized SGD with Changing Topology and Local Updates | http://arxiv.org/abs/2003.10422v2 |
A Unified Theory of SGD: Variance Reduction, Sampling, Quantization and Coordinate Descent | http://arxiv.org/abs/1905.11261v1 |
A Unified View of Label Shift Estimation | http://arxiv.org/abs/2003.07554v3 |
A Visual Attention Grounding Neural Model for Multimodal Machine Translation | http://arxiv.org/abs/1808.08266v2 |
A Wasserstein Minimum Velocity Approach to Learning Unnormalized Models | http://arxiv.org/abs/2002.07501v1 |
A Word-Complexity Lexicon and A Neural Readability Ranking Model for Lexical Simplification | http://arxiv.org/abs/1810.05754v1 |
A greedy anytime algorithm for sparse PCA | http://arxiv.org/abs/1910.06846v5 |
A large annotated corpus for learning natural language inference | http://arxiv.org/abs/1508.05326v1 |
A negative case analysis of visual grounding methods for VQA | http://arxiv.org/abs/2004.05704v2 |
A neurally plausible model learns successor representations in partially observable environments | http://arxiv.org/abs/1906.09480v1 |
A new regret analysis for Adam-type algorithms | http://arxiv.org/abs/2003.09729v1 |
A nonasymptotic law of iterated logarithm for general M-estimators | http://arxiv.org/abs/1903.06576v2 |
A principled approach for generating adversarial images under non-smooth dissimilarity metrics | http://arxiv.org/abs/1908.01667v2 |
A robust self-learning method for fully unsupervised cross-lingual mappings of word embeddings | http://arxiv.org/abs/1805.06297v2 |
A single image deep learning approach to restoration of corrupted remote sensing products | http://arxiv.org/abs/2004.04209v1 |
A strong baseline for question relevancy ranking | http://arxiv.org/abs/1808.08836v1 |
AD3: Attentive Deep Document Dater | http://arxiv.org/abs/1902.02161v1 |
ADVISER: A Toolkit for Developing Multi-modal, Multi-domain and Socially-engaged Conversational Agents | http://arxiv.org/abs/2005.01777v1 |
AIN: Fast and Accurate Sequence Labeling with Approximate Inference Network | http://arxiv.org/abs/2009.08229v2 |
ALICE: Active Learning with Contrastive Natural Language Explanations | http://arxiv.org/abs/2009.10259v1 |
AMAGOLD: Amortized Metropolis Adjustment for Efficient Stochastic Gradient MCMC | http://arxiv.org/abs/2003.00193v1 |
AMR Dependency Parsing with a Typed Semantic Algebra | http://arxiv.org/abs/1805.11465v1 |
AMR Parsing as Sequence-to-Graph Transduction | http://arxiv.org/abs/1905.08704v2 |
AMR Parsing via Graph-Sequence Iterative Inference | http://arxiv.org/abs/2004.05572v2 |
AMR-to-text Generation with Synchronous Node Replacement Grammar | http://arxiv.org/abs/1702.00500v4 |
AP-Perf: Incorporating Generic Performance Metrics in Differentiable Learning | http://arxiv.org/abs/1912.00965v2 |
AR-DAE: Towards Unbiased Neural Entropy Gradient Estimation | http://arxiv.org/abs/2006.05164v1 |
ASAP: Architecture Search, Anneal and Prune | http://arxiv.org/abs/1904.04123v2 |
ASSET: A Dataset for Tuning and Evaluation of Sentence Simplification Models with Multiple Rewriting Transformations | http://arxiv.org/abs/2005.00481v1 |
Abstract Syntax Networks for Code Generation and Semantic Parsing | http://arxiv.org/abs/1704.07535v1 |
Abstraction Mechanisms Predict Generalization in Deep Neural Networks | http://arxiv.org/abs/1905.11515v2 |
Abstractive Multi-Document Summarization via Phrase Selection and Merging | http://arxiv.org/abs/1506.01597v2 |
Abusive Language Detection with Graph Convolutional Networks | http://arxiv.org/abs/1904.04073v1 |
Accelerated Message Passing for Entropy-Regularized MAP Inference | http://arxiv.org/abs/2007.00699v1 |
Accelerated Primal-Dual Algorithms for Distributed Smooth Convex Optimization over Networks | http://arxiv.org/abs/1910.10666v2 |
Accelerated Reinforcement Learning for Sentence Generation by Vocabulary Prediction | http://arxiv.org/abs/1809.01694v2 |
Accelerated Stochastic Gradient-free and Projection-free Methods | http://arxiv.org/abs/2007.12625v2 |
Accelerating Large-Scale Inference with Anisotropic Vector Quantization | http://arxiv.org/abs/1908.10396v5 |
Accelerating NMT Batched Beam Decoding with LMBR Posteriors for Deployment | http://arxiv.org/abs/1804.11324v1 |
Accelerating Natural Language Understanding in Task-Oriented Dialog | http://arxiv.org/abs/2006.03701v1 |
Accelerating Online Reinforcement Learning with Offline Datasets | http://arxiv.org/abs/2006.09359v3 |
Accelerating Reinforcement Learning with Learned Skill Priors | http://arxiv.org/abs/2010.11944v1 |
Accurate Word Alignment Induction from Neural Machine Translation | http://arxiv.org/abs/2004.14837v2 |
Acrostic Poem Generation | http://arxiv.org/abs/2010.02239v1 |
Action and Perception as Divergence Minimization | http://arxiv.org/abs/2009.01791v2 |
Active Community Detection with Maximal Expected Model Change | http://arxiv.org/abs/1801.05856v2 |
Active Imitation Learning with Noisy Guidance | http://arxiv.org/abs/2005.12801v1 |
Active Learning for Coreference Resolution using Discrete Annotation | http://arxiv.org/abs/2004.13671v3 |
Active Learning for Identification of Linear Dynamical Systems | http://arxiv.org/abs/2002.00495v2 |
Active Learning from Crowd in Document Screening | http://arxiv.org/abs/2012.02297v1 |
Active World Model Learning with Progress Curiosity | http://arxiv.org/abs/2007.07853v1 |
AdaScale SGD: A User-Friendly Algorithm for Distributed Training | http://arxiv.org/abs/2007.05105v1 |
Adapting End-to-End Speech Recognition for Readable Subtitles | http://arxiv.org/abs/2005.12143v1 |
Adapting Word Embeddings to New Languages with Morphological and Phonological Subword Representations | http://arxiv.org/abs/1808.09500v1 |
Adaptive Attention Span in Transformers | http://arxiv.org/abs/1905.07799v2 |
Adaptive Attentional Network for Few-Shot Knowledge Graph Completion | http://arxiv.org/abs/2010.09638v1 |
Adaptive Checkpoint Adjoint Method for Gradient Estimation in Neural ODE | http://arxiv.org/abs/2006.02493v1 |
Adaptive Document Retrieval for Deep Question Answering | http://arxiv.org/abs/1808.06528v1 |
Adaptive Estimator Selection for Off-Policy Evaluation | http://arxiv.org/abs/2002.07729v2 |
Adaptive Exploration in Linear Contextual Bandit | http://arxiv.org/abs/1910.06996v2 |
Adaptive Gradient Descent without Descent | http://arxiv.org/abs/1910.09529v2 |
Adaptive Prediction Timing for Electronic Health Records | http://arxiv.org/abs/2003.02554v1 |
Adaptive Region-Based Active Learning | http://arxiv.org/abs/2002.07348v1 |
Adaptive Reward-Poisoning Attacks against Reinforcement Learning | http://arxiv.org/abs/2003.12613v2 |
Adaptive Risk Minimization: A Meta-Learning Approach for Tackling Group Shift | http://arxiv.org/abs/2007.02931v2 |
Adaptive Scaling for Sparse Detection in Information Extraction | http://arxiv.org/abs/1805.00250v2 |
Adaptive Transformers for Learning Multimodal Representations | http://arxiv.org/abs/2005.07486v3 |
Adding Seemingly Uninformative Labels Helps in Low Data Regimes | http://arxiv.org/abs/2008.00807v2 |
Additive Tree-Structured Covariance Function for Conditional Parameter Spaces in Bayesian Optimization | http://arxiv.org/abs/2006.11771v1 |
Addressing Ancestry Disparities in Genomic Medicine: A Geographic-aware Algorithm | http://arxiv.org/abs/2004.12053v1 |
Addressing Exposure Bias With Document Minimum Risk Training: Cambridge at the WMT20 Biomedical Translation Task | http://arxiv.org/abs/2010.05333v1 |
Addressing reward bias in Adversarial Imitation Learning with neutral reward functions | http://arxiv.org/abs/2009.09467v1 |
Addressing the Rare Word Problem in Neural Machine Translation | http://arxiv.org/abs/1410.8206v4 |
AdvAug: Robust Adversarial Augmentation for Neural Machine Translation | http://arxiv.org/abs/2006.11834v3 |
Advancing Renewable Electricity Consumption With Reinforcement Learning | http://arxiv.org/abs/2003.04310v1 |
Adversarial Alignment of Multilingual Models for Extracting Temporal Expressions from Text | http://arxiv.org/abs/2005.09392v1 |
Adversarial Attack and Defense of Structured Prediction Models | http://arxiv.org/abs/2010.01610v2 |
Adversarial Attacks on Probabilistic Autoregressive Forecasting Models | http://arxiv.org/abs/2003.03778v1 |
Adversarial Connective-exploiting Networks for Implicit Discourse Relation Classification | http://arxiv.org/abs/1704.00217v1 |
Adversarial Contrastive Estimation | http://arxiv.org/abs/1805.03642v3 |
Adversarial Deep Averaging Networks for Cross-Lingual Sentiment Classification | http://arxiv.org/abs/1606.01614v5 |
Adversarial Example Generation with Syntactically Controlled Paraphrase Networks | http://arxiv.org/abs/1804.06059v1 |
Adversarial Examples for Evaluating Reading Comprehension Systems | http://arxiv.org/abs/1707.07328v1 |
Adversarial Filters of Dataset Biases | http://arxiv.org/abs/2002.04108v3 |
Adversarial Learning of Privacy-Preserving Text Representations for De-Identification of Medical Records | http://arxiv.org/abs/1906.05000v1 |
Adversarial Multi-Criteria Learning for Chinese Word Segmentation | http://arxiv.org/abs/1704.07556v1 |
Adversarial Multi-task Learning for Text Classification | http://arxiv.org/abs/1704.05742v1 |
Adversarial Multitask Learning for Joint Multi-Feature and Multi-Dialect Morphological Modeling | http://arxiv.org/abs/1910.12702v1 |
Adversarial Mutual Information for Text Generation | http://arxiv.org/abs/2007.00067v1 |
Adversarial NLI: A New Benchmark for Natural Language Understanding | http://arxiv.org/abs/1910.14599v2 |
Adversarial Neural Pruning with Latent Vulnerability Suppression | http://arxiv.org/abs/1908.04355v4 |
Adversarial Removal of Demographic Attributes from Text Data | http://arxiv.org/abs/1808.06640v2 |
Adversarial Risk via Optimal Transport and Optimal Couplings | http://arxiv.org/abs/1912.02794v2 |
Adversarial Robustness Guarantees for Classification with Gaussian Processes | http://arxiv.org/abs/1905.11876v3 |
Adversarial Robustness for Code | http://arxiv.org/abs/2002.04694v2 |
Adversarial Robustness of Flow-Based Generative Models | http://arxiv.org/abs/1911.08654v1 |
Adversarial Self-Supervised Data-Free Distillation for Text Classification | http://arxiv.org/abs/2010.04883v1 |
Adversarial Semantic Collisions | http://arxiv.org/abs/2011.04743v1 |
Adversarial Training for Commonsense Inference | http://arxiv.org/abs/2005.08156v1 |
Adversarial Training for Satire Detection: Controlling for Confounding Variables | http://arxiv.org/abs/1902.11145v2 |
Adversarial attacks on Copyright Detection Systems | http://arxiv.org/abs/1906.07153v2 |
Adversarial representation learning for private speech generation | http://arxiv.org/abs/2006.09114v2 |
Adversarial training for multi-context joint entity and relation extraction | http://arxiv.org/abs/1808.06876v3 |
Affect-LM: A Neural Language Model for Customizable Affective Text Generation | http://arxiv.org/abs/1704.06851v1 |
Afro-MNIST: Synthetic generation of MNIST-style datasets for low-resource languages | http://arxiv.org/abs/2009.13509v1 |
Agent57: Outperforming the Atari Human Benchmark | http://arxiv.org/abs/2003.13350v1 |
Aggregation of Multiple Knockoffs | http://arxiv.org/abs/2002.09269v2 |
Algorithmic Recourse: from Counterfactual Explanations to Interventions | http://arxiv.org/abs/2002.06278v4 |
Algorithms and SQ Lower Bounds for PAC Learning One-Hidden-Layer ReLU Networks | http://arxiv.org/abs/2006.12476v1 |
Aligned Cross Entropy for Non-Autoregressive Machine Translation | http://arxiv.org/abs/2004.01655v1 |
Alignment-based compositional semantics for instruction following | http://arxiv.org/abs/1508.06491v2 |
All Fingers are not Equal: Intensity of References in Scientific Articles | http://arxiv.org/abs/1609.00081v1 |
All in the Exponential Family: Bregman Duality in Thermodynamic Variational Inference | http://arxiv.org/abs/2007.00642v1 |
Alleviating Privacy Attacks via Causal Learning | http://arxiv.org/abs/1909.12732v4 |
Almost Tune-Free Variance Reduction | http://arxiv.org/abs/1908.09345v2 |
Almost-Matching-Exactly for Treatment Effect Estimation under Network Interference | http://arxiv.org/abs/2003.00964v1 |
AmbigQA: Answering Ambiguous Open-domain Questions | http://arxiv.org/abs/2004.10645v2 |
Amharic Abstractive Text Summarization | http://arxiv.org/abs/2003.13721v1 |
Amodal 3D Reconstruction for Robotic Manipulation via Stability and Connectivity | http://arxiv.org/abs/2009.13146v1 |
Amortised Learning by Wake-Sleep | http://arxiv.org/abs/2002.09737v2 |
Amortized Inference of Variational Bounds for Learning Noisy-OR | http://arxiv.org/abs/1906.02428v2 |
Amortized Population Gibbs Samplers with Neural Sufficient Statistics | http://arxiv.org/abs/1911.01382v3 |
Amortized learning of neural causal representations | http://arxiv.org/abs/2008.09301v1 |
An AMR Aligner Tuned by Transition-based Parser | http://arxiv.org/abs/1810.03541v1 |
An Accelerated DFO Algorithm for Finite-sum Convex Functions | http://arxiv.org/abs/2007.03311v2 |
An Analysis of Action Recognition Datasets for Language and Vision Tasks | http://arxiv.org/abs/1704.07129v1 |
An Analysis of the Utility of Explicit Negative Examples to Improve the Syntactic Abilities of Neural Language Models | http://arxiv.org/abs/2004.02451v3 |
An EM Approach to Non-autoregressive Conditional Sequence Generation | http://arxiv.org/abs/2006.16378v1 |
An Effective Approach to Unsupervised Machine Translation | http://arxiv.org/abs/1902.01313v2 |
An Effective Transition-based Model for Discontinuous NER | http://arxiv.org/abs/2004.13454v1 |
An Effectiveness Metric for Ordinal Classification: Formal Properties and Experimental Results | http://arxiv.org/abs/2006.01245v1 |
An Embarrassingly Simple Approach for Transfer Learning from Pretrained Language Models | http://arxiv.org/abs/1902.10547v3 |
An Empirical Investigation Towards Efficient Multi-Domain Language Model Pre-training | http://arxiv.org/abs/2010.00784v1 |
An Empirical Investigation of Contextualized Number Prediction | http://arxiv.org/abs/2011.07961v1 |
An Empirical Investigation of Global and Local Normalization for Recurrent Neural Sequence Models Using a Continuous Relaxation to Beam Search | http://arxiv.org/abs/1904.06834v1 |
An Empirical Study of Generation Order for Machine Translation | http://arxiv.org/abs/1910.13437v1 |
An Empirical Study of Pre-trained Transformers for Arabic Information Extraction | http://arxiv.org/abs/2004.14519v5 |
An Empirical Study on Large-Scale Multi-Label Text Classification Including Few and Zero-Shot Labels | http://arxiv.org/abs/2010.01653v1 |
An Empirical Study on Model-agnostic Debiasing Strategies for Robust Natural Language Inference | http://arxiv.org/abs/2010.03777v2 |
An Empirical Study on Robustness to Spurious Correlations using Pre-trained Language Models | http://arxiv.org/abs/2007.06778v3 |
An Experiment on Leveraging SHAP Values to Investigate Racial Bias | http://arxiv.org/abs/2011.09865v1 |
An Explicitly Relational Neural Network Architecture | http://arxiv.org/abs/1905.10307v4 |
An Exploration of Arbitrary-Order Sequence Labeling via Energy-Based Inference Networks | http://arxiv.org/abs/2010.02789v1 |
An Exploratory Study of Argumentative Writing by Young Students: A Transformer-based Approach | http://arxiv.org/abs/2006.09873v1 |
An Imitation Game for Learning Semantic Parsers from User Interaction | http://arxiv.org/abs/2005.00689v3 |
An Imitation Learning Approach for Cache Replacement | http://arxiv.org/abs/2006.16239v2 |
An Imitation Learning Approach to Unsupervised Parsing | http://arxiv.org/abs/1906.02276v1 |
An Interpretable Knowledge Transfer Model for Knowledge Base Completion | http://arxiv.org/abs/1704.05908v2 |
An Inverse-free Truncated Rayleigh-Ritz Method for Sparse Generalized Eigenvalue Problem | http://arxiv.org/abs/2003.10897v1 |
An Investigation of Why Overparameterization Exacerbates Spurious Correlations | http://arxiv.org/abs/2005.04345v3 |
An Optimal Algorithm for Adversarial Bandits with Arbitrary Delays | http://arxiv.org/abs/1910.06054v2 |
An Unsupervised Joint System for Text Generation from Knowledge Graphs and Semantic Parsing | http://arxiv.org/abs/1904.09447v4 |
An Unsupervised Method for Uncovering Morphological Chains | http://arxiv.org/abs/1503.02335v1 |
An Unsupervised Probability Model for Speech-to-Translation Alignment of Low-Resource Languages | http://arxiv.org/abs/1609.08139v1 |
An end-to-end Differentially Private Latent Dirichlet Allocation Using a Spectral Algorithm | http://arxiv.org/abs/1805.10341v3 |
An end-to-end approach for the verification problem: learning the right distance | http://arxiv.org/abs/2002.09469v4 |
An information theoretic view on selecting linguistic probes | http://arxiv.org/abs/2009.07364v2 |
Analogies minus analogy test: measuring regularities in word embeddings | http://arxiv.org/abs/2010.03446v1 |
Analogous Process Structure Induction for Sub-event Sequence Prediction | http://arxiv.org/abs/2010.08525v1 |
Analogs of Linguistic Structure in Deep Representations | http://arxiv.org/abs/1707.08139v1 |
Analysing Lexical Semantic Change with Contextualised Word Representations | http://arxiv.org/abs/2004.14118v1 |
Analysis of Automatic Annotation Suggestions for Hard Discourse-Level Tasks in Expert Domains | http://arxiv.org/abs/1906.02564v1 |
Analytic Marching: An Analytic Meshing Solution from Deep Implicit Surface Networks | http://arxiv.org/abs/2002.06597v1 |
Analyzing Individual Neurons in Pre-trained Language Models | http://arxiv.org/abs/2010.02695v1 |
Analyzing Multi-Head Self-Attention: Specialized Heads Do the Heavy Lifting, the Rest Can Be Pruned | http://arxiv.org/abs/1905.09418v2 |
Analyzing Neural Discourse Coherence Models | http://arxiv.org/abs/2011.06306v1 |
Analyzing Polarization in Social Media: Method and Application to Tweets on 21 Mass Shootings | http://arxiv.org/abs/1904.01596v2 |
Analyzing Political Parody in Social Media | http://arxiv.org/abs/2004.13878v2 |
Analyzing Redundancy in Pretrained Transformer Models | http://arxiv.org/abs/2004.04010v2 |
Analyzing analytical methods: The case of phonology in neural models of spoken language | http://arxiv.org/abs/2004.07070v2 |
Analyzing autoencoder-based acoustic word embeddings | http://arxiv.org/abs/2004.01647v1 |
Analyzing the Limitations of Cross-lingual Word Embedding Mappings | http://arxiv.org/abs/1906.05407v1 |
Anatomy of Catastrophic Forgetting: Hidden Representations and Task Semantics | http://arxiv.org/abs/2007.07400v1 |
Anchored Correlation Explanation: Topic Modeling with Minimal Domain Knowledge | http://arxiv.org/abs/1611.10277v4 |
Anchoring and Agreement in Syntactic Annotations | http://arxiv.org/abs/1605.04481v3 |
Anderson Acceleration of Proximal Gradient Methods | http://arxiv.org/abs/1910.08590v2 |
Angular Visual Hardness | http://arxiv.org/abs/1912.02279v4 |
Answer-based Adversarial Training for Generating Clarification Questions | http://arxiv.org/abs/1904.02281v1 |
Approaching Neural Grammatical Error Correction as a Low-Resource Machine Translation Task | http://arxiv.org/abs/1804.05940v1 |
Approximate Cross-Validation in High Dimensions with Guarantees | http://arxiv.org/abs/1905.13657v4 |
Approximate Cross-validation: Guarantees for Model Assessment and Selection | http://arxiv.org/abs/2003.00617v2 |
Approximate Inference in Discrete Distributions with Monte Carlo Tree Search and Value Functions | http://arxiv.org/abs/1910.06862v1 |
Approximate is Good Enough: Probabilistic Variants of Dimensional and Margin Complexity | http://arxiv.org/abs/2003.04180v1 |
Approximating Stacked and Bidirectional Recurrent Architectures with the Delayed Recurrent Neural Network | http://arxiv.org/abs/1909.00021v2 |
Approximation Capabilities of Neural ODEs and Invertible Residual Networks | http://arxiv.org/abs/1907.12998v2 |
Approximation Guarantees of Local Search Algorithms via Localizability of Set Functions | http://arxiv.org/abs/2006.01400v1 |
Approximation Schemes for ReLU Regression | http://arxiv.org/abs/2005.12844v2 |
Approximation-Aware Dependency Parsing by Belief Propagation | http://arxiv.org/abs/1508.02375v1 |
AraDIC: Arabic Document Classification using Image-Based Character Embeddings and Class-Balanced Loss | http://arxiv.org/abs/2006.11586v1 |
Arc-swift: A Novel Transition System for Dependency Parsing | http://arxiv.org/abs/1705.04434v1 |
Architecture Agnostic Neural Networks | http://arxiv.org/abs/2011.02712v2 |
Are All Good Word Vector Spaces Isomorphic? | http://arxiv.org/abs/2004.04070v2 |
Are All Languages Created Equal in Multilingual BERT? | http://arxiv.org/abs/2005.09093v2 |
Are BLEU and Meaning Representation in Opposition? | http://arxiv.org/abs/1805.06536v1 |
Are Hyperbolic Representations in Graphs Created Equal? | http://arxiv.org/abs/2007.07698v1 |
Are Natural Language Inference Models IMPPRESsive? Learning IMPlicature and PRESupposition | http://arxiv.org/abs/2004.03066v2 |
Are Pretrained Language Models Symbolic Reasoners Over Knowledge? | http://arxiv.org/abs/2006.10413v2 |
Are Some Words Worth More than Others? | http://arxiv.org/abs/2010.06069v2 |
Are You Convinced? Choosing the More Convincing Evidence with a Siamese Network | http://arxiv.org/abs/1907.08971v2 |
Argument Generation with Retrieval, Planning, and Realization | http://arxiv.org/abs/1906.03717v1 |
Argument Invention from First Principles | http://arxiv.org/abs/1908.08336v1 |
Argument Mining for Understanding Peer Reviews | http://arxiv.org/abs/1903.10104v1 |
Argument Mining with Structured SVMs and RNNs | http://arxiv.org/abs/1704.06869v1 |
Artemis: A Novel Annotation Methodology for Indicative Single Document Summarization | http://arxiv.org/abs/2005.02146v2 |
Artificial Intelligence for Global Health: Learning From a Decade of Digital Transformation in Health Care | http://arxiv.org/abs/2005.12378v2 |
Asking and Answering Questions to Evaluate the Factual Consistency of Summaries | http://arxiv.org/abs/2004.04228v1 |
Asking without Telling: Exploring Latent Ontologies in Contextual Representations | http://arxiv.org/abs/2004.14513v2 |
Aspect Level Sentiment Classification with Deep Memory Network | http://arxiv.org/abs/1605.08900v2 |
Assessing Human Translations from French to Bambara for Machine Learning: a Pilot Study | http://arxiv.org/abs/2004.00068v1 |
Assessing Phrasal Representation and Composition in Transformers | http://arxiv.org/abs/2010.03763v2 |
Assessing Robustness to Noise: Low-Cost Head CT Triage | http://arxiv.org/abs/2003.07977v2 |
Assessing racial inequality in COVID-19 testing with Bayesian threshold tests | http://arxiv.org/abs/2011.01179v1 |
Assessing the Ability of Self-Attention Networks to Learn Word Order | http://arxiv.org/abs/1906.00592v1 |
Assessing the Helpfulness of Learning Materials with Inference-Based Learner-Like Agent | http://arxiv.org/abs/2010.02179v1 |
Associative Memory in Iterated Overparameterized Sigmoid Autoencoders | http://arxiv.org/abs/2006.16540v2 |
Asymmetric Private Set Intersection with Applications to Contact Tracing and Private Vertical Federated Machine Learning | http://arxiv.org/abs/2011.09350v1 |
Asymmetric self-play for automatic goal discovery in robotic manipulation | http://arxiv.org/abs/2101.04882v1 |
Asymptotic Analysis of Sampling Estimators for Randomized Numerical Linear Algebra Algorithms | http://arxiv.org/abs/2002.10526v1 |
Asymptotically Efficient Off-Policy Evaluation for Tabular Reinforcement Learning | http://arxiv.org/abs/2001.10742v1 |
Asynchronous Gibbs Sampling | http://arxiv.org/abs/1509.08999v7 |
Attacking Neural Text Detectors | http://arxiv.org/abs/2002.11768v3 |
Attend to Medical Ontologies: Content Selection for Clinical Abstractive Summarization | http://arxiv.org/abs/2005.00163v1 |
Attending the Emotions to Detect Online Abusive Language | http://arxiv.org/abs/1909.03100v1 |
Attention Guided Graph Convolutional Networks for Relation Extraction | http://arxiv.org/abs/1906.07510v8 |
Attention Is All You Need for Chinese Word Segmentation | http://arxiv.org/abs/1910.14537v3 |
Attention Strategies for Multi-Source Sequence-to-Sequence Learning | http://arxiv.org/abs/1704.06567v1 |
Attention is Not Only a Weight: Analyzing Transformers with Vector Norms | http://arxiv.org/abs/2004.10102v2 |
Attention is not Explanation | http://arxiv.org/abs/1902.10186v3 |
Attention-Passing Models for Robust and Data-Efficient End-to-End Speech Translation | http://arxiv.org/abs/1904.07209v1 |
Attention-over-Attention Neural Networks for Reading Comprehension | http://arxiv.org/abs/1607.04423v4 |
Attentive Group Equivariant Convolutional Networks | http://arxiv.org/abs/2002.03830v3 |
Audio-Visual Understanding of Passenger Intents for In-Cabin Conversational Agents | http://arxiv.org/abs/2007.03876v1 |
Augmented Natural Language for Generative Sequence Labeling | http://arxiv.org/abs/2009.13272v1 |
Augmenting Data for Sarcasm Detection with Unlabeled Conversation Context | http://arxiv.org/abs/2006.06259v1 |
Augmenting Neural Networks with First-order Logic | http://arxiv.org/abs/1906.06298v3 |
Augmenting word2vec with latent Dirichlet allocation within a clinical application | http://arxiv.org/abs/1808.03967v1 |
Author Commitment and Social Power: Automatic Belief Tagging to Infer the Social Context of Interactions | http://arxiv.org/abs/1805.06016v1 |
Auto-Rotating Perceptrons | http://arxiv.org/abs/1910.02483v2 |
Auto-Sizing Neural Networks: With Applications to n-gram Language Models | http://arxiv.org/abs/1508.05051v1 |
AutoExtend: Extending Word Embeddings to Embeddings for Synsets and Lexemes | http://arxiv.org/abs/1507.01127v1 |
AutoGluon-Tabular: Robust and Accurate AutoML for Structured Data | http://arxiv.org/abs/2003.06505v1 |
AutoML-Zero: Evolving Machine Learning Algorithms From Scratch | http://arxiv.org/abs/2003.03384v2 |
Autoencoder as Assistant Supervisor: Improving Text Representation for Chinese Social Media Text Summarization | http://arxiv.org/abs/1805.04869v1 |
Autoencoding Pixies: Amortised Variational Inference with Graph Convolutions for Functional Distributional Semantics | http://arxiv.org/abs/2005.02991v2 |
Automated Augmented Conjugate Inference for Non-conjugate Gaussian Process Models | http://arxiv.org/abs/2002.11451v1 |
Automated Topical Component Extraction Using Neural Network Attention Scores from Source-based Essay Scoring | http://arxiv.org/abs/2008.01809v1 |
Automatic Detection of Generated Text is Easiest when Humans are Fooled | http://arxiv.org/abs/1911.00650v2 |
Automatic Differentiation of Some First-Order Methods in Parametric Optimization | http://arxiv.org/abs/1910.05696v1 |
Automatic Estimation of Simultaneous Interpreter Performance | http://arxiv.org/abs/1805.04016v2 |
Automatic Event Salience Identification | http://arxiv.org/abs/1809.00647v1 |
Automatic Extraction of Rules Governing Morphological Agreement | http://arxiv.org/abs/2010.01160v2 |
Automatic Generation of High Quality CCGbanks for Parser Domain Adaptation | http://arxiv.org/abs/1906.01834v1 |
Automatic Metric Validation for Grammatical Error Correction | http://arxiv.org/abs/1804.11225v2 |
Automatic Reference-Based Evaluation of Pronoun Translation Misses the Point | http://arxiv.org/abs/1808.04164v1 |
Automatic Shortcut Removal for Self-Supervised Representation Learning | http://arxiv.org/abs/2002.08822v3 |
Automatic semantic segmentation for prediction of tuberculosis using lens-free microscopy images | http://arxiv.org/abs/2007.02482v1 |
Automatically Identifying Complaints in Social Media | http://arxiv.org/abs/1906.03890v1 |
Automatically Ranked Russian Paraphrase Corpus for Text Generation | http://arxiv.org/abs/2006.09719v1 |
Autoregressive Knowledge Distillation through Imitation Learning | http://arxiv.org/abs/2009.07253v2 |
Average-case Acceleration Through Spectral Density Estimation | http://arxiv.org/abs/2002.04756v5 |
Avoiding Reasoning Shortcuts: Adversarial Evaluation, Training, and Model Development for Multi-Hop QA | http://arxiv.org/abs/1906.07132v1 |
Avoiding the Hypothesis-Only Bias in Natural Language Inference via Ensemble Adversarial Training | http://arxiv.org/abs/2004.07790v4 |
AxCell: Automatic Extraction of Results from Machine Learning Papers | http://arxiv.org/abs/2004.14356v1 |
BAE: BERT-based Adversarial Examples for Text Classification | http://arxiv.org/abs/2004.01970v3 |
BAM! Born-Again Multi-Task Networks for Natural Language Understanding | http://arxiv.org/abs/1907.04829v1 |
BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension | http://arxiv.org/abs/1910.13461v1 |
BERT Fine-tuning For Arabic Text Summarization | http://arxiv.org/abs/2004.14135v1 |
BERT Knows Punta Cana is not just beautiful, it's gorgeous: Ranking Scalar Adjectives with Contextualised Representations | http://arxiv.org/abs/2010.02686v1 |
BERT-ATTACK: Adversarial Attack Against BERT Using BERT | http://arxiv.org/abs/2004.09984v3 |
BERT-EMD: Many-to-Many Layer Mapping for BERT Compression with Earth Mover's Distance | http://arxiv.org/abs/2010.06133v1 |
BERT-XML: Large Scale Automated ICD Coding Using BERT Pretraining | http://arxiv.org/abs/2006.03685v1 |
BERT-of-Theseus: Compressing BERT by Progressive Module Replacing | http://arxiv.org/abs/2002.02925v4 |
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding | http://arxiv.org/abs/1810.04805v2 |
BERTRAM: Improved Word Embeddings Have Big Impact on Contextualized Model Performance | http://arxiv.org/abs/1910.07181v3 |
BERTgrid: Contextualized Embedding for 2D Document Representation and Understanding | http://arxiv.org/abs/1909.04948v2 |
BERTs of a feather do not generalize together: Large variability in generalization across models with similar test set performance | http://arxiv.org/abs/1911.02969v2 |
BINOCULARS for Efficient, Nonmyopic Sequential Experimental Design | http://arxiv.org/abs/1909.04568v3 |
BLEU Neighbors: A Reference-less Approach to Automatic Evaluation | http://arxiv.org/abs/2004.12726v3 |
BLEU might be Guilty but References are not Innocent | http://arxiv.org/abs/2004.06063v2 |
BLEURT: Learning Robust Metrics for Text Generation | http://arxiv.org/abs/2004.04696v5 |
BPE-Dropout: Simple and Effective Subword Regularization | http://arxiv.org/abs/1910.13267v2 |
BabyAI++: Towards Grounded-Language Learning beyond Memorization | http://arxiv.org/abs/2004.07200v1 |
BabyWalk: Going Farther in Vision-and-Language Navigation by Taking Baby Steps | http://arxiv.org/abs/2005.04625v2 |
Back to the Future: Unsupervised Backprop-based Decoding for Counterfactual and Abductive Commonsense Reasoning | http://arxiv.org/abs/2010.05906v3 |
Backpropagating through Structured Argmax using a SPIGOT | http://arxiv.org/abs/1805.04658v1 |
Balanced off-policy evaluation in general action spaces | http://arxiv.org/abs/1906.03694v4 |
Balancing Competing Objectives with Noisy Data: Score-Based Classifiers for Welfare-Aware Machine Learning | http://arxiv.org/abs/2003.06740v4 |
Balancing Cost and Benefit with Tied-Multi Transformers | http://arxiv.org/abs/2002.08614v1 |
Balancing Gaussian vectors in high dimension | http://arxiv.org/abs/1910.13972v2 |
Balancing Objectives in Counseling Conversations: Advancing Forwards or Looking Backwards | http://arxiv.org/abs/2005.04245v1 |
Balancing Training for Multilingual Neural Machine Translation | http://arxiv.org/abs/2004.06748v4 |
Bandit Convex Optimization in Non-stationary Environments | http://arxiv.org/abs/1907.12340v2 |
Bandit optimisation of functions in the Matérn kernel RKHS | http://arxiv.org/abs/2001.10396v2 |
BanditSum: Extractive Summarization as a Contextual Bandit | http://arxiv.org/abs/1809.09672v3 |
Bandits for BMO Functions | http://arxiv.org/abs/2007.08703v1 |
Bandits with adversarial scaling | http://arxiv.org/abs/2003.02287v2 |
Barking up the right tree: an approach to search over molecule synthesis DAGs | http://arxiv.org/abs/2012.11522v1 |
BasisVAE: Translation-invariant feature-level clustering with Variational Autoencoders | http://arxiv.org/abs/2003.03462v1 |
Batch Stationary Distribution Estimation | http://arxiv.org/abs/2003.00722v1 |
Batch-Constrained Distributional Reinforcement Learning for Session-based Recommendation | http://arxiv.org/abs/2012.08984v1 |
Batched Multi-armed Bandits Problem | http://arxiv.org/abs/1904.01763v3 |
Bayesian Differential Privacy for Machine Learning | http://arxiv.org/abs/1901.09697v5 |
Bayesian Experimental Design for Implicit Models by Mutual Information Neural Estimation | http://arxiv.org/abs/2002.08129v3 |
Bayesian Graph Neural Networks with Adaptive Connection Sampling | http://arxiv.org/abs/2006.04064v3 |
Bayesian Hierarchical Words Representation Learning | http://arxiv.org/abs/2004.07126v1 |
Bayesian Image Classification with Deep Convolutional Gaussian Processes | http://arxiv.org/abs/1902.05888v2 |
Bayesian Learning from Sequential Data using Gaussian Processes with Signature Covariances | http://arxiv.org/abs/1906.08215v2 |
Bayesian Optimisation over Multiple Continuous and Categorical Inputs | http://arxiv.org/abs/1906.08878v2 |
Bayesian Optimization for Iterative Learning | http://arxiv.org/abs/1909.09593v4 |
Bayesian Optimization of Text Representations | http://arxiv.org/abs/1503.00693v1 |
Bayesian Reinforcement Learning via Deep, Sparse Sampling | http://arxiv.org/abs/1902.02661v4 |
Bayesian aggregation improves traditional single image crop classification approaches | http://arxiv.org/abs/2004.03468v1 |
Bayesian experimental design using regularized determinantal point processes | http://arxiv.org/abs/1906.04133v1 |
Be More with Less: Hypergraph Attention Networks for Inductive Text Classification | http://arxiv.org/abs/2011.00387v1 |
BeBold: Exploration Beyond the Boundary of Explored Regions | http://arxiv.org/abs/2012.08621v1 |
Before Name-calling: Dynamics and Triggers of Ad Hominem Fallacies in Web Argumentation | http://arxiv.org/abs/1802.06613v2 |
Behavior Analysis of NLI Models: Uncovering the Influence of Three Factors on Robustness | http://arxiv.org/abs/1805.04212v1 |
Being Bayesian, Even Just a Bit, Fixes Overconfidence in ReLU Networks | http://arxiv.org/abs/2002.10118v2 |
Being Negative but Constructively: Lessons Learnt from Creating Better Visual Question Answering Datasets | http://arxiv.org/abs/1704.07121v2 |
Benchmarking Graph Neural Networks | http://arxiv.org/abs/2003.00982v3 |
Benchmarking Multimodal Regex Synthesis with Complex Structures | http://arxiv.org/abs/2005.00663v1 |
Best Arm Identification for Cascading Bandits in the Fixed Confidence Setting | http://arxiv.org/abs/2001.08655v3 |
Best-First Beam Search | http://arxiv.org/abs/2007.03909v2 |
Best-item Learning in Random Utility Models with Subset Choices | http://arxiv.org/abs/2002.07994v1 |
Beta Embeddings for Multi-Hop Logical Reasoning in Knowledge Graphs | http://arxiv.org/abs/2010.11465v1 |
Better Depth-Width Trade-offs for Neural Networks through the lens of Dynamical Systems | http://arxiv.org/abs/2003.00777v2 |
Better Document-Level Machine Translation with Bayes' Rule | http://arxiv.org/abs/1910.00553v2 |
Better Highlighting: Creating Sub-Sentence Summary Highlights | http://arxiv.org/abs/2010.10566v1 |
Better Long-Range Dependency By Bootstrapping A Mutual Information Regularizer | http://arxiv.org/abs/1905.11978v2 |
Beyond Accuracy: Behavioral Testing of NLP models with CheckList | http://arxiv.org/abs/2005.04118v1 |
Beyond Error Propagation in Neural Machine Translation: Characteristics of Language Also Matter | http://arxiv.org/abs/1809.00120v2 |
Beyond Exponentially Discounted Sum: Automatic Learning of Return Function | http://arxiv.org/abs/1905.11591v2 |
Beyond Instructional Videos: Probing for More Diverse Visual-Textual Grounding on YouTube | http://arxiv.org/abs/2004.14338v2 |
Beyond Synthetic Noise: Deep Learning on Controlled Noisy Labels | http://arxiv.org/abs/1911.09781v3 |
Beyond UCB: Optimal and Efficient Contextual Bandits with Regression Oracles | http://arxiv.org/abs/2002.04926v2 |
Beyond User Self-Reported Likert Scale Ratings: A Comparison Model for Automatic Dialog Evaluation | http://arxiv.org/abs/2005.10716v2 |
Beyond exploding and vanishing gradients: analysing RNN training using attractors and smoothness | http://arxiv.org/abs/1906.08482v3 |
Beyond task success: A closer look at jointly learning to see, ask, and GuessWhat | http://arxiv.org/abs/1809.03408v2 |
Bi-Level Graph Neural Networks for Drug-Drug Interaction Prediction | http://arxiv.org/abs/2006.14002v1 |
Bi-directional Attention with Agreement for Dependency Parsing | http://arxiv.org/abs/1608.02076v2 |
BiST: Bi-directional Spatio-Temporal Reasoning for Video-Grounded Dialogues | http://arxiv.org/abs/2010.10095v1 |
Bidirectional Attentive Memory Networks for Question Answering over Knowledge Bases | http://arxiv.org/abs/1903.02188v3 |
Bidirectional Model-based Policy Optimization | http://arxiv.org/abs/2007.01995v2 |
Bilingual Dictionary Based Neural Machine Translation without Using Parallel Sentences | http://arxiv.org/abs/2007.02671v1 |
Bilingual Lexicon Induction through Unsupervised Machine Translation | http://arxiv.org/abs/1907.10761v1 |
Bilingual Lexicon Induction with Semi-supervision in Non-Isometric Embedding Spaces | http://arxiv.org/abs/1908.06625v1 |
Bio-Inspired Hashing for Unsupervised Similarity Search | http://arxiv.org/abs/2001.04907v2 |
BioMegatron: Larger Biomedical Domain Language Model | http://arxiv.org/abs/2010.06060v2 |
Biomedical Entity Representations with Synonym Marginalization | http://arxiv.org/abs/2005.00239v1 |
Biomedical Information Extraction for Disease Gene Prioritization | http://arxiv.org/abs/2011.05188v2 |
Bipartite Flat-Graph Network for Nested Named Entity Recognition | http://arxiv.org/abs/2005.00436v1 |
Birds have four legs?! NumerSense: Probing Numerical Commonsense Knowledge of Pre-trained Language Models | http://arxiv.org/abs/2005.00683v2 |
Bisect and Conquer: Hierarchical Clustering via Max-Uncut Bisection | http://arxiv.org/abs/1912.06983v1 |
Black Box Submodular Maximization: Discrete and Continuous Settings | http://arxiv.org/abs/1901.09515v2 |
Black Loans Matter: Distributionally Robust Fairness for Fighting Subgroup Discrimination | http://arxiv.org/abs/2012.01193v1 |
Black is to Criminal as Caucasian is to Police: Detecting and Removing Multiclass Bias in Word Embeddings | http://arxiv.org/abs/1904.04047v3 |
Black-box Certification and Learning under Adversarial Perturbations | http://arxiv.org/abs/2006.16520v1 |
Black-box Methods for Restoring Monotonicity | http://arxiv.org/abs/2003.09554v1 |
Blank Language Models | http://arxiv.org/abs/2002.03079v2 |
Bleaching Text: Abstract Features for Cross-lingual Gender Prediction | http://arxiv.org/abs/1805.03122v1 |
BoXHED: Boosted eXact Hazard Estimator with Dynamic covariates | http://arxiv.org/abs/2006.14218v2 |
BoolQ: Exploring the Surprising Difficulty of Natural Yes/No Questions | http://arxiv.org/abs/1905.10044v1 |
Boosting Entity Linking Performance by Leveraging Unlabeled Documents | http://arxiv.org/abs/1906.01250v1 |
Boosting Frank-Wolfe by Chasing Gradients | http://arxiv.org/abs/2003.06369v2 |
Boosting for Control of Dynamical Systems | http://arxiv.org/abs/1906.08720v2 |
Bootstrap Latent-Predictive Representations for Multitask Reinforcement Learning | http://arxiv.org/abs/2004.14646v1 |
Bootstrapped Q-learning with Context Relevant Observation Pruning to Generalize in Text-based Games | http://arxiv.org/abs/2009.11896v1 |
Bootstrapping Generators from Noisy Data | http://arxiv.org/abs/1804.06385v4 |
Bootstrapping Named Entity Recognition in E-Commerce with Positive Unlabeled Learning | http://arxiv.org/abs/2005.11075v1 |
Bootstrapping Techniques for Polysynthetic Morphological Analysis | http://arxiv.org/abs/2005.00956v1 |
Born-Again Tree Ensembles | http://arxiv.org/abs/2003.11132v3 |
Bounding, Concentrating, and Truncating: Unifying Privacy Loss Composition for Data Analytics | http://arxiv.org/abs/2004.07223v3 |
Bounds in Query Learning | http://arxiv.org/abs/1904.10122v1 |
Break It Down: A Question Understanding Benchmark | http://arxiv.org/abs/2001.11770v1 |
Breaking NLI Systems with Sentences that Require Simple Lexical Inferences | http://arxiv.org/abs/1805.02266v1 |
Breaking the Curse of Many Agents: Provable Mean Embedding Q-Iteration for Mean-Field Reinforcement Learning | http://arxiv.org/abs/2006.11917v1 |
Breaking the Curse of Space Explosion: Towards Efficient NAS with Curriculum Search | http://arxiv.org/abs/2007.07197v2 |
Breast Cancer Detection Using Convolutional Neural Networks | http://arxiv.org/abs/2003.07911v3 |
Bridging Anaphora Resolution as Question Answering | http://arxiv.org/abs/2004.07898v3 |
Bridging Information-Seeking Human Gaze and Machine Reading Comprehension | http://arxiv.org/abs/2009.14780v2 |
Bridging Linguistic Typology and Multilingual Machine Translation with Multi-View Language Representations | http://arxiv.org/abs/2004.14923v2 |
Bridging the Gap between Training and Inference for Neural Machine Translation | http://arxiv.org/abs/1906.02448v2 |
Bringing Stories Alive: Generating Interactive Fiction Worlds | http://arxiv.org/abs/2001.10161v1 |
Budget Learning via Bracketing | http://arxiv.org/abs/2004.06298v1 |
Budget-Constrained Bandits over General Cost and Reward Distributions | http://arxiv.org/abs/2003.00365v1 |
C-Learning: Horizon-Aware Cumulative Accessibility Estimation | http://arxiv.org/abs/2011.12363v2 |
C-Learning: Learning to Achieve Goals via Recursive Classification | http://arxiv.org/abs/2011.08909v1 |
CAT-Gen: Improving Robustness in NLP Models via Controlled Adversarial Text Generation | http://arxiv.org/abs/2010.02338v1 |
CAUSE: Learning Granger Causality from Event Sequences using Attribution Methods | http://arxiv.org/abs/2002.07906v1 |
CAiRE-COVID: A Question Answering and Query-focused Multi-Document Summarization System for COVID-19 Scholarly Information Management | http://arxiv.org/abs/2005.03975v3 |
CDL: Curriculum Dual Learning for Emotion-Controllable Response Generation | http://arxiv.org/abs/2005.00329v5 |
CITE: A Corpus of Image-Text Discourse Relations | http://arxiv.org/abs/1904.06286v2 |
CLEVR Parser: A Graph Parser Library for Geometric Learning on Language Grounded Image Scenes | http://arxiv.org/abs/2009.09154v2 |
CLEVR-Dialog: A Diagnostic Dataset for Multi-Round Reasoning in Visual Dialog | http://arxiv.org/abs/1903.03166v2 |
CLUB: A Contrastive Log-ratio Upper Bound of Mutual Information | http://arxiv.org/abs/2006.12013v6 |
CNM: An Interpretable Complex-valued Network for Matching | http://arxiv.org/abs/1904.05298v1 |
CNN-based Approach for Cervical Cancer Classification in Whole-Slide Histopathology Images | http://arxiv.org/abs/2005.13924v1 |
COD3S: Diverse Generation with Discrete Semantic Signatures | http://arxiv.org/abs/2010.02882v1 |
COMET: A Neural Framework for MT Evaluation | http://arxiv.org/abs/2009.09025v2 |
COMETA: A Corpus for Medical Entity Linking in the Social Media | http://arxiv.org/abs/2010.03295v2 |
COVID-19 Literature Topic-Based Search via Hierarchical NMF | http://arxiv.org/abs/2009.09074v1 |
CUNI Systems for the Unsupervised and Very Low Resource Translation Task in WMT20 | http://arxiv.org/abs/2010.11747v1 |
CURL: Contrastive Unsupervised Representations for Reinforcement Learning | http://arxiv.org/abs/2004.04136v4 |
Calibrated Language Model Fine-Tuning for In- and Out-of-Distribution Data | http://arxiv.org/abs/2010.11506v1 |
Calibrated Surrogate Losses for Adversarially Robust Classification | http://arxiv.org/abs/2005.13748v1 |
Calibrated Surrogate Maximization of Linear-fractional Utility in Binary Classification | http://arxiv.org/abs/1905.12511v2 |
Calibrated Top-1 Uncertainty estimates for classification by score based models | http://arxiv.org/abs/1903.09215v4 |
Calibrating Structured Output Predictors for Natural Language Processing | http://arxiv.org/abs/2004.04361v2 |
Calibration of Pre-trained Transformers | http://arxiv.org/abs/2003.07892v3 |
Calibration, Entropy Rates, and Memory in Language Models | http://arxiv.org/abs/1906.05664v1 |
CamemBERT: a Tasty French Language Model | http://arxiv.org/abs/1911.03894v3 |
Can Automatic Post-Editing Improve NMT? | http://arxiv.org/abs/2009.14395v1 |
Can Autonomous Vehicles Identify, Recover From, and Adapt to Distribution Shifts? | http://arxiv.org/abs/2006.14911v2 |
Can Increasing Input Dimensionality Improve Deep Reinforcement Learning? | http://arxiv.org/abs/2003.01629v2 |
Can Neural Machine Translation be Improved with User Feedback? | http://arxiv.org/abs/1804.05958v1 |
Can You Put it All Together: Evaluating Conversational Agents' Ability to Blend Skills | http://arxiv.org/abs/2004.08449v1 |
Can a Suit of Armor Conduct Electricity? A New Dataset for Open Book Question Answering | http://arxiv.org/abs/1809.02789v1 |
Carbontracker: Tracking and Predicting the Carbon Footprint of Training Deep Learning Models | http://arxiv.org/abs/2007.03051v1 |
Cascaded Mutual Modulation for Visual Reasoning | http://arxiv.org/abs/1809.01943v1 |
Catch Me if I Can: Detecting Strategic Behaviour in Peer Assessment | http://arxiv.org/abs/2010.04041v1 |
Categorical Metadata Representation for Customized Text Classification | http://arxiv.org/abs/1902.05196v1 |
Catplayinginthesnow: Impact of Prior Segmentation on a Model of Visually Grounded Speech | http://arxiv.org/abs/2006.08387v2 |
Causal Bayesian Optimization | http://arxiv.org/abs/2005.11741v2 |
Causal Curiosity: RL Agents Discovering Self-supervised Experiments for Causal Representation Learning | http://arxiv.org/abs/2010.03110v1 |
Causal Effect Estimation and Optimal Dose Suggestions in Mobile Health | http://arxiv.org/abs/2007.09812v2 |
Causal Feature Discovery through Strategic Modification | http://arxiv.org/abs/2002.07024v2 |
Causal Inference of Script Knowledge | http://arxiv.org/abs/2004.01174v1 |
Causal Inference using Gaussian Processes with Structured Latent Confounders | http://arxiv.org/abs/2007.07127v1 |
Causal Learning by a Robot with Semantic-Episodic Memory in an Aesop's Fable Experiment | http://arxiv.org/abs/2003.00274v1 |
Causal Modeling for Fairness in Dynamical Systems | http://arxiv.org/abs/1909.09141v2 |
Causal Structure Discovery from Distributions Arising from Mixtures of DAGs | http://arxiv.org/abs/2001.11940v2 |
Causal inference in degenerate systems: An impossibility result | http://arxiv.org/abs/1711.04466v3 |
Cautious Adaptation For Reinforcement Learning in Safety-Critical Settings | http://arxiv.org/abs/2008.06622v1 |
Censored Quantile Regression Forest | http://arxiv.org/abs/2001.03458v1 |
Certified Data Removal from Machine Learning Models | http://arxiv.org/abs/1911.03030v5 |
Certified Robustness to Label-Flipping Attacks via Randomized Smoothing | http://arxiv.org/abs/2002.03018v4 |
Challenges in Emotion Style Transfer: An Exploration with a Lexical Substitution Pipeline | http://arxiv.org/abs/2005.07617v1 |
Channel Equilibrium Networks for Learning Deep Representation | http://arxiv.org/abs/2003.00214v1 |
Chapter Captor: Text Segmentation in Novels | http://arxiv.org/abs/2011.04163v1 |
CharManteau: Character Embedding Models For Portmanteau Creation | http://arxiv.org/abs/1707.01176v2 |
Character-level Representations Improve DRS-based Semantic Parsing Even in the Age of BERT | http://arxiv.org/abs/2011.04308v1 |
Characterization of Overlap in Observational Studies | http://arxiv.org/abs/1907.04138v3 |
Characterizing Distribution Equivalence and Structure Learning for Cyclic and Acyclic Directed Graphs | http://arxiv.org/abs/1910.12993v3 |
Characterizing Private Clipped Gradient Descent on Convex Generalized Linear Problems | http://arxiv.org/abs/2006.06783v1 |
Characterizing the Latent Space of Molecular Deep Generative Models with Persistent Homology Metrics | http://arxiv.org/abs/2010.08548v1 |
CheXbert: Combining Automatic Labelers and Expert Annotations for Accurate Radiology Report Labeling Using BERT | http://arxiv.org/abs/2004.09167v3 |
Choice Set Optimization Under Discrete Choice Models of Group Decisions | http://arxiv.org/abs/2002.00421v2 |
ChrEn: Cherokee-English Machine Translation for Endangered Language Revitalization | http://arxiv.org/abs/2010.04791v1 |
Circuit-Based Intrinsic Methods to Detect Overfitting | http://arxiv.org/abs/1907.01991v2 |
ClarQ: A large-scale and diverse dataset for Clarification Question Generation | http://arxiv.org/abs/2006.05986v2 |
Classical Structured Prediction Losses for Sequence to Sequence Learning | http://arxiv.org/abs/1711.04956v5 |
Classification with Strategically Withheld Data | http://arxiv.org/abs/2012.10203v2 |
Classifying Syntactic Errors in Learner Language | http://arxiv.org/abs/2010.11032v2 |
Clinical Reading Comprehension: A Thorough Analysis of the emrQA Dataset | http://arxiv.org/abs/2005.00574v1 |
Clinical XLNet: Modeling Sequential Clinical Notes and Predicting Prolonged Mechanical Ventilation | http://arxiv.org/abs/1912.11975v1 |
Closed Loop Neural-Symbolic Learning via Integrating Neural Perception, Grammar Parsing, and Symbolic Reasoning | http://arxiv.org/abs/2006.06649v2 |
Closing the Gap: Joint De-Identification and Concept Extraction in the Clinical Domain | http://arxiv.org/abs/2005.09397v1 |
Closing the convergence gap of SGD without replacement | http://arxiv.org/abs/2002.10400v6 |
Closure Properties for Private Classification and Online Prediction | http://arxiv.org/abs/2003.04509v3 |
Clue: Cross-modal Coherence Modeling for Caption Generation | http://arxiv.org/abs/2005.00908v1 |
CoDEx: A Comprehensive Knowledge Graph Completion Benchmark | http://arxiv.org/abs/2009.07810v2 |
Coach: A Coarse-to-Fine Approach for Cross-domain Slot Filling | http://arxiv.org/abs/2004.11727v1 |
Coarse-to-Fine Decoding for Neural Semantic Parsing | http://arxiv.org/abs/1805.04793v1 |
Code and Named Entity Recognition in StackOverflow | http://arxiv.org/abs/2005.01634v3 |
Code-switching patterns can be an effective route to improve performance of downstream NLP applications: A case study of humour, sarcasm and hate speech detection | http://arxiv.org/abs/2005.02295v1 |
Cognitive Graph for Multi-Hop Reading Comprehension at Scale | http://arxiv.org/abs/1905.05460v2 |
CognitiveCNN: Mimicking Human Cognitive Models to resolve Texture-Shape Bias | http://arxiv.org/abs/2006.14722v1 |
Cold-start Active Learning through Self-supervised Language Modeling | http://arxiv.org/abs/2010.09535v2 |
Collaborative Machine Learning with Incentive-Aware Model Rewards | http://arxiv.org/abs/2010.12797v1 |
Collapsed Amortized Variational Inference for Switching Nonlinear Dynamical Systems | http://arxiv.org/abs/1910.09588v2 |
Collecting Diverse Natural Language Inference Problems for Sentence Representation Evaluation | http://arxiv.org/abs/1804.08207v2 |
Colorless green recurrent networks dream hierarchically | http://arxiv.org/abs/1803.11138v1 |
Colors in Context: A Pragmatic Neural Model for Grounded Language Understanding | http://arxiv.org/abs/1703.10186v2 |
Combating False Negatives in Adversarial Imitation Learning | http://arxiv.org/abs/2002.00412v1 |
Combining Pretrained High-Resource Embeddings and Subword Representations for Low-Resource Languages | http://arxiv.org/abs/2003.04419v3 |
Combining Self-Training and Self-Supervised Learning for Unsupervised Disfluency Detection | http://arxiv.org/abs/2010.15360v1 |
Combining Sentiment Lexica with a Multi-View Variational Autoencoder | http://arxiv.org/abs/1904.02839v1 |
Common Sense or World Knowledge? Investigating Adapter-Based Knowledge Injection into Pretrained Transformers | http://arxiv.org/abs/2005.11787v2 |
Commonsense for Generative Multi-Hop Question Answering Tasks | http://arxiv.org/abs/1809.06309v3 |
CommonsenseQA: A Question Answering Challenge Targeting Commonsense Knowledge | http://arxiv.org/abs/1811.00937v2 |
Communication-Efficient Asynchronous Stochastic Frank-Wolfe over Nuclear-norm Balls | http://arxiv.org/abs/1910.07703v1 |
Communication-Efficient Distributed Stochastic AUC Maximization with Deep Neural Networks | http://arxiv.org/abs/2005.02426v2 |
CompRes: A Dataset for Narrative Structure in News | http://arxiv.org/abs/2007.04874v1 |
Compact Personalized Models for Neural Machine Translation | http://arxiv.org/abs/1811.01990v1 |
Comparative Analysis of Text Classification Approaches in Electronic Health Records | http://arxiv.org/abs/2005.06624v1 |
Comparatives, Quantifiers, Proportions: A Multi-Task Model for the Learning of Quantities from Vision | http://arxiv.org/abs/1804.05018v1 |
Comparing recurrent and convolutional neural networks for predicting wave propagation | http://arxiv.org/abs/2002.08981v3 |
Competence-Level Prediction and Resume & Job Description Matching Using Context-Aware Transformer Models | http://arxiv.org/abs/2011.02998v1 |
Competence-based Curriculum Learning for Neural Machine Translation | http://arxiv.org/abs/1903.09848v2 |
Competing Bandits in Matching Markets | http://arxiv.org/abs/1906.05363v2 |
Competitive Mirror Descent | http://arxiv.org/abs/2006.10179v1 |
Complete Multilingual Neural Machine Translation | http://arxiv.org/abs/2010.10239v1 |
Complexity Guarantees for Polyak Steps with Momentum | http://arxiv.org/abs/2002.00915v2 |
Complexity-Weighted Loss and Diverse Reranking for Sentence Simplification | http://arxiv.org/abs/1904.02767v1 |
Compositional Demographic Word Embeddings | http://arxiv.org/abs/2010.02986v2 |
Compositional Questions Do Not Necessitate Multi-hop Reasoning | http://arxiv.org/abs/1906.02900v1 |
Compositional Semantic Parsing on Semi-Structured Tables | http://arxiv.org/abs/1508.00305v1 |
Compositional and Lexical Semantics in RoBERTa, BERT and DistilBERT: A Case Study on CoQA | http://arxiv.org/abs/2009.08257v1 |
Compositionality and Generalization in Emergent Languages | http://arxiv.org/abs/2004.09124v1 |
Comprehensive Supersense Disambiguation of English Prepositions and Possessives | http://arxiv.org/abs/1805.04905v1 |
Compressing BERT: Studying the Effects of Weight Pruning on Transfer Learning | http://arxiv.org/abs/2002.08307v2 |
Compressive Summarization with Plausibility and Salience Modeling | http://arxiv.org/abs/2010.07886v1 |
Computing Tight Differential Privacy Guarantees Using FFT | http://arxiv.org/abs/1906.03049v2 |
ConQUR: Mitigating Delusional Bias in Deep Q-learning | http://arxiv.org/abs/2002.12399v1 |
ConStance: Modeling Annotation Contexts to Improve Stance Classification | http://arxiv.org/abs/1708.06309v1 |
Concentration bounds for CVaR estimation: The cases of light-tailed and heavy-tailed distributions | http://arxiv.org/abs/1901.00997v2 |
Concept Bottleneck Models | http://arxiv.org/abs/2007.04612v3 |
Concise Explanations of Neural Networks using Adversarial Training | http://arxiv.org/abs/1810.06583v9 |
Concluding remarks | http://arxiv.org/abs/astro-ph/0612056v1 |
Conditional Augmentation for Aspect Term Extraction via Masked Sequence-to-Sequence Generation | http://arxiv.org/abs/2004.14769v2 |
Conditional Flow Variational Autoencoders for Structured Sequence Prediction | http://arxiv.org/abs/1908.09008v3 |
Conditional Generation and Snapshot Learning in Neural Dialogue Systems | http://arxiv.org/abs/1606.03352v1 |
Conditional Importance Sampling for Off-Policy Learning | http://arxiv.org/abs/1910.07479v2 |
Conditional Normalizing Flows for Low-Dose Computed Tomography Image Reconstruction | http://arxiv.org/abs/2006.06270v1 |
Conditional Set Generation with Transformers | http://arxiv.org/abs/2006.16841v2 |
Conditional gradient methods for stochastically constrained convex minimization | http://arxiv.org/abs/2007.03795v1 |
Conditioning of Reinforcement Learning Agents and its Policy Regularization Application | http://arxiv.org/abs/1906.05437v2 |
Confidence Intervals for Policy Evaluation in Adaptive Experiments | http://arxiv.org/abs/1911.02768v3 |
Confidence Sets and Hypothesis Testing in a Likelihood-Free Inference Setting | http://arxiv.org/abs/2002.10399v2 |
Confidence-Calibrated Adversarial Training: Generalizing to Unseen Attacks | http://arxiv.org/abs/1910.06259v4 |
ConjNLI: Natural Language Inference Over Conjunctive Sentences | http://arxiv.org/abs/2010.10418v2 |
Connecting Embeddings for Knowledge Graph Entity Typing | http://arxiv.org/abs/2007.10873v1 |
Conservative Exploration in Reinforcement Learning | http://arxiv.org/abs/2002.03218v2 |
Conservative Safety Critics for Exploration | http://arxiv.org/abs/2010.14497v1 |
Considering Likelihood in NLP Classification Explanations with Occlusion and Language Modeling | http://arxiv.org/abs/2004.09890v1 |
Consistency by Agreement in Zero-shot Neural Machine Translation | http://arxiv.org/abs/1904.02338v2 |
Consistency of a Recurrent Language Model With Respect to Incomplete Decoding | http://arxiv.org/abs/2002.02492v2 |
Consistent Estimators for Learning to Defer to an Expert | http://arxiv.org/abs/2006.01862v2 |
Consistent Structured Prediction with Max-Min Margin Markov Networks | http://arxiv.org/abs/2007.01012v2 |
Consistent Transcription and Translation of Speech | http://arxiv.org/abs/2007.12741v2 |
Consistent recovery threshold of hidden nearest neighbor graphs | http://arxiv.org/abs/1911.08004v1 |
Constant Curvature Graph Convolutional Networks | http://arxiv.org/abs/1911.05076v3 |
Constituent Parsing as Sequence Labeling | http://arxiv.org/abs/1810.08994v2 |
Constrained Decoding for Neural NLG from Compositional Representations in Task-Oriented Dialogue | http://arxiv.org/abs/1906.07220v1 |
Constrained Markov Decision Processes via Backward Value Functions | http://arxiv.org/abs/2008.11811v1 |
Constrained Neural Ordinary Differential Equations with Stability Guarantees | http://arxiv.org/abs/2004.10883v1 |
Constructing a provably adversarially-robust classifier from a high accuracy one | http://arxiv.org/abs/1912.07561v1 |
Constructive Universal High-Dimensional Distribution Generation through Deep ReLU Networks | http://arxiv.org/abs/2006.16664v1 |
Content Planning for Neural Story Generation with Aristotelian Rescoring | http://arxiv.org/abs/2009.09870v2 |
Content Selection in Deep Learning Models of Summarization | http://arxiv.org/abs/1810.12343v2 |
Context Gates for Neural Machine Translation | http://arxiv.org/abs/1608.06043v3 |
Context Mover's Distance & Barycenters: Optimal Transport of Contexts for Building Representations | http://arxiv.org/abs/1808.09663v6 |
Context-Aware Answer Extraction in Question Answering | http://arxiv.org/abs/2011.02687v1 |
Context-Aware Local Differential Privacy | http://arxiv.org/abs/1911.00038v2 |
Context-Aware Neural Machine Translation Learns Anaphora Resolution | http://arxiv.org/abs/1805.10163v1 |
Context-aware Dynamics Model for Generalization in Model-Based Reinforcement Learning | http://arxiv.org/abs/2005.06800v3 |
Contextual Constrained Learning for Dose-Finding Clinical Trials | http://arxiv.org/abs/2001.02463v2 |
Contextual Embeddings: When Are They Worth It? | http://arxiv.org/abs/2005.09117v1 |
Contextual Memory Trees | http://arxiv.org/abs/1807.06473v3 |
Contextual Neural Machine Translation Improves Translation of Cataphoric Pronouns | http://arxiv.org/abs/2004.09894v2 |
Contextual Online False Discovery Rate Control | http://arxiv.org/abs/1902.02885v2 |
Contextualization of Morphological Inflection | http://arxiv.org/abs/1905.01420v1 |
Contextualized Sparse Representations for Real-Time Open-Domain Question Answering | http://arxiv.org/abs/1911.02896v2 |
Contextualizing Hate Speech Classifiers with Post-hoc Explanation | http://arxiv.org/abs/2005.02439v3 |
Continual Learning from the Perspective of Compression | http://arxiv.org/abs/2006.15078v1 |
Continual Model-Based Reinforcement Learning with Hypernetworks | http://arxiv.org/abs/2009.11997v1 |
Continual adaptation for efficient machine communication | http://arxiv.org/abs/1911.09896v2 |
Continual and Multi-Task Architecture Search | http://arxiv.org/abs/1906.05226v1 |
Continual learning with direction-constrained optimization | http://arxiv.org/abs/2011.12581v1 |
Continuous Graph Flow | http://arxiv.org/abs/1908.02436v2 |
Continuous Graph Neural Networks | http://arxiv.org/abs/1912.00967v3 |
Continuous Online Learning and New Insights to Online Imitation Learning | http://arxiv.org/abs/1912.01261v1 |
Continuous Representation of Location for Geolocation and Lexical Dialectology using Mixture Density Networks | http://arxiv.org/abs/1708.04358v1 |
Continuous-time Lower Bounds for Gradient-based Algorithms | http://arxiv.org/abs/2002.03546v2 |
Continuously Indexed Domain Adaptation | http://arxiv.org/abs/2007.01807v2 |
Contrastive Behavioral Similarity Embeddings for Generalization in Reinforcement Learning | http://arxiv.org/abs/2101.05265v1 |
Contrastive Graph Neural Network Explanation | http://arxiv.org/abs/2010.13663v1 |
Contrastive Multi-View Representation Learning on Graphs | http://arxiv.org/abs/2006.05582v1 |
Contrastive Self-Supervised Learning for Commonsense Reasoning | http://arxiv.org/abs/2005.00669v1 |
Control Frequency Adaptation via Action Persistence in Batch Reinforcement Learning | http://arxiv.org/abs/2002.06836v2 |
Controlled Crowdsourcing for High-Quality QA-SRL Annotation | http://arxiv.org/abs/1911.03243v2 |
Controlling Output Length in Neural Encoder-Decoders | http://arxiv.org/abs/1609.09552v1 |
Controlling Overestimation Bias with Truncated Mixture of Continuous Distributional Quantile Critics | http://arxiv.org/abs/2005.04269v1 |
ConvLab-2: An Open-Source Toolkit for Building, Evaluating, and Diagnosing Dialogue Systems | http://arxiv.org/abs/2002.04793v2 |
Convergence Analysis of Block Coordinate Algorithms with Determinantal Sampling | http://arxiv.org/abs/1910.11561v3 |
Convergence Rates of Smooth Message Passing with Rounding in Entropy-Regularized MAP Inference | http://arxiv.org/abs/1907.01127v2 |
Convergence Rates of Variational Inference in Sparse Deep Learning | http://arxiv.org/abs/1908.04847v2 |
Conversation Modeling on Reddit using a Graph-Structured LSTM | http://arxiv.org/abs/1704.02080v1 |
Conversational Document Prediction to Assist Customer Care Agents | http://arxiv.org/abs/2010.02305v1 |
Conversational Semantic Parsing | http://arxiv.org/abs/2009.13655v1 |
Conversational Semantic Parsing for Dialog State Tracking | http://arxiv.org/abs/2010.12770v1 |
Conversational Word Embedding for Retrieval-Based Dialog System | http://arxiv.org/abs/2004.13249v1 |
Conversations Gone Awry: Detecting Early Signs of Conversational Failure | http://arxiv.org/abs/1805.05345v1 |
Convex Calibrated Surrogates for the Multi-Label F-Measure | http://arxiv.org/abs/2009.07801v1 |
Convex Representation Learning for Generalized Invariance in Semi-Inner-Product Space | http://arxiv.org/abs/2004.12209v3 |
Convolutional Kernel Networks for Graph-Structured Data | http://arxiv.org/abs/2003.05189v2 |
Convolutional Neural Networks with Recurrent Neural Filters | http://arxiv.org/abs/1808.09315v1 |
Convolutional dictionary learning based auto-encoders for natural exponential-family distributions | http://arxiv.org/abs/1907.03211v4 |
Cooperative Learning of Disjoint Syntax and Semantics | http://arxiv.org/abs/1902.09393v2 |
Cooperative Multi-Agent Bandits with Heavy Tails | http://arxiv.org/abs/2008.06244v1 |
Coordination without communication: optimal regret in two players multi-armed bandits | http://arxiv.org/abs/2002.07596v2 |
Coreferential Reasoning Learning for Language Representation | http://arxiv.org/abs/2004.06870v2 |
Coresets for Clustering in Graphs of Bounded Treewidth | http://arxiv.org/abs/1907.04733v4 |
Coresets for Data-efficient Training of Machine Learning Models | http://arxiv.org/abs/1906.01827v3 |
Correlating neural and symbolic representations of language | http://arxiv.org/abs/1905.06401v2 |
Corruption-Tolerant Gaussian Process Bandit Optimization | http://arxiv.org/abs/2003.01971v1 |
Counterfactual Cross-Validation: Stable Model Selection Procedure for Causal Inference Models | http://arxiv.org/abs/1909.05299v5 |
Counterfactual Data Augmentation for Mitigating Gender Stereotypes in Languages with Rich Morphology | http://arxiv.org/abs/1906.04571v3 |
Counterfactual Data Augmentation using Locally Factored Dynamics | http://arxiv.org/abs/2007.02863v2 |
Countering Language Drift with Seeded Iterated Learning | http://arxiv.org/abs/2003.12694v3 |
Countering hate on social media: Large scale classification of hate and counter speech | http://arxiv.org/abs/2006.01974v3 |
Coupling Distant Annotation and Adversarial Training for Cross-Domain Chinese Word Segmentation | http://arxiv.org/abs/2007.08186v2 |
Coupling Retrieval and Meta-Learning for Context-Dependent Semantic Parsing | http://arxiv.org/abs/1906.07108v1 |
Course Concept Expansion in MOOCs with External Knowledge and Interactive Game | http://arxiv.org/abs/1909.07739v1 |
Creating Causal Embeddings for Question Answering with Minimal Supervision | http://arxiv.org/abs/1609.08097v1 |
Cross Copy Network for Dialogue Generation | http://arxiv.org/abs/2010.11539v1 |
Cross-Domain Generalization of Neural Constituency Parsers | http://arxiv.org/abs/1907.04347v1 |
Cross-Lingual Alignment of Contextual Word Embeddings, with Applications to Zero-shot Dependency Parsing | http://arxiv.org/abs/1902.09492v2 |
Cross-Lingual Semantic Role Labeling with High-Quality Translated Training Corpus | http://arxiv.org/abs/2004.06295v2 |
Cross-Lingual Syntactic Transfer with Limited Resources | http://arxiv.org/abs/1610.06227v2 |
Cross-Lingual Training for Automatic Question Generation | http://arxiv.org/abs/1906.02525v1 |
Cross-Linguistic Syntactic Evaluation of Word Prediction Models | http://arxiv.org/abs/2005.00187v2 |
Cross-Media Keyphrase Prediction: A Unified Framework with Multi-Modality Multi-Head Attention and Image Wordings | http://arxiv.org/abs/2011.01565v1 |
Cross-Modal Data Programming Enables Rapid Medical Machine Learning | http://arxiv.org/abs/1903.11101v1 |
Cross-Modality Relevance for Reasoning on Language and Vision | http://arxiv.org/abs/2005.06035v1 |
Cross-Sentence N-ary Relation Extraction with Graph LSTMs | http://arxiv.org/abs/1708.03743v1 |
Cross-Target Stance Classification with Self-Attention Networks | http://arxiv.org/abs/1805.06593v2 |
Cross-Thought for Sentence Encoder Pre-training | http://arxiv.org/abs/2010.03652v1 |
Cross-lingual Abstract Meaning Representation Parsing | http://arxiv.org/abs/1704.04539v2 |
Cross-lingual Spoken Language Understanding with Regularized Representation Alignment | http://arxiv.org/abs/2009.14510v1 |
Cross-lingual Visual Verb Sense Disambiguation | http://arxiv.org/abs/1904.05092v2 |
Cross-media Structured Common Space for Multimedia Event Extraction | http://arxiv.org/abs/2005.02472v1 |
Cross-modal Language Generation using Pivot Stabilization for Web-scale Language Coverage | http://arxiv.org/abs/2005.00246v1 |
Cross-topic distributional semantic representations via unsupervised mappings | http://arxiv.org/abs/1904.05674v1 |
CrossWOZ: A Large-Scale Chinese Cross-Domain Task-Oriented Dialogue Dataset | http://arxiv.org/abs/2002.11893v2 |
Crossing Variational Autoencoders for Answer Retrieval | http://arxiv.org/abs/2005.02557v2 |
CrowS-Pairs: A Challenge Dataset for Measuring Social Biases in Masked Language Models | http://arxiv.org/abs/2010.00133v1 |
Crowdsourcing Lightweight Pyramids for Manual Summary Evaluation | http://arxiv.org/abs/1904.05929v1 |
Cumulo: A Dataset for Learning Cloud Classes | http://arxiv.org/abs/1911.04227v2 |
Curriculum Pre-training for End-to-End Speech Translation | http://arxiv.org/abs/2004.10093v1 |
Curse of Dimensionality on Randomized Smoothing for Certifiable Robustness | http://arxiv.org/abs/2002.03239v2 |
Cycles in Causal Learning | http://arxiv.org/abs/2007.12335v1 |
D2RL: Deep Dense Architectures in Reinforcement Learning | http://arxiv.org/abs/2010.09163v2 |
DADI: Dynamic Discovery of Fair Information with Adversarial Reinforcement Learning | http://arxiv.org/abs/1910.13983v1 |
DAGA: Data Augmentation with a Generation Approach for Low-resource Tagging Tasks | http://arxiv.org/abs/2011.01549v1 |
DAve-QN: A Distributed Averaged Quasi-Newton Method with Local Superlinear Convergence Rate | http://arxiv.org/abs/1906.00506v3 |
DERAIL: Diagnostic Environments for Reward And Imitation Learning | http://arxiv.org/abs/2012.01365v1 |
DGST: a Dual-Generator Network for Text Style Transfer | http://arxiv.org/abs/2010.14557v1 |
DLGNet: A Transformer-based Model for Dialogue Response Generation | http://arxiv.org/abs/1908.01841v2 |
DOC: Deep Open Classification of Text Documents | http://arxiv.org/abs/1709.08716v1 |
DORB: Dynamically Optimizing Multiple Rewards with Bandits | http://arxiv.org/abs/2011.07635v1 |
DR-BiLSTM: Dependent Reading Bidirectional LSTM for Natural Language Inference | http://arxiv.org/abs/1802.05577v2 |
DRS at MRP 2020: Dressing up Discourse Representation Structures as Graphs | http://arxiv.org/abs/2012.14837v1 |
DRTS Parsing with Structure-Aware Encoding and Decoding | http://arxiv.org/abs/2005.06901v1 |
DRWR: A Differentiable Renderer without Rendering for Unsupervised 3D Structure Learning from Silhouette Images | http://arxiv.org/abs/2007.06127v1 |
DTCA: Decision Tree-based Co-Attention Networks for Explainable Claim Verification | http://arxiv.org/abs/2004.13455v1 |
DYSAN: Dynamically sanitizing motion sensor data against sensitive inferences through adversarial networks | http://arxiv.org/abs/2003.10325v2 |
DagoBERT: Generating Derivational Morphology with a Pretrained Language Model | http://arxiv.org/abs/2005.00672v2 |
Data Amplification: Instance-Optimal Property Estimation | http://arxiv.org/abs/1903.01432v2 |
Data Appraisal Without Data Sharing | http://arxiv.org/abs/2012.06430v1 |
Data Augmentation for Training Dialog Models Robust to Speech Recognition Errors | http://arxiv.org/abs/2006.05635v1 |
Data Boost: Text Data Augmentation Through Reinforcement Learning Guided Conditional Generation | http://arxiv.org/abs/2012.02952v1 |
Data Generation for Neural Programming by Example | http://arxiv.org/abs/1911.02624v1 |
Data Manipulation: Towards Effective Instance Learning for Neural Dialogue Generation via Learning to Augment and Reweight | http://arxiv.org/abs/2004.02594v5 |
Data Rejuvenation: Exploiting Inactive Training Examples for Neural Machine Translation | http://arxiv.org/abs/2010.02552v1 |
Data Valuation using Reinforcement Learning | http://arxiv.org/abs/1909.11671v1 |
Data Weighted Training Strategies for Grammatical Error Correction | http://arxiv.org/abs/2008.02976v2 |
Data and Representation for Turkish Natural Language Inference | http://arxiv.org/abs/2004.14963v3 |
Data preprocessing to mitigate bias: A maximum entropy based approach | http://arxiv.org/abs/1906.02164v2 |
Data-Dependent Differentially Private Parameter Learning for Directed Graphical Models | http://arxiv.org/abs/1905.12813v3 |
Data-Efficient Image Recognition with Contrastive Predictive Coding | http://arxiv.org/abs/1905.09272v3 |
Data-driven confidence bands for distributed nonparametric regression | http://arxiv.org/abs/1912.06689v2 |
Dataset Cartography: Mapping and Diagnosing Datasets with Training Dynamics | http://arxiv.org/abs/2009.10795v2 |
DeBayes: a Bayesian method for debiasing network embeddings | http://arxiv.org/abs/2002.11442v2 |
DeClarE: Debunking Fake News and False Claims using Evidence-Aware Deep Learning | http://arxiv.org/abs/1809.06416v1 |
DeFormer: Decomposing Pre-trained Transformers for Faster Question Answering | http://arxiv.org/abs/2005.00697v1 |
DeSePtion: Dual Sequence Prediction and Adversarial Examples for Improved Fact-Checking | http://arxiv.org/abs/2004.12864v1 |
Debiased Sinkhorn barycenters | http://arxiv.org/abs/2006.02575v1 |
Debiasing Evaluations That are Biased by Evaluations | http://arxiv.org/abs/2012.00714v1 |
Decentralised Learning with Random Features and Distributed Gradient Descent | http://arxiv.org/abs/2007.00360v1 |
Decentralized Multi-player Multi-armed Bandits with No Collision Information | http://arxiv.org/abs/2003.00162v1 |
Decentralized gradient methods: does topology matter? | http://arxiv.org/abs/2002.12688v1 |
Decision Trees for Decision-Making under the Predict-then-Optimize Framework | http://arxiv.org/abs/2003.00360v2 |
Decomposable Neural Paraphrase Generation | http://arxiv.org/abs/1906.09741v1 |
Deconstructing word embedding algorithms | http://arxiv.org/abs/2011.07013v1 |
Decoupled Greedy Learning of CNNs | http://arxiv.org/abs/1901.08164v4 |
Decoupling Strategy and Generation in Negotiation Dialogues | http://arxiv.org/abs/1808.09637v1 |
DeeBERT: Dynamic Early Exiting for Accelerating BERT Inference | http://arxiv.org/abs/2004.12993v1 |
Deep Active Learning: Unified and Principled Method for Query and Training | http://arxiv.org/abs/1911.09162v2 |
Deep Bayesian Quadrature Policy Optimization | http://arxiv.org/abs/2006.15637v3 |
Deep Claim: Payer Response Prediction from Claims Data with Deep Learning | http://arxiv.org/abs/2007.06229v1 |
Deep Context-Aware Novelty Detection | http://arxiv.org/abs/2006.01168v2 |
Deep Contextualized Self-training for Low Resource Dependency Parsing | http://arxiv.org/abs/1911.04286v1 |
Deep Coordination Graphs | http://arxiv.org/abs/1910.00091v4 |
Deep Divergence Learning | http://arxiv.org/abs/2005.02612v1 |
Deep Dyna-Q: Integrating Planning for Task-Completion Dialogue Policy Learning | http://arxiv.org/abs/1801.06176v3 |
Deep Gaussian Markov Random Fields | http://arxiv.org/abs/2002.07467v2 |
Deep Generative Model for Joint Alignment and Word Representation | http://arxiv.org/abs/1802.05883v3 |
Deep Graph Contrastive Representation Learning | http://arxiv.org/abs/2006.04131v2 |
Deep Hierarchical Classification for Category Prediction in E-commerce System | http://arxiv.org/abs/2005.06692v1 |
Deep Isometric Learning for Visual Recognition | http://arxiv.org/abs/2006.16992v2 |
Deep Keyphrase Generation | http://arxiv.org/abs/1704.06879v2 |
Deep Molecular Programming: A Natural Implementation of Binary-Weight ReLU Neural Networks | http://arxiv.org/abs/2003.13720v3 |
Deep Networks and the Multiple Manifold Problem | http://arxiv.org/abs/2008.11245v1 |
Deep Neural Machine Translation with Linear Associative Unit | http://arxiv.org/abs/1705.00861v1 |
Deep Probabilistic Logic: A Unifying Framework for Indirect Supervision | http://arxiv.org/abs/1808.08485v1 |
Deep Recurrent Models with Fast-Forward Connections for Neural Machine Translation | http://arxiv.org/abs/1606.04199v3 |
Deep Reinforcement Learning amidst Lifelong Non-Stationarity | http://arxiv.org/abs/2006.10701v1 |
Deep Reinforcement Learning for Dialogue Generation | http://arxiv.org/abs/1606.01541v4 |
Deep Reinforcement Learning for Mention-Ranking Coreference Models | http://arxiv.org/abs/1609.08667v3 |
Deep Relevance Ranking Using Enhanced Document-Query Interactions | http://arxiv.org/abs/1809.01682v2 |
Deep Ritz revisited | http://arxiv.org/abs/1912.03937v2 |
Deep Structured Mixtures of Gaussian Processes | http://arxiv.org/abs/1910.04536v2 |
Deep Temporal-Recurrent-Replicated-Softmax for Topical Trends over Time | http://arxiv.org/abs/1711.05626v2 |
Deep contextualized word representations | http://arxiv.org/abs/1802.05365v2 |
Deep k-NN for Noisy Labels | http://arxiv.org/abs/2004.12289v1 |
Deep-speare: A Joint Neural Model of Poetic Language, Meter and Rhyme | http://arxiv.org/abs/1807.03491v1 |
DeepCoDA: personalized interpretability for compositional health data | http://arxiv.org/abs/2006.01392v2 |
DeepMatch: Balancing Deep Covariate Representations for Causal Inference Using Adversarial Training | http://arxiv.org/abs/1802.05664v1 |
DeepPath: A Reinforcement Learning Method for Knowledge Graph Reasoning | http://arxiv.org/abs/1707.06690v3 |
DeepSeqSLAM: A Trainable CNN+RNN for Joint Global Description and Sequence-based Place Recognition | http://arxiv.org/abs/2011.08518v1 |
Defense Through Diverse Directions | http://arxiv.org/abs/2003.10602v1 |
Defining Benchmarks for Continual Few-Shot Learning | http://arxiv.org/abs/2004.11967v1 |
Defining and Evaluating Fair Natural Language Generation | http://arxiv.org/abs/2008.01548v1 |
Defoiling Foiled Image Captions | http://arxiv.org/abs/1805.06549v1 |
Delete, Retrieve, Generate: A Simple Approach to Sentiment and Style Transfer | http://arxiv.org/abs/1804.06437v1 |
DeltaGrad: Rapid retraining of machine learning models | http://arxiv.org/abs/2006.14755v2 |
Demand-Weighted Completeness Prediction for a Knowledge Base | http://arxiv.org/abs/1804.11109v1 |
Demographic Dialectal Variation in Social Media: A Case Study of African-American English | http://arxiv.org/abs/1608.08868v1 |
Demographics Should Not Be the Reason of Toxicity: Mitigating Discrimination in Text Classifications with Instance Weighting | http://arxiv.org/abs/2004.14088v3 |
Demoting Racial Bias in Hate Speech Detection | http://arxiv.org/abs/2005.12246v1 |
Denoising Relation Extraction from Document-level Distant Supervision | http://arxiv.org/abs/2011.03888v1 |
Dense Passage Retrieval for Open-Domain Question Answering | http://arxiv.org/abs/2004.04906v3 |
Dense-Caption Matching and Frame-Selection Gating for Temporal Localization in VideoQA | http://arxiv.org/abs/2005.06409v1 |
Densely Connected Graph Convolutional Networks for Graph-to-Sequence Learning | http://arxiv.org/abs/1908.05957v2 |
Density Deconvolution with Normalizing Flows | http://arxiv.org/abs/2006.09396v2 |
Density Matching for Bilingual Word Embedding | http://arxiv.org/abs/1904.02343v3 |
Deontological Ethics By Monotonicity Shape Constraints | http://arxiv.org/abs/2001.11990v2 |
Dependency-based Hybrid Trees for Semantic Parsing | http://arxiv.org/abs/1809.00107v1 |
Dependent randomized rounding for clustering and partition systems with knapsack constraints | http://arxiv.org/abs/1709.06995v9 |
Depth Completion via Deep Basis Fitting | http://arxiv.org/abs/1912.10336v1 |
Depth Uncertainty in Neural Networks | http://arxiv.org/abs/2006.08437v3 |
DepthNet Nano: A Highly Compact Self-Normalizing Neural Network for Monocular Depth Estimation | http://arxiv.org/abs/2004.08008v1 |
Deriving Machine Attention from Human Rationales | http://arxiv.org/abs/1808.09367v1 |
Description Based Text Classification with Reinforcement Learning | http://arxiv.org/abs/2002.03067v3 |
Design Challenges in Low-resource Cross-lingual Entity Linking | http://arxiv.org/abs/2005.00692v2 |
Designing Differentially Private Estimators in High Dimensions | http://arxiv.org/abs/2006.01944v3 |
Designing Precise and Robust Dialogue Response Evaluators | http://arxiv.org/abs/2004.04908v2 |
Detecting Attackable Sentences in Arguments | http://arxiv.org/abs/2010.02660v1 |
Detecting Cross-Modal Inconsistency to Defend Against Neural Fake News | http://arxiv.org/abs/2009.07698v5 |
Detecting East Asian Prejudice on Social Media | http://arxiv.org/abs/2005.03909v1 |
Detecting Egregious Conversations between Customers and Virtual Agents | http://arxiv.org/abs/1711.05780v2 |
Detecting Fine-Grained Cross-Lingual Semantic Divergences without Supervision by Learning to Rank | http://arxiv.org/abs/2010.03662v1 |
Detecting Gang-Involved Escalation on Social Media Using Context | http://arxiv.org/abs/1809.03632v1 |
Detecting Perceived Emotions in Hurricane Disasters | http://arxiv.org/abs/2004.14299v1 |
Detecting Word Sense Disambiguation Biases in Machine Translation for Model-Agnostic Adversarial Attacks | http://arxiv.org/abs/2011.01846v1 |
Detecting dementia in Mandarin Chinese using transfer learning from a parallel corpus | http://arxiv.org/abs/1903.00933v2 |
Determining Semantic Textual Similarity using Natural Deduction Proofs | http://arxiv.org/abs/1707.08713v1 |
Deterministic Decoding for Discrete Data in Variational Autoencoders | http://arxiv.org/abs/2003.02174v1 |
Deterministic Non-Autoregressive Neural Sequence Modeling by Iterative Refinement | http://arxiv.org/abs/1802.06901v3 |
Dexterous Robotic Grasping with Object-Centric Visual Affordances | http://arxiv.org/abs/2009.01439v1 |
Dialogue Coherence Assessment Without Explicit Dialogue Act Labels | http://arxiv.org/abs/1908.08486v2 |
Dialogue Distillation: Open-Domain Dialogue Augmentation Using Unpaired Data | http://arxiv.org/abs/2009.09427v2 |
Dialogue Response Ranking Training with Large-Scale Human Feedback Data | http://arxiv.org/abs/2009.06978v1 |
Diameter-based Interactive Structure Discovery | http://arxiv.org/abs/1906.02101v2 |
Dice Loss for Data-imbalanced NLP Tasks | http://arxiv.org/abs/1911.02855v3 |
Did You Ask a Good Question? A Cross-Domain Question Intention Classification Benchmark for Text-to-SQL | http://arxiv.org/abs/2010.12634v1 |
Did the Model Understand the Question? | http://arxiv.org/abs/1805.05492v1 |
Differentiable Causal Backdoor Discovery | http://arxiv.org/abs/2003.01461v1 |
Differentiable Graph Module (DGM) for Graph Convolutional Networks | http://arxiv.org/abs/2002.04999v3 |
Differentiable Likelihoods for Fast Inversion of 'Likelihood-Free' Dynamical Systems | http://arxiv.org/abs/2002.09301v2 |
Differentiable Sampling with Flexible Reference Word Order for Neural Machine Translation | http://arxiv.org/abs/1904.04079v2 |
Differentiable Window for Dynamic Local Attention | http://arxiv.org/abs/2006.13561v1 |
Differential Evolution for Neural Architecture Search | http://arxiv.org/abs/2012.06400v1 |
Differentially Private Language Models Benefit from Public Pre-training | http://arxiv.org/abs/2009.05886v2 |
Differentially Private Set Union | http://arxiv.org/abs/2002.09745v1 |
Differentially Private Stochastic Coordinate Descent | http://arxiv.org/abs/2006.07272v3 |
Differentially private cross-silo federated learning | http://arxiv.org/abs/2007.05553v1 |
Differentiating through the Fréchet Mean | http://arxiv.org/abs/2003.00335v3 |
Digital Voicing of Silent Speech | http://arxiv.org/abs/2010.02960v1 |
Dilated Convolutional Attention Network for Medical Code Assignment from Clinical Text | http://arxiv.org/abs/2009.14578v1 |
Diptychs of human and machine perceptions | http://arxiv.org/abs/2010.13864v1 |
DisCor: Corrective Feedback in Reinforcement Learning via Distribution Correction | http://arxiv.org/abs/2003.07305v1 |
Discern: Discourse-Aware Entailment Reasoning Network for Conversational Machine Reading | http://arxiv.org/abs/2010.01838v3 |
DiscoFuse: A Large-Scale Dataset for Discourse-Based Sentence Fusion | http://arxiv.org/abs/1902.10526v3 |
Discontinuous Constituency Parsing with a Stack-Free Transition System and a Dynamic Oracle | http://arxiv.org/abs/1904.00615v1 |
Discontinuous Constituent Parsing as Sequence Labeling | http://arxiv.org/abs/2010.00633v1 |
Discount Factor as a Regularizer in Reinforcement Learning | http://arxiv.org/abs/2007.02040v1 |
Discourse Marker Augmented Network with Reinforcement Learning for Natural Language Inference | http://arxiv.org/abs/1907.09692v1 |
Discourse structure interacts with reference but not syntax in neural language models | http://arxiv.org/abs/2010.04887v1 |
Discourse-Aware Neural Extractive Text Summarization | http://arxiv.org/abs/1910.14142v2 |
Discovering and interpreting transcriptomic drivers of imaging traits using neural networks | http://arxiv.org/abs/1912.05071v1 |
Discrete Action On-Policy Learning with Action-Value Critic | http://arxiv.org/abs/2002.03534v2 |
Discrete Latent Variable Representations for Low-Resource Text Classification | http://arxiv.org/abs/2006.06226v1 |
Discrete Optimization for Unsupervised Sentence Summarization with Word-Level Extraction | http://arxiv.org/abs/2005.01791v1 |
Discriminative Adversarial Search for Abstractive Summarization | http://arxiv.org/abs/2002.10375v2 |
Discriminative Jackknife: Quantifying Uncertainty in Deep Learning via Higher-Order Influence Functions | http://arxiv.org/abs/2007.13481v1 |
Discriminative Nearest Neighbor Few-Shot Intent Detection by Transferring Natural Language Inference | http://arxiv.org/abs/2010.13009v1 |
Discriminative Neural Sentence Modeling by Tree-Based Convolution | http://arxiv.org/abs/1504.01106v5 |
Discriminatively-Tuned Generative Classifiers for Robust Natural Language Inference | http://arxiv.org/abs/2010.03760v1 |
Disentangle-based Continual Graph Representation Learning | http://arxiv.org/abs/2010.02565v4 |
Disentangled Planning and Control in Vision Based Robotics via Reward Machines | http://arxiv.org/abs/2012.14464v1 |
Disentangling Language and Knowledge in Task-Oriented Dialogs | http://arxiv.org/abs/1805.01216v3 |
Disentangling Trainability and Generalization in Deep Neural Networks | http://arxiv.org/abs/1912.13053v2 |
Dispersed Exponential Family Mixture VAEs for Interpretable Text Generation | http://arxiv.org/abs/1906.06719v4 |
Dissecting Lottery Ticket Transformers: Structural and Behavioral Study of Sparse Neural Machine Translation | http://arxiv.org/abs/2009.13270v2 |
Dissecting Non-Vacuous Generalization Bounds based on the Mean-Field Approximation | http://arxiv.org/abs/1909.03009v2 |
Dissecting Span Identification Tasks with Performance Prediction | http://arxiv.org/abs/2010.02587v1 |
Dissipative SymODEN: Encoding Hamiltonian Dynamics with Dissipation and Control into Deep Learning | http://arxiv.org/abs/2002.08860v3 |
Distant Supervision and Noisy Label Learning for Low Resource Named Entity Recognition: A Study on Hausa and Yorùbá | http://arxiv.org/abs/2003.08370v2 |
Distant Supervision from Disparate Sources for Low-Resource Part-of-Speech Tagging | http://arxiv.org/abs/1808.09733v1 |
Distill, Adapt, Distill: Training Small, In-Domain Models for Neural Machine Translation | http://arxiv.org/abs/2003.02877v3 |
Distilling Knowledge Learned in BERT for Text Generation | http://arxiv.org/abs/1911.03829v3 |
Distilling Knowledge for Search-based Structured Prediction | http://arxiv.org/abs/1805.11224v1 |
Distilling Neural Networks for Greener and Faster Dependency Parsing | http://arxiv.org/abs/2006.00844v1 |
Distinguish Confusing Law Articles for Legal Judgment Prediction | http://arxiv.org/abs/2004.02557v3 |
Distributed Differentially Private Averaging with Improved Utility and Robustness to Malicious Parties | http://arxiv.org/abs/2006.07218v1 |
Distributed Learning: Sequential Decision Making in Resource-Constrained Environments | http://arxiv.org/abs/2004.06171v1 |
Distributed, partially collapsed MCMC for Bayesian Nonparametrics | http://arxiv.org/abs/2001.05591v3 |
Distributionally Robust Bayesian Optimization | http://arxiv.org/abs/2002.09038v3 |
Distributionally Robust Bayesian Quadrature Optimization | http://arxiv.org/abs/2001.06814v1 |
Distributionally Robust Formulation and Model Selection for the Graphical Lasso | http://arxiv.org/abs/1905.08975v2 |
Diverse Exploration via InfoMax Options | http://arxiv.org/abs/2010.02756v1 |
Diverse, Controllable, and Keyphrase-Aware: A Corpus and Method for News Multi-Headline Generation | http://arxiv.org/abs/2004.03875v2 |
Diversifying Dialogue Generation with Non-Conversational Text | http://arxiv.org/abs/2005.04346v2 |
Diversifying Reply Suggestions using a Matching-Conditional Variational Autoencoder | http://arxiv.org/abs/1903.10630v1 |
Diversity driven Attention Model for Query-based Abstractive Summarization | http://arxiv.org/abs/1704.08300v2 |
Divide, Conquer, and Combine: a New Inference Strategy for Probabilistic Programs with Stochastic Support | http://arxiv.org/abs/1910.13324v3 |
Diving Deep into Context-Aware Neural Machine Translation | http://arxiv.org/abs/2010.09482v1 |
Do Explicit Alignments Robustly Improve Multilingual Encoders? | http://arxiv.org/abs/2010.02537v1 |
Do Multi-Sense Embeddings Improve Natural Language Understanding? | http://arxiv.org/abs/1506.01070v3 |
Do Neural Dialog Systems Use the Conversation History Effectively? An Empirical Study | http://arxiv.org/abs/1906.01603v2 |
Do Neural Language Models Show Preferences for Syntactic Formalisms? | http://arxiv.org/abs/2004.14096v1 |
Do Neural Models Learn Systematicity of Monotonicity Inference in Natural Language? | http://arxiv.org/abs/2004.14839v2 |
Do Neural Network Cross-Modal Mappings Really Bridge Modalities? | http://arxiv.org/abs/1805.07616v2 |
Do RNN and LSTM have Long Memory? | http://arxiv.org/abs/2006.03860v2 |
Do We Need Zero Training Loss After Achieving Zero Training Error? | http://arxiv.org/abs/2002.08709v1 |
Do We Really Need to Access the Source Data? Source Hypothesis Transfer for Unsupervised Domain Adaptation | http://arxiv.org/abs/2002.08546v5 |
Do You Have the Right Scissors? Tailoring Pre-trained Language Models via Monte-Carlo Methods | http://arxiv.org/abs/2007.06162v1 |
Do You See What I Mean? Visual Resolution of Linguistic Ambiguities | http://arxiv.org/abs/1603.08079v1 |
Do latent tree learning models identify meaningful structure in sentences? | http://arxiv.org/abs/1709.01121v2 |
Do sequence-to-sequence VAEs learn global features of sentences? | http://arxiv.org/abs/2004.07683v1 |
Document Context Neural Machine Translation with Memory Networks | http://arxiv.org/abs/1711.03688v2 |
Document Modeling with Graph Attention Networks for Multi-grained Machine Reading Comprehension | http://arxiv.org/abs/2005.05806v2 |
Document-Level Event Role Filler Extraction using Multi-Granularity Contextualized Encoding | http://arxiv.org/abs/2005.06579v1 |
Document-aligned Japanese-English Conversation Parallel Corpus | http://arxiv.org/abs/2012.06143v1 |
Does Multi-Encoder Help? A Case Study on Context-Aware Neural Machine Translation | http://arxiv.org/abs/2005.03393v2 |
Does label smoothing mitigate label noise? | http://arxiv.org/abs/2003.02819v1 |
Does syntax need to grow on trees? Sources of hierarchical inductive bias in sequence-to-sequence networks | http://arxiv.org/abs/2001.03632v1 |
Does the Markov Decision Process Fit the Data: Testing for the Markov Property in Sequential Decision Making | http://arxiv.org/abs/2002.01751v1 |
Does the Objective Matter? Comparing Training Objectives for Pronoun Resolution | http://arxiv.org/abs/2010.02570v1 |
Domain Adaptation with Adversarial Training and Graph Embeddings | http://arxiv.org/abs/1805.05151v1 |
Domain Adaptive Dialog Generation via Meta Learning | http://arxiv.org/abs/1906.03520v2 |
Domain Adaptive Imitation Learning | http://arxiv.org/abs/1910.00105v2 |
Domain Adaptive Inference for Neural Machine Translation | http://arxiv.org/abs/1906.00408v1 |
Domain Aggregation Networks for Multi-Source Domain Adaptation | http://arxiv.org/abs/1909.05352v2 |
Domain Knowledge Empowered Structured Neural Net for End-to-End Event Temporal Relation Extraction | http://arxiv.org/abs/2009.07373v2 |
Domain Knowledge Integration By Gradient Matching For Sample-Efficient Reinforcement Learning | http://arxiv.org/abs/2005.13778v1 |
Domain-Liftability of Relational Marginal Polytopes | http://arxiv.org/abs/2001.05198v1 |
Domain-Specific Lexical Grounding in Noisy Visual-Textual Documents | http://arxiv.org/abs/2010.16363v1 |
Don't Neglect the Obvious: On the Role of Unambiguous Words in Word Sense Disambiguation | http://arxiv.org/abs/2004.14325v3 |
Don't Read Too Much into It: Adaptive Computation for Open-Domain Question Answering | http://arxiv.org/abs/2011.05435v1 |
Don't Use English Dev: On the Zero-Shot Cross-Lingual Evaluation of Contextual Embeddings | http://arxiv.org/abs/2004.15001v2 |
Double Graph Based Reasoning for Document-level Relation Extraction | http://arxiv.org/abs/2009.13752v1 |
Double-Hard Debias: Tailoring Word Embeddings for Gender Bias Mitigation | http://arxiv.org/abs/2005.00965v1 |
Double-Loop Unadjusted Langevin Algorithm | http://arxiv.org/abs/2007.01147v1 |
Doubly Sparse Variational Gaussian Processes | http://arxiv.org/abs/2001.05363v1 |
Doubly Stochastic Variational Inference for Neural Processes with Hierarchical Latent Variables | http://arxiv.org/abs/2008.09469v2 |
Doubly robust off-policy evaluation with shrinkage | http://arxiv.org/abs/1907.09623v2 |
Dream and Search to Control: Latent Space Planning for Continuous Control | http://arxiv.org/abs/2010.09832v1 |
Driving Behavior Explanation with Multi-level Fusion | http://arxiv.org/abs/2012.04983v1 |
Dual Mirror Descent for Online Allocation Problems | http://arxiv.org/abs/2002.10421v4 |
DualTKB: A Dual Learning Bridge between Text and Knowledge Base | http://arxiv.org/abs/2010.14660v1 |
DyERNIE: Dynamic Evolution of Riemannian Manifold Embeddings for Temporal Knowledge Graph Completion | http://arxiv.org/abs/2011.03984v2 |
Dyna-AIL : Adversarial Imitation Learning by Planning | http://arxiv.org/abs/1903.03234v1 |
Dynamic Anticipation and Completion for Multi-Hop Reasoning over Sparse Knowledge Graph | http://arxiv.org/abs/2010.01899v1 |
Dynamic Context Selection for Document-level Neural Machine Translation via Reinforcement Learning | http://arxiv.org/abs/2010.04314v1 |
Dynamic Data Selection and Weighting for Iterative Back-Translation | http://arxiv.org/abs/2004.03672v2 |
Dynamic Fusion Network for Multi-Domain End-to-end Task-Oriented Dialog | http://arxiv.org/abs/2004.11019v3 |
Dynamic Knapsack Optimization Towards Efficient Multi-Channel Sequential Advertising | http://arxiv.org/abs/2006.16312v1 |
Dynamic Memory Induction Networks for Few-Shot Text Classification | http://arxiv.org/abs/2005.05727v1 |
Dynamic Oracles for Top-Down and In-Order Shift-Reduce Constituent Parsing | http://arxiv.org/abs/1810.10882v1 |
Dynamic Programming Encoding for Subword Segmentation in Neural Machine Translation | http://arxiv.org/abs/2005.06606v2 |
Dynamic Regions Graph Neural Networks for Spatio-Temporal Reasoning | http://arxiv.org/abs/2009.08427v1 |
Dynamical systems theory for causal inference with application to synthetic control methods | http://arxiv.org/abs/1808.08778v3 |
Dynamically Adjusting Transformer Batch Size by Monitoring Gradient Direction Change | http://arxiv.org/abs/2005.02008v1 |
ELI5: Long Form Question Answering | http://arxiv.org/abs/1907.09190v1 |
ELITR Non-Native Speech Translation at IWSLT 2020 | http://arxiv.org/abs/2006.03331v1 |
EM Converges for a Mixture of Many Linear Regressions | http://arxiv.org/abs/1905.12106v2 |
ENGINE: Energy-Based Inference Networks for Non-Autoregressive Machine Translation | http://arxiv.org/abs/2005.00850v2 |
ENT-DESC: Entity Description Generation by Exploring Knowledge Graph | http://arxiv.org/abs/2004.14813v2 |
ERASER: A Benchmark to Evaluate Rationalized NLP Models | http://arxiv.org/abs/1911.03429v2 |
ESPRIT: Explaining Solutions to Physical Reasoning Tasks | http://arxiv.org/abs/2005.00730v2 |
ESPnet-ST: All-in-One Speech Translation Toolkit | http://arxiv.org/abs/2004.10234v2 |
ETC: Encoding Long and Structured Inputs in Transformers | http://arxiv.org/abs/2004.08483v5 |
EXP4-DFDC: A Non-Stochastic Multi-Armed Bandit for Cache Replacement | http://arxiv.org/abs/2009.11330v2 |
Early Disease Diagnosis for Rice Crop | http://arxiv.org/abs/2004.04775v1 |
Easy-First Dependency Parsing with Hierarchical Tree LSTMs | http://arxiv.org/abs/1603.00375v2 |
Ecological Semantics: Programming Environments for Situated Language Understanding | http://arxiv.org/abs/2003.04567v2 |
EditNTS: An Neural Programmer-Interpreter Model for Sentence Simplification through Explicit Editing | http://arxiv.org/abs/1906.08104v1 |
Educating Text Autoencoders: Latent Representation Guidance via Denoising | http://arxiv.org/abs/1905.12777v3 |
Effective Approaches to Attention-based Neural Machine Translation | http://arxiv.org/abs/1508.04025v5 |
Effective Estimation of Deep Generative Language Models | http://arxiv.org/abs/1904.08194v3 |
Effective Unsupervised Domain Adaptation with Adversarially Trained Language Models | http://arxiv.org/abs/2010.01739v1 |
Effectiveness of MPC-friendly Softmax Replacement | http://arxiv.org/abs/2011.11202v1 |
Efficient Competitive Self-Play Policy Optimization | http://arxiv.org/abs/2009.06086v1 |
Efficient Constituency Parsing by Pointing | http://arxiv.org/abs/2006.13557v1 |
Efficient Contextualized Representation: Language Model Pruning for Sequence Labeling | http://arxiv.org/abs/1804.07827v2 |
Efficient Continuous Pareto Exploration in Multi-Task Learning | http://arxiv.org/abs/2006.16434v2 |
Efficient Deployment of Conversational Natural Language Interfaces over Databases | http://arxiv.org/abs/2006.00591v2 |
Efficient Dialogue State Tracking by Selectively Overwriting Memory | http://arxiv.org/abs/1911.03906v2 |
Efficient Distributed Hessian Free Algorithm for Large-scale Empirical Risk Minimization via Accumulating Sample Strategy | http://arxiv.org/abs/1810.11507v2 |
Efficient Domain Generalization via Common-Specific Low-Rank Decomposition | http://arxiv.org/abs/2003.12815v2 |
Efficient EUD Parsing | http://arxiv.org/abs/2006.00838v1 |
Efficient Estimation of Influence of a Training Instance | http://arxiv.org/abs/2012.04207v1 |
Efficient Inference For Neural Machine Translation | http://arxiv.org/abs/2010.02416v2 |
Efficient Intent Detection with Dual Sentence Encoders | http://arxiv.org/abs/2003.04807v1 |
Efficient Intervention Design for Causal Discovery with Latents | http://arxiv.org/abs/2005.11736v2 |
Efficient Low-rank Multimodal Fusion with Modality-Specific Factors | http://arxiv.org/abs/1806.00064v1 |
Efficient Meta Lifelong-Learning with Limited Memory | http://arxiv.org/abs/2010.02500v1 |
Efficient One-Pass End-to-End Entity Linking for Questions | http://arxiv.org/abs/2010.02413v1 |
Efficient Online Scalar Annotation with Bounded Support | http://arxiv.org/abs/1806.01170v1 |
Efficient Optimistic Exploration in Linear-Quadratic Regulators via Lagrangian Relaxation | http://arxiv.org/abs/2007.06482v1 |
Efficient Parameter Estimation of Truncated Boolean Product Distributions | http://arxiv.org/abs/2007.02392v1 |
Efficient Planning under Partial Observability with Unnormalized Q Functions and Spectral Learning | http://arxiv.org/abs/1911.05010v2 |
Efficient Privacy-Preserving Stochastic Nonconvex Optimization | http://arxiv.org/abs/1910.13659v2 |
Efficient Proximal Mapping of the 1-path-norm of Shallow Networks | http://arxiv.org/abs/2007.01003v2 |
Efficient Reservoir Management through Deep Reinforcement Learning | http://arxiv.org/abs/2012.03822v1 |
Efficient Robustness Certificates for Discrete Data: Sparsity-Aware Randomized Smoothing for Graphs, Images and More | http://arxiv.org/abs/2008.12952v1 |
Efficient Second-Order TreeCRF for Neural Dependency Parsing | http://arxiv.org/abs/2005.00975v2 |
Efficient allocation of law enforcement resources using predictive police patrolling | http://arxiv.org/abs/1811.12880v1 |
Efficient and Robust Algorithms for Adversarial Linear Contextual Bandits | http://arxiv.org/abs/2002.00287v2 |
Efficient and Scalable Bayesian Neural Nets with Rank-1 Factors | http://arxiv.org/abs/2005.07186v2 |
Efficient improper learning for online logistic regression | http://arxiv.org/abs/2003.08109v3 |
Efficient non-conjugate Gaussian process factor models for spike count data using polynomial approximations | http://arxiv.org/abs/1906.03318v2 |
Efficient strategies for hierarchical text classification: External knowledge and auxiliary tasks | http://arxiv.org/abs/2005.02473v2 |
Efficient, Noise-Tolerant, and Private Learning via Boosting | http://arxiv.org/abs/2002.01100v1 |
Efficiently Learning Adversarially Robust Halfspaces with Noise | http://arxiv.org/abs/2005.07652v1 |
Efficiently Sampling Functions from Gaussian Process Posteriors | http://arxiv.org/abs/2002.09309v4 |
Efficiently Solving MDPs with Stochastic Mirror Descent | http://arxiv.org/abs/2008.12776v1 |
Egoshots, an ego-vision life-logging dataset and semantic fidelity metric to evaluate diversity in image captioning models | http://arxiv.org/abs/2003.11743v2 |
Einsum Networks: Fast and Scalable Learning of Tractable Probabilistic Circuits | http://arxiv.org/abs/2004.06231v1 |
Embarrassingly Simple Unsupervised Aspect Extraction | http://arxiv.org/abs/2004.13580v1 |
Embedding Multimodal Relational Data for Knowledge Base Completion | http://arxiv.org/abs/1809.01341v2 |
Embedding Words in Non-Vector Space with Unsupervised Graph Learning | http://arxiv.org/abs/2010.02598v1 |
Embedding time expressions for deep temporal ordering models | http://arxiv.org/abs/1906.08287v1 |
Embedding-based Scientific Literature Discovery in a Text Editor Application | http://arxiv.org/abs/2005.04961v1 |
Embeddings of Label Components for Sequence Labeling: A Case Study of Fine-grained Named Entity Recognition | http://arxiv.org/abs/2006.01372v2 |
Emergence of Syntax Needs Minimal Supervision | http://arxiv.org/abs/2005.01119v1 |
Emergent Road Rules In Multi-Agent Driving Environments | http://arxiv.org/abs/2011.10753v1 |
Emerging Cross-lingual Structure in Pretrained Language Models | http://arxiv.org/abs/1911.01464v3 |
Empirical Study of the Benefits of Overparameterization in Learning Latent Variable Models | http://arxiv.org/abs/1907.00030v3 |
Empower Entity Set Expansion via Language Model Probing | http://arxiv.org/abs/2004.13897v2 |
Empowering Active Learning to Jointly Optimize System and User Demands | http://arxiv.org/abs/2005.04470v2 |
Enabling Language Models to Fill in the Blanks | http://arxiv.org/abs/2005.05339v2 |
Encoder-Decoder Models Can Benefit from Pre-trained Masked Language Models in Grammatical Error Correction | http://arxiv.org/abs/2005.00987v2 |
Encoder-decoder neural network for solving the nonlinear Fokker-Planck-Landau collision operator in XGC | http://arxiv.org/abs/2009.06534v2 |
Encoding Musical Style with Transformer Autoencoders | http://arxiv.org/abs/1912.05537v2 |
Encoding Sentences with Graph Convolutional Networks for Semantic Role Labeling | http://arxiv.org/abs/1703.04826v4 |
Encoding Source Language with Convolutional Neural Network for Machine Translation | http://arxiv.org/abs/1503.01838v5 |
Encodings of Source Syntax: Similarities in NMT Representations Across Target Languages | http://arxiv.org/abs/2005.08177v1 |
End to End Binarized Neural Networks for Text Classification | http://arxiv.org/abs/2010.05223v1 |
End-to-End Bias Mitigation by Modelling Biases in Corpora | http://arxiv.org/abs/1909.06321v3 |
End-to-End Neural Word Alignment Outperforms GIZA++ | http://arxiv.org/abs/2004.14675v1 |
End-to-End Slot Alignment and Recognition for Cross-Lingual NLU | http://arxiv.org/abs/2004.14353v2 |
End-to-End Synthetic Data Generation for Domain Adaptation of Question Answering Systems | http://arxiv.org/abs/2010.06028v1 |
End-to-end ASR: from Supervised to Semi-Supervised Learning with Modern Architectures | http://arxiv.org/abs/1911.08460v3 |
End-to-end Graph-based TAG Parsing with Neural Networks | http://arxiv.org/abs/1804.06610v3 |
End-to-end Neural Coreference Resolution | http://arxiv.org/abs/1707.07045v2 |
Energy and Policy Considerations for Deep Learning in NLP | http://arxiv.org/abs/1906.02243v1 |
Energy-Based Continuous Inverse Optimal Control | http://arxiv.org/abs/1904.05453v4 |
Energy-Based Processes for Exchangeable Data | http://arxiv.org/abs/2003.07521v2 |
Energy-based Surprise Minimization for Multi-Agent Value Factorization | http://arxiv.org/abs/2009.09842v3 |
Enhanced POET: Open-Ended Reinforcement Learning through Unbounded Invention of Learning Challenges and their Solutions | http://arxiv.org/abs/2003.08536v2 |
Enhanced Universal Dependency Parsing with Second-Order Inference and Mixture of Training Data | http://arxiv.org/abs/2006.01414v2 |
Enhancing Answer Boundary Detection for Multilingual Machine Reading Comprehension | http://arxiv.org/abs/2004.14069v2 |
Enhancing Drug-Drug Interaction Extraction from Texts by Molecular Structure Information | http://arxiv.org/abs/1805.05593v1 |
Enhancing Machine Translation with Dependency-Aware Self-Attention | http://arxiv.org/abs/1909.03149v3 |
Enhancing Pre-trained Chinese Character Representation with Word-aligned Attention | http://arxiv.org/abs/1911.02821v2 |
Enhancing Simple Models by Exploiting What They Already Know | http://arxiv.org/abs/1905.13565v3 |
Enhancing Stratospheric Weather Analyses and Forecasts by Deploying Sensors from a Weather Balloon | http://arxiv.org/abs/1912.02276v1 |
Enhancing Word Embeddings with Knowledge Extracted from Lexical Resources | http://arxiv.org/abs/2005.10048v1 |
Enriched In-Order Linearization for Faster Sequence-to-Sequence Constituent Parsing | http://arxiv.org/abs/2005.13334v1 |
Enriching Word Embeddings with Temporal and Spatial Information | http://arxiv.org/abs/2010.00761v1 |
Enriching Word Vectors with Subword Information | http://arxiv.org/abs/1607.04606v2 |
Entities as Experts: Sparse Memory Access with Entity Supervision | http://arxiv.org/abs/2004.07202v2 |
Entity Commonsense Representation for Neural Abstractive Summarization | http://arxiv.org/abs/1806.05504v1 |
Entity Linking for Queries by Searching Wikipedia Sentences | http://arxiv.org/abs/1704.02788v3 |
Entity Linking in 100 Languages | http://arxiv.org/abs/2011.02690v1 |
Entity Recognition at First Sight: Improving NER with Eye Movement Information | http://arxiv.org/abs/1902.10068v2 |
Entity-Enriched Neural Models for Clinical Question Answering | http://arxiv.org/abs/2005.06587v1 |
Entropy Minimization In Emergent Languages | http://arxiv.org/abs/1905.13687v3 |
Episodic Memory Reader: Learning What to Remember for Question Answering from Streaming Data | http://arxiv.org/abs/1903.06164v3 |
Equalized odds postprocessing under imperfect group information | http://arxiv.org/abs/1906.03284v3 |
Equivariant Hamiltonian Flows | http://arxiv.org/abs/1909.13739v1 |
Equivariant Neural Rendering | http://arxiv.org/abs/2006.07630v2 |
Error Estimation for Sketched SVD via the Bootstrap | http://arxiv.org/abs/2003.04937v1 |
Error bounds in estimating the out-of-sample prediction error using leave-one-out cross validation in high-dimensions | http://arxiv.org/abs/2003.01770v1 |
Error-Bounded Correction of Noisy Labels | http://arxiv.org/abs/2011.10077v1 |
Estimating Grape Yield on the Vine from Multiple Images | http://arxiv.org/abs/2004.04278v1 |
Estimating Principal Components under Adversarial Perturbations | http://arxiv.org/abs/2006.00602v2 |
Estimating Q(s,s') with Deep Deterministic Dynamics Gradients | http://arxiv.org/abs/2002.09505v2 |
Estimating localized complexity of white-matter wiring with GANs | http://arxiv.org/abs/1910.04868v2 |
Estimating predictive uncertainty for rumour verification models | http://arxiv.org/abs/2005.07174v1 |
Estimating the number and effect sizes of non-null hypotheses | http://arxiv.org/abs/2002.07297v2 |
Estimation and Inference with Trees and Forests in High Dimensions | http://arxiv.org/abs/2007.03210v2 |
Estimation of Bounds on Potential Outcomes For Decision Making | http://arxiv.org/abs/1910.04817v4 |
Evaluating Agents without Rewards | http://arxiv.org/abs/2012.11538v1 |
Evaluating Amharic Machine Translation | http://arxiv.org/abs/2003.14386v1 |
Evaluating Attribution Methods using White-Box LSTMs | http://arxiv.org/abs/2010.08606v1 |
Evaluating Dialogue Generation Systems via Response Selection | http://arxiv.org/abs/2004.14302v1 |
Evaluating Explainable AI: Which Algorithmic Explanations Help Users Predict Model Behavior? | http://arxiv.org/abs/2005.01831v1 |
Evaluating Explanation Methods for Neural Machine Translation | http://arxiv.org/abs/2005.01672v1 |
Evaluating Gender Bias in Machine Translation | http://arxiv.org/abs/1906.00591v1 |
Evaluating Logical Generalization in Graph Neural Networks | http://arxiv.org/abs/2003.06560v1 |
Evaluating Lossy Compression Rates of Deep Generative Models | http://arxiv.org/abs/2008.06653v1 |
Evaluating Neural Morphological Taggers for Sanskrit | http://arxiv.org/abs/2005.10893v1 |
Evaluating Robustness to Input Perturbations for Neural Machine Translation | http://arxiv.org/abs/2005.00580v1 |
Evaluating Theory of Mind in Question Answering | http://arxiv.org/abs/1808.09352v1 |
Evaluating and Characterizing Human Rationales | http://arxiv.org/abs/2010.04736v1 |
Evaluating the Calibration of Knowledge Graph Embeddings for Trustworthy Link Prediction | http://arxiv.org/abs/2004.01168v3 |
Evaluating the Factual Consistency of Abstractive Text Summarization | http://arxiv.org/abs/1910.12840v1 |
Evaluating the Utility of Hand-crafted Features in Sequence Labelling | http://arxiv.org/abs/1808.09075v1 |
Evaluation of Model Selection for Kernel Fragment Recognition in Corn Silage | http://arxiv.org/abs/2004.00292v1 |
Event Extraction by Answering (Almost) Natural Questions | http://arxiv.org/abs/2004.13625v1 |
Every Document Owns Its Structure: Inductive Text Classification via Graph Neural Networks | http://arxiv.org/abs/2004.13826v2 |
Evidence-Aware Inferential Text Generation with Vector Quantised Variational AutoEncoder | http://arxiv.org/abs/2006.08101v1 |
Evolution-based Fine-tuning of CNNs for Prostate Cancer Detection | http://arxiv.org/abs/1911.01477v1 |
EvolveGraph: Multi-Agent Trajectory Prediction with Dynamic Relational Reasoning | http://arxiv.org/abs/2003.13924v4 |
Evolving Reinforcement Learning Algorithms | http://arxiv.org/abs/2101.03958v1 |
Examination and Extension of Strategies for Improving Personalized Language Modeling via Interpolation | http://arxiv.org/abs/2006.05469v1 |
Examining Citations of Natural Language Processing Literature | http://arxiv.org/abs/2005.00912v1 |
Examining the State-of-the-Art in News Timeline Summarization | http://arxiv.org/abs/2005.10107v1 |
Exclusive Hierarchical Decoding for Deep Keyphrase Generation | http://arxiv.org/abs/2004.08511v1 |
ExpBERT: Representation Engineering with Natural Language Explanations | http://arxiv.org/abs/2005.01932v1 |
Experience Grounds Language | http://arxiv.org/abs/2004.10151v3 |
Experimental Evaluation and Development of a Silver-Standard for the MIMIC-III Clinical Coding Dataset | http://arxiv.org/abs/2006.07332v1 |
Expertise Style Transfer: A New Task Towards Better Communication between Experts and Laymen | http://arxiv.org/abs/2005.00701v1 |
Explainable Automated Fact-Checking for Public Health Claims | http://arxiv.org/abs/2010.09926v1 |
Explainable and Discourse Topic-aware Neural Language Understanding | http://arxiv.org/abs/2006.10632v2 |
Explaining Black Box Predictions and Unveiling Data Artifacts through Influence Functions | http://arxiv.org/abs/2005.06676v1 |
Explaining Character-Aware Neural Networks for Word-Level Prediction: Do They Discover Linguistic Rules? | http://arxiv.org/abs/1808.09551v1 |
Explaining Groups of Points in Low-Dimensional Representations | http://arxiv.org/abs/2003.01640v3 |
Explaining the Explainer: A First Theoretical Analysis of LIME | http://arxiv.org/abs/2001.03447v2 |
Explanation Augmented Feedback in Human-in-the-Loop Reinforcement Learning | http://arxiv.org/abs/2006.14804v3 |
Explicit Mean-Square Error Bounds for Monte-Carlo and Linear Stochastic Approximation | http://arxiv.org/abs/2002.02584v1 |
Explicit Memory Tracker with Coarse-to-Fine Reasoning for Conversational Machine Reading | http://arxiv.org/abs/2005.12484v2 |
Exploiting Categorical Structure Using Tree-Based Methods | http://arxiv.org/abs/2004.07383v1 |
Exploiting Cross-Sentence Context for Neural Machine Translation | http://arxiv.org/abs/1704.04347v3 |
Exploiting Deep Representations for Neural Machine Translation | http://arxiv.org/abs/1810.10181v1 |
Exploiting Domain Knowledge via Grouped Weight Sharing with Application to Text Categorization | http://arxiv.org/abs/1702.02535v3 |
Exploiting Explicit Paths for Multi-hop Reading Comprehension | http://arxiv.org/abs/1811.01127v2 |
Exploiting Rich Syntactic Information for Semantic Parsing with Graph-to-Sequence Model | http://arxiv.org/abs/1808.07624v1 |
Exploiting Sentence Order in Document Alignment | http://arxiv.org/abs/2004.14523v2 |
Exploiting Structured Knowledge in Text via Graph-Guided Representation Learning | http://arxiv.org/abs/2004.14224v1 |
Exploiting Syntactic Structure for Better Language Modeling: A Syntactic Distance Approach | http://arxiv.org/abs/2005.05864v1 |
Exploration by Optimisation in Partial Monitoring | http://arxiv.org/abs/1907.05772v3 |
Exploratory Analysis of COVID-19 Related Tweets in North America to Inform Public Health Institutes | http://arxiv.org/abs/2007.02452v1 |
Explore, Discover and Learn: Unsupervised Discovery of State-Covering Skills | http://arxiv.org/abs/2002.03647v4 |
Explore, Propose, and Assemble: An Interpretable Model for Multi-Hop Reading Comprehension | http://arxiv.org/abs/1906.05210v1 |
Exploring Author Context for Detecting Intended vs Perceived Sarcasm | http://arxiv.org/abs/1910.11932v1 |
Exploring Content Selection in Summarization of Novel Chapters | http://arxiv.org/abs/2005.01840v2 |
Exploring Contextual Word-level Style Relevance for Unsupervised Style Transfer | http://arxiv.org/abs/2005.02049v1 |
Exploring Contextualized Neural Language Models for Temporal Dependency Parsing | http://arxiv.org/abs/2004.14577v2 |
Exploring Exploration: Comparing Children with RL Agents in Unified Environments | http://arxiv.org/abs/2005.02880v2 |
Exploring Phoneme-Level Speech Representations for End-to-End Speech Translation | http://arxiv.org/abs/1906.01199v1 |
Exploring Recombination for Efficient Decoding of Neural Machine Translation | http://arxiv.org/abs/1808.08482v2 |
Exploring Semantic Capacity of Terms | http://arxiv.org/abs/2010.01898v1 |
Exploring Weaknesses of VQA Models through Attribution Driven Insights | http://arxiv.org/abs/2006.06637v2 |
Exploring and Predicting Transferability across NLP Tasks | http://arxiv.org/abs/2005.00770v2 |
Exploring aspects of similarity between spoken personal narratives by disentangling them into narrative clause types | http://arxiv.org/abs/2005.12762v2 |
Exploring the Linear Subspace Hypothesis in Gender Bias Mitigation | http://arxiv.org/abs/2009.09435v2 |
Exploring the Role of Argument Structure in Online Debate Persuasion | http://arxiv.org/abs/2010.03538v1 |
Exploring the Role of Prior Beliefs for Argument Persuasion | http://arxiv.org/abs/1906.11301v1 |
Exposing Shallow Heuristics of Relation Extraction Models with Challenge Data | http://arxiv.org/abs/2010.03656v1 |
Expressing Visual Relationships via Language | http://arxiv.org/abs/1906.07689v2 |
Expressive Interviewing: A Conversational System for Coping with COVID-19 | http://arxiv.org/abs/2007.03819v1 |
Expressiveness and Learning of Hidden Quantum Markov Models | http://arxiv.org/abs/1912.02098v1 |
Extending Implicit Discourse Relation Recognition to the PDTB-3 | http://arxiv.org/abs/2010.06294v1 |
Extending a Parser to Distant Domains Using a Few Dozen Partially Annotated Examples | http://arxiv.org/abs/1805.06556v1 |
Extensively Matching for Few-shot Learning Event Detection | http://arxiv.org/abs/2006.10093v1 |
Extract and Edit: An Alternative to Back-Translation for Unsupervised Neural Machine Translation | http://arxiv.org/abs/1904.02331v1 |
Extracting Headless MWEs from Dependency Parse Trees: Parsing, Tagging, and Joint Modeling Approaches | http://arxiv.org/abs/2005.03035v1 |
Extracting Implicitly Asserted Propositions in Argumentation | http://arxiv.org/abs/2010.02654v1 |
Extracting Symptoms and their Status from Clinical Conversations | http://arxiv.org/abs/1906.02239v1 |
Extractive Summarization as Text Matching | http://arxiv.org/abs/2004.08795v1 |
Extragradient with player sampling for faster Nash equilibrium finding | http://arxiv.org/abs/1905.12363v5 |
Extrapolating the profile of a finite population | http://arxiv.org/abs/2005.10561v1 |
Extreme Multi-label Classification from Aggregated Labels | http://arxiv.org/abs/2004.00198v1 |
FEQA: A Question Answering Evaluation Framework for Faithfulness Assessment in Abstractive Summarization | http://arxiv.org/abs/2005.03754v1 |
FFR V1.0: Fon-French Neural Machine Translation | http://arxiv.org/abs/2003.12111v1 |
FFR v1.1: Fon-French Neural Machine Translation | http://arxiv.org/abs/2006.09217v1 |
FIESTA: Fast IdEntification of State-of-The-Art models using adaptive bandit algorithms | http://arxiv.org/abs/1906.12230v1 |
FLAT: Chinese NER Using Flat-Lattice Transformer | http://arxiv.org/abs/2004.11795v2 |
F^2-Softmax: Diversifying Neural Text Generation via Frequency Factorized Softmax | http://arxiv.org/abs/2009.09417v2 |
Facebook AI's WMT20 News Translation Task Submission | http://arxiv.org/abs/2011.08298v1 |
Facet-Aware Evaluation for Extractive Summarization | http://arxiv.org/abs/1908.10383v2 |
Facilitating the Communication of Politeness through Fine-Grained Paraphrasing | http://arxiv.org/abs/2012.00012v1 |
Fact or Fiction: Verifying Scientific Claims | http://arxiv.org/abs/2004.14974v6 |
Fact-based Text Editing | http://arxiv.org/abs/2007.00916v1 |
Factorising AMR generation through syntax | http://arxiv.org/abs/1804.07707v2 |
Factual Error Correction for Abstractive Summarization Models | http://arxiv.org/abs/2010.08712v1 |
Fair Bayesian Optimization | http://arxiv.org/abs/2006.05109v1 |
Fair Correlation Clustering | http://arxiv.org/abs/2002.02274v2 |
Fair Decisions Despite Imperfect Predictions | http://arxiv.org/abs/1902.02979v4 |
Fair Embedding Engine: A Library for Analyzing and Mitigating Gender Bias in Word Embeddings | http://arxiv.org/abs/2010.13168v1 |
Fair Generative Modeling via Weak Supervision | http://arxiv.org/abs/1910.12008v2 |
Fair Learning with Private Demographic Data | http://arxiv.org/abs/2002.11651v2 |
Fairness in the Eyes of the Data: Certifying Machine-Learning Models | http://arxiv.org/abs/2009.01534v1 |
Fairwashing Explanations with Off-Manifold Detergent | http://arxiv.org/abs/2007.09969v1 |
Familywise Error Rate Control by Interactive Unmasking | http://arxiv.org/abs/2002.08545v3 |
Fast Adaptation via Policy-Dynamics Value Functions | http://arxiv.org/abs/2007.02879v1 |
Fast Algorithms for Computational Optimal Transport and Wasserstein Barycenter | http://arxiv.org/abs/1905.09952v4 |
Fast Differentiable Sorting and Ranking | http://arxiv.org/abs/2002.08871v2 |
Fast Interleaved Bidirectional Sequence Generation | http://arxiv.org/abs/2010.14481v1 |
Fast Learning of Graph Neural Networks with Guaranteed Generalizability: One-hidden-layer Case | http://arxiv.org/abs/2006.14117v1 |
Fast Linear Convergence of Randomized BFGS | http://arxiv.org/abs/2002.11337v3 |
Fast Markov Chain Monte Carlo Algorithms via Lie Groups | http://arxiv.org/abs/1901.08606v2 |
Fast OSCAR and OWL Regression via Safe Screening Rules | http://arxiv.org/abs/2006.16433v1 |
Fast Physical Activity Suggestions: Efficient Hyperparameter Learning in Mobile Health | http://arxiv.org/abs/2012.11646v1 |
Fast Rates for Online Prediction with Abstention | http://arxiv.org/abs/2001.10623v2 |
Fast and Accurate Deep Bidirectional Language Representations for Unsupervised Learning | http://arxiv.org/abs/2004.08097v1 |
Fast and Furious Convergence: Stochastic Second Order Methods under Interpolation | http://arxiv.org/abs/1910.04920v2 |
Fast and Scalable Expansion of Natural Language Understanding Functionality for Intelligent Agents | http://arxiv.org/abs/1805.01542v1 |
Fast semantic parsing with well-typedness guarantees | http://arxiv.org/abs/2009.07365v2 |
Fast(er) Exact Decoding and Global Training for Transition-Based Dependency Parsing via a Minimal Feature Set | http://arxiv.org/abs/1708.09403v1 |
Fast, Small and Exact: Infinite-order Language Modelling with Compressed Suffix Trees | http://arxiv.org/abs/1608.04465v1 |
FastBERT: a Self-distilling BERT with Adaptive Inference Time | http://arxiv.org/abs/2004.02178v2 |
FastFormers: Highly Efficient Transformer Models for Natural Language Understanding | http://arxiv.org/abs/2010.13382v1 |
Faster Graph Embeddings via Coarsening | http://arxiv.org/abs/2007.02817v3 |
Faster Projection-free Online Learning | http://arxiv.org/abs/2001.11568v2 |
Feature Adaptation of Pre-Trained Language Models across Languages and Domains with Robust Self-Training | http://arxiv.org/abs/2009.11538v3 |
Feature Noise Induces Loss Discrepancy Across Groups | http://arxiv.org/abs/1911.09876v2 |
Feature Quantization Improves GAN Training | http://arxiv.org/abs/2004.02088v2 |
Feature Selection using Stochastic Gates | http://arxiv.org/abs/1810.04247v7 |
Feature relevance quantification in explainable AI: A causal problem | http://arxiv.org/abs/1910.13413v2 |
Feature-map-level Online Adversarial Knowledge Distillation | http://arxiv.org/abs/2002.01775v3 |
FedPAQ: A Communication-Efficient Federated Learning Method with Periodic Averaging and Quantization | http://arxiv.org/abs/1909.13014v4 |
Federated Heavy Hitters Discovery with Differential Privacy | http://arxiv.org/abs/1902.08534v4 |
Federated Learning with Only Positive Labels | http://arxiv.org/abs/2004.10342v1 |
Fenchel Lifted Networks: A Lagrange Relaxation of Neural Network Training | http://arxiv.org/abs/1811.08039v3 |
FetchSGD: Communication-Efficient Federated Learning with Sketching | http://arxiv.org/abs/2007.07682v2 |
Few-Shot Complex Knowledge Base Question Answering via Meta Reinforcement Learning | http://arxiv.org/abs/2010.15877v1 |
Few-Shot Learning for Opinion Summarization | http://arxiv.org/abs/2004.14884v3 |
Few-Shot NLG with Pre-Trained Language Model | http://arxiv.org/abs/1904.09521v3 |
Few-shot Domain Adaptation by Causal Mechanism Transfer | http://arxiv.org/abs/2002.03497v2 |
Few-shot Relation Extraction via Bayesian Meta-learning on Relation Graphs | http://arxiv.org/abs/2007.02387v1 |
Few-shot Slot Tagging with Collapsed Dependency Transfer and Label-enhanced Task-adaptive Projection Network | http://arxiv.org/abs/2006.05702v1 |
Few-shot link prediction via graph neural networks for Covid-19 drug-repurposing | http://arxiv.org/abs/2007.10261v1 |
FewRel: A Large-Scale Supervised Few-Shot Relation Classification Dataset with State-of-the-Art Evaluation | http://arxiv.org/abs/1810.10147v2 |
Fiduciary Bandits | http://arxiv.org/abs/1905.07043v3 |
Fiedler Regularization: Learning Neural Networks with Graph Sparsity | http://arxiv.org/abs/2003.00992v3 |
Field-Level Crop Type Classification with k Nearest Neighbors: A Baseline for a New Kenya Smallholder Dataset | http://arxiv.org/abs/2004.03023v1 |
Fill in the BLANC: Human-free quality estimation of document summaries | http://arxiv.org/abs/2002.09836v2 |
Filling Missing Paths: Modeling Co-occurrences of Word Pairs and Dependency Paths for Recognizing Lexical Semantic Relations | http://arxiv.org/abs/1809.03411v1 |
Filtering Noisy Dialogue Corpora by Connectivity and Content Relatedness | http://arxiv.org/abs/2004.14008v2 |
FinRL: A Deep Reinforcement Learning Library for Automated Stock Trading in Quantitative Finance | http://arxiv.org/abs/2011.09607v1 |
Finding Convincing Arguments Using Scalable Bayesian Preference Learning | http://arxiv.org/abs/1806.02418v1 |
Finding Syntax in Human Encephalography with Beam Search | http://arxiv.org/abs/1806.04127v1 |
Finding Universal Grammatical Relations in Multilingual BERT | http://arxiv.org/abs/2005.04511v2 |
Finding Your Voice: The Linguistic Development of Mental Health Counselors | http://arxiv.org/abs/1906.07194v1 |
Finding trainable sparse networks through Neural Tangent Transfer | http://arxiv.org/abs/2006.08228v2 |
Fine Grained Citation Span for References in Wikipedia | http://arxiv.org/abs/1707.07278v1 |
Fine-Grained Analysis of Cross-Linguistic Syntactic Divergences | http://arxiv.org/abs/2005.03436v2 |
Fine-Grained Prediction of Syntactic Typology: Discovering Latent Structure with Supervised Learning | http://arxiv.org/abs/1710.03877v1 |
Fine-Grained Temporal Relation Extraction | http://arxiv.org/abs/1902.01390v2 |
Fine-grained Fact Verification with Kernel Graph Attention Network | http://arxiv.org/abs/1910.09796v3 |
Fine-grained linguistic evaluation for state-of-the-art Machine Translation | http://arxiv.org/abs/2010.06359v2 |
Finite Regret and Cycles with Fixed Step-Size via Alternating Gradient Descent-Ascent | http://arxiv.org/abs/1907.04392v1 |
Finite Time Analysis of Linear Two-timescale Stochastic Approximation with Markovian Noise | http://arxiv.org/abs/2002.01268v1 |
Finite-Sample Analysis of Decentralized Temporal-Difference Learning with Linear Function Approximation | http://arxiv.org/abs/1911.00934v2 |
Finite-Time Analysis of Asynchronous Stochastic Approximation and |
http://arxiv.org/abs/2002.00260v1 |
Finite-Time Last-Iterate Convergence for Multi-Agent Learning in Games | http://arxiv.org/abs/2002.09806v4 |
Fixed-Confidence Guarantees for Bayesian Best-Arm Identification | http://arxiv.org/abs/1910.10945v3 |
Flexible and Efficient Long-Range Planning Through Curious Exploration | http://arxiv.org/abs/2004.10876v2 |
Flexible retrieval with NMSLIB and FlexNeuART | http://arxiv.org/abs/2010.14848v2 |
Flow Models for Arbitrary Conditional Likelihoods | http://arxiv.org/abs/1909.06319v2 |
Fluent Response Generation for Conversational Question Answering | http://arxiv.org/abs/2005.10464v2 |
Forecasting Sequential Data using Consistent Koopman Autoencoders | http://arxiv.org/abs/2003.02236v2 |
Formal Limitations on the Measurement of Mutual Information | http://arxiv.org/abs/1811.04251v4 |
Fortification of Neural Morphological Segmentation Models for Polysynthetic Minimal-Resource Languages | http://arxiv.org/abs/1804.06024v1 |
Fortifying Toxic Speech Detectors Against Veiled Toxicity | http://arxiv.org/abs/2010.03154v1 |
Fractal Gaussian Networks: A sparse random graph model based on Gaussian Multiplicative Chaos | http://arxiv.org/abs/2008.03038v1 |
Fractional Underdamped Langevin Dynamics: Retargeting SGD with Momentum under Heavy-Tailed Gradient Noise | http://arxiv.org/abs/2002.05685v2 |
Free Energy Wells and Overlap Gap Property in Sparse PCA | http://arxiv.org/abs/2006.10689v1 |
Frequency Bias in Neural Networks for Input of Non-Uniform Density | http://arxiv.org/abs/2003.04560v1 |
Frequentist Uncertainty in Recurrent Neural Networks via Blockwise Influence Functions | http://arxiv.org/abs/2006.13707v2 |
Friendships, Rivalries, and Trysts: Characterizing Relations between Ideas in Texts | http://arxiv.org/abs/1704.07828v2 |
From Arguments to Key Points: Towards Automatic Argument Summarization | http://arxiv.org/abs/2005.01619v2 |
From Data to Decisions: Distributionally Robust Optimization is Optimal | http://arxiv.org/abs/1704.04118v3 |
From Dataset Recycling to Multi-Property Extraction and Beyond | http://arxiv.org/abs/2011.03228v1 |
From English to Code-Switching: Transfer Learning with Strong Morphological Clues | http://arxiv.org/abs/1909.05158v3 |
From ImageNet to Image Classification: Contextualizing Progress on Benchmarks | http://arxiv.org/abs/2005.11295v1 |
From Importance Sampling to Doubly Robust Policy Gradient | http://arxiv.org/abs/1910.09066v3 |
From Language to Programs: Bridging Reinforcement Learning and Maximum Marginal Likelihood | http://arxiv.org/abs/1704.07926v1 |
From Machine Reading Comprehension to Dialogue State Tracking: Bridging the Gap | http://arxiv.org/abs/2004.05827v1 |
From Nesterov's Estimate Sequence to Riemannian Acceleration | http://arxiv.org/abs/2001.08876v1 |
From PAC to Instance-Optimal Sample Complexity in the Plackett-Luce Model | http://arxiv.org/abs/1903.00558v2 |
From Paraphrase Database to Compositional Paraphrase Model and Back | http://arxiv.org/abs/1506.03487v2 |
From Predictions to Decisions: Using Lookahead Regularization | http://arxiv.org/abs/2006.11638v2 |
From Speech-to-Speech Translation to Automatic Dubbing | http://arxiv.org/abs/2001.06785v3 |
From tree matching to sparse graph alignment | http://arxiv.org/abs/2002.01258v2 |
Frowning Frodo, Wincing Leia, and a Seriously Great Friendship: Learning to Classify Emotional Relationships of Fictional Characters | http://arxiv.org/abs/1903.12453v2 |
Frustratingly Hard Evidence Retrieval for QA Over Books | http://arxiv.org/abs/2007.09878v1 |
Frustratingly Simple Few-Shot Object Detection | http://arxiv.org/abs/2003.06957v1 |
Fully Character-Level Neural Machine Translation without Explicit Segmentation | http://arxiv.org/abs/1610.03017v3 |
Fully Decentralized Joint Learning of Personalized Models and Collaboration Graphs | http://arxiv.org/abs/1901.08460v4 |
Fully Parallel Hyperparameter Search: Reshaped Space-Filling | http://arxiv.org/abs/1910.08406v2 |
Fully reversible neural networks for large-scale surface and sub-surface characterization via remote sensing | http://arxiv.org/abs/2003.07474v1 |
Fundamental Tradeoffs between Invariance and Sensitivity to Adversarial Perturbations | http://arxiv.org/abs/2002.04599v2 |
GANterpretations | http://arxiv.org/abs/2011.05158v1 |
GCAN: Graph-aware Co-Attention Networks for Explainable Fake News Detection on Social Media | http://arxiv.org/abs/2004.11648v1 |
GEAR: Graph-based Evidence Aggregating and Reasoning for Fact Verification | http://arxiv.org/abs/1908.01843v1 |
GNN-FiLM: Graph Neural Networks with Feature-wise Linear Modulation | http://arxiv.org/abs/1906.12192v5 |
GP-VAE: Deep Probabilistic Time Series Imputation | http://arxiv.org/abs/1907.04155v5 |
GRADE: Automatic Graph-Enhanced Coherence Metric for Evaluating Open-Domain Dialogue Systems | http://arxiv.org/abs/2010.03994v1 |
Gaining Insight into SARS-CoV-2 Infection and COVID-19 Severity Using Self-supervised Edge Features and Graph Neural Networks | http://arxiv.org/abs/2006.12971v2 |
Games for Fairness and Interpretability | http://arxiv.org/abs/2004.09551v1 |
Gamification of Pure Exploration for Linear Bandits | http://arxiv.org/abs/2007.00953v1 |
Gated Convolutional Bidirectional Attention-based Model for Off-topic Spoken Response Detection | http://arxiv.org/abs/2004.09036v4 |
Gaussian Mixture Latent Vector Grammars | http://arxiv.org/abs/1805.04688v1 |
Gaussian Sketching yields a J-L Lemma in RKHS | http://arxiv.org/abs/1908.05818v2 |
Gaussianization Flows | http://arxiv.org/abs/2003.01941v1 |
GenAug: Data Augmentation for Finetuning Text Generators | http://arxiv.org/abs/2010.01794v2 |
Gender Bias in Contextualized Word Embeddings | http://arxiv.org/abs/1904.03310v1 |
Gender Bias in Multilingual Embeddings and Cross-Lingual Transfer | http://arxiv.org/abs/2005.00699v1 |
Gender Coreference and Bias Evaluation at WMT 2020 | http://arxiv.org/abs/2010.06018v1 |
Gender Gap in Natural Language Processing Research: Disparities in Authorship and Citations | http://arxiv.org/abs/2005.00962v2 |
Gender in Danger? Evaluating Speech Translation Technology on the MuST-SHE Corpus | http://arxiv.org/abs/2006.05754v1 |
Gender-preserving Debiasing for Pre-trained Word Embeddings | http://arxiv.org/abs/1906.00742v1 |
General Identification of Dynamic Treatment Regimes Under Interference | http://arxiv.org/abs/2004.01218v1 |
Generalisation error in learning with random features and the hidden manifold model | http://arxiv.org/abs/2002.09339v2 |
Generalization Error of Generalized Linear Models in High Dimensions | http://arxiv.org/abs/2005.00180v1 |
Generalization Guarantees for Sparse Kernel Approximation with Entropic Optimal Features | http://arxiv.org/abs/2002.04195v1 |
Generalization and Representational Limits of Graph Neural Networks | http://arxiv.org/abs/2002.06157v1 |
Generalization to New Actions in Reinforcement Learning | http://arxiv.org/abs/2011.01928v1 |
Generalized Data Augmentation for Low-Resource Translation | http://arxiv.org/abs/1906.03785v1 |
Generalized and Scalable Optimal Sparse Decision Trees | http://arxiv.org/abs/2006.08690v3 |
Generalizing Convolutional Neural Networks for Equivariance to Lie Groups on Arbitrary Continuous Data | http://arxiv.org/abs/2002.12880v3 |
Generalizing Natural Language Analysis through Span-relation Representations | http://arxiv.org/abs/1911.03822v2 |
Generalizing Word Embeddings using Bag of Subwords | http://arxiv.org/abs/1809.04259v1 |
Generalizing and Hybridizing Count-based and Neural Language Models | http://arxiv.org/abs/1606.00499v2 |
Generate, Delete and Rewrite: A Three-Stage Framework for Improving Persona Consistency of Dialogue Generation | http://arxiv.org/abs/2004.07672v4 |
Generating Automatic Curricula via Self-Supervised Active Domain Randomization | http://arxiv.org/abs/2002.07911v2 |
Generating Counter Narratives against Online Hate Speech: Data and Strategies | http://arxiv.org/abs/2004.04216v1 |
Generating Dialogue Responses from a Semantic Latent Space | http://arxiv.org/abs/2010.01658v1 |
Generating Diverse Translation from Model Distribution with Dropout | http://arxiv.org/abs/2010.08178v1 |
Generating Diverse and Consistent QA pairs from Contexts with Information-Maximizing Hierarchical Conditional VAEs | http://arxiv.org/abs/2005.13837v5 |
Generating Fact Checking Briefs | http://arxiv.org/abs/2011.05448v1 |
Generating Fact Checking Explanations | http://arxiv.org/abs/2004.05773v1 |
Generating Fine-Grained Open Vocabulary Entity Type Descriptions | http://arxiv.org/abs/1805.10564v1 |
Generating Hierarchical Explanations on Text Classification via Feature Interaction Detection | http://arxiv.org/abs/2004.02015v3 |
Generating Image Descriptions via Sequential Cross-Modal Alignment Guided by Human Gaze | http://arxiv.org/abs/2011.04592v1 |
Generating Label Cohesive and Well-Formed Adversarial Claims | http://arxiv.org/abs/2009.08205v1 |
Generating Logical Forms from Graph Representations of Text and Entities | http://arxiv.org/abs/1905.08407v3 |
Generating Narrative Text in a Switching Dynamical System | http://arxiv.org/abs/2004.03762v1 |
Generating Negative Commonsense Knowledge | http://arxiv.org/abs/2011.07497v1 |
Generating Novel Glyph without Human Data by Learning to Communicate | http://arxiv.org/abs/2010.04402v2 |
Generating Question Relevant Captions to Aid Visual Question Answering | http://arxiv.org/abs/1906.00513v3 |
Generating Radiology Reports via Memory-driven Transformer | http://arxiv.org/abs/2010.16056v1 |
Generating Sentences by Editing Prototypes | http://arxiv.org/abs/1709.08878v2 |
Generating Summaries with Topic Templates and Structured Convolutional Decoders | http://arxiv.org/abs/1906.04687v1 |
Generating and Exploiting Large-scale Pseudo Training Data for Zero Pronoun Resolution | http://arxiv.org/abs/1606.01603v3 |
Generative Adversarial Imitation from Observation | http://arxiv.org/abs/1807.06158v4 |
Generative Adversarial User Privacy in Lossy Single-Server Information Retrieval | http://arxiv.org/abs/2012.03902v1 |
Generative Flows with Matrix Exponential | http://arxiv.org/abs/2007.09651v1 |
Generative ODE Modeling with Known Unknowns | http://arxiv.org/abs/2003.10775v1 |
Generative Semantic Hashing Enhanced via Boltzmann Machines | http://arxiv.org/abs/2006.08858v1 |
Generative Teaching Networks: Accelerating Neural Architecture Search by Learning to Generate Synthetic Training Data | http://arxiv.org/abs/1912.07768v1 |
Geometric Dataset Distances via Optimal Transport | http://arxiv.org/abs/2002.02923v1 |
Geometry-aware Domain Adaptation for Unsupervised Alignment of Word Embeddings | http://arxiv.org/abs/2004.08243v2 |
Geoopt: Riemannian Optimization in PyTorch | http://arxiv.org/abs/2005.02819v5 |
Getting a CLUE: A Method for Explaining Uncertainty Estimates | http://arxiv.org/abs/2006.06848v1 |
Give Me Convenience and Give Her Death: Who Should Decide What Uses of NLP are Appropriate, and on What Basis? | http://arxiv.org/abs/2005.13213v1 |
Giving Attention to the Unexpected: Using Prosody Innovations in Disfluency Detection | http://arxiv.org/abs/1904.04388v1 |
Global Neural CCG Parsing with Optimality Guarantees | http://arxiv.org/abs/1607.01432v2 |
Global-to-Local Neural Networks for Document-Level Relation Extraction | http://arxiv.org/abs/2009.10359v1 |
Globally Normalized Reader | http://arxiv.org/abs/1709.02828v1 |
Go Wide, Then Narrow: Efficient Training of Deep Thin Networks | http://arxiv.org/abs/2007.00811v2 |
Good Subnetworks Provably Exist: Pruning via Greedy Forward Selection | http://arxiv.org/abs/2003.01794v3 |
Good-Enough Compositional Data Augmentation | http://arxiv.org/abs/1904.09545v4 |
GraPPa: Grammar-Augmented Pre-Training for Table Semantic Parsing | http://arxiv.org/abs/2009.13845v1 |
Gradient Based Memory Editing for Task-Free Continual Learning | http://arxiv.org/abs/2006.15294v1 |
Gradient Descent with Early Stopping is Provably Robust to Label Noise for Overparameterized Neural Networks | http://arxiv.org/abs/1903.11680v3 |
Gradient Temporal-Difference Learning with Regularized Corrections | http://arxiv.org/abs/2007.00611v4 |
Gradient descent algorithms for Bures-Wasserstein barycenters | http://arxiv.org/abs/2001.01700v2 |
Gradient descent follows the regularization path for general losses | http://arxiv.org/abs/2006.11226v1 |
Gradient-free Online Learning in Games with Delayed Rewards | http://arxiv.org/abs/2006.10911v1 |
GradientDICE: Rethinking Generalized Offline Estimation of Stationary Values | http://arxiv.org/abs/2001.11113v7 |
Grammatical Error Correction in Low Error Density Domains: A New Benchmark and Analyses | http://arxiv.org/abs/2010.07574v1 |
Graph Clustering with Graph Neural Networks | http://arxiv.org/abs/2006.16904v1 |
Graph Coarsening with Preserved Spectral Properties | http://arxiv.org/abs/1802.04447v2 |
Graph Convolutional Gaussian Processes For Link Prediction | http://arxiv.org/abs/2002.04337v1 |
Graph Convolutional Network for Recommendation with Low-pass Collaborative Filters | http://arxiv.org/abs/2006.15516v2 |
Graph Convolutions over Constituent Trees for Syntax-Aware Semantic Role Labeling | http://arxiv.org/abs/1909.09814v3 |
Graph DNA: Deep Neighborhood Aware Graph Encoding for Collaborative Filtering | http://arxiv.org/abs/1905.12217v1 |
Graph Filtration Learning | http://arxiv.org/abs/1905.10996v2 |
Graph Homomorphism Convolution | http://arxiv.org/abs/2005.01214v2 |
Graph Learning for Inverse Landscape Genetics | http://arxiv.org/abs/2006.12334v2 |
Graph Neural Networks for Massive MIMO Detection | http://arxiv.org/abs/2007.05703v1 |
Graph Neural Networks for the Prediction of Substrate-Specific Organic Reaction Conditions | http://arxiv.org/abs/2007.04275v2 |
Graph Neural Networks in TensorFlow and Keras with Spektral | http://arxiv.org/abs/2006.12138v1 |
Graph Optimal Transport for Cross-Domain Alignment | http://arxiv.org/abs/2006.14744v3 |
Graph Pattern Entity Ranking Model for Knowledge Graph Completion | http://arxiv.org/abs/1904.02856v1 |
Graph Random Neural Features for Distance-Preserving Graph Representations | http://arxiv.org/abs/1909.03790v3 |
Graph Structure of Neural Networks | http://arxiv.org/abs/2007.06559v2 |
Graph based Neural Networks for Event Factuality Prediction using Syntactic and Semantic Structures | http://arxiv.org/abs/1907.03227v1 |
Graph neural induction of value iteration | http://arxiv.org/abs/2009.12604v1 |
Graph-based Nearest Neighbor Search: From Practice to Theory | http://arxiv.org/abs/1907.00845v4 |
Graph-based, Self-Supervised Program Repair from Diagnostic Feedback | http://arxiv.org/abs/2005.10636v2 |
GraphDialog: Integrating Graph Knowledge into End-to-End Task-Oriented Dialogue Systems | http://arxiv.org/abs/2010.01447v1 |
GraphOpt: Learning Optimization Models of Graph Formation | http://arxiv.org/abs/2007.03619v1 |
Graphs, Entities, and Step Mixture | http://arxiv.org/abs/2005.08485v2 |
Grasping the Finer Point: A Supervised Similarity Network for Metaphor Detection | http://arxiv.org/abs/1709.00575v1 |
Greedy Search with Probabilistic N-gram Matching for Neural Machine Translation | http://arxiv.org/abs/1809.03132v1 |
Gromov-Wasserstein Alignment of Word Embedding Spaces | http://arxiv.org/abs/1809.00013v1 |
Grounded Adaptation for Zero-shot Executable Semantic Parsing | http://arxiv.org/abs/2009.07396v2 |
Grounded Compositional Outputs for Adaptive Language Modeling | http://arxiv.org/abs/2009.11523v2 |
Grounded Conversation Generation as Guided Traverses in Commonsense Knowledge Graphs | http://arxiv.org/abs/1911.02707v3 |
Grounding Conversations with Improvised Dialogues | http://arxiv.org/abs/2004.09544v2 |
Group Equivariant Deep Reinforcement Learning | http://arxiv.org/abs/2007.03437v1 |
Growing Action Spaces | http://arxiv.org/abs/1906.12266v1 |
Growing Together: Modeling Human Language Learning With n-Best Multi-Checkpoint Machine Translation | http://arxiv.org/abs/2006.04050v1 |
Guaranteed Validity for Empirical Approaches to Adaptive Data Analysis | http://arxiv.org/abs/1906.09231v2 |
Guided Learning of Nonconvex Models through Successive Functional Gradient Optimization | http://arxiv.org/abs/2006.16840v1 |
Guiding Attention for Self-Supervised Learning with Transformers | http://arxiv.org/abs/2010.02399v1 |
Guiding Variational Response Generator to Exploit Persona | http://arxiv.org/abs/1911.02390v2 |
HABERTOR: An Efficient and Effective Deep Hatespeech Detector | http://arxiv.org/abs/2010.08865v1 |
HAT: Hardware-Aware Transformers for Efficient Natural Language Processing | http://arxiv.org/abs/2005.14187v1 |
HEAD-QA: A Healthcare Dataset for Complex Reasoning | http://arxiv.org/abs/1906.04701v1 |
HENIN: Learning Heterogeneous Neural Interaction Networks for Explainable Cyberbullying Detection on Social Media | http://arxiv.org/abs/2010.04576v1 |
HERO: Hierarchical Encoder for Video+Language Omni-representation Pre-training | http://arxiv.org/abs/2005.00200v2 |
HIBERT: Document Level Pre-training of Hierarchical Bidirectional Transformers for Document Summarization | http://arxiv.org/abs/1905.06566v1 |
HNHN: Hypergraph Networks with Hyperedge Neurons | http://arxiv.org/abs/2006.12278v1 |
Haar Graph Pooling | http://arxiv.org/abs/1909.11580v3 |
Haar Wavelet based Block Autoregressive Flows for Trajectories | http://arxiv.org/abs/2009.09878v1 |
Hallucinative Topological Memory for Zero-Shot Visual Planning | http://arxiv.org/abs/2002.12336v1 |
Halpern Iteration for Near-Optimal and Parameter-Free Monotone Inclusion and Strong Solutions to Variational Inequalities | http://arxiv.org/abs/2002.08872v3 |
Hamiltonian Graph Networks with ODE Integrators | http://arxiv.org/abs/1909.12790v1 |
Hamiltonian Monte Carlo Swindles | http://arxiv.org/abs/2001.05033v2 |
Handling Divergent Reference Texts when Evaluating Table-to-Text Generation | http://arxiv.org/abs/1906.01081v1 |
Handling Noisy Labels for Robustly Learning from Self-Training Data for Low-Resource Sequence Labeling | http://arxiv.org/abs/1903.12008v1 |
Handling the Positive-Definite Constraint in the Bayesian Learning Rule | http://arxiv.org/abs/2002.10060v13 |
Hard-Coded Gaussian Attention for Neural Machine Translation | http://arxiv.org/abs/2005.00742v1 |
Hardness of Identity Testing for Restricted Boltzmann Machines and Potts models | http://arxiv.org/abs/2004.10805v1 |
Harmonic Decompositions of Convolutional Networks | http://arxiv.org/abs/2003.12756v2 |
Harnessing Distribution Ratio Estimators for Learning Agents with Quality and Diversity | http://arxiv.org/abs/2011.02614v1 |
Harnessing the linguistic signal to predict scalar inferences | http://arxiv.org/abs/1910.14254v2 |
Harry Potter and the Action Prediction Challenge from Natural Language | http://arxiv.org/abs/1905.11037v1 |
Harvesting Paragraph-Level Question-Answer Pairs from Wikipedia | http://arxiv.org/abs/1805.05942v1 |
Harvesting and Refining Question-Answer Pairs for Unsupervised QA | http://arxiv.org/abs/2005.02925v1 |
Has Machine Translation Achieved Human Parity? A Case for Document-level Evaluation | http://arxiv.org/abs/1808.07048v1 |
Hearst Patterns Revisited: Automatic Hypernym Detection from Large Text Corpora | http://arxiv.org/abs/1806.03191v1 |
Helping Reduce Environmental Impact of Aviation with Machine Learning | http://arxiv.org/abs/2012.09433v1 |
Hermitian matrices for clustering directed graphs: insights and applications | http://arxiv.org/abs/1908.02096v1 |
Heterogeneous Graph Neural Networks for Extractive Document Summarization | http://arxiv.org/abs/2004.12393v1 |
Heterogeneous Supervision for Relation Extraction: A Representation Learning Approach | http://arxiv.org/abs/1707.00166v2 |
Heteroskedastic and Imbalanced Deep Learning with Adaptive Regularization | http://arxiv.org/abs/2006.15766v1 |
Hiding Among the Clones: A Simple and Nearly Optimal Analysis of Privacy Amplification by Shuffling | http://arxiv.org/abs/2012.12803v2 |
Hierarchical Clustering: a 0.585 Revenue Approximation | http://arxiv.org/abs/2006.01933v1 |
Hierarchical Entity Typing via Multi-level Learning to Rank | http://arxiv.org/abs/2004.02286v1 |
Hierarchical Evidence Set Modeling for Automated Fact Extraction and Verification | http://arxiv.org/abs/2010.05111v1 |
Hierarchical Generation of Molecular Graphs using Structural Motifs | http://arxiv.org/abs/2002.03230v2 |
Hierarchical Graph Network for Multi-hop Question Answering | http://arxiv.org/abs/1911.03631v4 |
Hierarchical Inter-Message Passing for Learning on Molecular Graphs | http://arxiv.org/abs/2006.12179v1 |
Hierarchical Losses and New Resources for Fine-grained Entity Typing and Linking | http://arxiv.org/abs/1807.05127v1 |
Hierarchical Neural Networks for Sequential Sentence Classification in Medical Scientific Abstracts | http://arxiv.org/abs/1808.06161v1 |
Hierarchical Neural Story Generation | http://arxiv.org/abs/1805.04833v1 |
Hierarchical Protein Function Prediction with Tail-GNNs | http://arxiv.org/abs/2007.12804v1 |
Hierarchical Quantized Representations for Script Generation | http://arxiv.org/abs/1808.09542v1 |
Hierarchical Structured Model for Fine-to-coarse Manifesto Text Analysis | http://arxiv.org/abs/1805.02823v1 |
Hierarchical Transformers for Multi-Document Summarization | http://arxiv.org/abs/1905.13164v1 |
Hierarchical Verification for Adversarial Robustness | http://arxiv.org/abs/2007.11826v1 |
Hierarchically Decoupled Imitation for Morphological Transfer | http://arxiv.org/abs/2003.01709v2 |
High Dimensional Robust Sparse Regression | http://arxiv.org/abs/1805.11643v3 |
High Resolution Medical Image Analysis with Spatial Partitioning | http://arxiv.org/abs/1909.03108v3 |
High-Dimensional Robust Mean Estimation via Gradient Descent | http://arxiv.org/abs/2005.01378v1 |
HighRES: Highlight-based Reference-less Evaluation of Summarization | http://arxiv.org/abs/1906.01361v1 |
Higher-order Coreference Resolution with Coarse-to-fine Inference | http://arxiv.org/abs/1804.05392v1 |
Highway Transformer: Self-Gating Enhanced Self-Attentive Networks | http://arxiv.org/abs/2004.08178v5 |
History for Visual Dialog: Do we really need it? | http://arxiv.org/abs/2005.07493v1 |
History-Gradient Aided Batch Size Adaptation for Variance Reduced Algorithms | http://arxiv.org/abs/1910.09670v4 |
Hooks in the Headline: Learning to Generate Headlines with Controlled Styles | http://arxiv.org/abs/2004.01980v3 |
HotpotQA: A Dataset for Diverse, Explainable Multi-hop Question Answering | http://arxiv.org/abs/1809.09600v1 |
How Can We Accelerate Progress Towards Human-like Linguistic Generalization? | http://arxiv.org/abs/2005.00955v1 |
How Does NLP Benefit Legal System: A Summary of Legal Artificial Intelligence | http://arxiv.org/abs/2004.12158v5 |
How Does Selective Mechanism Improve Self-Attention Networks? | http://arxiv.org/abs/2005.00979v1 |
How Furiously Can Colourless Green Ideas Sleep? Sentence Acceptability in Context | http://arxiv.org/abs/2004.00881v1 |
How Good is the Bayes Posterior in Deep Neural Networks Really? | http://arxiv.org/abs/2002.02405v2 |
How Large a Vocabulary Does Text Classification Need? A Variational Approach to Vocabulary Selection | http://arxiv.org/abs/1902.10339v4 |
How Much Knowledge Can You Pack Into the Parameters of a Language Model? | http://arxiv.org/abs/2002.08910v4 |
How Much Reading Does Reading Comprehension Require? A Critical Investigation of Popular Benchmarks | http://arxiv.org/abs/1808.04926v2 |
How To Backdoor Federated Learning | http://arxiv.org/abs/1807.00459v3 |
How agents see things: On visual representations in an emergent language game | http://arxiv.org/abs/1808.10696v2 |
How do Decisions Emerge across Layers in Neural Models? Interpretation with Differentiable Masking | http://arxiv.org/abs/2004.14992v2 |
How much complexity does an RNN architecture need to learn syntax-sensitive dependencies? | http://arxiv.org/abs/2005.08199v2 |
How multilingual is Multilingual BERT? | http://arxiv.org/abs/1906.01502v1 |
How recurrent networks implement contextual processing in sentiment analysis | http://arxiv.org/abs/2004.08013v1 |
How to Grow a (Product) Tree: Personalized Category Suggestions for eCommerce Type-Ahead | http://arxiv.org/abs/2005.12781v1 |
How to Make Deep RL Work in Practice | http://arxiv.org/abs/2010.13083v2 |
How to Probe Sentence Embeddings in Low-Resource Languages: On Structural Design Choices for Probing Task Evaluation | http://arxiv.org/abs/2006.09109v2 |
How to trap a gradient flow | http://arxiv.org/abs/2001.02968v3 |
How well does surprisal explain N400 amplitude under different experimental conditions? | http://arxiv.org/abs/2010.04844v1 |
Howl: A Deployed, Open-Source Wake Word Detection System | http://arxiv.org/abs/2008.09606v1 |
Human computation requires and enables a new approach to ethical review | http://arxiv.org/abs/2011.10754v1 |
Human-Like Active Learning: Machines Simulating the Human Learning Process | http://arxiv.org/abs/2011.03733v1 |
Human-Paraphrased References Improve Neural Machine Translation | http://arxiv.org/abs/2010.10245v1 |
Human-centric Dialog Training via Offline Reinforcement Learning | http://arxiv.org/abs/2010.05848v1 |
Hybrid Code Networks: practical and efficient end-to-end dialog control with supervised and reinforcement learning | http://arxiv.org/abs/1702.03274v2 |
Hybrid Session-based News Recommendation using Recurrent Neural Networks | http://arxiv.org/abs/2006.13063v1 |
Hybrid Stochastic-Deterministic Minibatch Proximal Gradient: Less-Than-Single-Pass Optimization with Nearly Optimal Generalization | http://arxiv.org/abs/2009.09835v1 |
HydroNets: Leveraging River Structure for Hydrologic Modeling | http://arxiv.org/abs/2007.00595v1 |
Hyper-spectral NIR and MIR data and optimal wavebands for detection of apple tree diseases | http://arxiv.org/abs/2004.02325v3 |
Hyperbolic Manifold Regression | http://arxiv.org/abs/2005.13885v1 |
Hypernetwork approach to generating point clouds | http://arxiv.org/abs/2003.00802v2 |
Hyperparameter Auto-tuning in Self-Supervised Robotic Learning | http://arxiv.org/abs/2010.08252v3 |
Hypothesis Testing Interpretations and Renyi Differential Privacy | http://arxiv.org/abs/1905.09982v2 |
ID3 Learns Juntas for Smoothed Product Distributions | http://arxiv.org/abs/1906.08654v1 |
IGSQL: Database Schema Interaction Graph Based Neural Model for Context-Dependent Text-to-SQL Generation | http://arxiv.org/abs/2011.05744v1 |
IIRC: A Dataset of Incomplete Information Reading Comprehension Questions | http://arxiv.org/abs/2011.07127v1 |
IMHO Fine-Tuning Improves Claim Detection | http://arxiv.org/abs/1905.07000v1 |
IMoJIE: Iterative Memory-Based Joint Open Information Extraction | http://arxiv.org/abs/2005.08178v1 |
INFOTABS: Inference on Tables as Semi-structured Data | http://arxiv.org/abs/2005.06117v1 |
INSET: Sentence Infilling with INter-SEntential Transformer | http://arxiv.org/abs/1911.03892v2 |
INSPIRED: Toward Sociable Recommendation Dialog Systems | http://arxiv.org/abs/2009.14306v2 |
IROF: a low resource evaluation metric for explanation methods | http://arxiv.org/abs/2003.08747v1 |
IV-Posterior: Inverse Value Estimation for Interpretable Policy Certificates | http://arxiv.org/abs/2012.01925v1 |
Identifying Semantic Divergences in Parallel Text without Annotations | http://arxiv.org/abs/1803.11112v1 |
Identifying and Correcting Label Bias in Machine Learning | http://arxiv.org/abs/1901.04966v1 |
Identifying and Reducing Gender Bias in Word-Level Language Models | http://arxiv.org/abs/1904.03035v1 |
Identifying civilians killed by police with distantly supervised entity-event extraction | http://arxiv.org/abs/1707.07086v1 |
If MaxEnt RL is the Answer, What is the Question? | http://arxiv.org/abs/1910.01913v1 |
If beam search is the answer, what was the question? | http://arxiv.org/abs/2010.02650v1 |
Image Augmentation Is All You Need: Regularizing Deep Reinforcement Learning from Pixels | http://arxiv.org/abs/2004.13649v3 |
Image Generation With Neural Cellular Automatas | http://arxiv.org/abs/2010.04949v2 |
Image Pivoting for Learning Multilingual Multimodal Representations | http://arxiv.org/abs/1707.07601v1 |
Image-based phenotyping of diverse Rice (Oryza Sativa L.) Genotypes | http://arxiv.org/abs/2004.02498v1 |
Imitation Attacks and Defenses for Black-box Machine Translation Systems | http://arxiv.org/abs/2004.15015v3 |
Imitation Learning Approach for AI Driving Olympics Trained on Real-world and Simulation Data Simultaneously | http://arxiv.org/abs/2007.03514v1 |
Imitation Learning for Neural Morphological String Transduction | http://arxiv.org/abs/1808.10701v1 |
Implicit Bias of Gradient Descent for Wide Two-layer Neural Networks Trained with the Logistic Loss | http://arxiv.org/abs/2002.04486v4 |
Implicit Class-Conditioned Domain Alignment for Unsupervised Domain Adaptation | http://arxiv.org/abs/2006.04996v1 |
Implicit Generative Modeling for Efficient Exploration | http://arxiv.org/abs/1911.08017v3 |
Implicit Geometric Regularization for Learning Shapes | http://arxiv.org/abs/2002.10099v2 |
Implicit Regularization of Random Feature Models | http://arxiv.org/abs/2002.08404v2 |
Implicit competitive regularization in GANs | http://arxiv.org/abs/1910.05852v4 |
Implicit regularization and solution uniqueness in over-parameterized matrix sensing | http://arxiv.org/abs/1806.02046v2 |
Implicit regularization for deep neural networks driven by an Ornstein-Uhlenbeck like process | http://arxiv.org/abs/1904.09080v2 |
Improper Learning for Non-Stochastic Control | http://arxiv.org/abs/2001.09254v3 |
Improved Natural Language Generation via Loss Truncation | http://arxiv.org/abs/2004.14589v2 |
Improved Neural Relation Detection for Knowledge Base Question Answering | http://arxiv.org/abs/1704.06194v2 |
Improved Optimistic Algorithms for Logistic Bandits | http://arxiv.org/abs/2002.07530v2 |
Improved Regret Bounds for Projection-free Bandit Convex Optimization | http://arxiv.org/abs/1910.03374v1 |
Improved Relation Extraction with Feature-Rich Compositional Embedding Models | http://arxiv.org/abs/1505.02419v3 |
Improved Semantic Representations From Tree-Structured Long Short-Term Memory Networks | http://arxiv.org/abs/1503.00075v3 |
Improved Semantic-Aware Network Embedding with Fine-Grained Word Alignment | http://arxiv.org/abs/1808.09633v1 |
Improved Sentiment Detection via Label Transfer from Monolingual to Synthetic Code-Switched Text | http://arxiv.org/abs/1906.05725v1 |
Improved Speech Representations with Multi-Target Autoregressive Predictive Coding | http://arxiv.org/abs/2004.05274v1 |
Improved Transition-Based Parsing by Modeling Characters instead of Words with LSTMs | http://arxiv.org/abs/1508.00657v2 |
Improving AMR Parsing with Sequence-to-Sequence Pre-training | http://arxiv.org/abs/2010.01771v1 |
Improving Abstraction in Text Summarization | http://arxiv.org/abs/1808.07913v1 |
Improving Adversarial Text Generation by Modeling the Distant Future | http://arxiv.org/abs/2005.01279v1 |
Improving Candidate Generation for Low-resource Cross-lingual Entity Linking | http://arxiv.org/abs/2003.01343v1 |
Improving Character-based Decoding Using Target-Side Morphological Information for Neural Machine Translation | http://arxiv.org/abs/1804.06506v1 |
Improving Dialog Evaluation with a Multi-reference Adversarial Dataset and Large Scale Pretraining | http://arxiv.org/abs/2009.11321v1 |
Improving Dialogue State Tracking by Discerning the Relevant Context | http://arxiv.org/abs/1904.02800v1 |
Improving Disentangled Text Representation Learning with Information-Theoretic Guidance | http://arxiv.org/abs/2006.00693v2 |
Improving Disfluency Detection by Self-Training a Self-Attentive Model | http://arxiv.org/abs/2004.05323v2 |
Improving Domain Adaptation Translation with Domain Invariant and Specific Information | http://arxiv.org/abs/1904.03879v2 |
Improving Generalization by Controlling Label-Noise Information in Neural Network Weights | http://arxiv.org/abs/2002.07933v2 |
Improving Generative Imagination in Object-Centric World Models | http://arxiv.org/abs/2010.02054v1 |
Improving Grammatical Error Correction via Pre-Training a Copy-Augmented Architecture with Unlabeled Data | http://arxiv.org/abs/1903.00138v3 |
Improving Graph Neural Network Expressivity via Subgraph Isomorphism Counting | http://arxiv.org/abs/2006.09252v2 |
Improving Human Text Comprehension through Semi-Markov CRF-based Neural Section Title Generation | http://arxiv.org/abs/1904.07142v1 |
Improving Information Extraction by Acquiring External Evidence with Reinforcement Learning | http://arxiv.org/abs/1603.07954v3 |
Improving Knowledge Graph Embedding Using Simple Constraints | http://arxiv.org/abs/1805.02408v2 |
Improving Lemmatization of Non-Standard Languages with Joint Learning | http://arxiv.org/abs/1903.06939v1 |
Improving Lexical Choice in Neural Machine Translation | http://arxiv.org/abs/1710.01329v3 |
Improving Machine Reading Comprehension with General Reading Strategies | http://arxiv.org/abs/1810.13441v2 |
Improving Massively Multilingual Neural Machine Translation and Zero-Shot Translation | http://arxiv.org/abs/2004.11867v1 |
Improving Maximum Likelihood Training for Text Generation with Density Ratio Estimation | http://arxiv.org/abs/2007.06018v1 |
Improving Molecular Design by Stochastic Iterative Target Augmentation | http://arxiv.org/abs/2002.04720v2 |
Improving Multi-turn Dialogue Modelling with Utterance ReWriter | http://arxiv.org/abs/1906.07004v1 |
Improving Multilingual Models with Language-Clustered Vocabularies | http://arxiv.org/abs/2010.12777v1 |
Improving Multilingual Named Entity Recognition with Wikipedia Entity Type Mapping | http://arxiv.org/abs/1707.02459v1 |
Improving Neural Conversational Models with Entropy-Based Data Filtering | http://arxiv.org/abs/1905.05471v3 |
Improving Neural Parsing by Disentangling Model Combination and Reranking Effects | http://arxiv.org/abs/1707.03058v1 |
Improving Non-autoregressive Neural Machine Translation with Monolingual Data | http://arxiv.org/abs/2005.00932v3 |
Improving Question Answering over Incomplete KBs with Knowledge-Aware Reader | http://arxiv.org/abs/1905.07098v2 |
Improving Robustness of Deep-Learning-Based Image Reconstruction | http://arxiv.org/abs/2002.11821v1 |
Improving Segmentation for Technical Support Problems | http://arxiv.org/abs/2005.11055v1 |
Improving Slot Filling by Utilizing Contextual Information | http://arxiv.org/abs/1911.01680v2 |
Improving Text Generation Evaluation with Batch Centering and Tempered Word Mover Distance | http://arxiv.org/abs/2010.06150v1 |
Improving Text Generation with Student-Forcing Optimal Transport | http://arxiv.org/abs/2010.05994v1 |
Improving Topic Models with Latent Feature Word Representations | http://arxiv.org/abs/1810.06306v1 |
Improving Transformer Models by Reordering their Sublayers | http://arxiv.org/abs/1911.03864v2 |
Improving Truthfulness of Headline Generation | http://arxiv.org/abs/2005.00882v2 |
Improving Unsupervised Word-by-Word Translation with Language Model and Denoising Autoencoder | http://arxiv.org/abs/1901.01590v1 |
Improving Yorùbá Diacritic Restoration | http://arxiv.org/abs/2003.10564v1 |
Improving a Neural Semantic Parser by Counterfactual Learning from Human Bandit Feedback | http://arxiv.org/abs/1805.01252v2 |
Improving fairness in machine learning systems: What do industry practitioners need? | http://arxiv.org/abs/1812.05239v2 |
Improving robustness against common corruptions by covariate shift adaptation | http://arxiv.org/abs/2006.16971v2 |
Improving the Efficiency of Grammatical Error Correction with Erroneous Span Detection and Correction | http://arxiv.org/abs/2010.03260v1 |
Improving the Gating Mechanism of Recurrent Neural Networks | http://arxiv.org/abs/1910.09890v2 |
Improving the Similarity Measure of Determinantal Point Processes for Extractive Multi-Document Summarization | http://arxiv.org/abs/1906.00072v1 |
Imputation estimators for unnormalized models with missing data | http://arxiv.org/abs/1903.03630v2 |
Imputer: Sequence Modelling via Imputation and Dynamic Programming | http://arxiv.org/abs/2002.08926v2 |
In search of isoglosses: continuous and discrete language embeddings in Slavic historical phonology | http://arxiv.org/abs/2005.13575v1 |
In-domain representation learning for remote sensing | http://arxiv.org/abs/1911.06721v1 |
Incentive-Compatible Forecasting Competitions | http://arxiv.org/abs/2101.01816v1 |
Incentives for Federated Learning: a Hypothesis Elicitation Approach | http://arxiv.org/abs/2007.10596v1 |
Incidence Networks for Geometric Deep Learning | http://arxiv.org/abs/1905.11460v4 |
Incomplete Utterance Rewriting as Semantic Segmentation | http://arxiv.org/abs/2009.13166v1 |
Incorporate Semantic Structures into Machine Translation Evaluation via UCCA | http://arxiv.org/abs/2010.08728v2 |
Incorporating Behavioral Hypotheses for Query Generation | http://arxiv.org/abs/2010.02667v1 |
Incorporating External Knowledge through Pre-training for Natural Language to Code Generation | http://arxiv.org/abs/2004.09015v1 |
Incorporating Subword Information into Matrix Factorization Word Embeddings | http://arxiv.org/abs/1805.03710v1 |
Incorporating Terminology Constraints in Automatic Post-Editing | http://arxiv.org/abs/2010.09608v1 |
Incorporating Uncertain Segmentation Information into Chinese NER for Social Media Text | http://arxiv.org/abs/2004.06384v2 |
Incorporating a Local Translation Mechanism into Non-autoregressive Translation | http://arxiv.org/abs/2011.06132v1 |
Increasing performance of electric vehicles in ride-hailing services using deep reinforcement learning | http://arxiv.org/abs/1912.03408v1 |
Incremental Neural Coreference Resolution in Constant Memory | http://arxiv.org/abs/2005.00128v2 |
Incremental Processing in the Age of Non-Incremental Encoders: An Empirical Assessment of Bidirectional Models for Incremental NLU | http://arxiv.org/abs/2010.05330v1 |
Incremental Sampling Without Replacement for Sequence Models | http://arxiv.org/abs/2002.09067v1 |
Incremental Transformer with Deliberation Decoder for Document Grounded Conversations | http://arxiv.org/abs/1907.08854v3 |
Independent Subspace Analysis for Unsupervised Learning of Disentangled Representations | http://arxiv.org/abs/1909.05063v1 |
Individual Calibration with Randomized Forecasting | http://arxiv.org/abs/2006.10288v3 |
Induced Inflection-Set Keyword Search in Speech | http://arxiv.org/abs/1910.12299v2 |
Inductive Relation Prediction by Subgraph Reasoning | http://arxiv.org/abs/1911.06962v2 |
Inertial Block Proximal Methods for Non-Convex Non-Smooth Optimization | http://arxiv.org/abs/1903.01818v3 |
Inexact Tensor Methods with Dynamic Accuracies | http://arxiv.org/abs/2002.09403v2 |
Inference Strategies for Machine Translation with Conditional Masking | http://arxiv.org/abs/2010.02352v2 |
Inference of Dynamic Graph Changes for Functional Connectome | http://arxiv.org/abs/1905.09993v2 |
Inferring Which Medical Treatments Work from Reports of Clinical Trials | http://arxiv.org/abs/1904.01606v2 |
Inferring astrophysical X-ray polarization with deep learning | http://arxiv.org/abs/2005.08126v1 |
Infinite attention: NNGP and NTK for deep attention networks | http://arxiv.org/abs/2006.10540v1 |
Influence Paths for Characterizing Subject-Verb Number Agreement in LSTM Language Models | http://arxiv.org/abs/2005.01190v1 |
Information Aggregation for Multi-Head Attention with Routing-by-Agreement | http://arxiv.org/abs/1904.03100v1 |
Information Directed Sampling for Linear Partial Monitoring | http://arxiv.org/abs/2002.11182v1 |
Information Extraction from Swedish Medical Prescriptions with Sig-Transformer Encoder | http://arxiv.org/abs/2010.04897v1 |
Information Seeking in the Spirit of Learning: a Dataset for Conversational Curiosity | http://arxiv.org/abs/2005.00172v2 |
Information Theoretic Optimal Learning of Gaussian Graphical Models | http://arxiv.org/abs/1703.04886v3 |
Information-Theoretic Local Minima Characterization and Regularization | http://arxiv.org/abs/1911.08192v2 |
Information-Theoretic Probing for Linguistic Structure | http://arxiv.org/abs/2004.03061v2 |
Information-Theoretic Probing with Minimum Description Length | http://arxiv.org/abs/2003.12298v1 |
Informative Dropout for Robust Representation Learning: A Shape-bias Perspective | http://arxiv.org/abs/2008.04254v1 |
Infusing Disease Knowledge into BERT for Health Question Answering, Medical Inference and Disease Name Recognition | http://arxiv.org/abs/2010.03746v1 |
Injecting Numerical Reasoning Skills into Language Models | http://arxiv.org/abs/2004.04487v1 |
Inoculation by Fine-Tuning: A Method for Analyzing Challenge Datasets | http://arxiv.org/abs/1904.02668v4 |
Input-Sparsity Low Rank Approximation in Schatten Norm | http://arxiv.org/abs/2004.12646v3 |
Inquisitive Question Generation for High Level Text Comprehension | http://arxiv.org/abs/2010.01657v1 |
Insights into Fairness through Trust: Multi-scale Trust Quantification for Financial Deep Learning | http://arxiv.org/abs/2011.01961v1 |
InstaHide: Instance-hiding Schemes for Private Distributed Learning | http://arxiv.org/abs/2010.02772v1 |
Instance-Based Learning of Span Representations: A Case Study through Named Entity Recognition | http://arxiv.org/abs/2004.14514v1 |
Instance-wise Depth and Motion Learning from Monocular Videos | http://arxiv.org/abs/1912.09351v2 |
Integrals over Gaussians under Linear Domain Constraints | http://arxiv.org/abs/1910.09328v2 |
Integrating Multimodal Information in Large Pretrained Transformers | http://arxiv.org/abs/1908.05787v3 |
Integrating Semantic Knowledge to Tackle Zero-shot Text Classification | http://arxiv.org/abs/1903.12626v1 |
Integrating Semantic and Structural Information with Graph Convolutional Network for Controversy Detection | http://arxiv.org/abs/2005.07886v1 |
Integrating Transformer and Paraphrase Rules for Sentence Simplification | http://arxiv.org/abs/1810.11193v1 |
Integrating Weakly Supervised Word Sense Disambiguation into Neural Machine Translation | http://arxiv.org/abs/1810.02614v1 |
Inter-Level Cooperation in Hierarchical Reinforcement Learning | http://arxiv.org/abs/1912.02368v2 |
Inter-sentence Relation Extraction with Document-level Graph Convolutional Neural Network | http://arxiv.org/abs/1906.04684v1 |
Interactive Classification by Asking Informative Questions | http://arxiv.org/abs/1911.03598v2 |
Interactive Extractive Search over Biomedical Corpora | http://arxiv.org/abs/2006.04148v1 |
Interactive Fiction Game Playing as Multi-Paragraph Reading Comprehension with Reinforcement Learning | http://arxiv.org/abs/2010.02386v1 |
Interactive Machine Comprehension with Information Seeking Agents | http://arxiv.org/abs/1908.10449v3 |
Interactive Refinement of Cross-Lingual Word Embeddings | http://arxiv.org/abs/1911.03070v3 |
Interactive Text Ranking with Bayesian Optimisation: A Case Study on Community QA and Summarisation | http://arxiv.org/abs/1911.10183v3 |
Interactive Visualization for Debugging RL | http://arxiv.org/abs/2008.07331v2 |
Interconnected Question Generation with Coreference Alignment and Conversation Flow Modeling | http://arxiv.org/abs/1906.06893v1 |
Interference and Generalization in Temporal Difference Learning | http://arxiv.org/abs/2003.06350v1 |
Interpolation between Residual and Non-Residual Networks | http://arxiv.org/abs/2006.05749v4 |
Interpretable Charge Predictions for Criminal Cases: Learning to Generate Court Views from Fact Descriptions | http://arxiv.org/abs/1802.08504v1 |
Interpretable Companions for Black-Box Models | http://arxiv.org/abs/2002.03494v2 |
Interpretable Multi-dataset Evaluation for Named Entity Recognition | http://arxiv.org/abs/2011.06854v2 |
Interpretable Off-Policy Evaluation in Reinforcement Learning by Highlighting Influential Transitions | http://arxiv.org/abs/2002.03478v3 |
Interpretable Question Answering on Knowledge Bases and Text | http://arxiv.org/abs/1906.10924v1 |
Interpretable and Compositional Relation Learning by Joint Training with an Autoencoder | http://arxiv.org/abs/1805.09547v1 |
Interpretable deep Gaussian processes with moments | http://arxiv.org/abs/1905.10963v3 |
Interpretation of NLP models through input marginalization | http://arxiv.org/abs/2010.13984v1 |
Interpretations are useful: penalizing explanations to align neural networks with prior knowledge | http://arxiv.org/abs/1909.13584v4 |
Interpreting Attention Models with Human Visual Attention in Machine Reading Comprehension | http://arxiv.org/abs/2010.06396v2 |
Intrinsic Probing through Dimension Selection | http://arxiv.org/abs/2010.02812v1 |
Intrinsic Reward Driven Imitation Learning via Generative Model | http://arxiv.org/abs/2006.15061v4 |
Introducing Syntactic Structures into Target Opinion Word Extraction with Deep Learning | http://arxiv.org/abs/2010.13378v1 |
Invariant Causal Prediction for Block MDPs | http://arxiv.org/abs/2003.06016v2 |
Invariant Risk Minimization Games | http://arxiv.org/abs/2002.04692v2 |
Inverse Active Sensing: Modeling and Understanding Timely Decision-Making | http://arxiv.org/abs/2006.14141v1 |
Invertible Generative Modeling using Linear Rational Splines | http://arxiv.org/abs/2001.05168v4 |
Invertible generative models for inverse problems: mitigating representation error and dataset bias | http://arxiv.org/abs/1905.11672v4 |
Investigating African-American Vernacular English in Transformer-Based Text Generation | http://arxiv.org/abs/2010.02510v2 |
Investigating Capsule Networks with Dynamic Routing for Text Classification | http://arxiv.org/abs/1804.00538v4 |
Investigating Cross-Linguistic Adjective Ordering Tendencies with a Latent-Variable Model | http://arxiv.org/abs/2010.04755v1 |
Investigating Prior Knowledge for Challenging Chinese Machine Reading Comprehension | http://arxiv.org/abs/1904.09679v3 |
Investigating representations of verb bias in neural language models | http://arxiv.org/abs/2010.02375v2 |
Investigating the Effect of Sensor Modalities in Multi-Sensor Detection-Prediction Models | http://arxiv.org/abs/2101.03279v1 |
Involutive MCMC: a Unifying Framework | http://arxiv.org/abs/2006.16653v1 |
Is 42 the Answer to Everything in Subtitling-oriented Speech Translation? | http://arxiv.org/abs/2006.01080v1 |
Is Graph Structure Necessary for Multi-hop Question Answering? | http://arxiv.org/abs/2004.03096v2 |
Is Local SGD Better than Minibatch SGD? | http://arxiv.org/abs/2002.07839v2 |
Is There a Trade-Off Between Fairness and Accuracy? A Perspective Using Mismatched Hypothesis Testing | http://arxiv.org/abs/1910.07870v2 |
Is Your Classifier Actually Biased? Measuring Fairness under Uncertainty with Bernstein Bounds | http://arxiv.org/abs/2004.12332v1 |
Is the Best Better? Bayesian Statistical Model Comparison for Natural Language Processing | http://arxiv.org/abs/2010.03088v1 |
It's Easier to Translate out of English than into it: Measuring Neural Translation Difficulty by Cross-Mutual Information | http://arxiv.org/abs/2005.02354v2 |
It's Not What Machines Can Learn, It's What We Cannot Teach | http://arxiv.org/abs/2002.09398v2 |
Iterative Edit-Based Unsupervised Sentence Simplification | http://arxiv.org/abs/2006.09639v1 |
Iterative Refinement in the Continuous Space for Non-Autoregressive Neural Machine Translation | http://arxiv.org/abs/2009.07177v1 |
Ivy: Instrumental Variable Synthesis for Causal Inference | http://arxiv.org/abs/2004.05316v1 |
Job Recommendation through Progression of Job Selection | http://arxiv.org/abs/1905.13136v2 |
Joint Bootstrapping Machines for High Confidence Relation Extraction | http://arxiv.org/abs/1805.00254v1 |
Joint Constrained Learning for Event-Event Relation Extraction | http://arxiv.org/abs/2010.06727v1 |
Joint Detection and Location of English Puns | http://arxiv.org/abs/1909.00175v1 |
Joint Diacritization, Lemmatization, Normalization, and Fine-Grained Morphological Tagging | http://arxiv.org/abs/1910.02267v1 |
Joint Effects of Context and User History for Predicting Online Conversation Re-entries | http://arxiv.org/abs/1906.01185v1 |
Joint Entity Extraction and Assertion Detection for Clinical Text | http://arxiv.org/abs/1812.05270v5 |
Joint Extraction of Entities and Relations Based on a Novel Tagging Scheme | http://arxiv.org/abs/1706.05075v1 |
Joint Learning of Pre-Trained and Random Units for Domain Adaptation in Part-of-Speech Tagging | http://arxiv.org/abs/1904.03595v1 |
Joint Modeling of Content and Discourse Relations in Dialogues | http://arxiv.org/abs/1705.05039v1 |
Joint Modeling of Topics, Citations, and Topical Authority in Academic Corpora | http://arxiv.org/abs/1706.00593v1 |
Joint Modelling of Emotion and Abusive Language Detection | http://arxiv.org/abs/2005.14028v1 |
Joint Multilingual Supervision for Cross-lingual Entity Linking | http://arxiv.org/abs/1809.07657v1 |
Joint Multitask Learning for Community Question Answering Using Task-Specific Embeddings | http://arxiv.org/abs/1809.08928v1 |
Joint Reasoning for Temporal and Causal Relations | http://arxiv.org/abs/1906.04941v1 |
Joint Semantic Synthesis and Morphological Analysis of the Derived Word | http://arxiv.org/abs/1701.00946v3 |
Joint translation and unit conversion for end-to-end localization | http://arxiv.org/abs/2004.05219v1 |
Jointly Multiple Events Extraction via Attention-based Graph Information Aggregation | http://arxiv.org/abs/1809.09078v2 |
Jointly Optimizing Diversity and Relevance in Neural Response Generation | http://arxiv.org/abs/1902.11205v3 |
Jointly Predicting Predicates and Arguments in Neural Semantic Role Labeling | http://arxiv.org/abs/1805.04787v2 |
KLEJ: Comprehensive Benchmark for Polish Language Understanding | http://arxiv.org/abs/2005.00630v1 |
KdConv: A Chinese Multi-domain Dialogue Dataset Towards Multi-turn Knowledge-driven Conversation | http://arxiv.org/abs/2004.04100v1 |
Keep CALM and Explore: Language Models for Action Generation in Text-based Games | http://arxiv.org/abs/2010.02903v1 |
Keeping Up Appearances: Computational Modeling of Face Acts in Persuasion Oriented Discussions | http://arxiv.org/abs/2009.10815v2 |
Kernel Conditional Density Operators | http://arxiv.org/abs/1905.11255v2 |
Kernel and Rich Regimes in Overparametrized Models | http://arxiv.org/abs/1906.05827v3 |
Kernel interpolation with continuous volume sampling | http://arxiv.org/abs/2002.09677v1 |
Kernels over Sets of Finite Sets using RKHS Embeddings, with Application to Bayesian (Combinatorial) Optimization | http://arxiv.org/abs/1910.04086v2 |
Key-Value Memory Networks for Directly Reading Documents | http://arxiv.org/abs/1606.03126v2 |
Keyphrase Generation: A Text Summarization Struggle | http://arxiv.org/abs/1904.00110v2 |
KinGDOM: Knowledge-Guided DOMain adaptation for sentiment analysis | http://arxiv.org/abs/2005.00791v2 |
Kinematic State Abstraction and Provably Efficient Rich-Observation Reinforcement Learning | http://arxiv.org/abs/1911.05815v1 |
Knowing The What But Not The Where in Bayesian Optimization | http://arxiv.org/abs/1905.02685v5 |
Knowledge Association with Hyperbolic Knowledge Graph Embeddings | http://arxiv.org/abs/2010.02162v1 |
Knowledge Completion for Generics using Guided Tensor Factorization | http://arxiv.org/abs/1612.03871v3 |
Knowledge Distillation for Multilingual Unsupervised Neural Machine Translation | http://arxiv.org/abs/2004.10171v1 |
Knowledge Graph-Augmented Abstractive Summarization with Semantic-Driven Cloze Reward | http://arxiv.org/abs/2005.01159v1 |
Knowledge-Grounded Dialogue Generation with Pre-trained Language Models | http://arxiv.org/abs/2010.08824v1 |
Knowledge-aware Pronoun Coreference Resolution | http://arxiv.org/abs/1907.03663v1 |
Knowledge-guided Open Attribute Value Extraction with Reinforcement Learning | http://arxiv.org/abs/2010.09189v1 |
Knowledgeable Reader: Enhancing Cloze-Style Reading Comprehension with External Commonsense Knowledge | http://arxiv.org/abs/1805.07858v1 |
KutralNet: A Portable Deep Learning Model for Fire Recognition | http://arxiv.org/abs/2008.06866v1 |
Køpsala: Transition-Based Graph Parsing via Efficient Training and Effective Encoding | http://arxiv.org/abs/2005.12094v2 |
LAReQA: Language-agnostic answer retrieval from a multilingual pool | http://arxiv.org/abs/2004.05484v1 |
LEAN-LIFE: A Label-Efficient Annotation Framework Towards Learning from Explanation | http://arxiv.org/abs/2004.07499v1 |
LEEP: A New Measure to Evaluate Transferability of Learned Representations | http://arxiv.org/abs/2002.12462v2 |
LIBRE: Learning Interpretable Boolean Rule Ensembles | http://arxiv.org/abs/1911.06537v1 |
LINSPECTOR: Multilingual Probing Tasks for Word Representations | http://arxiv.org/abs/1903.09442v2 |
LOGAN: Local Group Bias Detection by Clustering | http://arxiv.org/abs/2010.02867v1 |
LP-SparseMAP: Differentiable Relaxed Optimization for Sparse Structured Prediction | http://arxiv.org/abs/2001.04437v3 |
LRTA: A Transparent Neural-Symbolic Reasoning Framework with Modular Supervision for Visual Question Answering | http://arxiv.org/abs/2011.10731v1 |
LUKE: Deep Contextualized Entity Representations with Entity-aware Self-attention | http://arxiv.org/abs/2010.01057v1 |
Label-aware Double Transfer Learning for Cross-Specialty Medical Named Entity Recognition | http://arxiv.org/abs/1804.09021v2 |
Landscape Connectivity and Dropout Stability of SGD Solutions for Over-parameterized Neural Networks | http://arxiv.org/abs/1912.10095v2 |
Langevin Monte Carlo without smoothness | http://arxiv.org/abs/1905.13285v3 |
Language (Re)modelling: Towards Embodied Language Understanding | http://arxiv.org/abs/2005.00311v2 |
Language (Technology) is Power: A Critical Survey of "Bias" in NLP | http://arxiv.org/abs/2005.14050v2 |
Language Generation with Multi-Hop Reasoning on Commonsense Knowledge Graph | http://arxiv.org/abs/2009.11692v1 |
Language Model Prior for Low-Resource Neural Machine Translation | http://arxiv.org/abs/2004.14928v3 |
Language Modelling Makes Sense: Propagating Representations through WordNet for Full-Coverage Word Sense Disambiguation | http://arxiv.org/abs/1906.10007v1 |
Language Models as Fact Checkers? | http://arxiv.org/abs/2006.04102v2 |
Language Models as an Alternative Evaluator of Word Order Hypotheses: A Case Study in Japanese | http://arxiv.org/abs/2005.00842v1 |
Language Models not just for Pre-training: Fast Online Neural Noisy Channel Modeling | http://arxiv.org/abs/2011.07164v1 |
Language Understanding for Text-based Games Using Deep Reinforcement Learning | http://arxiv.org/abs/1506.08941v2 |
Language as a Latent Variable: Discrete Generative Models for Sentence Compression | http://arxiv.org/abs/1609.07317v2 |
Large Margin Neural Language Model | http://arxiv.org/abs/1808.08987v1 |
Large Product Key Memory for Pretrained Language Models | http://arxiv.org/abs/2010.03881v1 |
Large Scale Multi-Actor Generative Dialog Modeling | http://arxiv.org/abs/2005.06114v1 |
Large-Scale Multi-Domain Belief Tracking with Knowledge Sharing | http://arxiv.org/abs/1807.06517v1 |
Large-scale Analysis of Counseling Conversations: An Application of Natural Language Processing to Mental Health | http://arxiv.org/abs/1605.04462v3 |
Large-scale Cloze Test Dataset Created by Teachers | http://arxiv.org/abs/1711.03225v3 |
Last Iterate is Slower than Averaged Iterate in Smooth Convex-Concave Saddle Point Problems | http://arxiv.org/abs/2002.00057v2 |
Latent Alignment of Procedural Concepts in Multimodal Recipes | http://arxiv.org/abs/2101.04727v1 |
Latent Space Factorisation and Manipulation via Matrix Subspace Projection | http://arxiv.org/abs/1907.12385v3 |
Latent Space Oddity: Exploring Latent Spaces to Design Guitar Timbres | http://arxiv.org/abs/2010.15989v2 |
Latent Variable Modelling with Hyperbolic Normalizing Flows | http://arxiv.org/abs/2002.06336v4 |
Latent-CF: A Simple Baseline for Reverse Counterfactual Explanations | http://arxiv.org/abs/2012.09301v1 |
Layered Sampling for Robust Optimization Problems | http://arxiv.org/abs/2002.11904v1 |
LazyIter: A Fast Algorithm for Counting Markov Equivalent DAGs and Designing Experiments | http://arxiv.org/abs/2006.09670v1 |
LdSM: Logarithm-depth Streaming Multi-label Decision Trees | http://arxiv.org/abs/1905.10428v5 |
Learnable Bernoulli Dropout for Bayesian Deep Learning | http://arxiv.org/abs/2002.05155v1 |
Learning Accurate Long-term Dynamics for Model-based Reinforcement Learning | http://arxiv.org/abs/2012.09156v1 |
Learning Adaptive Language Interfaces through Decomposition | http://arxiv.org/abs/2010.05190v1 |
Learning Adversarially Robust Representations via Worst-Case Mutual Information Maximization | http://arxiv.org/abs/2002.11798v2 |
Learning Algebraic Multigrid Using Graph Neural Networks | http://arxiv.org/abs/2003.05744v2 |
Learning Architectures from an Extended Search Space for Language Modeling | http://arxiv.org/abs/2005.02593v2 |
Learning Autoencoders with Relational Regularization | http://arxiv.org/abs/2002.02913v4 |
Learning Canonical Transformations | http://arxiv.org/abs/2011.08822v1 |
Learning Collaborative Agents with Rule Guidance for Knowledge Graph Reasoning | http://arxiv.org/abs/2005.00571v2 |
Learning Compressed Sentence Representations for On-Device Text Processing | http://arxiv.org/abs/1906.08340v1 |
Learning Constraints for Structured Prediction Using Rectifier Networks | http://arxiv.org/abs/2006.01209v1 |
Learning Context-Free Languages with Nondeterministic Stack RNNs | http://arxiv.org/abs/2010.04674v1 |
Learning Context-Sensitive Convolutional Filters for Text Processing | http://arxiv.org/abs/1709.08294v3 |
Learning Contextualized Knowledge Structures for Commonsense Reasoning | http://arxiv.org/abs/2010.12873v2 |
Learning Cross-lingual Distributed Logical Representations for Semantic Parsing | http://arxiv.org/abs/1806.05461v1 |
Learning Crosslingual Word Embeddings without Bilingual Corpora | http://arxiv.org/abs/1606.09403v1 |
Learning De-biased Representations with Biased Representations | http://arxiv.org/abs/1910.02806v3 |
Learning Deep Transformer Models for Machine Translation | http://arxiv.org/abs/1906.01787v1 |
Learning Dialog Policies from Weak Demonstrations | http://arxiv.org/abs/2004.11054v2 |
Learning Discourse-level Diversity for Neural Dialog Models using Conditional Variational Autoencoders | http://arxiv.org/abs/1703.10960v3 |
Learning Discrete Structured Representations by Adversarially Maximizing Mutual Information | http://arxiv.org/abs/2004.03991v2 |
Learning Dynamic Feature Selection for Fast Sequential Prediction | http://arxiv.org/abs/1505.06169v1 |
Learning Dynamic and Personalized Comorbidity Networks from Event Data using Deep Diffusion Processes | http://arxiv.org/abs/2001.02585v2 |
Learning Efficient Multi-agent Communication: An Information Bottleneck Approach | http://arxiv.org/abs/1911.06992v2 |
Learning End-to-End Goal-Oriented Dialog with Maximal User Task Success and Minimal Human Agent Use | http://arxiv.org/abs/1907.07638v1 |
Learning Entangled Single-Sample Gaussians in the Subset-of-Signals Model | http://arxiv.org/abs/2007.05557v1 |
Learning Fair Policies in Multiobjective (Deep) Reinforcement Learning with Average and Discounted Rewards | http://arxiv.org/abs/2008.07773v1 |
Learning Fair Representations for Kernel Models | http://arxiv.org/abs/1906.11813v2 |
Learning Flat Latent Manifolds with VAEs | http://arxiv.org/abs/2002.04881v3 |
Learning Functionally Decomposed Hierarchies for Continuous Control Tasks with Path Planning | http://arxiv.org/abs/2002.05954v3 |
Learning Gaussian Graphical Models via Multiplicative Weights | http://arxiv.org/abs/2002.08663v2 |
Learning Generic Sentence Representations Using Convolutional Neural Networks | http://arxiv.org/abs/1611.07897v2 |
Learning Geometric Word Meta-Embeddings | http://arxiv.org/abs/2004.09219v1 |
Learning Graph Models for Template-Free Retrosynthesis | http://arxiv.org/abs/2006.07038v1 |
Learning Graph Structure With A Finite-State Automaton Layer | http://arxiv.org/abs/2007.04929v2 |
Learning Group Structure and Disentangled Representations of Dynamical Environments | http://arxiv.org/abs/2002.06991v2 |
Learning Halfspaces with Massart Noise Under Structured Distributions | http://arxiv.org/abs/2002.05632v1 |
Learning Hierarchical Interactions at Scale: A Convex Optimization Approach | http://arxiv.org/abs/1902.01542v5 |
Learning High-dimensional Gaussian Graphical Models under Total Positivity without Adjustment of Tuning Parameters | http://arxiv.org/abs/1906.05159v4 |
Learning Human Objectives by Evaluating Hypothetical Behavior | http://arxiv.org/abs/1912.05652v1 |
Learning Hyperbolic Representations for Unsupervised 3D Segmentation | http://arxiv.org/abs/2012.01644v2 |
Learning Implicit Text Generation via Feature Matching | http://arxiv.org/abs/2005.03588v2 |
Learning Implicitly with Noisy Data in Linear Arithmetic | http://arxiv.org/abs/2010.12619v1 |
Learning Informative Representations of Biomedical Relations with Latent Variable Models | http://arxiv.org/abs/2011.10285v1 |
Learning Intrinsic Symbolic Rewards in Reinforcement Learning | http://arxiv.org/abs/2010.03694v2 |
Learning Invariant Representations for Reinforcement Learning without Reconstruction | http://arxiv.org/abs/2006.10742v1 |
Learning Joint Semantic Parsers from Disjoint Data | http://arxiv.org/abs/1804.05990v1 |
Learning Lexico-Functional Patterns for First-Person Affect | http://arxiv.org/abs/1708.09789v1 |
Learning Long-term Visual Dynamics with Region Proposal Interaction Networks | http://arxiv.org/abs/2008.02265v1 |
Learning Matching Models with Weak Supervision for Response Selection in Retrieval-based Chatbots | http://arxiv.org/abs/1805.02333v2 |
Learning Mixtures of Graphs from Epidemic Cascades | http://arxiv.org/abs/1906.06057v2 |
Learning Multilingual Word Embeddings in Latent Metric Space: A Geometric Approach | http://arxiv.org/abs/1808.08773v3 |
Learning Music Helps You Read: Using Transfer to Study Linguistic Structure in Language Models | http://arxiv.org/abs/2004.14601v3 |
Learning Near Optimal Policies with Low Inherent Bellman Error | http://arxiv.org/abs/2003.00153v3 |
Learning Neural Sequence-to-Sequence Models from Weak Feedback with Bipolar Ramp Loss | http://arxiv.org/abs/1907.03748v1 |
Learning Neural Templates for Text Generation | http://arxiv.org/abs/1808.10122v3 |
Learning Object-Centric Video Models by Contrasting Sets | http://arxiv.org/abs/2011.10287v1 |
Learning Optimal Tree Models Under Beam Search | http://arxiv.org/abs/2006.15408v1 |
Learning Outside the Box: Discourse-level Features Improve Metaphor Identification | http://arxiv.org/abs/1904.02246v2 |
Learning Overlapping Representations for the Estimation of Individualized Treatment Effects | http://arxiv.org/abs/2001.04754v3 |
Learning Portable Representations for High-Level Planning | http://arxiv.org/abs/1905.12006v1 |
Learning Probabilistic Sentence Representations from Paraphrases | http://arxiv.org/abs/2005.08105v1 |
Learning Quadratic Games on Networks | http://arxiv.org/abs/1811.08790v3 |
Learning Reasoning Strategies in End-to-End Differentiable Proving | http://arxiv.org/abs/2007.06477v3 |
Learning Representations that Support Extrapolation | http://arxiv.org/abs/2007.05059v2 |
Learning Robot Skills with Temporal Variational Inference | http://arxiv.org/abs/2006.16232v1 |
Learning Robust Models for e-Commerce Product Search | http://arxiv.org/abs/2005.03624v1 |
Learning Sequence Encoders for Temporal Knowledge Graph Completion | http://arxiv.org/abs/1809.03202v1 |
Learning Similarity Metrics for Numerical Simulations | http://arxiv.org/abs/2002.07863v2 |
Learning Source Phrase Representations for Neural Machine Translation | http://arxiv.org/abs/2006.14405v1 |
Learning Sparse Nonparametric DAGs | http://arxiv.org/abs/1909.13189v2 |
Learning Spoken Language Representations with Neural Lattice Language Modeling | http://arxiv.org/abs/2007.02629v2 |
Learning Structural Kernels for Natural Language Processing | http://arxiv.org/abs/1508.02131v1 |
Learning Structured Representations of Entity Names using Active Learning and Weak Supervision | http://arxiv.org/abs/2011.00105v1 |
Learning Symbolic Physics with Graph Networks | http://arxiv.org/abs/1909.05862v2 |
Learning To Navigate The Synthetically Accessible Chemical Space Using Reinforcement Learning | http://arxiv.org/abs/2004.12485v2 |
Learning To Solve Differential Equations Across Initial Conditions | http://arxiv.org/abs/2003.12159v2 |
Learning Variational Word Masks to Improve the Interpretability of Neural Text Classifiers | http://arxiv.org/abs/2010.00667v3 |
Learning Visually Grounded Sentence Representations | http://arxiv.org/abs/1707.06320v2 |
Learning What to Defer for Maximum Independent Sets | http://arxiv.org/abs/2006.09607v2 |
Learning Word-Like Units from Joint Audio-Visual Analysis | http://arxiv.org/abs/1701.07481v3 |
Learning Zero-Sum Simultaneous-Move Markov Games Using Function Approximation and Correlated Equilibrium | http://arxiv.org/abs/2002.07066v3 |
Learning a Cost-Effective Annotation Policy for Question Answering | http://arxiv.org/abs/2010.03476v2 |
Learning a Multi-Domain Curriculum for Neural Machine Translation | http://arxiv.org/abs/1908.10940v2 |
Learning a Neural Semantic Parser from User Feedback | http://arxiv.org/abs/1704.08760v1 |
Learning a Policy for Opportunistic Active Learning | http://arxiv.org/abs/1808.10009v1 |
Learning a Simple and Effective Model for Multi-turn Response Generation with Auxiliary Tasks | http://arxiv.org/abs/2004.01972v2 |
Learning a Single Neuron with Gradient Methods | http://arxiv.org/abs/2001.05205v2 |
Learning an Unreferenced Metric for Online Dialogue Evaluation | http://arxiv.org/abs/2005.00583v1 |
Learning and Evaluating Contextual Embedding of Source Code | http://arxiv.org/abs/2001.00059v3 |
Learning and Evaluating Emotion Lexicons for 91 Languages | http://arxiv.org/abs/2005.05672v1 |
Learning and Sampling of Atomic Interventions from Observations | http://arxiv.org/abs/2002.04232v2 |
Learning beyond datasets: Knowledge Graph Augmented Neural Networks for Natural language Processing | http://arxiv.org/abs/1802.05930v2 |
Learning distributed representations of graphs with Geo2DR | http://arxiv.org/abs/2003.05926v3 |
Learning for Dose Allocation in Adaptive Clinical Trials with Safety Constraints | http://arxiv.org/abs/2006.05026v2 |
Learning from Context or Names? An Empirical Study on Neural Relation Extraction | http://arxiv.org/abs/2010.01923v2 |
Learning from Irregularly-Sampled Time Series: A Missing Data Perspective | http://arxiv.org/abs/2008.07599v1 |
Learning from Task Descriptions | http://arxiv.org/abs/2011.08115v1 |
Learning how to Active Learn: A Deep Reinforcement Learning Approach | http://arxiv.org/abs/1708.02383v1 |
Learning in Gated Neural Networks | http://arxiv.org/abs/1906.02777v2 |
Learning piecewise Lipschitz functions in changing environments | http://arxiv.org/abs/1907.09137v4 |
Learning robust visual representations using data augmentation invariance | http://arxiv.org/abs/1906.04547v1 |
Learning spectrograms with convolutional spectral kernels | http://arxiv.org/abs/1905.09917v2 |
Learning the piece-wise constant graph structure of a varying Ising model | http://arxiv.org/abs/1910.08512v2 |
Learning to Ask Good Questions: Ranking Clarification Questions using Neural Expected Value of Perfect Information | http://arxiv.org/abs/1805.04655v2 |
Learning to Ask Questions in Open-domain Conversational Systems with Typed Decoders | http://arxiv.org/abs/1805.04843v1 |
Learning to Ask Unanswerable Questions for Machine Reading Comprehension | http://arxiv.org/abs/1906.06045v1 |
Learning to Branch for Multi-Task Learning | http://arxiv.org/abs/2006.01895v2 |
Learning to Classify Intents and Slot Labels Given a Handful of Examples | http://arxiv.org/abs/2004.10793v1 |
Learning to Combine Top-Down and Bottom-Up Signals in Recurrent Neural Networks with Attention over Modules | http://arxiv.org/abs/2006.16981v3 |
Learning to Contextually Aggregate Multi-Source Supervision for Sequence Labeling | http://arxiv.org/abs/1910.04289v2 |
Learning to Continually Learn | http://arxiv.org/abs/2002.09571v2 |
Learning to Customize Model Structures for Few-shot Dialogue Generation Tasks | http://arxiv.org/abs/1910.14326v2 |
Learning to Deceive with Attention-Based Explanations | http://arxiv.org/abs/1909.07913v2 |
Learning to Decipher Hate Symbols | http://arxiv.org/abs/1904.02418v1 |
Learning to Encode Position for Transformer with Continuous Dynamical Model | http://arxiv.org/abs/2003.09229v1 |
Learning to Evaluate Translation Beyond English: BLEURT Submissions to the WMT Metrics 2020 Shared Task | http://arxiv.org/abs/2010.04297v3 |
Learning to Faithfully Rationalize by Construction | http://arxiv.org/abs/2005.00115v1 |
Learning to Fuse Sentences with Transformers for Summarization | http://arxiv.org/abs/2010.03726v1 |
Learning to Generate Compositional Color Descriptions | http://arxiv.org/abs/1606.03821v2 |
Learning to Generate Multiple Style Transfer Outputs for an Input Sentence | http://arxiv.org/abs/2002.06525v1 |
Learning to Ignore: Long Document Coreference with Bounded Memory Neural Networks | http://arxiv.org/abs/2010.02807v3 |
Learning to Learn Kernels with Variational Random Features | http://arxiv.org/abs/2006.06707v2 |
Learning to Map Context-Dependent Sentences to Executable Formal Queries | http://arxiv.org/abs/1804.06868v2 |
Learning to Navigate Unseen Environments: Back Translation with Environmental Dropout | http://arxiv.org/abs/1904.04195v1 |
Learning to Parse and Translate Improves Neural Machine Translation | http://arxiv.org/abs/1702.03525v2 |
Learning to Prune Deep Neural Networks via Reinforcement Learning | http://arxiv.org/abs/2007.04756v1 |
Learning to Rank Learning Curves | http://arxiv.org/abs/2006.03361v1 |
Learning to Reach Goals via Iterated Supervised Learning | http://arxiv.org/abs/1912.06088v4 |
Learning to Recognize Discontiguous Entities | http://arxiv.org/abs/1810.08579v3 |
Learning to Recover from Multi-Modality Errors for Non-Autoregressive Neural Machine Translation | http://arxiv.org/abs/2006.05165v1 |
Learning to Represent Action Values as a Hypergraph on the Action Vertices | http://arxiv.org/abs/2010.14680v1 |
Learning to Sample with Local and Global Contexts in Experience Replay Buffer | http://arxiv.org/abs/2007.07358v1 |
Learning to Score Behaviors for Guided Policy Optimization | http://arxiv.org/abs/1906.04349v4 |
Learning to Segment Actions from Observation and Narration | http://arxiv.org/abs/2005.03684v2 |
Learning to Simulate Complex Physics with Graph Networks | http://arxiv.org/abs/2002.09405v2 |
Learning to Stop While Learning to Predict | http://arxiv.org/abs/2006.05082v1 |
Learning to Understand Child-directed and Adult-directed Speech | http://arxiv.org/abs/2005.02721v3 |
Learning to Update Natural Language Comments Based on Code Changes | http://arxiv.org/abs/2004.12169v2 |
Learning to simulate and design for structural engineering | http://arxiv.org/abs/2003.09103v3 |
Learning with Bounded Instance- and Label-dependent Label Noise | http://arxiv.org/abs/1709.03768v3 |
Learning with Good Feature Representations in Bandits and in RL with a Generative Model | http://arxiv.org/abs/1911.07676v2 |
Learning with Multiple Complementary Labels | http://arxiv.org/abs/1912.12927v3 |
Leave-One-Out Cross-Validation for Bayesian Model Comparison in Large Data | http://arxiv.org/abs/2001.00980v1 |
Lessons from Archives: Strategies for Collecting Sociocultural Data in Machine Learning | http://arxiv.org/abs/1912.10389v1 |
Lessons from the Bible on Modern Topics: Low-Resource Multilingual Topic Model Evaluation | http://arxiv.org/abs/1804.10184v1 |
Let Me Choose: From Verbal Context to Font Selection | http://arxiv.org/abs/2005.01151v1 |
Let's Agree to Agree: Neural Networks Share Classification Order on Real Datasets | http://arxiv.org/abs/1905.10854v7 |
Levels of Analysis for Machine Learning | http://arxiv.org/abs/2004.05107v1 |
Leveraging Declarative Knowledge in Text and First-Order Logic for Fine-Grained Propaganda Detection | http://arxiv.org/abs/2004.14201v2 |
Leveraging Frequency Analysis for Deep Fake Image Recognition | http://arxiv.org/abs/2003.08685v3 |
Leveraging Graph to Improve Abstractive Multi-Document Summarization | http://arxiv.org/abs/2005.10043v1 |
Leveraging Monolingual Data with Self-Supervision for Multilingual Neural Machine Translation | http://arxiv.org/abs/2005.04816v1 |
Leveraging Multimodal Behavioral Analytics for Automated Job Interview Performance Assessment and Feedback | http://arxiv.org/abs/2006.07909v2 |
Leveraging Pre-trained Checkpoints for Sequence Generation Tasks | http://arxiv.org/abs/1907.12461v2 |
Leveraging Procedural Generation to Benchmark Reinforcement Learning | http://arxiv.org/abs/1912.01588v2 |
Leveraging Sentence Similarity in Natural Language Generation: Improving Beam Search using Range Voting | http://arxiv.org/abs/1908.06288v2 |
Lexical Features in Coreference Resolution: To be Used With Caution | http://arxiv.org/abs/1704.06779v1 |
Lexically Constrained Neural Machine Translation with Levenshtein Transformer | http://arxiv.org/abs/2004.12681v1 |
Lexicosyntactic Inference in Neural Models | http://arxiv.org/abs/1808.06232v1 |
Lifelong Language Knowledge Distillation | http://arxiv.org/abs/2010.02123v1 |
Lifelong Learning CRF for Supervised Aspect Extraction | http://arxiv.org/abs/1705.00251v1 |
Lifted Disjoint Paths with Application in Multiple Object Tracking | http://arxiv.org/abs/2006.14550v1 |
Lifted Rule Injection for Relation Embeddings | http://arxiv.org/abs/1606.08359v2 |
Lightweight, Dynamic Graph Convolutional Networks for AMR-to-Text Generation | http://arxiv.org/abs/2010.04383v1 |
Like a Baby: Visually Situated Neural Language Acquisition | http://arxiv.org/abs/1805.11546v2 |
Like hiking? You probably enjoy nature: Persona-grounded Dialog with Commonsense Expansions | http://arxiv.org/abs/2010.03205v1 |
Likelihood Regret: An Out-of-Distribution Detection Score For Variational Auto-encoder | http://arxiv.org/abs/2003.02977v3 |
Linear Bandits with Stochastic Delayed Feedback | http://arxiv.org/abs/1807.02089v3 |
Linear Convergence of Adaptive Stochastic Gradient Descent | http://arxiv.org/abs/1908.10525v2 |
Linear Convergence of Randomized Primal-Dual Coordinate Method for Large-scale Linear Constrained Convex Programming | http://arxiv.org/abs/2008.12946v1 |
Linear Dynamics: Clustering without identification | http://arxiv.org/abs/1908.01039v3 |
Linear Lower Bounds and Conditioning of Differentiable Games | http://arxiv.org/abs/1906.07300v3 |
Linear Mode Connectivity and the Lottery Ticket Hypothesis | http://arxiv.org/abs/1912.05671v4 |
Linear-Time Constituency Parsing with RNNs and Dynamic Programming | http://arxiv.org/abs/1805.06995v2 |
Linearly Convergent Frank-Wolfe with Backtracking Line-Search | http://arxiv.org/abs/1806.05123v4 |
Linguistic Features for Readability Assessment | http://arxiv.org/abs/2006.00377v1 |
Linguistic Harbingers of Betrayal: A Case Study on an Online Strategy Game | http://arxiv.org/abs/1506.04744v1 |
Linguistic Knowledge and Transferability of Contextual Representations | http://arxiv.org/abs/1903.08855v5 |
Lipschitz Constrained Parameter Initialization for Deep Transformers | http://arxiv.org/abs/1911.03179v2 |
Lipschitz and Comparator-Norm Adaptivity in Online Learning | http://arxiv.org/abs/2002.12242v2 |
Lipstick on a Pig: Debiasing Methods Cover up Systematic Gender Biases in Word Embeddings But do not Remove Them | http://arxiv.org/abs/1903.03862v2 |
List Decodable Subspace Recovery | http://arxiv.org/abs/2002.03004v1 |
Lite Training Strategies for Portuguese-English and English-Portuguese Translation | http://arxiv.org/abs/2008.08769v1 |
Local Differentially Private Regret Minimization in Reinforcement Learning | http://arxiv.org/abs/2010.07778v1 |
Localizing Moments in Video with Temporal Language | http://arxiv.org/abs/1809.01337v1 |
Locally Accelerated Conditional Gradients | http://arxiv.org/abs/1906.07867v2 |
Locally Private Hypothesis Selection | http://arxiv.org/abs/2002.09465v2 |
Location Attention for Extrapolation to Longer Sequences | http://arxiv.org/abs/1911.03872v2 |
Logarithmic Regret for Learning Linear Quadratic Regulators Efficiently | http://arxiv.org/abs/2002.08095v2 |
Logarithmic Regret for Online Control | http://arxiv.org/abs/1909.05062v1 |
Logic-Guided Data Augmentation and Regularization for Consistent Question Answering | http://arxiv.org/abs/2004.10157v2 |
Logical Inferences with Comparatives and Generalized Quantifiers | http://arxiv.org/abs/2005.07954v1 |
Logical Natural Language Generation from Open-Domain Tables | http://arxiv.org/abs/2004.10404v2 |
LogicalFactChecker: Leveraging Logical Operations for Fact Checking with Graph Module Network | http://arxiv.org/abs/2004.13659v1 |
Logistic Regression for Massive Data with Rare Events | http://arxiv.org/abs/2006.00683v1 |
Long Short-Term Memory as a Dynamically Computed Element-wise Weighted Sum | http://arxiv.org/abs/1805.03716v1 |
Long-Horizon Visual Planning with Goal-Conditioned Hierarchical Predictors | http://arxiv.org/abs/2006.13205v2 |
Long-Short Term Masking Transformer: A Simple but Effective Baseline for Document-level Neural Machine Translation | http://arxiv.org/abs/2009.09127v1 |
Long-tail Relation Extraction via Knowledge Graph Embeddings and Graph Convolution Networks | http://arxiv.org/abs/1903.01306v1 |
Look It Up: Bilingual and Monolingual Dictionaries Improve Neural Machine Translation | http://arxiv.org/abs/2010.05997v1 |
Look at the First Sentence: Position Bias in Question Answering | http://arxiv.org/abs/2004.14602v3 |
Lookahead-Bounded Q-Learning | http://arxiv.org/abs/2006.15690v1 |
Loss Function Search for Face Recognition | http://arxiv.org/abs/2007.06542v1 |
Lossless Compression of Deep Neural Networks | http://arxiv.org/abs/2001.00218v3 |
Low Rank Fusion based Transformers for Multimodal Sequences | http://arxiv.org/abs/2007.02038v1 |
Low Resource Neural Machine Translation: A Benchmark for Five African Languages | http://arxiv.org/abs/2003.14402v1 |
Low Shot Learning with Untrained Neural Networks for Imaging Inverse Problems | http://arxiv.org/abs/1910.10797v1 |
Low-Dimensional Hyperbolic Knowledge Graph Embeddings | http://arxiv.org/abs/2005.00545v1 |
Low-Rank Bottleneck in Multi-head Attention Models | http://arxiv.org/abs/2002.07028v1 |
Low-Resource Domain Adaptation for Compositional Task-Oriented Semantic Parsing | http://arxiv.org/abs/2010.03546v1 |
Low-Variance and Zero-Variance Baselines for Extensive-Form Games | http://arxiv.org/abs/1907.09633v1 |
Low-loss connection of weight vectors: distribution-based approaches | http://arxiv.org/abs/2008.00741v1 |
Low-resource Deep Entity Resolution with Transfer and Active Learning | http://arxiv.org/abs/1906.08042v1 |
LowFER: Low-rank Bilinear Pooling for Link Prediction | http://arxiv.org/abs/2008.10858v1 |
MAD-X: An Adapter-Based Framework for Multi-Task Cross-Lingual Transfer | http://arxiv.org/abs/2005.00052v3 |
MAF: Multimodal Alignment Framework for Weakly-Supervised Phrase Grounding | http://arxiv.org/abs/2010.05379v1 |
MART: Memory-Augmented Recurrent Transformer for Coherent Video Paragraph Captioning | http://arxiv.org/abs/2005.05402v1 |
MAST: Multimodal Abstractive Summarization with Trimodal Hierarchical Attention | http://arxiv.org/abs/2010.08021v1 |
MATINF: A Jointly Labeled Large-Scale Dataset for Classification, Question Answering and Summarization | http://arxiv.org/abs/2004.12302v2 |
MAVEN: A Massive General Domain Event Detection Dataset | http://arxiv.org/abs/2004.13590v2 |
MCMH: Learning Multi-Chain Multi-Hop Rules for Knowledge Graph Reasoning | http://arxiv.org/abs/2010.01735v1 |
MEGA RST Discourse Treebanks with Structure and Nuclearity from Scalable Distant Sentiment Supervision | http://arxiv.org/abs/2011.03017v1 |
MEGATRON-CNTRL: Controllable Story Generation with External Knowledge Using Large-Scale Language Models | http://arxiv.org/abs/2010.00840v1 |
MGHRL: Meta Goal-generation for Hierarchical Reinforcement Learning | http://arxiv.org/abs/1909.13607v4 |
MIME: MIMicking Emotions for Empathetic Response Generation | http://arxiv.org/abs/2010.01454v1 |
MLSUM: The Multilingual Summarization Corpus | http://arxiv.org/abs/2004.14900v1 |
MOCHA: A Dataset for Training and Evaluating Generative Reading Comprehension Metrics | http://arxiv.org/abs/2010.03636v2 |
MOPO: Model-based Offline Policy Optimization | http://arxiv.org/abs/2005.13239v6 |
MORSE: Semantic-ally Drive-n MORpheme SEgment-er | http://arxiv.org/abs/1702.02212v3 |
MPC-guided Imitation Learning of Neural Network Policies for the Artificial Pancreas | http://arxiv.org/abs/2003.01283v1 |
MTL2L: A Context Aware Neural Optimiser | http://arxiv.org/abs/2007.09343v1 |
MULTIPOLAR: Multi-Source Policy Aggregation for Transfer Reinforcement Learning between Diverse Environmental Dynamics | http://arxiv.org/abs/1909.13111v2 |
Machine Learning in Population and Public Health | http://arxiv.org/abs/2008.07278v1 |
Making Monolingual Sentence Embeddings Multilingual using Knowledge Distillation | http://arxiv.org/abs/2004.09813v2 |
Mapping Natural Language Instructions to Mobile UI Action Sequences | http://arxiv.org/abs/2005.03776v2 |
Mapping Natural-language Problems to Formal-language Solutions Using Structured Neural Representations | http://arxiv.org/abs/1910.02339v3 |
Mapping to Declarative Knowledge for Word Problem Solving | http://arxiv.org/abs/1712.09391v1 |
Marrying up Regular Expressions with Neural Networks: A Case Study for Spoken Language Understanding | http://arxiv.org/abs/1805.05588v1 |
Masked Language Model Scoring | http://arxiv.org/abs/1910.14659v3 |
Masking as an Efficient Alternative to Finetuning for Pretrained Language Models | http://arxiv.org/abs/2004.12406v2 |
Massively Multilingual Adversarial Speech Recognition | http://arxiv.org/abs/1904.02210v1 |
Massively Multilingual Transfer for NER | http://arxiv.org/abs/1902.00193v4 |
Matching the Blanks: Distributional Similarity for Relation Learning | http://arxiv.org/abs/1906.03158v1 |
Maximum Entropy Gain Exploration for Long Horizon Multi-goal Reinforcement Learning | http://arxiv.org/abs/2007.02832v1 |
Maximum Likelihood with Bias-Corrected Calibration is Hard-To-Beat at Label Shift Adaptation | http://arxiv.org/abs/1901.06852v5 |
Maximum Mutation Reinforcement Learning for Scalable Control | http://arxiv.org/abs/2007.13690v6 |
Maximum Reward Formulation In Reinforcement Learning | http://arxiv.org/abs/2010.03744v1 |
MeDAL: Medical Abbreviation Disambiguation Dataset for Natural Language Understanding Pretraining | http://arxiv.org/abs/2012.13978v1 |
Meaning to Form: Measuring Systematicity as Information | http://arxiv.org/abs/1906.05906v2 |
Measuring Emotions in the COVID-19 Real World Worry Dataset | http://arxiv.org/abs/2004.04225v2 |
Measuring Forecasting Skill from Text | http://arxiv.org/abs/2006.07425v2 |
Measuring Impact of Climate Change on Tree Species: analysis of JSDM on FIA data | http://arxiv.org/abs/1910.04932v1 |
Measuring Information Propagation in Literary Social Networks | http://arxiv.org/abs/2004.13980v2 |
Measuring Non-Expert Comprehension of Machine Learning Fairness Metrics | http://arxiv.org/abs/2001.00089v3 |
Measuring Thematic Fit with Distributional Feature Overlap | http://arxiv.org/abs/1707.05967v2 |
Measuring Visual Generalization in Continuous Control from Pixels | http://arxiv.org/abs/2010.06740v2 |
Median Matrix Completion: from Embarrassment to Optimality | http://arxiv.org/abs/2006.10400v1 |
Memory-enhanced Decoder for Neural Machine Translation | http://arxiv.org/abs/1606.02003v1 |
Mention Extraction and Linking for SQL Query Generation | http://arxiv.org/abs/2012.10074v1 |
Merge and Label: A novel neural network architecture for nested NER | http://arxiv.org/abs/1907.00464v1 |
Message Passing Query Embedding | http://arxiv.org/abs/2002.02406v2 |
Message Passing for Hyper-Relational Knowledge Graphs | http://arxiv.org/abs/2009.10847v1 |
Meta Fine-Tuning Neural Language Models for Multi-Domain Text Mining | http://arxiv.org/abs/2003.13003v2 |
Meta Learning Deep Visual Words for Fast Video Object Segmentation | http://arxiv.org/abs/1812.01397v3 |
Meta-Learning for Few-Shot NMT Adaptation | http://arxiv.org/abs/2004.02745v1 |
Meta-Learning with Shared Amortized Variational Inference | http://arxiv.org/abs/2008.12037v1 |
Meta-Reinforcement Learning Robust to Distributional Shift via Model Identification and Experience Relabeling | http://arxiv.org/abs/2006.07178v2 |
Meta-Reinforcement Learning for Robotic Industrial Insertion Tasks | http://arxiv.org/abs/2004.14404v2 |
Meta-SAC: Auto-tune the Entropy Temperature of Soft Actor-Critic via Metagradient | http://arxiv.org/abs/2007.01932v2 |
Meta-Transfer Learning for Code-Switched Speech Recognition | http://arxiv.org/abs/2004.14228v1 |
Meta-learning with Stochastic Linear Bandits | http://arxiv.org/abs/2005.08531v1 |
MetaFun: Meta-Learning with Iterative Functional Updates | http://arxiv.org/abs/1912.02738v4 |
Microblog Hashtag Generation via Encoding Conversation Contexts | http://arxiv.org/abs/1905.07584v1 |
Mimicking Word Embeddings using Subword RNNs | http://arxiv.org/abs/1707.06961v1 |
MinTL: Minimalist Transfer Learning for Task-Oriented Dialogue Systems | http://arxiv.org/abs/2009.12005v2 |
Mind the Trade-off: Debiasing NLU Models without Degrading the In-distribution Performance | http://arxiv.org/abs/2005.00315v1 |
Minimally distorted Adversarial Examples with a Fast Adaptive Boundary Attack | http://arxiv.org/abs/1907.02044v2 |
Minimax Pareto Fairness: A Multi Objective Perspective | http://arxiv.org/abs/2011.01821v1 |
Minimax Testing of Identity to a Reference Ergodic Markov Chain | http://arxiv.org/abs/1902.00080v3 |
Minimax Weight and Q-Function Learning for Off-Policy Evaluation | http://arxiv.org/abs/1910.12809v4 |
Minimizing Dynamic Regret and Adaptive Regret Simultaneously | http://arxiv.org/abs/2002.02085v1 |
Minimizing Interference and Selection Bias in Network Experiment Design | http://arxiv.org/abs/2004.07225v1 |
Mining Discourse Markers for Unsupervised Sentence Representation Learning | http://arxiv.org/abs/1903.11850v1 |
Mining Documentation to Extract Hyperparameter Schemas | http://arxiv.org/abs/2006.16984v2 |
Mirror Descent Policy Optimization | http://arxiv.org/abs/2005.09814v3 |
Missing Data Imputation using Optimal Transport | http://arxiv.org/abs/2002.03860v3 |
Mitigating Gender Bias Amplification in Distribution by Posterior Regularization | http://arxiv.org/abs/2005.06251v1 |
Mitigating Gender Bias for Neural Dialogue Generation with Adversarial Learning | http://arxiv.org/abs/2009.13028v2 |
Mitigating Gender Bias in Machine Translation with Target Gender Annotations | http://arxiv.org/abs/2010.06203v2 |
Mitigating Gender Bias in Natural Language Processing: Literature Review | http://arxiv.org/abs/1906.08976v1 |
Mitigating Leakage in Federated Learning with Trusted Hardware | http://arxiv.org/abs/2011.04948v3 |
Mitigating Manipulation in Peer Review via Randomized Reviewer Assignments | http://arxiv.org/abs/2006.16437v2 |
Mitigating Overfitting in Supervised Classification from Two Unlabeled Datasets: A Consistent Risk Correction Approach | http://arxiv.org/abs/1910.08974v4 |
Mitigating Uncertainty in Document Classification | http://arxiv.org/abs/1907.07590v1 |
MixText: Linguistically-Informed Interpolation of Hidden Space for Semi-Supervised Text Classification | http://arxiv.org/abs/2004.12239v1 |
Mixed Strategies for Robust Optimization of Unknown Objectives | http://arxiv.org/abs/2002.12613v2 |
MixingBoard: a Knowledgeable Stylized Integrated Text Generation Platform | http://arxiv.org/abs/2005.08365v2 |
MoNet3D: Towards Accurate Monocular 3D Object Localization in Real Time | http://arxiv.org/abs/2006.16007v1 |
Mobile-Based Deep Learning Models for Banana Diseases Detection | http://arxiv.org/abs/2004.03718v1 |
MobileBERT: a Compact Task-Agnostic BERT for Resource-Limited Devices | http://arxiv.org/abs/2004.02984v2 |
Model Fusion with Kullback--Leibler Divergence | http://arxiv.org/abs/2007.06168v1 |
Model selection for contextual bandits | http://arxiv.org/abs/1906.00531v3 |
Model-Agnostic Counterfactual Explanations for Consequential Decisions | http://arxiv.org/abs/1905.11190v5 |
Model-Based Reinforcement Learning with a Generative Model is Minimax Optimal | http://arxiv.org/abs/1906.03804v3 |
Model-Based Visual Planning with Self-Supervised Functional Distances | http://arxiv.org/abs/2012.15373v1 |
Modeling Cloud Reflectance Fields using Conditional Generative Adversarial Networks | http://arxiv.org/abs/2002.07579v2 |
Modeling Continuous Stochastic Processes with Dynamic Normalizing Flows | http://arxiv.org/abs/2002.10516v3 |
Modeling Discourse Structure for Document-level Neural Machine Translation | http://arxiv.org/abs/2006.04721v1 |
Modeling Empathy and Distress in Reaction to News Stories | http://arxiv.org/abs/1808.10399v1 |
Modeling Global and Local Node Contexts for Text Generation from Knowledge Graphs | http://arxiv.org/abs/2001.11003v2 |
Modeling Label Semantics for Predicting Emotional Reactions | http://arxiv.org/abs/2006.05489v2 |
Modeling Long Context for Task-Oriented Dialogue State Generation | http://arxiv.org/abs/2004.14080v1 |
Modeling Naive Psychology of Characters in Simple Commonsense Stories | http://arxiv.org/abs/1805.06533v1 |
Modeling Protagonist Emotions for Emotion-Aware Storytelling | http://arxiv.org/abs/2010.06822v2 |
Modeling Recurrence for Transformer | http://arxiv.org/abs/1904.03092v1 |
Modeling Semantic Compositionality with Sememe Knowledge | http://arxiv.org/abs/1907.04744v1 |
Modeling Semantic Expectation: Using Script Knowledge for Referent Prediction | http://arxiv.org/abs/1702.03121v1 |
Modeling Semantic Plausibility by Injecting World Knowledge | http://arxiv.org/abs/1804.00619v3 |
Modeling Source Syntax for Neural Machine Translation | http://arxiv.org/abs/1705.01020v1 |
Modeling Subjective Assessments of Guilt in Newspaper Crime Narratives | http://arxiv.org/abs/2006.09589v2 |
Modeling the Music Genre Perception across Language-Bound Cultures | http://arxiv.org/abs/2010.06325v2 |
Modeling, Visualization, and Analysis of African Innovation Performance | http://arxiv.org/abs/2008.07882v1 |
Modelling Lexical Ambiguity with Density Matrices | http://arxiv.org/abs/2010.05670v1 |
Modelling Suspense in Short Stories as Uncertainty Reduction over Neural Representation | http://arxiv.org/abs/2004.14905v1 |
Modular Block-diagonal Curvature Approximations for Feedforward Architectures | http://arxiv.org/abs/1902.01813v3 |
Modularized Transfomer-based Ranking Framework | http://arxiv.org/abs/2004.13313v3 |
Modulated Fusion using Transformer for Linguistic-Acoustic Emotion Recognition | http://arxiv.org/abs/2010.02057v1 |
Modulating Surrogates for Bayesian Optimization | http://arxiv.org/abs/1906.11152v4 |
MojiTalk: Generating Emotional Responses at Scale | http://arxiv.org/abs/1711.04090v2 |
Molecule Edit Graph Attention Network: Modeling Chemical Reactions as Sequences of Graph Edits | http://arxiv.org/abs/2006.15426v1 |
Momentum Improves Normalized SGD | http://arxiv.org/abs/2002.03305v2 |
Momentum in Reinforcement Learning | http://arxiv.org/abs/1910.09322v2 |
Moniqua: Modulo Quantized Communication in Decentralized SGD | http://arxiv.org/abs/2002.11787v3 |
Monitoring and explainability of models in production | http://arxiv.org/abs/2007.06299v1 |
More Data Can Expand the Generalization Gap Between Adversarially Robust and Standard Models | http://arxiv.org/abs/2002.04725v3 |
More Information Supervised Probabilistic Deep Face Embedding Learning | http://arxiv.org/abs/2006.04518v2 |
More Powerful Selective Kernel Tests for Feature Selection | http://arxiv.org/abs/1910.06134v2 |
Morph-fitting: Fine-Tuning Word Vector Spaces with Simple Language-Specific Rules | http://arxiv.org/abs/1706.00377v1 |
Morphological Irregularity Correlates with Frequency | http://arxiv.org/abs/1906.11483v1 |
Morphological Segmentation Inside-Out | http://arxiv.org/abs/1911.04916v1 |
MuTual: A Dataset for Multi-Turn Dialogue Reasoning | http://arxiv.org/abs/2004.04494v1 |
Multi-Agent Determinantal Q-Learning | http://arxiv.org/abs/2006.01482v4 |
Multi-Agent Task-Oriented Dialog Policy Learning with Role-Aware Reward Decomposition | http://arxiv.org/abs/2004.03809v2 |
Multi-Attribute Bayesian Optimization With Interactive Preference Learning | http://arxiv.org/abs/1911.05934v2 |
Multi-Dimensional Gender Bias Classification | http://arxiv.org/abs/2005.00614v1 |
Multi-Domain Dialogue Acts and Response Co-Generation | http://arxiv.org/abs/2004.12363v1 |
Multi-Domain Neural Machine Translation with Word-Level Adaptive Layer-wise Domain Mixing | http://arxiv.org/abs/1911.02692v2 |
Multi-Fact Correction in Abstractive Text Summarization | http://arxiv.org/abs/2010.02443v1 |
Multi-Hop Knowledge Graph Reasoning with Reward Shaping | http://arxiv.org/abs/1808.10568v2 |
Multi-Instance Multi-Label Learning Networks for Aspect-Category Sentiment Analysis | http://arxiv.org/abs/2010.02656v1 |
Multi-Level Matching and Aggregation Network for Few-Shot Relation Classification | http://arxiv.org/abs/1906.06678v1 |
Multi-Modal Generative Adversarial Network for Short Product Title Generation in Mobile E-Commerce | http://arxiv.org/abs/1904.01735v1 |
Multi-News: a Large-Scale Multi-Document Summarization Dataset and Abstractive Hierarchical Model | http://arxiv.org/abs/1906.01749v3 |
Multi-Objective Molecule Generation using Interpretable Substructures | http://arxiv.org/abs/2002.03244v3 |
Multi-Passage Machine Reading Comprehension with Cross-Passage Answer Verification | http://arxiv.org/abs/1805.02220v2 |
Multi-Principal Assistance Games | http://arxiv.org/abs/2007.09540v1 |
Multi-Reference Training with Pseudo-References for Neural Translation and Text Generation | http://arxiv.org/abs/1808.09564v1 |
Multi-Relational Question Answering from Narratives: Machine Reading and Reasoning in Simulated Worlds | http://arxiv.org/abs/1902.09093v1 |
Multi-Sentence Argument Linking | http://arxiv.org/abs/1911.03766v3 |
Multi-Source Unsupervised Hyperparameter Optimization | http://arxiv.org/abs/2006.10600v1 |
Multi-Step Inference for Reasoning Over Paragraphs | http://arxiv.org/abs/2004.02995v1 |
Multi-Task Identification of Entities, Relations, and Coreference for Scientific Knowledge Graph Construction | http://arxiv.org/abs/1808.09602v1 |
Multi-Task Learning in Histo-pathology for Widely Generalizable Model | http://arxiv.org/abs/2005.08645v1 |
Multi-Task Networks With Universe, Group, and Task Feature Learning | http://arxiv.org/abs/1907.01791v1 |
Multi-Task Ordinal Regression for Jointly Predicting the Trustworthiness and the Leading Political Ideology of News Media | http://arxiv.org/abs/1904.00542v1 |
Multi-Task Reinforcement Learning with Soft Modularization | http://arxiv.org/abs/2003.13661v2 |
Multi-Task Video Captioning with Video and Entailment Generation | http://arxiv.org/abs/1704.07489v2 |
Multi-Unit Transformers for Neural Machine Translation | http://arxiv.org/abs/2010.10743v2 |
Multi-View Sequence-to-Sequence Models with Conversational Structure for Abstractive Dialogue Summarization | http://arxiv.org/abs/2010.01672v1 |
Multi-XScience: A Large-scale Dataset for Extreme Multi-document Summarization of Scientific Articles | http://arxiv.org/abs/2010.14235v1 |
Multi-agent Communication meets Natural Language: Synergies between Functional and Structural Language Learning | http://arxiv.org/abs/2005.07064v1 |
Multi-document Summarization with Maximal Marginal Relevance-guided Reinforcement Learning | http://arxiv.org/abs/2010.00117v1 |
Multi-hop Inference for Question-driven Summarization | http://arxiv.org/abs/2010.03738v1 |
Multi-hop Reading Comprehension across Multiple Documents by Reasoning over Heterogeneous Graphs | http://arxiv.org/abs/1905.07374v2 |
Multi-label Few/Zero-shot Learning with Knowledge Aggregated from Multiple Label Graphs | http://arxiv.org/abs/2010.07459v1 |
Multi-lingual neural title generation for e-Commerce browse pages | http://arxiv.org/abs/1804.01041v1 |
Multi-objective Bayesian Optimization using Pareto-frontier Entropy | http://arxiv.org/abs/1906.00127v2 |
Multi-space Variational Encoder-Decoders for Semi-supervised Labeled Sequence Transduction | http://arxiv.org/abs/1704.01691v2 |
Multi-step Greedy Reinforcement Learning Algorithms | http://arxiv.org/abs/1910.02919v3 |
Multi-task Learning for Multilingual Neural Machine Translation | http://arxiv.org/abs/2010.02523v1 |
Multi-task Learning of Pairwise Sequence Classification Tasks Over Disparate Label Spaces | http://arxiv.org/abs/1802.09913v2 |
Multi-task Learning with Sample Re-weighting for Machine Reading Comprehension | http://arxiv.org/abs/1809.06963v3 |
Multi-task Reinforcement Learning with a Planning Quasi-Metric | http://arxiv.org/abs/2002.03240v3 |
Multi-turn Response Selection using Dialogue Dependency Relations | http://arxiv.org/abs/2010.01502v1 |
Multi-view Story Characterization from Movie Plot Synopses and Reviews | http://arxiv.org/abs/1908.09083v2 |
MultiCQA: Zero-Shot Transfer of Self-Supervised Text Matching Models on a Massive Scale | http://arxiv.org/abs/2010.00980v1 |
MultiQA: An Empirical Investigation of Generalization and Transfer in Reading Comprehension | http://arxiv.org/abs/1905.13453v1 |
MultiQT: Multimodal Learning for Real-Time Question Tracking in Speech | http://arxiv.org/abs/2005.00812v2 |
MultiWOZ 2.2 : A Dialogue Dataset with Additional Annotation Corrections and State Tracking Baselines | http://arxiv.org/abs/2007.12720v1 |
Multidimensional Persistence Module Classification via Lattice-Theoretic Convolutions | http://arxiv.org/abs/2011.14057v1 |
Multidirectional Associative Optimization of Function-Specific Word Representations | http://arxiv.org/abs/2005.05264v1 |
Multigrid Neural Memory | http://arxiv.org/abs/1906.05948v4 |
Multilevel Text Alignment with Cross-Document Attention | http://arxiv.org/abs/2010.01263v1 |
Multilinear Latent Conditioning for Generating Unseen Attribute Combinations | http://arxiv.org/abs/2009.04075v1 |
Multilingual AMR-to-Text Generation | http://arxiv.org/abs/2011.05443v1 |
Multilingual Constituency Parsing with Self-Attention and Pre-Training | http://arxiv.org/abs/1812.11760v2 |
Multilingual Denoising Pre-training for Neural Machine Translation | http://arxiv.org/abs/2001.08210v2 |
Multilingual Factor Analysis | http://arxiv.org/abs/1905.05547v2 |
Multilingual Jointly Trained Acoustic and Written Word Embeddings | http://arxiv.org/abs/2006.14007v1 |
Multilingual Offensive Language Identification with Cross-lingual Embeddings | http://arxiv.org/abs/2010.05324v1 |
Multilingual Universal Sentence Encoder for Semantic Retrieval | http://arxiv.org/abs/1907.04307v1 |
Multimodal Affective Analysis Using Hierarchical Attention Strategy with Word-Level Alignment | http://arxiv.org/abs/1805.08660v1 |
Multimodal Emoji Prediction | http://arxiv.org/abs/1803.02392v2 |
Multimodal Joint Attribute Prediction and Value Extraction for E-commerce Product | http://arxiv.org/abs/2009.07162v1 |
Multimodal Language Analysis with Recurrent Multistage Fusion | http://arxiv.org/abs/1808.03920v1 |
Multimodal Machine Translation with Embedding Prediction | http://arxiv.org/abs/1904.00639v1 |
Multimodal Routing: Improving Local and Global Interpretability of Multimodal Language Analysis | http://arxiv.org/abs/2004.14198v2 |
Multimodal Self-Supervised Learning for Medical Image Analysis | http://arxiv.org/abs/1912.05396v2 |
Multimodal Transformer Networks for End-to-End Video-Grounded Dialogue Systems | http://arxiv.org/abs/1907.01166v1 |
Multimodal and Multi-view Models for Emotion Recognition | http://arxiv.org/abs/1906.10198v1 |
Multinomial Logit Bandit with Low Switching Cost | http://arxiv.org/abs/2007.04876v1 |
Multiple Instance Learning Networks for Fine-Grained Sentiment Analysis | http://arxiv.org/abs/1711.09645v2 |
Multiresolution Tensor Learning for Efficient and Interpretable Spatial Analysis | http://arxiv.org/abs/2002.05578v5 |
Multiscale Collaborative Deep Models for Neural Machine Translation | http://arxiv.org/abs/2004.14021v3 |
Musical Word Embedding: Bridging the Gap between Listening Contexts and Music | http://arxiv.org/abs/2008.01190v1 |
Mutual Information Maximization for Simple and Accurate Part-Of-Speech Induction | http://arxiv.org/abs/1804.07849v4 |
My Fair Bandit: Distributed Learning of Max-Min Fairness with Multi-player Bandits | http://arxiv.org/abs/2002.09808v4 |
NADS: Neural Architecture Distribution Search for Uncertainty Awareness | http://arxiv.org/abs/2006.06646v1 |
NARMADA: Need and Available Resource Managing Assistant for Disasters and Adversities | http://arxiv.org/abs/2005.13524v1 |
NASH: Toward End-to-End Neural Architecture for Generative Semantic Hashing | http://arxiv.org/abs/1805.05361v1 |
NAT: Noise-Aware Training for Robust Neural Sequence Labeling | http://arxiv.org/abs/2005.07162v1 |
NEXUS Network: Connecting the Preceding and the Following in Dialogue Generation | http://arxiv.org/abs/1810.00671v2 |
NGBoost: Natural Gradient Boosting for Probabilistic Prediction | http://arxiv.org/abs/1910.03225v4 |
NILE : Natural Language Inference with Faithful Natural Language Explanations | http://arxiv.org/abs/2005.12116v1 |
NLP Scholar: An Interactive Visual Explorer for Natural Language Processing Literature | http://arxiv.org/abs/2006.01131v1 |
NSTM: Real-Time Query-Driven News Overview Composition at Bloomberg | http://arxiv.org/abs/2006.01117v1 |
Naive Exploration is Optimal for Online LQR | http://arxiv.org/abs/2001.09576v2 |
Naive Feature Selection: Sparsity in Naive Bayes | http://arxiv.org/abs/1905.09884v2 |
Nakdan: Professional Hebrew Diacritizer | http://arxiv.org/abs/2005.03312v1 |
Named Entity Recognition Only from Word Embeddings | http://arxiv.org/abs/1909.00164v2 |
Named Entity Recognition as Dependency Parsing | http://arxiv.org/abs/2005.07150v3 |
Named Entity Recognition for Social Media Texts with Semantic Augmentation | http://arxiv.org/abs/2010.15458v1 |
Named Entity Recognition without Labelled Data: A Weak Supervision Approach | http://arxiv.org/abs/2004.14723v1 |
Native Language Cognate Effects on Second Language Lexical Choice | http://arxiv.org/abs/1805.09590v1 |
Natural Language Comprehension with the EpiReader | http://arxiv.org/abs/1606.02270v2 |
Natural Language Processing with Small Feed-Forward Networks | http://arxiv.org/abs/1708.00214v1 |
Natural language processing for achieving sustainable development: the case of neural labelling to enhance community profiling | http://arxiv.org/abs/2004.12935v2 |
Naturalizing a Programming Language via Interactive Learning | http://arxiv.org/abs/1704.06956v1 |
Navigating the Dynamics of Financial Embeddings over Time | http://arxiv.org/abs/2007.00591v1 |
Near Human-Level Performance in Grammatical Error Correction with Hybrid Machine Translation | http://arxiv.org/abs/1804.05945v1 |
Near Input Sparsity Time Kernel Embeddings via Adaptive Sampling | http://arxiv.org/abs/2007.03927v2 |
Near-Optimal Algorithms for Minimax Optimization | http://arxiv.org/abs/2002.02417v5 |
Near-Optimal Methods for Minimizing Star-Convex Functions and Beyond | http://arxiv.org/abs/1906.11985v1 |
Near-imperceptible Neural Linguistic Steganography via Self-Adjusting Arithmetic Coding | http://arxiv.org/abs/2010.00677v1 |
Near-linear Time Gaussian Process Optimization with Adaptive Batching and Resparsification | http://arxiv.org/abs/2002.09954v2 |
Near-optimal Regret Bounds for Stochastic Shortest Path | http://arxiv.org/abs/2002.09869v1 |
Nearly Linear Row Sampling Algorithm for Quantile Regression | http://arxiv.org/abs/2006.08397v1 |
Necessary and Sufficient Geometries for Gradient Methods | http://arxiv.org/abs/1909.10455v2 |
Negated and Misprimed Probes for Pretrained Language Models: Birds Can Talk, But Cannot Fly | http://arxiv.org/abs/1911.03343v3 |
Negative Training for Neural Dialogue Response Generation | http://arxiv.org/abs/1903.02134v5 |
Negative sampling in semi-supervised learning | http://arxiv.org/abs/1911.05166v2 |
Neighborhood Growth Determines Geometric Priors for Relational Representation Learning | http://arxiv.org/abs/1910.05565v1 |
Neighborhood Matching Network for Entity Alignment | http://arxiv.org/abs/2005.05607v1 |
Nested Named Entity Recognition via Second-best Sequence Learning and Decoding | http://arxiv.org/abs/1909.02250v3 |
Nested Reasoning About Autonomous Agents Using Probabilistic Programs | http://arxiv.org/abs/1812.01569v2 |
Nested Subspace Arrangement for Representation of Relational Data | http://arxiv.org/abs/2007.02007v1 |
Neural AMR: Sequence-to-Sequence Models for Parsing and Generation | http://arxiv.org/abs/1704.08381v3 |
Neural Abstract Reasoner | http://arxiv.org/abs/2011.09860v1 |
Neural Argument Generation Augmented with Externally Retrieved Evidence | http://arxiv.org/abs/1805.10254v1 |
Neural Bipartite Matching | http://arxiv.org/abs/2005.11304v3 |
Neural CRF Model for Sentence Alignment in Text Simplification | http://arxiv.org/abs/2005.02324v3 |
Neural CRF Parsing | http://arxiv.org/abs/1507.03641v1 |
Neural Clustering Processes | http://arxiv.org/abs/1901.00409v4 |
Neural Contextual Bandits with UCB-based Exploration | http://arxiv.org/abs/1911.04462v3 |
Neural Cross-Lingual Coreference Resolution and its Application to Entity Linking | http://arxiv.org/abs/1806.10201v1 |
Neural Data-to-Text Generation via Jointly Learning the Segmentation and Correspondence | http://arxiv.org/abs/2005.01096v1 |
Neural Decomposition: Functional ANOVA with Variational Autoencoders | http://arxiv.org/abs/2006.14293v2 |
Neural Deepfake Detection with Factual Structure of Text | http://arxiv.org/abs/2010.07475v1 |
Neural Differential Equations for Single Image Super-resolution | http://arxiv.org/abs/2005.00865v1 |
Neural Discourse Structure for Text Categorization | http://arxiv.org/abs/1702.01829v2 |
Neural Dynamic Policies for End-to-End Sensorimotor Learning | http://arxiv.org/abs/2012.02788v1 |
Neural End-to-End Learning for Computational Argumentation Mining | http://arxiv.org/abs/1704.06104v2 |
Neural Fine-Grained Entity Type Classification with Hierarchy-Aware Loss | http://arxiv.org/abs/1803.03378v2 |
Neural Generation of Dialogue Response Timings | http://arxiv.org/abs/2005.09128v1 |
Neural Grammatical Error Correction with Finite State Transducers | http://arxiv.org/abs/1903.10625v2 |
Neural Kernels Without Tangents | http://arxiv.org/abs/2003.02237v2 |
Neural Language Models as Psycholinguistic Subjects: Representations of Syntactic State | http://arxiv.org/abs/1903.03260v1 |
Neural Latent Relational Analysis to Capture Lexical Semantic Relations in a Vector Space | http://arxiv.org/abs/1809.03401v1 |
Neural Legal Judgment Prediction in English | http://arxiv.org/abs/1906.02059v1 |
Neural Machine Translation of Text from Non-Native Speakers | http://arxiv.org/abs/1808.06267v2 |
Neural Machine Translation via Binary Code Prediction | http://arxiv.org/abs/1704.06918v1 |
Neural Machine Translation with Source-Side Latent Graph Parsing | http://arxiv.org/abs/1702.02265v4 |
Neural Manifold Ordinary Differential Equations | http://arxiv.org/abs/2006.10254v1 |
Neural Mask Generator: Learning to Generate Adaptive Word Maskings for Language Model Adaptation | http://arxiv.org/abs/2010.02705v1 |
Neural Metaphor Detection in Context | http://arxiv.org/abs/1808.09653v1 |
Neural Models for Documents with Metadata | http://arxiv.org/abs/1705.09296v2 |
Neural Open Information Extraction | http://arxiv.org/abs/1805.04270v1 |
Neural Operator: Graph Kernel Network for Partial Differential Equations | http://arxiv.org/abs/2003.03485v1 |
Neural Ordinary Differential Equations on Manifolds | http://arxiv.org/abs/2006.06663v1 |
Neural Proof Nets | http://arxiv.org/abs/2009.12702v1 |
Neural Related Work Summarization with a Joint Context-driven Attention Mechanism | http://arxiv.org/abs/1901.09492v1 |
Neural Responding Machine for Short-Text Conversation | http://arxiv.org/abs/1503.02364v2 |
Neural Segmental Hypergraphs for Overlapping Mention Recognition | http://arxiv.org/abs/1810.01817v1 |
Neural Simultaneous Speech Translation Using Alignment-Based Chunking | http://arxiv.org/abs/2005.14489v1 |
Neural Symbolic Machines: Learning Semantic Parsers on Freebase with Weak Supervision | http://arxiv.org/abs/1611.00020v4 |
Neural Syntactic Preordering for Controlled Paraphrase Generation | http://arxiv.org/abs/2005.02013v1 |
Neural Temporal Opinion Modelling for Opinion Prediction on Twitter | http://arxiv.org/abs/2005.13486v1 |
Neural Text Generation from Structured Data with Application to the Biography Domain | http://arxiv.org/abs/1603.07771v3 |
Neural Topic Modeling by Incorporating Document Relationship Graph | http://arxiv.org/abs/2009.13972v1 |
Neural Topic Modeling with Bidirectional Adversarial Training | http://arxiv.org/abs/2004.12331v1 |
Neural Topic Modeling with Continual Lifelong Learning | http://arxiv.org/abs/2006.10909v1 |
Neural Topic Modeling with Cycle-Consistent Adversarial Training | http://arxiv.org/abs/2009.13971v1 |
Neural Transductive Learning and Beyond: Morphological Generation in the Minimal-Resource Setting | http://arxiv.org/abs/1809.08733v2 |
Neural Word Segmentation with Rich Pretraining | http://arxiv.org/abs/1704.08960v1 |
Neural models of factuality | http://arxiv.org/abs/1804.02472v1 |
Neural reparameterization improves structural optimization | http://arxiv.org/abs/1909.04240v2 |
Neural versus Phrase-Based Machine Translation Quality: a Case Study | http://arxiv.org/abs/1608.04631v2 |
NeuralREG: An end-to-end approach to referring expression generation | http://arxiv.org/abs/1805.08093v1 |
Neurals Networks for Projecting Named Entities from English to Ewondo | http://arxiv.org/abs/2004.13841v1 |
Neuro-Symbolic Visual Reasoning: Disentangling "Visual" from "Reasoning" | http://arxiv.org/abs/2006.11524v3 |
New Oracle-Efficient Algorithms for Private Synthetic Data Release | http://arxiv.org/abs/2007.05453v1 |
New Potential-Based Bounds for Prediction with Expert Advice | http://arxiv.org/abs/1911.01641v3 |
New Protocols and Negative Results for Textual Entailment Data Collection | http://arxiv.org/abs/2004.11997v2 |
Newsroom: A Dataset of 1.3 Million Summaries with Diverse Extractive Strategies | http://arxiv.org/abs/1804.11283v2 |
No Metrics Are Perfect: Adversarial Reward Learning for Visual Storytelling | http://arxiv.org/abs/1804.09160v2 |
No Permanent Friends or Enemies: Tracking Relationships between Nations from News | http://arxiv.org/abs/1904.08950v1 |
No-Regret Prediction in Marginally Stable Systems | http://arxiv.org/abs/2002.02064v3 |
Noise Contrastive Estimation and Negative Sampling for Conditional Models: Consistency and Statistical Efficiency | http://arxiv.org/abs/1809.01812v1 |
Noise-tolerant, Reliable Active Classification with Comparison Queries | http://arxiv.org/abs/2001.05497v1 |
Noisy-Input Entropy Search for Efficient Robust Bayesian Optimization | http://arxiv.org/abs/2002.02820v1 |
Non-Autoregressive Machine Translation with Latent Alignments | http://arxiv.org/abs/2004.07437v3 |
Non-Parametric Calibration for Classification | http://arxiv.org/abs/1906.04933v3 |
Non-Projective Dependency Parsing with Non-Local Transitions | http://arxiv.org/abs/1710.09340v3 |
Non-convex Learning via Replica Exchange Stochastic Gradient MCMC | http://arxiv.org/abs/2008.05367v2 |
Non-exchangeable feature allocation models with sublinear growth of the feature sizes | http://arxiv.org/abs/2003.13491v1 |
Non-linear interlinkages and key objectives amongst the Paris Agreement and the Sustainable Development Goals | http://arxiv.org/abs/2004.09318v1 |
Nonmyopic Gaussian Process Optimization with Macro-Actions | http://arxiv.org/abs/2002.09670v1 |
Nonparametric Estimation in the Dynamic Bradley-Terry Model | http://arxiv.org/abs/2003.00083v1 |
Nonparametric Score Estimators | http://arxiv.org/abs/2005.10099v2 |
Norm-Based Curriculum Learning for Neural Machine Translation | http://arxiv.org/abs/2006.02014v1 |
Normalized Flat Minima: Exploring Scale Invariant Definition of Flat Minima for Neural Networks using PAC-Bayesian Analysis | http://arxiv.org/abs/1901.04653v2 |
Normalized Loss Functions for Deep Learning with Noisy Labels | http://arxiv.org/abs/2006.13554v1 |
Normalizing Flows Across Dimensions | http://arxiv.org/abs/2006.13070v1 |
Normalizing Flows on Tori and Spheres | http://arxiv.org/abs/2002.02428v2 |
Normalizing Flows with Multi-Scale Autoregressive Priors | http://arxiv.org/abs/2004.03891v1 |
Not All Claims are Created Equal: Choosing the Right Statistical Approach to Assess Hypotheses | http://arxiv.org/abs/1911.03850v3 |
Not Low-Resource Anymore: Aligner Ensembling, Batch Filtering, and New Datasets for Bengali-English Machine Translation | http://arxiv.org/abs/2009.09359v2 |
Null It Out: Guarding Protected Attributes by Iterative Nullspace Projection | http://arxiv.org/abs/2004.07667v2 |
Numeracy for Language Models: Evaluating and Improving their Ability to Predict Numbers | http://arxiv.org/abs/1805.08154v1 |
Nurse is Closer to Woman than Surgeon? Mitigating Gender-Biased Proximities in Word Embeddings | http://arxiv.org/abs/2006.01938v1 |
NwQM: A neural quality assessment framework for Wikipedia | http://arxiv.org/abs/2010.06969v1 |
OBJ2TEXT: Generating Visually Descriptive Language from Object Layouts | http://arxiv.org/abs/1707.07102v1 |
ON-TRAC Consortium for End-to-End and Simultaneous Speech Translation Challenge Tasks at IWSLT 2020 | http://arxiv.org/abs/2005.11861v1 |
OPAL: Offline Primitive Discovery for Accelerating Offline Reinforcement Learning | http://arxiv.org/abs/2010.13611v2 |
OSOM: A simultaneously optimal algorithm for multi-armed and linear contextual bandits | http://arxiv.org/abs/1905.10040v4 |
Obfuscation for Privacy-preserving Syntactic Parsing | http://arxiv.org/abs/1904.09585v2 |
Obfuscation via Information Density Estimation | http://arxiv.org/abs/1910.08109v1 |
Object Ordering with Bidirectional Matchings for Visual Reasoning | http://arxiv.org/abs/1804.06870v2 |
Observing Dialogue in Therapy: Categorizing and Forecasting Behavioral Codes | http://arxiv.org/abs/1907.00326v1 |
Obtaining Adjustable Regularization for Free via Iterate Averaging | http://arxiv.org/abs/2008.06736v1 |
Obtaining Faithful Interpretations from Compositional Neural Networks | http://arxiv.org/abs/2005.00724v2 |
Off-Dynamics Reinforcement Learning: Training for Transfer with Domain Classifiers | http://arxiv.org/abs/2006.13916v1 |
Off-Policy Actor-Critic with Shared Experience Replay | http://arxiv.org/abs/1909.11583v2 |
Offline Meta-Reinforcement Learning with Advantage Weighting | http://arxiv.org/abs/2008.06043v2 |
Old Dog Learns New Tricks: Randomized UCB for Bandit Problems | http://arxiv.org/abs/1910.04928v2 |
On Contrastive Learning for Likelihood-free Inference | http://arxiv.org/abs/2002.03712v2 |
On Convergence-Diagnostic based Step Sizes for Stochastic Gradient Descent | http://arxiv.org/abs/2007.00534v1 |
On Coresets For Regularized Regression | http://arxiv.org/abs/2006.05440v3 |
On Cross-Dataset Generalization in Automatic Detection of Online Abuse | http://arxiv.org/abs/2010.07414v2 |
On Detecting Data Pollution Attacks On Recommender Systems Using Sequential GANs | http://arxiv.org/abs/2012.02509v1 |
On Differentially Private Stochastic Convex Optimization with Heavy-tailed Data | http://arxiv.org/abs/2010.11082v1 |
On Dimensional Linguistic Properties of the Word Embedding Space | http://arxiv.org/abs/1910.02211v2 |
On Effective Parallelization of Monte Carlo Tree Search | http://arxiv.org/abs/2006.08785v2 |
On Efficient Constructions of Checkpoints | http://arxiv.org/abs/2009.13003v1 |
On Efficient Low Distortion Ultrametric Embedding | http://arxiv.org/abs/2008.06700v1 |
On Evaluation of Adversarial Perturbations for Sequence-to-Sequence Models | http://arxiv.org/abs/1903.06620v2 |
On Exposure Bias, Hallucination and Domain Shift in Neural Machine Translation | http://arxiv.org/abs/2005.03642v1 |
On Extractive and Abstractive Neural Document Summarization with Transformer Language Models | http://arxiv.org/abs/1909.03186v2 |
On Faithfulness and Factuality in Abstractive Summarization | http://arxiv.org/abs/2005.00661v1 |
On Generalization Bounds of a Family of Recurrent Neural Networks | http://arxiv.org/abs/1910.12947v2 |
On Gradient Descent Ascent for Nonconvex-Concave Minimax Problems | http://arxiv.org/abs/1906.00331v6 |
On Graph Classification Networks, Datasets and Baselines | http://arxiv.org/abs/1905.04682v1 |
On Incorporating Structural Information to improve Dialogue Response Generation | http://arxiv.org/abs/2005.14315v1 |
On Iterative Neural Network Pruning, Reinitialization, and the Similarity of Masks | http://arxiv.org/abs/2001.05050v1 |
On Layer Normalization in the Transformer Architecture | http://arxiv.org/abs/2002.04745v2 |
On Learning Language-Invariant Representations for Universal Machine Translation | http://arxiv.org/abs/2008.04510v1 |
On Learning Sets of Symmetric Elements | http://arxiv.org/abs/2002.08599v4 |
On Linear Stochastic Approximation: Fine-grained Polyak-Ruppert and Non-Asymptotic Concentration | http://arxiv.org/abs/2004.04719v1 |
On Losses for Modern Language Models | http://arxiv.org/abs/2010.01694v1 |
On Maximization of Weakly Modular Functions: Guarantees of Multi-stage Algorithms, Tractability, and Hardness | http://arxiv.org/abs/1805.11251v5 |
On Measuring Social Biases in Sentence Encoders | http://arxiv.org/abs/1903.10561v1 |
On Negative Interference in Multilingual Models: Findings and A Meta-Learning Treatment | http://arxiv.org/abs/2010.03017v1 |
On Optimal Transformer Depth for Low-Resource Language Translation | http://arxiv.org/abs/2004.04418v2 |
On Polynomial Approximations for Privacy-Preserving and Verifiable ReLU Networks | http://arxiv.org/abs/2011.05530v1 |
On Primes, Log-Loss Scores and (No) Privacy | http://arxiv.org/abs/2009.08559v1 |
On Random Subsampling of Gaussian Process Regression: A Graphon-Based Analysis | http://arxiv.org/abs/1901.09541v1 |
On Second-Order Group Influence Functions for Black-Box Predictions | http://arxiv.org/abs/1911.00418v2 |
On Suboptimality of Least Squares with Application to Estimation of Convex Bodies | http://arxiv.org/abs/2006.04046v1 |
On The Evaluation of Machine Translation Systems Trained With Back-Translation | http://arxiv.org/abs/1908.05204v2 |
On Thompson Sampling for Smoother-than-Lipschitz Bandits | http://arxiv.org/abs/2001.02323v2 |
On Unbalanced Optimal Transport: An Analysis of Sinkhorn Algorithm | http://arxiv.org/abs/2002.03293v2 |
On Using Very Large Target Vocabulary for Neural Machine Translation | http://arxiv.org/abs/1412.2007v2 |
On Variational Learning of Controllable Representations for Text without Supervision | http://arxiv.org/abs/1905.11975v4 |
On conditional versus marginal bias in multi-armed bandits | http://arxiv.org/abs/2002.08422v2 |
On the Benefits of Models with Perceptually-Aligned Gradients | http://arxiv.org/abs/2005.01499v1 |
On the Choice of Auxiliary Languages for Improved Sequence Tagging | http://arxiv.org/abs/2005.09389v1 |
On the Complementary Nature of Knowledge Graph Embedding, Fine Grain Entity Types, and Language Modeling | http://arxiv.org/abs/2010.05732v1 |
On the Computational Power of Transformers and its Implications in Sequence Modeling | http://arxiv.org/abs/2006.09286v3 |
On the Consistency of Top-k Surrogate Losses | http://arxiv.org/abs/1901.11141v2 |
On the Convergence of Adaptive Gradient Methods for Nonconvex Optimization | http://arxiv.org/abs/1808.05671v3 |
On the Convergence of Continuous Constrained Optimization for Structure Learning | http://arxiv.org/abs/2011.11150v2 |
On the Convergence of Nesterov's Accelerated Gradient Method in Stochastic Settings | http://arxiv.org/abs/2002.12414v2 |
On the Convergence of SARAH and Beyond | http://arxiv.org/abs/1906.02351v2 |
On the Convergence of Stochastic Gradient Descent with Low-Rank Projections for Convex Low-Rank Matrix Problems | http://arxiv.org/abs/2001.11668v2 |
On the Cross-lingual Transferability of Monolingual Representations | http://arxiv.org/abs/1910.11856v3 |
On the Encoder-Decoder Incompatibility in Variational Text Modeling and Beyond | http://arxiv.org/abs/2004.09189v1 |
On the Expressivity of Neural Networks for Deep Reinforcement Learning | http://arxiv.org/abs/1910.05927v3 |
On the Frailty of Universal POS Tags for Neural UD Parsers | http://arxiv.org/abs/2010.01830v3 |
On the Generalization Benefit of Noise in Stochastic Gradient Descent | http://arxiv.org/abs/2006.15081v1 |
On the Global Convergence Rates of Softmax Policy Gradient Methods | http://arxiv.org/abs/2005.06392v2 |
On the Idiosyncrasies of the Mandarin Chinese Classifier System | http://arxiv.org/abs/1902.10193v3 |
On the Inference Calibration of Neural Machine Translation | http://arxiv.org/abs/2005.00963v1 |
On the Limitations of Cross-lingual Encoders as Exposed by Reference-Free Machine Translation Evaluation | http://arxiv.org/abs/2005.01196v3 |
On the Limitations of Unsupervised Bilingual Dictionary Induction | http://arxiv.org/abs/1805.03620v1 |
On the Linguistic Representational Power of Neural Machine Translation Models | http://arxiv.org/abs/1911.00317v1 |
On the Multiple Descent of Minimum-Norm Interpolants and Restricted Lower Isometry of Kernels | http://arxiv.org/abs/1908.10292v2 |
On the Noisy Gradient Descent that Generalizes as SGD | http://arxiv.org/abs/1906.07405v3 |
On the Number of Linear Regions of Convolutional Neural Networks | http://arxiv.org/abs/2006.00978v2 |
On the Practical Computational Power of Finite Precision RNNs for Language Recognition | http://arxiv.org/abs/1805.04908v1 |
On the Relation between Quality-Diversity Evaluation and Distribution-Fitting Goal in Text Generation | http://arxiv.org/abs/2007.01488v2 |
On the Robustness of Language Encoders against Grammatical Errors | http://arxiv.org/abs/2005.05683v1 |
On the Role of Supervision in Unsupervised Constituency Parsing | http://arxiv.org/abs/2010.02423v2 |
On the Sample Complexity of Adversarial Multi-Source PAC Learning | http://arxiv.org/abs/2002.10384v2 |
On the Sample Complexity of Learning Sum-Product Networks | http://arxiv.org/abs/1912.02765v2 |
On the Sentence Embeddings from Pre-trained Language Models | http://arxiv.org/abs/2011.05864v1 |
On the Sparsity of Neural Machine Translation Models | http://arxiv.org/abs/2010.02646v1 |
On the Spontaneous Emergence of Discrete and Compositional Signals | http://arxiv.org/abs/2005.00110v1 |
On the Theoretical Properties of the Network Jackknife | http://arxiv.org/abs/2004.08935v2 |
On the Unreasonable Effectiveness of the Greedy Algorithm: Greedy Adapts to Sharpness | http://arxiv.org/abs/2002.04063v1 |
On the diminishing return of labeling clinical reports | http://arxiv.org/abs/2010.14587v1 |
On the importance of pre-training data volume for compact language models | http://arxiv.org/abs/2010.03813v2 |
On the interplay between noise and curvature and its effect on optimization and generalization | http://arxiv.org/abs/1906.07774v2 |
On the optimality of kernels for high-dimensional clustering | http://arxiv.org/abs/1912.00458v1 |
On the space-time expressivity of ResNets | http://arxiv.org/abs/1910.09599v4 |
On-The-Fly Information Retrieval Augmentation for Language Models | http://arxiv.org/abs/2007.01528v1 |
One Policy to Control Them All: Shared Modular Policies for Agent-Agnostic Control | http://arxiv.org/abs/2007.04976v1 |
One Size Does Not Fit All: Generating and Evaluating Variable Number of Keyphrases | http://arxiv.org/abs/1810.05241v4 |
One Size Fits All: Can We Train One Denoiser for All Noise Levels? | http://arxiv.org/abs/2005.09627v3 |
One Solution is Not All You Need: Few-Shot Extrapolation via Structured MaxEnt RL | http://arxiv.org/abs/2010.14484v2 |
Online Continuous DR-Submodular Maximization with Long-Term Budget Constraints | http://arxiv.org/abs/1907.00316v1 |
Online Conversation Disentanglement with Pointer Networks | http://arxiv.org/abs/2010.11080v1 |
Online Dense Subgraph Discovery via Blurred-Graph Feedback | http://arxiv.org/abs/2006.13642v1 |
Online Forecasting of Total-Variation-bounded Sequences | http://arxiv.org/abs/1906.03364v2 |
Online Hyper-parameter Tuning in Off-policy Learning via Evolutionary Strategies | http://arxiv.org/abs/2006.07554v1 |
Online Learning Using Only Peer Prediction | http://arxiv.org/abs/1910.04382v2 |
Online Learning for Active Cache Synchronization | http://arxiv.org/abs/2002.12014v2 |
Online Learning with Continuous Variations: Dynamic Regret and Reductions | http://arxiv.org/abs/1902.07286v3 |
Online Learning with Imperfect Hints | http://arxiv.org/abs/2002.04726v2 |
Online Pricing with Offline Data: Phase Transition and Inverse Square Law | http://arxiv.org/abs/1910.08693v6 |
Online Safety Assurance for Deep Reinforcement Learning | http://arxiv.org/abs/2010.03625v1 |
Online Segment to Segment Neural Transduction | http://arxiv.org/abs/1609.08194v1 |
Online metric algorithms with untrusted predictions | http://arxiv.org/abs/2003.02144v2 |
Online mirror descent and dual averaging: keeping pace in the dynamic case | http://arxiv.org/abs/2006.02585v2 |
Open Domain Event Extraction Using Neural Latent Variable Models | http://arxiv.org/abs/1906.06947v1 |
Open Domain Question Answering Using Early Fusion of Knowledge Bases and Text | http://arxiv.org/abs/1809.00782v1 |
Open Korean Corpora: A Practical Report | http://arxiv.org/abs/2012.15621v1 |
Operation-Aware Soft Channel Pruning using Differentiable Masks | http://arxiv.org/abs/2007.03938v2 |
OpinionDigest: A Simple Framework for Opinion Summarization | http://arxiv.org/abs/2005.01901v1 |
Opportunistic Decoding with Timely Correction for Simultaneous Translation | http://arxiv.org/abs/2005.00675v1 |
Optimal Client Sampling for Federated Learning | http://arxiv.org/abs/2010.13723v1 |
Optimal Continual Learning has Perfect Memory and is NP-hard | http://arxiv.org/abs/2006.05188v1 |
Optimal Randomized First-Order Methods for Least-Squares Problems | http://arxiv.org/abs/2002.09488v2 |
Optimal Robust Learning of Discrete Distributions from Batches | http://arxiv.org/abs/1911.08532v2 |
Optimal Transport-based Alignment of Learned Character Representations for String Similarity | http://arxiv.org/abs/1907.10165v1 |
Optimal approximation for unconstrained non-submodular minimization | http://arxiv.org/abs/1905.12145v3 |
Optimal group testing | http://arxiv.org/abs/1911.02287v3 |
Optimal transport mapping via input convex neural networks | http://arxiv.org/abs/1908.10962v2 |
Optimistic Policy Optimization with Bandit Feedback | http://arxiv.org/abs/2002.08243v2 |
Optimistic bounds for multi-output prediction | http://arxiv.org/abs/2002.09769v1 |
Optimization Theory for ReLU Neural Networks Trained with Normalization Layers | http://arxiv.org/abs/2006.06878v1 |
Optimization from Structured Samples for Coverage Functions | http://arxiv.org/abs/2007.02738v1 |
Optimization of Graph Total Variation via Active-Set-based Combinatorial Reconditioning | http://arxiv.org/abs/2002.12236v1 |
Optimized Score Transformation for Fair Classification | http://arxiv.org/abs/1906.00066v2 |
Optimizer Benchmarking Needs to Account for Hyperparameter Tuning | http://arxiv.org/abs/1910.11758v4 |
Optimizing Black-box Metrics with Adaptive Surrogates | http://arxiv.org/abs/2002.08605v1 |
Optimizing Data Usage via Differentiable Rewards | http://arxiv.org/abs/1911.10088v2 |
Optimizing Long-term Social Welfare in Recommender Systems: A Constrained Matching Approach | http://arxiv.org/abs/2008.00104v2 |
Optimizing Memory Placement using Evolutionary Graph Reinforcement Learning | http://arxiv.org/abs/2007.07298v2 |
Optimizing Millions of Hyperparameters by Implicit Differentiation | http://arxiv.org/abs/1911.02590v1 |
Optimizing the Factual Correctness of a Summary: A Study of Summarizing Radiology Reports | http://arxiv.org/abs/1911.02541v3 |
Optimus: Organizing Sentences via Pre-trained Modeling of a Latent Space | http://arxiv.org/abs/2004.04092v4 |
Option Discovery in the Absence of Rewards with Manifold Analysis | http://arxiv.org/abs/2003.05878v2 |
Oracle Efficient Private Non-Convex Optimization | http://arxiv.org/abs/1909.01783v3 |
Ordered SGD: A New Stochastic Optimization Framework for Empirical Risk Minimization | http://arxiv.org/abs/1907.04371v5 |
Ordinal Non-negative Matrix Factorization for Recommendation | http://arxiv.org/abs/2006.01034v4 |
Orthogonal Gradient Descent for Continual Learning | http://arxiv.org/abs/1910.07104v1 |
Orthogonal Relation Transforms with Graph Context Modeling for Knowledge Graph Embedding | http://arxiv.org/abs/1911.04910v3 |
Orthogonalized SGD and Nested Architectures for Anytime Neural Networks | http://arxiv.org/abs/2008.06635v1 |
Out of the Echo Chamber: Detecting Countering Debate Speeches | http://arxiv.org/abs/2005.01157v1 |
Overcoming Language Variation in Sentiment Analysis with Social Attention | http://arxiv.org/abs/1511.06052v4 |
Overfitting in adversarially robust deep learning | http://arxiv.org/abs/2002.11569v2 |
P-SIF: Document Embeddings Using Partition Averaging | http://arxiv.org/abs/2005.09069v1 |
PAC Bounds for Imitation and Model-based Batch Learning of Contextual Markov Decision Processes | http://arxiv.org/abs/2006.06352v2 |
PAC learning with stable and private predictions | http://arxiv.org/abs/1911.10541v2 |
PACRR: A Position-Aware Neural IR Model for Relevance Matching | http://arxiv.org/abs/1704.03940v3 |
PAGE: A Simple and Optimal Probabilistic Gradient Estimator for Nonconvex Optimization | http://arxiv.org/abs/2008.10898v2 |
PAIR: Planning and Iterative Refinement in Pre-trained Transformers for Long Text Generation | http://arxiv.org/abs/2010.02301v1 |
PALM: Pre-training an Autoencoding&Autoregressive Language Model for Context-conditioned Generation | http://arxiv.org/abs/2004.07159v2 |
PAN: Path Integral Based Convolution for Deep Graph Neural Networks | http://arxiv.org/abs/1904.10996v1 |
PARADE: A New Dataset for Paraphrase Identification Requiring Computer Science Domain Knowledge | http://arxiv.org/abs/2010.03725v1 |
PDO-eConvs: Partial Differential Operator Based Equivariant Convolutions | http://arxiv.org/abs/2007.10408v2 |
PEGASUS: Pre-training with Extracted Gap-sentences for Abstractive Summarization | http://arxiv.org/abs/1912.08777v3 |
PENNI: Pruned Kernel Sharing for Efficient CNN Inference | http://arxiv.org/abs/2005.07133v2 |
PERL: Pivot-based Domain Adaptation for Pre-trained Deep Contextualized Embedding Models | http://arxiv.org/abs/2006.09075v1 |
PHICON: Improving Generalization of Clinical Text De-identification Models via Data Augmentation | http://arxiv.org/abs/2010.05143v1 |
PLAS: Latent Action Space for Offline Reinforcement Learning | http://arxiv.org/abs/2011.07213v1 |
PLATO: Pre-trained Dialogue Generation Model with Discrete Latent Variable | http://arxiv.org/abs/1910.07931v3 |
POPCORN: Partially Observed Prediction COnstrained ReiNforcement Learning | http://arxiv.org/abs/2001.04032v2 |
POSEIDON: Privacy-Preserving Federated Neural Network Learning | http://arxiv.org/abs/2009.00349v3 |
PRover: Proof Generation for Interpretable Reasoning over Rules | http://arxiv.org/abs/2010.02830v1 |
PackIt: A Virtual Environment for Geometric Planning | http://arxiv.org/abs/2007.11121v1 |
Pan-Private Uniformity Testing | http://arxiv.org/abs/1911.01452v3 |
Parallel Algorithm for Non-Monotone DR-Submodular Maximization | http://arxiv.org/abs/1905.13272v1 |
Parallel Corpus Filtering via Pre-trained Language Models | http://arxiv.org/abs/2005.06166v1 |
Parallel Data Augmentation for Formality Style Transfer | http://arxiv.org/abs/2005.07522v1 |
Parallel Interactive Networks for Multi-Domain Dialogue State Generation | http://arxiv.org/abs/2009.07616v3 |
Parallels Between Phase Transitions and Circuit Complexity? | http://arxiv.org/abs/1904.05483v2 |
Parameters Estimation from the 21 cm signal using Variational Inference | http://arxiv.org/abs/2005.02299v1 |
Parametric Gaussian Process Regressors | http://arxiv.org/abs/1910.07123v3 |
Paraphrase Augmented Task-Oriented Dialog Generation | http://arxiv.org/abs/2004.07462v2 |
Paraphrase Generation as Zero-Shot Multilingual Translation: Disentangling Semantic Similarity from Lexical and Syntactic Diversity | http://arxiv.org/abs/2008.04935v2 |
Paraphrase to Explicate: Revealing Implicit Noun-Compound Relations | http://arxiv.org/abs/1805.02442v1 |
PareCO: Pareto-aware Channel Optimization for Slimmable Neural Networks | http://arxiv.org/abs/2007.11752v2 |
Pareto Probing: Trading Off Accuracy for Complexity | http://arxiv.org/abs/2010.02180v2 |
Parrot: Data-Driven Behavioral Priors for Reinforcement Learning | http://arxiv.org/abs/2011.10024v1 |
Parsing Speech: A Neural Approach to Integrating Lexical and Acoustic-Prosodic Information | http://arxiv.org/abs/1704.07287v2 |
Parsing as Reduction | http://arxiv.org/abs/1503.00030v1 |
Partial Trace Regression and Low-Rank Kraus Decomposition | http://arxiv.org/abs/2007.00935v2 |
Partially-Aligned Data-to-Text Generation with Distant Supervision | http://arxiv.org/abs/2010.01268v1 |
Past, Present, Future: A Computational Investigation of the Typology of Tense in 1000 Languages | http://arxiv.org/abs/1704.08914v2 |
Pathologies of Neural Models Make Interpretations Difficult | http://arxiv.org/abs/1804.07781v3 |
Patient-Specific Effects of Medication Using Latent Force Models with Gaussian Processes | http://arxiv.org/abs/1906.00226v1 |
PePScenes: A Novel Dataset and Baseline for Pedestrian Action Prediction in 3D | http://arxiv.org/abs/2012.07773v1 |
PeTra: A Sparsely Supervised Memory Model for People Tracking | http://arxiv.org/abs/2005.02990v1 |
Peer Loss Functions: Learning from Noisy Labels without Knowing Noise Rates | http://arxiv.org/abs/1910.03231v7 |
Perceptual Generative Autoencoders | http://arxiv.org/abs/1906.10335v2 |
Performative Prediction | http://arxiv.org/abs/2002.06673v3 |
Permutation Invariant Graph Generation via Score-Based Generative Modeling | http://arxiv.org/abs/2003.00638v1 |
Permutation invariant networks to learn Wasserstein metrics | http://arxiv.org/abs/2010.05820v3 |
PersLay: A Neural Network Layer for Persistence Diagrams and New Graph Topological Signatures | http://arxiv.org/abs/1904.09378v4 |
Personality Trait Detection Using Bagged SVM over BERT Word Embedding Ensembles | http://arxiv.org/abs/2010.01309v1 |
Personalized Language Model for Query Auto-Completion | http://arxiv.org/abs/1804.09661v1 |
Personalized Neural Embeddings for Collaborative Filtering with Text | http://arxiv.org/abs/1903.07860v1 |
Personalized neural language models for real-world query auto completion | http://arxiv.org/abs/1804.06439v3 |
Personalizing Dialogue Agents: I have a dog, do you have pets too? | http://arxiv.org/abs/1801.07243v5 |
Persuasion for Good: Towards a Personalized Persuasive Dialogue System for Social Good | http://arxiv.org/abs/1906.06725v2 |
Perturbed Masking: Parameter-free Probing for Analyzing and Interpreting BERT | http://arxiv.org/abs/2004.14786v2 |
Pessimism About Unknown Unknowns Inspires Conservatism | http://arxiv.org/abs/2006.08753v1 |
Phone Features Improve Speech Translation | http://arxiv.org/abs/2005.13681v1 |
Phonetic and Visual Priors for Decipherment of Informal Romanization | http://arxiv.org/abs/2005.02517v1 |
Phonotactic Complexity and its Trade-offs | http://arxiv.org/abs/2005.03774v1 |
Phrase-Based & Neural Unsupervised Machine Translation | http://arxiv.org/abs/1804.07755v2 |
Phrase-Indexed Question Answering: A New Challenge for Scalable Document Comprehension | http://arxiv.org/abs/1804.07726v2 |
Pieces of Eight: 8-bit Neural Machine Translation | http://arxiv.org/abs/1804.05038v1 |
Piecewise Linear Regression via a Difference of Convex Functions | http://arxiv.org/abs/2007.02422v3 |
Planning from Pixels using Inverse Dynamics Models | http://arxiv.org/abs/2012.02419v1 |
Planning to Explore via Self-Supervised World Models | http://arxiv.org/abs/2005.05960v2 |
Playing 20 Question Game with Policy-Based Reinforcement Learning | http://arxiv.org/abs/1808.07645v3 |
Playing Text-Adventure Games with Graph-Based Deep Reinforcement Learning | http://arxiv.org/abs/1812.01628v2 |
Please Mind the Root: Decoding Arborescences for Dependency Parsing | http://arxiv.org/abs/2010.02550v2 |
PlotMachines: Outline-Conditioned Generation with Dynamic Plot State Tracking | http://arxiv.org/abs/2004.14967v2 |
Plug and Play Autoencoders for Conditional Text Generation | http://arxiv.org/abs/2010.02983v2 |
PoWER-BERT: Accelerating BERT Inference via Progressive Word-vector Elimination | http://arxiv.org/abs/2001.08950v5 |
Pointer Graph Networks | http://arxiv.org/abs/2006.06380v2 |
Pointwise HSIC: A Linear-Time Kernelized Co-occurrence Norm for Sparse Linguistic Expressions | http://arxiv.org/abs/1809.00800v1 |
Pointwise Paraphrase Appraisal is Potentially Problematic | http://arxiv.org/abs/2005.11996v2 |
Poisson Learning: Graph Based Semi-Supervised Learning At Very Low Label Rates | http://arxiv.org/abs/2006.11184v2 |
Policy Gradient as a Proxy for Dynamic Oracles in Constituency Parsing | http://arxiv.org/abs/1806.03290v1 |
Policy Learning Using Weak Supervision | http://arxiv.org/abs/2010.01748v2 |
Policy Shaping and Generalized Update Equations for Semantic Parsing from Denotations | http://arxiv.org/abs/1809.01299v1 |
Policy Teaching via Environment Poisoning: Training-time Adversarial Attacks against Reinforcement Learning | http://arxiv.org/abs/2003.12909v2 |
Politeness Transfer: A Tag and Generate Approach | http://arxiv.org/abs/2004.14257v2 |
Political Advertising Dataset: the use case of the Polish 2020 Presidential Elections | http://arxiv.org/abs/2006.10207v1 |
PolyGen: An Autoregressive Generative Model of 3D Meshes | http://arxiv.org/abs/2002.10880v1 |
Polyglot Semantic Parsing in APIs | http://arxiv.org/abs/1803.06966v2 |
Polyglot Semantic Role Labeling | http://arxiv.org/abs/1805.11598v1 |
Population Mapping in Informal Settlements with High-Resolution Satellite Imagery and Equitable Ground-Truth | http://arxiv.org/abs/2009.08410v1 |
Population-Based Black-Box Optimization for Biological Sequence Design | http://arxiv.org/abs/2006.03227v2 |
Position-Aware Tagging for Aspect Sentiment Triplet Extraction | http://arxiv.org/abs/2010.02609v2 |
Post-Estimation Smoothing: A Simple Baseline for Learning with Side Information | http://arxiv.org/abs/2003.05955v1 |
Posterior Calibrated Training on Sentence Classification Tasks | http://arxiv.org/abs/2004.14500v2 |
Posterior Control of Blackbox Generation | http://arxiv.org/abs/2005.04560v1 |
PowerNorm: Rethinking Batch Normalization in Transformers | http://arxiv.org/abs/2003.07845v2 |
PowerTransformer: Unsupervised Controllable Revision for Biased Language Correction | http://arxiv.org/abs/2010.13816v1 |
Pragmatically Informative Image Captioning with Character-Level Inference | http://arxiv.org/abs/1804.05417v2 |
Pragmatically Informative Text Generation | http://arxiv.org/abs/1904.01301v2 |
Pre-Learning Environment Representations for Data-Efficient Neural Instruction Following | http://arxiv.org/abs/1907.09671v1 |
Pre-Training Transformers as Energy-Based Cloze Models | http://arxiv.org/abs/2012.08561v1 |
Pre-train and Plug-in: Flexible Conditional Text Generation with Variational Auto-Encoders | http://arxiv.org/abs/1911.03882v4 |
Pre-training Is (Almost) All You Need: An Application to Commonsense Reasoning | http://arxiv.org/abs/2004.14074v1 |
Pre-training Multilingual Neural Machine Translation by Leveraging Alignment Information | http://arxiv.org/abs/2010.03142v2 |
Pre-training for Abstractive Document Summarization by Reinstating Source Text | http://arxiv.org/abs/2004.01853v4 |
Pre-training on high-resource speech recognition improves low-resource speech-to-text translation | http://arxiv.org/abs/1809.01431v2 |
PreCo: A Large-scale Dataset in Preschool Vocabulary for Coreference Resolution | http://arxiv.org/abs/1810.09807v1 |
Precise Task Formalization Matters in Winograd Schema Evaluations | http://arxiv.org/abs/2010.04043v1 |
Precise Tradeoffs in Adversarial Training for Linear Regression | http://arxiv.org/abs/2002.10477v1 |
Predicting Choice with Set-Dependent Aggregation | http://arxiv.org/abs/1906.06365v2 |
Predicting Clinical Trial Results by Implicit Evidence Integration | http://arxiv.org/abs/2010.05639v1 |
Predicting Declension Class from Form and Meaning | http://arxiv.org/abs/2005.00626v2 |
Predicting In-game Actions from Interviews of NBA Players | http://arxiv.org/abs/1910.11292v3 |
Predicting Native Language from Gaze | http://arxiv.org/abs/1704.07398v2 |
Predicting Performance for Natural Language Processing Tasks | http://arxiv.org/abs/2005.00870v1 |
Predicting Semantic Relations using Global Graph Properties | http://arxiv.org/abs/1808.08644v1 |
Predicting Unplanned Readmissions with Highly Unstructured Data | http://arxiv.org/abs/2003.11622v2 |
Predicting and Analyzing Law-Making in Kenya | http://arxiv.org/abs/2006.05493v1 |
Prediction Focused Topic Models via Feature Selection | http://arxiv.org/abs/1910.05495v2 |
Prediction of Bayesian Intervals for Tropical Storms | http://arxiv.org/abs/2003.05024v1 |
Prediction of neonatal mortality in Sub-Saharan African countries using data-level linkage of multiple surveys | http://arxiv.org/abs/2011.12707v1 |
Predictive Biases in Natural Language Processing Models: A Conceptual Framework and Overview | http://arxiv.org/abs/1912.11078v2 |
Predictive Coding for Locally-Linear Control | http://arxiv.org/abs/2003.01086v1 |
Predictive Multiplicity in Classification | http://arxiv.org/abs/1909.06677v4 |
Predictive PER: Balancing Priority and Diversity towards Stable Deep Reinforcement Learning | http://arxiv.org/abs/2011.13093v1 |
Predictive Sampling with Forecasting Autoregressive Models | http://arxiv.org/abs/2002.09928v2 |
Pretrained Language Model Embryology: The Birth of ALBERT | http://arxiv.org/abs/2010.02480v2 |
Pretrained Transformers Improve Out-of-Distribution Robustness | http://arxiv.org/abs/2004.06100v2 |
Pretraining with Contrastive Sentence Objectives Improves Discourse Performance of Language Models | http://arxiv.org/abs/2005.10389v1 |
Principal Neighbourhood Aggregation for Graph Nets | http://arxiv.org/abs/2004.05718v5 |
Principled learning method for Wasserstein distributionally robust optimization with local perturbations | http://arxiv.org/abs/2006.03333v2 |
Privacy Amplification by Decentralization | http://arxiv.org/abs/2012.05326v1 |
Privacy-Preserving XGBoost Inference | http://arxiv.org/abs/2011.04789v4 |
Privacy-preserving Neural Representations of Text | http://arxiv.org/abs/1808.09408v1 |
Privacy-preserving collaborative machine learning on genomic data using TensorFlow | http://arxiv.org/abs/2002.04344v2 |
Private Outsourced Bayesian Optimization | http://arxiv.org/abs/2010.12799v1 |
Private Query Release Assisted by Public Data | http://arxiv.org/abs/2004.10941v1 |
Private Reinforcement Learning with PAC and Regret Guarantees | http://arxiv.org/abs/2009.09052v1 |
Private Stochastic Convex Optimization: Optimal Rates in Linear Time | http://arxiv.org/abs/2005.04763v1 |
Privately Learning Markov Random Fields | http://arxiv.org/abs/2002.09463v2 |
Privately Learning Thresholds: Closing the Exponential Gap | http://arxiv.org/abs/1911.10137v1 |
Privately detecting changes in unknown distributions | http://arxiv.org/abs/1910.01327v2 |
Probabilistic Assumptions Matter: Improved Models for Distantly-Supervised Document-Level Question Answering | http://arxiv.org/abs/2005.01898v1 |
Probabilistic FastText for Multi-Sense Word Embeddings | http://arxiv.org/abs/1806.02901v1 |
Probabilistic Frame Induction | http://arxiv.org/abs/1302.4813v1 |
Probabilistic Predictions of People Perusing: Evaluating Metrics of Language Model Performance for Psycholinguistic Modeling | http://arxiv.org/abs/2009.03954v1 |
Probabilistic Typology: Deep Generative Models of Vowel Inventories | http://arxiv.org/abs/1705.01684v1 |
Probabilistically Masked Language Model Capable of Autoregressive Generation in Arbitrary Word Order | http://arxiv.org/abs/2004.11579v1 |
Probing Emergent Semantics in Predictive Agents via Question Answering | http://arxiv.org/abs/2006.01016v1 |
Probing Linguistic Features of Sentence-Level Representations in Neural Relation Extraction | http://arxiv.org/abs/2004.08134v1 |
Probing Linguistic Systematicity | http://arxiv.org/abs/2005.04315v2 |
Probing Neural Dialog Models for Conversational Understanding | http://arxiv.org/abs/2006.08331v1 |
Probing Pretrained Language Models for Lexical Semantics | http://arxiv.org/abs/2010.05731v1 |
Probing Task-Oriented Dialogue Representation from Language Models | http://arxiv.org/abs/2010.13912v1 |
Probing for Semantic Classes: Diagnosing the Meaning Content of Word Embeddings | http://arxiv.org/abs/1906.03608v1 |
Probing the Need for Visual Context in Multimodal Machine Translation | http://arxiv.org/abs/1903.08678v2 |
Problems with Shapley-value-based explanations as feature importance measures | http://arxiv.org/abs/2002.11097v2 |
Profile Consistency Identification for Open-domain Dialogue Agents | http://arxiv.org/abs/2009.09680v3 |
Program Enhanced Fact Verification with Verbalization and Graph Attention Network | http://arxiv.org/abs/2010.03084v5 |
Progressive Graph Learning for Open-Set Domain Adaptation | http://arxiv.org/abs/2006.12087v2 |
Progressive Growing of Neural ODEs | http://arxiv.org/abs/2003.03695v1 |
Progressive Identification of True Labels for Partial-Label Learning | http://arxiv.org/abs/2002.08053v3 |
Progressive growing of self-organized hierarchical representations for exploration | http://arxiv.org/abs/2005.06369v1 |
Projective Preferential Bayesian Optimization | http://arxiv.org/abs/2002.03113v4 |
Pronoun-Targeted Fine-tuning for NMT with Hybrid Losses | http://arxiv.org/abs/2010.07638v1 |
Proper Learning, Helly Number, and an Optimal SVM Bound | http://arxiv.org/abs/2005.11818v1 |
Proper Network Interpretability Helps Adversarial Robustness in Classification | http://arxiv.org/abs/2006.14748v2 |
Prophets, Secretaries, and Maximizing the Probability of Choosing the Best | http://arxiv.org/abs/1910.03798v1 |
ProtoQA: A Question Answering Dataset for Prototypical Common-Sense Reasoning | http://arxiv.org/abs/2005.00771v3 |
Provable Representation Learning for Imitation Learning via Bi-level Optimization | http://arxiv.org/abs/2002.10544v1 |
Provable Self-Play Algorithms for Competitive Reinforcement Learning | http://arxiv.org/abs/2002.04017v3 |
Provable Smoothness Guarantees for Black-Box Variational Inference | http://arxiv.org/abs/1901.08431v4 |
Provably Convergent Two-Timescale Off-Policy Actor-Critic with Function Approximation | http://arxiv.org/abs/1911.04384v9 |
Provably Efficient Exploration in Policy Optimization | http://arxiv.org/abs/1912.05830v3 |
Provably Efficient Model-based Policy Adaptation | http://arxiv.org/abs/2006.08051v1 |
Provably Efficient Reinforcement Learning with Linear Function Approximation | http://arxiv.org/abs/1907.05388v2 |
Proving the Lottery Ticket Hypothesis: Pruning is All You Need | http://arxiv.org/abs/2002.00585v1 |
Prta: A System to Support the Analysis of Propaganda Techniques in the News | http://arxiv.org/abs/2005.05854v1 |
Psycholinguistics meets Continual Learning: Measuring Catastrophic Forgetting in Visual Question Answering | http://arxiv.org/abs/1906.04229v1 |
Pun Generation with Surprise | http://arxiv.org/abs/1904.06828v1 |
Putting An End to End-to-End: Gradient-Isolated Learning of Representations | http://arxiv.org/abs/1905.11786v3 |
PuzzLing Machines: A Challenge on Learning From Small Data | http://arxiv.org/abs/2004.13161v1 |
Puzzle Mix: Exploiting Saliency and Local Statistics for Optimal Mixup | http://arxiv.org/abs/2009.06962v2 |
PyHessian: Neural Networks Through the Lens of the Hessian | http://arxiv.org/abs/1912.07145v3 |
PyMT5: multi-mode translation of natural language and Python code with transformers | http://arxiv.org/abs/2010.03150v1 |
PySBD: Pragmatic Sentence Boundary Disambiguation | http://arxiv.org/abs/2010.09657v1 |
Pyramid Convolutional RNN for MRI Reconstruction | http://arxiv.org/abs/1912.00543v5 |
Q-learning with Language Model for Edit-based Unsupervised Summarization | http://arxiv.org/abs/2010.04379v1 |
Q-value Path Decomposition for Deep Multiagent Reinforcement Learning | http://arxiv.org/abs/2002.03950v1 |
QA2Explanation: Generating and Evaluating Explanations for Question Answering Systems over Knowledge Graph | http://arxiv.org/abs/2010.08323v1 |
QuASE: Question-Answer Driven Sentence Encoding | http://arxiv.org/abs/1909.00333v3 |
Quantifying Attention Flow in Transformers | http://arxiv.org/abs/2005.00928v2 |
Quantifying Differences in Reward Functions | http://arxiv.org/abs/2006.13900v2 |
Quantifying Intimacy in Language | http://arxiv.org/abs/2011.03020v1 |
Quantifying Privacy Leakage in Graph Embedding | http://arxiv.org/abs/2010.00906v1 |
Quantifying Similarity between Relations with Fact Distribution | http://arxiv.org/abs/1907.08937v1 |
Quantifying the Effects of COVID-19 on Mental Health Support Forums | http://arxiv.org/abs/2009.04008v1 |
Quantitative Argument Summarization and Beyond: Cross-Domain Key Point Analysis | http://arxiv.org/abs/2010.05369v1 |
Quantitative stability of optimal transport maps and linearization of the 2-Wasserstein space | http://arxiv.org/abs/1910.05954v1 |
Quantized Decentralized Stochastic Learning over Directed Graphs | http://arxiv.org/abs/2002.09964v5 |
Quantized Frank-Wolfe: Faster Optimization, Lower Communication, and Projection Free | http://arxiv.org/abs/1902.06332v3 |
Quantum Boosting | http://arxiv.org/abs/2002.05056v2 |
Quantum Expectation-Maximization for Gaussian Mixture Models | http://arxiv.org/abs/1908.06657v2 |
Quaternion Graph Neural Networks | http://arxiv.org/abs/2008.05089v3 |
Queens are Powerful too: Mitigating Gender Bias in Dialogue Generation | http://arxiv.org/abs/1911.03842v2 |
Question Directed Graph Attention Network for Numerical Reasoning over Text | http://arxiv.org/abs/2009.07448v1 |
R2-B2: Recursive Reasoning-Based Bayesian Optimization for No-Regret Learning in Games | http://arxiv.org/abs/2006.16679v1 |
R4C: A Benchmark for Evaluating RC Systems to Get the Right Answer for the Right Reason | http://arxiv.org/abs/1910.04601v2 |
RAMP-CNN: A Novel Neural Network for Enhanced Automotive Radar Object Recognition | http://arxiv.org/abs/2011.08981v1 |
RAT-SQL: Relation-Aware Schema Encoding and Linking for Text-to-SQL Parsers | http://arxiv.org/abs/1911.04942v4 |
RATQ: A Universal Fixed-Length Quantizer for Stochastic Optimization | http://arxiv.org/abs/1908.08200v3 |
RESIDE: Improving Distantly-Supervised Neural Relation Extraction using Side Information | http://arxiv.org/abs/1812.04361v2 |
RIFLE: Backpropagation in Depth for Deep Transfer Learning through Re-Initializing the Fully-connected LayEr | http://arxiv.org/abs/2007.03349v1 |
RL agents Implicitly Learning Human Preferences | http://arxiv.org/abs/2002.06137v1 |
RNNs can generate bounded hierarchical languages with optimal memory | http://arxiv.org/abs/2010.07515v1 |
ROMA: Multi-Agent Reinforcement Learning with Emergent Roles | http://arxiv.org/abs/2003.08039v3 |
RPD: A Distance Function Between Word Embeddings | http://arxiv.org/abs/2005.08113v1 |
Radial Bayesian Neural Networks: Beyond Discrete Support In Large-Scale Bayesian Deep Learning | http://arxiv.org/abs/1907.00865v3 |
Radioactive data: tracing through training | http://arxiv.org/abs/2002.00937v1 |
Random Hypervolume Scalarizations for Provable Multi-Objective Black Box Optimization | http://arxiv.org/abs/2006.04655v2 |
Random Matrix Theory Proves that Deep Learning Representations of GAN-data Behave as Gaussian Mixtures | http://arxiv.org/abs/2001.08370v1 |
Random Search and Reproducibility for Neural Architecture Search | http://arxiv.org/abs/1902.07638v3 |
Random extrapolation for primal-dual coordinate descent | http://arxiv.org/abs/2007.06528v1 |
Randomized Block-Diagonal Preconditioning for Parallel Learning | http://arxiv.org/abs/2006.13591v2 |
Randomized Exploration in Generalized Linear Bandits | http://arxiv.org/abs/1906.08947v2 |
Randomized Smoothing of All Shapes and Sizes | http://arxiv.org/abs/2002.08118v5 |
Randomly Projected Additive Gaussian Processes for Regression | http://arxiv.org/abs/1912.12834v1 |
Rank and run-time aware compression of NLP Applications | http://arxiv.org/abs/2010.03193v1 |
Ranking Paragraphs for Improving Answer Recall in Open-Domain Question Answering | http://arxiv.org/abs/1810.00494v1 |
Ranking and Selecting Multi-Hop Knowledge Paths to Better Predict Human Needs | http://arxiv.org/abs/1904.00676v1 |
Rapid Adaptation of Neural Machine Translation to New Languages | http://arxiv.org/abs/1808.04189v1 |
Rapidly Deploying a Neural Search Engine for the COVID-19 Open Research Dataset: Preliminary Thoughts and Lessons Learned | http://arxiv.org/abs/2004.05125v1 |
Rate-Distortion Optimization Guided Autoencoder for Isometric Embedding in Euclidean Latent Space | http://arxiv.org/abs/1910.04329v4 |
Rational Recurrences | http://arxiv.org/abs/1808.09357v1 |
Rationalizing Medical Relation Prediction from Corpus-level Statistics | http://arxiv.org/abs/2005.00889v1 |
Rationalizing Neural Predictions | http://arxiv.org/abs/1606.04155v2 |
Rationalizing Text Matching: Learning Sparse Alignments via Optimal Transport | http://arxiv.org/abs/2005.13111v1 |
Re-evaluating Evaluation in Text Summarization | http://arxiv.org/abs/2010.07100v1 |
Re-translation versus Streaming for Simultaneous Translation | http://arxiv.org/abs/2004.03643v3 |
ReLU Code Space: A Basis for Rating Network Quality Besides Accuracy | http://arxiv.org/abs/2005.09903v1 |
Reactive Supervision: A New Method for Collecting Sarcasm Data | http://arxiv.org/abs/2009.13080v1 |
Reading Between the Lines: Exploring Infilling in Visual Narratives | http://arxiv.org/abs/2010.13944v1 |
Ready Policy One: World Building Through Active Learning | http://arxiv.org/abs/2002.02693v1 |
Real-Time Open-Domain Question Answering with Dense-Sparse Phrase Index | http://arxiv.org/abs/1906.05807v2 |
Real-Time Optimisation for Online Learning in Auctions | http://arxiv.org/abs/2010.10070v1 |
Real-time Classification from Short Event-Camera Streams using Input-filtering Neural ODEs | http://arxiv.org/abs/2004.03156v1 |
Reasoning About Generalization via Conditional Mutual Information | http://arxiv.org/abs/2001.09122v3 |
Reasoning About Pragmatics with Neural Listeners and Speakers | http://arxiv.org/abs/1604.00562v2 |
Reasoning Over History: Context Aware Visual Dialog | http://arxiv.org/abs/2011.00669v1 |
Reasoning Over Semantic-Level Graph for Fact Checking | http://arxiv.org/abs/1909.03745v3 |
Reasoning about Actions and State Changes by Injecting Commonsense Knowledge | http://arxiv.org/abs/1808.10012v1 |
Reasoning about Goals, Steps, and Temporal Ordering with WikiHow | http://arxiv.org/abs/2009.07690v2 |
Reasoning with Latent Structure Refinement for Document-Level Relation Extraction | http://arxiv.org/abs/2005.06312v3 |
Reasoning with Sarcasm by Reading In-between | http://arxiv.org/abs/1805.02856v1 |
Recall and Learn: Fine-tuning Deep Pretrained Language Models with Less Forgetting | http://arxiv.org/abs/2004.12651v1 |
RecipeQA: A Challenge Dataset for Multimodal Comprehension of Cooking Recipes | http://arxiv.org/abs/1809.00812v1 |
Recipes for building an open-domain chatbot | http://arxiv.org/abs/2004.13637v2 |
Recognizing Implicit Discourse Relations via Repeated Reading: Neural Networks with Multi-Level Attention | http://arxiv.org/abs/1609.06380v1 |
Recovery of Sparse Signals from a Mixture of Linear Samples | http://arxiv.org/abs/2006.16406v2 |
Recurrent Chunking Mechanisms for Long-Text Machine Reading Comprehension | http://arxiv.org/abs/2005.08056v2 |
Recurrent Event Network: Autoregressive Structure Inference over Temporal Knowledge Graphs | http://arxiv.org/abs/1904.05530v4 |
Recurrent Hierarchical Topic-Guided RNN for Language Generation | http://arxiv.org/abs/1912.10337v2 |
Recurrent Interaction Network for Jointly Extracting Entities and Classifying Relations | http://arxiv.org/abs/2005.00162v2 |
Recurrent Neural Network Language Models Always Learn English-Like Relative Clause Attachment | http://arxiv.org/abs/2005.00165v3 |
Recurrent Neural Networks as Weighted Language Recognizers | http://arxiv.org/abs/1711.05408v2 |
Recurrent Neural Networks in Linguistic Theory: Revisiting Pinker and Prince (1988) and the Past Tense Debate | http://arxiv.org/abs/1807.04783v2 |
Recurrent babbling: evaluating the acquisition of grammar from limited input data | http://arxiv.org/abs/2010.04637v1 |
Recursive Subtree Composition in LSTM-Based Dependency Parsing | http://arxiv.org/abs/1902.09781v1 |
Reducibility and Statistical-Computational Gaps from Secret Leakage | http://arxiv.org/abs/2005.08099v2 |
Reducing Gender Bias in Abusive Language Detection | http://arxiv.org/abs/1808.07231v1 |
Reducing Gender Bias in Neural Machine Translation as a Domain Adaptation Problem | http://arxiv.org/abs/2004.04498v3 |
Reducing Sampling Error in Batch Temporal Difference Learning | http://arxiv.org/abs/2008.06738v1 |
Refer, Reuse, Reduce: Generating Subsequent References in Visual and Conversational Contexts | http://arxiv.org/abs/2011.04554v1 |
Refined bounds for algorithm configuration: The knife-edge of dual class approximability | http://arxiv.org/abs/2006.11827v2 |
Reflection-based Word Attribute Transfer | http://arxiv.org/abs/2007.02598v2 |
Reformulating Unsupervised Style Transfer as Paraphrase Generation | http://arxiv.org/abs/2010.05700v1 |
Regression Networks for Meta-Learning Few-Shot Classification | http://arxiv.org/abs/1905.13613v2 |
Regularity as Regularization: Smooth and Strongly Convex Brenier Potentials in Optimal Transport | http://arxiv.org/abs/1905.10812v5 |
Regularization via Structural Label Smoothing | http://arxiv.org/abs/2001.01900v2 |
Regularized Autoencoders via Relaxed Injective Probability Flow | http://arxiv.org/abs/2002.08927v1 |
Regularized Context Gates on Transformer for Machine Translation | http://arxiv.org/abs/1908.11020v2 |
Regularized Inverse Reinforcement Learning | http://arxiv.org/abs/2010.03691v2 |
Regularized Optimal Transport is Ground Cost Adversarial | http://arxiv.org/abs/2002.03967v3 |
Regularizing Dialogue Generation by Imitating Implicit Scenarios | http://arxiv.org/abs/2010.01893v2 |
Reinforcement Learning Based Text Style Transfer without Parallel Training Corpus | http://arxiv.org/abs/1903.10671v2 |
Reinforcement Learning Generalization with Surprise Minimization | http://arxiv.org/abs/2004.12399v2 |
Reinforcement Learning based Curriculum Optimization for Neural Machine Translation | http://arxiv.org/abs/1903.00041v1 |
Reinforcement Learning for Integer Programming: Learning to Cut | http://arxiv.org/abs/1906.04859v3 |
Reinforcement Learning for Molecular Design Guided by Quantum Mechanics | http://arxiv.org/abs/2002.07717v2 |
Reinforcement Learning in Feature Space: Matrix Bandit, Kernels, and Regret Bound | http://arxiv.org/abs/1905.10389v2 |
Reinforcement Learning through Active Inference | http://arxiv.org/abs/2002.12636v1 |
Reinforcement Learning with Chromatic Networks for Compact Architecture Search | http://arxiv.org/abs/1907.06511v3 |
Reinforcement Learning with Latent Flow | http://arxiv.org/abs/2101.01857v1 |
Relabel the Noise: Joint Extraction of Entities and Relations via Cooperative Multiagents | http://arxiv.org/abs/2004.09930v1 |
Relating Simple Sentence Representations in Deep Neural Networks and the Brain | http://arxiv.org/abs/1906.11861v1 |
Relation Embedding with Dihedral Group in Knowledge Graph | http://arxiv.org/abs/1906.00687v1 |
Relation Extraction with Explanation | http://arxiv.org/abs/2005.14271v1 |
Relational Graph Attention Network for Aspect-based Sentiment Analysis | http://arxiv.org/abs/2004.12362v1 |
Relations such as Hypernymy: Identifying and Exploiting Hearst Patterns in Distributional Vectors for Lexical Entailment | http://arxiv.org/abs/1605.05433v2 |
Relative gradient optimization of the Jacobian term in unsupervised deep learning | http://arxiv.org/abs/2006.15090v2 |
Relaxing Bijectivity Constraints with Continuously Indexed Normalising Flows | http://arxiv.org/abs/1909.13833v4 |
Relevance of Rotationally Equivariant Convolutions for Predicting Molecular Properties | http://arxiv.org/abs/2008.08461v4 |
Reliable Fidelity and Diversity Metrics for Generative Models | http://arxiv.org/abs/2002.09797v2 |
Reliable evaluation of adversarial robustness with an ensemble of diverse parameter-free attacks | http://arxiv.org/abs/2003.01690v2 |
Rep the Set: Neural Networks for Learning Set Representations | http://arxiv.org/abs/1904.01962v2 |
Replicability Analysis for Natural Language Processing: Testing Significance with Multiple Datasets | http://arxiv.org/abs/1709.09500v1 |
Representation Learning for Discovering Phonemic Tone Contours | http://arxiv.org/abs/1910.08987v2 |
Representation Learning for Grounded Spatial Reasoning | http://arxiv.org/abs/1707.03938v2 |
Representations of language in a model of visually grounded speech signal | http://arxiv.org/abs/1702.01991v3 |
Representing Unordered Data Using Complex-Weighted Multiset Automata | http://arxiv.org/abs/2001.00610v3 |
Representing and Denoising Wearable ECG Recordings | http://arxiv.org/abs/2012.00110v1 |
Repulsive Attention: Rethinking Multi-head Attention as Bayesian Inference | http://arxiv.org/abs/2009.09364v2 |
Repurposing Entailment for Multi-Hop Question Answering Tasks | http://arxiv.org/abs/1904.09380v1 |
Reserve Pricing in Repeated Second-Price Auctions with Strategic Bidders | http://arxiv.org/abs/1906.09331v1 |
Reset-Free Lifelong Learning with Skill-Space Planning | http://arxiv.org/abs/2012.03548v2 |
Resolution Dependent GAN Interpolation for Controllable Image Synthesis Between Domains | http://arxiv.org/abs/2010.05334v3 |
Resolving Spurious Correlations in Causal Models of Environments via Interventions | http://arxiv.org/abs/2002.05217v2 |
Response Selection for Multi-Party Conversations with Dynamic Topic Tracking | http://arxiv.org/abs/2010.07785v1 |
Response-Anticipated Memory for On-Demand Knowledge Integration in Response Generation | http://arxiv.org/abs/2005.06128v1 |
RethinkCWS: Is Chinese Word Segmentation a Solved Task? | http://arxiv.org/abs/2011.06858v2 |
Rethinking Action Spaces for Reinforcement Learning in End-to-end Dialog Agents with Latent Variable Models | http://arxiv.org/abs/1902.08858v2 |
Rethinking Dialogue State Tracking with Reasoning | http://arxiv.org/abs/2005.13129v2 |
Retrieval-Based Neural Code Generation | http://arxiv.org/abs/1808.10025v1 |
Retrofitting Structure-aware Transformer Language Model for End Tasks | http://arxiv.org/abs/2009.07408v1 |
Reusability and Transferability of Macro Actions for Reinforcement Learning | http://arxiv.org/abs/1908.01478v2 |
Reusing a Pretrained Language Model on Languages with Limited Corpora for Unsupervised NMT | http://arxiv.org/abs/2009.07610v3 |
Reverse Engineering Configurations of Neural Text Generation Models | http://arxiv.org/abs/2004.06201v1 |
Reverse-Engineering Deep ReLU Networks | http://arxiv.org/abs/1910.00744v2 |
Review-based Question Generation with Adaptive Instance Transfer and Augmentation | http://arxiv.org/abs/1911.01556v2 |
Revisiting Character-Based Neural Machine Translation with Capacity and Compression | http://arxiv.org/abs/1808.09943v1 |
Revisiting Ensembles in an Adversarial Context: Improving Natural Accuracy | http://arxiv.org/abs/2002.11572v1 |
Revisiting Fundamentals of Experience Replay | http://arxiv.org/abs/2007.06700v1 |
Revisiting Joint Modeling of Cross-document Entity and Event Coreference Resolution | http://arxiv.org/abs/1906.01753v1 |
Revisiting Low-Resource Neural Machine Translation: A Case Study | http://arxiv.org/abs/1905.11901v1 |
Revisiting Modularized Multilingual NMT to Meet Industrial Demands | http://arxiv.org/abs/2010.09402v1 |
Revisiting Rainbow: Promoting more insightful and inclusive deep reinforcement learning research | http://arxiv.org/abs/2011.14826v1 |
Revisiting Stochastic Extragradient | http://arxiv.org/abs/1905.11373v2 |
Revisiting Unsupervised Relation Extraction | http://arxiv.org/abs/2005.00087v1 |
Revisiting the Context Window for Cross-lingual Word Embeddings | http://arxiv.org/abs/2004.10813v1 |
Revisiting the Importance of Encoding Logic Rules in Sentiment Classification | http://arxiv.org/abs/1808.07733v1 |
RiSAWOZ: A Large-Scale Multi-Domain Wizard-of-Oz Dataset with Rich Semantic Annotations for Task-Oriented Dialogue Modeling | http://arxiv.org/abs/2010.08738v1 |
Rigging the Lottery: Making All Tickets Winners | http://arxiv.org/abs/1911.11134v2 |
Right for the Wrong Reasons: Diagnosing Syntactic Heuristics in Natural Language Inference | http://arxiv.org/abs/1902.01007v4 |
Rigid Formats Controlled Text Generation | http://arxiv.org/abs/2004.08022v1 |
RikiNet: Reading Wikipedia Pages for Natural Question Answering | http://arxiv.org/abs/2004.14560v1 |
Risk Assessment for Machine Learning Models | http://arxiv.org/abs/2011.04328v1 |
Risk Bounds for Learning Multiple Components with Permutation-Invariant Losses | http://arxiv.org/abs/1904.07594v2 |
Rk-means: Fast Clustering for Relational Data | http://arxiv.org/abs/1910.04939v1 |
Robust Bayesian Classification Using an Optimistic Score Ratio | http://arxiv.org/abs/2007.04458v1 |
Robust Cross-lingual Hypernymy Detection using Dependency Context | http://arxiv.org/abs/1803.11291v1 |
Robust Distant Supervision Relation Extraction via Deep Reinforcement Learning | http://arxiv.org/abs/1805.09927v1 |
Robust Domain Randomised Reinforcement Learning through Peer-to-Peer Distillation | http://arxiv.org/abs/2012.04839v1 |
Robust Encodings: A Framework for Combating Adversarial Typos | http://arxiv.org/abs/2005.01229v1 |
Robust Learning from Discriminative Feature Feedback | http://arxiv.org/abs/2003.03946v1 |
Robust Optimisation Monte Carlo | http://arxiv.org/abs/1904.00670v3 |
Robust Outlier Arm Identification | http://arxiv.org/abs/2009.09988v1 |
Robust Prediction of Punctuation and Truecasing for Medical ASR | http://arxiv.org/abs/2007.02025v2 |
Robust Reinforcement Learning using Adversarial Populations | http://arxiv.org/abs/2008.01825v2 |
Robust Variational Autoencoders for Outlier Detection and Repair of Mixed-Type Data | http://arxiv.org/abs/1907.06671v2 |
Robust Visual Domain Randomization for Reinforcement Learning | http://arxiv.org/abs/1910.10537v2 |
Robust and Private Learning of Halfspaces | http://arxiv.org/abs/2011.14580v1 |
Robust and Stable Black Box Explanations | http://arxiv.org/abs/2011.06169v1 |
Robust model training and generalisation with Studentising flows | http://arxiv.org/abs/2006.06599v2 |
Robust posterior inference when statistically emulating forward simulations | http://arxiv.org/abs/2004.11929v1 |
Robustifying Sequential Neural Processes | http://arxiv.org/abs/2006.15987v1 |
Robustness for Non-Parametric Classification: A Generic Attack and Defense | http://arxiv.org/abs/1906.03310v2 |
Robustness to Programmable String Transformations via Augmented Abstract Training | http://arxiv.org/abs/2002.09579v4 |
Robustness to Spurious Correlations via Human Annotations | http://arxiv.org/abs/2007.06661v2 |
Room-Across-Room: Multilingual Vision-and-Language Navigation with Dense Spatiotemporal Grounding | http://arxiv.org/abs/2010.07954v1 |
RussianSuperGLUE: A Russian Language Understanding Evaluation Benchmark | http://arxiv.org/abs/2010.15925v2 |
S-MART: Novel Tree-based Structured Learning Algorithms Applied to Tweet Entity Linking | http://arxiv.org/abs/1609.08075v1 |
S2ORC: The Semantic Scholar Open Research Corpus | http://arxiv.org/abs/1911.02782v3 |
S2RMs: Spatially Structured Recurrent Modules | http://arxiv.org/abs/2007.06533v1 |
SAFENet: Self-Supervised Monocular Depth Estimation with Semantic-Aware Feature Extraction | http://arxiv.org/abs/2010.02893v3 |
SAFER: A Structure-free Approach for Certified Robustness to Adversarial Word Substitutions | http://arxiv.org/abs/2005.14424v1 |
SCAFFOLD: Stochastic Controlled Averaging for Federated Learning | http://arxiv.org/abs/1910.06378v3 |
SCDE: Sentence Cloze Dataset with High Quality Distractors From Examinations | http://arxiv.org/abs/2004.12934v1 |
SCDV : Sparse Composite Document Vectors using soft clustering over distributional representations | http://arxiv.org/abs/1612.06778v3 |
SDE-Net: Equipping Deep Neural Networks with Uncertainty Estimates | http://arxiv.org/abs/2008.10546v1 |
SE(3)-Transformers: 3D Roto-Translation Equivariant Attention Networks | http://arxiv.org/abs/2006.10503v3 |
SECTOR: A Neural Model for Coherent Topic Segmentation and Classification | http://arxiv.org/abs/1902.04793v1 |
SGD Learns One-Layer Networks in WGANs | http://arxiv.org/abs/1910.07030v2 |
SHAPED: Shared-Private Encoder-Decoder for Text Style Adaptation | http://arxiv.org/abs/1804.04093v1 |
SIGMORPHON 2020 Shared Task 0: Typologically Diverse Morphological Inflection | http://arxiv.org/abs/2006.11572v2 |
SIGN: Scalable Inception Graph Neural Networks | http://arxiv.org/abs/2004.11198v3 |
SIGTYP 2020 Shared Task: Prediction of Typological Features | http://arxiv.org/abs/2010.08246v2 |
SIGUA: Forgetting May Make Learning with Noisy Labels More Robust | http://arxiv.org/abs/1809.11008v3 |
SJTU-NICT's Supervised and Unsupervised Neural Machine Translation Systems for the WMT20 News Translation Task | http://arxiv.org/abs/2010.05122v1 |
SKEP: Sentiment Knowledge Enhanced Pre-training for Sentiment Analysis | http://arxiv.org/abs/2005.05635v2 |
SLEDGE-Z: A Zero-Shot Baseline for COVID-19 Literature Search | http://arxiv.org/abs/2010.05987v1 |
SLM: Learning a Discourse Language Representation with Sentence Unshuffling | http://arxiv.org/abs/2010.16249v1 |
SLURP: A Spoken Language Understanding Resource Package | http://arxiv.org/abs/2011.13205v1 |
SMART: Robust and Efficient Fine-Tuning for Pre-trained Natural Language Models through Principled Regularized Optimization | http://arxiv.org/abs/1911.03437v2 |
SMArtCast: Predicting soil moisture interpolations into the future using Earth observation data in a deep learning framework | http://arxiv.org/abs/2003.10823v2 |
SOTERIA: In Search of Efficient Neural Networks for Private Inference | http://arxiv.org/abs/2007.12934v1 |
SOrT-ing VQA Models : Contrastive Gradient Learning for Improved Consistency | http://arxiv.org/abs/2010.10038v2 |
SQuAD: 100,000+ Questions for Machine Comprehension of Text | http://arxiv.org/abs/1606.05250v3 |
SRLGRN: Semantic Role Labeling Graph Reasoning Network | http://arxiv.org/abs/2010.03604v2 |
SSCR: Iterative Language-Based Image Editing via Self-Supervised Counterfactual Reasoning | http://arxiv.org/abs/2009.09566v2 |
SSMBA: Self-Supervised Manifold Based Data Augmentation for Improving Out-of-Domain Robustness | http://arxiv.org/abs/2009.10195v2 |
STARC: Structured Annotations for Reading Comprehension | http://arxiv.org/abs/2004.14797v1 |
STORIUM: A Dataset and Evaluation Platform for Machine-in-the-Loop Story Generation | http://arxiv.org/abs/2010.01717v1 |
SUPERT: Towards New Frontiers in Unsupervised Evaluation Metrics for Multi-Document Summarization | http://arxiv.org/abs/2005.03724v1 |
SUPP.AI: Finding Evidence for Supplement-Drug Interactions | http://arxiv.org/abs/1909.08135v3 |
SWAG: A Large-Scale Adversarial Dataset for Grounded Commonsense Inference | http://arxiv.org/abs/1808.05326v1 |
SacreROUGE: An Open-Source Library for Using and Developing Summarization Evaluation Metrics | http://arxiv.org/abs/2007.05374v1 |
Safe Imitation Learning via Fast Bayesian Reward Inference from Preferences | http://arxiv.org/abs/2002.09089v4 |
Safe Reinforcement Learning in Constrained Markov Decision Processes | http://arxiv.org/abs/2008.06626v1 |
Safe Reinforcement Learning with Natural Language Constraints | http://arxiv.org/abs/2010.05150v1 |
SafeCity: Understanding Diverse Forms of Sexual Harassment Personal Stories | http://arxiv.org/abs/1809.04739v2 |
Saliency Learning: Teaching the Model Where to Pay Attention | http://arxiv.org/abs/1902.08649v3 |
SalsaNext: Fast, Uncertainty-aware Semantic Segmentation of LiDAR Point Clouds for Autonomous Driving | http://arxiv.org/abs/2003.03653v3 |
Sample Amplification: Increasing Dataset Size even when Learning is Impossible | http://arxiv.org/abs/1904.12053v2 |
Sample Complexity Bounds for 1-bit Compressive Sensing and Binary Stable Embeddings with Generative Priors | http://arxiv.org/abs/2002.01697v3 |
Sample Complexity of Estimating the Policy Gradient for Nearly Deterministic Dynamical Systems | http://arxiv.org/abs/1901.08562v2 |
Sample Complexity of Reinforcement Learning using Linearly Combined Model Ensembles | http://arxiv.org/abs/1910.10597v1 |
Sample Efficient Training in Multi-Agent Adversarial Games with Limited Teammate Communication | http://arxiv.org/abs/2011.00424v1 |
Sample Factory: Egocentric 3D Control from Pixels at 100000 FPS with Asynchronous Reinforcement Learning | http://arxiv.org/abs/2006.11751v2 |
Sample-efficient proper PAC learning with approximate differential privacy | http://arxiv.org/abs/2012.03893v1 |
Sarcasm Detection in Tweets with BERT and GloVe Embeddings | http://arxiv.org/abs/2006.11512v1 |
Sarcasm Detection using Context Separators in Online Discourse | http://arxiv.org/abs/2006.00850v1 |
Satellite-based Prediction of Forage Conditions for Livestock in Northern Kenya | http://arxiv.org/abs/2004.04081v2 |
Satirical News Detection and Analysis using Attention Mechanism and Linguistic Features | http://arxiv.org/abs/1709.01189v1 |
Scalable Bayesian Learning of Recurrent Neural Networks for Language Modeling | http://arxiv.org/abs/1611.08034v2 |
Scalable Deep Generative Modeling for Sparse Graphs | http://arxiv.org/abs/2006.15502v1 |
Scalable Differentiable Physics for Learning and Control | http://arxiv.org/abs/2007.02168v1 |
Scalable Differential Privacy with Certified Robustness in Adversarial Learning | http://arxiv.org/abs/1903.09822v5 |
Scalable Exact Inference in Multi-Output Gaussian Processes | http://arxiv.org/abs/1911.06287v3 |
Scalable Gaussian Process Regression for Kernels with a Non-Stationary Phase | http://arxiv.org/abs/1912.11713v1 |
Scalable Gradients for Stochastic Differential Equations | http://arxiv.org/abs/2001.01328v6 |
Scalable Identification of Partially Observed Systems with Certainty-Equivalent EM | http://arxiv.org/abs/2006.11615v1 |
Scalable Multi-Hop Relational Reasoning for Knowledge-Aware Question Answering | http://arxiv.org/abs/2005.00646v2 |
Scalable Nearest Neighbor Search for Optimal Transport | http://arxiv.org/abs/1910.04126v4 |
Scalable Syntax-Aware Language Models Using Knowledge Distillation | http://arxiv.org/abs/1906.06438v1 |
Scalable Zero-shot Entity Linking with Dense Entity Retrieval | http://arxiv.org/abs/1911.03814v3 |
Scalable and Efficient Comparison-based Search without Features | http://arxiv.org/abs/1905.05049v3 |
Scaling Hidden Markov Language Models | http://arxiv.org/abs/2011.04640v1 |
Scaling up Hybrid Probabilistic Inference with Logical and Arithmetic Constraints via Message Passing | http://arxiv.org/abs/2003.00126v2 |
Scattering GCN: Overcoming Oversmoothness in Graph Convolutional Networks | http://arxiv.org/abs/2003.08414v2 |
Scene Graph Parsing as Dependency Parsing | http://arxiv.org/abs/1803.09189v1 |
Scene Graph Reasoning for Visual Question Answering | http://arxiv.org/abs/2007.01072v1 |
Schatten Norms in Matrix Streams: Hello Sparsity, Goodbye Dimension | http://arxiv.org/abs/1907.05457v2 |
SciDTB: Discourse Dependency TreeBank for Scientific Abstracts | http://arxiv.org/abs/1806.03653v1 |
SciREX: A Challenge Dataset for Document-Level Information Extraction | http://arxiv.org/abs/2005.00512v1 |
Score Combination for Improved Parallel Corpus Filtering for Low Resource Conditions | http://arxiv.org/abs/2011.07933v1 |
Scoring Lexical Entailment with a Supervised Directional Similarity Network | http://arxiv.org/abs/1805.09355v1 |
Screening Data Points in Empirical Risk Minimization via Ellipsoidal Regions and Safe Loss Functions | http://arxiv.org/abs/1912.02566v3 |
Screenplay Quality Assessment: Can We Predict Who Gets Nominated? | http://arxiv.org/abs/2005.06123v1 |
Screenplay Summarization Using Latent Narrative Structure | http://arxiv.org/abs/2004.12727v1 |
ScriptWriter: Narrative-Guided Script Generation | http://arxiv.org/abs/2005.10331v2 |
Secure Medical Image Analysis with CrypTFlow | http://arxiv.org/abs/2012.05064v1 |
Selecting Backtranslated Data from Multiple Sources for Improved Neural Machine Translation | http://arxiv.org/abs/2005.00308v1 |
Selecting Machine-Translated Data for Quick Bootstrapping of a Natural Language Understanding System | http://arxiv.org/abs/1805.09119v1 |
Selection Bias Explorations and Debias Methods for Natural Language Sentence Matching Datasets | http://arxiv.org/abs/1905.06221v4 |
Selective Attention for Context-aware Neural Machine Translation | http://arxiv.org/abs/1903.08788v2 |
Selective Dyna-style Planning Under Limited Model Capacity | http://arxiv.org/abs/2007.02418v2 |
Selective Encoding for Abstractive Sentence Summarization | http://arxiv.org/abs/1704.07073v1 |
Selective Question Answering under Domain Shift | http://arxiv.org/abs/2006.09462v1 |
Self-Attentive Associative Memory | http://arxiv.org/abs/2002.03519v3 |
Self-Induced Curriculum Learning in Self-Supervised Neural Machine Translation | http://arxiv.org/abs/2004.03151v2 |
Self-PU: Self Boosted and Calibrated Positive-Unlabeled Training | http://arxiv.org/abs/2006.11280v1 |
Self-Supervised Meta-Learning for Few-Shot Natural Language Classification Tasks | http://arxiv.org/abs/2009.08445v2 |
Self-Supervised Policy Adaptation during Deployment | http://arxiv.org/abs/2007.04309v2 |
Self-Training for Unsupervised Parsing with PRPN | http://arxiv.org/abs/2005.13455v1 |
Self-supervised Knowledge Triplet Learning for Zero-shot Question Answering | http://arxiv.org/abs/2005.00316v2 |
Self-supervised Label Augmentation via Input Transformations | http://arxiv.org/abs/1910.05872v2 |
SelfORE: Self-supervised Relational Feature Learning for Open Relation Extraction | http://arxiv.org/abs/2004.02438v2 |
Selfish Robustness and Equilibria in Multi-Player Bandits | http://arxiv.org/abs/2002.01197v2 |
Semantic Annotation for Microblog Topics Using Wikipedia Temporal Information | http://arxiv.org/abs/1701.03939v1 |
Semantic Drift in Multilingual Representations | http://arxiv.org/abs/1904.10820v4 |
Semantic Enrichment of Nigerian Pidgin English for Contextual Sentiment Classification | http://arxiv.org/abs/2003.12450v1 |
Semantic Graphs for Generating Deep Questions | http://arxiv.org/abs/2004.12704v1 |
Semantic Label Smoothing for Sequence to Sequence Problems | http://arxiv.org/abs/2010.07447v1 |
Semantic Parsing for Task Oriented Dialog using Hierarchical Representations | http://arxiv.org/abs/1810.07942v1 |
Semantic Parsing to Probabilistic Programs for Situated Question Answering | http://arxiv.org/abs/1606.07046v2 |
Semantic Parsing with Dual Learning | http://arxiv.org/abs/1907.05343v2 |
Semantic Parsing with Semi-Supervised Sequential Autoencoders | http://arxiv.org/abs/1609.09315v1 |
Semantic Role Labeling Guided Multi-turn Dialogue ReWriter | http://arxiv.org/abs/2010.01417v1 |
Semantic Role Labeling as Syntactic Dependency Parsing | http://arxiv.org/abs/2010.11170v1 |
Semantic Role Labeling for Learner Chinese: the Importance of Syntactic Parsing and L2-L1 Parallel Data | http://arxiv.org/abs/1808.09409v2 |
Semantic Scaffolds for Pseudocode-to-Code Generation | http://arxiv.org/abs/2005.05927v1 |
Semantic Structural Evaluation for Text Simplification | http://arxiv.org/abs/1810.05022v1 |
Semantic expressive capacity with bounded memory | http://arxiv.org/abs/1906.11752v1 |
Semantically Conditioned LSTM-based Natural Language Generation for Spoken Dialogue Systems | http://arxiv.org/abs/1508.01745v2 |
Semantically-Aligned Equation Generation for Solving and Reasoning Math Word Problems | http://arxiv.org/abs/1811.00720v2 |
Semantically-Aligned Universal Tree-Structured Solver for Math Word Problems | http://arxiv.org/abs/2010.06823v1 |
Semi-Modular Inference: enhanced learning in multi-modular models by tempering the influence of components | http://arxiv.org/abs/2003.06804v1 |
Semi-Supervised Bilingual Lexicon Induction with Two-way Interaction | http://arxiv.org/abs/2010.07101v1 |
Semi-Supervised Dialogue Policy Learning via Stochastic Reward Estimation | http://arxiv.org/abs/2005.04379v1 |
Semi-Supervised Learning with Normalizing Flows | http://arxiv.org/abs/1912.13025v1 |
Semi-Supervised QA with Generative Domain-Adaptive Nets | http://arxiv.org/abs/1702.02206v2 |
Semi-Supervised StyleGAN for Disentanglement Learning | http://arxiv.org/abs/2003.03461v3 |
Semi-supervised User Geolocation via Graph Convolutional Networks | http://arxiv.org/abs/1804.08049v4 |
Semiparametric Nonlinear Bipartite Graph Representation Learning with Provable Guarantees | http://arxiv.org/abs/2003.01013v1 |
SenseBERT: Driving Some Sense into BERT | http://arxiv.org/abs/1908.05646v2 |
Sentence Meta-Embeddings for Unsupervised Semantic Textual Similarity | http://arxiv.org/abs/1911.03700v3 |
Sentence Simplification with Deep Reinforcement Learning | http://arxiv.org/abs/1703.10931v2 |
Sentences with Gapping: Parsing and Reconstructing Elided Predicates | http://arxiv.org/abs/1804.06922v1 |
SentiBERT: A Transferable Transformer-Based Architecture for Compositional Sentiment Semantics | http://arxiv.org/abs/2005.04114v4 |
SentiLARE: Sentiment-Aware Language Representation Learning with Linguistic Knowledge | http://arxiv.org/abs/1911.02493v3 |
Seq2Edits: Sequence Transduction Using Span-level Edit Operations | http://arxiv.org/abs/2009.11136v1 |
SeqMix: Augmenting Active Sequence Labeling via Sequence Mixup | http://arxiv.org/abs/2010.02322v1 |
Sequence Tagging with Contextual and Non-Contextual Subword Representations: A Multilingual Evaluation | http://arxiv.org/abs/1906.01569v1 |
Sequence-Level Knowledge Distillation | http://arxiv.org/abs/1606.07947v4 |
Sequence-Level Mixed Sample Data Augmentation | http://arxiv.org/abs/2011.09039v1 |
Sequence-to-Action: End-to-End Semantic Graph Generation for Semantic Parsing | http://arxiv.org/abs/1809.00773v1 |
Sequential Cooperative Bayesian Inference | http://arxiv.org/abs/2002.05706v3 |
Sequential Matching Network: A New Architecture for Multi-turn Response Selection in Retrieval-based Chatbots | http://arxiv.org/abs/1612.01627v2 |
Sequential Transfer in Reinforcement Learning with a Generative Model | http://arxiv.org/abs/2007.00722v1 |
Serverless inferencing on Kubernetes | http://arxiv.org/abs/2007.07366v2 |
Set Functions for Time Series | http://arxiv.org/abs/1909.12064v3 |
Severing the Edge Between Before and After: Neural Architectures for Temporal Ordering of Events | http://arxiv.org/abs/2004.04295v1 |
Shape of synth to come: Why we should use synthetic data for English surface realization | http://arxiv.org/abs/2005.02693v1 |
Shaping Visual Representations with Language for Few-shot Classification | http://arxiv.org/abs/1911.02683v2 |
Shared-Private Bilingual Word Embeddings for Neural Machine Translation | http://arxiv.org/abs/1906.03100v1 |
Sharp Analysis of Expectation-Maximization for Weakly Identifiable Models | http://arxiv.org/abs/1902.00194v3 |
Sharp Composition Bounds for Gaussian Differential Privacy via Edgeworth Expansion | http://arxiv.org/abs/2003.04493v2 |
Sharp Models on Dull Hardware: Fast and Accurate Neural Machine Translation Decoding on the CPU | http://arxiv.org/abs/1705.01991v1 |
Sharper bounds for uniformly stable algorithms | http://arxiv.org/abs/1910.07833v2 |
Sheaf Neural Networks | http://arxiv.org/abs/2012.06333v1 |
SherLIiC: A Typed Event-Focused Lexical Inference Benchmark for Evaluating Natural Language Inference | http://arxiv.org/abs/1906.01393v1 |
Short-Term Meaning Shift: A Distributional Exploration | http://arxiv.org/abs/1809.03169v3 |
Should All Cross-Lingual Embeddings Speak English? | http://arxiv.org/abs/1911.03058v2 |
Showing Your Work Doesn't Always Work | http://arxiv.org/abs/2004.13705v1 |
SimGANs: Simulator-Based Generative Adversarial Networks for ECG Synthesis to Improve Deep ECG Classification | http://arxiv.org/abs/2006.15353v1 |
SimVerb-3500: A Large-Scale Evaluation Set of Verb Similarity | http://arxiv.org/abs/1608.00869v4 |
Similarity Analysis of Contextual Word Representation Models | http://arxiv.org/abs/2005.01172v1 |
Simple Unsupervised Summarization by Contextual Matching | http://arxiv.org/abs/1907.13337v1 |
Simple and Deep Graph Convolutional Networks | http://arxiv.org/abs/2007.02133v1 |
Simple and Effective Curriculum Pointer-Generator Networks for Reading Comprehension over Long Narratives | http://arxiv.org/abs/1905.10847v1 |
Simple and Effective Multi-Paragraph Reading Comprehension | http://arxiv.org/abs/1710.10723v2 |
Simple and Effective Text Simplification Using Semantic and Neural Methods | http://arxiv.org/abs/1810.05104v1 |
Simple and sharp analysis of k-means | |
SimpleQuestions Nearly Solved: A New Upperbound and Baseline Approach | http://arxiv.org/abs/1804.08798v1 |
Simpler but More Accurate Semantic Dependency Parsing | http://arxiv.org/abs/1807.01396v1 |
Simplify the Usage of Lexicon in Chinese NER | http://arxiv.org/abs/1908.05969v2 |
Simplifying Neural Machine Translation with Addition-Subtraction Twin-Gated Recurrent Networks | http://arxiv.org/abs/1810.12546v1 |
Simulator Calibration under Covariate Shift with Kernels | http://arxiv.org/abs/1809.08159v4 |
Simultaneous Inference for Massive Data: Distributed Bootstrap | http://arxiv.org/abs/2002.08443v1 |
Simultaneous Machine Translation with Visual Context | http://arxiv.org/abs/2009.07310v3 |
Simultaneous Translation Policies: From Fixed to Adaptive | http://arxiv.org/abs/2004.13169v2 |
Simultaneous Translation with Flexible Policy via Restricted Imitation Learning | http://arxiv.org/abs/1906.01135v2 |
Simultaneous paraphrasing and translation by fine-tuning Transformer models | http://arxiv.org/abs/2005.05570v1 |
Single Model Ensemble using Pseudo-Tags and Distinct Vectors | http://arxiv.org/abs/2005.00879v1 |
Single Point Transductive Prediction | http://arxiv.org/abs/1908.02341v4 |
Single Shot Multitask Pedestrian Detection and Behavior Prediction | http://arxiv.org/abs/2101.02232v1 |
Single-/Multi-Source Cross-Lingual NER via Teacher-Student Learning on Unlabeled Data in Target Language | http://arxiv.org/abs/2004.12440v2 |
Situated Mapping of Sequential Instructions to Actions with Single-step Reward Observation | http://arxiv.org/abs/1805.10209v2 |
Skeleton-to-Response: Dialogue Generation Guided by Retrieval Memory | http://arxiv.org/abs/1809.05296v5 |
Sketch-Driven Regular Expression Generation from Natural Language and Examples | http://arxiv.org/abs/1908.05848v2 |
Sketching Transformed Matrices with Applications to Natural Language Processing | http://arxiv.org/abs/2002.09812v1 |
Skill Transfer via Partially Amortized Hierarchical Planning | http://arxiv.org/abs/2011.13897v1 |
SlotRefine: A Fast Non-Autoregressive Model for Joint Intent Detection and Slot Filling | http://arxiv.org/abs/2010.02693v2 |
Small Data, Big Decisions: Model Selection in the Small-Data Regime | http://arxiv.org/abs/2009.12583v1 |
Small-GAN: Speeding Up GAN Training Using Core-sets | http://arxiv.org/abs/1910.13540v1 |
Smooth Contextual Bandits: Bridging the Parametric and Non-differentiable Regret Regimes | http://arxiv.org/abs/1909.02553v4 |
Social Bias Frames: Reasoning about Social and Power Implications of Language | http://arxiv.org/abs/1911.03891v3 |
Social Biases in NLP Models as Barriers for Persons with Disabilities | http://arxiv.org/abs/2005.00813v1 |
Social Chemistry 101: Learning to Reason about Social and Moral Norms | http://arxiv.org/abs/2011.00620v1 |
Social Media Attributions in the Context of Water Crisis | http://arxiv.org/abs/2001.01697v1 |
Soft Gazetteers for Low-Resource Named Entity Recognition | http://arxiv.org/abs/2005.01866v1 |
Soft Threshold Weight Reparameterization for Learnable Sparsity | http://arxiv.org/abs/2002.03231v9 |
SoftSort: A Continuous Relaxation for the argsort Operator | http://arxiv.org/abs/2006.16038v1 |
Software Engineering Event Modeling using Relative Time in Temporal Knowledge Graphs | http://arxiv.org/abs/2007.01231v2 |
Solving Constrained CASH Problems with ADMM | http://arxiv.org/abs/2006.09635v2 |
Solving Discounted Stochastic Two-Player Games with Near-Optimal Time and Sample Complexity | http://arxiv.org/abs/1908.11071v1 |
Solving General Arithmetic Word Problems | http://arxiv.org/abs/1608.01413v2 |
Solving Physics Puzzles by Reasoning about Paths | http://arxiv.org/abs/2011.07357v1 |
Source Separation with Deep Generative Priors | http://arxiv.org/abs/2002.07942v2 |
Sources of Transfer in Multilingual Named Entity Recognition | http://arxiv.org/abs/2005.00847v1 |
Span Selection Pre-training for Question Answering | http://arxiv.org/abs/1909.04120v2 |
Span-Based Constituency Parsing with a Structure-Label System and Provably Optimal Dynamic Oracles | http://arxiv.org/abs/1612.06475v1 |
Span-ConveRT: Few-shot Span Extraction for Dialog with Pretrained Conversational Representations | http://arxiv.org/abs/2005.08866v2 |
Span-based Localizing Network for Natural Language Video Localization | http://arxiv.org/abs/2004.13931v2 |
Span-based discontinuous constituency parsing: a family of exact chart-based algorithms with time complexities from O(n^6) down to O(n^3) | http://arxiv.org/abs/2003.13785v1 |
SpanBERT: Improving Pre-training by Representing and Predicting Spans | http://arxiv.org/abs/1907.10529v3 |
Sparse Coding of Neural Word Embeddings for Multilingual Sequence Labeling | http://arxiv.org/abs/1612.07130v1 |
Sparse Gaussian Processes with Spherical Harmonic Features | http://arxiv.org/abs/2006.16649v1 |
Sparse Orthogonal Variational Inference for Gaussian Processes | http://arxiv.org/abs/1910.10596v3 |
Sparse Overcomplete Word Vector Representations | http://arxiv.org/abs/1506.02004v1 |
Sparse Parallel Training of Hierarchical Dirichlet Process Topic Models | http://arxiv.org/abs/1906.02416v2 |
Sparse Sinkhorn Attention | http://arxiv.org/abs/2002.11296v1 |
Sparse Text Generation | http://arxiv.org/abs/2004.02644v3 |
Sparse and Constrained Attention for Neural Machine Translation | http://arxiv.org/abs/1805.08241v1 |
Sparse and Low-rank Tensor Estimation via Cubic Sketchings | http://arxiv.org/abs/1801.09326v4 |
Sparsified Linear Programming for Zero-Sum Equilibrium Finding | http://arxiv.org/abs/2006.03451v2 |
SpatialSim: Recognizing Spatial Configurations of Objects with Graph Neural Networks | http://arxiv.org/abs/2004.04546v2 |
Speak to your Parser: Interactive Text-to-SQL with Natural Language Feedback | http://arxiv.org/abs/2005.02539v2 |
Speaker Sensitive Response Evaluation Model | http://arxiv.org/abs/2006.07015v1 |
Speakers Fill Lexical Semantic Gaps with Context | http://arxiv.org/abs/2010.02172v2 |
Specialising Word Vectors for Lexical Entailment | http://arxiv.org/abs/1710.06371v2 |
Spectral Clustering with Graph Neural Networks for Graph Pooling | http://arxiv.org/abs/1907.00481v6 |
Spectral Frank-Wolfe Algorithm: Strict Complementarity and Linear Convergence | http://arxiv.org/abs/2006.01719v4 |
Spectral Subsampling MCMC for Stationary Time Series | http://arxiv.org/abs/1910.13627v2 |
Spectrum Dependent Learning Curves in Kernel Regression and Wide Neural Networks | http://arxiv.org/abs/2002.02561v6 |
Speech Translation and the End-to-End Promise: Taking Stock of Where We Are | http://arxiv.org/abs/2004.06358v1 |
Speeding Up Neural Machine Translation Decoding by Cube Pruning | http://arxiv.org/abs/1809.02992v1 |
SpellGCN: Incorporating Phonological and Visual Similarities into Language Models for Chinese Spelling Check | http://arxiv.org/abs/2004.14166v2 |
Spelling Error Correction with Soft-Masked BERT | http://arxiv.org/abs/2005.07421v1 |
Split and Rephrase | http://arxiv.org/abs/1707.06971v1 |
Spot The Bot: A Robust and Efficient Framework for the Evaluation of Conversational Dialogue Systems | http://arxiv.org/abs/2010.02140v1 |
Spying on your neighbors: Fine-grained probing of contextual embeddings for information about surrounding words | http://arxiv.org/abs/2005.01810v1 |
SqueezeBERT: What can computer vision teach NLP about efficient neural networks? | http://arxiv.org/abs/2006.11316v1 |
Stabilizing Bi-Level Hyperparameter Optimization using Moreau-Yosida Regularization | http://arxiv.org/abs/2007.13322v1 |
Stabilizing Differentiable Architecture Search via Perturbation-based Regularization | http://arxiv.org/abs/2002.05283v3 |
Stabilizing Transformers for Reinforcement Learning | http://arxiv.org/abs/1910.06764v1 |
Stack-Pointer Networks for Dependency Parsing | http://arxiv.org/abs/1805.01087v1 |
Stance Prediction and Claim Verification: An Arabic Perspective | http://arxiv.org/abs/2005.10410v1 |
Stance Prediction for Contemporary Issues: Data and Experiments | http://arxiv.org/abs/2006.00052v1 |
Stanza: A Python Natural Language Processing Toolkit for Many Human Languages | http://arxiv.org/abs/2003.07082v2 |
State Space Expectation Propagation: Efficient Inference Schemes for Temporal Gaussian Processes | http://arxiv.org/abs/2007.05994v1 |
Statistical Machine Translation Features with Multitask Tensor Networks | http://arxiv.org/abs/1506.00698v1 |
Statistically Efficient Off-Policy Policy Gradients | http://arxiv.org/abs/2002.04014v2 |
Statistically Preconditioned Accelerated Gradient Method for Distributed Optimization | http://arxiv.org/abs/2002.10726v1 |
Stay on the Path: Instruction Fidelity in Vision-and-Language Navigation | http://arxiv.org/abs/1905.12255v3 |
Staying True to Your Word: (How) Can Attention Become Explanation? | http://arxiv.org/abs/2005.09379v1 |
Stepwise Extractive Summarization and Planning with Structured Transformers | http://arxiv.org/abs/2010.02744v1 |
Stepwise Model Selection for Sequence Prediction via Deep Kernel Learning | http://arxiv.org/abs/2001.03898v3 |
Stereo Endoscopic Image Super-Resolution Using Disparity-Constrained Parallel Attention | http://arxiv.org/abs/2003.08539v1 |
Stimulating Creativity with FunLines: A Case Study of Humor Generation in Headlines | http://arxiv.org/abs/2002.02031v1 |
Stochastic Coordinate Minimization with Progressive Precision for Stochastic Convex Optimization | http://arxiv.org/abs/2003.05482v1 |
Stochastic Differential Equations with Variational Wishart Diffusions | http://arxiv.org/abs/2006.14895v1 |
Stochastic Flows and Geometric Optimization on the Orthogonal Group | http://arxiv.org/abs/2003.13563v1 |
Stochastic Frank-Wolfe for Constrained Finite-Sum Minimization | http://arxiv.org/abs/2002.11860v5 |
Stochastic Gauss-Newton Algorithms for Nonconvex Compositional Optimization | http://arxiv.org/abs/2002.07290v2 |
Stochastic Gradient and Langevin Processes | http://arxiv.org/abs/1907.03215v7 |
Stochastic Hamiltonian Gradient Methods for Smooth Games | http://arxiv.org/abs/2007.04202v1 |
Stochastic Latent Residual Video Prediction | http://arxiv.org/abs/2002.09219v4 |
Stochastic Linear Contextual Bandits with Diverse Contexts | http://arxiv.org/abs/2003.02681v1 |
Stochastic Neural Network with Kronecker Flow | http://arxiv.org/abs/1906.04282v2 |
Stochastic Normalizing Flows | http://arxiv.org/abs/2002.09547v2 |
Stochastic Optimization for Regularized Wasserstein Estimators | http://arxiv.org/abs/2002.08695v1 |
Stochastic Particle-Optimization Sampling and the Non-Asymptotic Convergence Theory | http://arxiv.org/abs/1809.01293v5 |
Stochastic Recursive Variance-Reduced Cubic Regularization Methods | http://arxiv.org/abs/1901.11518v2 |
Stochastic Regret Minimization in Extensive-Form Games | http://arxiv.org/abs/2002.08493v1 |
Stochastic Subspace Cubic Newton Method | http://arxiv.org/abs/2002.09526v1 |
Stochastic Top-k ListNet | http://arxiv.org/abs/1511.00271v1 |
Stochastic Wasserstein Autoencoder for Probabilistic Sentence Generation | http://arxiv.org/abs/1806.08462v2 |
Stochastic bandits with arm-dependent delays | http://arxiv.org/abs/2006.10459v1 |
Stochastic-YOLO: Efficient Probabilistic Object Detection under Dataset Shifts | http://arxiv.org/abs/2009.02967v2 |
Stochastically Dominant Distributional Reinforcement Learning | http://arxiv.org/abs/1905.07318v4 |
Stochasticity in Neural ODEs: An Empirical Study | http://arxiv.org/abs/2002.09779v2 |
Stolen Probability: A Structural Weakness of Neural Language Models | http://arxiv.org/abs/2005.02433v1 |
Stopping criterion for active learning based on deterministic generalization bounds | http://arxiv.org/abs/2005.07402v1 |
Straight to the Tree: Constituency Parsing with Neural Syntactic Distance | http://arxiv.org/abs/1806.04168v1 |
Strategic Classification is Causal Modeling in Disguise | http://arxiv.org/abs/1910.10362v3 |
Strategies for Structuring Story Generation | http://arxiv.org/abs/1902.01109v2 |
Strategizing against No-regret Learners | http://arxiv.org/abs/1909.13861v1 |
Streamlining Tensor and Network Pruning in PyTorch | http://arxiv.org/abs/2004.13770v1 |
Strength from Weakness: Fast Learning Using Weak Supervision | http://arxiv.org/abs/2002.08483v1 |
Stretching the Effectiveness of MLE from Accuracy to Bias for Pairwise Comparisons | http://arxiv.org/abs/1906.04066v1 |
Striving for Simplicity and Performance in Off-Policy DRL: Output Normalization and Non-Uniform Sampling | http://arxiv.org/abs/1910.02208v4 |
Strong Baselines for Neural Semi-supervised Learning under Domain Shift | http://arxiv.org/abs/1804.09530v1 |
Strong Baselines for Simple Question Answering over Knowledge Graphs with and without Neural Networks | http://arxiv.org/abs/1712.01969v2 |
Strong and Simple Baselines for Multimodal Utterance Embeddings | http://arxiv.org/abs/1906.02125v2 |
Stronger and Faster Wasserstein Adversarial Attacks | http://arxiv.org/abs/2008.02883v1 |
StructVAE: Tree-structured Latent Variable Models for Semi-supervised Semantic Parsing | http://arxiv.org/abs/1806.07832v1 |
Structural Language Models of Code | http://arxiv.org/abs/1910.00577v4 |
Structural Neural Encoders for AMR-to-text Generation | http://arxiv.org/abs/1903.11410v2 |
Structural Scaffolds for Citation Intent Classification in Scientific Publications | http://arxiv.org/abs/1904.01608v2 |
Structural Supervision Improves Few-Shot Learning and Syntactic Generalization in Neural Language Models | http://arxiv.org/abs/2010.05725v1 |
Structure Adaptive Algorithms for Stochastic Bandits | http://arxiv.org/abs/2007.00969v1 |
Structure Aware Negative Sampling in Knowledge Graphs | http://arxiv.org/abs/2009.11355v2 |
Structure Mapping for Transferability of Causal Models | http://arxiv.org/abs/2007.09445v1 |
Structure-Level Knowledge Distillation For Multilingual Sequence Labeling | http://arxiv.org/abs/2004.03846v3 |
Structured Attention for Unsupervised Dialogue Structure Induction | http://arxiv.org/abs/2009.08552v2 |
Structured Linear Contextual Bandits: A Sharp and Geometric Smoothed Analysis | http://arxiv.org/abs/2002.11332v1 |
Structured Minimally Supervised Learning for Neural Relation Extraction | http://arxiv.org/abs/1904.00118v5 |
Structured Multi-Label Biomedical Text Tagging via Attentive Neural Tree Decoding | http://arxiv.org/abs/1810.01468v1 |
Structured Policy Iteration for Linear Quadratic Regulator | http://arxiv.org/abs/2007.06202v1 |
Structured Prediction with Partial Labelling through the Infimum Loss | http://arxiv.org/abs/2003.00920v2 |
Structured Pruning of Large Language Models | http://arxiv.org/abs/1910.04732v1 |
Structured Training for Neural Network Transition-Based Parsing | http://arxiv.org/abs/1506.06158v1 |
Structured Tuning for Semantic Role Labeling | http://arxiv.org/abs/2005.00496v2 |
Student-Teacher Curriculum Learning via Reinforcement Learning: Predicting Hospital Inpatient Admission Location | http://arxiv.org/abs/2007.01135v1 |
Studying the Inductive Biases of RNNs with Synthetic Variations of Natural Languages | http://arxiv.org/abs/1903.06400v2 |
Style Transfer Through Back-Translation | http://arxiv.org/abs/1804.09000v3 |
Sub-Instruction Aware Vision-and-Language Navigation | http://arxiv.org/abs/2004.02707v2 |
Subgoal Discovery for Hierarchical Dialogue Policy Learning | http://arxiv.org/abs/1804.07855v3 |
SubjQA: A Dataset for Subjectivity and Review Comprehension | http://arxiv.org/abs/2004.14283v3 |
Sublinear Optimal Policy Value Estimation in Contextual Bandits | http://arxiv.org/abs/1912.06111v2 |
Substance over Style: Document-Level Targeted Content Transfer | http://arxiv.org/abs/2010.08618v1 |
Subword Regularization: Improving Neural Network Translation Models with Multiple Subword Candidates | http://arxiv.org/abs/1804.10959v1 |
Subword-Level Language Identification for Intra-Word Code-Switching | http://arxiv.org/abs/1904.01989v1 |
Successfully Applying the Stabilized Lottery Ticket Hypothesis to the Transformer Architecture | http://arxiv.org/abs/2005.03454v2 |
Summarizing Opinions: Aspect Extraction Meets Sentiment Prediction and They Are Both Weakly Supervised | http://arxiv.org/abs/1808.08858v1 |
Summarizing Text on Any Aspects: A Knowledge-Informed Weakly-Supervised Approach | http://arxiv.org/abs/2010.06792v2 |
Super-efficiency of automatic differentiation for functions defined as a minimum | http://arxiv.org/abs/2002.03722v1 |
Supermasks in Superposition | http://arxiv.org/abs/2006.14769v3 |
Supertagging Combinatory Categorial Grammar with Attentive Graph Convolutional Networks | http://arxiv.org/abs/2010.06115v2 |
Supervised Attentions for Neural Machine Translation | http://arxiv.org/abs/1608.00112v1 |
Supervised Domain Enablement Attention for Personalized Domain Classification | http://arxiv.org/abs/1812.07546v1 |
Supervised Grapheme-to-Phoneme Conversion of Orthographic Schwas in Hindi and Punjabi | http://arxiv.org/abs/2004.10353v2 |
Supervised Learning of Universal Sentence Representations from Natural Language Inference Data | http://arxiv.org/abs/1705.02364v5 |
Supervised Learning: No Loss No Cry | http://arxiv.org/abs/2002.03555v1 |
Supervised Seeded Iterated Learning for Interactive Language Learning | http://arxiv.org/abs/2010.02975v1 |
Support recovery and sup-norm convergence rates for sparse pivotal estimation | http://arxiv.org/abs/2001.05401v3 |
Surrogate sea ice model enables efficient tuning | http://arxiv.org/abs/2006.12977v1 |
SwitchOut: an Efficient Data Augmentation Algorithm for Neural Machine Translation | http://arxiv.org/abs/1808.07512v2 |
Symbolic Network: Generalized Neural Policies for Relational MDPs | http://arxiv.org/abs/2002.07375v2 |
Syn-QG: Syntactic and Shallow Semantic Rules for Question Generation | http://arxiv.org/abs/2004.08694v3 |
SynSetExpan: An Iterative Framework for Joint Entity Set Expansion and Synonym Discovery | http://arxiv.org/abs/2009.13827v1 |
Synchronous Bidirectional Neural Machine Translation | http://arxiv.org/abs/1905.04847v1 |
Syntactic Data Augmentation Increases Robustness to Inference Heuristics | http://arxiv.org/abs/2004.11999v1 |
Syntactic Scaffolds for Semantic Structures | http://arxiv.org/abs/1808.10485v1 |
Syntactic Search by Example | http://arxiv.org/abs/2006.03010v1 |
Syntactic Structure Distillation Pretraining For Bidirectional Encoders | http://arxiv.org/abs/2005.13482v1 |
Syntax-Enhanced Neural Machine Translation with Syntax-Aware Word Representations | http://arxiv.org/abs/1905.02878v1 |
Syntax-guided Controlled Generation of Paraphrases | http://arxiv.org/abs/2005.08417v1 |
T-Basis: a Compact Representation for Neural Networks | http://arxiv.org/abs/2007.06631v1 |
T-GD: Transferable GAN-generated Images Detection Framework | http://arxiv.org/abs/2008.04115v1 |
T3: Tree-Autoencoder Constrained Adversarial Text Generation for Targeted Attack | http://arxiv.org/abs/1912.10375v2 |
TACRED Revisited: A Thorough Evaluation of the TACRED Relation Extraction Task | http://arxiv.org/abs/2004.14855v1 |
TAG : Type Auxiliary Guiding for Code Comment Generation | http://arxiv.org/abs/2005.02835v1 |
TAPAS: Weakly Supervised Table Parsing via Pre-training | http://arxiv.org/abs/2004.02349v2 |
TOD-BERT: Pre-trained Natural Language Understanding for Task-Oriented Dialogue | http://arxiv.org/abs/2004.06871v3 |
TORQUE: A Reading Comprehension Dataset of Temporal Ordering Questions | http://arxiv.org/abs/2005.00242v2 |
TUDataset: A collection of benchmark datasets for learning with graphs | http://arxiv.org/abs/2007.08663v1 |
TUNIZI: a Tunisian Arabizi sentiment analysis Dataset | http://arxiv.org/abs/2004.14303v1 |
TVQA+: Spatio-Temporal Grounding for Video Question Answering | http://arxiv.org/abs/1904.11574v2 |
TVQA: Localized, Compositional Video Question Answering | http://arxiv.org/abs/1809.01696v2 |
TWEETQA: A Social Media Focused Question Answering Dataset | http://arxiv.org/abs/1907.06292v1 |
TXtract: Taxonomy-Aware Knowledge Extraction for Thousands of Product Categories | http://arxiv.org/abs/2004.13852v2 |
TaBERT: Pretraining for Joint Understanding of Textual and Tabular Data | http://arxiv.org/abs/2005.08314v1 |
Tabula nearly rasa: Probing the Linguistic Knowledge of Character-Level Neural Language Models Trained on Unsegmented Text | http://arxiv.org/abs/1906.07285v1 |
Tackling the Low-resource Challenge for Canonical Segmentation | http://arxiv.org/abs/2010.02804v1 |
Tailoring: encoding inductive biases by optimizing unsupervised objectives at prediction time | http://arxiv.org/abs/2009.10623v2 |
Tails of Lipschitz Triangular Flows | http://arxiv.org/abs/1907.04481v3 |
Taking a hint: How to leverage loss predictors in contextual bandits? | http://arxiv.org/abs/2003.01922v2 |
Talk to Papers: Bringing Neural Question Answering to Academic Search | http://arxiv.org/abs/2004.02002v3 |
Talking to the crowd: What do people react to in online discussions? | http://arxiv.org/abs/1507.02205v2 |
Tangled up in BLEU: Reevaluating the Evaluation of Automatic Machine Translation Evaluation Metrics | http://arxiv.org/abs/2006.06264v2 |
Target Conditioned Sampling: Optimizing Data Selection for Multilingual Neural Machine Translation | http://arxiv.org/abs/1905.08212v1 |
Target-Guided Open-Domain Conversation | http://arxiv.org/abs/1905.11553v2 |
Targeted Syntactic Evaluation of Language Models | http://arxiv.org/abs/1808.09031v1 |
Task-Oriented Dialogue as Dataflow Synthesis | http://arxiv.org/abs/2009.11423v2 |
Task-Oriented Query Reformulation with Reinforcement Learning | http://arxiv.org/abs/1704.04572v4 |
TaskNorm: Rethinking Batch Normalization for Meta-Learning | http://arxiv.org/abs/2003.03284v2 |
Tasty Burgers, Soggy Fries: Probing Aspect Robustness in Aspect-Based Sentiment Analysis | http://arxiv.org/abs/2009.07964v4 |
TaxiNLI: Taking a Ride up the NLU Hill | http://arxiv.org/abs/2009.14505v3 |
Taxonomy of Dual Block-Coordinate Ascent Methods for Discrete Energy Minimization | http://arxiv.org/abs/2004.07715v1 |
Taylor Expansion Policy Optimization | http://arxiv.org/abs/2003.06259v1 |
TeMP: Temporal Message Passing for Temporal Knowledge Graph Completion | http://arxiv.org/abs/2010.03526v1 |
TeaForN: Teacher-Forcing with N-grams | http://arxiv.org/abs/2010.03494v2 |
Teacher-Student Domain Adaptation for Biosensor Models | http://arxiv.org/abs/2003.07896v2 |
Teacher-Student chain for efficient semi-supervised histology image classification | http://arxiv.org/abs/2003.08797v2 |
Technology Readiness Levels for Machine Learning Systems | http://arxiv.org/abs/2101.03989v1 |
Tell Me How to Ask Again: Question Data Augmentation with Controllable Rewriting in Continuous Space | http://arxiv.org/abs/2010.01475v1 |
Tell-and-Answer: Towards Explainable Visual Question Answering using Attributes and Captions | http://arxiv.org/abs/1801.09041v1 |
Template-Based Question Generation from Retrieved Sentences for Improved Unsupervised Question Answering | http://arxiv.org/abs/2004.11892v1 |
Temporal Common Sense Acquisition with Minimal Supervision | http://arxiv.org/abs/2005.04304v1 |
Temporal Information Extraction by Predicting Relative Time-lines | http://arxiv.org/abs/1808.09401v1 |
Temporal Mental Health Dynamics on Social Media | http://arxiv.org/abs/2008.13121v3 |
Temporal Phenotyping using Deep Predictive Clustering of Disease Progression | http://arxiv.org/abs/2006.08600v1 |
Temporally-Continuous Probabilistic Prediction using Polynomial Trajectory Parameterization | http://arxiv.org/abs/2011.00399v1 |
TenIPS: Inverse Propensity Sampling for Tensor Completion | http://arxiv.org/abs/2101.00323v1 |
Tensor Fusion Network for Multimodal Sentiment Analysis | http://arxiv.org/abs/1707.07250v1 |
Tensor denoising and completion based on ordinal observations | http://arxiv.org/abs/2002.06524v3 |
Tensors over Semirings for Latent-Variable Weighted Logic Programs | http://arxiv.org/abs/2006.04232v1 |
TernaryBERT: Distillation-aware Ultra-low Bit BERT | http://arxiv.org/abs/2009.12812v3 |
Test-Time Training with Self-Supervision for Generalization under Distribution Shifts | http://arxiv.org/abs/1909.13231v3 |
Tetra-Tagging: Word-Synchronous Parsing with Linear-Time Inference | http://arxiv.org/abs/1904.09745v2 |
Text Classification Using Label Names Only: A Language Model Self-Training Approach | http://arxiv.org/abs/2010.07245v1 |
Text Classification with Few Examples using Controlled Generalization | http://arxiv.org/abs/2005.08469v1 |
Text Processing Like Humans Do: Visually Attacking and Shielding NLP Systems | http://arxiv.org/abs/1903.11508v2 |
Text and Causal Inference: A Review of Using Text to Remove Confounding from Causal Estimates | http://arxiv.org/abs/2005.00649v1 |
Text to 3D Scene Generation with Rich Lexical Grounding | http://arxiv.org/abs/1505.06289v2 |
Text-Based Ideal Points | http://arxiv.org/abs/2005.04232v2 |
TextAttack: Lessons learned in designing Python frameworks for NLP | http://arxiv.org/abs/2010.01724v1 |
TextBrewer: An Open-Source Knowledge Distillation Toolkit for Natural Language Processing | http://arxiv.org/abs/2002.12620v2 |
TextHide: Tackling Data Privacy in Language Understanding Tasks | http://arxiv.org/abs/2010.06053v1 |
That is a Known Lie: Detecting Previously Fact-Checked Claims | http://arxiv.org/abs/2005.06058v1 |
The (Non-)Utility of Structural Features in BiLSTM-based Dependency Parsers | http://arxiv.org/abs/1905.12676v2 |
The ADAPT Enhanced Dependency Parser at the IWPT 2020 Shared Task | http://arxiv.org/abs/2009.01712v1 |
The Area of the Convex Hull of Sampled Curves: a Robust Functional Statistical Depth Measure | http://arxiv.org/abs/1910.04085v2 |
The Argument Reasoning Comprehension Task: Identification and Reconstruction of Implicit Warrants | http://arxiv.org/abs/1708.01425v4 |
The Boomerang Sampler | http://arxiv.org/abs/2006.13777v2 |
The Cascade Transformer: an Application for Efficient Answer Sentence Selection | http://arxiv.org/abs/2005.02534v2 |
The Complexity of Finding Stationary Points with Stochastic Gradient Descent | http://arxiv.org/abs/1910.01845v2 |
The Curse of Performance Instability in Analysis Datasets: Consequences, Source, and Suggestions | http://arxiv.org/abs/2004.13606v2 |
The Dialogue Dodecathlon: Open-Domain Knowledge and Image Grounded Conversational Agents | http://arxiv.org/abs/1911.03768v2 |
The EOS Decision and Length Extrapolation | http://arxiv.org/abs/2010.07174v1 |
The Effect of Natural Distribution Shift on Question Answering Models | http://arxiv.org/abs/2004.14444v1 |
The Expressive Power of a Class of Normalizing Flow Models | http://arxiv.org/abs/2006.00392v1 |
The FAST Algorithm for Submodular Maximization | http://arxiv.org/abs/1907.06173v1 |
The Fast Loaded Dice Roller: A Near-Optimal Exact Sampler for Discrete Probability Distributions | http://arxiv.org/abs/2003.03830v2 |
The Galactic Dependencies Treebanks: Getting More Data by Synthesizing New Languages | http://arxiv.org/abs/1710.03838v1 |
The Gossiping Insert-Eliminate Algorithm for Multi-Agent Bandits | http://arxiv.org/abs/2001.05452v3 |
The Grammar of Emergent Languages | http://arxiv.org/abs/2010.02069v2 |
The Impact of Neural Network Overparameterization on Gradient Confusion and Stochastic Gradient Descent | http://arxiv.org/abs/1904.06963v5 |
The Implicit Regularization of Ordinary Least Squares Ensembles | http://arxiv.org/abs/1910.04743v2 |
The Implicit Regularization of Stochastic Gradient Flow for Least Squares | http://arxiv.org/abs/2003.07802v2 |
The Implicit and Explicit Regularization Effects of Dropout | http://arxiv.org/abs/2002.12915v3 |
The Importance of Being Recurrent for Modeling Hierarchical Structure | http://arxiv.org/abs/1803.03585v2 |
The Importance of Category Labels in Grammar Induction with Child-directed Utterances | http://arxiv.org/abs/2006.11646v1 |
The Influence of Shape Constraints on the Thresholding Bandit Problem | http://arxiv.org/abs/2006.10006v2 |
The Interplay between Lexical Resources and Natural Language Processing | http://arxiv.org/abs/1807.00571v1 |
The Intrinsic Robustness of Stochastic Bandits to Strategic Manipulation | http://arxiv.org/abs/1906.01528v2 |
The LMU Munich System for the WMT 2020 Unsupervised Machine Translation Shared Task | http://arxiv.org/abs/2010.13192v1 |
The Language of Legal and Illegal Activity on the Darknet | http://arxiv.org/abs/1905.05543v2 |
The Lipschitz Constant of Self-Attention | http://arxiv.org/abs/2006.04710v1 |
The Lower The Simpler: Simplifying Hierarchical Recurrent Models | http://arxiv.org/abs/1809.02790v4 |
The Measure and Mismeasure of Fairness: A Critical Review of Fair Machine Learning | http://arxiv.org/abs/1808.00023v2 |
The Microsoft Toolkit of Multi-Task Deep Neural Networks for Natural Language Understanding | http://arxiv.org/abs/2002.07972v2 |
The Multilingual Amazon Reviews Corpus | http://arxiv.org/abs/2010.02573v1 |
The NarrativeQA Reading Comprehension Challenge | http://arxiv.org/abs/1712.07040v1 |
The NetHack Learning Environment | http://arxiv.org/abs/2006.13760v2 |
The Neural Tangent Kernel in High Dimensions: Triple Descent and a Multi-Scale Theory of Generalization | http://arxiv.org/abs/2008.06786v1 |
The Non-IID Data Quagmire of Decentralized Machine Learning | http://arxiv.org/abs/1910.00189v2 |
The Paradigm Discovery Problem | http://arxiv.org/abs/2005.01630v1 |
The PhotoBook Dataset: Building Common Ground through Visually-Grounded Dialogue | http://arxiv.org/abs/1906.01530v2 |
The Power Spherical distribution | http://arxiv.org/abs/2006.04437v2 |
The Power of Batching in Multiple Hypothesis Testing | http://arxiv.org/abs/1910.04968v2 |
The Referential Reader: A Recurrent Entity Network for Anaphora Resolution | http://arxiv.org/abs/1902.01541v2 |
The Return of Lexical Dependencies: Neural Lexicalized PCFGs | http://arxiv.org/abs/2007.15135v1 |
The Right Tool for the Job: Matching Model and Instance Complexities | http://arxiv.org/abs/2004.07453v2 |
The SIGMORPHON 2020 Shared Task on Unsupervised Morphological Paradigm Completion | http://arxiv.org/abs/2005.13756v1 |
The SOFC-Exp Corpus and Neural Approaches to Information Extraction in the Materials Science Domain | http://arxiv.org/abs/2006.03039v1 |
The Secret is in the Spectra: Predicting Cross-lingual Task Performance with Spectral Similarity Measures | http://arxiv.org/abs/2001.11136v2 |
The Sensitivity of Language Models and Humans to Winograd Schema Perturbations | http://arxiv.org/abs/2005.01348v2 |
The State and Fate of Linguistic Diversity and Inclusion in the NLP World | http://arxiv.org/abs/2004.09095v2 |
The Sylvester Graphical Lasso (SyGlasso) | http://arxiv.org/abs/2002.00288v1 |
The TechQA Dataset | http://arxiv.org/abs/1911.02984v1 |
The Tree Ensemble Layer: Differentiability meets Conditional Computation | http://arxiv.org/abs/2002.07772v2 |
The True Sample Complexity of Identifying Good Arms | http://arxiv.org/abs/1906.06594v1 |
The Unreasonable Volatility of Neural Machine Translation Models | http://arxiv.org/abs/2005.12398v1 |
The Unstoppable Rise of Computational Linguistics in Deep Learning | http://arxiv.org/abs/2005.06420v3 |
The Usual Suspects? Reassessing Blame for VAE Posterior Collapse | http://arxiv.org/abs/1912.10702v1 |
The Volctrans Machine Translation System for WMT20 | http://arxiv.org/abs/2010.14806v2 |
The Web as a Knowledge-base for Answering Complex Questions | http://arxiv.org/abs/1803.06643v1 |
The World is Not Binary: Learning to Rank with Grayscale Data for Dialogue Response Selection | http://arxiv.org/abs/2004.02421v4 |
The continuous categorical: a novel simplex-valued exponential family | http://arxiv.org/abs/2002.08563v2 |
The cost-free nature of optimally tuning Tikhonov regularizers and other ordered smoothers | http://arxiv.org/abs/1905.12517v1 |
The elephant in the interpretability room: Why use attention as explanation when we have saliency methods? | http://arxiv.org/abs/2010.05607v1 |
The emergence of number and syntax units in LSTM language models | http://arxiv.org/abs/1903.07435v2 |
The equivalence between Stein variational gradient descent and black-box variational inference | http://arxiv.org/abs/2004.01822v1 |
The importance of fillers for text representations of speech transcripts | http://arxiv.org/abs/2009.11340v2 |
The k-tied Normal Distribution: A Compact Parameterization of Gaussian Mean Field Posteriors in Bayesian Neural Networks | http://arxiv.org/abs/2002.02655v2 |
The many Shapley values for model explanation | http://arxiv.org/abs/1908.08474v2 |
The perceptual boost of visual attention is task-dependent in naturalistic settings | http://arxiv.org/abs/2003.00882v2 |
The role of context in neural pitch accent detection in English | http://arxiv.org/abs/2004.14846v2 |
The role of regularization in classification of high-dimensional noisy Gaussian mixture | http://arxiv.org/abs/2002.11544v1 |
The unreasonable effectiveness of Batch-Norm statistics in addressing catastrophic forgetting across medical institutions | http://arxiv.org/abs/2011.08096v1 |
Theoretical Limitations of Self-Attention in Neural Sequence Models | http://arxiv.org/abs/1906.06755v2 |
Theory of Mind with Guilt Aversion Facilitates Cooperative Reinforcement Learning | http://arxiv.org/abs/2009.07445v1 |
Thermodynamic Consistent Neural Networks for Learning Material Interfacial Mechanics | http://arxiv.org/abs/2011.14172v1 |
Thompson Sampling Algorithms for Mean-Variance Bandits | http://arxiv.org/abs/2002.00232v3 |
Thompson Sampling for Linearly Constrained Bandits | http://arxiv.org/abs/2004.09258v2 |
Thompson Sampling via Local Uncertainty | http://arxiv.org/abs/1910.13673v3 |
Thresholding Bandit Problem with Both Duels and Pulls | http://arxiv.org/abs/1910.06368v2 |
Thresholding Graph Bandits with GrAPL | http://arxiv.org/abs/1905.09190v3 |
Tied Multitask Learning for Neural Speech Translation | http://arxiv.org/abs/1802.06655v2 |
Tight Differential Privacy for Discrete-Valued Mechanisms and for the Subsampled Gaussian Mechanism Using FFT | http://arxiv.org/abs/2006.07134v2 |
Tight Lower Bounds for Combinatorial Multi-Armed Bandits | http://arxiv.org/abs/2002.05392v3 |
Tightening Exploration in Upper Confidence Reinforcement Learning | http://arxiv.org/abs/2004.09656v2 |
Tigrinya Neural Machine Translation with Transfer Learning for Humanitarian Response | http://arxiv.org/abs/2003.11523v1 |
Tilde at WMT 2020: News Task Systems | http://arxiv.org/abs/2010.15423v1 |
Time Adaptive Reinforcement Learning | http://arxiv.org/abs/2004.08600v1 |
Time Dependence in Non-Autonomous Neural ODEs | http://arxiv.org/abs/2005.01906v2 |
Time Series Deconfounder: Estimating Treatment Effects over Time in the Presence of Hidden Confounders | http://arxiv.org/abs/1902.00450v4 |
Time Series Source Separation with Slow Flows | http://arxiv.org/abs/2007.10182v1 |
Time-aware Large Kernel Convolutions | http://arxiv.org/abs/2002.03184v2 |
Tiny Video Networks | http://arxiv.org/abs/1910.06961v1 |
To BERT or Not to BERT: Comparing Task-specific and Task-agnostic Semi-Supervised Approaches for Sequence Tagging | http://arxiv.org/abs/2010.14042v1 |
To Schedule or not to Schedule: Extracting Task Specific Temporal Entities and Associated Negation Constraints | http://arxiv.org/abs/2012.02594v1 |
To Test Machine Comprehension, Start by Defining Comprehension | http://arxiv.org/abs/2005.01525v2 |
ToTTo: A Controlled Table-To-Text Generation Dataset | http://arxiv.org/abs/2004.14373v3 |
Token-level and sequence-level loss smoothing for RNN language models | http://arxiv.org/abs/1805.05062v1 |
Top-Rank-Focused Adaptive Vote Collection for the Evaluation of Domain-Specific Semantic Models | http://arxiv.org/abs/2010.04486v1 |
Topic Memory Networks for Short Text Classification | http://arxiv.org/abs/1809.03664v1 |
Topic Modeling in Embedding Spaces | http://arxiv.org/abs/1907.04907v1 |
Topic Modeling via Full Dependence Mixtures | http://arxiv.org/abs/1906.06181v3 |
Topic Sensitive Attention on Generic Corpora Corrects Sense Bias in Pretrained Embeddings | http://arxiv.org/abs/1906.02688v2 |
Topically Driven Neural Language Model | http://arxiv.org/abs/1704.08012v2 |
Topological Autoencoders | http://arxiv.org/abs/1906.00722v4 |
Topological Sort for Sentence Ordering | http://arxiv.org/abs/2005.00432v1 |
Topologically Densified Distributions | http://arxiv.org/abs/2002.04805v1 |
Torch-Struct: Deep Structured Prediction Library | http://arxiv.org/abs/2002.00876v1 |
Toward A Neuro-inspired Creative Decoder | http://arxiv.org/abs/1902.02399v4 |
Toward Better Storylines with Sentence-Level Language Models | http://arxiv.org/abs/2005.05255v1 |
Toward Fast and Accurate Neural Discourse Segmentation | http://arxiv.org/abs/1808.09147v1 |
Toward Gender-Inclusive Coreference Resolution | http://arxiv.org/abs/1910.13913v4 |
Toward Micro-Dialect Identification in Diaglossic and Code-Switched Environments | http://arxiv.org/abs/2010.04900v2 |
Towards A Sign Language Gloss Representation Of Modern Standard Arabic | http://arxiv.org/abs/2005.01497v1 |
Towards Accurate and Reliable Energy Measurement of NLP Models | http://arxiv.org/abs/2010.05248v1 |
Towards Content Transfer through Grounded Text Generation | http://arxiv.org/abs/1905.05293v1 |
Towards Conversational Recommendation over Multi-Type Dialogs | http://arxiv.org/abs/2005.03954v3 |
Towards Debiasing NLU Models from Unknown Biases | http://arxiv.org/abs/2009.12303v4 |
Towards Debiasing Sentence Representations | http://arxiv.org/abs/2007.08100v1 |
Towards Dynamic Computation Graphs via Sparse Latent Structure | http://arxiv.org/abs/1809.00653v1 |
Towards Effective Context for Meta-Reinforcement Learning: an Approach based on Contrastive Learning | http://arxiv.org/abs/2009.13891v3 |
Towards End-to-End In-Image Neural Machine Translation | http://arxiv.org/abs/2010.10648v1 |
Towards End-to-End Reinforcement Learning of Dialogue Agents for Information Access | http://arxiv.org/abs/1609.00777v3 |
Towards Explainable Graph Representations in Digital Pathology | http://arxiv.org/abs/2007.00311v1 |
Towards Exploiting Background Knowledge for Building Conversation Systems | http://arxiv.org/abs/1809.08205v1 |
Towards Faithful Neural Table-to-Text Generation with Content-Matching Constraints | http://arxiv.org/abs/2005.00969v1 |
Towards Faithfully Interpretable NLP Systems: How should we define and evaluate faithfulness? | http://arxiv.org/abs/2004.03685v3 |
Towards Induction of Structured Phoneme Inventories | http://arxiv.org/abs/2010.05959v1 |
Towards Interpretable Reasoning over Paragraph Effects in Situation | http://arxiv.org/abs/2010.01272v1 |
Towards Interpreting BERT for Reading Comprehension Based QA | http://arxiv.org/abs/2010.08983v1 |
Towards Map-Based Validation of Semantic Segmentation Masks | http://arxiv.org/abs/2011.08008v2 |
Towards Multimodal Simultaneous Neural Machine Translation | http://arxiv.org/abs/2004.03180v2 |
Towards Near-imperceptible Steganographic Text | http://arxiv.org/abs/1907.06679v2 |
Towards Neural Machine Translation for Edoid Languages | http://arxiv.org/abs/2003.10704v1 |
Towards Open Domain Event Trigger Identification using Adversarial Domain Adaptation | http://arxiv.org/abs/2005.11355v1 |
Towards Persona-Based Empathetic Conversational Models | http://arxiv.org/abs/2004.12316v7 |
Towards Physics-informed Deep Learning for Turbulent Flow Prediction | http://arxiv.org/abs/1911.08655v4 |
Towards Reasonably-Sized Character-Level Transformer NMT by Finetuning Subword Systems | http://arxiv.org/abs/2004.14280v2 |
Towards Robustifying NLI Models Against Lexical Dataset Biases | http://arxiv.org/abs/2005.04732v2 |
Towards String-to-Tree Neural Machine Translation | http://arxiv.org/abs/1704.04743v3 |
Towards Supervised and Unsupervised Neural Machine Translation Baselines for Nigerian Pidgin | http://arxiv.org/abs/2003.12660v1 |
Towards Transparent and Explainable Attention Models | http://arxiv.org/abs/2004.14243v1 |
Towards Understanding Gender Bias in Relation Extraction | http://arxiv.org/abs/1911.03642v3 |
Towards Understanding the Dynamics of the First-Order Adversaries | http://arxiv.org/abs/2010.10650v1 |
Towards Understanding the Regularization of Adversarial Robustness on Neural Networks | http://arxiv.org/abs/2011.07478v1 |
Towards Universal Dialogue State Tracking | http://arxiv.org/abs/1810.09587v1 |
Towards Unsupervised Language Understanding and Generation by Joint Dual Learning | http://arxiv.org/abs/2004.14710v1 |
Towards a General Theory of Infinite-Width Limits of Neural Classifiers | http://arxiv.org/abs/2003.05884v3 |
Towards a predictive spatio-temporal representation of brain data | http://arxiv.org/abs/2003.03290v1 |
Towards an Automatic Turing Test: Learning to Evaluate Dialogue Responses | http://arxiv.org/abs/1708.07149v2 |
Towards classification parity across cohorts | http://arxiv.org/abs/2005.08033v1 |
Towards intervention-centric causal reasoning in learning agents | http://arxiv.org/abs/2005.12968v1 |
Toxicity Detection: Does Context Really Matter? | http://arxiv.org/abs/2006.00998v1 |
Train No Evil: Selective Masking for Task-Guided Pre-Training | http://arxiv.org/abs/2004.09733v2 |
Trainable Greedy Decoding for Neural Machine Translation | http://arxiv.org/abs/1702.02429v1 |
Training Binary Neural Networks through Learning with Noisy Supervision | http://arxiv.org/abs/2010.04871v1 |
Training Binary Neural Networks using the Bayesian Learning Rule | http://arxiv.org/abs/2002.10778v4 |
Training Classifiers with Natural Language Explanations | http://arxiv.org/abs/1805.03818v4 |
Training Deep Energy-Based Models with f-Divergence Minimization | http://arxiv.org/abs/2003.03463v2 |
Training Linear Neural Networks: Non-Local Convergence and Complexity Results | http://arxiv.org/abs/2002.09852v3 |
Training Millions of Personalized Dialogue Agents | http://arxiv.org/abs/1809.01984v1 |
Training Neural Networks for and by Interpolation | http://arxiv.org/abs/1906.05661v2 |
Training Production Language Models without Memorizing User Data | http://arxiv.org/abs/2009.10031v1 |
Training Question Answering Models From Synthetic Data | http://arxiv.org/abs/2002.09599v1 |
Trajectory of Alternating Direction Method of Multipliers and Adaptive Acceleration | http://arxiv.org/abs/1906.10114v2 |
TrajectoryNet: A Dynamic Optimal Transport Network for Modeling Cellular Dynamics | http://arxiv.org/abs/2002.04461v2 |
TransQuest at WMT2020: Sentence-Level Direct Assessment | http://arxiv.org/abs/2010.05318v1 |
Transfer Learning and Distant Supervision for Multilingual Transformer Models: A Study on African Languages | http://arxiv.org/abs/2010.03179v1 |
Transfer Learning of Photometric Phenotypes in Agriculture Using Metadata | http://arxiv.org/abs/2004.00303v1 |
Transfer Learning without Knowing: Reprogramming Black-box Machine Learning Models with Scarce Data and Limited Resources | http://arxiv.org/abs/2007.08714v2 |
Transfer NAS: Knowledge Transfer between Search Spaces with Transformer Agents | http://arxiv.org/abs/1906.08102v1 |
Transferable Multi-Domain State Generator for Task-Oriented Dialogue Systems | http://arxiv.org/abs/1905.08743v2 |
Transferring Monolingual Model to Low-Resource Language: The Case of Tigrinya | http://arxiv.org/abs/2006.07698v2 |
Transform the Set: Memory Attentive Generation of Guided and Unguided Image Collages | http://arxiv.org/abs/1910.07236v2 |
Transformation Importance with Applications to Cosmology | http://arxiv.org/abs/2003.01926v1 |
Transformation Networks for Target-Oriented Sentiment Classification | http://arxiv.org/abs/1805.01086v1 |
Transformer Based Multi-Source Domain Adaptation | http://arxiv.org/abs/2009.07806v1 |
Transformer Hawkes Process | http://arxiv.org/abs/2002.09291v4 |
Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context | http://arxiv.org/abs/1901.02860v3 |
Transformer-based Context-aware Sarcasm Detection in Conversation Threads from Social Media | http://arxiv.org/abs/2005.11424v1 |
Transformers are RNNs: Fast Autoregressive Transformers with Linear Attention | http://arxiv.org/abs/2006.16236v3 |
Transformers to Learn Hierarchical Contexts in Multiparty Dialogue for Span-based Question Answering | http://arxiv.org/abs/2004.03561v2 |
Transformers without Tears: Improving the Normalization of Self-Attention | http://arxiv.org/abs/1910.05895v2 |
Transforming Complex Sentences into a Semantic Hierarchy | http://arxiv.org/abs/1906.01038v1 |
Transition-Based Dependency Parsing with Stack Long Short-Term Memory | http://arxiv.org/abs/1505.08075v1 |
Transition-based Semantic Dependency Parsing with Pointer Networks | http://arxiv.org/abs/2005.13344v2 |
Translating Natural Language Instructions for Behavioral Robot Navigation with a Multi-Head Attention Mechanism | http://arxiv.org/abs/2006.00697v3 |
Translating Neuralese | http://arxiv.org/abs/1704.06960v5 |
Translating Similar Languages: Role of Mutual Intelligibility in Multilingual Transformers | http://arxiv.org/abs/2011.05037v1 |
Translation Artifacts in Cross-lingual Transfer Learning | http://arxiv.org/abs/2004.04721v4 |
Translationese as a Language in "Multilingual" NMT | http://arxiv.org/abs/1911.03823v2 |
Traversing Knowledge Graphs in Vector Space | http://arxiv.org/abs/1506.01094v2 |
Tree-Projected Gradient Descent for Estimating Gradient-Sparse Parameters on Graphs | http://arxiv.org/abs/2006.01662v1 |
Treebank Embedding Vectors for Out-of-domain Dependency Parsing | http://arxiv.org/abs/2005.00800v1 |
Trialstreamer: Mapping and Browsing Medical Evidence in Real-Time | http://arxiv.org/abs/2005.10865v1 |
Triangular Architecture for Rare Language Translation | http://arxiv.org/abs/1805.04813v2 |
TriggerNER: Learning with Entity Triggers as Explanations for Named Entity Recognition | http://arxiv.org/abs/2004.07493v4 |
Trying AGAIN instead of Trying Longer: Prior Learning for Automatic Curriculum Learning | http://arxiv.org/abs/2004.03168v1 |
Tuning-free Plug-and-Play Proximal Algorithm for Inverse Imaging Problems | http://arxiv.org/abs/2002.09611v2 |
Two Birds, One Stone: A Simple, Unified Model for Text Generation from Structured and Unstructured Data | http://arxiv.org/abs/1909.10158v2 |
Two Routes to Scalable Credit Assignment without Weight Symmetry | http://arxiv.org/abs/2003.01513v2 |
Two are Better than One: Joint Entity and Relation Extraction with Table-Sequence Encoders | http://arxiv.org/abs/2010.03851v1 |
Two-sample Testing Using Deep Learning | http://arxiv.org/abs/1910.06239v2 |
TwoWingOS: A Two-Wing Optimization Strategy for Evidential Claim Verification | http://arxiv.org/abs/1808.03465v2 |
TyDi QA: A Benchmark for Information-Seeking Question Answering in Typologically Diverse Languages | http://arxiv.org/abs/2003.05002v1 |
Type B Reflexivization as an Unambiguous Testbed for Multilingual Multi-Task Gender Bias | http://arxiv.org/abs/2009.11982v2 |
UDapter: Language Adaptation for Truly Universal Dependency Parsing | http://arxiv.org/abs/2004.14327v2 |
UNION: An Unreferenced Metric for Evaluating Open-ended Story Generation | http://arxiv.org/abs/2009.07602v1 |
USR: An Unsupervised and Reference Free Evaluation Metric for Dialog Generation | http://arxiv.org/abs/2005.00456v1 |
Ultra-Fine Entity Typing | http://arxiv.org/abs/1807.04905v1 |
Unbiased Risk Estimators Can Mislead: A Case Study of Learning with Complementary Labels | http://arxiv.org/abs/2007.02235v3 |
Uncertain Natural Language Inference | http://arxiv.org/abs/1909.03042v2 |
Uncertainty Estimation Using a Single Deep Deterministic Neural Network | http://arxiv.org/abs/2003.02037v2 |
Uncertainty Estimation in Cancer Survival Prediction | http://arxiv.org/abs/2003.08573v2 |
Uncertainty Quantification for Deep Context-Aware Mobile Activity Recognition and Unknown Context Discovery | http://arxiv.org/abs/2003.01753v1 |
Uncertainty Quantification for Sparse Deep Learning | http://arxiv.org/abs/2002.11815v2 |
Uncertainty in Neural Networks: Approximately Bayesian Ensembling | http://arxiv.org/abs/1810.05546v5 |
Uncertainty in Neural Relational Inference Trajectory Reconstruction | http://arxiv.org/abs/2006.13666v2 |
Uncertainty quantification using martingales for misspecified Gaussian processes | http://arxiv.org/abs/2006.07368v1 |
Uncertainty-Aware Label Refinement for Sequence Labeling | http://arxiv.org/abs/2012.10608v1 |
Uncertainty-Aware Semantic Augmentation for Neural Machine Translation | http://arxiv.org/abs/2010.04411v1 |
Uncertainty-Aware Vehicle Orientation Estimation for Joint Detection-Prediction Models | http://arxiv.org/abs/2011.03114v1 |
Uncovering the Folding Landscape of RNA Secondary Structure with Deep Graph Embeddings | http://arxiv.org/abs/2006.06885v2 |
Understanding Climate Impacts on Vegetation with Gaussian Processes in Granger Causality | http://arxiv.org/abs/2012.03338v1 |
Understanding Dataset Design Choices for Multi-hop Reasoning | http://arxiv.org/abs/1904.12106v1 |
Understanding Deep Learning Performance through an Examination of Test Set Difficulty: A Psychometric Case Study | http://arxiv.org/abs/1702.04811v3 |
Understanding Generalization in Deep Learning via Tensor Methods | http://arxiv.org/abs/2001.05070v2 |
Understanding Learned Reward Functions | http://arxiv.org/abs/2012.05862v1 |
Understanding Neural Abstractive Summarization Models via Uncertainty | http://arxiv.org/abs/2010.07882v1 |
Understanding Points of Correspondence between Sentences for Abstractive Summarization | http://arxiv.org/abs/2006.05621v1 |
Understanding Self-Attention of Self-Supervised Audio Transformers | http://arxiv.org/abs/2006.03265v2 |
Understanding Self-Training for Gradual Domain Adaptation | http://arxiv.org/abs/2002.11361v1 |
Understanding Task Design Trade-offs in Crowdsourced Paraphrase Collection | http://arxiv.org/abs/1704.05753v2 |
Understanding Undesirable Word Embedding Associations | http://arxiv.org/abs/1908.06361v1 |
Understanding Unintended Memorization in Federated Learning | http://arxiv.org/abs/2006.07490v1 |
Understanding and Improving Transformer From a Multi-Particle Dynamic System Point of View | http://arxiv.org/abs/1906.02762v1 |
Understanding and Mitigating the Tradeoff Between Robustness and Accuracy | http://arxiv.org/abs/2002.10716v2 |
Understanding language-elicited EEG data by predicting it from a fine-tuned language model | http://arxiv.org/abs/1904.01548v1 |
Understanding the Curse of Horizon in Off-Policy Evaluation via Conditional Importance Sampling | http://arxiv.org/abs/1910.06508v2 |
Understanding the Difficulty of Training Transformers | http://arxiv.org/abs/2004.08249v2 |
Understanding the Impact of Model Incoherence on Convergence of Incremental SGD with Random Reshuffle | http://arxiv.org/abs/2007.03509v1 |
Understanding the Intrinsic Robustness of Image Distributions using Conditional Generative Models | http://arxiv.org/abs/2003.00378v1 |
Understanding the Mechanics of SPIGOT: Surrogate Gradients for Latent Structure Learning | http://arxiv.org/abs/2010.02357v1 |
Understanding the robustness of deep neural network classifiers for breast cancer screening | http://arxiv.org/abs/2003.10041v1 |
Undirected Graphical Models as Approximate Posteriors | http://arxiv.org/abs/1901.03440v2 |
Unfolding and Shrinking Neural Machine Translation Ensembles | http://arxiv.org/abs/1704.03279v2 |
UniConv: A Unified Conversational Neural Architecture for Multi-domain Task-oriented Dialogues | http://arxiv.org/abs/2004.14307v2 |
Unified Pragmatic Models for Generating and Following Instructions | http://arxiv.org/abs/1711.04987v3 |
Unifying Human and Statistical Evaluation for Natural Language Generation | http://arxiv.org/abs/1904.02792v1 |
Universal Approximation Property of Neural Ordinary Differential Equations | http://arxiv.org/abs/2012.02414v1 |
Universal Approximation with Deep Narrow Networks | http://arxiv.org/abs/1905.08539v2 |
Universal Average-Case Optimality of Polyak Momentum | http://arxiv.org/abs/2002.04664v3 |
Universal Decompositional Semantic Parsing | http://arxiv.org/abs/1910.10138v3 |
Universal Equivariant Multilayer Perceptrons | http://arxiv.org/abs/2002.02912v2 |
Universal Natural Language Processing with Limited Annotations: Try Few-shot Textual Entailment as a Start | http://arxiv.org/abs/2010.02584v1 |
Universal Neural Machine Translation for Extremely Low Resource Languages | http://arxiv.org/abs/1802.05368v2 |
Universal Semantic Parsing | http://arxiv.org/abs/1702.03196v4 |
Unlabelled Data Improves Bayesian Uncertainty Calibration under Covariate Shift | http://arxiv.org/abs/2006.14988v1 |
Unlocking the Potential of Deep Counterfactual Value Networks | http://arxiv.org/abs/2007.10442v1 |
Unnatural Language Processing: Bridging the Gap Between Synthetic and Natural Language Data | http://arxiv.org/abs/2004.13645v1 |
Unpaired Sentiment-to-Sentiment Translation: A Cycled Reinforcement Learning Approach | http://arxiv.org/abs/1805.05181v2 |
Unraveling Meta-Learning: Understanding Feature Representations for Few-Shot Tasks | http://arxiv.org/abs/2002.06753v3 |
Unsupervised Alignment-based Iterative Evidence Retrieval for Multi-hop Question Answering | http://arxiv.org/abs/2005.01218v1 |
Unsupervised Commonsense Question Answering with Self-Talk | http://arxiv.org/abs/2004.05483v2 |
Unsupervised Cross-lingual Transfer of Word Embedding Spaces | http://arxiv.org/abs/1809.03633v1 |
Unsupervised Discovery of Implicit Gender Bias | http://arxiv.org/abs/2004.08361v2 |
Unsupervised Discovery of Interpretable Directions in the GAN Latent Space | http://arxiv.org/abs/2002.03754v3 |
Unsupervised Discrete Sentence Representation Learning for Interpretable Neural Dialog Generation | http://arxiv.org/abs/1804.08069v1 |
Unsupervised Domain Adaptation for Visual Navigation | http://arxiv.org/abs/2010.14543v2 |
Unsupervised Domain Clusters in Pretrained Language Models | http://arxiv.org/abs/2004.02105v2 |
Unsupervised Dual Paraphrasing for Two-stage Semantic Parsing | http://arxiv.org/abs/2005.13485v3 |
Unsupervised Grammar Induction with Depth-bounded PCFG | http://arxiv.org/abs/1802.08545v2 |
Unsupervised Hierarchy Matching with Optimal Transport over Hyperbolic Spaces | http://arxiv.org/abs/1911.02536v2 |
Unsupervised Identification of Translationese | http://arxiv.org/abs/1609.03205v1 |
Unsupervised Induction of Semantic Roles within a Reconstruction-Error Minimization Framework | http://arxiv.org/abs/1412.2812v1 |
Unsupervised Learning of Morphological Forests | http://arxiv.org/abs/1702.07015v1 |
Unsupervised Learning of Syntactic Structure with Invertible Neural Projections | http://arxiv.org/abs/1808.09111v1 |
Unsupervised Morphological Paradigm Completion | http://arxiv.org/abs/2005.00970v2 |
Unsupervised Multimodal Neural Machine Translation with Pseudo Visual Pivoting | http://arxiv.org/abs/2005.03119v1 |
Unsupervised Natural Language Inference via Decoupled Multimodal Contrastive Learning | http://arxiv.org/abs/2010.08200v1 |
Unsupervised Neural Machine Translation with Weight Sharing | http://arxiv.org/abs/1804.09057v1 |
Unsupervised Online Grounding of Natural Language during Human-Robot Interactions | http://arxiv.org/abs/2007.04304v1 |
Unsupervised Opinion Summarization as Copycat-Review Generation | http://arxiv.org/abs/1911.02247v2 |
Unsupervised Opinion Summarization with Noising and Denoising | http://arxiv.org/abs/2004.10150v1 |
Unsupervised Paraphrasing by Simulated Annealing | http://arxiv.org/abs/1909.03588v2 |
Unsupervised Parsing via Constituency Tests | http://arxiv.org/abs/2010.03146v1 |
Unsupervised Pidgin Text Generation By Pivoting English Data and Self-Training | http://arxiv.org/abs/2003.08272v1 |
Unsupervised Pivot Translation for Distant Languages | http://arxiv.org/abs/1906.02461v3 |
Unsupervised Pre-training of Bidirectional Speech Encoders via Masked Reconstruction | http://arxiv.org/abs/2001.10603v2 |
Unsupervised Quality Estimation for Neural Machine Translation | http://arxiv.org/abs/2005.10608v2 |
Unsupervised Question Answering by Cloze Translation | http://arxiv.org/abs/1906.04980v2 |
Unsupervised Question Decomposition for Question Answering | http://arxiv.org/abs/2002.09758v3 |
Unsupervised Recurrent Neural Network Grammars | http://arxiv.org/abs/1904.03746v6 |
Unsupervised Reference-Free Summary Quality Evaluation via Contrastive Learning | http://arxiv.org/abs/2010.01781v1 |
Unsupervised Speech Decomposition via Triple Information Bottleneck | http://arxiv.org/abs/2004.11284v5 |
Unsupervised Statistical Machine Translation | http://arxiv.org/abs/1809.01272v1 |
Unsupervised Text Style Transfer with Padded Masked Language Models | http://arxiv.org/abs/2010.01054v1 |
Unsupervised Transfer Learning for Spatiotemporal Predictive Networks | http://arxiv.org/abs/2009.11763v1 |
Unsupervised deep clustering for predictive texture pattern discovery in medical images | http://arxiv.org/abs/2002.03721v1 |
Up or Down? Adaptive Rounding for Post-Training Quantization | http://arxiv.org/abs/2004.10568v2 |
Urban Driving with Conditional Imitation Learning | http://arxiv.org/abs/1912.00177v2 |
Using Automatically Extracted Minimum Spans to Disentangle Coreference Evaluation from Boundary Detection | http://arxiv.org/abs/1906.06703v1 |
Using Context in Neural Machine Translation Training Objectives | http://arxiv.org/abs/2005.01483v1 |
Using Convolutional Variational Autoencoders to Predict Post-Trauma Health Outcomes from Actigraphy Data | http://arxiv.org/abs/2011.07406v2 |
Using Large Pretrained Language Models for Answering User Queries from Product Specifications | http://arxiv.org/abs/2005.14613v1 |
Using Linguistic Features to Improve the Generalization Capability of Neural Coreference Resolvers | http://arxiv.org/abs/1708.00160v2 |
Using Natural Language Relations between Answer Choices for Machine Comprehension | http://arxiv.org/abs/2012.15837v1 |
Using Punkt for Sentence Segmentation in non-Latin Scripts: Experiments on Kurdish (Sorani) Texts | http://arxiv.org/abs/2004.14134v2 |
Using Type Information to Improve Entity Coreference Resolution | http://arxiv.org/abs/2010.05738v1 |
Using competency questions to select optimal clustering structures for residential energy consumption patterns | http://arxiv.org/abs/2006.00934v1 |
Using millions of emoji occurrences to learn any-domain representations for detecting sentiment, emotion and sarcasm | http://arxiv.org/abs/1708.00524v2 |
Utility is in the Eye of the User: A Critique of NLP Leaderboards | http://arxiv.org/abs/2009.13888v3 |
Utility/Privacy Trade-off through the lens of Optimal Transport | http://arxiv.org/abs/1905.11148v3 |
VCDM: Leveraging Variational Bi-encoding and Deep Contextualized Word Representations for Improved Definition Modeling | http://arxiv.org/abs/2010.03124v1 |
VD-BERT: A Unified Vision and Dialog Transformer with BERT | http://arxiv.org/abs/2004.13278v3 |
VFlow: More Expressive Generative Flows with Variational Data Augmentation | http://arxiv.org/abs/2002.09741v2 |
Validated Variational Inference via Practical Posterior Error Bounds | http://arxiv.org/abs/1910.04102v4 |
Validation of Approximate Likelihood and Emulator Models for Computationally Intensive Simulations | http://arxiv.org/abs/1905.11505v2 |
Variable Skipping for Autoregressive Range Density Estimation | http://arxiv.org/abs/2007.05572v1 |
Variance Reduced Coordinate Descent with Acceleration: New Method With a Surprising Application to Finite-Sum Problems | http://arxiv.org/abs/2002.04670v1 |
Variance Reduction for Matrix Games | http://arxiv.org/abs/1907.02056v2 |
Variance Reduction in Stochastic Particle-Optimization Sampling | http://arxiv.org/abs/1811.08052v1 |
Variational Autoencoders and Nonlinear ICA: A Unifying Framework | http://arxiv.org/abs/1907.04809v4 |
Variational Autoencoders for Sparse and Overdispersed Discrete Data | http://arxiv.org/abs/1905.00616v2 |
Variational Autoencoders with Riemannian Brownian Motion Priors | http://arxiv.org/abs/2002.05227v3 |
Variational Bayesian Quantization | http://arxiv.org/abs/2002.08158v2 |
Variational Depth Search in ResNets | http://arxiv.org/abs/2002.02797v4 |
Variational Inference for Learning Representations of Natural Language Edits | http://arxiv.org/abs/2004.09143v4 |
Variational Inference with Continuously-Indexed Normalizing Flows | http://arxiv.org/abs/2007.05426v1 |
Variational Knowledge Graph Reasoning | http://arxiv.org/abs/1803.06581v3 |
Variational Neural Machine Translation with Normalizing Flows | http://arxiv.org/abs/2005.13978v1 |
Variational Optimization on Lie Groups, with Examples of Leading (Generalized) Eigenvalue Problems | http://arxiv.org/abs/2001.10006v1 |
Variational Pretraining for Semi-supervised Text Classification | http://arxiv.org/abs/1906.02242v1 |
Variational Sequential Labelers for Semi-Supervised Learning | http://arxiv.org/abs/1906.09535v1 |
Vector-Vector-Matrix Architecture: A Novel Hardware-Aware Framework for Low-Latency Inference in NLP Applications | http://arxiv.org/abs/2010.08412v1 |
Vehicle Trajectory Prediction by Transfer Learning of Semi-Supervised Models | http://arxiv.org/abs/2007.06781v2 |
Verb Physics: Relative Physical Knowledge of Actions and Objects | http://arxiv.org/abs/1706.03799v2 |
Video Prediction via Example Guidance | http://arxiv.org/abs/2007.01738v1 |
Video-Grounded Dialogues with Pretrained Generation Language Models | http://arxiv.org/abs/2006.15319v1 |
Video2Commonsense: Generating Commonsense Descriptions to Enrich Video Captioning | http://arxiv.org/abs/2003.05162v3 |
Visual Grounding of Learned Physical Models | http://arxiv.org/abs/2004.13664v2 |
Visually Grounded Continual Learning of Compositional Phrases | http://arxiv.org/abs/2005.00785v5 |
Visually Grounded Neural Syntax Acquisition | http://arxiv.org/abs/1906.02890v2 |
Voice Separation with an Unknown Number of Multiple Speakers | http://arxiv.org/abs/2003.01531v4 |
Volctrans Parallel Corpus Filtering System for WMT 2020 | http://arxiv.org/abs/2010.14029v1 |
Wandering Within a World: Online Contextualized Few-Shot Learning | http://arxiv.org/abs/2007.04546v2 |
Wasserstein Control of Mirror Langevin Monte Carlo | http://arxiv.org/abs/2002.04363v1 |
Wasserstein Distance Regularized Sequence Representation for Text Matching in Asymmetrical Domains | http://arxiv.org/abs/2010.07717v2 |
Wasserstein Smoothing: Certified Robustness against Wasserstein Adversarial Attacks | http://arxiv.org/abs/1910.10783v1 |
Wasserstein Style Transfer | http://arxiv.org/abs/1905.12828v1 |
WaveFlow: A Compact Flow-based Model for Raw Audio | http://arxiv.org/abs/1912.01219v4 |
WaveNODE: A Continuous Normalizing Flow for Speech Synthesis | http://arxiv.org/abs/2006.04598v4 |
We Can Detect Your Bias: Predicting the Political Ideology of News Articles | http://arxiv.org/abs/2010.05338v1 |
WeChat Neural Machine Translation Systems for WMT20 | http://arxiv.org/abs/2010.00247v2 |
Weakly Supervised Context Encoder using DICOM metadata in Ultrasound Imaging | http://arxiv.org/abs/2003.09070v1 |
Weakly Supervised Learning of Nuanced Frames for Analyzing Polarization in News Media | http://arxiv.org/abs/2009.09609v1 |
Weakly Supervised Medication Regimen Extraction from Medical Conversations | http://arxiv.org/abs/2010.05317v1 |
Weakly-Supervised Aspect-Based Sentiment Analysis via Joint Aspect-Sentiment Topic Embedding | http://arxiv.org/abs/2010.06705v1 |
Weakly-Supervised Disentanglement Without Compromises | http://arxiv.org/abs/2002.02886v4 |
Weakly-Supervised Spatio-Temporally Grounding Natural Sentence in Video | http://arxiv.org/abs/1906.02549v1 |
WeatherBench: A benchmark dataset for data-driven weather forecasting | http://arxiv.org/abs/2002.00469v3 |
Weight Poisoning Attacks on Pre-trained Models | http://arxiv.org/abs/2004.06660v1 |
Weird AI Yankovic: Generating Parody Lyrics | http://arxiv.org/abs/2009.12240v1 |
Weisfeiler and Leman go sparse: Towards scalable higher-order graph embeddings | http://arxiv.org/abs/1904.01543v3 |
What BERT is not: Lessons from a new suite of psycholinguistic diagnostics for language models | http://arxiv.org/abs/1907.13528v2 |
What Can Learned Intrinsic Rewards Capture? | http://arxiv.org/abs/1912.05500v3 |
What Can We Learn from Collective Human Opinions on Natural Language Inference Data? | http://arxiv.org/abs/2010.03532v2 |
What Did You Think Would Happen? Explaining Agent Behaviour Through Intended Outcomes | http://arxiv.org/abs/2011.05064v1 |
What Do Position Embeddings Learn? An Empirical Study of Pre-Trained Language Model Positional Encoding | http://arxiv.org/abs/2010.04903v1 |
What Does My QA Model Know? Devising Controlled Probes using Expert Knowledge | http://arxiv.org/abs/1912.13337v2 |
What Gives the Answer Away? Question Answering Bias Analysis on Video QA Datasets | http://arxiv.org/abs/2007.03626v1 |
What Happens To BERT Embeddings During Fine-tuning? | http://arxiv.org/abs/2004.14448v1 |
What Have We Achieved on Text Summarization? | http://arxiv.org/abs/2010.04529v1 |
What Kind of Language Is Hard to Language-Model? | http://arxiv.org/abs/1906.04726v2 |
What Makes Reading Comprehension Questions Easier? | http://arxiv.org/abs/1808.09384v1 |
What Question Answering can Learn from Trivia Nerds | http://arxiv.org/abs/1910.14464v3 |
What Was Written vs. Who Read It: News Media Profiling Using Text Analysis and Social Media Context | http://arxiv.org/abs/2005.04518v1 |
What You Say and How You Say it: Joint Modeling of Topics and Discourse in Microblog Conversations | http://arxiv.org/abs/1903.07319v1 |
What Your Username Says About You | http://arxiv.org/abs/1507.02045v2 |
What are the Goals of Distributional Semantics? | http://arxiv.org/abs/2005.02982v1 |
What are the Statistical Limits of Offline RL with Linear Function Approximation? | http://arxiv.org/abs/2010.11895v1 |
What do Models Learn from Question Answering Datasets? | http://arxiv.org/abs/2004.03490v2 |
What do Neural Machine Translation Models Learn about Morphology? | http://arxiv.org/abs/1704.03471v3 |
What is Learned in Visually Grounded Neural Syntax Acquisition | http://arxiv.org/abs/2005.01678v2 |
What is Local Optimality in Nonconvex-Nonconcave Minimax Optimization? | http://arxiv.org/abs/1902.00618v3 |
What is More Likely to Happen Next? Video-and-Language Future Event Prediction | http://arxiv.org/abs/2010.07999v1 |
What makes a good conversation? How controllable attributes affect human judgments | http://arxiv.org/abs/1902.08654v2 |
What's in a Name? Are BERT Named Entity Representations just as Good for any other Name? | http://arxiv.org/abs/2007.06897v1 |
When Are Tree Structures Necessary for Deep Learning of Representations? | http://arxiv.org/abs/1503.00185v5 |
When BERT Plays the Lottery, All Tickets Are Winning | http://arxiv.org/abs/2005.00561v2 |
When Does Self-Supervision Help Graph Convolutional Networks? | http://arxiv.org/abs/2006.09136v4 |
When Does Unsupervised Machine Translation Work? | http://arxiv.org/abs/2004.05516v3 |
When Explanations Lie: Why Many Modified BP Attributions Fail | http://arxiv.org/abs/1912.09818v6 |
When Hearst Is not Enough: Improving Hypernymy Detection from Corpus with Distributional Models | http://arxiv.org/abs/2010.04941v1 |
When and Why is Unsupervised Neural Machine Translation Useless? | http://arxiv.org/abs/2004.10581v1 |
When deep denoising meets iterative phase retrieval | http://arxiv.org/abs/2003.01792v1 |
When do Word Embeddings Accurately Reflect Surveys on our Beliefs About People? | http://arxiv.org/abs/2004.12043v1 |
Where Are You? Localization from Embodied Dialog | http://arxiv.org/abs/2011.08277v1 |
Where Are the Facts? Searching for Fact-checked Information to Alleviate the Spread of Fake News | http://arxiv.org/abs/2010.03159v1 |
Where's the Question? A Multi-channel Deep Convolutional Neural Network for Question Identification in Textual Data | http://arxiv.org/abs/2010.07816v1 |
Which Tasks Should Be Learned Together in Multi-task Learning? | http://arxiv.org/abs/1905.07553v4 |
Who did What: A Large-Scale Person-Centered Cloze Dataset | http://arxiv.org/abs/1608.05457v1 |
Whodunnit? Crime Drama as a Case for Natural Language Understanding | http://arxiv.org/abs/1710.11601v1 |
Why Non-myopic Bayesian Optimization is Promising and How Far Should We Look-ahead? A Study via Rollout | http://arxiv.org/abs/1911.01004v2 |
Why Normalizing Flows Fail to Detect Out-of-Distribution Data | http://arxiv.org/abs/2006.08545v1 |
Why Overfitting Isn't Always Bad: Retrofitting Cross-Lingual Word Embeddings to Dictionaries | http://arxiv.org/abs/2005.00524v1 |
Why Self-Attention? A Targeted Evaluation of Neural Machine Translation Architectures | http://arxiv.org/abs/1808.08946v3 |
Why Skip If You Can Combine: A Simple Knowledge Distillation Technique for Intermediate Layers | http://arxiv.org/abs/2010.03034v1 |
Why bigger is not always better: on finite and infinite neural networks | http://arxiv.org/abs/1910.08013v3 |
Why is unsupervised alignment of English embeddings from different algorithms so hard? | http://arxiv.org/abs/1809.00150v1 |
Widget Captioning: Generating Natural Language Description for Mobile User Interface Elements | http://arxiv.org/abs/2010.04295v1 |
Wiki-CS: A Wikipedia-Based Benchmark for Graph Neural Networks | http://arxiv.org/abs/2007.02901v1 |
WikiConv: A Corpus of the Complete Conversational History of a Large Online Collaborative Community | http://arxiv.org/abs/1810.13181v1 |
Will I Sound Like Me? Improving Persona Consistency in Dialogues through Pragmatic Self-Consciousness | http://arxiv.org/abs/2004.05816v2 |
Will-They-Won't-They: A Very Large Dataset for Stance Detection on Twitter | http://arxiv.org/abs/2005.00388v1 |
Winning on the Merits: The Joint Effects of Content and Style on Debate Outcomes | http://arxiv.org/abs/1705.05040v1 |
WinoWhy: A Deep Diagnosis of Essential Commonsense Knowledge for Answering Winograd Schema Challenge | http://arxiv.org/abs/2005.05763v1 |
With Little Power Comes Great Responsibility | http://arxiv.org/abs/2010.06595v1 |
Woodbury Transformations for Deep Generative Flows | http://arxiv.org/abs/2002.12229v3 |
Word Embeddings for Chemical Patent Natural Language Processing | http://arxiv.org/abs/2010.12912v1 |
Word Frequency Does Not Predict Grammatical Knowledge in Language Models | http://arxiv.org/abs/2010.13870v1 |
Word Ordering Without Syntax | http://arxiv.org/abs/1604.08633v2 |
Word Rotator's Distance | http://arxiv.org/abs/2004.15003v3 |
Word class flexibility: A deep contextualized approach | http://arxiv.org/abs/2009.09241v1 |
Word-level Speech Recognition with a Letter to Word Encoder | http://arxiv.org/abs/1906.04323v2 |
Word-level Textual Adversarial Attacking as Combinatorial Optimization | http://arxiv.org/abs/1910.12196v4 |
Word-order biases in deep-agent emergent communication | http://arxiv.org/abs/1905.12330v3 |
Words aren't enough, their order matters: On the Robustness of Grounding Visual Referring Expressions | http://arxiv.org/abs/2005.01655v1 |
Working Memory Networks: Augmenting Memory Networks with a Relational Reasoning Module | http://arxiv.org/abs/1805.09354v1 |
World Model as a Graph: Learning Latent Landmarks for Planning | http://arxiv.org/abs/2011.12491v1 |
Worse WER, but Better BLEU? Leveraging Word Embedding as Intermediate in Multitask End-to-End Speech Translation | http://arxiv.org/abs/2005.10678v2 |
X-FACTR: Multilingual Factual Knowledge Retrieval from Pretrained Language Models | http://arxiv.org/abs/2010.06189v3 |
X-LXMERT: Paint, Caption and Answer Questions with Multi-Modal Transformers | http://arxiv.org/abs/2009.11278v1 |
XGLUE: A New Benchmark Dataset for Cross-lingual Pre-training, Understanding and Generation | http://arxiv.org/abs/2004.01401v3 |
XL-WiC: A Multilingual Benchmark for Evaluating Semantic Contextualization | http://arxiv.org/abs/2010.06478v1 |
XLNet: Generalized Autoregressive Pretraining for Language Understanding | http://arxiv.org/abs/1906.08237v2 |
XLVIN: eXecuted Latent Value Iteration Nets | http://arxiv.org/abs/2010.13146v2 |
XTREME: A Massively Multilingual Multi-task Benchmark for Evaluating Cross-lingual Generalization | http://arxiv.org/abs/2003.11080v5 |
Xiaomingbot: A Multilingual Robot News Reporter | http://arxiv.org/abs/2007.08005v1 |
XtarNet: Learning to Extract Task-Adaptive Representation for Incremental Few-Shot Learning | http://arxiv.org/abs/2003.08561v2 |
XtremeDistil: Multi-stage Distillation for Massive Multilingual Models | http://arxiv.org/abs/2004.05686v2 |
YOLObile: Real-Time Object Detection on Mobile Devices via Compression-Compilation Co-Design | http://arxiv.org/abs/2009.05697v2 |
You Impress Me: Dialogue Generation via Mutual Persona Perception | http://arxiv.org/abs/2004.05388v1 |
Zeno++: Robust Fully Asynchronous SGD | http://arxiv.org/abs/1903.07020v4 |
Zero-Resource Translation with Multi-Lingual Neural Machine Translation | http://arxiv.org/abs/1606.04164v1 |
Zero-Shot Cross-Lingual Opinion Target Extraction | http://arxiv.org/abs/1904.09122v1 |
Zero-Shot Stance Detection: A Dataset and Model using Generalized Topic Representations | http://arxiv.org/abs/2010.03640v1 |
Zero-Shot Transfer Learning for Event Extraction | http://arxiv.org/abs/1707.01066v1 |
Zero-Shot Transfer Learning with Synthesized Data for Multi-Domain Dialogue State Tracking | http://arxiv.org/abs/2005.00891v1 |
Zero-Shot Translation Quality Estimation with Explicit Cross-Lingual Patterns | http://arxiv.org/abs/2010.04989v1 |
Zero-shot Sequence Labeling: Transferring Knowledge from Sentences to Tokens | http://arxiv.org/abs/1805.02214v1 |
Zero-shot User Intent Detection via Capsule Neural Networks | http://arxiv.org/abs/1809.00385v1 |
ZeroShotCeres: Zero-Shot Relation Extraction from Semi-Structured Webpages | http://arxiv.org/abs/2005.07105v1 |
doc2dial: A Goal-Oriented Document-Grounded Dialogue Dataset | http://arxiv.org/abs/2011.06623v2 |
emrQA: A Large Corpus for Question Answering on Electronic Medical Records | http://arxiv.org/abs/1809.00732v1 |
giotto-tda: A Topological Data Analysis Toolkit for Machine Learning and Data Exploration | http://arxiv.org/abs/2004.02551v1 |
i-RIM applied to the fastMRI challenge | http://arxiv.org/abs/1910.08952v1 |
iNLTK: Natural Language Toolkit for Indic Languages | http://arxiv.org/abs/2009.12534v2 |
iSarcasm: A Dataset of Intended Sarcasm | http://arxiv.org/abs/1911.03123v2 |
jiant: A Software Toolkit for Research on General-Purpose Text Understanding Models | http://arxiv.org/abs/2003.02249v2 |
k-simplex2vec: a simplicial extension of node2vec | http://arxiv.org/abs/2010.05636v2 |
pyBART: Evidence-based Syntactic Transformations for IE | http://arxiv.org/abs/2005.01306v2 |
scGNN: scRNA-seq Dropout Imputation via Induced Hierarchical Cell Similarity Graph | http://arxiv.org/abs/2008.03322v1 |
schuBERT: Optimizing Elements of BERT | http://arxiv.org/abs/2005.06628v1 |
simNet: Stepwise Image-Topic Merging Network for Generating Detailed and Comprehensive Image Captions | http://arxiv.org/abs/1808.08732v1 |
-
-
Save amitness/9e5ad24ab963785daca41e2c4cfa9a82 to your computer and use it in GitHub Desktop.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment