id author title date pages extension mime words sentences flesch summary cache txt work_oqwzwsqahzdmxjlup3t334s4ei Ellie Pavlick Inherent Disagreements in Human Textual Inferences 2019 18 .pdf application/pdf 11324 883 58 Inherent Disagreements in Human Textual Inferences to current work on learning common sense human inferences from hundreds of thousands of train models to make the inferences that a human multiple human raters to label pairs of sentences, should explicitly incentivize models to predict distributions over human judgments. not licensed in a specific context, instead advocating that annotation tasks should be ''natural'' for goal in NLP is to train models that reverseengineer the inferences a human would make humans use to draw inferences from natural language, but merely: Left to their own devices, To perform our analysis, we collect NLI judgments at 50× redundancy for sentence pairs drawn that, for many sentence pairs, human judgments Figure 4: Examples of sentence pairs with bi-modal human judgment distributions. the human labels, in one case because the model do not learn human-like models of uncertainty large annotated corpus for learning natural language inference. ./cache/work_oqwzwsqahzdmxjlup3t334s4ei.pdf ./txt/work_oqwzwsqahzdmxjlup3t334s4ei.txt