id author title date pages extension mime words sentences flesch summary cache txt lesk-fragility-2020 lesk-fragility-2020 .docx application/vnd.openxmlformats-officedocument.wordprocessingml.document 4868 364 64 Fragility errors here can arise from many sources for example, the training data may not be representative of the real problem (if you train a machine translation program solely on engineering documents, do not expect it to do well on theater reviews). Similarly, the New York Times discussed the way groups of primarily young white men will build systems that focus on their data, and give wrong or discriminatory answers in more general situations (Tugend 2019). Instead of trying to learn more about the characteristics of a system that is being modeled, the effort is driven by the dictum, "more data beats better algorithms." In a review of the history of speech recognition, Xuedong Huang, James Baker, and Raj Reddy write, "The power of these systems arises mainly from their ability to collect, process, and learn from very large datasets. ./cache/lesk-fragility-2020.docx ./txt/lesk-fragility-2020.txt