id author title date pages extension mime words sentences flesch summary cache txt cord-131667-zl5txjqx Liu, Junhua EPIC30M: An Epidemics Corpus Of Over 30 Million Relevant Tweets 2020-06-09 .txt text/plain 4079 251 48 In this paper, we present EPIC30M, a large-scale epidemic corpus that contains 30 millions micro-blog posts, i.e., tweets crawled from Twitter, from year 2006 to 2020. Furthermore, a time-series analysis also suggests that some of the epidemics, i.e. 2010 Haiti Cholera and 2018 Kivu Ebola, show a surge in tweets before the respective start dates of the outbreaks, which signifies the importance of leveraging social media to conduct early signal detection. Through the time-series line plots, we observe that some of the epidemics, i.e. 2010 Haiti Cholera and 2018 Kivu Ebola, show a surge in tweets before the respective official start dates of the outbreaks, which signifies the importance of leveraging social media to conduct early signal detection. While early detection and warning systems for crisis events may reduce overall damage and negative impacts [31] , EPIC30M provides high volume and timely information that facilitate trend analysis and pattern recognition tasks for epidemic events. ./cache/cord-131667-zl5txjqx.txt ./txt/cord-131667-zl5txjqx.txt