id author title date pages extension mime words sentences flesch summary cache txt cord-306375-cs4s2o8y Costa-Santos, C. COVID-19 surveillance - a descriptive study on data quality issues 2020-11-05 .txt text/plain 5151 252 52 Nevertheless, to our knowledge, there is no study performing a structured assessment of data quality issues from the datasets provided by National Surveillance Systems for research purposes during the COVID-19 pandemic. This updated database had an inconsistent manifest, including some variables presented in a different format (for example, instead of a variable with the outcome of the patient, the second dataset presented two dates: death and recovery date), or with different definitions (for example, variable age was defined as the age at the time of COVID-19 onset or as age at the time of COVID-19 notification, in the first and second datasets, respectively), which raised concerns regarding their use for valid research and replication of the analysis made using the first version of data. The DGSAugust dataset included 38520 COVID-19 cases diagnosed between March and June, less 4,003 cases (9%) than the daily public report provided by Portuguese Directorate-General of Health. ./cache/cord-306375-cs4s2o8y.txt ./txt/cord-306375-cs4s2o8y.txt