key: cord-0736334-jwq48f2v
authors: Van Calster, Ben; Wynants, Laure; Riley, Richard D; van Smeden, Maarten; Collins, Gary S
title: Methodology over metrics: current scientific standards are a disservice to patients and society
date: 2021-05-30
journal: J Clin Epidemiol
DOI: 10.1016/j.jclinepi.2021.05.018
sha: 837976ab2b9d6bef6045c9ccedde24d4feb406d4
doc_id: 736334
cord_uid: jwq48f2v

Covid-19 research made it painfully clear that the scandal of poor medical research, as denounced by Altman in 1994, persists today. The overall quality of medical research remains poor, despite longstanding criticisms. The problems are well known, but the research community fails to properly address them. We suggest that most problems stem from an underlying paradox: although methodology is undeniably the backbone of high-quality and responsible research, science consistently undervalues methodology. The focus remains more on the destination (research claims and metrics) than on the journey. Notwithstanding, research should serve society more than the reputation of those involved. While we notice that many initiatives are being established to improve components of the research cycle, these initiatives are too disjointed. The overall system is monolithic and slow to adapt. We assert that top-down action is needed from journals, universities, funders and governments to break the cycle and put methodology first. These actions should involve the widespread adoption of registered reports, balanced research funding between innovative, incremental and methodological research projects, full recognition and demystification of peer review, improved methodological review of reports, adherence to reporting guidelines, and investment in methodological education and research. Currently, the scientific enterprise is doing a major disservice to patients and society.

The academic world quickly responded to the covid-19 situation, and produced a staggering amount of research publications. Whilst nobody would disagree that we need organized collaborative research efforts to study disease prevention, management and treatment, the reality is that large swathes of research (including pre-prints and peerreviewed articles) of covid-19 is of poor quality, and this mirrors the quality of medical research in general [ 1 , 2 ] . For example, more than 230 prediction models have been published for the diagnosis of covid-19 infection or for predicting prognosis in infected patients [3] . A systematic review and critical appraisal of these models found that 

Research incentives focus on quantity, rather than methodological quality

Scientists are rewarded to rapidly churn out publications that are often poorly designed or use poor-quality data [ 8 -12 ] . Research evaluations also focus on journal prestige (e.g., the impact factor), number of citations (the 'H index'), and the amount of attention for a publication (e.g., the Altmetric score). Unfortunately, these metrics have only a modest and inconsistent association with quality [ 18 , 19 ] .

Funders and journals prioritize novelty over incremental and replication research Funding calls often focus on innovative (though high risk) ideas, sometimes with a guarantee that the project will succeed. Such guarantee may come from (often unfunded) preliminary results. Such requirements encourage researchers to run before they can walk. Often, funders and journals do not prioritize incremental and replication research due to perceived lack of novelty. Yet incremental and replication research is essential to confirm, expand, or refute reported breakthroughs [ 14 , 20 , 21 ] .

Researchers' agendas are dictated by short-term deadlines

Researchers are confronted with numerous deadlines related to grant proposals, conference submissions, training requirements, and doctoral dissertations. For all of these deadlines, it is commonplace to present some study findings. To fulfill this demand, methodological quality is often compromised. Examples include premature end of patient recruitment, unplanned interim analyses, use of poorly cleaned data, and small and poorly conceived studies. Such shortcuts lead to the dissemination of misleading or premature results [12] .

Peer review remains unacknowledged Peer review is one of the only stages in the scientific process where the quality of research plans and findings can be evaluated in detail [23] . In reality, peer review is largely carried without recognition, and the quality of peer review reports varies considerably [ 24 , 25 ] . The popularity of the pre-print approach, in which study reports are disseminated prior to being peer reviewed for the sake of openness, is therefore likely to backfire, in particular given the recent concerning evolution to accompany such reports with a press release.

Methodological illiteracy is still accepted It is a persisting problem that many researchers know too little about methodology and many studies are conducted with no or little involvement of adequately trained methodologists/statisticians for the research at hand [26] .

Transparent and complete reporting remains rare While such reporting is vital for understanding and reproducibility, systematic reviews repeatedly indicate that reporting remains incomplete [27] . Journals play a role as well, for example by enforcing strict word limits, encouraging 'brief reports', discouraging supplementary material, or applying charges per page.

nearly all models were at high risk of bias due to shortcomings in design, analysis and reporting. It was therefore not possible to judge whether the authors' conclusions on performance were trustworthy, casting doubt on whether they are safe to use [3] . Another example involves research on the treatment effect of hydroxychloroquine for covid-19 patients. Early reports claiming positive effects have been severely criticized for multiple serious methodological flaws [4] . We describe additional examples in Supplementary Material.

Poor quality research can result from poor design, conduct, or reporting, and leads to 'research waste': it has little value for patients and society, and can even be harmful if it forms the basis for making decisions [5] . The flawed research on hydroxychloroquine at the beginning of the pandemic affected policy, jeopardizing access for patients with indicated uses for the product, and hampered recruitment of patients in subsequent research [6] . Research waste is not confined to covid-19, but has been steadily accumulating for decades. In 1994, Doug Altman wrote the provocatively titled article 'The scandal of poor medical research' [7] . This paper could have been written today, without changing a single word. Despite being repeatedly denounced by multiple scientists, [ 8 -17 ] research waste remains a persistent, structural and costly problem resulting from how academia works. We argue that the core problem is a paradox: methodology, the very backbone of science, remains overly trivialized by the scientific community that funds, undertakes and reports (pre)clinical research. This paradox is endemic and needs to be eradicated. Systemic changes to improve science can only be effective if enforced top-down and critically based on the unacceptability of this paradox.

The current organization of the scientific enterprise is business-like, with a strong focus on procedures and boxticking to ensure that the system remains operational. This has unfortunate but well known consequences, of which we describe six in Table 1 : (1) research incentives focus on quantity, rather than methodological quality, (2) funders and journals prioritize novelty over incremental and Table 2 . Practices resulting from prioritizing publication appearance over publication quality

Poor study preparation and design Many studies are poorly designed and ill-prepared, with an insufficiently detailed or inaccessible research protocol (if one exists at all) [ 10 , 28 ] . While intervention studies in humans more often have a protocol than other studies, the mere presence of a protocol does not automatically imply that all research team members adhere to it, or that the study is well designed. Poor design problems include issues such as inappropriate control group, selection bias, small sample size, and failure to use appropriate statistical tools.

Data or analysis tweaking (e.g. p-hacking) Many publications contain results that are not fully honest, by tweaking the data or analysis procedures or even data fabrication [ 13 , 29 ] . A particular phenomenon is that of p-hacking, where researches experiment with statistical approaches and inclusions/exclusions of participants until a statistically significant result is obtained [30] .

Incomplete reporting Key information needed to understand how a study was carried out and what was found is often simply not mentioned in publications [27] . Poor reporting can make results unusable or uninterpretable, which subverts the hard work of setting up and conducting the study.

Selective reporting Many publications suffer from selective reporting by focusing on the most interesting or surprising results [31] . For example, in publications from clinical trials, endpoints that were not prespecified are often added and endpoints that were prespecified are left out from presentation without justification [32] .

The interpretation and conclusions of study results are often too strong even after peer review, a phenomenon called 'spin' [33] . Spin is also seen in the tendency to use more positive words in abstracts, and to use exaggerated claims when disseminating research results to (social) media [34] . This can lead to overinterpretation and the spread of exaggerated beliefs that take much more time to debunk.

Publication bias A manuscript that reports on a study with less appealing or 'negative' results are historically less likely to be submitted for publication and accepted by journals than other manuscripts. This is the well-known and long-standing problem of publication bias [35] . This is a major ethical problem, because it seriously distorts the evidence base and hence our knowledge on the effectiveness of interventions. In addition, study participants of unpublished trials (referring to tens of thousands of patients) have been exposed to risk and inconvenience for no good reason [36] . Alongside publication bias, there is also the tendency that studies with positive results are more frequently cited ('citation bias'), which may further distort the evidence base [37] .

HARKing (hypothesizing after the results are known) HARKing means that parts of a publication (such as the introduction and the hypothesis) are written to accommodate the final results [38] .

The data resulting from a study are often presented in multiple publications that are highly similar. The study results are split into 'minimal publishable units' beyond what is reasonable. For example, researchers may write several papers by simply changing the outcomes or variables of interest for each paper.

Reluctance to take corrective action post hoc.

Published papers frequently contain errors, yet journals are not always eager to take corrective action when errors are highlighted [32] . Incorrect/flawed research is often not even highlighted: letters to the editor are not very common, and often have strict word limits. Author replies to such letters are typically defensive and dismissive.

replication research, (3) researchers' agendas are dictated by short-term deadlines, (4) peer review remains unacknowledged, (5) methodological illiteracy is still accepted, and (6) transparent and complete reporting remains rare. This situation maintains and reinforces dubious methodological practices, including poor design and preparation, manipulation of data and analysis procedures, incomplete and selective reporting, HARKing (hypothesizing after the results are known), spin, publication bias, salami-slicing, and reluctance to take corrective action after publication ( Table 2 ) . As a result, incorrect findings may be presented as novel insights, leading to poorly founded opinions that require significant effort to be debunked. In 1949, Luykx wrote "whenever quantitative data play a part in a piece of research, the experimental design as well as the statistical analysis cannot receive too much emphasis, before, during and on completion of the project" [26] . Indeed, to find trustworthy answers to research questions, robust methodology plays a fundamental role from study planning to study reporting. We argue that the persisting problem of research waste in science follows from paradoxically undervaluing its own backbone [39] . As long as methodological quality is not needed to publish papers, get promoted, or acquire funding, it remains an easy target for negligence [39] . The acceptance of low quality research and academia's focus on output quantity may even lead to an adverse selection, where researchers adhering to high methodological standards (hence favoring quality over quantity) can experience negative effects on career opportunities [8] .

We posit that health researchers should have at least rudimentary understanding of research methodology and statistics, but should often not conduct these aspects of a study by themselves. Rather, it should be commonplace that the methodological aspects of a research study are led by researchers with dedicated training and experience for the type of research at hand. It has long been argued that quantitative research should involve methodologists or statisticians from conception to reporting, yet this remains too uncommon [ 15 , 26 , 40 , 41 ] . Likewise, applied statisticians should involve clinical experts with sufficient knowledge of the clinical problem they are addressing, and have knowledge of the required methodology to address the research question, because each problem has its peculiarities that can affect study design and subsequent analysis. A stronger focus on methodology also implies that statisticians and non-statisticians alike should be educated in terms of statistical thinking (and critical appraisal), not just the mechanics or even mindless rituals behind statistical calculations [16] . Statistics and methodology training should discuss how studies are designed, and how research questions are translated into study procedures, data collection processes, and analysis tools.

Failure to uphold methodological standards leads to genuine ethical problems. It is unethical to expose humans or animals to any risk or inconvenience on research that is methodologically unsound [ 36 , 42 ] . There are many examples of how poor methodology may lead to exaggerated and even false claims [ 36 , 43 , 44 ] . Poorly conducted studies, including retrospective studies that re-use existing data, tarnish the literature with untrustworthy knowledge which eventually may harm patients, and are a misuse of public funding.

All researchers should undertake efforts to improve medical science. One example is that appropriate mentorship of young researchers is an important factor to foster research integrity, set standards and encourage accountability [ 29 , 45 ] . Over time, many scientists have established dedicated initiatives to improve the methodological quality of research (see list of examples in Table 3 and other literature [ 17 , 24 ] ). We highly welcome and value reproducibility networks, the EQUATOR Network focusing on transparent reporting, the Center for Open Science and its activities such as the promotion of the registered report system, the DORA statement and Hong Kong principles for research(er) evaluation, the STRATOS initiative to provide evidence-based guidance of methodological practices for observational research, or the FAIR principles for data sharing. Such initiatives are invaluable to increase the sense of urgency among all stakeholders. Examples of their impact are available [24] .

While important, these initiatives are disjointed and constitute bottom-up changes -typically requiring the researcher to stumble across such initiatives before they can embed them in their own research practices. Achieving change in this way is difficult, as each individual is part of the scientific environment with all its interrelations and interests. This environment is slow to adapt. We therefore believe a paradigm shift is needed in which all aspects of trustworthy research are broadly taught, valued, enforced, and carried out. This shift should be advanced by topdown action from governments that have subsidy rules for the institutions falling under its wings, universities, funders, and journals to break the cycle and condemn poor methodology. We are aware that activities of stakeholders are inextricably linked, such that major process changes immediately impact on other chains of the scientific environment. We believe that the following actions, if enforced top-down, would positively impact on the quality of medical research.

The Declaration to Improve Biomedical and Health Research recently called for three measures: mandatory registration of interests, uptake of registered reports by all journals and funders, and pre-registration and publication of all publicly funded research on a WHO-affiliated research registry [52] . The registered report scheme is indeed a valuable approach ( Table 3 ) : studies are then evaluated based on the research question and the proposed methodology to address this question [ 2 , 49 ] . Registered reports can be linked with journals, but also with funders in a reproducible research grant model. The funding body then has transparency regarding the specific research that is funded, and has a near-guarantee of publication ( https:// www.cos.io/ initiatives/ registered-reports ). This will enforce investigators to include research methodologists and statisticians in their projects from the start. The format is also ideal for replication studies, where the study design is largely determined by the original study.

Research funding should also move away from shorttermism and hype, and should have robust scientific advancement in mind. There needs to be greater balance regarding the funding of all types of research, including incremental and replication research [ 14 , 22 ] . It is a crucial aspect of responsible science that novel claims are cor- Lancet series on research waste in 2014 17 recommendations for researchers, academic institutions, scientific journals, funding agencies and science regulators were provided [46] . In 2016, it was noticed that this series had an impact, but rather hesitatingly [46] . For example, with respect to being fully transparent during every stage of research, researchers mentioned issues such as lack of time, lack of benefit, and fear of being scooped.

Hong Kong principles for research assessment The Hong Kong principles focus on responsible research practices, transparent reporting, open science, valuing a diversity of research, and recognizing all contributions to research and scholarly activity [24] . Examples of specific initiatives that are consistent with each principle are provided. These principles were based on earlier efforts such as DORA ( www.sfdora.org ). DORA has been signed by about 2000 organizations and more than 15000 individuals, indicating widespread support among academics.

The EQUATOR Network ( www.equator-network.org ) hosts a library reporting guidelines for a wide range of study designs and clinical research objectives, as well as for preparing study protocols [47] . These guidelines are continuously updated and amended where necessary. There is no excuse for not following the most relevant guideline(s) when preparing a manuscript.

Thinking for Observational Studies.

The STRATOS initiative unites methodological experts to prepare guidance documents regarding the design and analysis of observational studies ( www.stratos-initiative.org ). Guidance documents are prepared on different levels, in order to reach non-statisticians as well as practicing statisticians.

Center for Open Science (COS) COS is a center which mission it is to 'increase openness, integrity, and reproducibility' of research (cos.io) [48] . COS aims to achieve this through meta-research (study and track the state of science), infrastructure (see e.g. the Open Science Foundation, osf.io), training, incentives, and collaboration/connectivity. They have referred to their vision as scientific utopia.

Study registries Study registries make study information publicly available at the start of the study, to improve transparency and completeness and allow comparison to resulting publications (e.g., clinicaltrials.gov, crd.york.ac.uk/prospero). Registration is widely established for interventional studies, and slowly getting more attention for observational studies. Recently, initiatives for animal studies are being taken ( https:// preclinicaltrials.eu/ , http:// animalresearchregistry.org/ ).

Registered reports COS has introduced the registered reports system ( https:// www.cos.io/ our-services/ registered-reports ): papers undergo peer review before data collection, based on the research questions and the proposed methodology [49] . If the study is considered to be of high methodological quality, it is provisionally accepted for publication if the authors adhere to the methodology as registered. Currently 244 journals, including medical journals, accept this system as a publishing format.

TOP, also under the umbrella of COS, provides guidelines to support journals' policies for the publication of papers ( https:// www.cos.io/ our-services/ top-guidelines ) [48] .

Findability, Accessibility, Interoperability, and Reusability (FAIR) principles FAIR provides guiding principles for data sharing, which is important for transparency and utility of research projects [50] . Hitherto, journals and researchers still show considerable reserve towards data sharing [51] . As long as the focus in academia emphasises quantity rather than quality, there will be concern that others will take advantage of the effort to collect (high quality) data [46] . Further, privacy and intellectual property issues are important additional bottlenecks.

Methodological/statistical reviewing Several medical journals recognize the importance of methodological review (e.g., statisticians, information specialists/librarians), although the implementation varies widely. Some journals decide on an ad hoc basis when statistical input is required, although this decision may itself require statistical input. Some journals include statisticians on the editorial board, whilst some journals hire a team of statisticians and methodologists.

Reviewer recognition (e.g. Publons) Initiatives such as Publons ( www.publons.com) aim to increase recognition for doing peer review. Such initiatives are a good start, although the question remains what peer reviewers really get out of it.

The Dutch Research Council ( www.nwo.nl) offers grants for doing replications studies of 'cornerstone research' ( https:// www.nwo.nl/ onderzoeksprogrammas/ replicatiestudies ).

All mentioned URLs were accessed on May 23rd 2021. roborated in new data. In addition, by focusing on (highrisk) novelty of the research question, methodology will often play a minor role in the decision to allocate funding and perhaps also during eventual study conduct. We therefore contend that methodological quality deserves a more prominent role in funding decisions. It should become standard to allocate funding for methodological and statistical support of clinical research, as well as for research focused on applied methodology and medical statistics.

Performing qualitative review is as important as conducting and publishing studies. Peer review should therefore have defined and accepted quality standards, be addressed in the education of researchers, be a full part of researchers' job descriptions, and be appropriately recognized by academic institutions.

Applied journals should attach more importance to methodological review of submitted manuscript, as study findings are largely irrelevant if the study is flawed [ 2 , 39 ] . For example, journals may employ a team of qualified people with different methodological expertise (e.g., statisticians, epidemiologists, information specialists/librarians, systematic reviewers). One may think of a staged process, where detailed methodological/statistical review is performed once the clinical value of the paper has been confirmed. When the editorial board includes a statistician, this person may further select manuscripts that require detailed methodological and statistical peer review.

Using an appropriate reporting guideline should be mandatory, and adherence should be monitored by journals -as a minimum, journals should ensure all accepted articles have an accompanying completed reporting guideline checklist (that corresponds to the accepted article) that is checked for completeness and accuracy prior to publication. We also strongly discourage the policy, adopted by several journals, to put the methods section of a publication in smaller font at the end, because this wrongly suggests that results matter and methods are uninteresting. We re-iterate that study findings based on flawed design, methodology and analysis are largely meaningless -thus understanding this prior to reading the study findings is important in the flow of reading a published article.

All quantitative studies should (ideally) involve a methodologist or statistician. It has been claimed that there is a lack of qualified statisticians as well as a lack of access to them [53] . Investment is therefore needed to support study programs and research projects in the field of research methods and meta-research [2] . Another route to address a shortage of qualified methodologists/statisticians would simply be to conduct fewer studies: this would provide more breathing space to ensure methodological quality of the studies that are conducted.

Since Altman's 1994 paper, the problem of poor research has persisted -and arguably deteriorated further. It is our view that research quality is not taken seriously enough, damaging the scientific reputation of medical research. Science should not be a game in which we collect credits to reach the next level of our career. We know that research waste is a multi-stakeholder problem involving researchers, institutions, governments, journals, and funding agencies [ 14 , 46 ] . Recommendations for stakeholders have been issued repeatedly, but change is modest and slow [ 14 , 17 , 24 , 46 ] . In this way, despite being strongly sponsored by public money, the scientific enterprise is doing a major disservice to patients and society. Rigorous methodology is critical, and this needs to be imposed top-down without compromise.

Supplementary material associated with this article can be found, in the online version, at doi: 10 

Waste in covid-19 research

Science after Covid-19: faster, better, stronger?

Prediction models for diagnosis and prognosis of covid-19: systematic review and critical appraisal

COVID-19 coronavirus research has overall low methodological quality thus far: case in point for chloroquine/hydroxychloroquine

Avoidable waste in the production and reporting of research evidence

Rise and fall: Hydroxychloroquine and COVID-19 global trends: interest, political influence, and potential implications

The scandal of poor medical research

Current incentives for scientists lead to underpowered studies with erroneous conclusions

Academic criteria for promotion and tenure in biomedical sciences faculties: cross sectional analysis of international sample of universities

Increasing value and reducing waste in research design, conduct, and analysis

Reducing waste from incomplete or unusable reports of biomedical research

Ten steps towards improving prognosis research

Researcher requests for inappropriate analysis and reporting: a U.S. survey of consulting biostatisticians

Reproducibility in science: improving the standard for basic and preclinical research

How statistical expertise is used in medical research

We need statistical thinking, not statistical rituals

Research waste is still a scandal -an essay by Paul Glasziou and Iain Chalmers

Journal impact factor, trial effect size, and methodological quality appear scantly related: a systematic review and meta-analysis

Some Limitations of the H Index: a commentary on Ruscio and Colleagues' analysis of bibliometric indices

Rewarding replications: a sure and simple way to improve psychological science

Empirical evaluation of very large treatment effects of medical interventions

Universal funder responsibilities that advance social value

The troubles with peer review for allocating research funding

The Hong Kong Principles for assessing researchers: fostering research integrity

How often do leading biomedical journals use statistical experts to evaluate statistical methods? The results of a survey

Progress without statistics

Using the CONSORT statement to evaluate the completeness of reporting of addiction randomised trials: a cross-sectional review

Improving the transparency of prognosis research: the role of reporting, data sharing, registration, and protocols

Misconduct policies, academic culture and career stage, not gender or pressures to publish, affect scientific integrity

The extent and consequences of p-hacking in science

Evidence for the selective reporting of analyses and discrepancies in clinical trials: a systematic review of cohort studies of clinical trials

COMPARE: a prospective cohort study correcting and monitoring 58 misreported trials in real time

Spin in scientific publications: a frequent detrimental research practice

The association between exaggeration in health related science news and academic press releases: retrospective observational study

Fate of clinical research studies after ethical approval-follow-up of study protocols until publication

Harms from uninformative clinical trials

Scientific citations favor positive results: a systematic review and metaanalysis

When Does HARKing Hurt? Identifying when different types of undisclosed post hoc hypothesizing harm scientific progress

Avoidable waste of research related to inadequate methods in clinical trials

Bridging clinical investigators and statisticians: writing the statistical methodology for a research proposal

Participation of epidemiologists and/or biostatisticians and methodological quality of published controlled clinical trials

What makes clinical research ethical

Reporting of a publicly accessible protocol and its association with positive study findings in cardiovascular trials (from the Epidemiological Study of Randomized Trials [ESORT])

Influence of reported study design characteristics on intervention effect estimates from randomized, controlled trials

The ego has landed! What can be done about research misconduct, scandals, and spins?

Increasing value and reducing waste in biomedical research: who's listening?

EQUA-TOR: reporting guidelines for health research

Scientific standards. Promoting an open research culture

What's next for registered reports?

The FAIR Guiding Principles for scientific data management and stewardship

Data-sharing recommendations in biomedical journals and randomized controlled trials: an audit of journals following the ICMJE recommendations

Reducing bias and improving transparency in medical research: a critical overview of the problems, progress so far and suggested next steps

Quality research in healthcare: are researchers getting enough statistical support?