Evidence Based Library and Information Practice

The Intrinsic Uncertainty of Research Integrity

The year 2012 was a good year for research fraud, or at least a good year for illustrating what the eventual outcomes of research fraud can be. In February, anesthesiologist Yoshitaka Fujii was dismissed from Toho University for having fabricated data for at least 172 research articles. In November, the University of Kentucky and the U.S. federal government brought to a close the case against Eric Smart, a diabetes and cardiovascular disease specialist, who had fabricated or falsified data in 21 articles and other research documents. Also in November, the final report appeared in the case of Diederik Stapel, a social psychologist at Tilburg University, who had published 55 fraudulent articles and infected the dissertations of numerous of his doctoral students who had based their work in part on his fabricated and falsified data sets. Less than two months before the appearance of the Stapel report, another investigatory committee had submitted its follow-up report on the affair around the internationally respected cardiologist Don Poldermans, whom the Erasmus Medical Center had fired in late 2011 for research misconduct.

There was a time when researchers as a matter of course upheld the “pretense that research misconduct is too rare to matter” (Macilwain, 2012a, p. 1417). Yet many have now come to suspect, even openly to proclaim, that those cases of research misconduct which are in fact eventually exposed probably amount to only the tip of the iceberg, and that the actual detrimental effects of that misconduct on science and scholarship are now already quite substantial. Such suspicions and pronouncements are not based solely on subjective impression or anecdotal evidence. There now exists a small but growing body of research into the extent and diversity of the problem. Meta-studies carried out in recent years have indicated that some measure of intentional research misconduct (in the widely accepted sense of fabrication, falsification, or plagiarism in the proposing, performing, or reporting of research) is at work within perhaps 10% to 20% of all research, while additional questionable research practices (QRPs) bring to over 50%, at minimum, the share of the research effort likely to be producing misleading, erroneous, or altogether worthless results (Fanelli, 2009). Frequently, it is the highly prestigious peer-reviewed journals that publish these results. Fang, Steen, and Casadevall (2012) found that well over half of the articles indexed in PubMed as retracted had been retracted on grounds of actual or suspected research misconduct. The study by John, Loewenstein, and Prelec (2012) yielded, for psychology, a misconduct and/or QRP rate higher than 90%. Their findings suggest that certain QRPs “may constitute the de facto scientific norm” (p. 524). Many researchers apparently consider such practices a necessity in order for them to survive in their work (Martinson as cited in Bonetta, 2006, p. 875). And an extra complication for the scholarly enterprise is that the discovery of misconduct in itself is not enough, since, as Hernon and Altman (1999) write: “we know that only a few studies are discredited; an overwhelming majority remain in the literature untainted, even though their falsity has been ascertained; and that many continue to be cited for years after the misconduct has been exposed” (p. 402).

A Growing Concern

Those who have studied questions of research integrity are fond of pointing out that misconduct by no means remains confined to those fields in which its exposure has happened to receive the greatest publicity, or in which it has aroused the greatest public interest or concern. It can, and presumably does, occur in every field – certainly in any field where empirical research plays an important role. Furthermore, to a certain extent it is by nature self-perpetuating. According to John, Loewenstein, and Prelec (2012), the unrealistically elegant results achieved through research misconduct and other QRPs “can lead to a ‘race to the bottom,’ with questionable research begetting even more questionable research” (p. 531). But is it in fact the case that fraudulent practices are on the increase? Fang, Steen, and Casadevall (2012), at any rate, speak of an “ongoing retraction epidemic”, and state that the “percentage of scientific articles retracted because of fraud has increased ∼10-fold since 1975” (p. 17028). Such a finding is indeed very much in line with the widespread opinion among researchers and other stakeholders, in various fields, that research misconduct is not only on the rise but also becoming easier to commit successfully (according to Stapel himself, too easy), while “the probability of being found out is minimal” (Stroebe, Postmes, & Spears, 2012, p. 682). The director of the U.S. Office of Research Integrity openly admits “there are also more and more ways for people who want to cheat to do so” (Wright as cited in Macilwain, 2012a, p. 1419). Something, then, has got to be done. It has gradually become clear that science and scholarship are not self-correcting, at least not sufficiently. Macilwain (2012b) speaks of “a generation of denial” which has come to an end, now that the worldwide research community is finally taking research misconduct seriously and has put the development and implementation of countermeasures firmly on the agenda (Heijden et al., 2012; InterAcademy Council, & IAP, 2012; Panel on Responsible Conduct of Research, 2011; Second World Conference on Research Integrity, 2010).

The Case of Library and Information Research

Nonetheless, there are disciplines, certainly in the social sciences and humanities, which have never concerned themselves much, if at all, with the question of research misconduct within their ranks, and continue largely to ignore it as even a potential problem. One of these is – ironically, one might well think – library and information studies (LIS). In a search of this discipline’s literature, I could locate only two publications (Burke et al., 1996; Curry, 2005) that touch more than perfunctorily on the question. The former characterizes research misconduct as not demonstrably an issue in LIS and unlikely to become one; the latter suggests that, though it most likely should be an issue, LIS professionals will probably remain collectively unwilling to treat it as one. The modern era of publicity and public concern regarding serious violations of research integrity began in the early 1980s, at the time of the Darsee affair. It took another fifteen years before the LIS literature produced its first publication broaching the subject of possible research misconduct in its own field. That publication, a speculative but noncommittal editorial, did at least assert an ambition to “elicit further conversation”, as well as possibly “a review article which would inform us all in more depth on this important topic” (Burke et al., 1996, p. 200). The further conversation seems never to have materialized, and that review article has still to be written. Hernon and Calvert (1997) assumed in passing that there was “probably not” a “serious problem” (p. 88) in our field, but wondered whether it wouldn’t at least be a good idea to conduct an up-to-date review of relevant standards, policies, and procedures – a review which to my knowledge also never took place. Even with Curry’s subsequent contribution, our knowledge remains no deeper than it was in 1996, and since 2005 there has again been nothing but silence on the topic.

Given the entire absence of any research into, or even of any informed speculation on, the extent and nature of possible research misconduct in LIS, we can only speculate concerning the actual situation. Clearly, fraud and other forms of research misbehaviour are a proven and acknowledged factor in the worlds of medical, psychological, biological, and physics research, to name but a few obvious examples. That they should then somehow be absent from the world of LIS research seems improbable in the extreme. But do we at least have reasons to believe that they are probably less prevalent in LIS than in, for example, the fields just named above? Yes, we do. Do we have reasons to believe that they may be more prevalent in LIS than in those and other fields? Yes, we have those as well. There are good arguments which one could advance in support of either view, based on all we have learned from the many published descriptions, investigations, and analyses of known cases of fraudulent researchers in numerous disciplines. Or is it perhaps better, at least until further notice, simply to operate on the working assumption that the LIS research world is a more or less normal research world, and thus provisionally to infer that at least one in every ten LIS research studies may well to some degree be fraudulent, while at least half of them will have incorporated one or more questionable research practices? Pending the kind of deeper understanding which Burke et al. (1996) had hoped would be forthcoming, but has not been, such a working assumption and such an inference would indeed not appear to be an irresponsible choice.

But do not we in fact owe it to the profession to go further than that? It is now seventeen years since the entire editorial board of Library & Information Science Research identified the issue of fraud in LIS research as an “important topic” about which more should be known, yet the profession’s reaction has remained one perhaps best described by library/information school professor William Fisher when he wrote (1999) “we are fortunate these practices do not seem to be a major problem for the LIS literature, so we will not dwell on them” (p. 66). But ought we really to just go on cheerfully about our business while contenting ourselves with the conclusion that research misconduct does not seem to be a major problem in our neck of the woods? Fisher cites no evidence and adduces no arguments that might serve to justify such a relatively unconcerned attitude. If it is justified, then we should at least be able to point out how we know that it is. If it is not justified, then the sooner we know that the better. The sooner we are in a position to estimate the extent and to begin to describe the nature of the problem in our field, the better off we, and the field, will be. The same goes for the detection and the investigation of specific cases – not so much out of a desire to stigmatize or to penalize wayward colleagues, as out of a sense of obligation to cleanse and correct the research record where appropriate. And let us not overlook a consideration of equal or in fact even greater importance. As Bosch points out (2012), “The details of such cases also highlight what future action is needed to prevent similar misconduct” (p. 1680).

No Evidence without Integrity

If research misconduct is in principle an “important topic” for the LIS research community at large, one would think it ought to be a matter of particular concern to anyone with even a casual interest in evidence based practice (EBP), to say nothing of committed EBP advocates or practitioners. Yet up to now, quite remarkably it seems to me, there has been no indication, indeed hardly the slightest hint, that such is the case. To what extent, and how, do fraud and other QRPs actually impact upon the evidentiary value of the research literature – in LIS or for that matter anywhere else? Here again, we cannot but resort to speculation. To my knowledge, there exist no more than two publications (Lelgemann & Sauerland, 2010; Neugebauer, Becker, Sauerland, & Laubenthal, 2009) which have addressed the relationship between research misbehaviour and evidence based practice. Both deal explicitly with the establishment of specialized clinical guidelines, and have only limited relevance for the LIS domain.

Given the situation as so far sketched above, LIS professionals set on founding their practice upon the best available evidence from research may be tempted to respond by arguing that the factor of potentially fraudulent research may indeed render the task confronting us a bit more complicated and challenging than we had previously imagined but that, even so, we as EBPers already have an instrument capable of effectively dealing with that task. If only we persist in our commitment to a rigorous and systematic habit of critical appraisal of all potentially pertinent evidence, there should be little reason for us to fear any contaminating influence of research misconduct on the decisions that we take. Comforting as this reassurance may at first sound, its validity is unfortunately open to serious doubt. As already noted, the record of success in detecting probable scientific misconduct, to say nothing of conclusively proving such misconduct, has been decidedly poor. As Trikalinosa, Evangeloua, and Ioannidis (2008) point out, “There are no strong alert signs to hint that a paper is fraudulent. ... Overall, a fraudulent article looks much the same as a nonfraudulent one. ... Even blatant papers of falsification may require careful scrutiny to be revealed” (p. 469). And no wonder. Those who have ultimately been exposed as, or who have eventually confessed to being, committers of fraud have tended to be highly competent or even unusually talented researchers, not infrequently the holders of important positions within prestigious institutions. Such wrongdoers can be very adept at masking their own violations of research integrity, and have at their disposal the facilities and influence which support them in doing so. This has long been known. Marathe (1989) painted an insightful portrait of the typical dishonest researcher as a highly intelligent person with a good reputation, who is “much aware of his [or her] competence,” “used to success,” and “does not intend to be caught” (p. 259). Under these circumstances, we can hardly expect that standard critical appraisal routines will normally be able to lay bare the unethical practices behind the publications of intentionally fraudulent researchers. We should likewise not assume that our traditional information literacy and critical thinking skills are well fitted to this task, or that the intermediation of what Eldredge (2012) has termed the new evidence “Translator” will offer much practical relief in this context (p. 141). Of the various techniques customarily suggested for identifying instances of suspected research misconduct, some (e.g., peer review, editorial control, co-author alertness) have repeatedly shown themselves incapable of actually doing so (Relman, 1983; Stroebe, Postmes, & Spears, 2012). Others (e.g., research auditing, replication, whistle blowing) have proven to some degree effective especially in the exact and life sciences but are, for differing reasons, much less suitable within an area such as LIS.

Indeed, we would probably be well advised to pin few hopes on our prospects of ever becoming very successful at the detection of fraud. If the record of success has been decidedly disappointing in the “harder” sciences, the odds against booking significant successes in LIS would seem to be extraordinarily large. A more promising approach is likely to be one oriented less toward the detection, and much more toward the prevention, of fraud and QRPs. But how can one best go about anticipating and forestalling violations of research integrity from the outset? Observers have not been at a loss for ideas and recommendations, such as: a fundamental overhaul to the system of incentives, rewards, and academic/professional recognition such that the quality, conclusiveness, and transparency of research and its reporting become more decisive than quantity and speed of publication; agreements that journals will henceforth devote more space to the publication of negative or “null” results and to the reporting of replicative research, and that universities and other relevant organizations will institutionalize stronger incentives and recognition for researchers producing such publications; far less emphasis on the attainment of “mediagenic” research results; measures aimed at mitigating the increasingly fierce competition for (diminishing) research funding; the mandatory archiving and long-term unhindered accessibility of all raw research data, protocols, and analysis codes; the requirement that each named author formally accept full co-responsibility for the entirety of a published research report; reduction or elimination of “honorary” authorships; increasing the likelihood that a fraudulent researcher will be caught and penalized, for example by encouraging whistleblowers through guarantees of anonymity or career protection. Often heard is the suggestion that the best means of systematically reducing the occurrence of research misbehaviour in the long run is to ensure that the training of future researchers includes a comprehensive and mandatory research integrity component. Anderson et al. (2007), however, have shown that the positive effect of integrity training and mentoring has possibly been greatly overestimated, and that in some respects the effect may even be negative. Anderson herself (2007) proposes that institutionalized group mentoring might yield positive results, but it is her further notion of a culture of “collective openness” which most appeals to this commentator. She sees that as “a mechanism for sustaining attention to the responsible conduct of research on an everyday basis” in an environment in which “everyone is not only encouraged but expected to question each others’ decisions and work, so that mistakes and oversights, as well as misbehavior, will be noticed and corrected”, where there is “a collective sense of responsibility for the integrity of the work” and where “not challenging questionable behavior or decisions is unacceptable” (p. 392). Sobel (2012), too, strongly emphasizes structural openness as “paramount” in any effort to curb misconduct before it occurs. Openness after the fact is also crucial: if, and as soon as, misconduct is discovered, the last thing that should happen is that it be swept under the carpet, for example, as a face-saving tactic for the institution or journal involved (Bosch, 2012).

Coping with Research Misconduct

There is certainly something to be said for each of the suggested measures listed above. Taken together, they could possibly even make a quite significant difference for the good. Still, it would be an illusion to believe that they can free LIS or any other field altogether of fraud and other QRPs. That problem will always be with us. Any LIS professional who values our research literature as a source for evidence of potential utility in the improvement of professional practice should keep in mind that there are not only one but, rather, two inherent potential threats to such utility. Though they differ in two important respects, their effect is more or less the same. As Burke et al. (1996) have put it, “Given the nature of LIS research, fraud and falsification would be difficult to spot. Poorly conducted and poorly reported research, however, is a well-identified problem, and largely has much the same result as fraud – misleading research findings” (p. 206). One important difference, then is that while we have, as EBP-aficionados, developed fairly effective ways of recognizing and dealing with the hazards of sloppy research, we have apparently not even begun to think about ways of dealing with the hazards of fraudulent research, hazards which in any event are much trickier to localize. That’s the bad news. The good news, and the other important difference, is that genuinely fraudulent research in LIS is almost certainly far less prevalent than sloppy research in LIS. That circumstance can give us hope that by consciously adopting preventive measures we can eventually reduce its frequency to an even less problematical level. It should then be possible to move even further toward the objective of neutralizing whatever bogus evidence continues to exist. In this connection, too, it is fortunate that “[t]he number of systematic reviews published in LIS each year seems to slowly be growing” (Koufogiannakis, 2012, p. 93), for it is by basing our decisions where possible on the syntheses of research evidence provided by well-executed systematic reviews, rather than on the findings of individual studies or fortuitous collections of articles, that we can most effectively evade any potential contaminating influence attributable to undetected fraudulent or otherwise questionable research practices infecting the literature of LIS and other disciplines.

References

Anderson, M. S. (2007). Collective openness and other recommendations for the promotion of research integrity. Science and Engineering Ethics, 13(4), 387-394. doi:10.1007/s11948-007-9047-0

Anderson, M. S., Horn, A. S., Risbey, K. R., Ronning, E. A., De Vries, R., & Martinson, B. C. (2007). What do mentoring and training in the responsible conduct of research have to do with scientists' misbehavior? Findings from a national survey of NIH-funded scientists. Academic Medicine, 82(9), 853-860. doi:10.1097/ACM.0b013e31812f764c

Bonetta, L. (2006). The aftermath of scientific fraud. Cell, 124(5), 873-875. doi:10.1016/j.cell.2006.02.032

Bosch, X. (2012). Research integrity on the horizon. The Lancet, 379(9827), 1679-1680. doi:10.1016/S0140-6736(12)60317-1

Burke, M., Chang, M., Davis, C., Hernon, P., Nicholls, P., Schwartz, C., Shaw, D., Smith, A., & Wiberley, S. (1996). Editorial: Fraud and misconduct in library and information science research. Library & Information Science Research, 18(3), 199-206. doi:10.1016/S0740-8188(96)90040-7

Curry, A. (2005). Unreliable research: Are librarians liable? IFLA Journal, 31(1), 28-34. doi:10.1177/0340035205052640

Eldredge, J. D. (2012). The evolution of evidence based library and information practice, part I: Defining EBLIP. Evidence Based Library and Information Practice, 7(4), 139-145. Retrieved from http://ejournals.library.ualberta.ca/index.php/EBLIP/article/view/18572/14514

Fanelli, D. (2009). How many scientists fabricate and falsify research? A systematic review and meta-analysis of survey data. PLoS One, 4(5), e5738. doi:10.1371/journal.pone.0005738

Fang, F. C., Steen, R. G., & Casadevall, A. (2012). Misconduct accounts for the majority of retracted scientific publications. Proceedings of the National Academy of Sciences of the United States of America, 109(42), 17028-17033. doi: 10.1073/pnas.1212247109

Fisher, W. (1999). When write is wrong: Is all our professional literature on the same page? Library Collections, Acquisitions, and Technical Services, 23(1), 61-72. doi:10.1016/S1464-9055(98)00126-2

Heijden, P. F. v. d., Fokkema, J., Lamberts, S. W. J., Mols, G. P. M. F., Hartogh, G. A. d., Stouthard, M. E. A., & Post, A. A. (2012). De Nederlandse gedragscode wetenschapsbeoefening: Principes van goed wetenschappelijk onderwijs en onderzoek, revised version. Den Haag: Vereniging van Universiteiten VSNU. Retrieved from http://www.vsnu.nl/files/documenten/Domeinen/Onderzoek/Code_wetenschapsbeoefening_2004_%282012%29.pdf (English translation: Retrieved from http://www.vsnu.nl/files/documenten/Feiten_en_Cijfers/The_Netherlands_Code_of_Conduct_for_Scientific_Practice_2012.pdf).

Hernon, P., & Altman, E. (1999). Misconduct: Infecting the literature, but do we really care? The Journal of Academic Librarianship, 25(5), 402-404. doi:10.1016/S0099-1333(99)80061-5.

Hernon, P., & Calvert, P. J. (1997). Research misconduct as viewed from multiple perspectives. In E. Altman & P. Hernon (Eds.), Research misconduct: Issues, implications, and strategies (pp. 71-89). Greenwich, CT [etc.]: Ablex.

InterAcademy Council, & IAP (2012). Responsible conduct in the global research enterprise: A policy report. Amsterdam: IAC Secretariat; Trieste: IAP Secretariat. Retrieved from http://www.interacademies.net/File.aspx?id=19789

John, L. K., Loewenstein, G, & Prelec, D. (2012). Measuring the prevalence of questionable research practices with incentives for truth telling. Psychological Science, 23(5), 524-532. doi:10.1177/0956797611430953

Koufogiannakis, D. (2012). The state of systematic reviews in library and information studies. Evidence Based Library and Information Practice, 7(2), 91-95. Retrieved from http://ejournals.library.ualberta.ca/index.php/EBLIP/article/view/17089/14045

Lelgemann, M., & Sauerland, S. (2010). Gefälschte Studien und nicht publizierte Daten: Auswirkung auf die Erarbeitung von Leitlinien und evidenzbasierten Empfehlungen. Zeitschrift für Evidenz, Fortbildung und Qualität im Gesundheitswesen, 104(4), 284-291. doi:10.1016/j.zefq.2010.03.035

Macilwain, C. (2012a). Scientific misconduct: More cops, more robbers? Cell, 149(7), 1417-1419. doi:10.1016/j.cell.2012.06.001

Macilwain, C. (2012b). The time is right to confront misconduct. Nature, 488(7409), 7. doi:10.1038/488007a

Marathe, S. (1989). Scientific fraud. Nature, 340(6231), 259. doi:10.1038/340259c0

Neugebauer, E. A. M., Becker, M., Sauerland, S., & Laubenthal, H. (2009). Wissenschaftsbetrug/Gefälschte Studien: Auswirkungen auf die S3-Leitlinie? Deutsches Ärzteblatt, 106(15), A703. Retrieved from http://www.aerzteblatt.de/archiv/64093/Wissenschaftsbetrug-Gefaelschte-Studien-Auswirkungen-auf-die-S3-Leitlinie

Panel on Responsible Conduct of Research (2011). The Tri-Agency framework: Responsible conduct of research. Ottawa: Secretariat on Responsible Conduct of Research. Retrieved from http://www.rcr.ethics.gc.ca/eng/policy-politique/framework-cadre/

Relman, A. S. (1983). Lessons from the Darsee affair. The New England Journal of Medicine, 308(23), 1415-1417. doi:10.1056/NEJM198306093082311

Second World Conference on Research Integrity (2010). Singapore statement on research integrity. Retrieved from http://www.singaporestatement.org/downloads/singpore%20statement_A4size.pdf

Sobel, B. E. (2012). On thwarting the seeds of scientific fraud. Coronary Artery Disease, 23(8), 560-562. doi:10.1097/MCA.0b013e32835a05e9

Stroebe, W, Postmes, T., & Spears, R. (2012). Scientific misconduct and the myth of self-correction in science. Perspectives on Psychological Science, 7(6), 670-688. doi:10.1177/1745691612460687

Trikalinosa, N. A., Evangeloua, E., & Ioannidis, J. P. A. (2008). Falsified papers in high-impact journals were slow to retract and indistinguishable from nonfraudulent papers. Journal of Clinical Epidemiology, 61(5), 464-470. doi:10.1016/j.jclinepi.2007.11.019