key: cord-0062944-55jo2eiq authors: Atkinson, Douglas; Thornton, Stephen title: The citation behaviours and the academic performance of first-year Political Science students date: 2021-05-10 journal: Eur Polit Sci DOI: 10.1057/s41304-021-00333-x sha: 90d05c695b477d59d03b4856b770d3864a309f0b doc_id: 62944 cord_uid: 55jo2eiq This research will utilise citation analysis to explore the information behaviours of a cohort of first-year Political Science students at a university in the UK. Using a dataset of the citation behaviour of 262 students, we find that students who locate and cite particular sources of information receive better grades than those that do not. These findings suggest that students who know how to locate and subsequently cite these sources—which tend to be those regarded as more reliable and of higher quality—will achieve higher grades on their course work. This might sound obvious, but such assumptions are rarely checked; furthermore, such findings might convince doubtful students—and staff—to take information literacy more seriously. It has long been argued that students of Political Science 1 require sophisticated levels of information literacy to make sense of the world-and to do well in their assignments (Marfleet and Dille 2005; Thornton 2006; Stevens and Campbell 2007) . Various studies have suggested that achieving these high levels of competence is difficult and becoming ever more so as increasing amounts of information are made daily available, largely in the form of online materials. Furthermore, institutions of learning such as universities are struggling to adapt their teaching models to support their students in their regular fight not to become overwhelmed by a tsunami of information. First-year university students are seen as particularly vulnerable. As Shannon and Shannon (2016) note, they face a double problem: relatively low levels of the information skills and experience allied to an over-confidence in their own abilities. "[T]hey do not know what they do not know " (2016: 458) . This article will further investigate the information behaviours of first-year Political Science students, at least those based at a particular university in the UK. We will demonstrate a statistically significant relationship between these students' ability to locate and cite particular types of sources and the overall quality of the students' academic work (as reflected in the grades the work receives). Most analyses of students' information behaviours have three features in common. First, most tend to be written from the perspective of the library world. Second, the majority of studies explore students based in the US. Third, qualitative methods such as surveys and/or focus groups are the most familiar methods of obtaining insight into students' information behaviours (Carlson 2006) . This study breaks free of those traditional characteristics by being conducted by two political scientists (albeit with considerable assistance from information professionals), by exploring the situation amongst a cohort of first-year students based at a British university, and by utilising a quantitative approach to the investigation. This research will be conducted using a technique known as citation analysis. Analysis of citations and bibliographies is, amongst library and information professionals at least, a popular research method, albeit one primarily used for library collection management purposes. As Hovde notes, evaluation of the sources that students cite in their work provides quantifiable information "that keeps future plans and adjustments out of the category of random guesswork" (Hovde 2000: 3) . However, these analyses are also used to provide a measure of students' information literacy proficiency, often to demonstrate the effectiveness of particular learning experiences provided by librarians (Reinsfelder 2012) . Citation analysis has been a tool used by librarians for nearly a century (Middleton 2005: 7) . Influential studies include those by Hovde (2000) in which a simple assessment framework was developed to analyse-in the first instance-the bibliographies of 109 first-year English students. The bibliographies, Hovde argues, provide "a flexible, non-invasive, time efficient assessment forum for the documentation of student library use" (2000: 5) . This analysis explored the types of sources used (broken down into the following categories: books; journals and magazines; newspapers; electronic sources; other) and the index source of the articles used (which, in those days, included standalone CD ROMS), and provided a means of measuring changes to student information behaviour that could be easily replicated. Other studies of a similar type include Hinchcliffe et al (2003) , Carlson (2006) , Datig (2016) and Lantz et al (2016) . Of the studies that are concerned with information literacy competence, the assumption throughout is that certain sources-because of attributes such as perceived reliability, academic rigour and author transparency-are regarded as being of higher quality than others, with peer reviewed articles from recognized journals, seen as the "gold standard", at one end, and information from websites of dubious authenticity at the other (Lantz et al. 2016: 261) . Though our work largely follows this convention, it is worth noting-as Lantz et al. acknowledge-the information contained in many peer-reviewed research articles can be too complex for many first-year assessments. In short, information generally deemed "higher quality" might not always be the most appropriate for a particular task. Most citation analyses explore students from one discipline (often English), from one institution and from one year of study (usually the first year). There are, inevitably, far fewer studies that have explored the citation behaviours of Political Science students as a distinct group and the effect that the ability to effectively distinguish the credibility of sources has on student success. This small group includes a study by Hendley (2012) , which investigated the citation behaviours of those studying History, Sociology and Political Science at the State University of New York, College at Oneonta. This analysis looked at the proportion of different types of sources but added further investigation of three further areas: the proportion of student citations of specific website domains; the identification of the most cited academic journal titles for each discipline; and the prevalence of interdisciplinary journal usage (Hendley 2012: 99-100) . Regarding the citation behaviour of Political Science students, Hendley notes that of the 689 citations that were recorded in the Political Science papers, 42% were of websites, and 23% were of scholarly/academic journals, and 19% were of books (Hendley 2012: 103) . A further 8% was made up of "newspaper, magazine articles, and other periodical articles", and "other sources" comprised the remaining 6%. In total, 56% of all the citations came from the "non-traditional sources combined (websites, magazine and newspaper articles, and other sources)" (Hendley 2012: 103) . Political Science papers included proportionally considerably more non-traditional sources than either History or Sociology papers. Though acknowledging the limitations of this single case-study, Hendley is correct to state that such findings do provide "useful, preliminary information concerning students' citation patterns" and do prompt many questions including, "are the students citing resources that are relevant to their assignment?" (Hendley 2012: 110) . This article will borrow this valuable technique from the world of information professionals to provide a portrait of information behaviours amongst students of Political Science. Rather than use citation behaviour analysis to inform collection management, or to explicitly measure the worth of a particular information experience provided by the library, the primary purpose here is to provide a snapshot of students' information skills in their first semester at university and to see how this affects their performance in course work. As noted, those reading for a degree in Political Science must choose their information sources with care. In the UK this concern is reflected in the most recent Quality Assurance Agency (QAA) benchmark statements for the discipline (known more often in the UK as "Politics and International Relations"). To be a regarded a successful graduate of Politics/International Relations requires a variety of information literacy competences. These include the abilities to "gather, organise and deploy evidence, data and information from a variety of sources", to "assess their ethical implications", to "synthesise relevant information and exercise critical judgement", to "use communication and information technologies for the retrieval, analysis and presentation of information", and to "critique and synthesise information" (QAA 2015: 18-19) . However, to date, the link between information literacy competencies and a student's success in a Political Science classroom in the UK has not been systematically tested. Using citation analysis of students' work can provide a useful "ground zero" from which to base future learning activities to encourage students to form a critical and thoughtful relationship with information. A student's ability to forge this relationship should have an effect on their success in course work. According to the QAA benchmark statements, students who have these competencies will perform significantly better in their course work than those who do not. Understanding the performance gap will allow for informed decisions about the types of interventions and information that might be used to improve students' understanding of information literacy. To this end, evaluating whether or not, and to what degree, the employment of these skills has on the success of Political Science students is of the utmost importance. This study is the latest in a series of explorations of information behaviours from students studying the same module at the same institution (Cardiff University, UK) over a period that stretches back over a decade. It is worth briefly considering these earlier findings as they provide useful context for this fresh study. As explained in more detail in a recent article by one of this study's authors (Thornton 2019) , every year over a period from 2009 to 2017 students on the same foundational comparative politics module completed a survey during their first weeks at university. In 2009, 166 students took this module, by 2017 this number had risen to 274. The survey asked questions such as: "Have you received any training in locating information?"; "When preparing for writing essays or other assignments, which types of information have you used?"; and "What, if any, criteria do you use to assess whether a website contains information reliable enough to use in your assessed work?" (Thornton 2019: 95-99) . By examining student responses to the same questions across the years, it became apparent that students were entering university with no more enhanced levels of information literacy in 2017 than displayed in 2009, despite information literacy itself being a more established concept in education throughout the world. For example, responses to the question about quality control techniques deployed when using websites were remarkably similar despite the near-decade temporal distance between the cohorts surveyed. Consideration of the credibility of the author(s) and the evaluation of the reputation of the website were the most popular techniques identified each time. Author credibility was named by 35% of both the 2009 cohort and the 2017 cohort, and website reputation was named by 34% of the 2009 intake and by a very similar 36% eight years later. It also appeared that in 2017, just as much as in 2009, the responses of only a minority of students suggested the capability to deliver a coherent web evaluation strategy such as that influentially recommended by Kapoun at the end of the last century (1998; see also Cornell University 2020). In addition, responses to other questions suggested that the students surveyed were no bolder about their information choices in 2017 compared to 2009, indeed the willingness and ability of many students to interact with a variety of electronic repositories of information-bar the inevitable Google-appeared to have diminished. Aware of the limitations of this type of research (Carlson 2006: 14) , we felt it important to examine the information skills that the students actually used as part of their everyday academic work, and the impact it had on their success, rather than merely solicit their views through a survey. Consequently, in the 2019-2020 academic year we conducted a citation analysis of first-year student essays. We anticipated that students who employ information literacy skills to select particular sources for citation will receive better marks than those who employ a less critical approach. This would be in line with the Cardiff University Politics and International Relations undergraduate marking criteria for traditional essays. Of the four main criteria categories, two explicitly tested information literacy skills: "range and relevance of content", and quality of "references and bibliography" (the other categories are "academic quality" and "style, length and presentation"). To give one example, to reach the relevant criterion under the "range and relevance of content" category at the 2.1 level (60-69%), an essay would need to demonstrate "good use of a range of relevant sources, demonstrating clear understanding of main ideas". Yet, just because an attribute is mentioned in the marking criteria does not guarantee that all essays will be judged according to it, so it is worth checking. Moreover, the phrase "a range of relevant sources" is vaguefor staff as well as students-thus it is useful to explore precisely which sources provide particular weight when illustrating the achievement of the marking criterion. Ultimately, we have the citation data and the score assigned to each essay written by 262 students taking the module in 2019-2020. The more frequently certain types of resource are cited in an essay (those regarded as "higher quality") the higher its mark will be. To assess our hypothesis, we used a dataset comprising the essays submitted as part of the Introduction to Government module held during the first semester of the academic year 2019/2020 (before the COVID-19 pandemic). This module is intended to be taken by first-year Political Science students at Cardiff University and, as such, is one of the first courses taken as part of their degree. As part of the students' overall assessment, they were assigned an essay. The students were asked to respond to one of three essay prompts: (1) Is direct democracy superior to representative democracy? (2) What is the role of the state in the twenty-first century? (3) When, if ever, are constitutions effective? The students were given a strict word limit of 1000 words. Additionally, each student was made aware of the essay at the same point in the semester and given a strict deadline by which they needed to submit the essay. Although the students were assigned a list of required and suggested readings, they could cite any source they felt was needed to support their argument. Some weeks before submission, the students were invited to attend a study skills lecture which included advice about assessing the likely quality of information, touching on subjects such as bias, academic rigour and reputability. Approximately 50% of the students attended. Once submitted, the essays were graded by four different markers following an initial calibration exercise process and the process overseen by a single moderator. We acknowledge that employing quantitative analysis to assess student performance simplifies reality and cannot possibly tell the complete story. This is especially the case when we only looked at one input of student performance. However, such an analysis still provides a considerable amount of value, as it can help us understand the average effect of information literacy and citation behaviour on a student's performance. Strong findings from these models will also suggest that more work-such as research employing experimental designs-needs to be done to fully understand this relationship. Our dependent variable was the mark assigned to each essay. During the assessment period, the students' names and other identifying information were not known to the faculty member assigning the mark to the essay. The scores range from 15 to 72 with a mean of 53.85 and a standard deviation of 8.43. Figure 1 shows the distribution of the marks. Our independent variables were counts of the number of times a student cited a specific type of source, as well as the number of citations. Ideally, we would have controlled for individual level factors that might have an impact on the student's essay score, such as their past performance and demographic information, but logistical constraints meant we were unable to include this information in our analysis. We collected data on the number of times a student cited a textbook. We considered a book a textbook if it was marketed by a known publisher as such. It should be noted that students used both assigned and unassigned textbooks. The most commonly cited textbooks were Heywood (2019)-which was assigned as mandatory reading for the class-and Hague et al. (2019)-which was recommended reading for the class. This variable also includes instances where students cited a chapter in edited textbook volumes, such as the Oxford Handbook of International Relations. This variable ranged from 0 to 19 citations with the average student citing a textbook 2.03 times with a standard deviation of 2.83. We considered a source a monograph if it was a sustained treatment of one subject or if it was marketed by the publisher as such. The vast majority of the monographs cited by students came from academic publishers such as the Oxford University Press with a smaller number coming from non-academic presses. This variable ranged from 0 to 21 citations, with a mean of 1.98 and a standard deviation of 2.94. We considered a source to be a journal article if it was marketed as such by its publisher. In this category we also included chapters of edited specialist volumes, as they are similar in quality and intent to a journal article. Commonly cited journals include the American Political Science Review, Political Research Quarterly, Politics, the American Economic Review and the British Journal of Political Science. This variable ranged from 0 to 13 with a mean of 2.11 and a standard deviation of 2.38. As with the other sources, we defined a source a newspaper if the creators of the source identified and marketed it as such. A source was considered a newspaper regardless whether the information was accessed online or from a hardcopy of the newspaper. Commonly cited newspapers were the New York Times, the Guardian, and the Times. This variable had a minimum value of 0 and a maximum number of 9 with a mean of 0.56 and a standard deviation of 1.22. Once again, we identified a source as a periodical if it is marketed as such by its producers. As we did with newspapers, we identified a source as coming from a periodical regardless whether the information was gathered from the publisher's website or in its physical format. Examples of periodicals cited by students included the New Statesman, the Economist, Time, and The Atlantic. This variable ranged from 0 to 9 with a mean of 0.56 and a standard deviation of 1.21. We identified a source as a website if it was accessed on the internet and was not a newspaper, periodical, information from a think tank, or government statistical site. Students cited a wide variety of websites with very few websites sticking out as being cited more often than others. This variable had a minimum value of 0 and a maximum value of 11 citations with a mean of 1.16 citations and a standard deviation of 1.92. A source was identified as coming from a government or think tank if the source of the information could be traced back to one of these organizations either through the student identifying the website from which they obtained the information or from the name of the author. For organizations of which we were unsure, we looked up the source to determine how they marketed themselves. Commonly cited think tanks include the Hoover Institution, the American Enterprise Institute, and Chatham House. The governments from which students most often cited data were the United States, the United Kingdom, and Switzerland. This variable ranged from 0 to 6 citations with a mean of 0.67 citations and a standard deviation of 1.12. In order to account for the number of citations that a student used in their paper, we employed a categorical variable that placed each student into one of four categories. We chose to use a categorical variable based on evenly distributed quantiles due to the fact that the raw count measure was heavily skewed to the right. 2 We assigned a student a 1 if they were in the bottom quartile of citations, that is they identified between 0 and 6 citations. A student was assigned a 2 if they were in the 2nd quartile of citations, between 7 and 9 citations. A student was assigned a 3 if they fell within the 3rd quartile, between 10 and 12 citations. A student was assigned a four if they were in the top quartile, 13 or more citations. This variable ranged from 1 to 4 with a mean of 2.406 and a standard deviation of 1.128. Whether a particular resource is "higher quality" than another is, of course, to some extent a subjective judgement, often dependent on the task to be performed. Moreover, despite the dire warnings occasionally heard in study skills sessions, websites often contain excellent information and some journal articles have proved to be woefully inaccurate. Nevertheless, for the purposes of this research, it was necessary to distinguish some types of information as being likely to be more or less appropriate for the task of completing a first-year Political Science essay. As such, we considered information contained within textbooks, monographs, book chapters, information from government and/or think tanks, and-despite the concerns of Lantz et al. (2016) -journal articles to be more likely to be appropriate for this particular task than the information found in websites, newspapers, and periodicals. References to the former group of sources are those we call, for shorthand, "quality citations". On balance, it was felt the latter group were more problematic in relation to such matters as reliability, rigour and transparent authorship. Which side of the line information from governments and, in particular, think tanks fell was a particularly difficult judgement call. Past studies have suggested some students regard information presented by any suitably professional appearing think tank to be largely factual and apolitical. For example, one student reported that the libertarian think tank the Cato Institute to be political neutral, perhaps reading too much into the claim on the organization's website that the body is independent and nonpartisan (Thornton 2012: 214) . Nevertheless, it was felt that the potential relevance of the information contained outweighed the possible issues regarding reliability and bias, at least for the essay topics under consideration. To make the proportion, we took the number of "quality citations" and divided by the total number of citations. This variable ranged from 0 to 1 with a mean of 0.769 and a standard deviation of 0.281. In order to ensure that we assessed information literacy skills as they are employed in independent research, we also used a measure of the proportion of quality citations excluding textbook citations. This variable also ranged from 0 to 1 with a mean of 0.522 and a standard deviation of 0.298 (Table 1) . To assess our hypothesis, we used an ordinary least squares model. The results of this model can be found in Table 2 . Because we employed a simple linear model, the coefficients can be directly interpreted as, for every citation, there was an accompanying increase in the student's mark as represented by the coefficient found in the table. As can be seen in Table 2 , the coefficients for the textbook, journal article, government and think tank statistics, and the newspaper variables were all positive and statistically significant at the 99% level of confidence. These findings were largely in line with our expectations. Specifically, when students referenced more quality citations, they achieved higher scores. In more detail, we found that for every 1 textbook citation, holding all other variables at their means, a student received an additional 0.613 points on their score. Similarly, for every additional journal article the student cited there was an associated increase of 0.977 in the score the essay received. This same pattern holds for a monograph, where, for every additional citation, there was an accompanying 0.663 increase in the mark the essay received. To the surprise of the researchers, a citation from a government/think tank source proved to have the largest effect on the mark the student received for their essay. For each citation of think-tank/government statistics there was an associated increase of a 1.864, nearly 2 point, increase in the score the student received. As noted earlier, of all the "quality" sources, these were considered the most problematic. However, the evidence here suggests that a student was more likely to rewarded for providing distinctive evidence to support a point than penalised for using a potentially unreliable source. That said, a different set of essay questions may have led unwary students towards more treacherous sources. To further illustrate the size of these effects we provide graphs of predicted values across the entire range of our independent variable. We also controlled for the number of citations that a student uses. As can be seen, the coefficient for this variable, while positive, does not achieve conventional levels of statistical significance. We found that, although newspapers, periodicals, and websites have positive coefficients suggesting that citations of these sources do have a positive effect on the mark the student receives for their essay, these findings did not meet standard levels of statistical significance. In other words, we were unable to distinguish our findings from zero. These findings are in line with our expectations. This does not mean that we anticipate that citing these sources that will necessarily negatively impact a student's mark, but any positive impact is less manifest than for the sources already discussed. We will now turn to a discussion of the predicted values graphs found in Fig. 2 . In Fig. 2 , we present the predicted marks across the range of our main independent variables, from their minimum to maximum values. As can be seen, the predicted value of the mark assigned to the essay at 0 citations is around the mean, 53.6 for each of our independent variables. Additionally, at the maximum number of citations for each variable the predicted mark is slightly above or below 65. This demonstrates strong evidence for our argument. The difference in the rate of change between the predicted values of the government statistics variable and the other variables suggests the need for further discussion. These findings do not suggest that this is necessarily the best source for a student to cite, relative to the others. The relatively low mean value of this variable suggests that something else is going on. Specifically, it suggests that there are a large number of students who did not cite a single statistic to back up their arguments. A quick look at the cross tabs demonstrates that this is the case. There were 172 students who cited a government or think tank statistical data zero times and 51 that cited it one time. What we do demonstrate is that students who think to use data to back up their claims, and know where to look for it, will be judged to have performed better on their essay. To further assess the effect of quality citations on the mark that the essay receives, in models 1 and 2 of Table 3 we introduce our proportion of quality citations variables. In model 1, we introduce our quality ratio variable. As can be seen, the quality ratio variable is positive and statistically significant. As mentioned before, because the coefficient is the result of an ordinary least squares model, it is directly interpretable. For every 1 unit increase in the quality ratio there is a 6.382 increase in the student's mark. Because the quality ratio variable is bounded by 0 and 1, this suggests that from the minimum to the maximum value of the quality ratio variable there is a change of 6.382 points. This finding is in line with our expectations. It suggests that the more a student relies upon sources perceived as higher quality to craft and support their argument the better they will have been judged to have performed. We also controlled for the number of sources that a student used. We find that the coefficient for this variable is positive and statistically significant. As we mentioned before, this is a categorial variable. The size of the coefficient suggests that as a student moves To ensure that we were capturing a student's information literacy skills, we also employed an independent variable that considers quality sources while excluding textbook citations in model 2. As can be seen, the coefficient for this variable is positive and statistically significant. The size of this coefficient, 7.008, suggests that from the lowest proportion of quality citations, 0, to the highest, 1, a student's mark will improve by 7.008 points. This finding adds even further evidence for our argument, namely that students able to identify and rely more heavily upon sources of information perceived as higher quality are judged to have performed better, on average, than their peers. This research adds an important element to the literature that suggests an essential feature of the effective study of Political Science is sufficient informational literacy. We demonstrate the ability to seek out sources generally recognized as being of higher quality does have a significant positive impact on grades awarded, and that using certain types of information-such as government or think tank statistical data-can provide a particularly positive step towards the recognition of academic merit in a student's work. Thus, this work demonstrates that which is generally just assumed, namely that one of the key determinants of student success is the ability to select carefully the sources from which they draw their information. This provides further motivation for students and faculty alike to take information literacy seriously, not least as the information source that surprisingly proved to be the most effective for improving grades-think tank data-is amongst the most dangerous for the uncritical user. Further research remains to make these findings yet more robust and to find out what strategies for stimulating Political Science students' information literacy work best. not stable and the model the coefficients generated by this model should not be interpreted (Hair et al. 2009) (Table 4 ). As can be seen Table 5 , by measuring the total number of sources as a categorical variable divided into quartiles, this problem is avoided: B. To demonstrate that our findings are robust to our measurement decision, we present a model that includes a control for the number of sources where we have broken up the number of sources into evenly distributed quintiles. As can be seen in the Table 6 , these results are largely consistent with what we present in the main document. An Examination of Undergraduate Citation Behavior Evaluating Web Pages Citation Behavior of Advanced Undergraduate Students in the Social Sciences: A Mixed-Method Approach Comparative Government and Politics Multivariate Data Analysis: A Global Perspective Citation Behavior of Undergraduate Students: A Study of History What Students Really Cite: Findings from a Content Analysis of First-Year Student Bibliographies Check the Citation: Library Instruction and Student Paper Bibliographies Teaching undergraduates WEB evaluation: A guide for library instruction Student bibliographies: Charting Research Skills Over Time Information Literacy and the Undergraduate Research Methods Curriculum An Attempt to Quantify the Quality of Student Bibliographies Subject Benchmark Statements: Politics and International Relations Citation Analysis as a Tool to Measure the Impact of Individual Research Consultations Librarians in the Midst: Improving Student Research Through Collaborative Instruction The Politics of Information Literacy: Integrating Information Literacy into the Political Science Curriculum Information Literacy and the Teaching of Politics. LATISS -Learning and Teaching in the Trying To Learn (Politics in a Data-Drenched Society: Can Information Literacy Save Us? A Longitudinal Comparison of Information Literacy in Students Starting Politics Degrees The authors would like to thank the library service at Cardiff University in particular Luisa Tramontini and Rebecca Mogg, the students and staff involved on the module Introduction to Government, and the excellent editorial team and anonymous reviewers at European Political Science. To demonstrate that including the continuous number of citations variable introduces multi-collinearity, we present the results of a variance inflation factor test-the most commonly employed test for multi-collinearity-for a model specified to include this variable. The major problem introduced by collinearity is that it makes the regression estimates unreliable and unstable (Hair et al. 2009 ).As can be seen, the model that employs this variable badly fails this test, as any variable with over a VIF value of 10 suggesting that there are very high and problematic levels of collinearity. Ultimately, the results from this regression are Publisher's Note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.