Previous Contents Next
Issues in Science and Technology Librarianship
Spring 2013
DOI:10.5062/F4T151M8

[Refereed]

Data Sharing Interviews with Crop Sciences Faculty: Why They Share Data and How the Library Can Help

Sarah C. Williams
Life Sciences Data Services Librarian
Funk ACES Library
University of Illinois
Urbana, Illinois
scwillms@illinois.edu

Copyright 2013, Sarah C. Williams. Used with permission.

Abstract

This study was designed to generate a deeper understanding of data sharing by targeting faculty members who had already made data publicly available. During interviews, crop scientists at the University of Illinois at Urbana-Champaign were asked why they decided to share data, why they chose a data sharing method (e. g., supplementary file, repository), what were the benefits and drawbacks of the method(s) used, and what role they saw for the library to help facilitate data sharing. This article summarizes the participants' reasons for making data publicly available but also describes the challenges that they faced when sharing data. Most participants had not previously thought of the library for assistance with their data, but when asked how the library could help facilitate data sharing, they had a variety of ideas, which are presented in this article.

Introduction

Sharing research data is becoming increasingly important, especially as a result of funding agency policies. In 2011, the National Science Foundation (NSF) began requiring each grant proposal to include a data management plan, which describes how the proposal will conform to the NSF data sharing policy. The National Institutes of Health (NIH) also requires a data sharing plan to be submitted with grant applications over a certain monetary value.

Most funding agencies do not specify how data must be shared. Data can be deposited in and made available through disciplinary data repositories, institutional repositories and web-based laboratory databases. Supplementary files published with scholarly articles are another method of data sharing. Data can also be shared privately, such as via e-mail or CD-ROM.

Due to the growing importance of research data, many academic libraries are exploring and implementing services to support research data on their campuses (e.g., Peters & Dryden 2012; Westra 2010). In order to develop effective data services that support research processes in their institutions, librarians must identify and be aware of disciplinary data practices (Cragin et al. 2010).

Data practices in the life sciences have been addressed in both large-scale (Key Perspectives 2010; RIN 2009; Tenopir et al. 2011) and smaller-scale studies (Cragin et al. 2010; Diekmann 2012). These studies gathered limited information from researchers who had publicly shared data, as it is not yet common practice across disciplines.

The goal of this study was to generate a deeper understanding of data sharing by targeting faculty who had publicly shared data, whether raw data submitted to a repository or processed data published as a supplementary file. Interviews with crop sciences faculty at the University of Illinois at Urbana-Champaign focused on their data sharing experiences and thoughts. To inform future data services, the author also asked the participants what role they saw for the library to help facilitate data sharing.

Literature Review

Several large surveys and studies have investigated the data sharing practices of researchers, including life scientists. Tenopir et al. (2011) conducted an international survey of data sharing practices and perceptions of data sharing barriers and enablers. Over 1,300 researchers from a variety of science and social science disciplines responded to the survey, which was open from October 2009 through July 2010. When asked if they made their data available to others, 46% of the respondents answered they do not make their data electronically available to others, but nearly one third of the survey participants did not answer the question at all, which led the authors to speculate that data sharing was even less common. About one third of the respondents agreed their data could be easily accessed by others. The most cited reasons for not making data electronically available to others included limited time, lack of funding, no rights to make data publicly available, and no system for sharing data.

The "Data Dimensions" report (Key Perspectives 2010) summarized the Digital Curation Centre's SCARP Project, which explored attitudes and approaches to data deposit, sharing, reuse, curation and preservation across disciplines. The report emphasized that data practices and attitudes differed by discipline and even by sub-discipline. The report noted that life scientists were often perceived as willing to share data but mentioned this was not the case in all sub-disciplines. Genomics research was cited as an example in which "data sharing tends to be the norm." (Key Perspectives 2010, p. 12) This report built on the earlier work by Swan and Brown (2008), which involved interviews with over one hundred researchers in the UK. The "Patterns of Information Use and Exchange" report (RIN 2009) also concluded that data practices varied widely in the life sciences. Another key finding was the gap between the practices of researchers and the policies and strategies of funding agencies, which was further articulated by Pryor (2009).

The data practices of scientists have also been researched in smaller-scale studies, with a limited number of participants who had publicly shared data. Diekmann (2012) interviewed fourteen agricultural scientists in 2008-2009 and concluded that data sharing was a rare practice. The researchers hesitated to deposit or share raw data; instead they typically shared their research results and analyzed data via publications, presentations and web sites. Data quality was one concern of openly sharing data, especially field data, which is subject to biological and environmental variation.

Cragin et al. (2010) explored the data sharing practices of life scientists and physical scientists conducting "small-science" research, which they described as hypothesis-driven research that results in heterogeneous data and that has limited data standards and disciplinary repositories. The twenty researchers interviewed had an interest in data management or data sharing. While the participants were generally positive about and open to data sharing, "Actual dissemination of data to researchers beyond known colleagues or peers was limited to seemingly small numbers based on specific requests, and these were generally handled via e-mail or a posted CD-ROM." (Cragin et al. 2010, p. 4031) Use of a repository was limited to five participants who had deposited genomics data in GenBank. The decision to make data publicly available depended on a variety of factors, including the time required to prepare the data and the risk of misuse. The focus of the Cragin et al. study was to inform the future development of institutional repositories, but they concluded, "the findings highlight the need for additional investigation of use and non-use of other types of repositories" (Cragin et al. 2010, p. 4036).

Peters and Dryden (2012) interviewed thirteen researchers working on NSF and NIH funded projects. When asked about sharing raw data outside of their research groups, most participants responded that data was only shared upon request and usually via e-mail. The most common reason for not sharing was that the data was confidential, proprietary or classified. Other reasons included concerns about intellectual property, the risk of misinterpretation, and the time and effort required to prepare the data.

Methods

This study targeted faculty who had publicly shared data. To identify potential participants, the author used information gathered in an earlier study of selected publications by assistant, associate, and full professors affiliated with the Department of Crop Sciences at the University of Illinois at Urbana-Champaign (Williams 2012). Of the 62 faculty included in the earlier study, 20 had publicly shared data, whether via repositories, as supplementary files, or on a university web site, and they were the potential participants for the study described in this article.

Several resources informed the design of this interview-based study. The Data Curation Profiles Toolkit recommended selecting a specific dataset "to be the subject of the interview before the discussion about the data begins" (Carlson 2012, p. 13). For this study, the author used the data sharing articles identified in the earlier bibliographic study to suggest data on which to focus. Resources were also gleaned for interview question ideas. The "Conducting a Data Interview" poster (Witt & Carlson 2007) recommended beginning an interview by asking the researcher about the story of the data, in order to allow the interviewee to talk freely. The first question in this study asked participants to provide a brief overview of the research and data related to the selected article. A question adapted from the Data Curation Profiles Toolkit (2013) asked faculty how they imagine people will find their shared data.

Other interview questions for the faculty members included why they decided to share data, why they chose a certain data sharing method (e. g., supplementary file, repository), what were the benefits and drawbacks of the method(s) used, and whether they had ever tried to reuse data shared that way. If data were shared as a supplementary file, participants were asked whether they knew of a disciplinary repository for that type of data. Participants were also asked if they saw a role for the library to facilitate data sharing. The complete interview introduction and list of questions are in the Appendix.

After Institutional Review Board (IRB) approval was received in June 2012, the 20 faculty members were contacted via e-mail to request their participation in this study. Seven faculty members agreed to participate, and to accommodate their schedules, the interviews ran from August through November 2012. When scheduling the interviews, the author and each faculty member also selected the article and related data on which to focus the interview.

The interviews, which lasted 35 to 75 minutes, were conducted on a one-on-one basis in the faculty members' offices. In keeping with IRB protocol, each participant signed a consent form at the beginning. The author took written notes during all of the interviews. To supplement the written notes, the author requested permission to audio record the interviews, and three participants agreed to have the interviews audio recorded. Quotations used in this article came from the written notes and audio recordings and were normalized for readability.

Results

The seven faculty participants included three assistant professors, one associate professor and three full professors. Of the seven interviews conducted, four focused primarily on genetic data, one focused on phylogenetic data, one focused on field data, and one focused on a combination of genetic and field data. Table 1 lists the data types and data sharing methods discussed in each interview.

Table 1: Summary of Data, Data Sharing Methods, and Reasons for Sharing Data

Interview Number Types of Data Discussed Data Sharing Methods Discussed Reasons for Sharing Data
1 Field Supplementary file Prove confidence in research results
Help the research community
2 Phylogenetic Supplementary file
Laboratory database
Disciplinary repository
Meet expectations in this area of research
Help manage and distribute data within the laboratory
3 Genetic Supplementary file Help others, especially when funding received for research
4 Genetic Supplementary file
Disciplinary repository
Meet expectations in this area of research
5 Genetic Disciplinary repository Meet funding agency requirements
6 Genetic and field Supplementary file
Disciplinary repository
Institutional repository
Make data more accessible
Meet funding agency requirements
7 Genetic Disciplinary repository Meet expectations in this area of research
Help build a resource for the research community
Meet funding agency requirements

Reasons for Sharing Data

There were some similarities in the faculty members' reasons for sharing data. Table 1 includes a summary of responses. For the participants conducting genetic research, data sharing was essentially expected and accepted. One faculty member said data sharing is "standard practice" and noted that sequence data is amenable to depositing in a repository. Another said data sharing is a "general trend of the field," but this faculty member also emphasized the importance of sharing data "with other scientists to help develop a community resource." She stated, "Without [sharing data], the contribution of this work would be limited."

Funding agency requirements also provided motivation for several participants. Two faculty members had been sharing data for many years because they received funding from the NSF Plant Genome Research Program, which required data management plans and data sharing prior to the broader NSF data sharing policy. One participant said that he was very interested in making data publicly available, but mainly when he received research funding. He was less likely to share data if he was working on his own time or with his own money.

A faculty member who conducted field research stressed that experimental fieldwork requires many replications, which produce more data than can be published in an article. She shared the additional data in supplementary files to prove confidence in her research results, and said, "If you're confident [in the data], why not share it? You spend a lot of time and effort to generate this data, so it would be nice if it could help the community to learn something additional." Another participant was responsible for long-term data produced by institutional field research; he regularly received requests for these data, so he deposited the data in the University of Illinois institutional repository, IDEALS, to make the data more accessible to others.

One faculty member had a web-based laboratory database. This database did make the laboratory data publicly available, but he noted, "We did it for us more than anything." The database helped manage and distribute data within the laboratory.

Supplementary Files

During several interviews, participants discussed the benefits of submitting supplementary files along with scholarly articles. As one faculty member said, "You cannot publish everything in the short format of a paper," and supplementary files help overcome these limitations. Supplementary files provided a way to highlight variations in results (e.g., due to environmental conditions), even if the overall trend was the same as summarized in the article. The participants also noted that supplementary files provided a way to share value-added data that might be helpful to others.

None of the participants voiced drawbacks about supplementary files as a data sharing method. During preparation for one interview, the author discovered that the supplementary files for one article were no longer available on the publisher's web site, so the author specifically asked this participant if he had any concerns about publishers being able to maintain and preserve supplementary files. He acknowledged that he had not given this much previous consideration, and while he said, "You make some good points," he did not appear overly concerned.

When asked how other researchers might find the data shared in their supplementary files, the faculty members' responses were limited. One said, "It's not a problem." Another said that researchers are more likely to find data attached to an article than data available in an isolated repository or web site that few use.

All of the participants who had shared data in supplementary files had also reused or tried to reuse data from supplementary files published by other researchers. Examples included modifying experiments based on information in supplementary files, reusing marker data published in spreadsheets, and combining existing data from supplementary files with new data from the laboratory to write an article. One participant commented that it was sometimes difficult to reuse data from supplementary files because the data were not always well described or did not follow well-established standards.

In four of the five interviews that dealt with supplementary files, a question explored whether supplementary files were used instead of repositories (i.e., non-use of repositories). Two participants explained that they use supplementary files to share analyzed data, which had added value but could not be shared via a disciplinary repository that only accepted raw data. In one interview, a faculty member talked more generally about making a conscious decision with each paper about using supplementary files, a laboratory database or a disciplinary repository to share data. Another participant was not sure if a repository existed for the data published in supplementary files.

While no questions asked specifically about the peer-review process, the subject was mentioned in a few interviews. One faculty member explained that supplementary files go through the same peer-review process as the article, noting that she has had reviewers suggest revisions to her supplementary files. She said that if supplementary files are readable and understandable, researchers will find and use them. Another faculty member said that supplementary files facilitate the review process, because they can help support an argument and "save doing analysis from scratch." He noted that some top journals are specifically reviewing how well supplementary files are organized.

Disciplinary Repositories

Several participants had used disciplinary repositories, and they noted many benefits. Faculty members relied on repositories to provide permanent access to their data. In most cases, the repository served as a backup for local data, but one faculty member said that depositing data in a repository freed her and her computer resources from maintaining old data. Repositories also facilitated data distribution and access. Time was saved, because data were deposited once and accessed by multiple researchers. And compared to requesting data directly from scientists, "Accessing the data becomes much easier for general researchers." The participants who deposited data in National Center for Biotechnology Information (NCBI) resources also emphasized the value of being able to search across compatible repositories to retrieve a variety of information. Another benefit of a centralized, disciplinary repository was that it saved duplicating efforts to develop separate systems for similar data.

The most commonly cited drawback of disciplinary repositories was the amount of time and effort required to deposit the data. As one participant said, "It takes time. That is one huge disadvantage for the researcher who actually generated the data." One faculty member also described complications of depositing data; in one case, a repository system and its naming convention were constantly evolving because the organism's genome was still being completed. In another case, some research data were produced in a "boutique" format that had to be cross-referenced with a more standard format before the data could be deposited.

A few participants talked specifically about the process of depositing data in NCBI resources. One believed that while the instructions might seem daunting, most people conducting genetics research would be able to figure out how to deposit data into NCBI resources. Another faculty member who infrequently deposited data into NCBI resources said that she needed to relearn the process each time, which was time consuming and not efficient. When asked about the necessity of the information required to submit data to NCBI, she responded that the metadata requested was essential.

The faculty members had brief responses when asked how other researchers might discover the data they deposited in a repository. Most thought that other researchers would see data accession numbers included in journal articles or use a repository's search functionality.

All of the participants who had deposited data in a disciplinary repository had also reused data from repositories. Most commonly, they compared data generated in their research to other data available in repositories.

Challenges of Sharing Data

Some participants described general challenges of sharing data, regardless of the method. One faculty member emphasized the amount of time required to prepare data to be shared saying, "It's one thing to [use data] in-house, do the analysis and that's it; it's another thing to prepare it for sharing." He also spoke candidly about the challenges of keeping track of what data had been deposited, whether in a laboratory database or disciplinary repository. He described a situation in which he asked a student to deposit data in a disciplinary repository; the student did most of the preparation but left for postdoctoral work before depositing the data. "That's reality," he said. Another faculty member talked about the difficulty of organizing data sharing tasks around grant deadlines, academic calendars, and student transitions. Both faculty members noted that it is challenging to balance data sharing with grant and article writing, since the latter are currently more rewarded in the academic environment. With that in mind, one said, "Spending too much time for [data sharing] feels ridiculous."

Even selecting the data sharing method can be challenging. When asked about the decision to share data via supplementary files, a laboratory database or a disciplinary repository, a faculty member said, "It's a call every time we write a paper. It's not that there is one way to [share data]." He also reminisced how "many years ago, writing a paper was simple." In the past, authors would describe their work and include two to four images or tables, but in a recent experience with an electronic, open access publisher, he almost reached the number limit of supplementary files, and the publisher still did not have the entire dataset.

Role of the Library

Most participants said they had not previously considered the library for assistance with data, but when asked how the library could help facilitate data sharing, they had a variety of ideas. Some of the ideas were related to technological solutions. A few faculty members suggested the library should provide a system that would allow researchers to share data for which no disciplinary repository exists. Another faculty member wondered what would happen if publishers stopped accepting supplementary files but researchers were still expected to make data publicly available; she suggested that the library could have a system to make these files available, especially for researchers who do not have a method (e.g., a web site) to disseminate the data.

One participant was very interested in assistance with depositing data in a disciplinary repository, whether the assistance came from the library or another group on campus. She deposited data infrequently, requiring that she relearn the process each time, which was inefficient and took time away from writing grants and articles. "If there's someone in the institute who can [deposit data], instead of individual researchers, that would save lots of our time and [we could] be more productive," she said. She also believed this assistance would provide a "significant advantage" when mentioned in grant proposals, because funding agencies might be more confident that the data would be deposited. Another idea was that this group could check whether researchers had actually deposited data, if data sharing had been proposed or promised. She thought this oversight might encourage data sharing in disciplines less likely to share data.

Other ideas focused on data education and training. One participant suggested the library could help researchers learn about repositories for acquiring and depositing data. She also mentioned of referring researchers to the library for data assistance, especially graduate students who receive limited guidance from faculty advisors. One participant was interested in the library offering workshops about database design, semantics and data management, removing the burden from individual laboratories to do this training. He said it takes a significant amount of time to manage data, but ultimately, he was responsible for it because graduate students and postdoctoral researchers are transitory. To help ease the burden, the faculty member said he currently expects postdoctoral researchers to have some previous data management experience.

One faculty member described an experience several years prior that had prompted his thinking about the library's role in preserving research data. He was approached about taking responsibility for a valuable database, because the researcher who developed it was retiring. He said, "At that time, I thought, this is a problem," because there was no group automatically responsible for the research databases of retiring faculty. To help illustrate the number of research databases and the magnitude of the problem, he highlighted the special issues of Nucleic Acids Research that include articles describing research databases. Some databases become invaluable to a community and receive funding, but many other small, but important, databases could be lost when researchers retire. He felt the library could be an ideal organization to take responsibility for preserving faculty research data, so the data would be available for future research.

Impact of Funding Agency Requirements

No participants felt their data sharing practices would change as more funding agencies require data to be shared. One faculty member said, "I don't think so. Well, it's more a problem for us, because now we have to write these two extra pages," referring to the data management plans required by NSF. Another participant noted that data sharing requirements might change the mindset of researchers in disciplines traditionally less willing to share data, and lead to more opportunities for collaboration.

Discussion

Since this study asked faculty members why they shared their data publicly, it provides a unique perspective, differing from other studies that summarized reasons for not sharing data (Diekmann 2012; Peters & Dryden 2011; Tenopir et al. 2011). The most common reasons for sharing data publicly were to meet the data sharing expectations of the research field, to help the research community, and to meet funding agency requirements. The benefits of publicly sharing data included overcoming limitations of scholarly articles, providing value-added data, having a backup of local data, and facilitating data distribution and access. The participants conducting genetic research made it very clear that data sharing was "standard practice." This echoes the "Data Dimensions" report, which cited genomics as a life sciences discipline in which data sharing was common (Key Perspectives 2010). In the findings reported by Cragin et al. (2010), the participants who had deposited genomics data in GenBank were the only ones that had used a repository. Conversely, Tenopir et al (2011) cited the Campbell et al. (2002) study in which 35% of geneticists felt data sharing had decreased in the previous decade and 14% felt sharing had increased. However, as noted in the Campbell et al. (2002) article, after request volume and other variables were controlled for, geneticists were found to be no more likely than other life scientists to deny data requests from others or to have their data requests denied.

One of the interview questions in this study was intended to explore non-use of repositories, which Cragin et al. (2010) identified as an area of future research. A bibliographic study by Williams (2012) found that supplementary files were the most common data sharing method, which raised the questions: Did disciplinary repositories exist for data submitted as supplementary files, and if so, why were the repositories not used? In this study, only one participant was not sure if a repository existed for the data published in supplementary files. Three other participants explained that they carefully consider how to share their data. In particular, two of them specifically said that they use supplementary files to share value-added data, which could not be shared via a disciplinary repository that only accepted raw data. It was clear from this small sample of faculty members that some are making very deliberate decisions about the best way to share their data.

The faculty members in this study had all shared data publicly, whether raw data submitted to a repository or processed data published as supplementary files. In some cases, they had overcome the barriers that have prevented other researchers from sharing data, but the participants still experienced challenges. The time and effort required to prepare data for sharing, especially via disciplinary repositories, was the most frequently mentioned challenge. Similarly, the Tenopir et al (2011) survey found that insufficient time was the most common reason for not sharing data electronically, and Diekmann (2012) noted the significant amount of time and effort required to prepare data for sharing beyond direct collaborators.

The participants also described challenges of juggling and balancing priorities. Examples included keeping track of what data had been deposited and organizing data sharing activities around grant deadlines, academic calendars, and student transitions. Balancing data sharing with grant and article writing was another major challenge, since grants and publications are currently more rewarded in the academic environment. This directly relates to one of the key findings of the "Patterns of Information Use and Exchange" report (RIN 2009) that incentives are weak for sharing research results other than through formal publication. The recognition of sharing data is gradually increasing though, for example, with the 2013 update to the NSF Grant Proposal Guidelines that changed the "Publications" section of the biographical sketch to "Products" in order to allow datasets to be included along with publications and other research products (National Science Foundation 2013).

A few faculty members talked specifically about the challenges of depositing data to NCBI resources. One participant acknowledged that the instructions can be daunting, although they can be mastered by researchers conducting this type of research. Another faculty member found it challenging to deposit data because she did so infrequently and therefore had to relearn the process each time. Despite the difficulty, she did feel that the metadata requested was essential. These comments are similar to findings in the report by Swan and Brown (2008). They noted that GenBank's metadata requirements are very demanding but the metadata also provide valuable contextual information about the datasets.

After discussing data sharing challenges, the participants were asked what role they saw for the library to help facilitate data sharing. Consistent with another study that found faculty members "do not perceive libraries as a source of data management expertise or as the best place to store academic research data" (Scaramozzino et al. 2012, p. 362), most participants had not previously thought of the library for assistance with data. Yet, when asked how the library could help facilitate data sharing, they had a variety of ideas. The ideas included technological solutions, assistance with depositing data, and data education and training. In some cases, the library or university already had services or systems to meet the suggestions. Regarding ideas for data education and training, librarians were already offering data management and database design sessions through the library's workshop series. Faculty could also refer researchers to the library for data assistance, either to their subject specialist librarian or to the library's Scholarly Commons, which specifically offers assistance with research data. Another request was for assistance with depositing data into disciplinary repositories; while the library did not offer that service, the faculty member was referred to a new high-performance computing group on campus that specialized in services and infrastructure for life sciences research. The faculty member was not previously aware of this group, and by exploring the web site, she identified other ways that the group might advance her research. When wondering what would happen if publishers stopped accepting supplementary files, another participant suggested that the library could provide a system for researchers to make these files available, and the institutional repository was one existing library system that could provide that service. The lack of awareness of existing campus services and systems highlights the need to expand and improve their promotion, so the author plans to try new methods to make more researchers aware of valuable research data support available.

One faculty member had been thinking for several years about the library's role in preserving research data. He was very concerned about the long-term availability of the ever-growing number of research databases, most of which do not receive continuous funding. This concern is supported by the "Data Dimensions" report (Key Perspectives 2010) that stated funding was available for developing databases but funding to maintain existing databases was more limited. The report by Swan and Brown (2008) noted that there are numerous specialized public databases for genomics research on the web, but because the projects have ended and the staff are gone, they remain static and lack a curation plan. It would require a substantial commitment for the library to become involved in preservation of research databases, and some of the other faculty suggestions would also be significant projects, such as providing a system to share data for which no disciplinary repository exists. The author plans to share these requests with library and campus data committees to help inform their discussions of major research data support initiatives.

In general, the interviews were a good opportunity to promote a broader or fresh view of the library. One faculty member asking about the current state of the library and what the library was doing beyond housing the physical collection, provided an opportunity to mention services related to data and scholarly communications. Her questions about access to theses and dissertations prompted a discussion about the institutional repository. Additionally, her comment that the information that describes the data generated in her laboratory was very different than the information that describes items in the library catalog, provided an opportunity to highlight the diverse metadata expertise of librarians. Toward the end of another interview, a faculty member said, "[Our conversation] was very interesting. It brought me a new thought and a new way of seeing the library's function." She also commented, "It sounds like the library's function is expanding significantly." Conversations and comments like these open the door to new ways of working with faculty.

One major challenge is informing more faculty about the library's expanded role. These interviews reached seven faculty members, most of whom had not previously considered the library for help with research data, but this leaves many other faculty members who may also benefit from an updated view of the library. In addition to the author's continued efforts to promote the library's data services, the research data initiatives being discussed at the library and campus level should greatly increase the visibility of existing research data support services.

Conclusion

Unlike previous studies that focused on reasons for non-sharing of data, this study provides a unique perspective in that it targeted faculty members who had publicly shared data and explored their reasons for doing so. Common reasons for sharing data publicly were to meet data sharing expectations of the research field, to help the research community, and to meet funding agency requirements. This study also investigated non-use of repositories and found that at least in this small sample of faculty members, some are making very conscious decisions about the best way to share their data.

While the faculty members in this study overcame barriers that have prevented other researchers from sharing their data publicly, the participants still experienced challenges. Frequently mentioned challenges included the time and effort required to prepare data to share or to deposit data in a repository, the struggle to balance data sharing with grant and article writing, and complicated instructions and processes for depositing data. Nevertheless, these challenges provide opportunities for librarians to assist faculty with their research by promoting existing library services and systems, referring researchers to other groups on campus, and creating new data services.

Acknowledgements

The author thanks Peg Burnette, Anita Foster and Jean McDonald for providing comments on the final draft of this article.

References

Campbell, E.G., Clarridge, B.R., Gokhale, M., Birenbaum, L., Hilgartner, S., Holtzman, N, and Blumenthal, D. 2002. Data withholding in academic genetics: Evidence from a national survey. Journal of the American Medical Association 287(4): 473-480.

Carlson, J. 2012. Demystifying the data interview: Developing a foundation for reference librarians to talk with researchers about their data. Reference Services Review 40(1): 7-23.

Cragin, M.H., Palmer, C.L., Carlson, J.R., and Witt, M. 2010. Data sharing, small science and institutional repositories. Philosophical Transactions of the Royal Society A 368: 4023-4038.

Data Curation Profiles Toolkit. 2013. [Internet]. [Cited 2013 Jan 24]. Available from: http://datacurationprofiles.org/

Diekmann, F. 2012. Data practices of agricultural scientists: Results from an exploratory study. Journal of Agricultural & Food Information 13(1): 14-34.

Key Perspectives. 2010. Data Dimensions: Disciplinary Differences in Research Data Sharing, Reuse and Long Term Viability. [Internet]. [Cited 2012 Dec 18]. Available from: http://www.dcc.ac.uk/projects/scarp

National Science Foundation. 2013. GPG Summary of Significant Changes. [Internet]. [Cited 2013 March 7]. Available from: http://nsf.gov/pubs/policydocs/pappguide/nsf13001/gpg_sigchanges.jsp

Peters, C., and Dryden, A.R. 2011. Assessing the academic library's role in campus-wide research data management: A first step at the University of Houston. Science & Technology Libraries 30: 387-403.

Pryor, G. 2009. Multi-scale data sharing in the life sciences: Some lessons for policy makers. The International Journal of Digital Curation [Internet]. [Cited 2012 Dec 18]; 3(4). Available from: http://www.ijdc.net/index.php/ijdc/article/view/135

Research Information Network. 2009. Patterns of Information Use and Exchange: Case Studies of Researchers in the Life Sciences. [Internet]. [Cited 2012 Dec 18]. Available from: http://www.rin.ac.uk/our-work/using-and-accessing-information-resources/patterns-information-use-and-exchange-case-studie

Scaramozzino, J.M., Ramírez, M.L., and McGaughey, K.J. 2012. A study of faculty data curation behaviors and attitudes at a teaching-centered university. College & Research Libraries 73(4): 349-365.

Swan, A. and Brown, S. 2008. To Share or not to Share: Publication and Quality Assurance of Research Data Outputs. [Internet]. [Cited 2012 Dec 18]. Available from: http://www.rin.ac.uk/our-work/data-management-and-curation/share-or-not-share-research-data-outputs

Tenopir, C., Allard, S., Douglass, K., Aydinoglu, A.U., Wu, L., Read, E., Manoff, M., and Frame, M. 2011. Data sharing by scientists: Practices and perceptions. PLoS ONE [Internet]. [Cited 2012 Dec 18]. 6(6). Available from: http://www.plosone.org/article/info:doi/10.1371/journal.pone.0021101

Westra, B. 2010. Data services for the sciences: A needs assessment. Ariadne [Internet]. [Cited 2013 Jan 17]. 30(64) Available from: http://www.ariadne.ac.uk/issue64/westra

Williams, S.C. 2012. Data practices in the crop sciences: A review of selected faculty publications. Journal of Agricultural & Food Information 13(4): 308-325.

Witt, M., and Carlson, J.R. 2007. Conducting a data interview. Libraries Research Publications [Internet]. [Cited 2012 Dec 18]. Paper 81. Available from: http://docs.lib.purdue.edu/lib_research/81/

Appendix
Interview Introduction and Questions

Hello, I am the Life Sciences Data Services Librarian at University of Illinois. I really appreciate you taking the time for this interview, so that I can learn more about data sharing in the crop sciences and hopefully you can learn more about the library's data services.

I have a series of questions about your data sharing experiences and thoughts and about whether you see a role for the library in facilitating data sharing. I am interested in your honest experiences and thoughts, so there are no right or wrong answers. To give our conversation direction, we'll try to focus on data sharing related to one particular article, as we discussed when scheduling this interview. I will be writing some notes, but to avoid stifling our conversation, the interview will also be audio recorded to aid my note taking.

  1. Could you provide a brief overview of the study and data related to this article?
  2. Why did you (and your co-authors) decide to share additional data beyond what was published in the article?
  3. How did you share the additional data (e.g., supplementary files, repository, university website)?
  4. Why was this method/were these methods chosen?
  5. If data was shared as a supplementary file, do you know if a disciplinary repository existed for this type of data?
  6. What are the benefits of the method(s)?
  7. What are the drawbacks of the method(s)?
  8. How do you imagine people will find this shared data?
  9. Have you ever used or tried to use data shared this way?
  10. What are the benefits and challenges of sharing data?
  11. Do you see a role for the library to facilitate data sharing?
  12. Do you anticipate your data sharing practices will change as more funding agencies expect data to be shared?
  13. Would you like to discuss anything else related to data sharing in general?

Previous Contents Next

W3C 4.0   Checked!