Previous Contents Next
Issues in Science and Technology Librarianship
Fall 2016
DOI:10.5062/F43X84NJ

URLs in this document have been updated. Links enclosed in {curly brackets} have been changed. If a replacement link was located, the new URL was added and the link is active; if a new site could not be identified, the broken link was removed.

[Refereed]

Data Management Practices and Perspectives of Atmospheric Scientists and Engineering Faculty

Christie Wiley
Engineering Research Data Services Librarian
cawiley@illinois.edu

William H. Mischo
Berthold Family Professor of Information Access and Discovery
Grainger Engineering Library Information Center
w-mischo@illinois.edu

University of Illinois at Urbana-Champaign
Urbana, Illinois

Abstract

This article analyzes 21 in-depth interviews of engineering and atmospheric science faculty at the University of Illinois Urbana-Champaign (UIUC) to determine faculty data management practices and needs within the context of their research activities. A detailed literature review of previous large-scale and institutional surveys and interviews revealed that researchers have a broad awareness of data-sharing mandates of federal agencies and journal publishers and a growing acceptance, with some concerns, of the value of data-sharing. However, the disciplinary differences in data management needs are significant and represent a set of challenges for libraries in setting up consistent and successful services. In addition, faculty have not yet significantly changed their data management practices to conform with the mandates. The interviews focused on current research projects and funding sources, data types and format, the use of disciplinary and institutional repositories, data-sharing, their awareness of university library data management and preservation services, funding agency review panel experiences, and struggles or challenges with managing research data. In general, the interviews corroborated the trends identified in the literature. One clear observation from the interviews was that scientists and engineers take a holistic view of the research lifecycle and treat data as one of many elements in the scholarly communication workflow. Data generation, usage, storage, and sharing are an integrated aspect of a larger scholarly workflow, and are not necessarily treated as a separate entity. Acknowledging this will allow libraries to develop programs that better integrate data management support into scholarly communication instruction and training.

Introduction

The growth of data intensive science and the introduction of federal grant agency data management and sharing plans has changed the way science is conducted (Tenopir et al. 2014). This has led academic libraries to establish programs to assist researchers in the management of research data, including providing critical infrastructure, storage resources and support in the area of data management plans, curation and stewardship (Heidorn 2011; Tenopir 2011; Borgman et al. 2015).

The University of Illinois at Urbana-Champaign Library is actively engaged in efforts to assess researcher needs regarding data management across disciplines. During the fall of 2015, the lead author conducted informal conversations with Atmospheric and Engineering faculty to learn more about research data needs and their awareness of campus data management and preservation services. The responses indicated that faculty received numerous communications from campus and departments units regarding data management, yet they were often unsure of what sources were offered or how to take advantage of these services. Additionally, department heads were unsure of how their faculty addressed data management practice and needs or handled managing their data, as faculty have varied research interests and data requirements. These discussions serve as the basis of this analysis.

One goal of this study was to corroborate or better understand the data management behaviors identified in the literature and to use this information to identify needs and gaps in library services and identify mechanisms that can help meet researcher's data management needs. The UIUC faculty interviews were designed to complement and extend the results of previous surveys of research faculty. The authors will present the results of the interviews and provide some insights and comments regarding the role of data within the research process. We are particularly interested in the role that data plays as faculty groups engage in the scholarly communication process.

Literature Review

There is a great deal of literature on researcher data-sharing practices and attitudes and the role of the library in meeting researcher data management needs. Much of this work has been centered on data management and data-sharing mandates of grant funding agencies (Holdren 2013; Tenopir 2016). In addition to federal agency requirements, a number of publishers have begun to mandate that data be made available for all articles published in specific journals (Van Tuyl and Whitmire 2016). While surveys suggest that faculty are generally aware of agency sharing and re-use mandates, it does not appear they have significantly changed their data management practices because of the mandates (Diekema et al. 2014; Whitmire et al. 2015).

The rapid growth and complexity of e-research technologies and practices has spawned an evolution in scholarly communications models and library support services (Carlson et al. 2011). The rapid transformation of knowledge creation methods presents libraries with an opportunity to be more integrally involved in the end-to-end aspects of knowledge creation (Borgman et al. 2015). Libraries are still defining their roles and relationships in the management, curation, preservation, and dissemination of research data (Carlson et al. 2011; Rolando et al. 2013). In a review paper, MacMillan (2014) presented an overview of data deposition and sharing practices within the scholarly communication workflow and provided an overview of what librarians need to know about the data life cycle. Johnston and Jeffrys (2014) interviewed graduate students and a faculty member in a civil engineering research group and, from these interviews and observations, developed a blueprint for an instructional program in the library to support engineering data management practices.

A number of surveys and interviews have been conducted with scientists and researchers regarding their data-sharing and data management practices and needs. Several studies have indicated that there is a growing acceptance of data sharing by researchers. Tenopir et al. (2011; 2015) reports on two surveys of scientists conducted by the DataONE Assessment group over a four-year period in 2009-2010 (1,329 responses) and 2013-2014 (1,015 responses). This longitudinal study shows an acceptance by researchers to engage in data-sharing activities and an increase in researcher data-sharing activities. However, the survey also revealed an increase in perceived risk in sharing data. Zinner et al. (2014) surveyed a total of over 3,000 life scientists in two surveys -- in 2000 and in 2013 -- and reported a dramatic shift in the type of data sharing being practiced, from a focus on a peer-to-peer data-sharing environment to a sharing model focused on disciplinary and institutional repositories. A follow-up to the 2013 survey reported by Zinner found that recently enacted data-sharing policies and newly introduced sharing infrastructure and tools have had a significant effect on encouraging data-sharing. In particular, life scientists identified the data-sharing requirements of funding agencies, particularly the NIH and the National Human Genome Research Institute as mechanisms that promote data sharing (Pham-Kanter et al. 2014).

Van Tuyl and Michalek (2015) surveyed and interviewed Carnegie-Mellon faculty and found that 64 percent of all the faculty and 95 percent of the engineering faculty were aware of U.S. grant agency data management plan requirements, but their research practices were not always consistent with best practices. Akers and Doty (2013) surveyed Emory university faculty and found that over 80 percent of the basic science faculty were somewhat or very aware of federal agency data management requirements. Tenopir et al. (2015) identified clear disciplinary differences in data sharing and reuse across scientific subject areas. Weller and Monroe-Gulick (2014) examined the differences in research methodologies, data storage practices, and research challenges across subject disciplines, and recommended that libraries develop targeted services to different disciplines. Federer et al.(2015) also found that a number of significant differences exist between the attitudes and practices of clinical and basic science researchers, including their motivations for sharing, their reasons for not sharing, and the amount of work required to prepare their data. The authors concluded that addressing the unique concerns of diverse research communities is important to encouraging researchers to share and reuse data.

Akers and Doty (2013) surveyed Emory University researchers and found important disciplinary distinctions in data management actions and attitudes as well in their level of interest in support services. The authors encourage academic librarians to develop a range of data management services that can be tailored to unique disciplinary needs.

Kim and Stanton (2016) conducted a survey of 1,317 scientists to examine the extent that institutional and individual factors influenced data-sharing behaviors. The authors found that different disciplines may have different data-sharing requirements and expectation and that it is important to investigate how both disciplinary and individual factors influence data practices. They also found, in opposition to several earlier studies, that there was no significant correlation between regulative pressure by funding agencies and scientists' data-sharing behaviors. Whitmire et al. (2015) found that Oregon State University (OSU) researchers generate a wide variety of data types, and that practices vary widely between different disciplines and colleges. The authors discovered that faculty are not utilizing the campus-wide storage infrastructure, but are maintaining their own storage servers in surprising numbers. They also found that graduate students and research assistants perform the majority of data-related tasks, except for actual data-sharing procedures, which are primarily the domain of faculty.

Other institutional surveys of researchers reveal the same issues (McClure et al. 2014; Parham et al. 2012; Peters and Dryden 2011; Carlson et al. 2011; Gu and Averkamp 2012). These studies reveal a broad awareness of data-sharing mandates and requirements and a growing acceptance, with some reservations and caveats, of the value of data sharing. However, data management needs differ greatly by discipline and these disciplinary differences are significant and represent a set of key challenges for libraries in pulling together consistent and successful data management services. The complexities and heterogeneity of data and the role data plays in the scientific workflow and the scholarly communication lifecycle complicate the issues. One of our goals in interviewing our group of researchers was to ascertain where data use and management fit into their overarching scholarly workflow. It was clear that the researchers we talked to take a holistic view of the role of data in their work and data management is often not a critical concern, but rather a tool they use to bring their research to a point of publication and dissemination. As Borgman (2015) noted in a letter to Nature , "Because so much about research data is open to personal interpretation, the information can be difficult to describe, represent and manage -- and to share or reuse. The failure to understand these complexities leads to misguided policies for data management and to a lack of investment in both the workforce and the infrastructure for data curation. Ultimately, it can mean that no data survive for research."

This study looks at the data life cycle in the context of the overarching scholarly communication process and seeks to identify data management services that can better meet local and disciplinary needs of engineering and atmospheric scientists.

Methods

The lead author created the interview questions using the Data Curation Profile Toolkit and other questions administered (http://datacurationprofiles.org/). The information was gathered by interviewing faculty to discover how data management practices are applicable to their current research as well as their perspective on managing data. The author recruited participants from the UIUC department web sites and by inquiring about potential participants with the assistance of engineering research support staff affiliated with the College of Engineering. The participants were all active researchers with external support.

The topic areas and questions were designed to use and extend the results of previous surveys of the data practices of researchers. The topics covered during interviews were current research projects, funding sources, data types, format, description, disciplinary repository use, data-sharing, discovery mechanisms, the role of data in the scholarly communication process, awareness of university library provision of data management and preservation services, campus repository usage, agency review panel experiences and struggles or challenges with managing research data (interview questions are provided in the Appendix ). Participants were also asked about their perspectives on who was interested in their data, how long the data were useful, and what parts of the data would be important to preserve over time. If time permitted, participants were asked about the number of graduate students they oversaw and how they stay current on the literature in their field. All interviews lasted approximately 45 minutes.

The author interviewed participants in their campus offices. In addition to being a comfortable environment for the participants, the location provided faculty with easy access to their research laboratories and supporting materials and notebooks. The interviews were recorded and later transcribed verbatim into Microsoft Word documents and imported into NVivo for analysis. The author also analyzed the data to identify main ideas or themes in the areas of awareness of university support, backup, challenges, data management planning, scholarly workflow, grant and data preservation, duration, target audience and external funding to gain additional understanding regarding commonalities and uniqueness among the interviews.

Results and Discussion

Twenty-one individuals (19 male, 2 female) participated in the study. Experience levels varied among assistant professor, associate professor, full professor and a lecturer. Those interviewed represented the following departments within the College of Engineering: Materials Sciences (9), Atmospheric Science (4), Civil Engineering (4), Aerospace Engineering (2), Computer Science (1), and Mechanical Sciences (1).

Data Management Plans (DMP)

The second question asked if any of the funding sources required a data management plan. Participants indicated that the requirement was dependent upon the funding source. Participants were also asked whether funding sources required them to share data with other researchers, preserve data beyond life of the funding and if the data set was bound by privacy and confidentiality. Eighteen (95%) participants indicated the funding source required them to share data and preserve it beyond life of the funding. One participant indicated there was no plan to share the data nor preserve the data beyond the life of the funding. Sixteen (84%) of the participants indicated the datasets produced were not bound by privacy and confidentiality. Three participants indicated the datasets they were producing were bound by privacy and confidentiality. These participants worked in collaboration with health and industry collaborators.

Funding Sources

The primary funding agencies are listed in Figure 1. Participants also listed other funding agencies, including: United States Geological Survey, Department of Defense, United States Department of Agriculture, Illinois Toll Way, American Chemical Society, United States Environmental Protection Agency, Federal Railroad Administration, Federal Transit Authority, City of Chicago and small gifts from private industry. Many of these agencies do not have stringent policies regarding data management.

Figure 1: Funding Sources

Participants were asked if they had experience writing data management plans (DMPs) and if any templates or web resources were used. All participants had experience writing data management plans. Two respondents in this study indicated they had used templates provided by the Grainger Engineering Library. Other responses varied between seeking assistance from libraries in their former position and searching on the Internet. One participant specifically stated "We use boilerplate language in the data management plan because we don't have a problem with data management." While the literature provides many examples of DMP support services and analyses of their content, there is little objective data on the effects that these services have on the content and quality of the DMPs created by researchers (Johnson & Knuth 2016).

Data Type and Description

The third question asked researchers about the type of data they collect or produce. All of the participants in this study produce quantitative data, and they were asked to explain briefly how the datasets are organized or described. Four persons mentioned specific workflows for organizing research data and implementing their workflow with graduate students in research laboratories. One researcher discussed creating a schema developed within their research group to describe their data. One researcher discussed creating column headers (i.e., when to use CSV, Excel or spreadsheet type applications). No researchers used any standardized form of description or metadata to describe the data they produce. Many of the interviewees noted that, for the most part, the data generated and processed as part of their projects was a means to a scholarly end, usually the publication of an article or conference paper. While there is interest in the concept of data as a publication, there are no clear guidelines about what a data publication should be or how it should be peer-reviewed (Kratz and Strasser 2015). Data publication has not gained much traction in the scientific community, and the faculty in this study are not currently thinking of data in this way.

Participants were asked about the number of post-doctoral researchers and graduate students they oversee in conducting research projects. Responses ranged from one to 13 graduate students and post-doctoral researchers. Yet, researchers also noted that their ability to provide individual instruction on best data management practices is limited. Half of the participants stated they would be interested in having a librarian provide instruction on best practices for describing data, data management, backing up data for projects.

Data Format and Size

Researchers were asked about the format of the data and for an estimation of file size. ASCII data files were the typical outputs created in a project that produced mechanical, acoustic emission, temperature-dependent, and x-ray data. Text- and comma-separated value (CSV) files are used in a computational material science project. Binary files and images are the output produced in stem cell and tissue engineering. Several projects produce XML files. Atmospheric faculty produce radar, airborne, spaceborne, and satellite data using network common data form (net.cdf) and hierarchical data format (HDF) files. Atmospheric faculty indicated they produce terabytes of data. Other respondents estimated that they produce thousands of files within the size range of 1-100 gigabytes.

Commercial Data and Storage

Four participants indicated they use commercial data in the context of government web sites and databases. Researchers indicated analyzing data produced by others in the context of verifying data or testing results from journal publications. Researchers discussed collecting and generating data as it related to research projects. Researchers discussed using Box, Gitbucket and Github. All 21 researchers indicated that they are allotted various amounts of disk storage space in their departmental cluster and in all cases have dedicated laboratory spaces. Participants did not list reasons they chose a specific storage option, nor did they indicate if they chose more than one. Some participants mentioned using external hard drives and various web resources. The University Library offers up to two terabytes of space per principal investigator through the public data repository, the Illinois Data Bank (http://databank.illinois.edu) which is operated by the Research Data Service.

Researchers mentioned they often back up their program files but rely on students to back up data files. Unfortunately, when students graduate or leave the program, their research data may be lost, either because the student did not provide documentation for the project, or because the student retained the data, did not leave a back-up copy, or refused to provide a copy of the data. The former problem means students who are new to the project have no information regarding previous work. In one specific case a student refused to provide access to the data and University Counsel, the campus legal office, has been contacted to gain access to the data.

Preservation and Value

Engineering and atmospheric researchers indicated that other researchers in the same field of study are the primary users of their data. When asked what data would be the most important to preserve over time, participants responded that raw data were important because it allowed them to retest original theories. Two participants indicated post-processed data were equally important. Although researchers differed regarding the amount of time data are useful, the average amount of time indicated was five years.

Data-Sharing

Engineering and atmospheric researchers' perspectives on sharing suggested they rarely received requests from other researchers interested in their data. All respondents share data among graduate students research groups and internally. The tools used to share research include local web sites, Box, IDEALS (UIUC institutional repository), departmental clusters, and archiving and open source packages. As noted in the literature review, a number of studies have identified different data life cycle needs within different research methodologies and subject disciplines.

Discovery and Repository

While they assigned high importance to the ability of other researchers in their field to find and read their work, interviewees indicated that it is less important that researchers in other disciplines be able to do so. Two subjects felt that their data had some level of importance for the general public (specifically awareness of research related to emergency weather and environmental information). The majority agreed they would like to have their data discovered within a search engine, although one was opposed to this path because of the lack of subsequent control and concern for the possibility of misrepresentation.

Atmospheric faculty mentioned using the repository affiliated with National Center for Atmospheric Research (NCAR). Engineering researchers indicated depositing data with National Center for Supercomputing Affiliations (NCSA), National Institute of Standards and Technology (NIST) and National Center for Biotechnology Information (NCIB) or they did not use a repository at all.

Researchers are interested in the discoverability of data but prefer internal mechanisms (i.e., web sites, campus clusters) or journal publications. Most faculty in this study (84%) were unaware of the campus repository. This lack of awareness correlates with research dataset records and individualized files deposited within IDEALS. A recent study to identify the number of datasets, types of files deposited, research methodologies, and research discipline or research community within the UIUC institutional repository found that discipline-associated datasets accounted for only 18 percent of dataset records (Wiley 2015). This indicates that discipline-based campus repository usage is low compared to rare books, special collections and datasets associated with farming communities.

Experiences as a DMP Reviewer

Borgman et al. (2015) have stated that "data management is the general rubric of ensuring the integrity, access, and usability throughout the research process and beyond." However, when UIUC researchers were asked about their experiences as DMP reviewers, these comments illustrate their perspective on data management plans:

"People tend to not consider it as much as they should. They take it for granted and it is becoming an issue because the volume of data is getting big."
"Few cases where the researchers were concerned. In the case of computational and experimental engineering collaborators, no clear infrastructure, no outline of how they would make their two different types of data work together."

Faculty members in this study are aware of funding agency requirements and acknowledge complying with them. However, their statements suggest the data management plan is an afterthought or not taken as seriously as it should be. Funding agency requirements and journal article submission policies have made data management a higher priority in the research process. Significant challenges to implementing data management arise from the lack of clear guidelines within the DMP, lack of infrastructure support, large volumes of data, changing accessibility formats, differing views of sharing, and costs of long-term preservation.

Challenges

Researchers indicated they were struggling with data issues concerning the volume of data, long-term preservation, making big data user friendly, backing up and disseminating data, and organization of research workflow. An atmospheric researcher expressed a similar experience within the American Meteorological Society:

One of the things we have been looking at is how the AMS will be looking at data storage, stewardship and data availability for publishing journals. The society itself does not have any capability to store it. It does not have the money or facilities to do that. We also cannot act as police or as enforcers. If an author states the data is stored somewhere or publicly available then we have to trust that it is. We have recommended authors use a data citation so they cite where the data is located. If an author says that he/she will make the data accessible we will have to trust that they will."

Atmospheric faculty expressed concerns about the volumes of data they generate. In some cases graduate students don't have enough space to store data for research projects. Another concern is the ability to maintain this data for the long-term. One person stated that they keep data for a year after publication, but if there are no subsequent questions or disputes, they discard it.

Implications and Conclusion

The primary goal of this study was to examine the data-sharing practices and attitudes of UIUC faculty in engineering and atmospheric sciences with respect to the researcher behaviors identified in the literature. These behaviors include: an acceptance of the value of data-sharing but a concern, in some instances, of the danger or inadvisability of doing this; an awareness of federal and journal mandates on data-sharing but a lack of skills and known mechanisms to do this properly; and the existence of clearly identified disciplinary differences in data lifecycle methodologies and practices. For the most part, all of these trends were verified.

A second goal was to inform the creation of data services that can better meet faculty and student needs. Conversations with the researchers also revealed that there is a feeling that a sea change has occurred in the way research is conducted: that e-journals and e-books, grid and cloud computing, simulation software, and data analytics tools have produced a collaborative and distributed knowledge creation environment in which they all now work.

Researchers are aware that the library provides data preservation and management services, thanks to the various outreach events provided by the Library's Research Data Service and subject and departmental librarians. Despite this awareness, none of the participants in this study had contacted the library or used the preservation and data management services offered by the university library, although three did use an online template offered by the engineering library. This highlights a disconnect between awareness of services and their actual use.

Scientists and engineers take a holistic view of the research lifecycle and treat data as one of many elements in the scholarly communication workflow, where data management is not necessarily considered as a separate entity. To address this, libraries should develop programs that better integrate data management support into scholarly communication instruction and training.

Although atmospheric researchers reported the use of a disciplinary repository (NCAR), researchers were unaware that the campus institutional repository was available for depositing research data. College of Engineering faculty only mentioned depositing data using infrastructures supported by the National Center for Supercomputing Applications (NCSA) at UIUC. They did not indicate use of any specific engineering-related repository. Most researchers discussed publishing journal articles and associated data files with journal publications. They felt that journal publishers should also bear the responsibility of preservation or maintenance of the data. These indicators highlight the uncertainty around the long-term preservation of data.

In general, researchers are submitting vague data management plans in their proposals to funding agencies. This is somewhat surprising given that most of the researchers offered that a key goal of their research was to enable it to be verified, compared and reproduced by other researchers in the same discipline.

To date, universities and libraries have introduced services designed to make researchers aware of funding agency mandates, assist with data management plan analysis, and provide data storage options. While this study indicated that researchers were, to some extent, aware of funding mandates, none of the faculty had actually contacted the library to obtain assistance with data management and only three had used an online data management plan template. It is clear that many researchers view data management planning as a formality and that funding agencies are not providing clear expectations for data management plans. This should change as funding agencies are now providing more detailed guidelines for both open data and open access requirements. Libraries need to stay abreast of these developments and devise services that take into account researcher's lack of skills and expertise in meeting these requirements.

The subjects in this study noted that they need assistance with storage, back-up, long-term preservation and archiving of data. Although researchers are currently complying with funding agency mandates, they are also in need of solutions for the volumes of data produced, the ability of colleagues to easily and effectively access the data, and for the implementation of better workflow mechanisms within their distributed research groups. Their holistic view of the overarching research lifecycle and the role of data as one of many elements in the scholarly communication workflow should be taken into account when designing these support services.

This study confirmed that both engineering and atmospherics science faculty rely heavily on their graduate students for awareness of current literature, the organization of data storage requirements, the creation of metadata, and for carrying out the experiments that actually generate the data. While faculty typically set up the beginning of the publication cycle, they may be unsure of how graduate students follow best practices for describing and organizing data. Future work is currently underway to study this process with graduate students and post- doctoral researchers working on projects funded by the National Institutes of Health (NIH).

Finally, the interviews helped to identify other questions that might have been missed using another method. The interview format also provided the opportunity to create connections with faculty, share knowledge, and learn more about their research. Since completing the interviews the authors have expanded these conversations to other engineering departments and are creating focused instruction modules on data management planning, and best practices for the organization, back-up, and effective access of data. Future work will leverage the knowledge gleaned from the study to explore more tailored and effective data management services.

References

Akers, K.G. & Doty, J. 2013. Disciplinary differences in faculty research data management practices and perspectives. International Journal of Digital Curation 8(2): 5-26. doi:10.2218/ijdc.v8i2.263

Borgman, C., Darch, P.T., Sands, A.E., Pasquetto, I.V., Golshan, M.S., Wallis, J.S. & Traweek, S. 2015. Knowledge infrastructures in science: data, diversity, and digital libraries. International Journal on Digital Libraries 16:207-227. doi:10.1007/s00799-015-0157-z

Borgman, C. 2015. Data management: One scientist's data as another's noise, Nature 520, 157 doi:10.1038/520157d

Carlson, J., Fosmire, M., Miller, C.C., & Nelson, M.S. 2011. Determining data information literacy needs: a study of students and research faculty. portal: Libraries and the Academy 11(2):629-657. doi:10.1353/pla.2011.0022

Data Curation Profiles Toolkit. 2013. [Internet]. Available from: http://datacurationprofiles.org/

Diekema, A.R., Wesolek, A., & Walter, C.D. 2014. The NSF/NIH Effect: Surveying the Effect of Data Management Requirements on Faculty, Sponsored Programs, and Institutional Repositories. Journal of Academic Librarianship 40:322 331. doi:10.1016/j.acalib.2014.04.010

Federer, L.M., Lu, Y. L., Joubert, D.J., Welsh, J. & Brandys, B. 2015. Biomedical data sharing and reuse: Attitudes and practices of clinical and scientific research staff. PLoS ONE 10 (6):e0129506. doi: 10.1371/journal.pone.0129506

Gu, X. & Averkamp, S. 2012. Report on the University of Iowa Libraries' data management needs survey. [Internet]. Available: http://blog.lib.umn.edu/lmcguire/hslm/Data_Management_ at_UIowa_SurveyReport_20121121.pdf

Heidorn, P. Bryan. 2011. The emerging role of libraries in data curation and e-science. Journal of Library Administration 51:662-672. doi: 10.1080/01930826.2011.601269

Holdren, J. P. 2013. Increasing access to the results of federally funded scientific research. Washington, D.C.: Office of Science and Technology Policy. Available from {https://web.archive.org/web/20161020065541/https://www.whitehouse.gov/sites/default/files/microsites/ostp/ostp_public_access_memo_2013.pdf}

Johnson, A.M & Knuth, S. 2016. Data Management Plan Requirements for Campus Grant Competitions: Opportunities for Research Data Services Assessment and Outreach. Journal of E-Science Librarianship 5(1):e1089. doi:10.7191/jeslib.2016.1089

Johnston, L. & Jeffryes, J. 2014. Data management skills needed by structural engineering students: Case study at the University of Minnesota. Journal of Professional Issues in Engineering Education and Practice 140 (2):05013002. doi:10.1061/(ASCE)EI.1943-5541.0000154

Kim, Y., & Stanton, J. 2016. Institutional and individual factors affecting scientists' data sharing behaviors: A multilevel analysis. Journal of the Association for Information and Technology 67(4):776-799. doi:10.1002/asi.23424

Kratz, E. & Strasser, C. 2015. Researcher perspectives on publication and peer review of data. PLOS One 10(4): e0122337. doi:10.1371/journal.pone.0117619

MacMillan, D. 2014. Data sharing and discovery: What librarians need to know. Journal of Academic Librarianship 40 (5):541-549. doi:10.1016/j.acalib.2014.06.011">10.1016/j.acalib.2014.06.011

McClure, M., Level, A. V., Cranston, C. L., Oehlerts, B. & Culbertson, M. 2014. Data curation: A study of researcher practices and needs. portal: Libraries and the Academy 14(2):139-164. doi:10.1353/pla.2014.0009

Parham, S. W., Bodnar, J. & Fuchs, S. 2012. Supporting tomorrow's research: Assessing faculty data curation needs at Georgia Tech. College & Research Libraries News 73(1):10-13. Available from: http://crln.acrl.org/content/73/1/10.short

Peters, C. & Dryden, A.R. 2011. Assessing the academic library's role in campus-wide research data management: A first step at the University of Houston. Science & Technology Libraries 30(4):387-403. doi:10.1080/0194262X.2011.626340

Pham-Kanter, G., Zinner, D.E. & Campbell, E.G. 2014. Codifying collegiality: Recent developments in data sharing policy in the life sciences. PLoS ONE 9 (9):e108451. doi:10.1371/journal.pone.0108451

Rolando, L., Doty, C., Hagenmaier, W., Valk, A. & Parham, S.W. 2013. Institutional readiness for data stewardship: findings and recommendations from the research data assessment. Technical Report, Georgia Institute of Technology, Atlanta. Available from https://smartech.gatech.edu/bitstream/handle/1853/48188/Research+Data+Assessment+Final+Report.pdf;jsessionid=DE3DBA9225CB17572E1EDEBB6097D992.smart2?sequence=4

Tenopir, C., Allard, S., Douglass, K., Aydinoglu, A.U., Wu, L., Read, E., Manoff, M. & Frame, M. 2011. Data sharing by scientists: practices and perceptions. PLoS ONE 6(6):e21101. doi:10.1371/journal.pone.0021101

Tenopir, C., Hughes, D., Allard, S., Frame, M., Birch, B., Baird, B., Sandusky, R., Langseth, M. & Lundeen, A. 2015. Research Data Services in Academic Libraries: Data Intensive roles for the future. Journal of eScience Librarianship 4(2) e1085. doi:10.7191/jeslib.2015.1085

Van Tuyl, S.V. & Michalek, G. 2015. Assessing research data management practices of faculty at Carnegie Mellon University. Journal of Library and Scholarly Communication 3(3). doi:10.7710/2162-3309.1258

Van Tuyl, S.V. & Whitmire, A. L. 2016. Water, Water, Everywhere: Defining and Accessing Data Sharing in Academia. PLoS ONE 11(2):1-16. doi:10.1371/journal.pone.0147942

Whitmire, A. L., Boock, M. & Sutton, S. C. 2015. Variability in academic research data management practices: implications for data services development from a faculty survey. Program: Electronic library and information systems 49(4):382-407. doi:10.1108/PROG-02-2015-0017

Wiley, C.A. 2015. An Analysis of Datasets within Illinois Digital Environment for Access to Learning and Scholarship (IDEALS), the University of Illinois Urbana-Champaign Repository. Journal of eScience Librarianship 4(2): e1081. doi:10.7191/jeslib.2015.1081

Zinner D., Pham-Kanter G. & Campbell, E.G. 2016. The Changing Nature of Scientific Sharing and Withholding in Academic Life Sciences Research: Trends from National Surveys in 2000 and 2013. Academic Medicine 91(3): 433-440. doi:10.1097/ACM.0000000000001028

Appendix

Interview Questions

1. Please describe your current research project(s)?
(a) How long has this project been going on?
(b) Are there any external institutions involved outside UIUC
(c) Are there any physical infrastructure such as deployed sensors, laboratory spaces, etc?
 
2. What is your funding sources for these research?
(a) Do any of your funding sources require a data management plan?
(b) Share you data with others, publish data, or deposit data into a repository?
(c) Preserve you data beyond the life of the funding?
(d) Is this dataset bound by privacy or confidentially?
 
3. What type of quantitative data or qualitative data is collected or produced? (I.e. most recent paper or research project)
 
4. Thinking of your research project- Do you use commercial data? Did you analyze data produced by others or did you collect or generate data?
 
5. What format is the data in?
(a) How many data files exist?
 
6. Please explain briefly how you datasets are organized? How is it described? (e.g. detailed annotations, a code book, data dictionary, column headings in a spreadsheet?)
 
7. Do you have any standardized forms of description or metadata please identify the standards?
 
8. What would be the most important of the data to preserve or maintain or time?
 
9. How long would the dataset be useful or have value to you or others?
 
10. Who would you imagine would be interested in this data? (i.e. Other researchers within this field, other researchers in other disciplines, policy makers?
 
11. How would you imagine this data being used by the group or groups of people you listed?
 
12. Would you place any conditions on these people using this data? If so what would this be?
 
13. Have you used disciplinary repository in the past?
(a) Would you consider using one?
(b) Do you support a peer using a disciplinary repository?
 
14. Still thinking of the research project- From the data within this project- will a journal article or report be produced from it?
 
15. How do you stay aware of current literature in your field?
 
16. Please rank which is most important, least important, not important at al
(a) The ability for researchers within my discipline to easily find my dataset?
(b) The ability for researchers from outside of my discipline to easily find this dataset?
(c) The ability of the general public to easily find this dataset?
(d) The ability for people to easily discover this dataset using Internet search engine?
 
17. Have you been part of agency review panel or process that looked at data management plans that were submitted with the grant applications and if so, please tell me about your experience?
 
18. Data can be shared both internally to a project, with your future graduate students and outside of your research group with other researchers. How do you currently share you data?
 
19. Where do you store data for your current research projects?
 
20. Are you aware that the university library offers data management and preservation services?
 
21. Are you aware the university library has a campus repository?
 
22. Would you be interested in having a librarian provide instruction on best practices for describing data, data management, backing up data for projects?
 
23. What areas are you struggling with in regards to your data? Is there an area of data management that you feel you have no good or efficient solutions for at this moment?
 

Previous Contents Next

Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 International License. W3C 4.0   Checked!