Building Research Data Management Infrastructure in Canada from the Bottom-up Charles (Chuck) Humphrey University of Alberta Where’s Canada today? The following three statements best summarize the situation in Canada today around research data management and preservation. 1.  A strategic shift has occurred over the past decade from building a national data preservation institution to building national research data management infrastructure. 2.  Building this national research data infrastructure is taking place from the bottom-up. 3.  Building this infrastructure from the bottom-up requires intentional, collaborative actions. The driving principle is one of cooperation, not control. 2 1. Shift from institution to infrastructure 3 2002 2004 2006 2001 2003 2005 2007 National Data Archive Consultation, 2001-2002 OECD Access to Publicly Funded Research Data, 2004 Canadian Digital Information Strategy, 2006-2007 Consultation on Access to Scientific Research Data, 2005 International Data Forum, 2007 UNESCO Charter on Preservation of Digital Heritage, 2003 Understanding infrastructure 4 [C]yberinfrastructure is the set of organizational practices, technical infrastructure, and social norms that collectively provide for the smooth operation of scientific work at a distance (p6). ! ! ! ! Understanding Infrastructure: Dynamics, Tensions, and Design P. Edwards, S. Jackson, G. Bowker and C. Knobel January 2007 Research data management infrastructure 5 }  RDMI is the configuration of staff, services, and tools assembled to support data management across the research lifecycle and more specifically to provide comprehensive coverage of the stages making up the data lifecycle. It can be organized locally and/or globally to support research data activities across the research lifecycle. Capitalizing on Big Data: Toward a Policy Framework for Advancing Digital Scholarship in Canada Appendix 4: Definitions 2. Bottom-up development 6 }  The Brewster Kale principle: just build it }  November 12, 2009 CARL Directors’ meeting in Ottawa }  Levels of data stewardship responsibilities }  The research project level }  The local institutional level }  The wider stakeholder level }  Across regions, Canada, and the globe }  Across domains and research programs }  Across sectors Stewardship levels and the research lifecycle 7 Institutional Research Lifecycle 8 Paul Jefferys. Data Management at Oxford. March 2012. Policy at the institutional level 9 At the wider research stakeholder level 10 }  A few examples: }  Individual institution across sectors }  Canada’s International Polar Year and the development of the IPY Data Assembly Centre Network and its transformation into the Canadian Polar Data Network }  Consortia within region }  OCUL/SP cloud storage project }  OCUL/SP Dataverse Network }  Regional consortia }  The Canadian Social Science Research Data Private LOCKSS Network }  Shared functionality enhancements to Archivematica for research data }  National membership }  The CARL Research Data Management Institute 3. Successful bottom-up characteristics 11 }  Capitalize on the energy driving the sense of urgency around sharing and preserving research data, which is resulting in potential partners across sectors and institutions. }  As we identify potential partners, we have begun to change the metaphors that we use to describe the organizational representations of research data infrastructure. We have gone from “data landscapes” to “data ecosystem.” The data landscape 12 Access Function Individual Centric Domain Centric Institutional Centric Long-term access Short to mid-term access Immediate access Websites FTP sites Domain web portals Data centres Domain archives Data libraries Staging repositories Institutional repositories Su st ai na bi lit y Research data ecosystem 13 Building of a successful collaboration 14 }  Take steps to build trust among partners, which doesn’t always come from the tops of organizations. Let those passionate within organizations find solutions, working together with their counterparts across organizations. }  Prepare a charter that expresses the norms for working together and the common commitment to the task of research data preservation. A good example is the Charter of the Canadian Polar Data Network (see http://polardatanetwork.ca/wp-content/uploads/ CPDN_Governance.pdf) }  Develop a set of policies to serve as a foundation for the shared research data management infrastructure. Data policy document framework 15 Building of a successful collaboration 16 }  Develop blueprints for new research data management infrastructure in teams across institutions and do what is possible now while laying the groundwork for what can be incorporated in the future. The CARL Canadian National Collaborative Data Infrastructure proposal as a blueprint. Building of a successful collaboration 17 }  Pool resources to get infrastructure in place. }  The CWAP project is an example of this approach. This is a jointly funded initiative between the University of Alberta, UBC, and SFU to add research data preservation functionality to Archivematic, a tool developed by Artefactual Inc in Vancouver that produces high quality archival information packages. More recently, OCUL/SP has expressed interest in contributing to this develop and the University of Saskatchewan has a project to extend functionality between Islandora and Archivematica. Building of a successful collaboration 18 }  Integrate and be open to new partners, including a variety of designated user communities. “By what authority …” 19 }  Build trust among the communities being served. }  Demonstrate competencies in delivering services. }  Develop a positive reputation around trust and competencies. }  Operate from a data culture that incorporates norms of best practice and in which rewards are only part of the reason for engaging.  An RDMI agenda for the Canadian Research Data Management Network 20 }  RDM policy and resource coordination }  Develop, promote, interpret, and review RDM policies }  Collaboratively raise resources for joint RDM projects }  Services }  Coordinate service delivery across the data lifecycle: planning, managing, sharing, discovering, repurposing, and preserving research data }  Tools and technology }  Identify, evaluate, and develop tools supporting DM across the research lifecycle }  Identify, evaluate, and develop preservation tools and technology }  Expertise }  Upgrade and train DM skills for stakeholders across the lifecycle }  Develop and advance RDM specializations }  Local or globally