14_4-5_Aufsätze.indd Z f B B 6 1 ( 2 0 1 4 ) 4 – 5 Libraries as e-infrastructure 215 Libraries have served as education and research infrastructures for centuries. In this paper, we will describe major opportunities and future challenges in the context of digital research and the »e-infrastructures« that are required for e-science. We will pro- vide examples of current involvements and focus on the impor- tance of cooperation at local, international, specifically European, and global scale. Bibliotheken fungieren seit Jahrhunderten als Bildungs- und Forschungsinfrastrukturen. In dem vorliegenden Aufsatz werden die Chancen und Herausforderungen von digitalen Forschungs- umgebungen und von – für die sogenannte E-Science benötig- ten – E-Infrastrukturen erörtert. Es werden aktuelle Beispiele be- schrieben; außerdem wird aufgezeigt, wie wichtig Kooperation auf lokaler, internationaler, speziell europäischer Ebene in diesem Zusammenhang ist. L i b r a r i e s a s i n f r a s t r u c t u r e s For centuries, libraries were a major, if not the main research infrastructure of academic institutions. They started off by holding the manuscripts and prints of researchers working at the institution, in times when reproduction of scholarly work was the exception and scholars had to travel around the world to gain insights into the works of other scholars. When the reproduction of scholarly works became easier, librar­ ies were able to collect a large segment of the world’s knowledge and make it accessible to researchers and students. Libraries’ estates were usually established at the heart of the campus to perform their organ­ izational function for the circulation of knowledge and serve as a sanctuary for study, where researchers and students could be among themselves and could receive advice by librarians – in many cases scholars themselves. In parallel, experimental research became a ma­ jor paradigm and laboratories containing a scientif­ ic apparatus became a major part of the institution­ al infrastructure. And laboratories started to produce knowledge resources that were usually not kept in li­ braries, namely research data and artifacts that under­ pin research findings. Libraries and laboratories – text and data – coexisted in a highly entangled form as re­ search infrastructure partners for more than 300 years. Today, three rapid and radical developments bring li­ braries as infrastructures to a whole new level. First, digital knowledge resources are largely location inde­ pendent. Second, and relatedly, research has become collaborative and distributed. Third, and most signif­ icantly for our question about the role of libraries as research infrastructures, data and the software used to process it – forming compound objects representing virtualized experimental artifacts – became primary research outputs themselves. As text, data and soft­ ware become more and more integrated, the resulting challenge for research infrastructure is how to sustain these new research objects. Text expressing the researcher’s narrative of ideas and methods was long time the sole authoritative record of research. Libraries have been keeping and providing access to this record for the society inde­ pendently of changing publishing mechanisms. Since there are now new forms of compound knowledge objects that need to be kept as authoritative records of research, the question is how libraries, laboratories and computing centers can work together to maintain a record of research that can be reliably accessed now and by future generations. Libraries and e-science The rules of the scientific and educational system have changed tremendously with the use of informa­ tion and communication technologies. Huge amounts of data are produced and can be immediately made available, interpreted, processed, enriched, stored and preserved. The old paradigm that access to re­ search output is slow, difficult and expensive in or­ der to be high quality is no longer valid. Traditional mechanisms for guaranteeing quality, such as peer review, have shown not to be 100 % reliable and se­ riously slow down the review process. Expensive li­ censes have made access to research output hardly affordable for most research institutes.1 Furthermore, copyright systems lack flexibility to allow for text and data mining.2 As an answer to the need for quick, easy, afford­ able and permanent access to research output, librar­ ies have built digital repositories. A repository brings together all scientific output of an institution or a project. Libraries are widely recognized as a superior source of quality content, but they need to make more effort to increase the visibility of the content stored in these repositories.3 According to several studies, large amounts of papers (10–90 %, depending on the field) published in academic journals remain uncited. Librar­ ies can contribute to a more efficient and transparent scientific ecosystem in the e­science age. Interoper­ ability standards, metadata enrichment, linked data, e-science W o l f r a m H o r s t m a n n , W o u t e r S c h a l l i e r , J a r k k o S i r e n , C a r l o s M o r a i s - P i r e s Libraries as e-infrastructure Fo to : Bo dl ei an L ib ra ri es Fo to : pr iv at Fo to : pr iv at Fo to : pr iv at Wolfram Horstmann Wouter Schallier Jarkko Siren Carlos Morais-Pires Z f B B 6 1 ( 2 0 1 4 ) 4 – 5216 Wolfram Horstmann, Wouter Schallier, Jarkko Siren, Carlos Morais-Pires and convergence of metadata schemes will give high quality scientific output more visibility. Libraries also need to aim at a full integration of formal publications (books, papers) with other content types such as grey literature, research data, software, audio, video, learn­ ing objects, etc. Finally, repositories give governments, funding agencies, and research institutes insight in the impact of the research that they support. Since preservation of research output is no longer limited to institutional and format related boundaries, preservation becomes more complex. On the other hand, it is also an opportunity for libraries to organize preservation as a collaborative, global effort. The care for educational and scientific information as a public good represents also challenges for governments and policymakers. The emerging compound knowledge ob- jects produced in collaborative research activities re­ quire a diverse set of services beyond the basic remit of storage; they should include easy to use services for deposit, registration, quality control, discovery, and ac­ cess. These are supplemented with information­age infrastructure elements, such as semantic standards, specialist query and visualization tools, preservation services and elements which sustain critical charac­ teristics of the repository materials: their integrity, au­ thenticity, usability, and their ability to be understood and discovered. Libraries in e-infrastructures To derive greatest benefit from research data and any other form of research output, it is fundamental that library services for e­science are connected to state­ of­the­art information and communication infrastruc­ tures, also termed e­infrastructures. These infrastruc­ tures include high­performance computing resources, fast networks, as well as information storage, access and management structures. Thanks to a long history of co­operation, libraries are well suited to develop dig­ ital information infrastructures as a collaborative ef­ fort. Recent examples of such innovative efforts involv­ ing big consortia include OpenAIRE (Europe)4, SHARE (USA)5 and COAR (global)6. Examples of services include coordinated advocacy and support (e. g. OpenAIRE Na­ tional Open Access Desks or NOADs)7, information ag­ gregation services building on institutional reposito­ ries, reporting services for research funders and insti­ tutions, and integration of all research outputs in »en­ hanced publications«, »executable papers« and finally through researcher workflows.8 The concept of Virtual Research Environment has been proposed as a work­ ing environment – for all sorts of scientific disciplines – that integrates all these elements and connects them with the underlying e­infrastructure.9 Research libraries and data centers are both im­ mersed in the transition imposed on them by the adoption of e­science practices by the communities they serve.10 The complementary role they are adopt­ ing as providers of e­infrastructure services were de­ scribed by the ODE project11. The services provided by libraries and data centers must necessarily be aligned to provide the integrated data and text products as well as comprehensive workflows that can best sup­ port e­science research practices.12 The study on Au­ thentication and Authorisation Infrastructures (AAI) in research conducted jointly by LIBER (Association of European Research Libraries) and TERENA (Trans­Eu­ ropean Research and Education Networking Associa­ tion) is an example of how libraries and data centers can collaborate in developing common services to support e­science. It includes case studies that show how inter­institutional collaboration can be improved through the libraries’ involvement in e­infrastruc­ tures.13 More generally, libraries have a significant po­ tential to provide information services for collabora­ tive science.14 The DataONE15 project and infrastruc­ ture also illustrates how libraries collaborate and provide services in linking data with publications as well as support for research data management. The FORCE11 community initiative16 involves many librar­ ians in their efforts to improve scholarly communica­ tion, including enhanced publications as well as cita­ tion of research data. In this context, libraries execute the institutional implementation of global approaches for providing unambiguous research information, e. g. ORCID17 or FundRef18 for authors and academic insti­ tutions. And, of course, libraries are getting heavily in­ volved in research data management.19 It is crucial for libraries to be involved in the de­ velopment of infrastructures that ensure new ways of using scientific information,20 a task that may require new partnerships. In e­science this includes the crea­ tion of machine­readable scientific records and text and data mining tools. Libraries need to participate in the current debate on legal reforms relating to these technologies (a summary of the discussions as well as proposals for reform are described in the report of the Text and Data Mining Expert Group21).22 Libraries as sustainable hosts In today’s quickly evolving research world, libraries provide a sustainable framework for specialized ser­ vices. ArXiV23, probably the world’s most renowned open access repository, is operated by Cornell Univer­ sity Library. PubMed24, the world’s most authoritative bibliography in the medical and life sciences is operat­ ed by the National Library of Medicine. And DataCite25, Authentication and Authorisation Infrastructures (AAI) machine-readable scientific records and text and data mining tools Z f B B 6 1 ( 2 0 1 4 ) 4 – 5 Libraries as e-infrastructure 217 the world’s most important service for providing per­ sistent addresses for research data in the Internet is managed by the Technical Information Library of Ger­ many in collaboration with many libraries such as the British Library and the California Digital Library as well as many research institutions around the world. New skills for librarians Libraries have transformed their skillset over the last decades. In order to adapt to the researcher’s new re­ quirements, libraries had to hire business analysts and staff with academic background for digital scholarship support. Digital library systems require highly trained administrators and developers, and repositories re­ quire metadata as well as copyright specialists. Since research data and software have become pri­ mary research assets that often require guarantees for permanent access, libraries can provide a safe harbor for digital research objects in a dynamic environment of mobile researchers, volatile repository content, transient products and short­lived standards. Librar­ ies now need to tackle the challenge of making data and software reliably accessible and re­usable. This re­ quires a transformational approach to library services and development of the new skills. Tasks such as the curation and stewardship for new research objects – data and software – will imply a profound revision of library and information science curricula, certificates and trainings, direct involvement in research projects, as well as learning on the job. Librarians will not be­ come experts in data analytics, which is evolving as its own discipline. But they can become stewards who provide a sustainable basis for data scientists to work on. Local cooperation The new roles of libraries in e­infrastructures have sig­ nificant implications for the cooperation across the campus or the research institution. New forms of co­ operation with researchers are emerging: one­to­one support and copyright advice for depositing publica­ tions, but also data consultancy in information inten­ sive research projects. Interfaces to the financial and administrative systems of the research institutes need to be made in order to reliably link publications and data to research projects. And in all instances libraries need to closely align their activities with the comput­ ing services of the research institute to enable a seam­ less operation of services. Thus, virtual teams across libraries, computing services and research offices are being set up to tackle new challenges such as Open Access publishing and Research Data Management. Global cooperation The grand challenges of the 21st century transcend borders, and science will be increasingly global. Da­ ta­driven science will require extensive global collabo­ rations and researchers on each continent are striving for a leading role in the world’s production of knowl­ edge. Research data itself is global and the key issues to consider are:26 1. How data can be networked 2. How to envision and set up data governance on a global scale 3. How the EU can play a leading role in helping start and steer this global trend. An international group of research funders has been supporting the set­up of the Research Data Alliance (RDA) to enable data exchange on a global scale. The initial phase of RDA has been supported by the Eu­ ropean Commission, the US National Science Foun­ dation and National Institute of Standards and Tech­ nology, and the Australian Ministry of Research, with research funders from other countries becoming ac­ tively involved.27 RDA is being set­up to bring a diver­ sity of stakeholders together and improve interactions between users and technology and service providers. RDA is a bottom­up community­led initiative to foster global interoperability across geographic and disciplinary boundaries. RDA is open: those who want to participate in RDA and shape the way the global data infrastructure operates are invited to join and take the lead on concrete initiatives. It is focused on the real needs of the research communities and will seek links with industry. It aims at being the place where practitioners stop discussing about the ideal solution and/or the complete set of standards and start implementing practical solutions for data shar­ ing and related issues. Libraries are already active and even in leading roles in several RDA working groups.28 Global initiatives such as COAR (Confederation of Open Access Repositories)29 bring together sever­ al major regional repository networks from Austra lia, Canada, China, Europe, Latin America and the United States. COAR’s ambitions are to develop sustainable repository networks all over the globe, align these net­ works and make them fully interoperable, increase the impact of repository content, and provide training and support. Organisations like EIFL30, World Bank31, UNESCO32, ECLAC33 and others actively promote open access to knowledge as a motor for socio­economic development. research data itself is global COAR (Confederation of Open Access Repositories) Z f B B 6 1 ( 2 0 1 4 ) 4 – 5218 Wolfram Horstmann, Wouter Schallier, Jarkko Siren, Carlos Morais-Pires Future Horizons The new European research funding framework re­ flects the challenges for the next years. Similar ex­ amples can be found in other research funding pro­ grams in different parts of the world. Horizon 2020, the EU Framework Programme for Research and Inno­ vation, was adopted in December 2013. A quote from the regulation starts: »Horizon 2020 should support the achievement and functioning of the European Re­ search Area in which researchers, scientific knowledge and technology circulate freely, by strengthening co­ operation both between the Union and the Member States, and among the Member States, […]«.34 Horizon 2020 is open also to the participation of non­European countries. The funding scheme includes support for international partnerships e. g. in the do­ main of scientific information, data and computing­in­ tensive science (areas relevant for COAR, RDA, etc.). Horizon 2020 covers the period of 2014–2020 with a budget of approximately 80 billion Euros. Its macro structure is based on three interrelated pillars: »Hori­ zon 2020 pursues three priorities, namely generating excellent science (›Excellent science‹), creating indus­ trial leadership (›Industrial leadership‹) and tackling societal challenges (›Societal challenges‹). Those prior­ ities should be implemented by a specific programme consisting of three Parts on indirect actions and one Part on the direct actions of the Joint Research Centre (JRC).«35 Research infrastructures (RI) priority is part of the »Excellent science« pillar and includes e­infrastruc­ tures, a. k. a. Information and Communication Tech­ nologies Infrastructures offering services for high­ speed connectivity, high­performance computing and research data management. It aims at developing a strong European research capacity in terms of instru­ ments, installations and equipment to cope with the most demanding requirements for pushing forward the frontiers of scientific knowledge. The actions on e­infrastructures, as recently published in the Horizon 2020 Work Programme 2014–15, cover data­intensive science and engineering, high­performance compu­ tational infrastructure, research and education net­ works, virtual research environments, and e­science software environments. These actions provide oppor­ tunities for partnerships of scholarly communication and data management experts from libraries and scientific communities with e­infrastructure service providers capable of exploring the technologies and know­how for data management supported by high bandwidth communication, high­performance com­ puting, open scientific software, and virtual research environments. C o n c l u s i o n s Libraries’ support to research is evolving. The key com­ petency of information provision stays – albeit in a in­ creasingly digital form. But the more tacit role of the library as a service organization that can provide sus­ tainable support for knowledge resources is pushed to the foreground. Text­based resources are comple­ mented by research data. Involvement in digital re­ search methods and operation of software resources becomes a must. Libraries build virtual teams with re­ search offices and computing centers both on a local and a global level and become an integral part of a global e­infrastructure for research. A c k n o w l e d g e m e n t s The authors would like to thank their colleagues for rich and fruitful discussions. 1 See COM(2012) 0401 final of 17. 7. 2012. Available at: http://eur- lex.europa.eu/legal-content/EN/TXT/?qid=1398254671867&uri=CELE X:52012DC0401 2 See http://ec.europa.eu/research/innovation-union/pdf/ TDM-report_from_the_expert_group-042014.pdf 3 Kenning Arlitsch, Patrick S. O'Brien (2012). Invisible institutional repositories: Addressing the low indexing ratios of IRs in Google Scholar, Library Hi Tech, Vol. 30, Iss. 1, pp. 60–81. Available at: http://dx.doi.org/ 10.1108/07378831211213210 4 https://www.openaire.eu 5 www.arl.org/news/arl-news/2773-shared-access-research-eco system-proposed-by-aau-aplu-arl 6 https://www.coar-repositories.org 7 Birgit Schmidt, Iryna Kuchma (2012). Implementing Open Ac- cess Mandates in Europe. Available at: www.univerlag.uni-goettingen. de/content/list.php?notback=1&details=isbn-978-3-86395-095-8 8 See e. g. the LIBER Quarterly special issue Vol 23, No 4 (2014), http://liber.library.uu.nl/index.php/lq/issue/view/528 9 Leonardo Candela, Donatella Castelli, Pasquale Pagano (2013). Virtual Research Environments: an Overview and a Research Agenda. Available at: http://dx.doi.org/10.2481/dsj.GRDI-013 10 See Carlos Morais Pires, Jean-Claude Guédon, Alan Blatecky. Scientific Data Infrastructures: Transforming Science, Education, and Society; Zeitschrift für Bibliothekswesen und Bibliographie 60 (2013). 11 Susan Reilly, Wouter Schallier, Sabine Schrimpf, Eefke Smit & Max Wilkinson (2011). Report on integration of data and publications, Opportunities for Data Exchange project (ODE). Available at: www.al liancepermanentaccess.org/wp-content/uploads/downloads/2011/11/ ODE-ReportOnIntegrationOfDataAndPublications-1_1.pdf 12 See Carole Goble, David de Roure. The Impact of Workflow Tools on Data-centric Research (in Tony Hey et al (ed.). The Fourth Paradigm: Data-Intensive Scientific Discovery). Available also at http://research. microsoft.com/en-us/collaboration/fourthparadigm/4th_paradigm_ book_part3_goble_deroure.pdf 13 See Advancing Technologies and Federating Communities, A Study on Authentication and Authorisation Platforms for Scientific Resources in Europe : Final Report. Available at http://bookshop.eu ropa.eu/en/advancing-technologies-and-federating-communities-pb KK0413180/ 14 See Ellen Collins, Michael Jubb. Information Handling in Collab- orative Research. LIBER Quarterly, [S. l.], v. 22, n. 4, p. 331–344, feb. 2013. ISSN 2213-056X. Available at: http://liber.library.uu.nl/index.php/lq/ article/view/URN%3ANBN%3ANL%3AUI%3A10-1-114291/8709. Date accessed: 24 Apr. 2014. 15 www.dataone.org/for-librarians 16 https://www.force11.org/ 17 www.orcid.org/ 18 www.crossref.org/fundref/ 19 See Christopher J. Shaffer (2013). The Role of the Library in the Research Enterprise, in: Journal of eScience Librarianship 2(1): Article 4. http://dx.doi.org/10.7191/jeslib.2013.1043 20 See Herbert van de Sompel, Carl Lagoze. All Aboard: Toward a Machine-Friendly Scholarly Communication System, in: Tony Hey et al (ed.). The Fourth Paradigm: Data-Intensive Scientific Discovery). Avail- able at: http://research.microsoft.com/en-us/collaboration/fourthpar adigm/4th_paradigm_book_part4_sompel_lagoze.pdf global e-infrastructure for research developing a strong European research capacity Z f B B 6 1 ( 2 0 1 4 ) 4 – 5 Libraries as e-infrastructure 219 21 See http://ec.europa.eu/research/innovation-union/pdf/TDM- report_from_the_expert_group-042014.pdf 22 See Lucie Guibault, Andreas Wiebe (Eds.) (2013). Safe to be Open: Study on the Protection of Research Data and Recommendation for Access and Usage. Available at: www.univerlag.uni-goettingen.de/ content/list.php?details=isbn-978-3-86395-147-4 23 www.arxiv.org 24 www.ncbi.nlm.nih.gov/pubmed 25 https://www.datacite.org 26 See F. Friend, H. Van de Sompel, J-C. Guédon. Beyond Sharing and Re-using: Toward Global Data Networking. Accessed at: https://eu rope.rd-alliance.org/Repository/document/Publications%20and%20 Reports/Toward-Global-Data-Networking.pdf 27 See https://www.rd-alliance.org/; Fran Berman. Building Glob- al Infrastructure for Data Sharing and Exchange Through the Research Data Alliance, www.dlib.org/dlib/january14/01guest_editorial.html, doi:10.1045/january2014-berman 28 Beth Plale, »Synthesis of Working Group and Interest Group Activity One Year into the Research Data Alliance«, www.dlib.org/dlib/ january14/plale/01plale.html, doi:10.1045/january2014-plale 29 www.coar-repositories.org/ 30 www.eifl.net/ 31 https://openknowledge.worldbank.org/ 32 http://en.unesco.org/themes/building-knowledge-societies 33 http://repositorio.cepal.org 34 REGULATION (EU) No 1290/2013 OF THE EUROPEAN PARLIA- MENT AND OF THE COUNCIL of 11 December 2013 laying down the rules for participation and dissemination in »Horizon 2020 – The Framework Programme for Research and Innovation (2014–2020)« and Repealing Regulation (EC) No 1906/2006, Official Journal of the European Union L 347/81, 20.12.2013. 35 Proposal for a COUNCIL DECISION establishing the Specific Pro- gramme Implementing Horizon 2020 – The Framework Programme for Research and Innovation (2014–2020) /* COM/2011/0811 final - 2011/0402 (CNS). T h e A u t h o r s Dr. Wolfram Horstmann, Associate Director, Bod­ leian Libraries, University of Oxford, Broad Street, Oxford OX1 3BG, UK (Now University Librarian, Göttingen University, Tel.: 0551­39­5210, E­Mail: horstmann@sub.uni­goettingen.de) Wouter Schallier, Chief, Hernán Santa Cruz Library, United Nations , Economic Commission for Latin America and the Caribbean (ECLAC), Av. Dag Ham­ marskjöld 3477, Vitacura, Santiago, Chile, E­Mail: wouter.schallier@eclac.org [The author has co­written this article on his per­ sonal behalf. Views expressed in this article are not necessarily shared by his organisation] Jarkko Siren, Technical and Scientific Project Of­ ficer, European Commission DG CONNECT, Avenue de Beaulieu 33, 1160 Brussels, Belgium, E­Mail: Jarkko.SIREN@ec.europa.eu [The views of the author are his and do not com­ mit the European Commission] Dr. Carlos Morais Pires, Programme Officer, Coor­ dinator for Scientific Data e­Infrastructures, Euro­ pean Commission, Avenue de Beaulieu 33, 1160 Brussels, Belgium, E­Mail: carlos.morais­pires@ec.europa.eu [The views of the author are his and do not com­ mit the European Commission]