The story of data                City, University of London Institutional Repository Citation: Robinson, L. & Bawden, D. (2017). 'The story of data': a socio-technical approach to education for the data librarian role in the CityLIS library school at City, University of London. Library Management, doi: 10.1108/LM-01-2017-0009 This is the accepted version of the paper. This version of the publication may differ from the final published version. Permanent repository link: http://openaccess.city.ac.uk/17311/ Link to published version: http://dx.doi.org/10.1108/LM-01-2017-0009 Copyright and reuse: City Research Online aims to make research outputs of City, University of London available to a wider audience. Copyright and Moral Rights remain with the author(s) and/or copyright holders. URLs from City Research Online may be freely distributed and linked to. City Research Online: http://openaccess.city.ac.uk/ publications@city.ac.uk City Research Online http://openaccess.city.ac.uk/ mailto:publications@city.ac.uk Accepted for publication in Library Management 1 'The story of data': a socio-technical approach to education for the data librarian role in the CityLIS library school at City, University of London Lyn Robinson and David Bawden Accepted for publication in Library Management, 25 April 2017 DOI 10.1108/LM-01-2017-0009 Abstract Purpose This paper describes a new approach to education for library/information students in data literacy - the principles and practice of data collection, manipulation and management - as a part of the Masters programme in library and information science (CityLIS) at City, University of London. Design/methodology/approach The course takes a socio-technical approach, integrating, and giving equal importance to, technical and social/ethical aspects. Topics covered include: the relation between data, information and documents; representation of digital data; network technologies; information architecture; metadata; data structuring; search engines, databases and specialised retrieval tools; text and data mining, web scraping; data cleaning, manipulation, analysis and visualization; coding; data metrics and analytics; artificial intelligence; data management and data curation; data literacy and data ethics; and constructing data narratives. Findings The course, which was well-received by students in its first iteration, gives a basic grounding in data literacy, to be extended by further study, professional practice, and lifelong learning. Originality/value This is one of the first accounts of an introductory course to equip all new entrants to the library/information professions with the understanding and skills to take on roles in data librarianship and data management. Accepted for publication in Library Management 2 Introduction A role for librarians, and other information professionals, which is of considerable and increasing importance is the handling of data resources; on behalf of their users, and for their own purposes. This role, or perhaps it is better to say spectrum of roles, parallels that in the more traditional world of text and image resources. In supporting users, this ranges from a concern with the overall institutional, or even wider, policies for the management and curation of datasets of all kinds, to assisting an individual user with the detail of small- scale data handling and analysis. It also includes the collection, analysis, management, and use of data relating to library operations, and their use as metrics for service evaluation and improvement; an extension of the well-established 'library statistics'. The recent great expansion of the amount of available, and of public and institutional awareness of the importance of data, lends an urgency to the need for library/information specialists to be fully aware of the new 'data dimension' to their work, and this certainly amounts to a new role for librarians, in line with the theme of this Special Issue. As Ekstrøm et al. (2016) write "Imagine a librarian armed with the digital tools to automate literature reviews for any discipline, by reducing thousands of articles' ideas into memes and then applying network analysis to visualise trends in emerging lines of research. What if your research librarian could then dig deeper and use [a digital tool] to map in which sections of articles your key research terms appear? Imagine the results confirmed that your favourite research term almost never appears in the results sections, but cluster only around introductions and perspectives? And what if the librarian did not stop there, but zoomed into the cloud of data with savvy statistics, applying the latest text and data mining techniques to satisfy even the most scrutinising scientific mind, before formulating an innovative research question?" Not all librarians, even in academic and research settings, will become data specialists to this extent, although many certainly will. But all library and information professionals, in all sectors, will need to gain at least a basic appreciation of the issues around data, both technical and socio-ethical. This role certainly exists now, but will become of greater significance and ubiquity in future years. As Kirkwood (2016, p.275) puts it "Data are nothing without analysis, and many librarians currently lack the data fluency to work confidently in a world of dynamic content creation ... Librarians need both to re-skill and to change their self-identification and the philosophy that underlies it, if they are to achieve confident data fluency." This need for many, if not all, librarians to become more confident in dealing with data, a role which only a few years ago would be relevant to very few within the profession, is a vital one. The issue is not merely one of technical competence, important though that is, but of a confident appreciation of all the issues surrounding the good use of data, including the legal and ethical; much as librarians have traditionally had a confident appreciation of text-based publications. If librarians - in general, and beyond a few specialists and enthusiasts - are to be effective in this new role, professional education will have to adapt accordingly; see, for example, the surveys of data-focused provision in courses in the US (Tang and Sae-Lim 2016) and in China (Si, Zhuang, Xing and Guo 2013). In the US, courses focusing on aspects of data science, data Accepted for publication in Library Management 3 handling and data management are offered within most educational programmes for library/information specialists, particularly, though not exclusively, in the iSchools. One response is to provide programmes which specifically prepare students for the new data-centric roles, such as data librarian, data steward, data curator, research data manager and data archivist. Such programmes necessarily focus strongly on the development of technical and managerial skills of data handling, and are aimed at students who are aiming at a clearly data-focused career within the library/information sector. Examples of these are the programmes offered by the iSchools at the University of Pittsburgh (Lyon, Mattern, Acker and Langmead 2015), and at the University of Sheffield (University of Sheffield 2017). Another response, which is the rationale for the course described in this article, is to adapt curricula to ensure that all new entrants to the library profession are given at least a basic foundational understanding of both the technology of data handling and management, and its social and ethical implications. The two aspects are of equal importance, and cannot sensibly be separated. Without a detailed and practical appreciation of the technical issues, consideration of social and ethical matters will necessarily be ungrounded and general; while without a socio-ethical appreciation it will be difficult for students to understand how technical skills should best be applied. For library/information professionals dealing with data in any respect, while technical competence is a necessity, it must be framed within an understanding of the social and ethical - and indeed the wider cultural and political - environment. This paper describes an initiative, following the latter approach, within the library/information science Masters programme at City, University of London (CityLIS). This involves the repositioning of an introductory information technology (IT) course within the programme as a course dealing with data in all its aspects of relevance to the library/information professions, and from a socio-technical and ethical perspective. The data challenge for librarians Of the many changes and challenges impacting on the work of the library and information professional, the 'data deluge' is certainly among the most significant. The greatly increased amount and diversity of data available is one of the most important changes in the information landscape. This applies both to the very large and heterogeneous datasets which tend to termed 'Big Data', and to the smaller, but no less important, bodies of data collected for specific purposes (Sugimoto, Ekbia and Mattioli 2016; Borgman 2015). The significance of data in the library/information context is two-fold. First, information professionals may need to become involved in data support, research data management, data curation, data governance, data quality evaluation, data citation, data literacy training, and similar activities, as a part, or all, of their professional remit (Koltay 2015, 2016; Rice and Southall 2016). This may involve, at its most formal: assisting with, or managing, research data management policies and plans (Briney 2015); developing and managing data repositories; overseeing a data curation programme (Nielsen and Hjørland 2014; Oliver and Harvey 2016); designing training programmes for data literacy and associated skills, including basic coding, in environments including university, school Accepted for publication in Library Management 4 and public libraries (MacMillan 2015; Carlson, Nelson, Johnson and Koshoffer 2015; Crystle 2017); or dealing with data within an overall framework of digital scholarship (Borgman 2015; Mackenzie and Martin 2016). Or it may, in a less formal way, involve giving advice to individual users on how best to deal with their data, in the way that librarians have always advised on dealing with bibliographic references. Becoming, in part or in whole, a data librarian, in Rice and Southall's terminology, is simply a new extension of the information provision/information management function, albeit that it may a new role description. Second, it is important for information professionals, even if they have no special role in helping their users deal with data, to be able to handle data of all kinds confidently for their own purposes; to use data analytics to improve their library services, for example (Farmer and Safer 2016; Kirkwood 2016; Showers 2015). When these two developments are considered together, it is clear that new entrants to the information professions must be equipped to deal as confidently with data, in its variety of forms, as they have traditionally dealt with text information. Achieving such data confidence means having a conceptual understanding of data, and the issues around it, plus the technical capabilities of 'data scraping' and 'data wrangling': the abilities to find, extract, collect, clean, organise, analyse, and present data. Furthermore, there are two inter-related aspects to the kind of data fluency that the new environment demands of information professionals: the technical, and the social and ethical. There is little point in a librarian being able to code, to scrape data from websites, to clean and analyse datasets, and to produce metrics on demand, if they are unfamiliar with the legal requirements of, and ethical considerations implicit in, what they are doing. But equally, there is little point in such a person being able to fluently debate the social and ethical niceties, if they are unable to get their data they need, in the form they need it in, and to draw from it the meaningful information that it is of use. The two go hand in hand, and the understanding of data that the library/information professional must possess must be a socio-technical understanding, enabling them to deal with data with technical competence and with ethical confidence. There is, of course. also a legal dimension to the proper use of data; this is mentioned where necessary in the course described here, but a full treatment of legal issues comes in courses elsewhere in the City programme, dealing with information law. The importance of these issues has been emphasised repeatedly, as may be shown by the following examples. The sheer volume of data to be dealt with is illustrated by the general acceptance that we have entered the 'zettabyte era', in which annual data traffic on global networks exceeds the zettabyte level (Cisco 2016, Floridi 2014). In response to this, the UK government has explicitly recognised the importance of data literacy as a way of helping non-data specialists make the most of data science (Parkes 2016), while the US National Information Standards Organization (NISO) is planning training webinars for 2017 putting data literacy on a par with digital literacy (NISO 2017). In the library sector, a bibliography on research data curation noted 560 items published between 2009 and 2016 (Bailey 2016). 'Dealing with data' was named as one of '5 technical skills that information professionals should learn', according to an entry on the CILIP (Chartered Institute of Library and Information Professionals) blog in March 2016 Accepted for publication in Library Management 5 (Pennington 2016). This emphasised the need to deal with four distinct types of data: structured (e.g. spreadsheets and relational databases); semi-structured (e.g. files of metadata records); unstructured (without any table or field structure and encompassing big data); and linked data. Similarly, 'Using social media analytics' was named as one of the 'top five library technology topics' by the Techsoup for libraries' blog in December 2016 (Gilbert- Knight 2016). Training for librarians has begun to develop to match these perceived needs. To give three examples: the library of North Carolina State University hosts a week-long 'Data science and Visualization Institute for Librarians' (North Caroline State University 2017); the Library of Congress held a conference on 'Collections as Data' in October 2016, with the two main themes that digital collections are composed of data that can be acquired, processed and displayed in many ways, and that we should always remember that data is derived from, and manipulated by, people (Ashenfelder 2016); and the American Library Association and Google, though their Libraries Ready to Code project, are seeking to equip librarians to teach coding and data handling in public and school libraries (American Library Association 2017). These kinds of developments support the need for all librarians to have a solid socio- technical grounding in data issues. IT teaching at CityLIS An introductory information technology course has been offered as a compulsory part of the library/information programmes at City since Masters level teaching in the subject was established in its current structure in the late 1980s (Robinson and Bawden 2010). This course has always been seen as an introduction to basic concepts, and a preparation for more specialist courses. [Note that in this paper we use the term 'programme' for the whole Masters scheme of study, and 'course' for this specific part.] This course was initially called 'Computers and communications technology', and the very broad syllabus was: Information systems and technology. An introduction to computers, hardware, software, operating systems, programming languages, software packages, databases, word processing, spreadsheets. Terminology and basic concepts of telecommunications. Telecommunications-based systems, including telex, fax, electronic mail, teleconferencing, videotex, electronic journals, document delivery systems, office automation. Hard copy techniques, including copying, duplicating, printing, graphic design and composition, desktop publishing. Microforms and their applications. Introduction to systems analysis. In 1996, the Masters programme was restructured on a modular basis, and the course renamed 'Information technology', with a greater digital emphasis. By 2003-04, the course was named 'Data Representation and Management', and by then focused entirely on digital systems. Its emphasis was on software systems for handling various kind of information: text handling and word processing systems, spreadsheets, web authoring, databases, etc. In 2008, the course was renamed 'Data and Information Technology and Architecture' and Accepted for publication in Library Management 6 shortly afterwards 'Digital Information Technologies and Architecture'; de-emphasing data handling and taking a wider perspective. Its aim was to "provide the technical background required to store, structure, manage and share information effectively". It still included material on specific kinds of software, but was increasingly focused on web-based systems, search engines, blogs and wikis, semantic web, information retrieval, etc., and on information architecture, and issues such as open access and repositories. In academic year 2016-17, this course was given a major overhaul. It was realised that the introductory material on software use was no longer necessary, while the detailed material on web-based systems, retrieval and information architecture was better left to later specialist and elective courses. Eliminating this material allowed for a new focus on the handling of data in all its aspects, as the essential background preparation for the new data roles mentioned above; a return to the data focus of earlier years, but with a very different treatment appropriate to the new environment. It was also felt essential to introduce a strong flavour of ethics, and social implications, hitherto missing in what was very much a technical course. The revised course, with its strongly socio-technical perspective, was renamed as 'Digital Information Technologies and Applications', to indicate that information architecture was not not such a central point. It took the strapline 'The Story of Data', to match another part of the programme called 'The Story of Documents'. The story of data The stated aim of the restructured course is to "provide the technical and philosophical background required to collect, store, describe, structure, manage and share information effectively in the digital society", by engaging with the deluge of digital data, and distilling information from it. The theme "Finding the I in data" is emphasised, with a double meaning: finding meaningful information (I) in data, and also considering how data represents or misrepresents us as individuals (I). There is also a strong focus on implications for library/information applications and issues, to ensure that the course does not become a generic 'data science lite'. In drawing up the syllabus, we were particularly influenced by North Carolina's 'Data Science and Visualization Institute for Librarians' mentioned above, and by modules in the Oxford Internet Institute's Masters programme in 'Social Science of the Internet' (Oxford Internet Institute 2017). We drew from these programmes ideas for both the balance of technical and conceptual material, and the balance of practical activities with consideration of conceptual and managerial aspects, as well as the general 'flow' of the course. More specifically, they influenced our decisions to use the Python language to illustrate the value of coding, and to use examples of scraping data from the Web whenever possible. Although there is no single recommended text for the course - the material is too broad and diverse - the technical content is roughly matched by Herzog (2015) and the socio-ethical content by Floridi (2014). For the central concept - data itself - we follow Floridi's definition: data is any discernible difference, or lack of uniformity; information is well-formed, meaningful and truthful data (Floridi 2010). Accepted for publication in Library Management 7 The course is organised in ten sections: their titles are stated here to show the trajectory of the story, and discussed below: The story of data 1 Finding the 'I' in data 2 You will be assimilated 3 Data about data 4 Taming of the data 5 Searching for the data 6 Working with the data 7 Counting the data 8 The meaning in the data 9 AI: the data will replace you 10 Making data work Each section includes two class sessions - presentations, demonstrations and practical work - plus significant independent student work; the whole course (a 15 UK credit module) accounting for a nominal 150 hours student work. This is sufficient to ensure that all students have the opportunity to gain an appreciation of each topic, conceptually and practically, and to be in a position to learn more, either during their studies or in the workplace. For some sections, guest lecturers from institutions such as the UK Digital Curation Centre, Altmetric, and CILIP offer the viewpoint from the world of practice. Considering each section in turn, we now briefly outline its content. 1 Finding the 'I' in data This introductory section considers the modern phenomenon of the data deluge, and its implications for the individual. It considers: the relation between data, information and documents (Floridi 2010); the historical development of computer systems, and the ways in which computers represent and handle data - Turing and von Neumann architectures, bits and bytes, and coding systems (Ince 2011); and socio-technical issues, particularly for the library/information profession. This section establishes the conceptual framework for the course, and provides the understanding of basic issues needed by any librarian dealing with data. 2 You will be assimilated This section introduces networks and digital network technologies, specifically the internet and the web, and the standards and protocols which underlie them, most notably TCP/IP and HTML. The concepts of the web and web pages are used to introduce some basic ideas of information architecture (Rosenfeld, Morville and Arango 2015). Some social and ethical implications of data transfer and sharing - including individual presence and privacy online, digital divide, net neutrality, and the implications of the design of network infrastructures - are considered. This establishes an understanding of the web environment in which virtually data in the library context resides. Accepted for publication in Library Management 8 3 Data about data This section considers the ways in data forms documents (Furner 2016), and how different kinds of documents are defined, described and organised, leading to an introduction to metadata standards and applications. Following the approach of Pomerantz (2015), this treats metadata very broadly, giving some attention to bibliographic and web resource metadata, but focusing equally on metadata for datasets. This provides a link between the metadata concepts familiar to librarians to their application in the less-familiar dataset context. 4 Taming of the data This section considers the structuring of data into organised data files of various kinds: flat files, CSV files, database structures including relational, and standards, including XML, RDF and linked data. This leads to a discussion of the processes of data management, for research and for other purposes, and of data curation (Briney 2015; Oliver and Harvey 2016). A conceptual understanding of, and an ability to work with, data files of these kinds is fundamental to the success of librarians in confidently dealing with data collections. 5 Searching for the data This section considers how to find data of various forms, building on early discussions of data structure. It covers the range of search tools for various forms of data collection: search engines, relational database systems and SQL, full text bibliographic search systems, and other specialised retrieval tools. Carlson, Nelson, Johnson and Koshoffer 2015). It subsumes the text retrieval and bibliographical retrieval systems familiar to most librarians within the broader framework of systems with retrieve data of all kinds. 6 Working with the data This section focus on the ways data can be collected from web services and APIs, such as Twitter, and then cleaned, manipulated and analysed; what is sometimes termed 'data scraping' and 'data wrangling'. Software such as Hawksey's Tagsexplorer and OpenRefine (Groves 2016) are used to illustrate collection, summarisation and visualisation. A facility with this kind of process will be particularly valuable to librarians seeking to become experts in helping their users deal with data issues, as it is becoming a wide-spread form of data usage. 7 Counting the data This section examines data metrics, introducing basic analytics, basic bibliometrics (as an introduction to the study of bibliometrics laws and applications later in the programme), and altmetrics (Tattersall 2016). While counting data is now technically quite straightforward, we ask what are we measuring when we measure data, and what does it mean? Again, this is an extension of issues familiar to librarians - the bibliometrics of conventional publication - into the less-familiar data realm. 8 The meaning in the data This section examines tools for exploring data to find meaning in it, including tools for text and data mining, and for visualization. Standard packages - Wordle, Tagxedo and Voyant Tools (Megan 2014, Moorfield-Lang 2010) are used for collection and analysis of both structured and unstructured data from the web. There is a basic introduction to coding in Accepted for publication in Library Management 9 the Python language, including use of general and specialised subroutine libraries, web scraping via API wrapper, and regular expressions. The aim is to illustrate the purpose of coding, and where it offers advantages over the standard packages, with examples of library/information applications. The ability to undertake basic coding is now a valuable skill in many library contexts, including modifying bibliographic records, enriching metadata, converting record formats, customising interfaces, and linking systems. This section also considers the discipline of digital humanities, which has provided many of these tools, and its relationship to LIS (Svensson and Goldberg 2015; Robinson 2016). 9 AI: the data will replace you This section examines artificial intelligence (AI), from popular visions and historical developments to current practice, and implications for the information professions. Topics include machine learning, automatic indexing, tagging, classification and categorisation, artificial agents, web bots in general and chatbots in particular, and robots. Issues include whether librarians will really be replaced by robots, what the likely balance of the human and the digital will be, and what are some of the ethical implications, following the approaches of Boden (2016), and Floridi (2016, 2017). Some understanding of these issues is essential for new entrants to the library profession, as the impact of AI to all sectors will be significant. 10 Making data work This section, in a sense, circles back to the first section, considering the importance of data handling and management for the future library/information professional. How can they best contribute to managing the data deluge, and how can data be used to improve, justify and show the impact of, library/information services? No attempt is made to give definitive answers to these questions; rather this section opens a discussion, to be continued throughout the CityLIS Masters programme. All aspects of the learning context have been changed to emphasise the integration of the technical and social/ethical treatment of data issues. Previously the course had been run by formal lectures followed by practical classes in a computer room. The computer room classes have been abandoned in favour of using seminar room for all sessions, and encouraging students to bring and use their devices (laptops, tablets, smartphones) for short practical in-class exercises, which can be naturally integrated with presentation, and which encourages discussion and peer support. Practical exercises have been adjusted so as to be doable without any special hardware or software, by using standard web-based systems: Voyant Tools, Wordle, Tagxedo, Tagsexplorer, Openrefine, etc. For the introduction to coding, which uses the Python language, we are able to recommend a choice of online tutorials for practice, including one which requires only a web browser, rather than any special software. Those students with a strong interest can, of course, take things further by using special purpose hardware and software available at the university. The purpose of including the coding component is not to train the class to become efficient coders: that would be neither desirable nor feasible in the time available. It is not necessary that all librarians be coders, but it is necessary that they understand the nature and purpose of coding, and when and why writing code may be preferable to using prepacked software. In order to do this, it is necessary to have some practical experience of coding. This course Accepted for publication in Library Management 10 provides this, in the context of data collecting and processing, for those students who have not encountered coding before. For those who have, it provides an introduction to a language, Python, with a rich provision of libraries and subroutines for accessing and manipulating data of the kinds of most interest to library/information practitioners. The aim is not to try to develop professional programming skills, but to show coding as a tool for creative exploration of data, following the approach espoused by Montford (2016). Students needed a more in-depth treatment of programming can find it elsewhere in the programme, especially by following technology-oriented electives, and by participating in 'out of hours' option technology training. An example of the latter is CityLIS's hosting of the first Library Carpentry software training (Playforth 2015). Background reading and resources for each section are designed to cover three perspectives: the technical; the socio-ethical, and the professional, outlining the implications for library/information professionals. While the sections are distinguished mainly by their technical content, the social and ethical concerns tend to overlap, since their principles are applicable in many aspects of information and data management (Floridi 2013, Floridi and Taddeo 2016). The assessment for the course is an essay or report on a topic chosen by the student, but which must incorporate both technical and socio-ethical aspects. Students are also required to set up a blog, if they do not already have one, and use it to reflect on their learning as the course progresses, and also encouraged to use other forms of social media such as Twitter, so as to ensure that all are comfortable with communicating via digital media. Conclusions At the time of writing, the course had been given for the first time. Reaction from students, and from the expert practitioners acting as guest lecturers, suggests that this is an engaging and effective way of introducing students to the role of library/information professionals in managing data, understanding both the technology and its social and ethical dimensions. A more through and formal evaluation at the end of the academic year will influence the future direction of the course. The fact that is it is compulsory for all library/information students, and indeed is the first course they encounter in their studies, helps emphasise the importance of understanding data and its implications in all library/information contexts. Data issues are clearly here to stay as a significant aspect of the work of all librarians, and other information professionals, and all entrants to the profession need a good socio- technical grounding as a basis for professional practice, and - vitally - continuing learning throughout professional life. We hope that this new CityLIS offering, which will be further developed over future years, will serve this purpose for our students, and may be a useful example to others. Accepted for publication in Library Management 11 References American Library Association (2017), Equipping librarians to code: ALA, Google launch ready to code university pilot programme, [blog post], available at http://www.ala.org/news/press-releases/2017/01/equipping-librarians-code-ala-google- launch-ready-code-university-pilot, accessed 20 January 2017. Ashenfelder, M. (2016), Data and humanism shape Library of Congress conference, [blog post], available at http://blogs.loc.gov/thesignal/2016/10/data-and-humanism-shape- library-of-congress-conference/?loclr=eadpb, accessed 17 January 2017. Bailey, C.W. (2016), Research data curation bibliography (version 6), Houston TX: Digital Scholarship, available at http://digital-scholarship.org/rdcb/rdcb.htm, accessed 16 January 2017. Boden, M.A. (2016), AI: its nature and future, Oxford: Oxford University Press. Borgman, C.L. (2015), Big Data, Little Data, No Data: Scholarship in the Networked World, Cambridge MA: MIT Press. Briney, K. (2015) Data management for researchers: organize, maintain, and share your data, Exeter: Pelagic Publishing. Carlson, J., Nelson, M.S., Johnson, L.R. and Koshoffer, A. (2015), Developing data literacy programs: working with faculty, graduate students and undergraduates, Bulletin of the Association for Information Science and Technology, 41(6), 14-17. Cisco (2016), The Zettabyte era - Trends and Analysis, [online], available at http://www.cisco.com/c/en/us/solutions/collateral/service-provider/visual-networking- index-vni/vni-hyperconnectivity-wp.html, accessed 20 January 2017. Crystle, M. (2017), Libraries and facilitators of Coding for All, Knowledge Quest, 45(3), 46-53. Ekstrøm, J., Elbaek, M., Erdmann, C. and Grogorov, I. (2016), The research librarian of the future: data scientist and co-investigator, LSE Impact of Social Sciences blog, December 14 2016, available at http://blogs.lse.ac.uk/impactofsocialsciences/2016/12/14/the-research- librarian-of-the-future-data-scientist-and-co-investigator/, accessed 26 March 2017 Farmer, L.S.J. and Safer, A.M. (2016), Library Improvement through data analytics, London: Facet Publishing. Floridi, L. (2017), Charting our AI future, Project Syndicate, [online], available at https://www.project-syndicate.org/commentary/human-implications-of-artificial- intelligence-by-luciano-floridi-2017-01, accessed 17 January 2017. Accepted for publication in Library Management 12 Floridi, L. (2016), Should we be afraid of AI?, Aeon Essays, [online], available at https://aeon.co/essays/true-ai-is-both-logically-possible-and-utterly-implausible, accessed 17 January 2017. Floridi, L. (2014), The fourth revolution: how the infosphere is reshaping human reality, Oxford: Oxford University Press. Floridi, L. (2013), The ethics of information, Oxford: Oxford University Press. Floridi, L. (2010), Information: a very short introduction, Oxford: Oxford University Press. Floridi, L. and Taddeo, M. (2016), What is data ethics?, Philosophical Transactions of the Royal Society A, 374: 20160360, http://dx.doi.org/10.1098/rsta.2016.0360 Furner, J. (2016), "Data": The data, in Kelly, M. and Bielby, J. (eds), Information cultures in the digital age, Wiesbaden: Springer VS, pp 287-306. Gilbert-Knight, A. (2016), Your top 5 library technology topics, Techsoup for libraries blog (9 December 2016), available at http://techsoupforlibraries.org/blog/your-top-5-library- technology-topics, accessed 16 January 2016. Groves, A. (2016), Beyond Excel: how to start cleaning data with OpenRefine, Multimedia Information and Technology, 42(2), 18-22. Herzog, D. (2015), Data literacy: a user's guide, London: Sage. Ince, D. (2011), The computer: a very short introduction, Oxford: Oxford University Press. Kirkwood, R.J. (2016), Collection development or data-driven content curation? Library Management, 37(4/5(, 275-284. Koltay, T. (2015), Data literacy: in search of a name and identity, Journal of Documentation, 71(2), 401-415. Koltay, T. (2016), Data governance, data literacy and the management of data quality, IFLA Journal, 42(4), 303-312. Lyon, L., Mattern,E., Acker, A. and Langmead, A. (2015), Applying translational principles to data science curriculum development, in iPres 2015, November 206 2015, Chapel Hill, North Carolina, available at http://d-scholarship.pitt.edu/27159/, accessed 17 January 2017. MacMillan, D. (2015), Developing data literacy competencies to enhance faculty collaborations, Liber Quarterly, 24(3), 140-160. Mackenzie, A. and Martin, L. (eds.) (2016), Developing digital scholarship: emerging practices in academic libraries, London: Facet Publishing. Accepted for publication in Library Management 13 Megan, W.E. (2014), Review of Voyant Tools, Collaborative Librarianship, 6(2), 96-97. Montford, N. (2016), Exploratory programming for the arts and humanities, Cambridge MA: MIT Press. Moorfield-Lang, H. (2010), Infographics: information gets visual, Information Searcher, 19(3), 15-16. Nielsen, H.J. and Hjørland, B. (2014), “Curating research data: the potential roles of libraries and information professionals”, Journal of Documentation, 70(2), 221–240. NISO (2017) NISO two-part webinar: Digital and data literacy, available at http://www.niso.org/news/events/2017/webinars/sept13_webinar, accessed 20 January 2017. North Carolina State University (2017), Data science and Visualization Institute for Librarians, [online], available at https://www.lib.ncsu.edu/datavizinstitute, accessed 20 January 2017. Oliver, G. and Harvey, R. (2016), Digital Curation, London: Facet Publishing. Oxford Internet Institute (2017), MSc Social Science of the Internet, [online], available at https://www.oii.ox.ac.uk/study/msc, accessed 20 January 2017. Parkes, E. (2016), Data literacy: helping non-data specialists make the most of data science. Government Digital Service blog post, available at https://gds.blog.gov.uk/2016/04/27/data-literacy-helping-non-data-specialists-make-the- most-of-data-science, accessed 20 January 2017. Pennington, D. (2016), 5 technical skills information professionals should learn. CILIP blog (22 March 2016), available at http://www.cilip.org.uk/blog/5-technical-skills-information- professionals-should-learn, accessed 16 January 2016. Playforth, C. (2015), Why the information profession needs Library Carpentry [blog post], available at https://blogs.city.ac.uk/citylis/2015/12/07/why-information-profession-needs- library-carpentry, accessed 20 January 2017. Pomerantz, J. (2015), Metadata, Cambridge MA: MIT Press. Rice, R. and Southall, J. (2016), The Data Librarian's Handbook, London: Facet Publishing. Robinson, L. (2016), Are the digital humanities and library and information science the same thing? [blog post], available at https://thelynxiblog.com/2015/06/29/are-the-digital- humanities-and-library-information-science-the-same-thing/, accessed 17 January 2017. Accepted for publication in Library Management 14 Robinson, L. and Bawden, D. (2010), Information (and library) science at City University London: 50 years on educational development, Journal of Information Science, 36(5), 631- 654. Rosenfeld, L., Morville, P. and Arango, J. (2015), Information architecture for the web and beyond (4th edn.), Sebastopol CA: O'Reilly Media. Showers, B. (2015), Library Analytics and Metrics, London: Facet Publishing. Si, L., Zhuang, X., Xing, W. and Guo, W. (2013), The cultivation of scientific data specialists: Development of LIS education oriented to e-science service requirements, Library Hi Tech, 31(4), 700–724. Sugimoto, C.R., Ekbia, H.R. and Mattiolli, M. (2016), Big data is not a monolith, Cambridge MA: MIT Press. Svensson, P. and Goldberg, D.T. (2015), Between humanities and the digital, Cambridge MA: MIT Press. Tang, R. and Sae-Lim, W. (2016), Data Science Programs in U.S. Higher Education: An Exploratory Content Analysis of Program Description, Curriculum Structure, and Course Focus, Education for Information, 32(3), 269-290 Tattersall, A. (ed.) (2016) Altmetrics: a practical guide for librarians, researchers and academics, London: Facet Publishing. University of Sheffield (2017), MSc Data Science, [online], available at http://www.shef.ac.uk/is/pgt/courses/ds#tab01, accessed 20 January 2017.