Microsoft Word - PersonasforDataStewardship_submitted.docx


Preprint. 
Accepted for publishing in College & Research Libraries April 30th 2020 
To be published early 2021. 

1 

Using Personas to Visualize the Need for Data Stewardship. 
Live Kvale Oslo Metropolitan University, Norway  
 

Abstract  
There is a current discussion in universities regarding the need for dedicated research data stewards. 
This article presents a set of fictional personas for research data support based on experience and 
requests by experts in different areas of data management. Using a modified Delphi study, twenty-
four participants from different stakeholder groups have contributed to the skills and backgrounds 
necessary to fulfill the needs for data stewardship. Inspired by user experience (UX) methodology 
different data personas are developed to illustrate the range of skills required to support data 
management within universities. Further as a competency hub for data stewards the development of 
a research data support center is proposed. 

Introduction  
Data are the entities researchers draw conclusions from, and essential for fellow researcher to 
examine and criticize results. Transparency and access to data, the analysis applied, and the 
conclusions drawn are part of what defines research.1 Data sharing and data archiving is expected to 
resolve the reproducibility crisis in research and provide new insight2. Consequently, academic 
journals and research funders are increasingly requiring research data to be made available.3 Along 
with requirements for sharing data in academic research, there has been a growing need for new 
skills for data managers, data stewards, data librarians, and data scientists.4 These new roles are 
professionals who assist researchers in managing research data, avoiding data loss during the 
research process, and preparing the data for archiving and public access. Digital research data are 
easily lost, and steps to preserve data must be taken in all stages of the research process.5 
Consequently, skills to maintain and curate data are required, but which skills are needed? And 
where in the universities should curation services be offered? These questions are currently being 
explored6 and debated in libraries and among infrastructure providers.7 This paper draws on a study 
of stakeholders involved in research data management in Norway involving policy makersi, national 
infrastructure providersii8  and researchers and research support staffiii from the four oldest 
universities in Norway. By using persona templates adapted from user experience (UX) methodology9 
this paper explores how the data stewards are described by different stakeholders. The aim with the 
making of the personas has been to visualize how a data steward team could respond to the various 
necessary competencies and skills needed for data management support.  

Internationally, “data steward” is one of several terms used in the literature and among practitioners 
to describe a person working with research data management (RDM). “Data librarian”, “data 
manager”, and “data curator” are examples of other titles with somewhat overlapping 
responsibilities.10 The term data steward is used in this article, as it is less domain specific than 
“librarian,” “curator,” or “scientist.” The usage of “data steward” is intended to include all the 
different requirements for data management 
The research question investigated is: 

 
i Representatives from the Norwegian Ministry of Education and Research, the research council of Norway and 
the rectorate of one of the included universities 
ii In Europe and Norway there is a strong tradition for national custodians of research data.  
iii University IT, library and research office 


Using personas to visualize the needs for data stewardship. 2 

Who are the data stewards in the universities? 
a. What roles should data stewards play 
b. What services should data stewards provide as part of these roles 
c. What skills do data stewards need to carry out these services 

 
By developing a set of data personas, it becomes possible to illustrate and exemplify one possible 
response to each research questions; it is not to be interpreted as a universal solution, but rather as 
an example of how roles, skills, communication, and services for data management may be 
organized. The findings also focus on potential obstacles and what to be aware of when developing 
data steward services.  

Literature Review 
A broad range of literature on RDM skills were identified through searches for "data steward”, "data 
librarian", "data manager", and "data curator" in Web of Science and Scopus. These articles were 
supplemented by searching relevant journals that are not indexed in these databases, such as JesLib 
and the International Journal of Digital Curation, and adding other relevant documents. The different 
articles highlight the skills required in data management and the different roles of data professionals. 
The articles were grouped into three categories according to how the data steward was described: 1. 
new responsibilities of the librarian, 2. the embedded data steward in the research environment, and 
3. other approaches to data management services. In addition, the literature review contains a 
section on the usage of personas related to data management services. 

In the library and information science literature a majority of articles on data stewardship aim at 
clarifying which skills are needed for the data professional librarian offering data management 
support to researchers at the university.11 Both Brown and Federer emphasize that “support for 
researchers’ data needs is a moving target”12 that needs to be supported by a skills development 
program in libraries.13 The most important skills identified by Federer relate to communication, 
presentation, relationship with researchers, teamwork, and one-to-one training.14 This argument is 
supported by Kennan, who finds that communication skills in many forms were the most in demand 
for RDM positions; she further emphasizes the need for “boundless curiosity”, including both the 
willingness and ability to learn new things.15 Kennan identifies four different roles that ensure data 
management in the different stages of the data life cycle: “the data librarian/data manager”, “the 
data IT and systems experts,” “the data scientist”, and “the data creator”.16 Cox and Corrall illustrate 
the role of the “research data manager”17 in the breach between the faculty and the academic 
library, connecting the institutional repository manager role in the library with the research produced 
by the faculty. The data librarians described can either be skilled generalists in data management or 
be specialized in a particular discipline. Disciplinary specialization can be achieved through 
engagement with subject specialists and researchers.18 

A data steward working in a research group or similar research environment with data management 
is here referred to as embedded data steward. These domain-specific data stewards are primarily 
used in data-intensive research within health sciences19 and natural sciences20 and specialize in data 
management in a single discipline. An editorial from Nature Genetics21 starts with a clear statement 
regarding data stewardship, asserting that, “professional data stewards be trained and employed in 
all data-rich research projects, [which] raises the exciting prospect they will conduct research on 
data-intensive research itself.”22 Some articles describe solutions for data management within 
national research institutes23 or data centers.24 The articles on embedded data stewards are 
discipline specific and involve a high degree of specialization with a focus on the development of best 


Using personas to visualize the needs for data stewardship. 3 

practices and domain specific standards.25 This illustrates how embedded data steward needs to 
understand the methods and data they are working with in addition to preservation and metadata. 
None of the articles describing data stewards in research environments are from the humanities or 
the social sciences. These disciplines have traditionally been less data intensive and research are 
often conducted without data sharing among collaborating researchers. Also, the humanities and 
social sciences cater to needs differently; such as the trend of digital scholarship centers run by 
university libraries that explicitly serve the field of the digital humanities.26 These are some possible 
explanations to why the experiences with embedded data stewards in the humanities and social 
sciences are fever and newer which again could explain why examples of embedded data scientists in 
humanities and social sciences and have not yet reached the literature. 

While embedded and library-centric were the two large categories to be found in the literature, 
there are other approaches to data management services. One example is the one-stop research 
support described by Clements27 where someone can find answers to all questions regarding 
research data in one place, possibly a web portal. Another approach by Delft University in the 
Netherlands places domain-specialized data stewards within the faculty departments.28 The service is 
coordinated by the library but aims to integrate the services of the data steward in each faculty. Still, 
their goal is to provide “more granular disciplinary experts.”29 The report from Research Libraries UK 
and Matt Greenhall exploring digital scholarship in UK libraries argues for a “mixed economy of 
digital scholarship support”30 whereby the library partner supplies other research support facilities at 
the universities with complementary expertise in data management. In literature on data scientists 
the term “Data unicorns”31 is used in the meaning an unrealistic skillset for one person. Kennan 
transfers this to the idea of the data steward.32 

What the literature on data stewards has in common is the exploration of professional domains and 
services new to librarianship. The primary challenges described include the targeting of the right 
level of specialization versus the general knowledge of data management and communication and 
collaboration between the different levels within the organization.33  

In the context of RDM, there are three examples of the usage of personas34 within the literature. 
Lage builds on the usage of personas to improve institutional repositories for publication.35 Crowston 
includes “Abby the Science data librarian”36 in the group of users of research data repository. Both 
focus on the researcher, presenting five37 and eight38 researcher personas with different needs in 
regard to data management and different interests in using institutional archives for research data. A 
recent report on education for data stewards from Denmark39 presents the use of personas to 
illustrate the different needs and skill sets requested for data stewards in both the corporate and 
research sectors.  

Methodology  
RDM is a rapidly developing domain. In order to grasp some of the changes and developments, a 
Delphi study with an expert group and multiple rounds of data collection was found suitable.40  

The expert group of participants in a Delphi study provided the possibility of bringing preliminary 
findings back for discussion contributed to the understanding of the perception of roles through 
negotiation, testing, and learning. A group of twenty-four stakeholders participated in the study 
(Table 1). The group contained representatives from policymakers and national infrastructure 
providers in addition to researchers and research support staff from four universities in Norway. 
Recruitment of participants from different stakeholder groups was to include different aspects of the 
development of the sociotechnical infrastructure for research data and potentially uncover gaps or 
disagreements. The data steward was one element highlighted form multiple stakeholders as a gap 


Using personas to visualize the needs for data stewardship. 4 

or missing link. The four universities are the oldest in Norway, are all multidisciplinary and have well-
established collaborations on administrative and technical infrastructure. From the policy makers, 
rectors of research at the four universities were invited to participateiv in addition to representatives 
from the Norwegian ministry of knowledge and research and the research council of Norway. The 
infrastructure providers represent different organizations that offer data archiving services to 
universities in Norwayv. The researchers were invited based on the receipt of European Union (EU) 
funding. The EU requires data management plans from the projects they fund. Researchers were 
identified through the cordis web pagevi, of 25 invited researchers eight participated in the study. 
This way of identifying and recruiting researchers was done to avoid potential biases related to 
engagement with data management as a topic. It also gave a pool of researchers with different 
disciplinary backgrounds (biology, musicology, science studies, economics, neuroscience, psychology, 
philosophy, gender studies). The grouping of researchers as working either individually (RI) or 
collaboratively (RG) was done during the analysis of data from the first round of collection, as the 
needs described corresponded with how the researchers collaborated with other researchers on 
data, rather than with disciplinary backgrounds. The research support staff were recruited with the 
focus of including representatives from three types of research support services (library, IT, and the 
research office). Five of the research support participants also had previous experience as 
researchers, and two provided IT services that were offered both locally and nationally.  In some 
cases, the participants did not want their statements to be identified with them, these have been 
marked as “off-record”.  

Table	1.	 The	participants	organized	according	to	role	

Role/stakeholder	category	 Individual	Participant	codes	

Researchers	working	individually	 RI	 RIZ	 RIJ	 RIL	 RIB	

Researchers	working	in	groups	 RG	 RGV	 RGD	 RGA	 RGW	

Policymakers		 PO	 POU	 POS	 POK	 	

Infrastructure	service	providers	 IN	 INH	 INO	 INR	 	

Research	support	IT	 IT	 ITE	 ITY	 ITI	 	

Research	support,	research	office		 RO	 ROC	 ROX	 ROT	 	

Research	support,	library	 L	 LM	 LP	 LG	 LN	
 

Within UX methodology, personas are commonly used to describe users of computer systems.41 The 
development of personas builds on data collected through interviews or surveys, with the aim of 
creating fictional characters, either based on the participants or to fill the roles they describe. By 
creating personas, system developers flip the focus from the system to the user, aiming to create a 
product that fits well for some users, rather than merely adequately for everyone.42 With the 
ongoing changes in the data management landscape and current infrastructure development, the 
data stewards will be central users. Still, who will fill these roles is not clear.  

By using the expert group of the Delhi study to develop data stewardship personas, this study is not 
claiming to offer a universal solution but provides an illustration of how the different roles could be 

 
iv Unfortunately, only one of the four invited rectors agreed to participate. 
v These three are all publicly funded and offer different archive services. 
vi https://cordis.europa.eu/en 


Using personas to visualize the needs for data stewardship. 5 

distributed. Data steward personas can be useful both to system developers and to the universities 
that employ data stewards and develop data management services. 

Figure 1 illustrates the different phases of the study. In the exploration phase (January 2018), open 
interviews approximately one hour long were conducted with the participants. Data stewardship and 
skills for data management were but two of the several themes brought up in the interviews.  

 
Figure 1 Dephi-inspired multiphase method43 

To further explore the expectations the different stakeholders interviewed held regarding data 
stewards, the second round of data collection (September 2018) had a section dedicated to the data 
steward (Appendix 1) inspired by UX-persona design. All answers were given in free text and were 
optional. 

In the concluding phase (March 2019), a first draft for three personas was developed based on the 
preliminary findings. This draft was presented and discussed with the participants in open interviews 
lasting about 30 minutes. The persona drafts were shared with participants prior to interviews as 
part of the interview guide. The findings presented in this article are from all three rounds of data 
collection and include an integrated analysis44 of the results. Quotes presented in this paper are 
marked with a participant code and 1 or 2, referring to first or second interview e.g. “RGA2”. Data 
from the survey are not linked to participant.  

The collected data were, for the most part, qualitatively coded and analyzed thematically,45 initially 
using the software NVivo and later using XML for thematic coding and Python script for extraction. 
Some of the results from the surveys, such as background, education, and skills, were counted and 
treated quantitatively.  

Most participants granted permission to share the whole or parts of the data with directly 
identifiable information such as names removed. They all had the opportunity to review data they 
contributed ahead of publication and to indicate if there were parts they did not want published, or 
only to be identified “off the record”. The data, including the XML codebook, Python script, interview 
guides, transcripts, survey and consent forms can be accessed through Zenodo.46 


Using personas to visualize the needs for data stewardship. 6 

Findings 
The findings first present the need for data stewardship before exploring in greater detail the skills 
and background requested for data stewards, which are used in the development of the personas. 

The need for data stewards 
Several participants pointed towards a need for data stewardship. The vocabulary used to describe 
this need varied along with expectations as to how this role should be filled. The practical challenges 
of data planning, data management, and data curation were explored, along with collaborative skills 
between existing research support services, data stewards, and researchers.  

Data management does require knowledge of research. Respondent ITE (research support, university 
IT cf Table 1) emphasized that, “the researchers know their data so only they will be able to describe 
their data, but they need help from the data curators” (ITE1). The data curator role described by ITE 
is defined as working with the departments to create continuity and preserve valuable digital data. 
ITE also believed in employing data curators to avoid data loss when temporary staff leave and 
further describes the need for data stewardship and data management as a consequence of data-
intensive research. Among the researchers, RIZ feared that the general data steward would not be 
able to understand the context: “To do this type of job you must know the context, and to do this on 
an industrial scale might work in some cases but probably not in all” (RIZ1). As RIZ pointed out, some 
data types might be easy to structure and organize with a lower degree of specialized knowledge, 
whereas other types require a higher degree of specialization and domain-specific knowledge.  

Understanding when different types of knowledge are required is also an issue, as well as 
understanding what one can expect researchers to do themselves versus what they need additional 
expertise to do, such as making data interoperable and creating a data management plan (DMP). 
Participant ROC addressed the long-term perspective: “I don’t believe any researcher can have the 
responsibility to follow the data from collection […] until they are ready to be stored for maybe 1000 
years.” Making decisions on what should be selected for long-term storage itself requires expertise in 
addition to performing the actual data preservation. 

Two of the researchers working in larger collaborations have hired, or are in the process of hiring, 
data stewards. RGA works on a multidisciplinary project that generates a large volumes of data from 
a variety of sources, while RGD collects social science data from previous projects for reuse in a new 
context. Both agreed on the need for data management: “The largest need is for human recourse to 
manage data” (RGA1) and,  “For us, research data means how to integrate data from all these sites, 
how to harmonize, standardize, and integrate them, and then how to analyze them in a way that 
something new comes out of that” (RGD1). RGD also described how several people are working with 
different aspects of data management, from data cleaning and access control to the re-collection of 
consent from participants.  

Collaboration is also suggested as a challenge: “I believe it is important with such a holy trinity that 
IT, library, and administration could become if they would work together" (RGA1). She pointed to the 
need to combine different people with different skills and different backgrounds to solve complex 
issues and to create robust data management services. Also, one of the research office staff noted 
that collaboration is key: “There is not one such person, one that knows everything, it is more than a 
Kinder egg, more than three things, at least four or five things you need to have thorough knowledge 
of” (ROT1). Metaphors such as “Kinder egg” or “Holy Trinity” have similarities with the “data 
unicorn,”47 indicating that expectations for research support services for research data ought to be 
collaborative to deliver the complexity of skills required. In the survey, one participant from the 


Using personas to visualize the needs for data stewardship. 7 

library explained that, rather than a person, she saw as the best solution a team of people with 
different competencies complementing each other.  

Researcher RGW explained that data management was the responsibility of the professor in charge 
of the lab: “In our group we don’t actually have a data manager, but it is mostly the job of the 
professor. The data type has been fixed a couple of years ago that the data should be analyzed in 
such and such a way, so it has the same data structure. The professor acts like a data manager also. 
But because we are temporary researchers, and we have our own style, [the] professor should 
decide the data structure” (RGW2). As the majority of the researchers in the lab are there 
temporarily, it is the responsibility of the lab, and the professor who decides the structure and 
formats of the data, to “act like a data manager.” Still, since RGW’s description pointed to a high level 
of awareness, there is likely a formal or informal protocol for data management in the lab, and the 
responsibility belongs to the principal investigator. Also, among research support staff, it is agreed 
that the researchers themselves should be responsible for knowing basic data management: "I think 
that as a researcher, I would not say you are obliged to, but you should know basic data 
management" (LM2). This does not, however, exclude the need for dedicated data managers and 
further highlights the need for available training.  

The participants described the following needs for data stewardship:  

• As research is becoming increasingly data intensive, larger research groups may need to hire 
data managers. 

• Data loss from PhDs, postdocs, and other temporary staff when leaving the university is a 
challenge.  

• To find the right balance between the generalist and the specialist is important in terms of 
playing the right data management support role.  

• A closer collaboration between IT, library, and research offices is needed. 
• All researchers cannot be expected to do data management on their own, yet it is the 

responsibility of the researcher to ensure good data management in his or her research.  
 

Collaboration and communication between the different support levels 
In the survey, there were 20 responses to the question regarding the workplace of the data steward, 
which showed a general agreement that closeness to the research environment is essential. In 
particular, the researchers emphasized that the employee should work in the research groups. The 
research group or/and departments (10) were mentioned most frequently, but the research 
administration (4) and the library (5) were also suggested as appropriate work environments for the 
data steward. One suggested the national research data infrastructures as the appropriate place to 
employ the data steward. In the interviews, several participants elaborated on this by emphasizing 
how collaboration and communication between different levels of support within the universities are 
crucial:  

You need to create a system where these people actually work together and are able to 
interact in a good way. [……] There is a pulverization of responsibilities absolutely everywhere, 
and with such a research data center it might be possible to avoid this, given the entry points 
and the information flow and such. (RGA2) 

Both responses show how the workplace of the data steward is one issue to consider, but many 
challenges are related to organizational culture, organization, and information flow. One of the 
researchers suggested coordination between the universities to ensure standardized and high-quality 


Using personas to visualize the needs for data stewardship. 8 

services: “You need a way to assure that you even out the pressure from place to place, so you don’t 
end up in one bubble, each with the development of strange subcultures; this is important to avoid, 
difficult to avoid but very important” (RIJ2). The workplaces of the data stewards need to be 
interconnected in networks of information and skills exchange locally and, perhaps, nationally and 
internationally. As RIJ notes, hiring data stewards without facilitating knowledge exchange can easily 
create dysfunctional subcultures rather than interoperable data.  

  
Speaking the same language  
The respondents (21) mentioned several educational backgrounds in different combinations; the 
results have been split and grouped in Figure 1. Further, four mentioned the master’s level, and five 
mentioned the PhD level as the appropriate educational level. Others suggested higher education 
without specifying the degree level. The respondents all suggested that the ideal candidate would be 
a highly educated person preferably with research experience, often in combination with a 
background in data stewardship, IT, or library and information science (LIS).  

 
Figure 2 Preferred background for data stewards 

As one of the staff members at the university explained, a PhD degree can be a gateway to 
communication with the researchers to “speak their language” and create trust:  

 
It helps if they all speak the same language. That is part of the success in my department, 
where half of the staff have a PhD, so we can communicate with the researchers. […] You first 
need some positive and some negative experiences in order to make the transition. Someone 
has done this internally […] others who have not committed huge mistakes yet, they just 
continue to build, data on data on data and more data, without any control. (ITI2)  

 
The notion that experience of research can be one way of creating trust and knowledge of the data 
types and methods used in the field is another point of entry. However, ITI believed that the need for 
data management must often be experienced by the researchers before it can be taken seriously. 


Using personas to visualize the needs for data stewardship. 9 

Similar backgrounds help in creating relational bonds and trust between researchers and data 
stewards: 
 

I think this is also a kind of confidentiality. There is something like a role you trust, like that 
person is to be really trusted, so I think it would be, I don’t know, but if I was a researcher I 
would be a bit, I don’t know maybe awkward to contact somebody who is just a data 
manager and is not related closely to my field. (RGD2) 

 
One of the researchers described a fear that the data stewards operate using their own agendas: 
 

If you enter on the side in that mean of adding an additional agenda beyond solidity and 
such, that sometimes that might be an advantage for some types of projects, for others it 
might be alarming, both economic and in terms of work environment. (RIJ2) 
 

Research experience among data stewards or similar disciplinary backgrounds are possible strategies 
to create common ground between data stewards and researcher. These strategies might also help 
to avoid additional agendas on the part of data stewards. Interest in research or the research topic 
might help to assure the researchers that solidity and reproducibility are the stewards’ primary 
motivations for data management. There are, however, already several agendas present in the field 
of data management, such as economic interests,48 and an interest to explore existing data in new 
ways trough data science.49 For the researchers, on the other hand, the purpose of data 
management is primarily to document and archive research data for their own reuse and for proof of 
reproducibility. One of the policymakers pointed to this conflict of interests and motivations: “A risk 
in this area, and what we have seen until now that the area suffers from, is that library and archive 
people, bureaucrats, and non-researchers have taken a strong role of leadership” (off record). When 
policymakers, archivists, IT developers, data scientists, and librarians all see different potential in 
research data, these interests might come to overshadow the core: the quality of research and the 
challenge in overcoming the reproducibility crisis.  
 
The question of what motivates the data steward in doing their job becomes important for building 
relations between the researchers. Twelve participants answered this question. Ethical motivations 
and genuine engagement in research were seen as the most important motivations: “Engagement 
both with good research and ethical data management,” “the enjoyment of assisting researchers in 
taking care for their data and sharing data in a safe way,” and “[Contributing] to making research 
transparent and verifiable.” Other responses described a methodical person with a genuine interest 
in research who can provide a valuable contribution by organizing, providing services, building 
something together as a team, and contributing to science 
 
Dividing tasks but maintaining responsibility  
When asked to write a short biography, nine participants responded. One of the descriptions given 
was that of a “technical and tidy person,” and other characteristics included a good overall 
understanding of research and of the research data life cycle:  “The person must be mature or 
experienced enough to understand the range of the field of data management and curation, and the 
limitations for what should be shared and [to] understand the whole lifecycle of research data in 
projects”. Another participant described, “a service-minded person able to work closely with several 
research teams.” Thus, both emphasize that technical and social skills are necessary, along with 
experience, knowledge, and the ability to provide professional guidance. One participant gave a 
longer description of a researcher who wanted to work in-depth with data and who enjoys both the 
service and the problem-solving aspects of data stewardship.  

Balancing the interest in the research with motivations to keep the data structured and documented 


Using personas to visualize the needs for data stewardship. 10 

to enhance the quality of the research results without adding additional agendas is important. Still, 
the involvement of the data stewards must be balanced in such a way that the responsibility of the 
research data is not completely transferred away from the researchers:  
 

You hope that data management should become embedded in normal research practice, for 
much of this can, with fairly simple means, become part of existing routines. […] Because the 
problem, if you get a data manager in the group, is that the others might not take as much 
responsibility for the data management (LM2) 

 
The interviewed researchers shared the concern of LM. A data steward must provide support 
without creating an excuse to transfer the responsibility from the researchers; when data are 
deposited in an archive, a transfer of responsibility can take place: 
 

The researchers themselves must be responsible […] I realize this myself in part of these 
discussions, that one thinks ´yes, we create this role, and then everything is solved,´ but it is 
not at all in that way. Because the researcher sits with the data set and needs to make sure 
this is in order, and then you need a curation function, but that again depends on the data set 
and where you are in your research process. […] But from the moment we have a publication, 
with a corresponding data set, made available, then the data set will still need curation, but 
then you are more on the library side. First the researcher needs to sign off the responsibility, 
and then others take it on. (RGA2) 

 
Another option proposed by RIZ is not to create data stewardship positions, but to distribute 
responsibility among existing researchers in a group: 
 

I would say that the competency should be in the group, and not in an extra position; I believe 
there are other positions more important to prioritize, so I guess I am against all these, but 
the nearer the better. (RIZ2) 

 
RGA and RIZ work in extremely different research environments: while RGA works in a collaborative 
and data-intensive environment, which employs its own data managers, RIZ is a theorist and 
collaborator with other researchers on publications. RIZ’s point of assigning responsibility for 
ensuring data quality to the researcher is representative of the view of many researchers, in 
particular those working independently. She argued that the quality of your data is the quality of 
your research, and your responsibility as a researcher. 
 
Fifteen participants listed different skills as being necessary for data stewardship; some skills were 
mentioned several times. The skills mentioned are analyzed and grouped in Table 3. Different labels, 
such as personal skills, general skills, research skills, knowledge of law and policy, technical skills, and 
archiving skills, differentiate the variety of skills listed. The label “general skills” is used for skills that 
are found to apply to more than one of the other categories. Knowledge of metadata are most 
commonly mentioned. However, none of the researchers mentioned metadata explicitly. There is 
one mention of “data management and storage for further use,” while another writes about “coding, 
systematization and law”; this is the response that best reflects the feedback from the researchers, 
along with responses that emphasize personal skills, such as creativity, punctuality, and good 
communication skills.  

 
Using personas to visualize the needs for data stewardship. 11 

Table 3. Data stewardship skills (times mentioned in prentices) 

Personal 
Skills: 

General 
skills: 

Research 
skills: 

Law and 
policy: 

Technical 
skills: 

Archiving 
skills: 

Structured and 
organized (4) 

Knowledge of 
research (4) 

Knowledge of 
discipline 
specific 
terminology (2) 

Understanding 
and interpretation 
of policies (3) 

Programming, 
Coding, Scripting 
(4) 

Metadata 
related (6) 
(here under: 
metadata 
demands, 
standards, 
documentation, 
descriptive 
metadata) 

Accurate (5) Research 
ethics (3) 

Ability to 
understand 
discipline 
specific needs 
(1) 

Knowledge of law 
and juridical 
aspects (2) 

Technical aspect 
of data 
management (1) 

Familiarity with 
organizing and 
planning for 
different types 
of research 
data (2) 

Dialog with end 
user/ 
communication 
(2) 

Knowledge of 

the FAIRvii 
principles. (1) 

Statistics and 
methodology (1) 

Define policies (2) Ability to work 
with large 
databases and 
LIMS (2) 

 
Systematization 
(2) 

Creative (1) 
 

Data 
management 
and storage 
for further use 
(1) 

 Personal privacy 
(1) 

 
Digitization (1) 

Flexible (1) Ability to work 
with 
guidelines and 
documentation 
(1) 

 IP-law (1) User interface (1) Search (1) 

A problem solver 
able to think 
outside of the 
box (1) 

  Familiar with DMP 
procedures (2)  

Data 
transformation (1) 

Data archives 
(1) 

Good listener (1)     Archival 
standard for 
curation and 
secure long-
term archival 
storage (1) 

 
The Personas 
Based on the analyses, the placement of a support service in the right context, and with appropriate 
channels of communication and collaboration, appears to be one of the major challenges of 
delivering appropriate services. As a workplace for two of the data steward personas, the Research 
Data Service Center (RDSC) has been developed. The RDSC draws on inspiration from the 
development of digital scholarship centers, however with a multidisciplinary approach and with the 
emphasis on strengthening collaboration between the different research support services within a 
university. Several participants requested better collaboration in order to provide better data 
management support, the suggested RDSC is one response to this. In the RDSC, the library, IT, and 
the research administration are aligned in a partnership for coordinated research data support. 
Further, three different data steward personas filling different roles and levels of support are 

 
vii Findable, Accessible, Interoperable and Reusable: FAIR Guiding Principles for scientific data management and 
stewardship 


Using personas to visualize the needs for data stewardship. 12 

presented: the RDM service coordinator, the data curator, and the data manager. Again, it is 
necessary to emphasize that personas are fictive entities, and real people could be filling these roles. 
The number of data stewards will vary depending on institution size. The survey responses gave a 
mix of male, female and gender-neutral names and the personas have been carefully constructed to 
reflect this. The author selected illustration photos to give the personas more of an identity by 
providing them with a face, care have been taken to avoid stereotyping. The names and photos were 
presented to the participants in the final interview, none of the participants presented any opinions 
on either, but focused on the roles and skills embedded in each persona while referring to each with 
the names.  

 
The Research Data Service Center 
The RDSC is run collaboratively by IT, the library, and the research office at the university. The RDSC 
has been established to solve issues of RDM support and training but also espouses other related 
research skills, such as data visualization, data analysis software, support on statistics, etc. The 
services they offer are divided into core services provided by RDSC staff and coordinated services 
where the RDSC is the host for related networks and courses. RDSC is designed to be user-centered 
and responsive to current needs among researchers who are testing and offering the latest in 
technologies for research data. By having an approval function for data management plans and, by 
coordination, network meetings for of data managers, they map and respond to the knowledge level 
and needs of their local environment. The RDSC are up to date on challenges and needs in their 
community. Further, they collaborate closely with different departments at the university to ensure 
that data management training is offered to researchers and graduate students.  

Core services 

• DMP review and consultancy 

• One-to-one data management support for PhDs and researchers 

• Courses in data management  

• Coordination of the “peer-support network” of data managers 

Coordinated services 

• Hosting courses focusing on skills for research (Python, poster design, R and other courses 
provided by the Carpentry community). 

• Hosting other peer support networks (Carpentry study group, R-ladies etc.) 

• FAIR training courses 

There are three groups of staff at the center: permanent staff, student staff and associated staff. In 
addition, they collaborate closely with the data protection officer and with a network of data 
managers hired by a research group. 

The permanent staff includes one RDM service coordinator Kim and data curators of which David is 
one.  

Based on requests, student staff are hired from a pool of data science students and PhD candidates. 
This offers students interested in data management an opportunity to practice and brings new 


Using personas to visualize the needs for data stewardship. 13 

expertise into the center. Some of these students end up being hired as data managers in data-
intensive research groups upon graduation. 

Associated staff work at the research office, library, and in IT but have some tasks at the RDSC. 
Typically, expertise on data analysis software and statistics are offered by IT staff along with support 
on writing. DMPs and grant fulfillment are offered by the research office, and metadata and data 
archiving are offered by library staff. In addition, each individual brings their own skills—some with 
graphic design, others with ontology building, artificial intelligence, interaction design, or semantic 
web technologies. This renders the center an interdisciplinary environment that focuses on 
collaboration and RDM, as well as the proliferation of skills for data-centered research. 

The RDM Service Coordinator – Kim Smith 
Kim Smith is the coordinator and communicator with the RDM service. She has a master’s degree in 
LIS and several years of experience at the university library. Kim works as RDM Service Coordinator at 
the RDSC and is responsible for the data management services at the university. She has the 
overview and coordinates everyone involved at the RDSC. Kim enjoys teaching and presides over 
several of the RDM training courses offered at the university. Through a series of workshops held at 
the center, she has given several researchers and master’s students their first RDM course. She also 
advises on privacy and copyright issues, and while she does not have a background in law, experience 
has made her able to advise on many of the issues that occur. When in doubt, she consults the data 
protection officer. Kim is also responsible for the review and approval of DMPs. The workload is, 
however, shared, and the plans are reviewed collaboratively at the center. Through DMP reviews, 
Kim, David and other staff at the RDSC are able to identify potential challenges at an early stage and 
offer support. In addition, Kim is active in the international coordination work done with the 
Research Data Alliance: 

• Communication and interpretation 

• Policy expertise 

• Research ethics and personal privacy 

• Intellectual property law 

• Data management plans  

• Metadata 

Motivation: Contribute to making research transparent and verifiable and build new knowledge in 
the organization 

Kim believes that proper data management can solve the reproduction crisis and help rebuild trust in 
research in society in general. With a background as a librarian, she is focused on data quality and 
longtime curation. Kim is also concerned about maintaining the legacy of prominent researchers at 
her university. Her colleagues describe her as structured and strategic. 

Photo 1 Kim Smith, Ill. from Colourbox 


Using personas to visualize the needs for data stewardship. 14 

The Data Curator – David Carpenter 
David holds a PhD in computational linguistics and many years of experience with data-intensive 
research. Recently, he has taken a course in data stewardship. David has a scientifically oriented, 
analytical mindset. He had been engaged for several years in data-driven research, but he became 
more interested in the challenges related to ontologies and metadata definitions, and less interested 
in scientific topics and final publications over time. David is good at convincing researchers that a by-
product of proper data management is an increased number of citations, leading to more 
accreditations. 

• Systematization 

• Making data FAIR 

• Metadata, documentation, and provenance 

• Data archives and archiving 

• Coding 

• Data mining 

• Formatting and data transformation   

Motivation: David enjoys translating between disciplines, understanding researchers’ needs, and 
solving problems. 

David loves research and the university as a work environment, but he prefers working with the data 
rather than publishing. He is described as accurate and systematic. 

The Data Manager – Kari Anderson 
Data manager Kari Anderson is the disciplinary specialist, while the staff at the research support 
center are the generalists. She is one of the data managers working in the data-intensive research 
groups at the university. The data managers meet monthly at the peer support network at the RDSC 
to exchange experiences and solve concrete problems. Kari makes sure there is an agreement on 
standards and protocol for data management within the research group. When new staff is hired or if 
students are participating, she makes sure they are briefed in data management before touching 
anything. Kari identifies with the other researchers in the group. She is good at picking up on 
potential issues at an early stage, and if someone has problems with conversions, transfer, or the 
merging of data, she loves the challenge. She is also focusing on deleting what is obsolete, rather 
than keeping every version of everything. 

Kari has a PhD in neuroscience and is fascinated by classification. Through statistical classification, 
she has developed an interest in AI. She was working closely with a research group during her 
master’s and was later hired as a PhD. During her PhD period, her role gradually became more of a 
data manager, and when a new center for brain research was established, she was hired as a data 
steward. She is also taking some extra courses within data science to work with still more methods 
and disciplines as a data manager/data scientist.  

Through the RDM network at the university, she learned of the Research Data Alliance and is now 
engaged in the health data interest group, where she keeps up to date. Still, her heart is most at 
home in the R-ladies network.  

Photo 2 David Carpenter, Ill. from Colourbox 


Using personas to visualize the needs for data stewardship. 15 

• Documentation  

• Working with large databases 

• Coding 

• Systematization 

• Data transformation 

• Metadata standards 

• Interoperability  

Motivation: She loves working in the creative environment of research while still clocking office 
hours. 

At the lab, she is described as the right hand of the professor, the go-to person for the people 
working there, and a creative and hard-working part of the team. 

Persona summary 
By creating the personas Kim Smith, David Carpenter, and Kari Anderson, the aim has been to 
visualize and concretize one example of how both a team providing general support and a data 
steward working within a research group can function. What is crucial is that the data stewards have 
a genuine interest in contribution to research and a combination of the right soft skills and 
knowledge of research along with technical, law and policy, or archival skills. The personas can be 
applied both in the development of software solutions and as inspiration when creating better 
research data support at the institutions. 

Conclusion 
The findings from this study show that outreach, education, and problem-solving are only some of 
the keys to the creation of a functional service for data management. There are several concerns that 
must be taken into account as a service is developed.  

Four primary challenges for providing data stewardship at universities are identified: 

1. Placement of responsibility: Researchers must retain their responsibility for data throughout 
the research cycle. When depositing to a data archive responsibility can be transferred if the 
selected archive offers curation services.  

 
2. Communication: Lines of communication between support levels must be established to 

avoid closed subcultures and to exchange best practices between domains.  
 

3. Knowledge of data and methods: There is a need for local and specialized expertise within an 
increasing number of domains. It is necessary to find the appropriate degree of disciplinary 
knowledge to provide support. Knowledge of research is essential; however, the researchers 
are responsible for data management in their projects.  

 
4. Joint research support effort: Research data management requires several different types of 

expertise that traditionally are spread among different research support departments at 
universities. The creation of a general research data support team or center with connection 
to the research office, IT, and the library is crucial to cover all aspects of data management. 

 
One solution can never fit all and, while a general team will be able to solve and support a wide 

Photo 3 Kari Anderson, Ill. from Colourbox 


Using personas to visualize the needs for data stewardship. 16 

range of issues, many larger research communities need dedicated staff with specific knowledge of 
the issues and concerns that are relevant for their research data. While data management is 
gradually becoming current practice within several data-intensive communities, it is also needed 
among researchers producing and collecting small heterogeneous datasets, referred to as the long 
tail of research data;50 a research data support center is an attempt to resolve this. A general team 
will function as a professional network for discipline-specific research data staff and could potentially 
assist research groups in recruitment and transfer of skills and knowledge across disciplinary 
boundaries. Motivated by contributing to research, data stewards can be recruited among both 
graduate students and researchers; however, understanding of research and research methods is 
important.  
 
 
References 
 

1 Robert King Merton, “The Sociology of Science: Theoretical and Empirical Investigations” 
(Chicago: University of Chicago Press, 1973). 
2 Christine L. Borgman, Big Data, Little Data, No Data : Scholarship in the Networked World 
(Cambridge, MA: MIT Press, 2015); Peter T. Darch, “Limits to the Pursuit of Reproducibility: 
Emergent Data-Scarce Domains of Science,” in Transforming Digital Worlds, ed. Gobinda 
Chowdhury et al., vol. 10766 (Cham: Springer International Publishing, 2018), 164–74, 
doi:10.1007/978-3-319-78105-1_21. 
3 Rob Kitchin, The Data Revolution: Big Data, Open Data, Data Infrastructures & Their 
Consequences (Los Angeles, Calif: SAGE Publishing, 2014). 
4 Alma Swan and Sheridan Brown, “The Skills, Role and Career Structure of Data Scientists 
and Curators: An Assessment of Current Practice and Future Needs,” Report to the JISC (Key 
Perspectives Ltd, 2008); Robin Rice, The Data Librarian’s Handbook (London: Facet, 2016). 
5 Kitchin, The Data Revolution. 
6 Michael J Scroggins et al., “Thorny Problems in Data (-Intensive) Science,” Publications (Los 
Angeles, California: UCLA: Center for Knowledge Infrastructures, 2019), 
https://escholarship.org/uc/item/31b1z69c.; Marta Teperek et al., “Data Stewardship 
Addressing Disciplinary Data Management Needs,” International Journal of Digital Curation 
13, no. 1 (December 27, 2018): 141–49, doi:10.2218/ijdc.v13i1.604; German Council for 
Scientific Information Infrastructures (RfII), “Digital Competencies – Urgently Needed! 
Recommendations on Career and Training Prospects for the Scientific Labour Market” 
(Göttingen: German Council for Scientific Information Infrastructures (RfII), 2019), 
http://www.rfii.de/?p=4015. 
7 German Council for Scientific Information Infrastructures (RfII), “Digital Competencies – 
Urgently Needed!”; Philipp Conzett and Lene Østvand, “Støttetenester for 
forskingsdatahandtering på UiT Noregs arktiske universitet – erfaringar og forslag til beste 
praksis,” Nordic Journal of Information Literacy in Higher Education 10, no. 1 (May 31, 2018): 
65–80, doi:10.15845/noril.v10i1.283. 
8 Kristin R. Eschenfelder and Kalpana Shankar, “Of Seamlessness and Frictions: Transborder 
Data Flows of European and US Social Science Data,” in Sustainable Digital Communities: 
15th International Conference, IConference 2020, Boras, Sweden, March 23–26, 2020, 
Proceedings, ed. Anneli Sundqvist et al., vol. 12051, Lecture Notes in Computer Science 
(Cham: Springer International Publishing, 2020), 695–702, doi:10.1007/978-3-030-43687-2. 
 

Using personas to visualize the needs for data stewardship. 17 

 
9 C. Lewis and J. Contrino, “Making the Invisible Visible: Personas and Mental Models of 
Distance Education Library Users,” Journal of Library and Information Services in Distance 
Learning 10, no. 1–2 (2016): 15–29, doi:10.1080/1533290X.2016.1218813. 
10 Mark D. Wilkinson et al., “The FAIR Guiding Principles for Scientific Data Management and 
Stewardship,” Scientific Data 3 (March 15, 2016): 160018, doi:10.1038/sdata.2016.18; Sara 
Rosenbaum, “Data Governance and Stewardship: Designing Data Stewardship Entities and 
Advancing Data Access: Data Governance and Stewardship,” Health Services Research 45, no. 
5p2 (2010): 1442–55, doi:10.1111/j.1475-6773.2010.01140.x; Swan and Brown, “The Skills, 
Role and Career Structure of Data Scientists and Curators”; Jingfeng Xia and Minglu Wang, 
“Competencies and Responsibilities of Social Science Data Librarians: An Analysis of Job 
Descriptions,” College & Research Libraries 75, no. 3 (May 1, 2014): 362–88, 
doi:10.5860/crl13-435; Anna Clements, “Research Information Meets Research Data 
Management … in the Library?,” Insights: The UKSG Journal 26, no. 3 (November 1, 2013): 
298–304, doi:10.1629/2048-7754.99. 
11 Xia and Wang, “Competencies and Responsibilities of Social Science Data Librarians”; 
Rebecca A. Brown, Malcolm Wolski, and Joanna Richardson, “Developing New Skills for 
Research Support Librarians,” Australian Library Journal 64, no. 3 (July 3, 2015): 224–34, 
doi:10.1080/00049670.2015.1041215; Lisa Federer, “Defining Data Librarianship: A Survey of 
Competencies, Skills, and Training,” Journal of the Medical Library Association 106, no. 3 
(July 2018): 294–303, doi:10.5195/jmla.2018.306; Lyn Robinson and David Bawden, “‘The 
Story of Data’: A Socio-Technical Approach to Education for the Data Librarian Role in the 
CityLIS Library School at City, University of London,” Library Management 38, no. 6/7 
(August 15, 2017): 312–22, doi:10.1108/LM-01-2017-0009; Mary Anne Kennan, “‘In the Eye 
of the Beholder’: Knowledge and Skills Requirements for Data Professionals,” Information 
Research-an International Electronic Journal 22, no. 4 (December 2017): 1601; Fabian 
Cremer, Claudia Engelhardt, and Heike Neuroth, “Embedded Data Manager - Embedded 
Research Data Management: Experiences, Perspectives and Potentials,” Bibliothek 
Forschung Und Praxis 39, no. 1 (April 2015): 13–31, doi:10.1515/bfp-2015-0006; Andrew M. 
Cox and Sheila Corrall, “Evolving Academic Library Specialties,” Journal of the American 
Society for Information Science and Technology 64, no. 8 (August 2013): 1526–42, 
doi:10.1002/asi.22847. 
12 Federer, “Defining Data Librarianship.” 
13 Brown, Wolski, and Richardson, “Developing New Skills for Research Support Librarians.” 
14 Federer, “Defining Data Librarianship.” 
15 Kennan, “‘In the Eye of the Beholder.’” 
16 Ibid. 
17 Cox and Corrall, “Evolving Academic Library Specialties.” 
18 Minglu Wang, “Supporting the Research Process through Expanded Library Data Services,” 
Program-Electronic Library and Information Systems 47, no. 3 (2013): 282–303, 
doi:10.1108/PROG-04-2012-0010; Ricardo L. Punzalan and Adam Kriesberg, “Library-
Mediated Collaborations: Data Curation at the National Agricultural Library,” Library Trends 
65, no. 3 (WIN 2017): 429–47, doi:10.1353/lib.2017.0010; T.P. Bardyn, T. Resnick, and S.K. 
Camina, “Translational Researchers’ Perceptions of Data Management Practices and Data 
Curation Needs: Findings from a Focus Group in an Academic Health Sciences Library,” 
Journal of Web Librarianship 6, no. 4 (2012): 274–87, doi:10.1080/19322909.2012.730375. 
 

Using personas to visualize the needs for data stewardship. 18 

 
19 Meredith N. Zozus et al., “Analysis of Professional Competencies for the Clinical Research 
Data Management Profession: Implications for Training and Professional Certification,” 
Journal of the American Medical Informatics Association 24, no. 4 (July 2017): 737–45, 
doi:10.1093/jamia/ocw179; Gabriele Schnapper et al., “Data Managers: A Survey of the 
European Society of Breast Cancer Specialists in Certified Multi-Disciplinary Breast Centers,” 
Breast Journal 24, no. 5 (October 2018): 811–15, doi:10.1111/tbj.13043; Martin Dugas and 
Susanne Dugas-Breit, “Integrated Data Management for Clinical Studies: Automatic 
Transformation of Data Models with Semantic Annotations for Principal Investigators, Data 
Managers and Statisticians,” Plos One 9, no. 2 (February 28, 2014): e90492, 
doi:10.1371/journal.pone.0090492; R. Esser, “Biostatistics and Data Management in Global 
Drug Development,” Drug Information Journal 35, no. 3 (September 2001): 643–53, 
doi:10.1177/009286150103500302; George Hripcsak et al., “Health Data Use, Stewardship, 
and Governance: Ongoing Gaps and Challenges: A Report from AMIA’s 2012 Health Policy 
Meeting,” Journal of the American Medical Informatics Association 21, no. 2 (March 2014): 
204–11, doi:10.1136/amiajnl-2013-002117; Hong Huang et al., “Prioritization of Data Quality 
Dimensions and Skills Requirements in Genome Annotation Work,” Journal of the American 
Society for Information Science and Technology 63, no. 1 (January 2012): 195–207, 
doi:10.1002/asi.21652; Crystal Kallem, “Data Stewardship,” Journal of the American Health 
Information Management Association 79, no. 9 (2008): 58-59;63. 
20 John Cartwright, Jesse Varner, and Susan McLean, “Data Stewardship: How NOAA Delivers 
Environmental Information for Today and Tomorrow,” Marine Technology Society Journal 
49, no. 2 (April 2015): 107–11, doi:10.4031/MTSJ.49.2.11; T. A. Boden, M. Krassovski, and B. 
Yang, “The AmeriFlux Data Activity and Data System: An Evolving Collection of Data 
Management Techniques, Tools, Products and Services,” Geoscientific Instrumentation 
Methods and Data Systems 2, no. 1 (2013): 165–76, doi:10.5194/gi-2-165-2013; Aj Barrett, 
“Socioeconomic Aspects of Materials Data - Serving the User,” Journal of Chemical 
Information and Computer Sciences 33, no. 1 (February 1993): 22–26, 
doi:10.1021/ci00011a004; Kenneth R. Knapp, “Scientific Data Stewardship of International 
Satellite Cloud Climatology Project B1 Global Geostationary Observations,” Journal of 
Applied Remote Sensing 2 (2008): 023548, doi:10.1117/1.3043461; Kenneth R. Knapp, John J. 
Bates, and Bruce Barkstrom, “Scientific Data Stewardship - Lessons Learned from a Satellite-
Data Rescue Effort,” Bulletin of the American Meteorological Society 88, no. 9 (September 
2007): 1359–61, doi:10.1175/BAMS-88-9-1359; Xin Li et al., “Toward an Improved Data 
Stewardship and Service for Environmental and Ecological Science Data in West China,” 
International Journal of Digital Earth 4, no. 4 (2011): 347–59, 
doi:10.1080/17538947.2011.558123; R.R. Downs and R.S. Chen, “Designing Submission and 
Workflow Services for Preserving Interdisciplinary Scientific Data,” Earth Science Informatics 
3, no. 1 (2010): 101–10, doi:10.1007/s12145-010-0051-6; R.R. Downs et al., “Data 
Stewardship in the Earth Sciences,” D-Lib Magazine 21, no. 7–8 (2015), 
doi:10.1045/july2015-downs; Helena Karasti et al., “Knowledge Infrastructures: Part I (Guest 
Editorial),” Science & Technology Studies 29, no. 1 (2016), 
http://ojs.tsv.fi/index.php/sts/article/download/55406/pdf_1; E.A. Kihn and C.G. Fox, 
“Geophysical Data Stewardship in the 21st Century at the National Geophysical Data Center 
(NGDC),” Data Science Journal 12 (2013): WDS193–96, doi:10.2481/dsj.WDS-033; T.P. 
Lauriault, P.L. Pulsifer, and D.R.F. Taylor, “The Preservation and Archiving of Geospatial 
 

Using personas to visualize the needs for data stewardship. 19 

 
Digital Data: Challenges and Opportunities for Cartographers,” Lecture Notes in 
Geoinformation and Cartography, no. 9783642127328 (2010): 25–55, doi:10.1007/978-3-
642-12733-5_2; D.J. Lowe, “The Geological Data Manager: An Expanding Role to Fill a 
Rapidly Growing Need,” Geological Society Special Publication 97 (1995): 81–90, 
doi:10.1144/GSL.SP.1995.097.01.10; R.D. McDowall, “Understanding Data Governance, Part 
I,” Spectroscopy (Santa Monica) 32, no. 2 (2017): 32–38; C.J. Moore and R.E. Habermann, 
“Core Data Stewardship: A Long-Term Perspective,” Geological Society Special Publication 
267 (2006): 241–51, doi:10.1144/GSL.SP.2006.267.01.18; T. Nadim, “Data Labours: How the 
Sequence Databases GenBank and EMBL-Bank Make Data,” Science as Culture 25, no. 4 
(2016): 496–519, doi:10.1080/09505431.2016.1189894. 
21 “European Open Science Cloud,” Nature Genetics 48, no. 8 (2016): 821–821, 
doi:10.1038/ng.3642. 
22 Ibid. 
23 Cartwright, Varner, and McLean, “Data Stewardship.” 
24 Boden, Krassovski, and Yang, “The AmeriFlux Data Activity and Data System”; Li et al., 
“Toward an Improved Data Stewardship and Service for Environmental and Ecological 
Science Data in West China”; Kihn and Fox, “Geophysical Data Stewardship in the 21st 
Century at the National Geophysical Data Center (NGDC).” 
25 Cartwright, Varner, and McLean, “Data Stewardship”; Knapp, Bates, and Barkstrom, 
“Scientific Data Stewardship - Lessons Learned from a Satellite-Data Rescue Effort”; Huang et 
al., “Prioritization of Data Quality Dimensions and Skills Requirements in Genome 
Annotation Work.” 
26 Rikk Mulligan, SPEC Kit 350: Supporting Digital Scholarship (May 2016), SPEC Kit 
(Association of Research Libraries, 2016), doi:10.29242/spec.350. 
27 Clements, “Research Information Meets Research Data Management … in the Library?” 
28 Teperek et al., “Data Stewardship Addressing Disciplinary Data Management Needs.” 
29 Ibid. 
30 Matt Greenhall, “Digital Scholarship and the Role of the Research Library,” The result of 
the RLUK digital scholarhip survey (London: RLUK, 2019), https://www.rluk.ac.uk/wp-
content/uploads/2019/07/RLUK-Digital-Scholarship-report-July-2019.pdf. 
31 Saša Baškarada and Andy Koronios, “Unicorn Data Scientist: The Rarest of Breeds,” 
Program 51, no. 1 (January 1, 2017): 65–74, doi:10.1108/PROG-07-2016-0053; Kennan, “‘In 
the Eye of the Beholder.’” 
32 Kennan, “‘In the Eye of the Beholder.’” 
33 Teperek et al., “Data Stewardship Addressing Disciplinary Data Management Needs.” 
34 Kathryn Lage, Barbara Losoff, and Jack Maness, “Receptivity to Library Involvement in 
Scientific Data Curation: A Case Study at the University of Colorado Boulder,” Portal: 
Libraries and the Academy 11, no. 4 (2011): 915–937; Kevin Crowston, “User Personas” 
(DataOne - Data Observation Network for Earth, 2015), 
https://www.dataone.org/personas/abby-science-data-librarian; Lorna Wildgaard et al., 
“National Coordination of Data Steward Education in Denmark: Final Report to the National 
Forum for Research Data Management (DM Forum)” (National Forum for Research Data 
Management (DM Forum), 2020), https://doi.org/10.5281/zenodo.3609516. 
 

Using personas to visualize the needs for data stewardship. 20 

 
35 Jack M. Maness, Tomasz Miaskiewicz, and Tamara Sumner, “Using Personas to 
Understand the Needs and Goals of Institutional Repository Users,” D-Lib Magazine 14, no. 
9/10 (2008): 15. 
36 Crowston, “User Personas.” 
37 Ibid. 
38 Lage, Losoff, and Maness, “Receptivity to Library Involvement in Scientific Data Curation.” 
39 Wildgaard et al., “National Coordination of Data Steward Education in Denmark.” 
40 Erio Ziglio, “The Delphi Method and Its Contribution to Decision-Making,” in Gazing into 
the Oracle - The Delphi Method and Its Application to Social Policy and Public Health, ed. 
Michael Adler and Erio Ziglio (London: Jessica Kingsley, 1996), 3–33. 
41 H. Rex Hartson and Pardha S. Pyla, The UX Book: Process and Guidelines for Ensuring a 
Quality User Experience (Amsterdam ; Boston: Elsevier, 2012). 
42 Ibid., 266. 
43 Live Kvale and Nils Pharo, “Understanding the Data Management Plan as a Boundary 
Object through a Multi-Stakeholder Perspective,” Submitted for publication 
https://doi.org/10.2218/ijdc.v15i1.729. 
44 John W. Creswell and Vicki L. Plano Clark, Designing and Conducting Mixed Methods 
Research, 3rd ed. (SAGE Publishing, 2018), 80. 
45 Johnny Saldaña, The Coding Manual for Qualitative Researchers, 3rd ed. (London: SAGE 
Publishing, 2016). 
46 Data from a three-phase Delphi study used to investigate Knowledge Infrastructure for 
Research Data in Norway, KIRDN_Data, 2020, http://doi.org/10.5281/zenodo.3673053. 
47 Kennan, “‘In the Eye of the Beholder’”; Baškarada and Koronios, “Unicorn Data Scientist.” 
48 OECD, ed., Data-Driven Innovation: Big Data for Growth and Well-Being (Paris: OECD 
Publishing, 2015), http://dx.doi.org/10.1787/9789264229358-en. 
49 Chris Anderson, “The End of Theory: The Data Deluge Makes the Scientific Method 
Obsolete,” Wired, June 2008, https://www.wired.com/2008/06/pb-theory/. 
50 Bryan p. Heidorn, “Shedding Light on the Dark Data in the Long Tail of Science,” Library 
Trends 57, no. 2 (2008): 280–99, doi:10.1353/lib.0.0036. 
 

Using personas to visualize the needs for data stewardship. 21 

 
Appendix 1.  

Questions describing the data steward in the survey. 

Your ideal data person 

In several interviews the need for a data person of some kind (Data Steward, Data 
Curator, Data Scientist, Data Librarian, “Datarøkter”) was mentioned. In order to get a 
better understanding of who this is or could be, I would like you to spend some minutes 
creating an image of an ideal person. 
 
I am here asking you to create an imaginary character so please use your imagination. 

a. Position/job title  
If you do not see the need for such a position, please give a short explanation on 
why there is no need for this. 

b. Name 

c. Workplace - Where does this person work and who are they employed by?  

d. Background – brief description of work experience and educational background 

e. Bio - Please provide a short description of who this person is. 

f. Skills - please add minimum three words that describes what this person is 
particularly good at. 

g. Motivations – please describe what makes this person enjoy their work 

h. Other things - Feel free to add additional information about this person