SlideShare a Scribd company logo
Zhenjia Fan*
Context-Based Roles and Competencies of Data
Curators in Supporting Research Data Lifecycle
Management: Multi-Case Study in China
https://doi.org/10.1515/libri-2018-0065
Received May 31, 2018; accepted August 06, 2018
Abstract: Focusing on the main research question of what
the critical roles and competencies of data curation are in
supporting research data life cycle management, this
paper adopts a multi-case study method, with data gov-
ernance frameworks, to analyze stakeholders and data
curators, and their competencies, based on different con-
texts from cases from enterprises and academic libraries
in mainland China. Via the context and business analysis
on different cases, critical roles such as data supervisor,
data steward, and data custodian in guaranteeing data
quality and efficiency of data reuse are put forward.
Based on the general factor framework summarized via
existing literature, suggestions for empowering data cura-
tors’ competencies are raised according to the cases. The
findings of this paper are as follows: besides digital
archiving and preservation, more emphasis should be
placed on data governance in the field of data curation.
Data curators are closely related but not equivalent to
stakeholders of data governance. The different roles of
data curators would play their own part in the process of
data curation and can be specified as data supervisor,
data steward, and data custodian according to given
contexts. The roles, competencies, and empowerment
strategies presented in this paper might have both theo-
retical and practical significance for the fields of both
data curation and data governance.
Keywords: data curation, data governance, data life cycle,
data curator, research data management
Introduction
As the data deluge (Poole 2016) progresses, nowadays,
issues about big data are easily becoming hot topics;
however, what we should never neglect is that data
curation appeared much earlier than big data and has
been experienced in librarianship and information
resource management even in the era of “small data”.
It is vital for information professions to recognize data
curators, and their competencies are required in given
contexts regardless of being a “small” or “big” data
era. In January 2018, the General Office of the State
Council of the People’s Republic of China promulgated
the “Measures on Scientific Data Management” (State
Council Document No. 17 2018), which clearly points to
the corresponding duties of government departments,
research institutes, universities, and scientific data
centers concerning the process of scientific data man-
agement. Furthermore, the process of scientific data
collection, submitting, saving, sharing, use, and secur-
ity has also been stated in this legal document. It is a
fact that some R&D institutes, universities, and enter-
prises in China have undertaken extensive practices in
the field of data management before such a document is
issued. As one of the key activities in innovative enter-
prises, research institutes, and universities, data cura-
tion has improved the efficiency and quality of R&D
data management in the era of data-intensive research.
Different agents such as university libraries, enterprises,
data centers, and other institutions based on specific
business situations have accumulated many valuable
cases which might have significance for data curation
in both theory and practice.
The majority of studies related to data curation
concentrate on the field of Library and Information
Science, especially focusing on the topic of librarianship.
However, data curation is a business issue that reaches
beyond librarianship. Data curation can be considered as
one of the critical components in R&D management and
it is reflected in different duties such as research man-
agement, information management, data management,
librarianship, and archives (Noonan and Chute 2014).
Although facing a similar practical issue in curation,
data curators will appear as holding different roles with
respect to the context: data librarian in a library; chief
data officer and data manager in an enterprise; data
*Corresponding author: Zhenjia Fan, Department of Information
Resources Management, Business School, Nankai University,
94 Weijin Road, Nankai District, Tianjin, China,
E-mail: fanzhenjia@nankai.edu.cn
LIBRI 2019; 69(2): 127–137
Brought to you by | Universitat de Barcelona
Authenticated
Download Date | 7/22/19 9:35 AM
steward and data scientist in a research institute etc.
There is no doubt that data curator, data steward, and
data custodian are sometimes used interchangeably in
both literature and practices, while it is also difficult to
make a clear distinction among them in the Chinese
context. However, there are different word habits to
describe data curator in different practical fields. For
example, in many Chinese enterprises, data steward
refers to data curators who mainly focus on data assets,
while data custodian usually refers to data curators who
offer technical supporting in the process of data cura-
tion. The competencies required also vary in different
contexts; for example, between the field of digital huma-
nities on the one hand and engineering on the other.
One basic consensus of the purpose of data curation
is to guarantee the quality of research data, improve the
efficiency of data reuse, and serve the research and
development process. Therefore, rather than being simply
a technical job, a holistic approach is needed, involving
governance of multi-stakeholders. When facing multi-sta-
keholders in data curation, a realistic problem becomes
how to identify common roles in the process of research
data management and seek for common discourses
among the diverse identifications. It is essential to sum-
marize the roles and competencies from different con-
texts of real data curation practices.
Data literacy is a much debated topic related to roles
and competencies in data curation, and the literacy fra-
meworks of librarians are always taken into considera-
tion first (e. g. Carlson and Johnston 2016; Koltay 2015,
2017). Moreover, data governance (Otto 2011; Soares
2014), with its strong theoretical capacity to incorporate
human, technological, and organizational factors into
one framework to understand the problem, is always
used as the perspective to analyze the stakeholders and
propose effective suggestions. Tools, such as databases
and cloud platforms (Soares 2014; Witt 2008), have also
been paid increasingly more attention concerning solu-
tions for data curation in recent years. Of course, some
are soft compared to the technological factors, such as
regulations and rules (Pryor 2009; Smith 2014) for the
process of data curation, which have also attracted sev-
eral researchers in order to guarantee a system for the
implementation of data curation.
We must acknowledge that research achievements in
data curation have provided us with too many essential
theoretical aids. However, it is also obvious that some
basic questions remain when searching for the perfect
answer. For example, what is the objective of data cura-
tion: data set as results, or data stream as a business
process? If the answer is the former, another question
concerning how to distinguish data curation and digital
archiving will be raised; if we prefer the latter as the
better answer, how do we solve the data governance
issue while it is difficult to consider data curation as an
independent process in a real business context? This
brings practical challenges in data curation for data col-
lection and quality control involving intellectual prop-
erty, business secrets, and state securities etc. It is
necessary to organize and curate data from the data life
cycle under the perspective of data governance.
In addition, it is too simplified to consider data cura-
tion as one independent process in one given organiza-
tion although it is, indeed, beneficial to find problems
during each step because data curation is often involved
with other business issues in practice. This makes it
difficult to answer whether data curation is effective
when no specific context is taken into consideration
and is why it is essential to focus on the business context
when researching data curation.
Based on the background to this problem, this paper
summarizes the critical roles and competencies from a
multi-case study approach. This essay describes the sta-
tus quo of data curation in agents including enterprises
and libraries in China and discusses the roles of data
curation in such contexts. The cases in this paper cover
significant enterprises including Neusoft, one of the lar-
gest IT corporations in China, and several academic and
university libraries in mainland China.
Literature Review
The process of data curation involves different stake-
holders who take corresponding roles. As discussed in
the introduction, it is essential to review topics such as
data governance, data curation, data literacy and more.
Data Governance and Stakeholders
As governance is mainly related to the interests of busi-
ness sectors, it is understandable that data governance is
seldom addressed by the LIS (Library and Information
Science) literature with few exceptions such as Krier
and Strasser (2014). Similar to corporate governance,
data governance enables better decision-making and pro-
tects the needs of stakeholders. Data governance can be
considered as a set of activities which include planning,
supervision, and enforcement that govern the process
128 Zhenjia Fan: Context-Based Roles and Competencies
Brought to you by | Universitat de Barcelona
Authenticated
Download Date | 7/22/19 9:35 AM
and methodologies carried out to guarantee and improve
the quality of data. Otto (2011) points out that a data asset
is the object of data governance and data management,
with the common purpose to maximize the value of the
data. Data curation, as a data management practice, aims
to improve the level of data reuse and needs to follow
certain governance frameworks and guidance strategies.
Sarsfield (2009, 38) considers data governance as “guar-
antees that data can be trusted and that people can be
made accountable for any adverse event that happens
because of poor quality”.
In the LIS field of China, Gu (2016) emphasizes data
management from the perspective of data life cycle,
which consists of data acquisition, data sharing, data
reuse, and data appreciation. Huang and Lai (2016)
explored the library as the core stakeholder and sum-
marized library, data center, research institutions, gov-
ernment and public sectors, policymakers, funders and
data publishers as the main categories of data govern-
ance subjects.
Data Curation and Roles
The concept of “curation” originated from the field of
museum science and has since been applied in the field
of data management. Since “data curation”, firstly used by
Gray and Szalay (2002), is a popular buzzword (Abrams
2014), it soon led to the recognition of, and attention paid
to, LIS disciplines. Data management objects, including
observation, calculation, experiment, derived, and other
means to obtain data, can be expressed as text, number,
image, video, audio, software, algorithms, reports, models,
and other forms. The Digital Curation Center (DCC) defines
data curation as all activities that maintain, preserve, and
add value during the life cycle of digital data. In addition
to the “preservation” of the data, the emphasis is on value-
added data and reuse. In this process, the collaboration
between data researchers, publishers, managers, and
users is highlighted (Heidorn et al. 2007).
With the continuous advancement in the field of data
management, agents who are engaged in data manage-
ment are also attracting the attention of academia. Swan
and Brown (2008) presented data creators, data scien-
tists, data managers, and data librarians as four profes-
sional roles in data management. Furthermore, the four
roles have relationships with each other and, notably,
data librarians are similar to data curators (Tammaro,
Ross, and Casarosa 2014). After resolving the roles in
the context of enterprise research data management,
Fan (2017) points that data curators can be further sub-
divided into specific roles such as data supervisor, data
steward, and data custodian.
Lesk (2013) believes that data curators would be a
high-demand profession in the era of big data while
Walters (2011) discusses the new roles of libraries and
librarians, and further emphasizes the collaborative stra-
tegies of data management and long-term preservation.
Academic librarians are often integrated into the research
process, initially in the framework of research data ser-
vices (RDSs) (Tenopir et al. 2016). Henderson (2017) indi-
cates the roles of librarians in data management through
five aspects: collection (data documentation); storage
(data storage, archiving, and preservation); organization
(metadata creation and controlled vocabularies, file nam-
ing); retrieval (data sharing and reusing, data access);
and presentation/dissemination (data privacy, rights,
and publishing). In order to fulfil the requirements of
data curation, data librarians should understand meta-
data, information literacy, scholarly communication,
open access, and both collection and repository manage-
ment (Corrall, Kennan, and Afzal 2013).
Data Literacy and Competencies
Similar to the concept of information literacy, data lit-
eracy is becoming a critical concept in LIS, which is
related to data management throughout the life cycle of
data management. Data literacy can be defined as a
specific skill set and knowledge base, which empowers
individuals to transform data into information and
actionable knowledge by enabling them to access, inter-
pret, critically assess, manage, and ethically use data
(Koltay 2015). According to Carlson and Johnston (2016),
data literacy can be summarized as when data literacy is
embedded in the R&D data flow and related to data life
cycle; R&D management and data utilization are the
main perspectives in data literacy; practical skills such
as data analysis, description, and tools would be the
main focus in data literacy.
Data literacy is closely related to research data ser-
vices that include research data management. In the
context of data governance, skills such as dealing with
licensing terms and agreements, as well as knowledge
about copyright, are already possessed by librarians as
the components of data literacy (Krier and Strasser 2014).
Prado and Marzal (2013) identify a number of abilities as
data literacy while some clearly show their origin from
ALA and ACRL:
Zhenjia Fan: Context-Based Roles and Competencies 129
Brought to you by | Universitat de Barcelona
Authenticated
Download Date | 7/22/19 9:35 AM
– determining when data is needed;
– accessing data sources appropriate to the information
needed;
– recognizing source data value, types, and formats;
– critically assessing data and its sources;
– knowing how to select and synthesize data and combine
it with other information sources and prior knowledge;
– using data ethically;
– applying results to learning, decision making, or pro-
blem solving.
Similar to the list above, a pilot data literacy program,
offered at Purdue University, was built around the follow-
ing skills (Carlson and Bracke 2013):
– planning;
– life cycle models;
– discovery and acquisition;
– description and metadata;
– security and storage;
– copyright and licensing;
– sharing;
– management and documentation;
– visualizations;
– repositories;
– preservation;
– publication and curation.
The ability to identify the context in which data are
produced and reused has also been emphasized.
Through analysis of the relevant connotation, data
literacy has the following characteristics: it is closely con-
nected with and embedded in the research business work-
flow; it is both for the data service supplier and data user; it
emphasizes the practical skills of analyzing, displaying,
and using data management tools. Taking the factors dis-
cussed above, it can be summarized as the general factor
framework, as in Figure 1. The factors can be divided into
structural and agency; factors outside the data curators’
control are structural factors, and agency factors are mainly
related to data literacy. Both structural and agency factors
will affect data curation, and data curation can lead to
achievements. Furthermore, achievements can loop back
to the context of the data curation. So, data curation can
be regarded as a bridge among different stakeholders.
Figure 1: General factors framework of data curators.
130 Zhenjia Fan: Context-Based Roles and Competencies
Brought to you by | Universitat de Barcelona
Authenticated
Download Date | 7/22/19 9:35 AM
Research Design
Multi-perspectives on data curation vary, and we should
take this variability into consideration to couple it with
different contexts. In view of the existing literature, this
study explores the analysis approach of “context-business-
role-competency” to summarize the roles from real facts.
The core research questions of this study are as follows:
– How does the context affect the process of data
curation?
– Who are the stakeholders and what are the key roles
in data curation?
– What are the competencies required for data curators
in different contexts?
Based on the above questions, this study adopts a multi-
case study method to summarize roles and corresponding
competencies.
Case study can be used to describe and analyze
the business issue in depth. In view of the lack of effec-
tive localization of data curation in China, this study
attempts to summarize the theoretical framework from
the practice of data curation in different contexts. The
Neusoft Corporation (hereinafter referred to as “Neusoft”)
and several academic libraries in China were selected
as the main case objectives. Participatory observation,
in-depth interview, and document investigation have
been adopted for data collection. The semi-structured
interview outline was summarized according to former
research literature and data collection was conducted
from December 2016 to December 2017. This study was
conducted based on a data collection approach from
multi-time, multi-channel, and multi-source in order to
construct the complete evidence chain. In addition,
open interviews and participatory observations were
used in fieldwork to facilitate a comprehensive under-
standing of the cases. The above aspects can guarantee
the validity of the qualitative data. After the cases were
collected, critical points of the research life cycle were
coded according to files and interview records, following
which the key roles and relationships were analyzed.
Finally, the general competencies framework was dis-
cussed based on the above tasks.
Case Studies and Discussion
As mentioned above, LIS has paid much attention to data
curation research; however, practices of data curation are
never limited to librarianship but also include many
other fields related to research data management. We
firstly introduce and analyze one case beyond the library
via the perspective of data governance, and then compare
the roles with cases from libraries to summarize the
competencies required in common.
Case Beyond Library
Founded in 1991, as one of the largest IT companies in
China with more than 16,000 employees at the end of
2017, Neusoft has established a two-level R&D system
including both enterprise and departmental levels, and
serves society with industry solutions, intelligent inter-
connection products, platform products, and cloud and
data services. Research data have been generated during
the R&D process, and increasingly more value has been
added via the reuse of data. Therefore, data curation is of
significance in R&D management.
Context and Business
Department Responsible for Research Data
Management
Based on open innovation strategy, Neusoft traditionally
pays significant attention to R&D and process manage-
ment. If we adopt the data governance theory to analyze
the stakeholders of the research data management of
Neusoft, Figure 2 shows the stakeholders’ structure. In
2008 Neusoft Research (NSR) was established based on
the Department of Research of Neusoft, and has taken the
responsibilities of R&D management of the whole cor-
poration. There are more than ten R&D managers respon-
sible for special areas in Neusoft.
Types of Data
According to the management files, types of data mana-
ged by NSR include scientific data, metadata, and related
management data.
Tools for Data Curation
NSR adopts SEAS (Super Electronic Archive System), a
platform developed by Neusoft, to collect and curate
Zhenjia Fan: Context-Based Roles and Competencies 131
Brought to you by | Universitat de Barcelona
Authenticated
Download Date | 7/22/19 9:35 AM
data. It sets different administration authorities to cor-
responding roles in R&D management. Within their
respective authorities, R&D managers can complete
operations such as record creation, data upload, data
update, data deletion, data retrieval, and data down-
load. The information security manager of NSR is
responsible for the daily operation and maintenance of
the platform and data monitoring.
Life Cycle Management
As data collector, NSR completes the task of data collec-
tion according to the R&D project life cycle; that is, from
the start to implementation and to the end of research
projects. We should note that some research data might
be kept by the creators as they are always considered as
private assets, especially in the context of some core
business. Therefore, it is often an issue requiring the
coordination of higher level stakeholders (e. g. super
vice presidents) during the process.
Policy and Institution
The system of data curation in Neusoft can be traced back
to the R&D archives management system. Policies cover,
amongst others, the topics of innovation platform man-
agement, research achievements, intellectual property,
and research cooperation. Files of policy and institution
indicate the detailed business processes, roles, responsi-
bilities, relevant documents and templates, and examples
of application etc.
Roles and Competencies
Roles
Based on this case, the data governance framework needs
to establish a clear organizational structure and division
of responsibilities in the context of enterprise data cura-
tion. According to the actual roles in the data curation,
the three main roles are data supervisor, data custodian,
and data steward. Other roles also include data creators
and data users, which can be cross-converted. For the
basis of the relevant roles see Figure 3.
It is necessary to distinguish the responsibilities
between data business process and technical manage-
ment process in data curation. As shown in Figure 3,
the data steward and data custodian are always corre-
sponding roles for the two functional processes. As
Rosenbaum (2010) indicates, the data steward focuses
mainly on taking care of data assets for others accord-
ing to policies and practices as determined through
data governance, while the data custodian is always
responsible for the technical support of the data cura-
tion process. Generally speaking, the data steward
deals with content and the data custodian deals with
technical support; however, they both share the name
data curator.
According to codes analysis of regulation files for
data curation in case 1, the core competencies of data
curators can be concluded as follows:
– Data planning: be familiar with data types,
volumes, formats, metadata, standard, reservation,
and access etc.;
– Data creation and collection: be familiar with data
sources, tools, skills, and evaluation criteria;
Figure 2: Stakeholders of research data curation in Neusoft.
132 Zhenjia Fan: Context-Based Roles and Competencies
Brought to you by | Universitat de Barcelona
Authenticated
Download Date | 7/22/19 9:35 AM
– Data processing: be familiar with the description,
cleaning, and cataloguing for data;
– Data analysis: be familiar with metadata and visual
tools to service the decision-making;
– Data preservation: be familiar with long-term preser-
vation, data security, and problem solving;
– Data sharing and reuse: be familiar with data sharing
platform, policy, and regulations;
– Common competency: including communication skills,
background knowledge, collaboration abilities etc.
Case from Academic Libraries
As an ideal center of data curation in colleges and uni-
versities, the academic library has the opportunity to be
the key actor in data services for research. Data curation,
for librarianship, represents the indescribable temptation
in data-intensive era, and it also shows that few know
exactly where it will go.
Founded on October 23, 2014, the China Academic
Library Research Data Management Implementation
Group (CALRDMIG) consists of nine libraries including
Peking University Library, Tsinghua University Library,
Fudan University Library, Zhejiang University Library,
Wuhan University Library, Beijing Institute of Technology
Library, Shanghai Jiaotong University Library, Tongji
University Library, and Shanghai International Studies
University Library, aiming to promote the development of
research data management between different academic
libraries in mainland China. The CALRDMIG sets several
working tasks:
– environmental scanning;
– research data management framework;
– research data management measures and policies;
– formulation and implementation of related standards;
– platform and tools for research data management;
– training courses of research data management;
– planning for research data management for academic
libraries and best practice cases collection.
In China, the majority of academic libraries are under the
direct supervision of the CPC committee and university
offices which mean that academic library running funds
and business content are directed by university adminis-
trative departments acting as the service agent for col-
leges and institutes at the same level as the library itself.
At the same time, the academic library is supervised
by a virtual agent, the Academic Libraries Consulting
Committee of the State Education Department which sim-
ply offers business direction but holds no funds. This
means the data curation business issued by an academic
library is always under the basic governance framework,
as shown in Figure 4.
Interviews were undertaken in the nine academic
libraries and some typical cases have also been col-
lected. Peking University Library, as an example, pro-
vides research data management services, including
data mining, classification, and archiving etc., to help
understand the data value and promote data utilization
in the process of the whole life cycle. At the same time,
it can effectively meet the new demands of researchers
for funding and data verification, and assist the effi-
ciency of research work. All the structured, semi-struc-
tured, and unstructured data are included in the scope
of data curation. Moving to another academic library,
Fudan University Library has set up the Department of
Data Management and Technology which is responsible
for frontier research, project planning, and construction
of the data library, as well as data management-related
projects of the library, colleges, and institution. It
Figure 3: Roles in data curation.
Zhenjia Fan: Context-Based Roles and Competencies 133
Brought to you by | Universitat de Barcelona
Authenticated
Download Date | 7/22/19 9:35 AM
provides technical and platform support in both system
development and data analysis; academic research,
technical services, external cooperation, and data lit-
eracy training to researchers. The other seven libraries
also offer services related to research data curation.
There is no exact definition of data curation although
there is no denying that it covers the whole life cycle of
data management. The general process of research data
curation can be generally described as shown in Figure 5.
From the interview and survey data, it can be
observed that few libraries have implemented indepen-
dent data curation, that is, data curation is always
embedded in other library business such as the insti-
tute’s repository (IR). Data curation as one of the most
important tendencies in academic libraries has obtained
increasingly more consensus. However, practice experi-
ence in such a field is far different. At present, if lack of
clear top-level design for data management might be one
of the common reasons for the status quo of data cura-
tion in academic libraries, then competency framework
would be another. Since it is embedded in a context
different from that of enterprise, roles and competencies
of data curators in academic libraries would be dis-
cussed in the corresponding context.
Roles and Competencies of Data
Curators in Academic Libraries
From the case description above, it is not difficult to find
that the context of academic libraries is obviously differ-
ent from that of enterprise. In enterprise, it appears more
freedom is given as the whole enterprise would share a
common interest in pursuit of increased benefits. Data
curation can be considered as one of the components of
corporate governance. Different roles such as data super-
visors, stewards, and custodians can communicate with
data creators and users effectively. When it comes to the
context of academic libraries, a different perspective
appears. Both the CPC committee and president of the
university always consider the library as a supplemental
department to colleges and institutes, even as simply a
document repository. Moreover, few researchers consider
the library to be a reliable cooperator. Therefore, much of
the effort, including data curation, by librarians, is not
recognized, which is a disappointing fact in many aca-
demic libraries in China.
It is essential to borrow some experience beyond librar-
ianship, with perhaps data governance from enterprise
Figure 4: Basic governance framework of a typical library for data curation.
Figure 5: Life cycle of research data curation.
134 Zhenjia Fan: Context-Based Roles and Competencies
Brought to you by | Universitat de Barcelona
Authenticated
Download Date | 7/22/19 9:35 AM
one example. As governance is a concept mainly focused on
risk management, legal, and policy issues, data governance
can be regarded as issues about attribution and citation,
archiving and preservation, discovery and provenance,
data schema and ontologies, and infrastructure (Smith
2014). Under the perspective of data governance, as Pryor
(2012) indicates, the process of approaching, preserving,
and adding value to data of stakeholders can be analyzed
in data life cycle.
Though data curation in academic libraries involves
several stakeholders beyond libraries, academic libraries
still play a critical role. Poole (2016) indicates that digital/
data curation needs to be embedded in institutional and
scholarly infrastructure: archives, centers, libraries, and
institutional repositories (IRs) are key components of
such an infrastructure. IRs, always associated with and
implemented in academic libraries in China, are taking
their data management and knowledge service roles in
the R & D supporting process. As Walters (2014) indi-
cates, data exchange and storage is a key function of
IRs, who can manage not only scholarship and data,
but also software, tools, and code (Cragin et al. 2010;
Walters 2014). To some extent, data curation has been
implemented on the basis of IRs, and data curators are
always the librarians issued about IRs.
Technical skills are essential to data curation; however,
technologies can never solve all the problems in data cura-
tion. Both LIS and non-LIS domain skills (Vivarelli, Cassella,
and Valacchi 2013) and so-called soft skills, namely project
management, negotiation, team-building, and collaborative
problem solving (Oliver and Harvey 2016; Swan and Brown
2008), would be required. According to the discussion
above, common competencies of data curators in academic
libraries can be concluded as follows:
– General competencies: institution management, general
information technology;
– Specific competencies: data related technology, aca-
demic research;
– Competencies of LIS: resource construction, training,
and lifelong learning, knowledge acquisition and
consultation service, knowledge organization.
As the most used name for the role of data curation in
academic libraries, data librarian is “the professional all-
rounder like a Swiss army knife” (Osswald 2013). Beyond
being an information professional, the data librarian
should also be an effective communicator.
If we take the “ownership” or “collections” paradigm
into consideration in data curation, which has an assump-
tion that the academic library should and could collect all
data and therefore be able to adequately satisfy research
needs, it meets too many obstacles in real practices. For
example, the construction of an institutional repository is
the basis of data management; however, one of the biggest
obstacles is that researchers do not cooperate or are
unwilling to provide the data. Data are more valuable
than published papers and monographs, to some extent,
and researchers will not submit them easily.
Few researchers are good at creating sharable meta-
data and curating their data efficiently; however, it is a
pity that even fewer are aware of seeking help from
libraries and librarians. Besides, the stereotype of lamen-
ted competencies of librarians means they will not be
considered as true collaborators by researchers (Wright
et al. 2014). One case is from NSF Data Management
Planning (Hswe and Holt 2011), where researchers failing
to consult librarians would reinforce this bad impression.
Competencies about collaboration with other depart-
ments and being embedded in research team service
would therefore also be considered in the context of the
academic library.
Competencies for data librarians include:
– Communication competencies: skills of effective com-
munication, need analysis, IT and project manage-
ment training, and cooperation with researchers;
– IT competencies: master database design tools, web
development tools and content management systems,
server management, programming software and dis-
tributed collaboration tools;
– Interdisciplinary competencies: background of, and
beyond, LIS; working experience in reference ser-
vice, collection development, and information orga-
nization etc.
Taking into consideration the data life cycle and data
governance framework, the general framework of data
curators in the context of the academic library can be
concluded as shown in Figure 6.
Conclusion and Implications
In the context of research data curation both from and
beyond academic libraries, data supervisors and data
stewards often come from a background of data crea-
tors and data users. The roles of data creators, data
managers, and data users are often relatively easily
distinguished, but data stewards and data custodians
typically behave as research managers, and it is neces-
sary to further subdivide these two roles when defining
the post.
Zhenjia Fan: Context-Based Roles and Competencies 135
Brought to you by | Universitat de Barcelona
Authenticated
Download Date | 7/22/19 9:35 AM
Combined with the status of data curation in universities,
certain problems faced by practitioners concerning data
literacy are highlighted as lack of policy, unclear responsi-
bilities, mismatch in quality and demand of staff members’
data literacy, lack of professional education etc. Aiming at
the problems mentioned above, this study suggests corre-
sponding countermeasures and suggestions. Firstly, clarify
the job responsibilities and professional quality, and accel-
erate standards setting and policy construction in data cura-
tion in academic libraries. Secondly, establish data literacy
training systems as well as strengthen data consciousness
and data ethics education. Thirdly, promote innovation on
data curation and realize the mutual promotion of theory
and practice. Finally, cooperate with professional depart-
ments in university to establish relevant courses and culti-
vate professionals concerning data curation.
Based on a real business situation, this study com-
bines the stakeholders involved in the data governance
framework to resolve the role of scientific research data
management and complete the role recognition of data
curation. The main conclusions are:
– stakeholders involved in the data governance frame-
work are not fully related to the subject of data curation;
– the framework of data curation needs to establish
a clear organizational structure and division of
responsibilities, and clarify the synergistic relation-
ship between different stakeholders;
– data curation as a specific business data manage-
ment, core roles, including data supervisors, data
stewards and data custodians, should be subdivided
in post setting.
Data governance is a service to enable the transparency of
data related processes and effective use; it can be used
also in the field of data curation in the academic library. It
is important for the library profession to take this chal-
lenge seriously and acquire the competencies required to
provide effective data curation. Competencies framework
should be developed according to the contexts in which
data curation is implemented.
Because of the inherent limitations of the case study,
this study explores the roles of the subject of data man-
agement in the framework of data governance, and needs
to be tested for more case studies. In addition, based on
the data management framework, different data manage-
ment roles should have a data literacy framework, as well
as data continuity, master data maturity, data manage-
ment performance measurement, and other aspects of
how to achieve different management role coordination
research issues, pending more in-depth study.
Acknowledgements: This study is supported by the
Fundamental Research Funds for the Central Universities
(Project: Innovation-Driven Enterprise Data Governance,
No. 63172077) in Nankai University. We offer a sincere
thanks to participants in the long-term field study and
manuscript reviewers for their useful suggestions.
References
Abrams, S. 2014. “Curation: Buzzword or What?.” Information
Outlook 18 (5): 25–27.
Carlson, J., and M. S. Bracke. 2013. “Data Management and Sharing
from the Perspective of Graduate Students: An Examination of
the Culture and Practice at the Water Quality Field Station.”
Portal: Libraries and the Academy 13 (4): 343–61.
Figure 6: General competencies framework in data life cycle management.
136 Zhenjia Fan: Context-Based Roles and Competencies
Brought to you by | Universitat de Barcelona
Authenticated
Download Date | 7/22/19 9:35 AM
Carlson, J., and L. Johnston. 2016. Data Information Literacy:
Librarians, Data, and the Education of a New Generation of
Researchers. West Lafayette: Purdue University Press.
Corrall, S., M. A. Kennan, and W. Afzal. 2013. “Bibliometrics and
Research Data Management Services: Emerging Trends in
Library Support for Research.” Library Trends 61 (3): 636–74.
Cragin, M., C. L. Palmer, J. R. Carlson, and M. Witt. 2010. “Data
Sharing, Small Science and Institutional Repositories.”
Transactions of the Royal Society 368 (1926): 4023–38.
Fan, Z. 2017. “Enterprise R&D Data Governance and Curators:
Case Study of NSR Practice.” Library and Information Service
61 (1): 56–63. (in Chinese).
Gray, J., and A. S. Szalay. 2002. “Online Scientific Data Curation,
Publication, and Archiving.” Proceedings of SPIE - the
International Society for Optical Engineering 4846: 103–07.
Gu, L. 2016. “Data Governance: Opportunity for the Library.” Journal
of Library Science in China 42 (9): 40–56. (in Chinese).
Heidorn, P. B., H. R. Tobbo, G. S. Choudhury, C. Greer, and R. Marciano.
2007. “Identifying Best Practices and Skills for Workforce
Development in Data Curation.” Proceedings of the American
Society for Information Science & Technology 44 (1): 1–3.
Henderson, M. E. 2017. Data Management: A Practical Guide for
Librarians. Lanham, Maryland: Rowman & Littlefield.
Hswe, P., and A. Holt. 2011. “Joining in the Enterprise of Response in
the Wake of the NSF Data Management Planning Requirement.”
Research Library Issues 274: 11–16.
Huang, R., and T. Lai. 2016. “Analysis of the Library’s Participation in
Scientific Data Management from the Perspective of
Stakeholders.” Library and Information Service 60 (3): 21–25,
89. (in Chinese).
Koltay, T. 2015. “Data Literacy: In Search of a Name and Identity.”
Journal of Documentation 72 (2): 401–15.
Koltay, T. 2017. “Data Literacy for Researchers and Data Librarians.”
Journal of Librarianship & Information Science 49 (1): 3–14.
Krier, L., and C. A. Strasser. 2014. Data Management for Libraries.
Chicago, IL: American Library Association.
Lesk, M. 2013. “Curators of the Future.” New Technology of Library
and Information Service 3: 1–7.
Noonan, D., and T. Chute. 2014. “Data Curation and the University
Archives.” American Archivist 77 (1): 201–40.
Oliver, G., and R. Harvey. 2016. Data Curation: A How-To Manual,
2nd ed. London: Facet.
Osswald, A. 2013. Skills for the Future: Educational Opportunities for
Digital Curation Professionals. Florence: European
Commission.
Otto, B. 2011. “Data Governance.” Business & Information Systems
Engineering 3 (4): 241–44.
Poole, A. H. 2016. “The Conceptual Landscape of Digital Curation.”
Journal of Documentation 72 (5): 961–86.
Prado, J.C., and A. Marzal. 2013. “Incorporating Data Literacy into
Information Literacy Programs: Core Competencies and
Contents.” Libri 63 (2): 123–34.
Pryor, G. 2009. “Multi-Scale Data Sharing in the Life Sciences: Some
Lessons for Policy Makers.” International Journal of Digital
Curation 4 (3): 71–82.
Pryor, G. 2012. “Why Manage Research Data?.” In Managing
Research Data, edited by G. Pryor, 1–16. London: Facet. 1–16.
Rosenbaum, S. 2010. “Data Governance and Stewardship: Designing
Data Stewardship Entities and Advancing Data Access.” Health
Services Research 45 (5p2): 1442–55.
Sarsfield, S. 2009. The Data Governance Imperative. Cambridge, UK:
IT Governance Publishing.
Smith, M. 2014. “Data Governance: Where Technology and Policy
Collide.” In Research Data Management: Practical Strategies
for Information Professionals, edited by J. Ray, 45–59. West
Lafayette, In Purdue University Press.
Soares, S. 2014. Data Governance Tools. Boise: MC Press Online, LLC.
Swan, A., and S. Brown. 2008. “The Skills, Role and Career Structure
of Data Scientists and Curators: An Assessment of Current
Practice and Future Needs.” Nieuwsbrief Spined 22 (7): 20–22.
Tammaro, A. M., S. Ross, and V. Casarosa. 2014. “Research Data
Curator: The Competencies Gap.” BOBCATSSS 2014
Proceedings 1: 95–100.
Tenopir, C., K. Levine, S. Allard, L. Christian, R. Volentine, R. Boehm,
F. Nichols, et al. 2016. “Trustworthiness and Authority of
Scholarly Information in a Digital Age: Results of an
International Questionnaire?.” Journal of the Association for
Information Science and Technology 67 (10): 2344–61.
Vivarelli, M., M. Cassella, and F. Valacchi. 2013. The Digital Curator
between Continuity and Change: Developing a Training Course
at the University of Turin. Florence: European Union.
Walters, T. 2011. “New Roles for New Times Digital Curation for
Preservation.” General Collection 9 (2): 76.
Walters, T. 2014. “Assimilating Digital Repositories into the Active
Research Process?.” In Research Data Management: Practical
Strategies for Information Professionals, edited by J. Ray,
189–201. West Lafayette, In Purdue University Press.
Witt, M. 2008. “Institutional Repositories and Research Data Curation
in a Distributed Environment.” Library Trends 57 (2): 191–201.
Wright, S., A. Whitmire, L. Zilinski, and D. Minor. 2014.
“Collaboration and Tension between Institutions and Units
Providing Data Management Support.” Bulletin of the
American Society for Information Science and Technology
40 (6): 18–21.
Zhenjia Fan: Context-Based Roles and Competencies 137
Brought to you by | Universitat de Barcelona
Authenticated
Download Date | 7/22/19 9:35 AM

More Related Content

Similar to fan2019.pdf

Bsim0004 Assignment1 Copy Part1
Bsim0004 Assignment1 Copy Part1Bsim0004 Assignment1 Copy Part1
Bsim0004 Assignment1 Copy Part1
Svensson Leung
 
Shifting from librarian to data manager
Shifting from librarian to data managerShifting from librarian to data manager
Shifting from librarian to data manager
Nicolaie Constantinescu
 
A Survey on Big Data Analytics: Challenges
A Survey on Big Data Analytics: ChallengesA Survey on Big Data Analytics: Challenges
A Survey on Big Data Analytics: Challenges
Dr. Amarjeet Singh
 
IRJET-A Review on Topic Detection and Term-Term Relation Analysis in Big Data
IRJET-A Review on Topic Detection and Term-Term Relation Analysis in Big DataIRJET-A Review on Topic Detection and Term-Term Relation Analysis in Big Data
IRJET-A Review on Topic Detection and Term-Term Relation Analysis in Big Data
IRJET Journal
 
Learning analytics definitions processes potential
Learning analytics definitions processes potentialLearning analytics definitions processes potential
Learning analytics definitions processes potential
Fernando Bordignon
 
LEVERAGING CLOUD BASED BIG DATA ANALYTICS IN KNOWLEDGE MANAGEMENT FOR ENHANCE...
LEVERAGING CLOUD BASED BIG DATA ANALYTICS IN KNOWLEDGE MANAGEMENT FOR ENHANCE...LEVERAGING CLOUD BASED BIG DATA ANALYTICS IN KNOWLEDGE MANAGEMENT FOR ENHANCE...
LEVERAGING CLOUD BASED BIG DATA ANALYTICS IN KNOWLEDGE MANAGEMENT FOR ENHANCE...
ijdpsjournal
 
LEVERAGING CLOUD BASED BIG DATA ANALYTICS IN KNOWLEDGE MANAGEMENT FOR ENHANCE...
LEVERAGING CLOUD BASED BIG DATA ANALYTICS IN KNOWLEDGE MANAGEMENT FOR ENHANCE...LEVERAGING CLOUD BASED BIG DATA ANALYTICS IN KNOWLEDGE MANAGEMENT FOR ENHANCE...
LEVERAGING CLOUD BASED BIG DATA ANALYTICS IN KNOWLEDGE MANAGEMENT FOR ENHANCE...
ijdpsjournal
 
LEVERAGING CLOUD BASED BIG DATA ANALYTICS IN KNOWLEDGE MANAGEMENT FOR ENHANC...
LEVERAGING CLOUD BASED BIG DATA ANALYTICS  IN KNOWLEDGE MANAGEMENT FOR ENHANC...LEVERAGING CLOUD BASED BIG DATA ANALYTICS  IN KNOWLEDGE MANAGEMENT FOR ENHANC...
LEVERAGING CLOUD BASED BIG DATA ANALYTICS IN KNOWLEDGE MANAGEMENT FOR ENHANC...
ijdpsjournal
 
· Application 1 – Analysis and Synthesis of Prior ResearchAt pro.docx
· Application 1 – Analysis and Synthesis of Prior ResearchAt pro.docx· Application 1 – Analysis and Synthesis of Prior ResearchAt pro.docx
· Application 1 – Analysis and Synthesis of Prior ResearchAt pro.docx
oswald1horne84988
 
The Role of Big Data Management and Analytics in Higher Education
The Role of Big Data Management and Analytics in Higher EducationThe Role of Big Data Management and Analytics in Higher Education
The Role of Big Data Management and Analytics in Higher Education
Business, Management and Economics Research
 
A metadata scheme of the software-data relationship: A proposal
A metadata scheme of the software-data relationship: A proposalA metadata scheme of the software-data relationship: A proposal
A metadata scheme of the software-data relationship: A proposal
Kai Li
 
IMPACT BIG DATA ANALYTIC AND KNOWLEDGE MANAGEMENT AS STRATEGY SERVICE ADVANTA...
IMPACT BIG DATA ANALYTIC AND KNOWLEDGE MANAGEMENT AS STRATEGY SERVICE ADVANTA...IMPACT BIG DATA ANALYTIC AND KNOWLEDGE MANAGEMENT AS STRATEGY SERVICE ADVANTA...
IMPACT BIG DATA ANALYTIC AND KNOWLEDGE MANAGEMENT AS STRATEGY SERVICE ADVANTA...
Rochelle Schear
 
Big data for qualitative research by kathy a. mills (z lib.org)
Big data for qualitative research by kathy a. mills (z lib.org)Big data for qualitative research by kathy a. mills (z lib.org)
Big data for qualitative research by kathy a. mills (z lib.org)
MiguelRosario24
 
RES804 P6 Individual Project - Prospectus
RES804 P6 Individual Project - ProspectusRES804 P6 Individual Project - Prospectus
RES804 P6 Individual Project - Prospectus
ThienSi Le
 
Introduction
IntroductionIntroduction
Introduction
okeee
 
Kotarski2011
Kotarski2011Kotarski2011
International Journal of Engineering Research and Development (IJERD)
International Journal of Engineering Research and Development (IJERD)International Journal of Engineering Research and Development (IJERD)
International Journal of Engineering Research and Development (IJERD)
IJERD Editor
 
Semantic Web Investigation within Big Data Context
Semantic Web Investigation within Big Data ContextSemantic Web Investigation within Big Data Context
Semantic Web Investigation within Big Data Context
Murad Daryousse
 
Sample
Sample Sample
Data modeling techniques used for big data in enterprise networks
Data modeling techniques used for big data in enterprise networksData modeling techniques used for big data in enterprise networks
Data modeling techniques used for big data in enterprise networks
Dr. Richard Otieno
 

Similar to fan2019.pdf (20)

Bsim0004 Assignment1 Copy Part1
Bsim0004 Assignment1 Copy Part1Bsim0004 Assignment1 Copy Part1
Bsim0004 Assignment1 Copy Part1
 
Shifting from librarian to data manager
Shifting from librarian to data managerShifting from librarian to data manager
Shifting from librarian to data manager
 
A Survey on Big Data Analytics: Challenges
A Survey on Big Data Analytics: ChallengesA Survey on Big Data Analytics: Challenges
A Survey on Big Data Analytics: Challenges
 
IRJET-A Review on Topic Detection and Term-Term Relation Analysis in Big Data
IRJET-A Review on Topic Detection and Term-Term Relation Analysis in Big DataIRJET-A Review on Topic Detection and Term-Term Relation Analysis in Big Data
IRJET-A Review on Topic Detection and Term-Term Relation Analysis in Big Data
 
Learning analytics definitions processes potential
Learning analytics definitions processes potentialLearning analytics definitions processes potential
Learning analytics definitions processes potential
 
LEVERAGING CLOUD BASED BIG DATA ANALYTICS IN KNOWLEDGE MANAGEMENT FOR ENHANCE...
LEVERAGING CLOUD BASED BIG DATA ANALYTICS IN KNOWLEDGE MANAGEMENT FOR ENHANCE...LEVERAGING CLOUD BASED BIG DATA ANALYTICS IN KNOWLEDGE MANAGEMENT FOR ENHANCE...
LEVERAGING CLOUD BASED BIG DATA ANALYTICS IN KNOWLEDGE MANAGEMENT FOR ENHANCE...
 
LEVERAGING CLOUD BASED BIG DATA ANALYTICS IN KNOWLEDGE MANAGEMENT FOR ENHANCE...
LEVERAGING CLOUD BASED BIG DATA ANALYTICS IN KNOWLEDGE MANAGEMENT FOR ENHANCE...LEVERAGING CLOUD BASED BIG DATA ANALYTICS IN KNOWLEDGE MANAGEMENT FOR ENHANCE...
LEVERAGING CLOUD BASED BIG DATA ANALYTICS IN KNOWLEDGE MANAGEMENT FOR ENHANCE...
 
LEVERAGING CLOUD BASED BIG DATA ANALYTICS IN KNOWLEDGE MANAGEMENT FOR ENHANC...
LEVERAGING CLOUD BASED BIG DATA ANALYTICS  IN KNOWLEDGE MANAGEMENT FOR ENHANC...LEVERAGING CLOUD BASED BIG DATA ANALYTICS  IN KNOWLEDGE MANAGEMENT FOR ENHANC...
LEVERAGING CLOUD BASED BIG DATA ANALYTICS IN KNOWLEDGE MANAGEMENT FOR ENHANC...
 
· Application 1 – Analysis and Synthesis of Prior ResearchAt pro.docx
· Application 1 – Analysis and Synthesis of Prior ResearchAt pro.docx· Application 1 – Analysis and Synthesis of Prior ResearchAt pro.docx
· Application 1 – Analysis and Synthesis of Prior ResearchAt pro.docx
 
The Role of Big Data Management and Analytics in Higher Education
The Role of Big Data Management and Analytics in Higher EducationThe Role of Big Data Management and Analytics in Higher Education
The Role of Big Data Management and Analytics in Higher Education
 
A metadata scheme of the software-data relationship: A proposal
A metadata scheme of the software-data relationship: A proposalA metadata scheme of the software-data relationship: A proposal
A metadata scheme of the software-data relationship: A proposal
 
IMPACT BIG DATA ANALYTIC AND KNOWLEDGE MANAGEMENT AS STRATEGY SERVICE ADVANTA...
IMPACT BIG DATA ANALYTIC AND KNOWLEDGE MANAGEMENT AS STRATEGY SERVICE ADVANTA...IMPACT BIG DATA ANALYTIC AND KNOWLEDGE MANAGEMENT AS STRATEGY SERVICE ADVANTA...
IMPACT BIG DATA ANALYTIC AND KNOWLEDGE MANAGEMENT AS STRATEGY SERVICE ADVANTA...
 
Big data for qualitative research by kathy a. mills (z lib.org)
Big data for qualitative research by kathy a. mills (z lib.org)Big data for qualitative research by kathy a. mills (z lib.org)
Big data for qualitative research by kathy a. mills (z lib.org)
 
RES804 P6 Individual Project - Prospectus
RES804 P6 Individual Project - ProspectusRES804 P6 Individual Project - Prospectus
RES804 P6 Individual Project - Prospectus
 
Introduction
IntroductionIntroduction
Introduction
 
Kotarski2011
Kotarski2011Kotarski2011
Kotarski2011
 
International Journal of Engineering Research and Development (IJERD)
International Journal of Engineering Research and Development (IJERD)International Journal of Engineering Research and Development (IJERD)
International Journal of Engineering Research and Development (IJERD)
 
Semantic Web Investigation within Big Data Context
Semantic Web Investigation within Big Data ContextSemantic Web Investigation within Big Data Context
Semantic Web Investigation within Big Data Context
 
Sample
Sample Sample
Sample
 
Data modeling techniques used for big data in enterprise networks
Data modeling techniques used for big data in enterprise networksData modeling techniques used for big data in enterprise networks
Data modeling techniques used for big data in enterprise networks
 

Recently uploaded

DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
Kari Kakkonen
 
GridMate - End to end testing is a critical piece to ensure quality and avoid...
GridMate - End to end testing is a critical piece to ensure quality and avoid...GridMate - End to end testing is a critical piece to ensure quality and avoid...
GridMate - End to end testing is a critical piece to ensure quality and avoid...
ThomasParaiso2
 
National Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practicesNational Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practices
Quotidiano Piemontese
 
A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...
sonjaschweigert1
 
20240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 202420240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 2024
Matthew Sinclair
 
Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
Safe Software
 
How to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptxHow to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptx
danishmna97
 
Generative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to ProductionGenerative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to Production
Aggregage
 
Mind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AIMind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AI
Kumud Singh
 
Removing Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software FuzzingRemoving Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software Fuzzing
Aftab Hussain
 
Introduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - CybersecurityIntroduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - Cybersecurity
mikeeftimakis1
 
Artificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopmentArtificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopment
Octavian Nadolu
 
Large Language Model (LLM) and it’s Geospatial Applications
Large Language Model (LLM) and it’s Geospatial ApplicationsLarge Language Model (LLM) and it’s Geospatial Applications
Large Language Model (LLM) and it’s Geospatial Applications
Rohit Gautam
 
Data structures and Algorithms in Python.pdf
Data structures and Algorithms in Python.pdfData structures and Algorithms in Python.pdf
Data structures and Algorithms in Python.pdf
TIPNGVN2
 
“I’m still / I’m still / Chaining from the Block”
“I’m still / I’m still / Chaining from the Block”“I’m still / I’m still / Chaining from the Block”
“I’m still / I’m still / Chaining from the Block”
Claudio Di Ciccio
 
Elizabeth Buie - Older adults: Are we really designing for our future selves?
Elizabeth Buie - Older adults: Are we really designing for our future selves?Elizabeth Buie - Older adults: Are we really designing for our future selves?
Elizabeth Buie - Older adults: Are we really designing for our future selves?
Nexer Digital
 
Uni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdfUni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems S.M.S.A.
 
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
名前 です男
 
Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1
DianaGray10
 
Pushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 daysPushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 days
Adtran
 

Recently uploaded (20)

DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
 
GridMate - End to end testing is a critical piece to ensure quality and avoid...
GridMate - End to end testing is a critical piece to ensure quality and avoid...GridMate - End to end testing is a critical piece to ensure quality and avoid...
GridMate - End to end testing is a critical piece to ensure quality and avoid...
 
National Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practicesNational Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practices
 
A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...
 
20240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 202420240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 2024
 
Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
 
How to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptxHow to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptx
 
Generative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to ProductionGenerative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to Production
 
Mind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AIMind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AI
 
Removing Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software FuzzingRemoving Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software Fuzzing
 
Introduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - CybersecurityIntroduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - Cybersecurity
 
Artificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopmentArtificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopment
 
Large Language Model (LLM) and it’s Geospatial Applications
Large Language Model (LLM) and it’s Geospatial ApplicationsLarge Language Model (LLM) and it’s Geospatial Applications
Large Language Model (LLM) and it’s Geospatial Applications
 
Data structures and Algorithms in Python.pdf
Data structures and Algorithms in Python.pdfData structures and Algorithms in Python.pdf
Data structures and Algorithms in Python.pdf
 
“I’m still / I’m still / Chaining from the Block”
“I’m still / I’m still / Chaining from the Block”“I’m still / I’m still / Chaining from the Block”
“I’m still / I’m still / Chaining from the Block”
 
Elizabeth Buie - Older adults: Are we really designing for our future selves?
Elizabeth Buie - Older adults: Are we really designing for our future selves?Elizabeth Buie - Older adults: Are we really designing for our future selves?
Elizabeth Buie - Older adults: Are we really designing for our future selves?
 
Uni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdfUni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdf
 
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
みなさんこんにちはこれ何文字まで入るの?40文字以下不可とか本当に意味わからないけどこれ限界文字数書いてないからマジでやばい文字数いけるんじゃないの?えこ...
 
Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1Communications Mining Series - Zero to Hero - Session 1
Communications Mining Series - Zero to Hero - Session 1
 
Pushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 daysPushing the limits of ePRTC: 100ns holdover for 100 days
Pushing the limits of ePRTC: 100ns holdover for 100 days
 

fan2019.pdf

  • 1. Zhenjia Fan* Context-Based Roles and Competencies of Data Curators in Supporting Research Data Lifecycle Management: Multi-Case Study in China https://doi.org/10.1515/libri-2018-0065 Received May 31, 2018; accepted August 06, 2018 Abstract: Focusing on the main research question of what the critical roles and competencies of data curation are in supporting research data life cycle management, this paper adopts a multi-case study method, with data gov- ernance frameworks, to analyze stakeholders and data curators, and their competencies, based on different con- texts from cases from enterprises and academic libraries in mainland China. Via the context and business analysis on different cases, critical roles such as data supervisor, data steward, and data custodian in guaranteeing data quality and efficiency of data reuse are put forward. Based on the general factor framework summarized via existing literature, suggestions for empowering data cura- tors’ competencies are raised according to the cases. The findings of this paper are as follows: besides digital archiving and preservation, more emphasis should be placed on data governance in the field of data curation. Data curators are closely related but not equivalent to stakeholders of data governance. The different roles of data curators would play their own part in the process of data curation and can be specified as data supervisor, data steward, and data custodian according to given contexts. The roles, competencies, and empowerment strategies presented in this paper might have both theo- retical and practical significance for the fields of both data curation and data governance. Keywords: data curation, data governance, data life cycle, data curator, research data management Introduction As the data deluge (Poole 2016) progresses, nowadays, issues about big data are easily becoming hot topics; however, what we should never neglect is that data curation appeared much earlier than big data and has been experienced in librarianship and information resource management even in the era of “small data”. It is vital for information professions to recognize data curators, and their competencies are required in given contexts regardless of being a “small” or “big” data era. In January 2018, the General Office of the State Council of the People’s Republic of China promulgated the “Measures on Scientific Data Management” (State Council Document No. 17 2018), which clearly points to the corresponding duties of government departments, research institutes, universities, and scientific data centers concerning the process of scientific data man- agement. Furthermore, the process of scientific data collection, submitting, saving, sharing, use, and secur- ity has also been stated in this legal document. It is a fact that some R&D institutes, universities, and enter- prises in China have undertaken extensive practices in the field of data management before such a document is issued. As one of the key activities in innovative enter- prises, research institutes, and universities, data cura- tion has improved the efficiency and quality of R&D data management in the era of data-intensive research. Different agents such as university libraries, enterprises, data centers, and other institutions based on specific business situations have accumulated many valuable cases which might have significance for data curation in both theory and practice. The majority of studies related to data curation concentrate on the field of Library and Information Science, especially focusing on the topic of librarianship. However, data curation is a business issue that reaches beyond librarianship. Data curation can be considered as one of the critical components in R&D management and it is reflected in different duties such as research man- agement, information management, data management, librarianship, and archives (Noonan and Chute 2014). Although facing a similar practical issue in curation, data curators will appear as holding different roles with respect to the context: data librarian in a library; chief data officer and data manager in an enterprise; data *Corresponding author: Zhenjia Fan, Department of Information Resources Management, Business School, Nankai University, 94 Weijin Road, Nankai District, Tianjin, China, E-mail: fanzhenjia@nankai.edu.cn LIBRI 2019; 69(2): 127–137 Brought to you by | Universitat de Barcelona Authenticated Download Date | 7/22/19 9:35 AM
  • 2. steward and data scientist in a research institute etc. There is no doubt that data curator, data steward, and data custodian are sometimes used interchangeably in both literature and practices, while it is also difficult to make a clear distinction among them in the Chinese context. However, there are different word habits to describe data curator in different practical fields. For example, in many Chinese enterprises, data steward refers to data curators who mainly focus on data assets, while data custodian usually refers to data curators who offer technical supporting in the process of data cura- tion. The competencies required also vary in different contexts; for example, between the field of digital huma- nities on the one hand and engineering on the other. One basic consensus of the purpose of data curation is to guarantee the quality of research data, improve the efficiency of data reuse, and serve the research and development process. Therefore, rather than being simply a technical job, a holistic approach is needed, involving governance of multi-stakeholders. When facing multi-sta- keholders in data curation, a realistic problem becomes how to identify common roles in the process of research data management and seek for common discourses among the diverse identifications. It is essential to sum- marize the roles and competencies from different con- texts of real data curation practices. Data literacy is a much debated topic related to roles and competencies in data curation, and the literacy fra- meworks of librarians are always taken into considera- tion first (e. g. Carlson and Johnston 2016; Koltay 2015, 2017). Moreover, data governance (Otto 2011; Soares 2014), with its strong theoretical capacity to incorporate human, technological, and organizational factors into one framework to understand the problem, is always used as the perspective to analyze the stakeholders and propose effective suggestions. Tools, such as databases and cloud platforms (Soares 2014; Witt 2008), have also been paid increasingly more attention concerning solu- tions for data curation in recent years. Of course, some are soft compared to the technological factors, such as regulations and rules (Pryor 2009; Smith 2014) for the process of data curation, which have also attracted sev- eral researchers in order to guarantee a system for the implementation of data curation. We must acknowledge that research achievements in data curation have provided us with too many essential theoretical aids. However, it is also obvious that some basic questions remain when searching for the perfect answer. For example, what is the objective of data cura- tion: data set as results, or data stream as a business process? If the answer is the former, another question concerning how to distinguish data curation and digital archiving will be raised; if we prefer the latter as the better answer, how do we solve the data governance issue while it is difficult to consider data curation as an independent process in a real business context? This brings practical challenges in data curation for data col- lection and quality control involving intellectual prop- erty, business secrets, and state securities etc. It is necessary to organize and curate data from the data life cycle under the perspective of data governance. In addition, it is too simplified to consider data cura- tion as one independent process in one given organiza- tion although it is, indeed, beneficial to find problems during each step because data curation is often involved with other business issues in practice. This makes it difficult to answer whether data curation is effective when no specific context is taken into consideration and is why it is essential to focus on the business context when researching data curation. Based on the background to this problem, this paper summarizes the critical roles and competencies from a multi-case study approach. This essay describes the sta- tus quo of data curation in agents including enterprises and libraries in China and discusses the roles of data curation in such contexts. The cases in this paper cover significant enterprises including Neusoft, one of the lar- gest IT corporations in China, and several academic and university libraries in mainland China. Literature Review The process of data curation involves different stake- holders who take corresponding roles. As discussed in the introduction, it is essential to review topics such as data governance, data curation, data literacy and more. Data Governance and Stakeholders As governance is mainly related to the interests of busi- ness sectors, it is understandable that data governance is seldom addressed by the LIS (Library and Information Science) literature with few exceptions such as Krier and Strasser (2014). Similar to corporate governance, data governance enables better decision-making and pro- tects the needs of stakeholders. Data governance can be considered as a set of activities which include planning, supervision, and enforcement that govern the process 128 Zhenjia Fan: Context-Based Roles and Competencies Brought to you by | Universitat de Barcelona Authenticated Download Date | 7/22/19 9:35 AM
  • 3. and methodologies carried out to guarantee and improve the quality of data. Otto (2011) points out that a data asset is the object of data governance and data management, with the common purpose to maximize the value of the data. Data curation, as a data management practice, aims to improve the level of data reuse and needs to follow certain governance frameworks and guidance strategies. Sarsfield (2009, 38) considers data governance as “guar- antees that data can be trusted and that people can be made accountable for any adverse event that happens because of poor quality”. In the LIS field of China, Gu (2016) emphasizes data management from the perspective of data life cycle, which consists of data acquisition, data sharing, data reuse, and data appreciation. Huang and Lai (2016) explored the library as the core stakeholder and sum- marized library, data center, research institutions, gov- ernment and public sectors, policymakers, funders and data publishers as the main categories of data govern- ance subjects. Data Curation and Roles The concept of “curation” originated from the field of museum science and has since been applied in the field of data management. Since “data curation”, firstly used by Gray and Szalay (2002), is a popular buzzword (Abrams 2014), it soon led to the recognition of, and attention paid to, LIS disciplines. Data management objects, including observation, calculation, experiment, derived, and other means to obtain data, can be expressed as text, number, image, video, audio, software, algorithms, reports, models, and other forms. The Digital Curation Center (DCC) defines data curation as all activities that maintain, preserve, and add value during the life cycle of digital data. In addition to the “preservation” of the data, the emphasis is on value- added data and reuse. In this process, the collaboration between data researchers, publishers, managers, and users is highlighted (Heidorn et al. 2007). With the continuous advancement in the field of data management, agents who are engaged in data manage- ment are also attracting the attention of academia. Swan and Brown (2008) presented data creators, data scien- tists, data managers, and data librarians as four profes- sional roles in data management. Furthermore, the four roles have relationships with each other and, notably, data librarians are similar to data curators (Tammaro, Ross, and Casarosa 2014). After resolving the roles in the context of enterprise research data management, Fan (2017) points that data curators can be further sub- divided into specific roles such as data supervisor, data steward, and data custodian. Lesk (2013) believes that data curators would be a high-demand profession in the era of big data while Walters (2011) discusses the new roles of libraries and librarians, and further emphasizes the collaborative stra- tegies of data management and long-term preservation. Academic librarians are often integrated into the research process, initially in the framework of research data ser- vices (RDSs) (Tenopir et al. 2016). Henderson (2017) indi- cates the roles of librarians in data management through five aspects: collection (data documentation); storage (data storage, archiving, and preservation); organization (metadata creation and controlled vocabularies, file nam- ing); retrieval (data sharing and reusing, data access); and presentation/dissemination (data privacy, rights, and publishing). In order to fulfil the requirements of data curation, data librarians should understand meta- data, information literacy, scholarly communication, open access, and both collection and repository manage- ment (Corrall, Kennan, and Afzal 2013). Data Literacy and Competencies Similar to the concept of information literacy, data lit- eracy is becoming a critical concept in LIS, which is related to data management throughout the life cycle of data management. Data literacy can be defined as a specific skill set and knowledge base, which empowers individuals to transform data into information and actionable knowledge by enabling them to access, inter- pret, critically assess, manage, and ethically use data (Koltay 2015). According to Carlson and Johnston (2016), data literacy can be summarized as when data literacy is embedded in the R&D data flow and related to data life cycle; R&D management and data utilization are the main perspectives in data literacy; practical skills such as data analysis, description, and tools would be the main focus in data literacy. Data literacy is closely related to research data ser- vices that include research data management. In the context of data governance, skills such as dealing with licensing terms and agreements, as well as knowledge about copyright, are already possessed by librarians as the components of data literacy (Krier and Strasser 2014). Prado and Marzal (2013) identify a number of abilities as data literacy while some clearly show their origin from ALA and ACRL: Zhenjia Fan: Context-Based Roles and Competencies 129 Brought to you by | Universitat de Barcelona Authenticated Download Date | 7/22/19 9:35 AM
  • 4. – determining when data is needed; – accessing data sources appropriate to the information needed; – recognizing source data value, types, and formats; – critically assessing data and its sources; – knowing how to select and synthesize data and combine it with other information sources and prior knowledge; – using data ethically; – applying results to learning, decision making, or pro- blem solving. Similar to the list above, a pilot data literacy program, offered at Purdue University, was built around the follow- ing skills (Carlson and Bracke 2013): – planning; – life cycle models; – discovery and acquisition; – description and metadata; – security and storage; – copyright and licensing; – sharing; – management and documentation; – visualizations; – repositories; – preservation; – publication and curation. The ability to identify the context in which data are produced and reused has also been emphasized. Through analysis of the relevant connotation, data literacy has the following characteristics: it is closely con- nected with and embedded in the research business work- flow; it is both for the data service supplier and data user; it emphasizes the practical skills of analyzing, displaying, and using data management tools. Taking the factors dis- cussed above, it can be summarized as the general factor framework, as in Figure 1. The factors can be divided into structural and agency; factors outside the data curators’ control are structural factors, and agency factors are mainly related to data literacy. Both structural and agency factors will affect data curation, and data curation can lead to achievements. Furthermore, achievements can loop back to the context of the data curation. So, data curation can be regarded as a bridge among different stakeholders. Figure 1: General factors framework of data curators. 130 Zhenjia Fan: Context-Based Roles and Competencies Brought to you by | Universitat de Barcelona Authenticated Download Date | 7/22/19 9:35 AM
  • 5. Research Design Multi-perspectives on data curation vary, and we should take this variability into consideration to couple it with different contexts. In view of the existing literature, this study explores the analysis approach of “context-business- role-competency” to summarize the roles from real facts. The core research questions of this study are as follows: – How does the context affect the process of data curation? – Who are the stakeholders and what are the key roles in data curation? – What are the competencies required for data curators in different contexts? Based on the above questions, this study adopts a multi- case study method to summarize roles and corresponding competencies. Case study can be used to describe and analyze the business issue in depth. In view of the lack of effec- tive localization of data curation in China, this study attempts to summarize the theoretical framework from the practice of data curation in different contexts. The Neusoft Corporation (hereinafter referred to as “Neusoft”) and several academic libraries in China were selected as the main case objectives. Participatory observation, in-depth interview, and document investigation have been adopted for data collection. The semi-structured interview outline was summarized according to former research literature and data collection was conducted from December 2016 to December 2017. This study was conducted based on a data collection approach from multi-time, multi-channel, and multi-source in order to construct the complete evidence chain. In addition, open interviews and participatory observations were used in fieldwork to facilitate a comprehensive under- standing of the cases. The above aspects can guarantee the validity of the qualitative data. After the cases were collected, critical points of the research life cycle were coded according to files and interview records, following which the key roles and relationships were analyzed. Finally, the general competencies framework was dis- cussed based on the above tasks. Case Studies and Discussion As mentioned above, LIS has paid much attention to data curation research; however, practices of data curation are never limited to librarianship but also include many other fields related to research data management. We firstly introduce and analyze one case beyond the library via the perspective of data governance, and then compare the roles with cases from libraries to summarize the competencies required in common. Case Beyond Library Founded in 1991, as one of the largest IT companies in China with more than 16,000 employees at the end of 2017, Neusoft has established a two-level R&D system including both enterprise and departmental levels, and serves society with industry solutions, intelligent inter- connection products, platform products, and cloud and data services. Research data have been generated during the R&D process, and increasingly more value has been added via the reuse of data. Therefore, data curation is of significance in R&D management. Context and Business Department Responsible for Research Data Management Based on open innovation strategy, Neusoft traditionally pays significant attention to R&D and process manage- ment. If we adopt the data governance theory to analyze the stakeholders of the research data management of Neusoft, Figure 2 shows the stakeholders’ structure. In 2008 Neusoft Research (NSR) was established based on the Department of Research of Neusoft, and has taken the responsibilities of R&D management of the whole cor- poration. There are more than ten R&D managers respon- sible for special areas in Neusoft. Types of Data According to the management files, types of data mana- ged by NSR include scientific data, metadata, and related management data. Tools for Data Curation NSR adopts SEAS (Super Electronic Archive System), a platform developed by Neusoft, to collect and curate Zhenjia Fan: Context-Based Roles and Competencies 131 Brought to you by | Universitat de Barcelona Authenticated Download Date | 7/22/19 9:35 AM
  • 6. data. It sets different administration authorities to cor- responding roles in R&D management. Within their respective authorities, R&D managers can complete operations such as record creation, data upload, data update, data deletion, data retrieval, and data down- load. The information security manager of NSR is responsible for the daily operation and maintenance of the platform and data monitoring. Life Cycle Management As data collector, NSR completes the task of data collec- tion according to the R&D project life cycle; that is, from the start to implementation and to the end of research projects. We should note that some research data might be kept by the creators as they are always considered as private assets, especially in the context of some core business. Therefore, it is often an issue requiring the coordination of higher level stakeholders (e. g. super vice presidents) during the process. Policy and Institution The system of data curation in Neusoft can be traced back to the R&D archives management system. Policies cover, amongst others, the topics of innovation platform man- agement, research achievements, intellectual property, and research cooperation. Files of policy and institution indicate the detailed business processes, roles, responsi- bilities, relevant documents and templates, and examples of application etc. Roles and Competencies Roles Based on this case, the data governance framework needs to establish a clear organizational structure and division of responsibilities in the context of enterprise data cura- tion. According to the actual roles in the data curation, the three main roles are data supervisor, data custodian, and data steward. Other roles also include data creators and data users, which can be cross-converted. For the basis of the relevant roles see Figure 3. It is necessary to distinguish the responsibilities between data business process and technical manage- ment process in data curation. As shown in Figure 3, the data steward and data custodian are always corre- sponding roles for the two functional processes. As Rosenbaum (2010) indicates, the data steward focuses mainly on taking care of data assets for others accord- ing to policies and practices as determined through data governance, while the data custodian is always responsible for the technical support of the data cura- tion process. Generally speaking, the data steward deals with content and the data custodian deals with technical support; however, they both share the name data curator. According to codes analysis of regulation files for data curation in case 1, the core competencies of data curators can be concluded as follows: – Data planning: be familiar with data types, volumes, formats, metadata, standard, reservation, and access etc.; – Data creation and collection: be familiar with data sources, tools, skills, and evaluation criteria; Figure 2: Stakeholders of research data curation in Neusoft. 132 Zhenjia Fan: Context-Based Roles and Competencies Brought to you by | Universitat de Barcelona Authenticated Download Date | 7/22/19 9:35 AM
  • 7. – Data processing: be familiar with the description, cleaning, and cataloguing for data; – Data analysis: be familiar with metadata and visual tools to service the decision-making; – Data preservation: be familiar with long-term preser- vation, data security, and problem solving; – Data sharing and reuse: be familiar with data sharing platform, policy, and regulations; – Common competency: including communication skills, background knowledge, collaboration abilities etc. Case from Academic Libraries As an ideal center of data curation in colleges and uni- versities, the academic library has the opportunity to be the key actor in data services for research. Data curation, for librarianship, represents the indescribable temptation in data-intensive era, and it also shows that few know exactly where it will go. Founded on October 23, 2014, the China Academic Library Research Data Management Implementation Group (CALRDMIG) consists of nine libraries including Peking University Library, Tsinghua University Library, Fudan University Library, Zhejiang University Library, Wuhan University Library, Beijing Institute of Technology Library, Shanghai Jiaotong University Library, Tongji University Library, and Shanghai International Studies University Library, aiming to promote the development of research data management between different academic libraries in mainland China. The CALRDMIG sets several working tasks: – environmental scanning; – research data management framework; – research data management measures and policies; – formulation and implementation of related standards; – platform and tools for research data management; – training courses of research data management; – planning for research data management for academic libraries and best practice cases collection. In China, the majority of academic libraries are under the direct supervision of the CPC committee and university offices which mean that academic library running funds and business content are directed by university adminis- trative departments acting as the service agent for col- leges and institutes at the same level as the library itself. At the same time, the academic library is supervised by a virtual agent, the Academic Libraries Consulting Committee of the State Education Department which sim- ply offers business direction but holds no funds. This means the data curation business issued by an academic library is always under the basic governance framework, as shown in Figure 4. Interviews were undertaken in the nine academic libraries and some typical cases have also been col- lected. Peking University Library, as an example, pro- vides research data management services, including data mining, classification, and archiving etc., to help understand the data value and promote data utilization in the process of the whole life cycle. At the same time, it can effectively meet the new demands of researchers for funding and data verification, and assist the effi- ciency of research work. All the structured, semi-struc- tured, and unstructured data are included in the scope of data curation. Moving to another academic library, Fudan University Library has set up the Department of Data Management and Technology which is responsible for frontier research, project planning, and construction of the data library, as well as data management-related projects of the library, colleges, and institution. It Figure 3: Roles in data curation. Zhenjia Fan: Context-Based Roles and Competencies 133 Brought to you by | Universitat de Barcelona Authenticated Download Date | 7/22/19 9:35 AM
  • 8. provides technical and platform support in both system development and data analysis; academic research, technical services, external cooperation, and data lit- eracy training to researchers. The other seven libraries also offer services related to research data curation. There is no exact definition of data curation although there is no denying that it covers the whole life cycle of data management. The general process of research data curation can be generally described as shown in Figure 5. From the interview and survey data, it can be observed that few libraries have implemented indepen- dent data curation, that is, data curation is always embedded in other library business such as the insti- tute’s repository (IR). Data curation as one of the most important tendencies in academic libraries has obtained increasingly more consensus. However, practice experi- ence in such a field is far different. At present, if lack of clear top-level design for data management might be one of the common reasons for the status quo of data cura- tion in academic libraries, then competency framework would be another. Since it is embedded in a context different from that of enterprise, roles and competencies of data curators in academic libraries would be dis- cussed in the corresponding context. Roles and Competencies of Data Curators in Academic Libraries From the case description above, it is not difficult to find that the context of academic libraries is obviously differ- ent from that of enterprise. In enterprise, it appears more freedom is given as the whole enterprise would share a common interest in pursuit of increased benefits. Data curation can be considered as one of the components of corporate governance. Different roles such as data super- visors, stewards, and custodians can communicate with data creators and users effectively. When it comes to the context of academic libraries, a different perspective appears. Both the CPC committee and president of the university always consider the library as a supplemental department to colleges and institutes, even as simply a document repository. Moreover, few researchers consider the library to be a reliable cooperator. Therefore, much of the effort, including data curation, by librarians, is not recognized, which is a disappointing fact in many aca- demic libraries in China. It is essential to borrow some experience beyond librar- ianship, with perhaps data governance from enterprise Figure 4: Basic governance framework of a typical library for data curation. Figure 5: Life cycle of research data curation. 134 Zhenjia Fan: Context-Based Roles and Competencies Brought to you by | Universitat de Barcelona Authenticated Download Date | 7/22/19 9:35 AM
  • 9. one example. As governance is a concept mainly focused on risk management, legal, and policy issues, data governance can be regarded as issues about attribution and citation, archiving and preservation, discovery and provenance, data schema and ontologies, and infrastructure (Smith 2014). Under the perspective of data governance, as Pryor (2012) indicates, the process of approaching, preserving, and adding value to data of stakeholders can be analyzed in data life cycle. Though data curation in academic libraries involves several stakeholders beyond libraries, academic libraries still play a critical role. Poole (2016) indicates that digital/ data curation needs to be embedded in institutional and scholarly infrastructure: archives, centers, libraries, and institutional repositories (IRs) are key components of such an infrastructure. IRs, always associated with and implemented in academic libraries in China, are taking their data management and knowledge service roles in the R & D supporting process. As Walters (2014) indi- cates, data exchange and storage is a key function of IRs, who can manage not only scholarship and data, but also software, tools, and code (Cragin et al. 2010; Walters 2014). To some extent, data curation has been implemented on the basis of IRs, and data curators are always the librarians issued about IRs. Technical skills are essential to data curation; however, technologies can never solve all the problems in data cura- tion. Both LIS and non-LIS domain skills (Vivarelli, Cassella, and Valacchi 2013) and so-called soft skills, namely project management, negotiation, team-building, and collaborative problem solving (Oliver and Harvey 2016; Swan and Brown 2008), would be required. According to the discussion above, common competencies of data curators in academic libraries can be concluded as follows: – General competencies: institution management, general information technology; – Specific competencies: data related technology, aca- demic research; – Competencies of LIS: resource construction, training, and lifelong learning, knowledge acquisition and consultation service, knowledge organization. As the most used name for the role of data curation in academic libraries, data librarian is “the professional all- rounder like a Swiss army knife” (Osswald 2013). Beyond being an information professional, the data librarian should also be an effective communicator. If we take the “ownership” or “collections” paradigm into consideration in data curation, which has an assump- tion that the academic library should and could collect all data and therefore be able to adequately satisfy research needs, it meets too many obstacles in real practices. For example, the construction of an institutional repository is the basis of data management; however, one of the biggest obstacles is that researchers do not cooperate or are unwilling to provide the data. Data are more valuable than published papers and monographs, to some extent, and researchers will not submit them easily. Few researchers are good at creating sharable meta- data and curating their data efficiently; however, it is a pity that even fewer are aware of seeking help from libraries and librarians. Besides, the stereotype of lamen- ted competencies of librarians means they will not be considered as true collaborators by researchers (Wright et al. 2014). One case is from NSF Data Management Planning (Hswe and Holt 2011), where researchers failing to consult librarians would reinforce this bad impression. Competencies about collaboration with other depart- ments and being embedded in research team service would therefore also be considered in the context of the academic library. Competencies for data librarians include: – Communication competencies: skills of effective com- munication, need analysis, IT and project manage- ment training, and cooperation with researchers; – IT competencies: master database design tools, web development tools and content management systems, server management, programming software and dis- tributed collaboration tools; – Interdisciplinary competencies: background of, and beyond, LIS; working experience in reference ser- vice, collection development, and information orga- nization etc. Taking into consideration the data life cycle and data governance framework, the general framework of data curators in the context of the academic library can be concluded as shown in Figure 6. Conclusion and Implications In the context of research data curation both from and beyond academic libraries, data supervisors and data stewards often come from a background of data crea- tors and data users. The roles of data creators, data managers, and data users are often relatively easily distinguished, but data stewards and data custodians typically behave as research managers, and it is neces- sary to further subdivide these two roles when defining the post. Zhenjia Fan: Context-Based Roles and Competencies 135 Brought to you by | Universitat de Barcelona Authenticated Download Date | 7/22/19 9:35 AM
  • 10. Combined with the status of data curation in universities, certain problems faced by practitioners concerning data literacy are highlighted as lack of policy, unclear responsi- bilities, mismatch in quality and demand of staff members’ data literacy, lack of professional education etc. Aiming at the problems mentioned above, this study suggests corre- sponding countermeasures and suggestions. Firstly, clarify the job responsibilities and professional quality, and accel- erate standards setting and policy construction in data cura- tion in academic libraries. Secondly, establish data literacy training systems as well as strengthen data consciousness and data ethics education. Thirdly, promote innovation on data curation and realize the mutual promotion of theory and practice. Finally, cooperate with professional depart- ments in university to establish relevant courses and culti- vate professionals concerning data curation. Based on a real business situation, this study com- bines the stakeholders involved in the data governance framework to resolve the role of scientific research data management and complete the role recognition of data curation. The main conclusions are: – stakeholders involved in the data governance frame- work are not fully related to the subject of data curation; – the framework of data curation needs to establish a clear organizational structure and division of responsibilities, and clarify the synergistic relation- ship between different stakeholders; – data curation as a specific business data manage- ment, core roles, including data supervisors, data stewards and data custodians, should be subdivided in post setting. Data governance is a service to enable the transparency of data related processes and effective use; it can be used also in the field of data curation in the academic library. It is important for the library profession to take this chal- lenge seriously and acquire the competencies required to provide effective data curation. Competencies framework should be developed according to the contexts in which data curation is implemented. Because of the inherent limitations of the case study, this study explores the roles of the subject of data man- agement in the framework of data governance, and needs to be tested for more case studies. In addition, based on the data management framework, different data manage- ment roles should have a data literacy framework, as well as data continuity, master data maturity, data manage- ment performance measurement, and other aspects of how to achieve different management role coordination research issues, pending more in-depth study. Acknowledgements: This study is supported by the Fundamental Research Funds for the Central Universities (Project: Innovation-Driven Enterprise Data Governance, No. 63172077) in Nankai University. We offer a sincere thanks to participants in the long-term field study and manuscript reviewers for their useful suggestions. References Abrams, S. 2014. “Curation: Buzzword or What?.” Information Outlook 18 (5): 25–27. Carlson, J., and M. S. Bracke. 2013. “Data Management and Sharing from the Perspective of Graduate Students: An Examination of the Culture and Practice at the Water Quality Field Station.” Portal: Libraries and the Academy 13 (4): 343–61. Figure 6: General competencies framework in data life cycle management. 136 Zhenjia Fan: Context-Based Roles and Competencies Brought to you by | Universitat de Barcelona Authenticated Download Date | 7/22/19 9:35 AM
  • 11. Carlson, J., and L. Johnston. 2016. Data Information Literacy: Librarians, Data, and the Education of a New Generation of Researchers. West Lafayette: Purdue University Press. Corrall, S., M. A. Kennan, and W. Afzal. 2013. “Bibliometrics and Research Data Management Services: Emerging Trends in Library Support for Research.” Library Trends 61 (3): 636–74. Cragin, M., C. L. Palmer, J. R. Carlson, and M. Witt. 2010. “Data Sharing, Small Science and Institutional Repositories.” Transactions of the Royal Society 368 (1926): 4023–38. Fan, Z. 2017. “Enterprise R&D Data Governance and Curators: Case Study of NSR Practice.” Library and Information Service 61 (1): 56–63. (in Chinese). Gray, J., and A. S. Szalay. 2002. “Online Scientific Data Curation, Publication, and Archiving.” Proceedings of SPIE - the International Society for Optical Engineering 4846: 103–07. Gu, L. 2016. “Data Governance: Opportunity for the Library.” Journal of Library Science in China 42 (9): 40–56. (in Chinese). Heidorn, P. B., H. R. Tobbo, G. S. Choudhury, C. Greer, and R. Marciano. 2007. “Identifying Best Practices and Skills for Workforce Development in Data Curation.” Proceedings of the American Society for Information Science & Technology 44 (1): 1–3. Henderson, M. E. 2017. Data Management: A Practical Guide for Librarians. Lanham, Maryland: Rowman & Littlefield. Hswe, P., and A. Holt. 2011. “Joining in the Enterprise of Response in the Wake of the NSF Data Management Planning Requirement.” Research Library Issues 274: 11–16. Huang, R., and T. Lai. 2016. “Analysis of the Library’s Participation in Scientific Data Management from the Perspective of Stakeholders.” Library and Information Service 60 (3): 21–25, 89. (in Chinese). Koltay, T. 2015. “Data Literacy: In Search of a Name and Identity.” Journal of Documentation 72 (2): 401–15. Koltay, T. 2017. “Data Literacy for Researchers and Data Librarians.” Journal of Librarianship & Information Science 49 (1): 3–14. Krier, L., and C. A. Strasser. 2014. Data Management for Libraries. Chicago, IL: American Library Association. Lesk, M. 2013. “Curators of the Future.” New Technology of Library and Information Service 3: 1–7. Noonan, D., and T. Chute. 2014. “Data Curation and the University Archives.” American Archivist 77 (1): 201–40. Oliver, G., and R. Harvey. 2016. Data Curation: A How-To Manual, 2nd ed. London: Facet. Osswald, A. 2013. Skills for the Future: Educational Opportunities for Digital Curation Professionals. Florence: European Commission. Otto, B. 2011. “Data Governance.” Business & Information Systems Engineering 3 (4): 241–44. Poole, A. H. 2016. “The Conceptual Landscape of Digital Curation.” Journal of Documentation 72 (5): 961–86. Prado, J.C., and A. Marzal. 2013. “Incorporating Data Literacy into Information Literacy Programs: Core Competencies and Contents.” Libri 63 (2): 123–34. Pryor, G. 2009. “Multi-Scale Data Sharing in the Life Sciences: Some Lessons for Policy Makers.” International Journal of Digital Curation 4 (3): 71–82. Pryor, G. 2012. “Why Manage Research Data?.” In Managing Research Data, edited by G. Pryor, 1–16. London: Facet. 1–16. Rosenbaum, S. 2010. “Data Governance and Stewardship: Designing Data Stewardship Entities and Advancing Data Access.” Health Services Research 45 (5p2): 1442–55. Sarsfield, S. 2009. The Data Governance Imperative. Cambridge, UK: IT Governance Publishing. Smith, M. 2014. “Data Governance: Where Technology and Policy Collide.” In Research Data Management: Practical Strategies for Information Professionals, edited by J. Ray, 45–59. West Lafayette, In Purdue University Press. Soares, S. 2014. Data Governance Tools. Boise: MC Press Online, LLC. Swan, A., and S. Brown. 2008. “The Skills, Role and Career Structure of Data Scientists and Curators: An Assessment of Current Practice and Future Needs.” Nieuwsbrief Spined 22 (7): 20–22. Tammaro, A. M., S. Ross, and V. Casarosa. 2014. “Research Data Curator: The Competencies Gap.” BOBCATSSS 2014 Proceedings 1: 95–100. Tenopir, C., K. Levine, S. Allard, L. Christian, R. Volentine, R. Boehm, F. Nichols, et al. 2016. “Trustworthiness and Authority of Scholarly Information in a Digital Age: Results of an International Questionnaire?.” Journal of the Association for Information Science and Technology 67 (10): 2344–61. Vivarelli, M., M. Cassella, and F. Valacchi. 2013. The Digital Curator between Continuity and Change: Developing a Training Course at the University of Turin. Florence: European Union. Walters, T. 2011. “New Roles for New Times Digital Curation for Preservation.” General Collection 9 (2): 76. Walters, T. 2014. “Assimilating Digital Repositories into the Active Research Process?.” In Research Data Management: Practical Strategies for Information Professionals, edited by J. Ray, 189–201. West Lafayette, In Purdue University Press. Witt, M. 2008. “Institutional Repositories and Research Data Curation in a Distributed Environment.” Library Trends 57 (2): 191–201. Wright, S., A. Whitmire, L. Zilinski, and D. Minor. 2014. “Collaboration and Tension between Institutions and Units Providing Data Management Support.” Bulletin of the American Society for Information Science and Technology 40 (6): 18–21. Zhenjia Fan: Context-Based Roles and Competencies 137 Brought to you by | Universitat de Barcelona Authenticated Download Date | 7/22/19 9:35 AM