Guideline for Digital Curation for the Princess Maha Chakri Sirindhorn Anthropology Centre’s (SAC) Digital Repository: Preliminary Outcome
8th Dec., ICADL 2016, University of Tsukuba, Japan
TataKelola dan KamSiber Kecerdasan Buatan v022.pdf
Guideline for Digital Curation of the SAC Digital Repository
1. GUIDELINE FOR DIGITAL
CURATION FOR THE PRINCESS
MAHA CHAKRI SIRINDHORN
ANTHROPOLOGY CENTRE’S
(SAC) DIGITAL REPOSITORY:
PRELIMINARY OUTCOME
8th Dec., ICADL 2016, University of Tsukuba, Japan
Sittisak Rungcharoensuksri
SAC,Thailand
3. 1.1 THE CHANGING ROLE OF THAI
REPOSITORIES IN THE DIGITAL AGE
1.Time consuming
work process
2. Copyright
management
3. Commitment
4. Digital
preservation
3
Hub for
learning,
teaching and
researching
for global
audience
Digital
repository
Storage
1.INTRODUCTION
Inconclusive
national standards
and an effective
practice guide for
digital repositories
(Klungthanaboon and et al., 2012)
4. 1.2 THE CURRENT SITUATION OF THE SAC
4
Digital material
management
problem
(www.sac.or.th)
5. 1.3 RESEARCH OBJECTIVE, BENEFITS, METHODOLOGY
• To develop the guideline for digital curation by
demonstrating a case study from digital repository of
the SAC
Research objective
• The SAC can understand the situations and
problems of digital repositories in the world context,
in Thailand, and of itself.
• The SAC can apply the Digital Curation Lifecycle
Model and further suggestions from this research to
develop its own digital curation guideline.
Benefits
5
Phase 2
Focus group
discussion
Analysis
Phase 1
Literature
review
Online
questionnaire
6. 1.4 WHY DIGITAL CURATION?
6
1. Improve the
quality of data
accession
2. Improve data
quality
3. Encourage data
sharing and reuse
4. Help protect
data
Benefits
of digital
curation
The SAC aims to
understand and to curate all
the lifecycle stages of digital
objects. But how can we get
to know our lifecycle?
(Harvey, 2010, p.12)
7. 1.5 DIGITAL CURATION
LIFECYCLE MODEL
7
Sequential Actions
• Conceptualise - to develop and plan data creation
procedures and outcomes in mind
• Create or receive - to associate description and
representation information of the data for data curation
and also include external sources
• Appraise and Select - to evaluate the data
determination before keeping them in the long term
• Ingest - to prepare data for the addition to the digital
archive
• Preservation action - to ensure long-term data
preservation and data retention.
• Store - to secure description and representation
information in an appropriate way
• Access, Use, and Reuse - to make sure that data can be
accessed by authorized users for the use and later reuse
• Transform - to create the new data by generating a
subset of data from original data.
(DCC)
8. 2.1 THE SURVEY REPORT OF THE SAC'S
STAFFS DIGITAL WORK PROCESS
Online
questionnaire
• Data Asset
Framework (DAF)
• Digital Curation
Lifecycle Model
8
2.OUTPUTFROMPHASE1
12
3
2
6
1
8
1
6
Number and position of respondents
Researchers Coordinating staffs
Database administrators Librarians
Audiovisual staffs Programmers
Assistant researchers Other staffs
39
respondents
from 58 staffs
9. 2.2 SAMPLE QUESTIONS
9
What types of materials and file formats which you have been created?
How often do you back up your data and where do you store them?
How do they create and receive data?
How do you participate in
the digital project during the
Conceptualise stage?
Do you apply any Appraisal or Selection criteria to your data before Ingest them to the long-term storage?
How do you acquire your data?
How does your work process have
related with the SAC staffs and other
stakeholders?
What type/size of data and format do you create?
What kind of device or system do you use to back up your data?
How long will you keep your data in your device?
Does your data have reused and accessed by other staffs?
What is the reason?Why your
data cannot reuse and access by
other staffs?
How do you communicate and share your data with other staffs?
10. 2.3 SAMPLE ANSWERS
10
Data acquisition
Collecting
Composing and
synthesizing
Receiving
Fieldwork
Seminar and meeting
Others
Correspondence
Data inaccessibility
Confidential data
Sensitive cultural data
Non
metadata
Obsoleteness of medium
and format
Others
Note: The responders can choose more than one answer.
11. 2.4 INTEREST AND KNOWLEDGE OF THE SAC’S STAFFS
Group 1: Interested and
has some knowledge
. Group 2: Interested but
lacks knowledge
Group 3: Not interested
11
12. 2.5 THE RELATIONS BETWEEN
THE SAC’S STAFFS AND DIGITAL
CURATION LIFECYCLE MODEL
12
Researchers
• Conceptualise the project
• Create and manage data
• Ingest and preserve data
• Access, Use, and Reuse data
Programmers
• Create and manage, and
preserve data by IT tools
Database
administrators
• Create and manage data
• Ingest and preserve data
Coordinating staffs
• Create and manage data
Librarians
• Create and manage data
• Ingest and preserve da
Other staffs
• Create and
manage data
Audiovisual staffs
• Create and
manage, and
preserve data by IT
tools
Assistant researchers
• Create and manage data
13. 3.1 HOW DO WE APPLY DIGITAL CURATION TO THE
SAC’S DIGITAL WORK PROCESS
13
3.NEXTSTEPFORPHASE2
1. How to create awareness and
encourage knowledge about digital
curation?
2. How to adapt the guideline in
an appropriate way and establish
effective communication?
3. How to provide support from
the policy makers?
Phase 2
• Focus group
discussion
• Analysis
15. REFERENCES
• R. Harvey, Digital curation: A how-to-do-it-manual, NewYork: Neal-Schuman Publishers, Inc., 2010.
• DCC, "What is digital curation?," [Online]. Available: http://www.dcc.ac.uk/resources/curation-lifecycle-model.
[Accessed 1 April 2016].
• DAF, "Data Asset Framework: Implementation Guide," 2009.
• S. Jones, S. Ross and R. Ruusalepp, "Data Audit Framework Methodology," HATII, Glasgow, 2009.
• W. Klungthanaboon,T. Leelanupab and M. Moss, "Institutioanl Repositories for Scholary Communities in Thailand,"
KMTL Information Technology Journal, vol. 1, no. 1, January - June 2012.
• S. Rungcharoensuksri, "The Survey Report of the SAC's Staffs DigitalWork Process," The Princess Maha Chakri
Sirindhorn Anthropology Centre, Bangkok, 2016. (Unpublished document)
15
Editor's Notes
Good afternoon. The topic that I will present you today is the first outcome of my small project “the digital curation guideline for the SAC’s digital repository”.
My presentation has 3 parts, First, I will explain the background and current situation of digital repositories in Thailand and the SAC. Then, I will explain the aims, benefits, and methodology . After that, I will describe the reasons why digital curation concept has been chosen as the model for this research. Second, I will show the result from The Survey of the SAC's Staffs Digital Work Process. And lastly, I will demonstrate what I am going to do next. If you have any questions, please feel free to let me know at the end of the presentation.
In the last ten years, the academic institutions in Thailand have changed their roles from storing documents to being digital repositories . They aimed to be the hub of learning, teaching and researching for global audience through the Internet. However, many institutions have found some problems during the transformation. First, time consuming issue. For some authors, to deposit their work and assign metadata by themselves are an extra workload other than their routine jobs. As a result, they are likely to not upload their works to the repositories. Second, copyright management issue. Some printed materials belong to the publishers or the funding sponsors. So, some authors are not sure whether to submit their research output to the system or not because they’re concerned about copyright infringement . Third, commitment issue. The continuous support from stakeholders, such as, authors, publishers, and university administrators is very important to the management and maintenance of digital repositories. Although the budget and man-hour to create the digital project might not be high in the beginning, these stakeholders have to sacrifice their money and time to sustain the system for the long-term service. Lastly, digital preservation issue. How can the institutions preserve and ensure the long-term accessibility to the digital materials? Many institutions have designed and applied archival standards and strategic plans to preserve their digital items, but there are inconclusive national standards in Thailand, especially , an effective practice guide for digital repositories.
The SAC was established in 1991 with the primary objective of being a bank of anthropological data and information in related fields. As a result, since 2000, the centre has been developing a series of searchable and online databases such as Inscriptions, Ethnic Groups, Local Museums. However, during the work process, each database has found the problems about digital materials management because the centre hasn’t provided any guideline for the staffs.
Therefore, this research will examine the development of the guideline for digital curation by demonstrating a case study from digital repository of the SAC. There are two benefits from this research. First, the SAC can understand the situations and problems of digital repositories in the world context, Thailand, and itself. Second, the SAC can apply the Digital Curation Lifecycle Model and further suggestions from this research to develop its own guideline. There are four procedures to investigate the questions for this research. Literature review, Online questionnaire, Focus group discussion, and Analysis. Today I will present the results from the first phase of the project, which is the online questionnaire.
Before we go to that part. I have to explain why I have applied Digital Curation concept as a model for my research. After the literature review stage, I found that this project might not complete if I couldn’t clarify the data lifecycle of the SAC’s digital work process. Although, some staffs are quite familiar with some stages of the digital object management, they can’t imagine the whole process of data lifecycle. Some staffs also don’t understand why they should give any attention to this process. As a result, I have chosen digital curation as a model for the SAC’s work process because this concept is more inclusive than either digital archiving or digital preservation. It addresses the whole range of process applied in digital objects over their lifecycle. In addition, the procedure of digital curation also improves the data quality in four aspect s. First, it leads to the continual , speedy, and reliable access to the data. Second, it helps to improve the quality of data, data trustworthiness , and helps certify the credibility of the data as a formal record. Third, it helps to encourage data sharing and reuse throughout its life time. Because the common standards and information about the context and provenance of the data are applied to digital data during the curation actions. Lastly, it helps to prevent technology obsolescence and data loss. Because the procedures of protecting and preserving data during the digital curation are adapted to the data.
Let’s take a look at digital curation lifecycle model. Basically, this concept has aimed to address the whole lifecycle of data. The lifecycle model is designed to explain the steps of curating digital materials. However, in order to understand the lifecycle model step by step. I decided to apply the Sequential Actions which locates in the outer ring of the model as a model for the SAC’s data management. This red loop represents the key actions for digital curation through their lifecycle in 8 actions. For example, the Conceptualise, this stage has aimed to develop and plan data creation procedures and outcomes in mind. Or the Preservation action which has aimed to ensure long-term data preservation and data retention.
The questionnaire was designed by adapting the set of questions from Data Asset Framework and conformed with the eight stages of Sequential Actions in Digital Curation Lifecycle Model with the aim to survey the working behaviour of the SAC’s staffs. In the first place, the questionnaire was distributed to 58 staffs which are from the Office of Academic Affairs and Information of the centre. However, only 39 people responded to the questionnaire. 12 respondents are researchers and the rest are the staffs from the supporting division such as programmers, coordinating staffs, librarians.
Here is the list of questions. For instance , what types of materials and file formats which they have created? How often do they back up their data and where do they store them? How do they create and receive data? How do they participate in the digital project during the Conceptualise stage? Do they apply any Appraisal or Selection criteria to their data before Ingest them to the storage?
There are two interesting answers from the questionnaire which we should investigate in details. The first one is data acquisition . You can see, 60% of the SAC’s data are created by collecting from other sources such as book, donation, report, the Internet. After that, the SAC’s researchers will use these materials as the source for composing and synthesising their work. However, how can we guarantee that there is no copyright infringement problem during this process? To prevent the unexpected problems, some respondents suggested that “the centre should support the plagiarism tool like ‘Turn it in’ for the researchers to check the similarity of their work before publicity .”
The second one is the reason of data inaccessibility. The lack of metadata description is the main reason why some respondents can’t access, use, and reuse data. It should be noted that, the centre never declare any policy about data management, especially , the regulations about submitting metadata description to the SAC’s digital resources. For example, one respondent said “he couldn’t use some images in the IR because there are no image description.Some respondents could not find the information that they want because their colleagues didn’t upload their works into the system. The rest of the problems are from the confidential data, the obsolete ness of medium and format, and sensitive cultural data. This counts 33%. It should be pointed out for the last problem. Apart from the data that the centre has been creating by itself, there’s also information which is originated from the source communities or the cultural owners. This information is sensitive and shouldn’t be used by the third person without the permission from the owners. So, some respondents who really care about this issue want the centre to set some measures for using this sensitive information in an appropriate way.
According to the survey, there are two interesting points that we should interpret in details. The first point is the interest and knowledge of the respondents . Generally, we may divide the interest and knowledge of the SAC’s staffs into three groups.
For the first group, most of the respondents are the staffs from the Database Division. They are familiar with the technical terms about data management and able to explain the limitation of their current work process rather well, because digital objects management is included in their job descriptions.
For the second group, they are interested in digital curation concept and would like to apply these processes to their project. However, because of the lack of data management skills, they are at risk of data loss.
For the last group, they aren’t interested in digital curation because this kind of work isn’t included in their job descriptions. However, if we look at the roles and activities of them in data lifecycle model in the next slide. You will see their relations with the digital curation activities in each stage.
The second point is the roles and activities of the respondents in each stage of Lifecycle Model. Overall, all of the staffs from the Office of Academic Affairs and Information involve in digital curation activities in different levels of intensity in each stage depending on their job descriptions. For instance, the man-hour of the database administrators and the librarians have focused in Create or Receive, Appraise and Select, Ingest, and Preservation actions data of the project. On the other hand, the programmers and audiovisual staffs have participated in Preservation actions and Store data from the project by applying technology tools.
However, it’s interesting to highlight the role of researchers in the curation activities. Basically, the researchers are the manager of the SAC’s projects who set the concept and workflow. They involve in digital curation activities more than half of the actions in the lifecycle model. In short, they are the “key figures” to propel the digital work process of the SAC’s project because they have to manage and maintain both the data and the staffs involved with the projects.
These two interesting points from the survey have disclosed the notions and relations of the SAC’s staffs with digital curation idea. The next question is: How do we apply digital curation to the SAC digital work process?
Some respondents suggested that, before we will reach that point, the centre should impose three measures for
the possibility of applying the digital curation concept to the SAC’s digital work process. First, the centre should demonstrate the benefits of data management and damages caused by the lack of best practice for data management to the stakeholders, especially , to the researchers who have played the key role in the project. After that, the centre should consider the staffs’ positions and responsibilities and provide the staffs with the knowledge about digital curation at the appropriate levels, in order to make them aware and acknowledge the relations between their work process and the data lifecycle model in each stage.
Second, the centre should take the users' behaviour s and the levels of staffs' knowledge about digital curation into consideration before applying this idea to the SAC guideline. The policy makers should clearly and effectively communicate with the stakeholders about the importance and benefits of data management processes in compliance with the guideline. Moreover, the centre must include the digital work process in the staffs' KPI since this might help persuade and encourage them to change their working routine to the best practice following the guideline.
Third, the role and the support from the policy makers are the most important factors to incorporate the digital curation guideline to the work process of the centre. The policy makers should declare the digital curation as part of the SAC’s project objectives.
As a result, in phase 2 of the project, I will do the focus group discussion with the SAC staffs by setting the issue for discussion according to these three suggestions.
Then, I will bring all the results and comments to analyse and develop the guideline for the SAC.
In conclusion, the result from the first phase of this project has disclosed the current situations and problems of the SAC’s digital work process. Moreover, the SAC’s staffs also have started to realise the importance of best practice for data management. However, this project is only the starting point to deal with data management problems of the centre that have accumulated for nearly a decade. It still needs more hard work and cooperation from the stakeholders.