Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Dissemination Information Packages (DIPS) for Information Reuse
1. MIT Libraries Brown Bag
Dissemination Information
Packages (DIPS) for
Information Reuse (DIPIR)
DIPIR Principal Investigators:
Ixchel M. Faniel, Ph.D.
Elizabeth Yakel, Ph.D.
Overview of DIPIR :
Nancy Y McGovern, Ph.D.
3. • IMLS-funded project led by Drs. Ixchel Faniel (PI) & Elizabeth
Yakel (co-PI)
• 3-year project October 2010 – September 2013
• Studying the intersection between data reuse and digital
preservation in three academic disciplines to identify how
contextual information about the data that supports reuse
can best be created and preserved.
• Focuses on research data produced and used by quantitative
social scientists, archaeologists, and zoologists.
• The intended audiences of this project are researchers who
use secondary data and the digital curators, digital repository
managers, data center staff, and others who collect, manage,
and store digital information.
4. Motivation for the DIPIR Project
Two Major Goals
1. Bridge gap between
data reuse and digital
curation research
2. Determine whether
reuse and curation
practices can be
generalized across
disciplines
Our interest is in this overlap.
Data reuse
research
Disciplines
curating
and reusing
data
Digital
curation
research
5. The Research Team
Resources at dipir.org:
• Project Details
• People
• Sites
• Publications
• Bibliography
• Project Reports
• News
Nancy
McGovern
ICPSR/MIT
Elizabeth
Yakel
University of
Michigan (CoPI)
William Fink
UM Museum
of Zoology
Ixchel Faniel
OCLC
Research
DIPIR
Project
(PI)
Eric Kansa
Open Context
For more information, please visit http://www.dipir.org
6. Next Steps
Interviews
• Social scientists
• Archaeologists
• Zoologists
Survey
• ICPSR Data
Reusers
Map significant
properties of data
as representation
information
Observations
• UMMZ Data
Reusers
Faniel & Yakel 2011
Web
analytics
• OpenContext.org
transaction log
analysis
7. Methods Overview
ICSPR
Open Context
UMMZ
Phase 1: Project Start up
Interviews
Staff
10
Winter 2011
4
Winter 2011
10
Spring 2011
Phase 2: Collecting and analyzing user data
Interviews
data consumers
43
Winter 2012
Survey
data consumers
2000
Summer 2012
Web analytics
data consumers
Observations
data consumers
22
Winter 2012
27
Fall 2012
Server logs
Ongoing
10
Ongoing
Phase 3: Mapping significant properties as representation information
9. Survey of ICPSR Data Reusers - Part 1
Measuring Repository Success
What data quality
indicators contribute
to quantitative social
scientists’ data reuse
satisfaction?
10. ICPSR Survey of Data Reusers – Part 1
Data Quality Indicators
•
•
•
•
•
Completeness – sufficiency, breadth, depth, and scope
Relevancy – applicability and helpfulness of data for the task
Accessibility – ease and speed data were retrieved
Ease of Operation – ease data were managed and manipulated
Credibility – correctness, reliability, impartiality of data
(Wang and Strong, 1996; Lee et al., 2002)
Additional Indicators:
• Data Producer Reputation – regard for a data producer’s work
• Documentation Quality – sufficiency and ability to facilitate use
11. Survey Methodology
Data Collection
1,632 first authors of published journal articles 2008-2012 surveyed
The Survey
Part 1:inquire about data reuse experience
Part 2: inquire about experience using ICPSR repository and
intention to continue use
Preliminary Findings
• Tested measures of repository success
• Extended ideas about data quality beyond credibility and
relevance of data
– Data reuse satisfaction requires data that are complete, accessible,
and easy to operate
• Data producer reputation was not significant
• Documentation quality played a role if data reuse satisfaction
12. The Study
Research Question
How do novice social science
researchers make sense of
social science data?
Data Collection
22 Interviews
Data Analysis
Code set developed and expanded
from interview protocol
http://www.english.sxu.edu
13. Making sense of matching and merging capabilities across multiple datasets
• Combining longitudinal data
• “If they're not asking the same question over years,… [it’s] particularly
difficult because if they’ve changed the question wording, are then people
answering differently and so there were several discussions that I had with
my dissertation advisor…” (CBU18).
• Merging data from different sources
• “…authors will create a variable, they’ll average across a four or five year
period, and I’m trying to match that with a variable that was coded for a
single year period. So making an argument…that these two things should be
put together …, is something I always have to be wary of …So when dealing
with that,…I’ll see if it’s been done by others” (CBU04).
14. Preliminary Findings
Research Question
How do novice social science researchers make sense of
social science data?
Data Collection
22 Interviews
Data Analysis
Code set developed and expanded from interview protocol
Preliminary Findings
Novices engaged in careful articulation of the data producer’s
research process.
Novices relied on human scaffolding in the form of faculty
advisors and instructors.
Human scaffolding also came from the community as represented
in the literature.
15. Social Science Resource
Faniel, I.M., Kriesberg, A. & Yakel, E. (2012). Data
Reuse and Sensemaking among Novice Social
Scientists. Proceedings of the American Society for
Information Science and Technology, 49. (Slides)
Full list: http://dipir.org/publications/
16. The Challenges of Digging Data: A Study of Context in Archaeological Data Reuse
Motivation
• Social and economic forces
pushing toward digital
archaeological data
publication
• No robust set of standards
exist for field archaeology
• Data reuse studies can
inform standards
development, but there are
few outside of science and
engineering disciplines
http://opencontext.org/
17. Archaeology resource
Faniel, I.M., Kansa, E., Kansa, S.W., Barrera-Gomez, J. & Yakel,
E. (2013). The Challenges of Digging Data: A Study of Context
in Archaeological Data Reuse. Proceedings of the 13th
ACM/IEEE-CS Joint Conference on Digital Libraries. (Preprint,
Abstract, view slides via SlideShare)
Full list: http://dipir.org/publications/
18. Archaeology Study
Research Question
1. How does contextual information
serve to preserve the meaning of
and trust in archaeological field
research over time?
2. How can existing cultural heritage
standards be extended to
incorporate these contextual
elements?
Data Collection
22 interviews with archaeologists
Data Analysis
Code set developed and expanded from
interview protocol
http://www.english.sxu.edu
19. Preliminary Findings
• The lack of context was a persistent problem.
• Data collection procedures were highly sought during
data reuse.
• Additional context also played a role during data reuse.
• Researchers have an interest in the entire data lifecycle (data collection preparation through repository)
• Need more studies involving data integration and reuse
to help guide standards development (CIDOC-CRM not
sufficient)
20.
21. A Snapshot of the 27 Data Reusers
96%
93%
reuse data from other
repositories and websites
reuse data from
museums and archives
63%
37%
study ecological trends
26%
26%
are systematists
reuse data from
journal articles
reuse data
from colleagues
22. Data Selection Criteria
Condition of specimen
Data coverage
Geographic precision
Results of pre-analysis
Identification or location errors
Matches another dataset
Availability of voucher specimen
Relevant taxonomically
Sequence has been published
Time period specimen collected
23. Trust in Repositories Resource
Yakel, E., Faniel, I., Kriesberg, A., & Yoon, A. (2013).
Trust in Digital Repositories. International Journal of
Digital Curation, 8(1), 143–156.
doi:10.2218/ijdc.v8i1.251.
(Awarded Best Conference Paper at the 8th
International Digital Curation Conference (IDCC).
Amsterdam, Netherlands). (Article)
Full list: http://dipir.org/publications/
24. Stakeholder Trust
DIPIR is examining trust factors for re-use:
• Benevolence
– The organization demonstrates goodwill toward the customer
• Integrity
– The organization is honest and treats stakeholders with respect
• Identification
– Understanding and internalization of stakeholder interests by the
organization
– ISO TRAC understanding the designated community (pp. 25-26)
• Transparency
– Sharing trust-relevant information with stakeholders
– ISO TRAC sharing audit results (p. 19)
(Pirson & Malhotra, 2011)
25. Theoretical Framework
DeLone and McLean Information Systems (IS) Success Model
Information
Quality
System
Quality
Intention Use
to use
Net
Benefits
User
Satisfaction
Service
Quality
(DeLone & McLean, 2003)
26. DIPIR and TRAC
• DIPIR used TRAC requirements as a starting point for
informing a survey of social scientists
• That process raised questions about what users of
digital repositories might notice and/or rely upon
• Worthwhile to take a step back and consider how users
might perceive our TRAC-related efforts
27. Perceptions of TRAC
Examples from TRAC requirements:
3.1.1. Mission Statement reflects “commitment to the
preservation of, long term retention of, management of,
and access to digital information”
3.2. “sustained operation of the repository”
3.3.4. “commit to transparency and accountability in all
actions”
How might users of repositories become aware of and
respond to our efforts to be compliant?
Should we strive to encourage them to be aware? How?
How can/would we know if their interest in our practices
increases or changes?
Who is our audience for demonstrating good practice?
29. How often interviewees mentioned
Trust Factors
Quantitative
Social Scientists
(44)
(66)
0
1
1
5
1
1
1
5
1
2
2
10
1
7
8
9
1
10
4
0
23
1
27
1
Archaeologists
Concepts
(22)
Stakeholder Trust in the Organization
Benevolence
Identification
Integrity
Transparency
Social Factors
Colleagues
Structural Assurance
Guarantees:
Preservation/Sustainability
Institutional reputation
Third Party Endorsement
All
30. Coming UP …
DIPIR Research Assistant Adam Kriesberg will present a paper
on Nov. 4 at the 2013 Meeting of the Association for
Information Science and Technology (ASIS&T). The paper is
entitled “The Role of Data Reuse in the Apprenticeship
Process” and features Rebecca Frank, Ixchel Faniel, and
Elizabeth Yakel as co-authors.
http://dipir.org/news/
31. Acknowledgements
• Institute of Museum and Library Services,
– LG-06-10-0140-10
• Our co-authors: Sarah Whitcher Kansa, Ph.D., Julianna
Barrera-Gomez, M.S.I., Elizabeth Yakel, Ph.D.
• Partners: Nancy McGovern, Ph.D. (MIT), Eric Kansa, Ph.D.
(Open Context), William Fink, Ph.D. (University of Michigan
Museum of Zoology)
• Students: Morgan Daniels, Rebecca Frank, Adam Kriesberg,
Jessica Schaengold, Gavin Strassel, Michele DeLia, Kathleen
Fear, Mallory Hood, Molly Haig, Annelise Doll, Monique
Lowe
Editor's Notes
Transparency: “Communicating audit results to the public—transparency—will engender more trust, and additional objective audits, potentially leading towards certification, will promote further trust in the repository and the system that supports it” (ISO TRAC, 2012, p. 19).
A core element of transparency for digital repositories – a showstopper if it’s missingEstablishes scope: a repository should be what it purports to be How might users/consumers view a mission statement?Would they be aware it exists? Should they be?Might it inform or encourage their use of content?What significance might it have for them?Does a repository’s track record or longevity have an impact on users/consumers?Would users be aware a repository’s track record? How?Might it encourage their use of a repository's content?