Managing Social Science Data from the Arctic with ELOKA, ACADIS, NSIDC, and (Insert Many Other Acronyms Here!)
1. Managing Social Science Data from the Arctic with
ELOKA, ACADIS, NSIDC, and (Insert Many Other
Acronyms Here!)
Colleen Strawhacker, Peter Pulsifer, and Shari Gearheard
cstrawhacker@gmail.com
National Snow and Ice Data Center, University of Colorado
Presented for the GHEA Workshop
November 4th – 6th, 2013
2. Big Data, Cyberinfrastructure, other “techy”
terms here
Major movement to working
with “big data”
The “democratization of
science”
How do we make the most of
using these data for analysis
at different scales?
How do we ensure that data
are preserved and curated
for the future?
4. The National Snow and Ice Data Center
Creates tools
for data
access
Performs
scientific
research
Educates
the public
about the
cryosphere
Supports
data users
Manages and
distributes
scientific data
Supports local
and traditional
knowledge
5. NSIDC: An overview5
ACADIS
ACADIS, a joint NSIDC
UCAR and NCAR
effort, will manage all
Arctic data for NSF
ACADIS
Advanced Cooperative
Arctic Data and Information Service
6.
7. Traditional and Local
Knowledge
Important source of data to
consider in the changing
Arctic
Different from data from the
physical sciences… how do
we manage, preserve, curate
it?
What about the
considerations of the
community whose
knowledge we are working
with?
Timothy Allen, BBC
12. Technology Preservation
“Uggi” CD
Fox Gearheard, S. 2003. When the weather is uggianaqtuq: Inuit
observations of environmental change. Boulder, Colorado USA:
University of Colorado Geography Department Cartography Lab.
Distributed by National Snow and Ice Data Center. CD-ROM.
13.
14. The Creation of SIZONet
Concerns from the Community
What data should be made public?
Who should be able to contribute to the database?
How will the data be contextualized?
These concerns are frequently mirrored in other
ELOKA projects.
18. The Creation of the Yup’ik
Place Names Atlas
Concerns from the Community
Who should be able to add place names to database?
Who should have access to locations? Should locations
be hidden?
Where should the database infrastructure be housed?
Do these concerns sound familiar?
And yet, a different agreement and resolution are being
made to meet the wishes of the community.
19. What about Social Science Data?
Tension between having data be open access
(or, freely available to anyone who may want to analyze
it) and privacy issues
Major considerations when it comes to protecting the
locations of archaeological sites, privacy of research
subjects
NSIDC is not adequately equipped to handle these
types of data at this time – but that’s my job!
20. Working with Archaeological Data from
NABO
Partner with tDAR and NABO to get archaeological data entered into
a system that is accessible to both archaeologists and other
scientists working in the Arctic
22. Upcoming Opportunities
IASSA Panel on Data Management in Arctic Social
Science in May 2014 at the IASC in Prince George
APECS Webinar in late November 2013 in data
management for traditional and local knowledge
NSF Proposal to put these ideas into practice in 2014
IHOPE Sponsored Session at the Resilience 2014
Conference in Montpelier, France
Contact:
Colleen Strawhacker
cstrawhacker@gmail.com
Good afternoon everyone, and thank you for attending our panel. My name is Colleen Strawhacker, and I am a postdoc at the National Snow and Ice Center at the University of Colorado and you have probably seen a lot of emails from me. And I, along with my colleagues, Peter Pulsifer and Shari Gearheard, will be presenting on Managing Social Science Data from the Arctic with ELOKA, ACADIS, NSIDC and many other acronyms throughout the presentation here.So, I wanted to use this presentation to give you a background on some of the data management activities going on at the National Snow and Ice Data Center and provide some ideas on what to do with the insane amounts of data that everyone here has been talking about over the last few days.
In recent years, discourse in the sciences increasingly includes reference to data management, e-science, cyberinfrastructure, data intensive science, and big data, or insert other techy terms here. In this “fourth paradigm”, massive datasets connected through networks will power a new age of scientific discovery. Or, big large scale data analysis.The concept and practice of Citizen Science – scientific activities, carried out by nonprofessional scientists – has become mainstream within programs inside and outside of traditional scientific research environments (e.g., universities, government agencies). Something that I like the call the democratization of science, and I like NABO and GHEA have a lot of good examples of this.Concurrently, social networking technology has transformed how individuals and organizations with Internet access interact and share information and knowledge. We now see a convergence of these technical and social movements with the result being a socio-technical system that allows a broad range of different people to engage in research at scales ranging from the microscopic to the cosmic.Which leads to the questions:How do we make the most of using these data for large-scale analysis?How do we ensure that data are preserved and curated for the future?All of these aspects are why our current thrust toward cyberinfrastructure and visualization is so important.
Unfortunately, these data and their essential metadata are lost in many different ways and, like Tom points out yesterday, there are little bits and pieces of the problems with certain parts of our data that having the actual data collector there is usually best, but the metadata can help to solve that problem if the investigator cannot be there.
one of the first data management systems created by the NSIDC is ACADIS, or the Advanced Cooperative Arctic Data and Information Service. This data management system was created primarily to house physical science data from the Arctic, including measures like sea ice thickness and extent, imagery of the Arctic and Antarctic.You can see from this chart what kinds of data typically go into this system, such as ecological data, MODIS and Landsat imagery, field data observations, etc.
Here is a screen shot of where you can access ACADIS and some different ways you can browse the data, either geographically or by PI, or discipline, etc.So this system is a really excellent system for managing data from the Arctic and Antarctic and it’s the place where most people funded by the National Science Foundation in Polar Programs are managing and curating their data. Our data center, however, has realized in recent years that supporting and maintaining data gained from traditional knowledge and the social sciences provides many different challenges that ACADIS is not equipped to deal with. And you can see from the talks, ranging from Seth in the Faroes, Jen in Mozambique, Megan in Iceland – that these data are very important to understanding how the local system is operating sustainably or unsustainably.
For millennia, Indigenous Peoples of the world have been observing their environment, sharing information through social networks, establishing theories of how the world works, and using those theories to guide practice. Although different from Western Science, these ways of knowing and being offer valuable sources of knowledge for knowledge creators and others.The Arctic has been home to Indigenous Peoples for many generations. Indigenous Peoples have carved a productive, vital culture and knowledge system, yet until recently, the scientific community working in the Arctic has largely overlooked Indigenous local observations and knowledge. While there is still much room for improvement today, Indigenous Peoples are increasingly being acknowledged as investigators, partners, and collaborators, and their local knowledge and observations are being documented for use in Arctic research and policy development. This increased recognition is happening within the context of the Data Intensive Science, broader social media movements, and improved access to the Internet through mobile technologies. Indigenous people around the world are engaging in these emerging paradigms. Indigenous communities and organizations, however, are facing new challenges with respect to establishing models of engagement that are considered ethical, culturally appropriate, supportive of community goals, and achievable within the local context.A Data Intensive Science model envisions a high level of interaction among researchers facilitated by information technology. This model supports easy search, acquisition and use of data and information while using a globally based collaborative infrastructure This model of science will also see innovative analyses using scalable, widely-accessible and accepted tools.
These partnerships have resulted in the development of a number of products based on the knowledge of Indigenous partners. Here we provide a small subset of these examples from Alaska and Canada interms of the opportunities afforded and issues raised by using digital technologies for sharing Indigenous knowledge and observations, enhancing culture, and supporting education and training initiatives.
In addition to the telling and documentation of oral histories, ELOKA is also very much a data management system, In that we are concerned about the protecting technology and information against technological obsolescence. For example, this Uggi CD, created by my co-author Shari Gearheard for her dissertation on inuit observations on environmental change was created in 2003. This CD was Quite popular in the past – you could order online from the National Snow and Ice Data Center – and was actually one of NSIDC’s top sellers. The CD, however, no longer works on most computers, including mine here. It does, however, work on a very outdated Mac that was dug out of the technological archives at the University of Colorado – so one laptop that we have at our data center - and we are currently working to ensure that this CD can be accessible again in the future.
Second, however, was providing a system in which data could be directly collected and recorded, which resulted in the creation of SIZOnet. Members of several Alaskan coastal communities worked with sea ice geophysicists and other researchers from the University of Alaska Fairbanks to develop a digital collection of observations of sea ice, weather, and wildlife over many years. ELOKA engaged by developing a web-accessible site that would allow community members, researchers, or anyone else interested to access the information online. Developing the application required extensive consultation between all partners involved in this project.
Community members were interested in broadly sharing information but were clear that not all information should be made public – some information should be restricted to members of a particular community, specifically the raw transcripts where personal references are made and sensitive sites are identified. University-based researchers were interested in establishing a systematic record of observations made, however to avoid inappropriately transforming the information, the form of the database was co-developed with the communities involved. Some information was appropriate for use in a database, while others were more appropriately left as narrative. Technology was developed to establish access rules that would allow for sharing while maintaining some level of control over who could obtain the data. While technology developed through dialogue and negotiation reduces the likelihood of knowledge de-contextualization, the model is constantly evolving.
To encourage what the communities see as responsible use of the application, a “Use Agreement” was developed for ELOKA. Users from outside of the community must accept the terms of this agreement to access the information. While there is no expectation that this agreement will be applied using legal tools, the team agreed that a normative approach would improve the probability of responsible use. At present, tools to include photographs, maps, and video are being implemented to provide more context for the data collected.
Another example is the Yupik Environmental Knowledge Project and Atlas. Over the last ten years, the Calista Elders Council (CEC) has worked with elders from communities in Alaska to document Yup'ik place names. The result was the documentation of more than five thousand Indigenous place names that are culturally significant and provide important knowledge of the past and present environment. The communities were interested in sharing these place names amongst themselves and with others. While many books have been published to share place names, communities recognize the broad reach of the Internet and social media, and they frequently desire to be part of these movements, including research-oriented movements such as Data Intensive Science. While the map of place names is an important knowledge artifact, stories and other knowledge of these places are also critical from the perspective of community members.
To provide this contextual information, two websites were developed. One provides extensive narrative and accompanying photographs and maps, while the other website presents place names using an interactive map that associates places with related multimedia (e.g., audio clips, video, written stories). While technical developments are an important part of establishing an information system that is ethically and culturally appropriate, an important component has been missing. The ability for community members to directly contribute place names and related information or media is seen as critically important in terms of maintaining the application over time and engaging youth members.
To this end, a training and visioning workshop was held in September 2013 in Anchorage and Bethel Alaska. During this meeting, staff from the Calista Elders Council and members of partner communities, including college students, were trained on how to add new content to the involved place name atlas. This workshop also included substantial discussion as to how the development of the atlas should proceed. Who should have the ability to add names? Should the website be completely open or moderated in some way? Related to this decision, whose knowledge counts? If there are multiple place names or versions of a narrative, who decides on what is correct, or does a decision even need to be made? How is community membership defined? Should some place names or narratives be hidden from public view, and if yes, who does get access? How can we use the technology as a platform to promote interaction between youth (who may be more adept as using the tools) and Elders (who have more extensive knowledge)? When the expertise exists, should the application be physically moved from university infrastructure to the local community?Does this sound familiar? Yes, the dialogue is just like the creation of SIZO net, and yet, different agreements and resolutions were made to ensure the data Were preserved and accessible in a way that benefitted the community.
Like the data from traditional knowledge, social science data is complicated to preserve and curate. There is a tension between having data be open access and freely available and privacy, so we face many of the same issues that ELOKA has, in managing data from traditional sources, but social science data is still quite a bit different.For instance, ACADIS sometimes gets spreadsheets full of data of interviews of human subjects, etc., and individuals in small villages throughout the Arctic can be easily identified, if the spreadsheet is not adequately coded. Also, the protection of archaeological sites is both essential to understanding the context of the data but also essential to keep private to protect the sites from looting.At this point, there is a gap in our capabilities in data management at NSIDC – we do data management for the physical sciences and traditional and local knowledge very well, but we need to figure out how to best manage social science data at the Center, and that will be my job over the next 2 years.
For most of NABO’s datasets, however, tDAR, or the Digital Archaeological Record, is going to be the best place to put the data as they already manage archaeological data very well. They can adequately preserve context and know the important metadata entries for archaeological data. They also have systems in place to hide archaeological site locations, standardize metadata forms and standards, and find data by PI, project, and geographically.But, from what we have seen from all of the presentations over the past 2 days, that NABO and GHEA collaborators has a lot of other types of data – historic documents, Icelandic sagas, tephra – and many of these data sources will not be appropriate for a data management systems that specializes in archaeology.So, what do we do?
One idea is to use The Arctic Data Explorer, or create a similar website, may be one option to ensure that all of the highly variable datasets that NABO has can work across multiple data management systems. So, right now the Arctic data explorer is set crawl all of these different sources of data, so if you search for a keyword, all of these websites will be searched, so you do not have to go to each individual data management system to find all of the data that you needOne idea is to put the data where they are appropriate – physical science data, like tephras, in ACADIS, archaeological data in tDAR, Icelandic sagas in ELOKA, but to ensure that they are tagged the same thing – NABO may be a good one, so someone can search for it and find all NABO data. Or, perhaps we create an entirely new system all together to figure out how the best way to manage such a diverse and large source of data