SlideShare a Scribd company logo
Managing large and complex data sets: …  THE CHALLENGES OF ARCHIVING AND ONLINE DELIVERY CATHERINE HARDMAN
The problem….in 1996 My lithics report here, on floppy disc
The ADS: some ancient history ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
What do we do? ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
No need for digital preservation Domesday Book: Publisher: William of Normandy (1086) – still readable
Where’s preservation when you need it? Domesday Disc: Publisher: BBC (1986) –nearly lost
Why is it important?
[object Object],What’s the problem? Information Entropy
The scale of the problem in the 1990s Strategies for protecting physical media Findings and  Recommendations from ‘Digital Data in Archaeology: A Survey of User Needs’ Condron et al 1999
Protecting Physical media … never the twain
The scale of the problem in the 1990s The popularity of storage options Findings and  Recommendations from ‘Digital Data in Archaeology: A Survey of User Needs’ Condron et al 1999
8" Floppy 3.5" Floppy 5.25" Floppy 12" Optical Disk 5.25" Optical Disk CD-ROM Sparq Disk Cartridge Zip Disk Click! DVD-ROM Jaz Disk Floptical Disk Punch Tape Rectangular Hole  Punch Card IBM 3480 DLT Tape DG90M Tape DC4_120 8mmD-eight QIC DC600 G2000 Tape 4mm Tape Ditto Max 9-Track Ree l Cassette tape           Memory Stick MultiMedia Card SD Memory Card xD Picture Card Smart Media CompactFlash Travan
Why is it all so difficult? ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
How do we do it? Open Archival Information System (OAIS)
But that’s people…
Migration based approach & controlled ingest Aim to connect with data producers early on in their project lifecycles to ensure that preservation planning is a key consideration during the project rather than an afterthought.
Guides to help you do all that.
It hasn’t really got much easier ,[object Object]
The size of digital archives held by different types of archaeological bodies  http://ads.ahds.ac.uk/ A rchaeology  D ata  S ervice
Big Data Project Roughly how much data would be generated by a single project?
Which of these data collection techniques do you carry out?  Technologies used 12% 4% 4% 3% 8% 1% 3% 11% 9% 9% 7% 14% 3% 12% 3D Laser Scanning Sidescan Sonar Multibeam Scanning Single Beam Scanning Geophysics Acoustic Tracking Sub bottom profiling Geographic (eg GIS) Lidar Digital Video Video Movie Clips Still Images CAD (2D or 3D) Other
What are the main software packages you use ?
Do you have an archiving policy for the data sets / types in question?
back-up
When you start a new project …would you consider using existing datasets?
This is the opportunity!
 
Making the inaccessible accessible ,[object Object]
Blurring the distinction … … between publication and archives …
Making the LEAP…
 
What does that mean for you? ,[object Object],[object Object],[object Object],[object Object]
How do you do that? ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
We’re here to help ,[object Object],[object Object],[object Object]

More Related Content

Similar to Managing large and complex data sets

Research Data Curation _ Grad Humanities Class
Research Data Curation _ Grad Humanities ClassResearch Data Curation _ Grad Humanities Class
Research Data Curation _ Grad Humanities Class
Aaron Collie
 
Research Data Management Fundamentals for MSU Engineering Students
Research Data Management Fundamentals for MSU Engineering StudentsResearch Data Management Fundamentals for MSU Engineering Students
Research Data Management Fundamentals for MSU Engineering Students
Aaron Collie
 
Cs501 dm intro
Cs501 dm introCs501 dm intro
Cs501 dm intro
Kamal Singh Lodhi
 
Cairo
CairoCairo
Project CAiRO Overview
Project CAiRO OverviewProject CAiRO Overview
Project CAiRO Overview
Stephen Gray
 
Planning and Managing Digital Library & Archive Projects
Planning and Managing Digital Library & Archive ProjectsPlanning and Managing Digital Library & Archive Projects
Planning and Managing Digital Library & Archive Projects
ac2182
 
IFLA ARL Webinar Series: Digital Preservation - Managing Publications and Dat...
IFLA ARL Webinar Series: Digital Preservation - Managing Publications and Dat...IFLA ARL Webinar Series: Digital Preservation - Managing Publications and Dat...
IFLA ARL Webinar Series: Digital Preservation - Managing Publications and Dat...
IFLAAcademicandResea
 
Chapter 1. Introduction
Chapter 1. IntroductionChapter 1. Introduction
Chapter 1. Introduction
butest
 
Research Data Management
Research Data ManagementResearch Data Management
Research Data Management
Jamie Bisset
 
Cloud Programming Models: eScience, Big Data, etc.
Cloud Programming Models: eScience, Big Data, etc.Cloud Programming Models: eScience, Big Data, etc.
Cloud Programming Models: eScience, Big Data, etc.
Alexandru Iosup
 
Getaneh Alemu
Getaneh AlemuGetaneh Alemu
Getaneh Alemu
JISC Digital Media
 
Digitisation Workshop Pres 2008(V1)
Digitisation Workshop Pres 2008(V1)Digitisation Workshop Pres 2008(V1)
Digitisation Workshop Pres 2008(V1)
Mal Booth
 
Digital Preservation
Digital PreservationDigital Preservation
Digital Preservation
Smita Chandra
 
Digital Preservation
Digital PreservationDigital Preservation
Digital Preservation
smtcd
 
Data in Action
Data in ActionData in Action
Data in Action
Natalino Busa
 
Digital Presentation Best Practices: Lessons Learned From Across the Pond
Digital Presentation Best Practices: Lessons Learned From Across the PondDigital Presentation Best Practices: Lessons Learned From Across the Pond
Digital Presentation Best Practices: Lessons Learned From Across the Pond
ULB - Bibliothèques
 
Digital Preservation Best Practices: Lessons Learned From Across the Pond
Digital Preservation Best Practices: Lessons Learned From Across the PondDigital Preservation Best Practices: Lessons Learned From Across the Pond
Digital Preservation Best Practices: Lessons Learned From Across the Pond
Benoit Pauwels
 
Navigating the Analog Waves: Digitizing Audio Cassettes for Your Collection
Navigating the Analog Waves: Digitizing Audio Cassettes for Your CollectionNavigating the Analog Waves: Digitizing Audio Cassettes for Your Collection
Navigating the Analog Waves: Digitizing Audio Cassettes for Your Collection
Kay Gregg
 
Scaling the (evolving) web data –at low cost-
Scaling the (evolving) web data –at low cost-Scaling the (evolving) web data –at low cost-
Scaling the (evolving) web data –at low cost-
WU (Vienna University of Economics and Business)
 
William Kilbride
William KilbrideWilliam Kilbride
William Kilbride
JISC Digital Media
 

Similar to Managing large and complex data sets (20)

Research Data Curation _ Grad Humanities Class
Research Data Curation _ Grad Humanities ClassResearch Data Curation _ Grad Humanities Class
Research Data Curation _ Grad Humanities Class
 
Research Data Management Fundamentals for MSU Engineering Students
Research Data Management Fundamentals for MSU Engineering StudentsResearch Data Management Fundamentals for MSU Engineering Students
Research Data Management Fundamentals for MSU Engineering Students
 
Cs501 dm intro
Cs501 dm introCs501 dm intro
Cs501 dm intro
 
Cairo
CairoCairo
Cairo
 
Project CAiRO Overview
Project CAiRO OverviewProject CAiRO Overview
Project CAiRO Overview
 
Planning and Managing Digital Library & Archive Projects
Planning and Managing Digital Library & Archive ProjectsPlanning and Managing Digital Library & Archive Projects
Planning and Managing Digital Library & Archive Projects
 
IFLA ARL Webinar Series: Digital Preservation - Managing Publications and Dat...
IFLA ARL Webinar Series: Digital Preservation - Managing Publications and Dat...IFLA ARL Webinar Series: Digital Preservation - Managing Publications and Dat...
IFLA ARL Webinar Series: Digital Preservation - Managing Publications and Dat...
 
Chapter 1. Introduction
Chapter 1. IntroductionChapter 1. Introduction
Chapter 1. Introduction
 
Research Data Management
Research Data ManagementResearch Data Management
Research Data Management
 
Cloud Programming Models: eScience, Big Data, etc.
Cloud Programming Models: eScience, Big Data, etc.Cloud Programming Models: eScience, Big Data, etc.
Cloud Programming Models: eScience, Big Data, etc.
 
Getaneh Alemu
Getaneh AlemuGetaneh Alemu
Getaneh Alemu
 
Digitisation Workshop Pres 2008(V1)
Digitisation Workshop Pres 2008(V1)Digitisation Workshop Pres 2008(V1)
Digitisation Workshop Pres 2008(V1)
 
Digital Preservation
Digital PreservationDigital Preservation
Digital Preservation
 
Digital Preservation
Digital PreservationDigital Preservation
Digital Preservation
 
Data in Action
Data in ActionData in Action
Data in Action
 
Digital Presentation Best Practices: Lessons Learned From Across the Pond
Digital Presentation Best Practices: Lessons Learned From Across the PondDigital Presentation Best Practices: Lessons Learned From Across the Pond
Digital Presentation Best Practices: Lessons Learned From Across the Pond
 
Digital Preservation Best Practices: Lessons Learned From Across the Pond
Digital Preservation Best Practices: Lessons Learned From Across the PondDigital Preservation Best Practices: Lessons Learned From Across the Pond
Digital Preservation Best Practices: Lessons Learned From Across the Pond
 
Navigating the Analog Waves: Digitizing Audio Cassettes for Your Collection
Navigating the Analog Waves: Digitizing Audio Cassettes for Your CollectionNavigating the Analog Waves: Digitizing Audio Cassettes for Your Collection
Navigating the Analog Waves: Digitizing Audio Cassettes for Your Collection
 
Scaling the (evolving) web data –at low cost-
Scaling the (evolving) web data –at low cost-Scaling the (evolving) web data –at low cost-
Scaling the (evolving) web data –at low cost-
 
William Kilbride
William KilbrideWilliam Kilbride
William Kilbride
 

More from data_management

RCAHMS digital archive
RCAHMS digital archiveRCAHMS digital archive
RCAHMS digital archive
data_management
 
ScotGrid
ScotGridScotGrid
ScotGrid
data_management
 
Share point7mar11
Share point7mar11Share point7mar11
Share point7mar11
data_management
 
Experiences (mis)managing archaeological data
Experiences (mis)managing archaeological dataExperiences (mis)managing archaeological data
Experiences (mis)managing archaeological data
data_management
 
RDO support
RDO supportRDO support
RDO support
data_management
 
Managing music data
Managing music dataManaging music data
Managing music data
data_management
 
Support in TFTS, Glasgow
Support in TFTS, GlasgowSupport in TFTS, Glasgow
Support in TFTS, Glasgow
data_management
 
Providing technical support during research projects
Providing technical support during research projectsProviding technical support during research projects
Providing technical support during research projects
data_management
 
Managing sensitive data in performing arts
Managing sensitive data in performing artsManaging sensitive data in performing arts
Managing sensitive data in performing arts
data_management
 

More from data_management (9)

RCAHMS digital archive
RCAHMS digital archiveRCAHMS digital archive
RCAHMS digital archive
 
ScotGrid
ScotGridScotGrid
ScotGrid
 
Share point7mar11
Share point7mar11Share point7mar11
Share point7mar11
 
Experiences (mis)managing archaeological data
Experiences (mis)managing archaeological dataExperiences (mis)managing archaeological data
Experiences (mis)managing archaeological data
 
RDO support
RDO supportRDO support
RDO support
 
Managing music data
Managing music dataManaging music data
Managing music data
 
Support in TFTS, Glasgow
Support in TFTS, GlasgowSupport in TFTS, Glasgow
Support in TFTS, Glasgow
 
Providing technical support during research projects
Providing technical support during research projectsProviding technical support during research projects
Providing technical support during research projects
 
Managing sensitive data in performing arts
Managing sensitive data in performing artsManaging sensitive data in performing arts
Managing sensitive data in performing arts
 

Recently uploaded

Executive Directors Chat Leveraging AI for Diversity, Equity, and Inclusion
Executive Directors Chat  Leveraging AI for Diversity, Equity, and InclusionExecutive Directors Chat  Leveraging AI for Diversity, Equity, and Inclusion
Executive Directors Chat Leveraging AI for Diversity, Equity, and Inclusion
TechSoup
 
Natural birth techniques - Mrs.Akanksha Trivedi Rama University
Natural birth techniques - Mrs.Akanksha Trivedi Rama UniversityNatural birth techniques - Mrs.Akanksha Trivedi Rama University
Natural birth techniques - Mrs.Akanksha Trivedi Rama University
Akanksha trivedi rama nursing college kanpur.
 
Azure Interview Questions and Answers PDF By ScholarHat
Azure Interview Questions and Answers PDF By ScholarHatAzure Interview Questions and Answers PDF By ScholarHat
Azure Interview Questions and Answers PDF By ScholarHat
Scholarhat
 
Your Skill Boost Masterclass: Strategies for Effective Upskilling
Your Skill Boost Masterclass: Strategies for Effective UpskillingYour Skill Boost Masterclass: Strategies for Effective Upskilling
Your Skill Boost Masterclass: Strategies for Effective Upskilling
Excellence Foundation for South Sudan
 
PIMS Job Advertisement 2024.pdf Islamabad
PIMS Job Advertisement 2024.pdf IslamabadPIMS Job Advertisement 2024.pdf Islamabad
PIMS Job Advertisement 2024.pdf Islamabad
AyyanKhan40
 
South African Journal of Science: Writing with integrity workshop (2024)
South African Journal of Science: Writing with integrity workshop (2024)South African Journal of Science: Writing with integrity workshop (2024)
South African Journal of Science: Writing with integrity workshop (2024)
Academy of Science of South Africa
 
Types of Herbal Cosmetics its standardization.
Types of Herbal Cosmetics its standardization.Types of Herbal Cosmetics its standardization.
Types of Herbal Cosmetics its standardization.
Ashokrao Mane college of Pharmacy Peth-Vadgaon
 
A Strategic Approach: GenAI in Education
A Strategic Approach: GenAI in EducationA Strategic Approach: GenAI in Education
A Strategic Approach: GenAI in Education
Peter Windle
 
The simplified electron and muon model, Oscillating Spacetime: The Foundation...
The simplified electron and muon model, Oscillating Spacetime: The Foundation...The simplified electron and muon model, Oscillating Spacetime: The Foundation...
The simplified electron and muon model, Oscillating Spacetime: The Foundation...
RitikBhardwaj56
 
ISO/IEC 27001, ISO/IEC 42001, and GDPR: Best Practices for Implementation and...
ISO/IEC 27001, ISO/IEC 42001, and GDPR: Best Practices for Implementation and...ISO/IEC 27001, ISO/IEC 42001, and GDPR: Best Practices for Implementation and...
ISO/IEC 27001, ISO/IEC 42001, and GDPR: Best Practices for Implementation and...
PECB
 
Film vocab for eal 3 students: Australia the movie
Film vocab for eal 3 students: Australia the movieFilm vocab for eal 3 students: Australia the movie
Film vocab for eal 3 students: Australia the movie
Nicholas Montgomery
 
Pride Month Slides 2024 David Douglas School District
Pride Month Slides 2024 David Douglas School DistrictPride Month Slides 2024 David Douglas School District
Pride Month Slides 2024 David Douglas School District
David Douglas School District
 
The History of Stoke Newington Street Names
The History of Stoke Newington Street NamesThe History of Stoke Newington Street Names
The History of Stoke Newington Street Names
History of Stoke Newington
 
Advanced Java[Extra Concepts, Not Difficult].docx
Advanced Java[Extra Concepts, Not Difficult].docxAdvanced Java[Extra Concepts, Not Difficult].docx
Advanced Java[Extra Concepts, Not Difficult].docx
adhitya5119
 
BÀI TẬP BỔ TRỢ TIẾNG ANH 8 CẢ NĂM - GLOBAL SUCCESS - NĂM HỌC 2023-2024 (CÓ FI...
BÀI TẬP BỔ TRỢ TIẾNG ANH 8 CẢ NĂM - GLOBAL SUCCESS - NĂM HỌC 2023-2024 (CÓ FI...BÀI TẬP BỔ TRỢ TIẾNG ANH 8 CẢ NĂM - GLOBAL SUCCESS - NĂM HỌC 2023-2024 (CÓ FI...
BÀI TẬP BỔ TRỢ TIẾNG ANH 8 CẢ NĂM - GLOBAL SUCCESS - NĂM HỌC 2023-2024 (CÓ FI...
Nguyen Thanh Tu Collection
 
The basics of sentences session 6pptx.pptx
The basics of sentences session 6pptx.pptxThe basics of sentences session 6pptx.pptx
The basics of sentences session 6pptx.pptx
heathfieldcps1
 
Chapter 4 - Islamic Financial Institutions in Malaysia.pptx
Chapter 4 - Islamic Financial Institutions in Malaysia.pptxChapter 4 - Islamic Financial Institutions in Malaysia.pptx
Chapter 4 - Islamic Financial Institutions in Malaysia.pptx
Mohd Adib Abd Muin, Senior Lecturer at Universiti Utara Malaysia
 
Top five deadliest dog breeds in America
Top five deadliest dog breeds in AmericaTop five deadliest dog breeds in America
Top five deadliest dog breeds in America
Bisnar Chase Personal Injury Attorneys
 
Digital Artifact 1 - 10VCD Environments Unit
Digital Artifact 1 - 10VCD Environments UnitDigital Artifact 1 - 10VCD Environments Unit
Digital Artifact 1 - 10VCD Environments Unit
chanes7
 
A Survey of Techniques for Maximizing LLM Performance.pptx
A Survey of Techniques for Maximizing LLM Performance.pptxA Survey of Techniques for Maximizing LLM Performance.pptx
A Survey of Techniques for Maximizing LLM Performance.pptx
thanhdowork
 

Recently uploaded (20)

Executive Directors Chat Leveraging AI for Diversity, Equity, and Inclusion
Executive Directors Chat  Leveraging AI for Diversity, Equity, and InclusionExecutive Directors Chat  Leveraging AI for Diversity, Equity, and Inclusion
Executive Directors Chat Leveraging AI for Diversity, Equity, and Inclusion
 
Natural birth techniques - Mrs.Akanksha Trivedi Rama University
Natural birth techniques - Mrs.Akanksha Trivedi Rama UniversityNatural birth techniques - Mrs.Akanksha Trivedi Rama University
Natural birth techniques - Mrs.Akanksha Trivedi Rama University
 
Azure Interview Questions and Answers PDF By ScholarHat
Azure Interview Questions and Answers PDF By ScholarHatAzure Interview Questions and Answers PDF By ScholarHat
Azure Interview Questions and Answers PDF By ScholarHat
 
Your Skill Boost Masterclass: Strategies for Effective Upskilling
Your Skill Boost Masterclass: Strategies for Effective UpskillingYour Skill Boost Masterclass: Strategies for Effective Upskilling
Your Skill Boost Masterclass: Strategies for Effective Upskilling
 
PIMS Job Advertisement 2024.pdf Islamabad
PIMS Job Advertisement 2024.pdf IslamabadPIMS Job Advertisement 2024.pdf Islamabad
PIMS Job Advertisement 2024.pdf Islamabad
 
South African Journal of Science: Writing with integrity workshop (2024)
South African Journal of Science: Writing with integrity workshop (2024)South African Journal of Science: Writing with integrity workshop (2024)
South African Journal of Science: Writing with integrity workshop (2024)
 
Types of Herbal Cosmetics its standardization.
Types of Herbal Cosmetics its standardization.Types of Herbal Cosmetics its standardization.
Types of Herbal Cosmetics its standardization.
 
A Strategic Approach: GenAI in Education
A Strategic Approach: GenAI in EducationA Strategic Approach: GenAI in Education
A Strategic Approach: GenAI in Education
 
The simplified electron and muon model, Oscillating Spacetime: The Foundation...
The simplified electron and muon model, Oscillating Spacetime: The Foundation...The simplified electron and muon model, Oscillating Spacetime: The Foundation...
The simplified electron and muon model, Oscillating Spacetime: The Foundation...
 
ISO/IEC 27001, ISO/IEC 42001, and GDPR: Best Practices for Implementation and...
ISO/IEC 27001, ISO/IEC 42001, and GDPR: Best Practices for Implementation and...ISO/IEC 27001, ISO/IEC 42001, and GDPR: Best Practices for Implementation and...
ISO/IEC 27001, ISO/IEC 42001, and GDPR: Best Practices for Implementation and...
 
Film vocab for eal 3 students: Australia the movie
Film vocab for eal 3 students: Australia the movieFilm vocab for eal 3 students: Australia the movie
Film vocab for eal 3 students: Australia the movie
 
Pride Month Slides 2024 David Douglas School District
Pride Month Slides 2024 David Douglas School DistrictPride Month Slides 2024 David Douglas School District
Pride Month Slides 2024 David Douglas School District
 
The History of Stoke Newington Street Names
The History of Stoke Newington Street NamesThe History of Stoke Newington Street Names
The History of Stoke Newington Street Names
 
Advanced Java[Extra Concepts, Not Difficult].docx
Advanced Java[Extra Concepts, Not Difficult].docxAdvanced Java[Extra Concepts, Not Difficult].docx
Advanced Java[Extra Concepts, Not Difficult].docx
 
BÀI TẬP BỔ TRỢ TIẾNG ANH 8 CẢ NĂM - GLOBAL SUCCESS - NĂM HỌC 2023-2024 (CÓ FI...
BÀI TẬP BỔ TRỢ TIẾNG ANH 8 CẢ NĂM - GLOBAL SUCCESS - NĂM HỌC 2023-2024 (CÓ FI...BÀI TẬP BỔ TRỢ TIẾNG ANH 8 CẢ NĂM - GLOBAL SUCCESS - NĂM HỌC 2023-2024 (CÓ FI...
BÀI TẬP BỔ TRỢ TIẾNG ANH 8 CẢ NĂM - GLOBAL SUCCESS - NĂM HỌC 2023-2024 (CÓ FI...
 
The basics of sentences session 6pptx.pptx
The basics of sentences session 6pptx.pptxThe basics of sentences session 6pptx.pptx
The basics of sentences session 6pptx.pptx
 
Chapter 4 - Islamic Financial Institutions in Malaysia.pptx
Chapter 4 - Islamic Financial Institutions in Malaysia.pptxChapter 4 - Islamic Financial Institutions in Malaysia.pptx
Chapter 4 - Islamic Financial Institutions in Malaysia.pptx
 
Top five deadliest dog breeds in America
Top five deadliest dog breeds in AmericaTop five deadliest dog breeds in America
Top five deadliest dog breeds in America
 
Digital Artifact 1 - 10VCD Environments Unit
Digital Artifact 1 - 10VCD Environments UnitDigital Artifact 1 - 10VCD Environments Unit
Digital Artifact 1 - 10VCD Environments Unit
 
A Survey of Techniques for Maximizing LLM Performance.pptx
A Survey of Techniques for Maximizing LLM Performance.pptxA Survey of Techniques for Maximizing LLM Performance.pptx
A Survey of Techniques for Maximizing LLM Performance.pptx
 

Managing large and complex data sets

  • 1. Managing large and complex data sets: … THE CHALLENGES OF ARCHIVING AND ONLINE DELIVERY CATHERINE HARDMAN
  • 2. The problem….in 1996 My lithics report here, on floppy disc
  • 3.
  • 4.
  • 5. No need for digital preservation Domesday Book: Publisher: William of Normandy (1086) – still readable
  • 6. Where’s preservation when you need it? Domesday Disc: Publisher: BBC (1986) –nearly lost
  • 7. Why is it important?
  • 8.
  • 9. The scale of the problem in the 1990s Strategies for protecting physical media Findings and Recommendations from ‘Digital Data in Archaeology: A Survey of User Needs’ Condron et al 1999
  • 10. Protecting Physical media … never the twain
  • 11. The scale of the problem in the 1990s The popularity of storage options Findings and Recommendations from ‘Digital Data in Archaeology: A Survey of User Needs’ Condron et al 1999
  • 12. 8" Floppy 3.5" Floppy 5.25" Floppy 12" Optical Disk 5.25" Optical Disk CD-ROM Sparq Disk Cartridge Zip Disk Click! DVD-ROM Jaz Disk Floptical Disk Punch Tape Rectangular Hole Punch Card IBM 3480 DLT Tape DG90M Tape DC4_120 8mmD-eight QIC DC600 G2000 Tape 4mm Tape Ditto Max 9-Track Ree l Cassette tape         Memory Stick MultiMedia Card SD Memory Card xD Picture Card Smart Media CompactFlash Travan
  • 13.
  • 14. How do we do it? Open Archival Information System (OAIS)
  • 16. Migration based approach & controlled ingest Aim to connect with data producers early on in their project lifecycles to ensure that preservation planning is a key consideration during the project rather than an afterthought.
  • 17. Guides to help you do all that.
  • 18.
  • 19. The size of digital archives held by different types of archaeological bodies http://ads.ahds.ac.uk/ A rchaeology D ata S ervice
  • 20. Big Data Project Roughly how much data would be generated by a single project?
  • 21. Which of these data collection techniques do you carry out? Technologies used 12% 4% 4% 3% 8% 1% 3% 11% 9% 9% 7% 14% 3% 12% 3D Laser Scanning Sidescan Sonar Multibeam Scanning Single Beam Scanning Geophysics Acoustic Tracking Sub bottom profiling Geographic (eg GIS) Lidar Digital Video Video Movie Clips Still Images CAD (2D or 3D) Other
  • 22. What are the main software packages you use ?
  • 23. Do you have an archiving policy for the data sets / types in question?
  • 25. When you start a new project …would you consider using existing datasets?
  • 26. This is the opportunity!
  • 27.  
  • 28.
  • 29. Blurring the distinction … … between publication and archives …
  • 31.  
  • 32.
  • 33.
  • 34.

Editor's Notes

  1. How big is your data? – asked in order to get a idea of scale of the problem So you’ll see there is some quite big data being produced out there – some people producing over 200GB for a project
  2. We ran an online questionnaire to find out about users and uses of big data – I’ll just skim through some of the things that came out of it: We got 48 responses. this is one of the first questions we asked. Wanted to get an idea of the data collection techniques that people are using to create big data. You’ll see there’s a wide range of technologies including the ones I mentioned on an earlier slide.
  3. Of the 101 software packages entered into the online form a staggering 52 are unique (that is after editing for things like lower and upper case character differences). It seems the world of ‘big data’ is very fragmented.
  4. This is an interesting one. We asked if people had an archival policy for the data sets in question. Only 48% of respondents note that they have a policy in place Of these many noted that these policies were localised and incomplete - not formal written policy. A proper system of digital archiving should involve continuous active management of the data, putting data on a dvd and putting it in a drawer is not really a stable archival policy. A formal archival policy as we see it should ideally be based on the OAIS system – continuous active management of data to ensure its survival into the future.
  5. Overwhelming “yes” to this question.... Some of the reasons that were cited: monitoring over time avoiding duplication Saving time/money Of course – re-use just isn’t possible unless someone is archiving and providing access to this data