SlideShare a Scribd company logo
1 of 21
Download to read offline
Addressing the Key Challenges of Storage,
Discoverability, Accessibility and Analysis.
Ivan Hanigan and Marco Fahmi
Australian SuperSite Network (ASN) and
Long Term Ecological Research Network (LTERN)
ACEAS Phenocam Workshop
2014-03-11
Ivan Hanigan and Marco Fahmi (ASN-LTERN)Addressing the Key Challenges of Storage, Discoverability, Accessibility and Analysis.2014-03-11 1 / 21
Topic
1 Introduction
2 What I want out of this workshop
3 Storage hosting of the data
4 Discoverability
5 Accessibility
6 Analysis
7 Conclusion
Ivan Hanigan and Marco Fahmi (ASN-LTERN)Addressing the Key Challenges of Storage, Discoverability, Accessibility and Analysis.2014-03-11 2 / 21
Introduction
Four key challenges to working with large data collections:
Storage (Big Data, resilience to disasters, future proofing)
Discoverability (exposing metadata, indexing, standard schemas)
Accessibility (who is accessing what? Is it collaborative?)
Analysis (workflow and provenance)
Ivan Hanigan and Marco Fahmi (ASN-LTERN)Addressing the Key Challenges of Storage, Discoverability, Accessibility and Analysis.2014-03-11 3 / 21
Phenocams
Managing phenocam data is an exemplar of these issues
Ivan Hanigan and Marco Fahmi (ASN-LTERN)Addressing the Key Challenges of Storage, Discoverability, Accessibility and Analysis.2014-03-11 4 / 21
Topic
1 Introduction
2 What I want out of this workshop
3 Storage hosting of the data
4 Discoverability
5 Accessibility
6 Analysis
7 Conclusion
Ivan Hanigan and Marco Fahmi (ASN-LTERN)Addressing the Key Challenges of Storage, Discoverability, Accessibility and Analysis.2014-03-11 5 / 21
What I want out of this workshop
My work as a Data Manager / Data Analyst at ASN and LTERN
Toward a better set of descriptions of the business requirements
for each of these goals
Building systems that address these challenges.
Ivan Hanigan and Marco Fahmi (ASN-LTERN)Addressing the Key Challenges of Storage, Discoverability, Accessibility and Analysis.2014-03-11 6 / 21
Topic
1 Introduction
2 What I want out of this workshop
3 Storage hosting of the data
4 Discoverability
5 Accessibility
6 Analysis
7 Conclusion
Ivan Hanigan and Marco Fahmi (ASN-LTERN)Addressing the Key Challenges of Storage, Discoverability, Accessibility and Analysis.2014-03-11 7 / 21
The Data Deluge
“The next five years will produce more research data than has
been produced in all of previous human history.”
The great data explosion April 29, 2009
http://www.theaustralian.news.com.au/story/0,25197,25400306-
12332,00.html
Ivan Hanigan and Marco Fahmi (ASN-LTERN)Addressing the Key Challenges of Storage, Discoverability, Accessibility and Analysis.2014-03-11 8 / 21
Australian Research Cloud
IT Infrastructure available is unprecedented
Often cheap or free
Ivan Hanigan and Marco Fahmi (ASN-LTERN)Addressing the Key Challenges of Storage, Discoverability, Accessibility and Analysis.2014-03-11 9 / 21
Storage hosting of the data
There are technical challenges of storing (as well as
uploading/downloading) data.
Sustainability and future-proofing of the storage is a logistical
challenge.
Questions arise such as should your store be the only location of
the data or one of several mirrors?
Is storage “indefinitely” really possible?
Ivan Hanigan and Marco Fahmi (ASN-LTERN)Addressing the Key Challenges of Storage, Discoverability, Accessibility and Analysis.2014-03-11 10 / 21
Topic
1 Introduction
2 What I want out of this workshop
3 Storage hosting of the data
4 Discoverability
5 Accessibility
6 Analysis
7 Conclusion
Ivan Hanigan and Marco Fahmi (ASN-LTERN)Addressing the Key Challenges of Storage, Discoverability, Accessibility and Analysis.2014-03-11 11 / 21
Data and metadata standards
With the gathered expertise, it will be useful to advocate:
Conventions over Configuration
appropriate syntax and semantics for Phenocam data with
well considered conceptual frameworks for grouping datasets
appropriate compatibility/compliance with other standards.
Ivan Hanigan and Marco Fahmi (ASN-LTERN)Addressing the Key Challenges of Storage, Discoverability, Accessibility and Analysis.2014-03-11 12 / 21
Topic
1 Introduction
2 What I want out of this workshop
3 Storage hosting of the data
4 Discoverability
5 Accessibility
6 Analysis
7 Conclusion
Ivan Hanigan and Marco Fahmi (ASN-LTERN)Addressing the Key Challenges of Storage, Discoverability, Accessibility and Analysis.2014-03-11 13 / 21
Ownership, Sharing and Anonymous Re-use
There will be some contractual obligations about sharing and
publishing data (or not!) as well as a general inclination of the
group of what/when to share.
There is also the appropriate licensing scheme governing this,
embargoes and controlled access.
Ivan Hanigan and Marco Fahmi (ASN-LTERN)Addressing the Key Challenges of Storage, Discoverability, Accessibility and Analysis.2014-03-11 14 / 21
Ethics and Trust
Trust is needed then by the data provider to allay concerns over
the re-use of data
Collaborative and respectful use should be expected of data
users.
Ivan Hanigan and Marco Fahmi (ASN-LTERN)Addressing the Key Challenges of Storage, Discoverability, Accessibility and Analysis.2014-03-11 15 / 21
Topic
1 Introduction
2 What I want out of this workshop
3 Storage hosting of the data
4 Discoverability
5 Accessibility
6 Analysis
7 Conclusion
Ivan Hanigan and Marco Fahmi (ASN-LTERN)Addressing the Key Challenges of Storage, Discoverability, Accessibility and Analysis.2014-03-11 16 / 21
Analysis
Workflow management because users will need to have tools or
tech savvy to do something interesting and useful with the data.
The paradigm of “Bringing the Code to the Data” rather than
“Taking the Data to the Code”
Uses remote supercomputers with very large storage and
compute capacity
However it often feels like to be able to access and use a
supercomputer one needs to be as skillful as a “Super Scientist”
How to support ordinary users wanting “Super” analyses?
Provenance tracking of analysis outputs to ensure reproducibility
Ivan Hanigan and Marco Fahmi (ASN-LTERN)Addressing the Key Challenges of Storage, Discoverability, Accessibility and Analysis.2014-03-11 17 / 21
Appropriate analysis
There is an implicit belief of ‘big data’ advocates that answers
to difficult environmental questions can be found through
sharing data
But Ecology is inherently about understanding local patterns and
processes, and often hard-won, field-based understanding is
essential to help interpret the results of data analyses
There might be a need for support in study designs from those
familiar with the ecosystem
Ivan Hanigan and Marco Fahmi (ASN-LTERN)Addressing the Key Challenges of Storage, Discoverability, Accessibility and Analysis.2014-03-11 18 / 21
Security against malicious mis-use
A data analysis server is geared to executing software code
Analyses may require custom code to be written, or installation
of third-party software from unknown developers
There is a risk that such a Virtual Lab could be the victim of a
malicious attack.
Ivan Hanigan and Marco Fahmi (ASN-LTERN)Addressing the Key Challenges of Storage, Discoverability, Accessibility and Analysis.2014-03-11 19 / 21
Topic
1 Introduction
2 What I want out of this workshop
3 Storage hosting of the data
4 Discoverability
5 Accessibility
6 Analysis
7 Conclusion
Ivan Hanigan and Marco Fahmi (ASN-LTERN)Addressing the Key Challenges of Storage, Discoverability, Accessibility and Analysis.2014-03-11 20 / 21
Conclusion
These challenges are not trivial
We suspect the answers to many of these challenges will rely on
outsourcing much of the hardware and software as possible
to shift the responsibility of upkeep and sustainability on
someone else’s shoulders
and let the scientists focus on their science.
Ivan Hanigan and Marco Fahmi (ASN-LTERN)Addressing the Key Challenges of Storage, Discoverability, Accessibility and Analysis.2014-03-11 21 / 21

More Related Content

Viewers also liked

Viewers also liked (14)

Extinction of Northern Quoll. Euan Ritchie ACEAS Grand 2014
Extinction of Northern Quoll. Euan Ritchie ACEAS Grand 2014Extinction of Northern Quoll. Euan Ritchie ACEAS Grand 2014
Extinction of Northern Quoll. Euan Ritchie ACEAS Grand 2014
 
Aquatic connectivity - Prof. Brian Fry ACEAS Grand
Aquatic connectivity - Prof. Brian Fry ACEAS GrandAquatic connectivity - Prof. Brian Fry ACEAS Grand
Aquatic connectivity - Prof. Brian Fry ACEAS Grand
 
Ben Evans SPEDDEXES 2014
Ben Evans SPEDDEXES 2014Ben Evans SPEDDEXES 2014
Ben Evans SPEDDEXES 2014
 
Genetic impacts and climate change Part B, ACEAS Grand, Vicki Thomson
Genetic impacts and climate change Part B, ACEAS Grand, Vicki ThomsonGenetic impacts and climate change Part B, ACEAS Grand, Vicki Thomson
Genetic impacts and climate change Part B, ACEAS Grand, Vicki Thomson
 
Prof. Michael Raupach "Synthesis in science and society" ACEAS Grand 2014 part B
Prof. Michael Raupach "Synthesis in science and society" ACEAS Grand 2014 part BProf. Michael Raupach "Synthesis in science and society" ACEAS Grand 2014 part B
Prof. Michael Raupach "Synthesis in science and society" ACEAS Grand 2014 part B
 
Australian seagrass habitats: condition and threats, James Udy, ACEAS Grand 2014
Australian seagrass habitats: condition and threats, James Udy, ACEAS Grand 2014Australian seagrass habitats: condition and threats, James Udy, ACEAS Grand 2014
Australian seagrass habitats: condition and threats, James Udy, ACEAS Grand 2014
 
Edward King SPEDDEXES 2014
Edward King SPEDDEXES 2014Edward King SPEDDEXES 2014
Edward King SPEDDEXES 2014
 
AG-DC Data Cube Ip SPEDDEXES
AG-DC Data Cube Ip SPEDDEXESAG-DC Data Cube Ip SPEDDEXES
AG-DC Data Cube Ip SPEDDEXES
 
Adaptation pathways for aquatic plants. Patrick Driver ACEAS Grand 2014
Adaptation pathways for aquatic plants. Patrick Driver ACEAS Grand 2014Adaptation pathways for aquatic plants. Patrick Driver ACEAS Grand 2014
Adaptation pathways for aquatic plants. Patrick Driver ACEAS Grand 2014
 
SPatially Explicit Data Discovery, Extraction and Evaluation Services (SPEDDE...
SPatially Explicit Data Discovery, Extraction and Evaluation Services (SPEDDE...SPatially Explicit Data Discovery, Extraction and Evaluation Services (SPEDDE...
SPatially Explicit Data Discovery, Extraction and Evaluation Services (SPEDDE...
 
Developing an Australian phenology monitoring network, Tim Brown, ACEAS Grand...
Developing an Australian phenology monitoring network, Tim Brown, ACEAS Grand...Developing an Australian phenology monitoring network, Tim Brown, ACEAS Grand...
Developing an Australian phenology monitoring network, Tim Brown, ACEAS Grand...
 
Australian seagrass habitats. Kathryn McMahon, ACEAS Grand 2014
Australian seagrass habitats. Kathryn McMahon, ACEAS Grand 2014Australian seagrass habitats. Kathryn McMahon, ACEAS Grand 2014
Australian seagrass habitats. Kathryn McMahon, ACEAS Grand 2014
 
Paul Maxwell, AMSA 2013. Managing Seagrass Resilience: feedbacks and scales
Paul Maxwell, AMSA 2013. Managing Seagrass Resilience: feedbacks and scalesPaul Maxwell, AMSA 2013. Managing Seagrass Resilience: feedbacks and scales
Paul Maxwell, AMSA 2013. Managing Seagrass Resilience: feedbacks and scales
 
Taylor neon pheno_cam_2014_aceas
Taylor neon pheno_cam_2014_aceasTaylor neon pheno_cam_2014_aceas
Taylor neon pheno_cam_2014_aceas
 

Similar to Phenocams hanigan-20140309

Executive Summary - Data Management Hub
Executive Summary - Data Management HubExecutive Summary - Data Management Hub
Executive Summary - Data Management Hub
Denis Parfenov
 

Similar to Phenocams hanigan-20140309 (20)

Metadata for Data Rescue and Data at Risk
Metadata for Data Rescue and Data at RiskMetadata for Data Rescue and Data at Risk
Metadata for Data Rescue and Data at Risk
 
Community Learning Analytics - Challenges and Opportunities - ICWL 2013 Invit...
Community Learning Analytics - Challenges and Opportunities - ICWL 2013 Invit...Community Learning Analytics - Challenges and Opportunities - ICWL 2013 Invit...
Community Learning Analytics - Challenges and Opportunities - ICWL 2013 Invit...
 
Researching OER in the Open: developments and deliberations in the ROER4D pro...
Researching OER in the Open: developments and deliberations in the ROER4D pro...Researching OER in the Open: developments and deliberations in the ROER4D pro...
Researching OER in the Open: developments and deliberations in the ROER4D pro...
 
Advancing Foundation and Practice of Software Analytics
Advancing Foundation and Practice of Software AnalyticsAdvancing Foundation and Practice of Software Analytics
Advancing Foundation and Practice of Software Analytics
 
Data management and sharing protocol
Data management and sharing protocolData management and sharing protocol
Data management and sharing protocol
 
Executive Summary - Data Management Hub
Executive Summary - Data Management HubExecutive Summary - Data Management Hub
Executive Summary - Data Management Hub
 
Ready, Set, Go! Join the Top 10 FAIR Data Things Global Sprint
Ready, Set, Go! Join the Top 10 FAIR Data Things Global SprintReady, Set, Go! Join the Top 10 FAIR Data Things Global Sprint
Ready, Set, Go! Join the Top 10 FAIR Data Things Global Sprint
 
NISO/NFAIS Joint Virtual Conference: Connecting the Library to the Wider Wor...
NISO/NFAIS Joint Virtual Conference:  Connecting the Library to the Wider Wor...NISO/NFAIS Joint Virtual Conference:  Connecting the Library to the Wider Wor...
NISO/NFAIS Joint Virtual Conference: Connecting the Library to the Wider Wor...
 
R A Longhorn Presentation at Taiwan Open Data Forum, Taipei, 9 July 2014
R A Longhorn Presentation at Taiwan Open Data Forum, Taipei, 9 July 2014R A Longhorn Presentation at Taiwan Open Data Forum, Taipei, 9 July 2014
R A Longhorn Presentation at Taiwan Open Data Forum, Taipei, 9 July 2014
 
e-Science, Research Data and Libaries
e-Science, Research Data and Libariese-Science, Research Data and Libaries
e-Science, Research Data and Libaries
 
Building an Open Source Staff-Facing Tablet App for Library Assessment
Building an Open Source Staff-Facing Tablet App for Library AssessmentBuilding an Open Source Staff-Facing Tablet App for Library Assessment
Building an Open Source Staff-Facing Tablet App for Library Assessment
 
Supporting Professional Communities in the Next Web
Supporting Professional Communities in the Next Web Supporting Professional Communities in the Next Web
Supporting Professional Communities in the Next Web
 
FAIR Ddata in trustworthy repositories: the basics
FAIR Ddata in trustworthy repositories: the basicsFAIR Ddata in trustworthy repositories: the basics
FAIR Ddata in trustworthy repositories: the basics
 
Responsible Research Data Management - RMIT - Mar 19
Responsible Research Data Management - RMIT - Mar 19Responsible Research Data Management - RMIT - Mar 19
Responsible Research Data Management - RMIT - Mar 19
 
Scientific Information Management at the U.S. Geological Survey
Scientific Information Management at the U.S. Geological SurveyScientific Information Management at the U.S. Geological Survey
Scientific Information Management at the U.S. Geological Survey
 
ENCh01.ppt
ENCh01.pptENCh01.ppt
ENCh01.ppt
 
Top (10) challenging problems in data mining
Top (10) challenging problems  in data miningTop (10) challenging problems  in data mining
Top (10) challenging problems in data mining
 
Rda nitrd 2015 berman - final
Rda nitrd 2015 berman  - finalRda nitrd 2015 berman  - final
Rda nitrd 2015 berman - final
 
Setting up a data repository, what does it entail?
Setting up a data repository, what does it entail?Setting up a data repository, what does it entail?
Setting up a data repository, what does it entail?
 
Apereo Learning Analytics Overview
Apereo Learning Analytics OverviewApereo Learning Analytics Overview
Apereo Learning Analytics Overview
 

More from aceas13tern

More from aceas13tern (19)

Risk assessment of Australian ecosystems. Dr Emma Burns. ACEAS Grand 2014
Risk assessment of Australian ecosystems. Dr Emma Burns. ACEAS Grand 2014Risk assessment of Australian ecosystems. Dr Emma Burns. ACEAS Grand 2014
Risk assessment of Australian ecosystems. Dr Emma Burns. ACEAS Grand 2014
 
Vast lands and variable data: patterns and processes of mammal decline. Chris...
Vast lands and variable data: patterns and processes of mammal decline. Chris...Vast lands and variable data: patterns and processes of mammal decline. Chris...
Vast lands and variable data: patterns and processes of mammal decline. Chris...
 
Unifying principles for modelling, Brad Evans, ACEAS Grand 2014
Unifying principles for modelling, Brad Evans, ACEAS Grand 2014Unifying principles for modelling, Brad Evans, ACEAS Grand 2014
Unifying principles for modelling, Brad Evans, ACEAS Grand 2014
 
Ecosystem services and livelihood opportunities, Jeremy Russell-Smith, ACEAS ...
Ecosystem services and livelihood opportunities, Jeremy Russell-Smith, ACEAS ...Ecosystem services and livelihood opportunities, Jeremy Russell-Smith, ACEAS ...
Ecosystem services and livelihood opportunities, Jeremy Russell-Smith, ACEAS ...
 
Local to national, Dr Lee Belbin, ACEAS Grand 2014
Local to national, Dr Lee Belbin, ACEAS Grand 2014Local to national, Dr Lee Belbin, ACEAS Grand 2014
Local to national, Dr Lee Belbin, ACEAS Grand 2014
 
Interactive Games to Value and Manage Ecosystem Services. Prof. Bob Costanza....
Interactive Games to Value and Manage Ecosystem Services. Prof. Bob Costanza....Interactive Games to Value and Manage Ecosystem Services. Prof. Bob Costanza....
Interactive Games to Value and Manage Ecosystem Services. Prof. Bob Costanza....
 
Avifaunal disarray Ralph MacNally ACEAS Grand 2014
Avifaunal disarray Ralph MacNally ACEAS Grand 2014Avifaunal disarray Ralph MacNally ACEAS Grand 2014
Avifaunal disarray Ralph MacNally ACEAS Grand 2014
 
Andrew Treloar, overview of ACEAS Data Workflow, ACEAS Grand 2014
Andrew Treloar, overview of ACEAS Data Workflow, ACEAS Grand 2014Andrew Treloar, overview of ACEAS Data Workflow, ACEAS Grand 2014
Andrew Treloar, overview of ACEAS Data Workflow, ACEAS Grand 2014
 
Transformation of Australia’s vegetated landscapes. Richard Thackway ACEAS Gr...
Transformation of Australia’s vegetated landscapes. Richard Thackway ACEAS Gr...Transformation of Australia’s vegetated landscapes. Richard Thackway ACEAS Gr...
Transformation of Australia’s vegetated landscapes. Richard Thackway ACEAS Gr...
 
Drought-induced mortality. Pat Mitchell, ACEAS Grand 2014
Drought-induced mortality. Pat Mitchell, ACEAS Grand 2014Drought-induced mortality. Pat Mitchell, ACEAS Grand 2014
Drought-induced mortality. Pat Mitchell, ACEAS Grand 2014
 
Productivity and freshwater fish abundance. Jian Yen. ACEAS Grand 2014
Productivity and freshwater fish abundance. Jian Yen. ACEAS Grand 2014Productivity and freshwater fish abundance. Jian Yen. ACEAS Grand 2014
Productivity and freshwater fish abundance. Jian Yen. ACEAS Grand 2014
 
Indigenous bio cultural knowledge ACEAS Grand 2014 Locke and Clark
Indigenous bio cultural knowledge ACEAS Grand 2014 Locke and ClarkIndigenous bio cultural knowledge ACEAS Grand 2014 Locke and Clark
Indigenous bio cultural knowledge ACEAS Grand 2014 Locke and Clark
 
Genetic impacts and climate change Part A, ACEAS Grand, Vicki Thomson
Genetic impacts and climate change Part A, ACEAS Grand, Vicki ThomsonGenetic impacts and climate change Part A, ACEAS Grand, Vicki Thomson
Genetic impacts and climate change Part A, ACEAS Grand, Vicki Thomson
 
Genetic impacts and climate change Part C, ACEAS Grand, Vicki Thomson
Genetic impacts and climate change Part C, ACEAS Grand, Vicki ThomsonGenetic impacts and climate change Part C, ACEAS Grand, Vicki Thomson
Genetic impacts and climate change Part C, ACEAS Grand, Vicki Thomson
 
Assoc. Prof. Alison Specht ACEAS Grand 2014 "Synthesis Centres internationally"
Assoc. Prof. Alison Specht ACEAS Grand 2014 "Synthesis Centres internationally"Assoc. Prof. Alison Specht ACEAS Grand 2014 "Synthesis Centres internationally"
Assoc. Prof. Alison Specht ACEAS Grand 2014 "Synthesis Centres internationally"
 
Dr MIchael Vardon, ABS, ACEAS 2014 "Synthesis in environmental accounting"
Dr MIchael Vardon, ABS, ACEAS 2014 "Synthesis in environmental accounting"Dr MIchael Vardon, ABS, ACEAS 2014 "Synthesis in environmental accounting"
Dr MIchael Vardon, ABS, ACEAS 2014 "Synthesis in environmental accounting"
 
Prof. Tony McMichael ACEAS 2014 "Synthesis in public health"
Prof. Tony McMichael ACEAS 2014 "Synthesis in public health" Prof. Tony McMichael ACEAS 2014 "Synthesis in public health"
Prof. Tony McMichael ACEAS 2014 "Synthesis in public health"
 
Prof. Michael Raupach "Synthesis in science and society" ACEAS Grand 2014 part A
Prof. Michael Raupach "Synthesis in science and society" ACEAS Grand 2014 part AProf. Michael Raupach "Synthesis in science and society" ACEAS Grand 2014 part A
Prof. Michael Raupach "Synthesis in science and society" ACEAS Grand 2014 part A
 
Duursma ACEAS Phenocams 2014
Duursma ACEAS Phenocams 2014Duursma ACEAS Phenocams 2014
Duursma ACEAS Phenocams 2014
 

Recently uploaded

1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdf
QucHHunhnh
 
Seal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptxSeal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptx
negromaestrong
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdf
ciinovamais
 

Recently uploaded (20)

1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdf
 
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
 
General Principles of Intellectual Property: Concepts of Intellectual Proper...
General Principles of Intellectual Property: Concepts of Intellectual  Proper...General Principles of Intellectual Property: Concepts of Intellectual  Proper...
General Principles of Intellectual Property: Concepts of Intellectual Proper...
 
Unit-IV; Professional Sales Representative (PSR).pptx
Unit-IV; Professional Sales Representative (PSR).pptxUnit-IV; Professional Sales Representative (PSR).pptx
Unit-IV; Professional Sales Representative (PSR).pptx
 
Seal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptxSeal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptx
 
Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104Nutritional Needs Presentation - HLTH 104
Nutritional Needs Presentation - HLTH 104
 
Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfKey note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdf
 
Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and Mode
 
Mehran University Newsletter Vol-X, Issue-I, 2024
Mehran University Newsletter Vol-X, Issue-I, 2024Mehran University Newsletter Vol-X, Issue-I, 2024
Mehran University Newsletter Vol-X, Issue-I, 2024
 
Energy Resources. ( B. Pharmacy, 1st Year, Sem-II) Natural Resources
Energy Resources. ( B. Pharmacy, 1st Year, Sem-II) Natural ResourcesEnergy Resources. ( B. Pharmacy, 1st Year, Sem-II) Natural Resources
Energy Resources. ( B. Pharmacy, 1st Year, Sem-II) Natural Resources
 
On National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan FellowsOn National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan Fellows
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdf
 
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptxINDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
 
Sociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning ExhibitSociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning Exhibit
 
Ecological Succession. ( ECOSYSTEM, B. Pharmacy, 1st Year, Sem-II, Environmen...
Ecological Succession. ( ECOSYSTEM, B. Pharmacy, 1st Year, Sem-II, Environmen...Ecological Succession. ( ECOSYSTEM, B. Pharmacy, 1st Year, Sem-II, Environmen...
Ecological Succession. ( ECOSYSTEM, B. Pharmacy, 1st Year, Sem-II, Environmen...
 
Food Chain and Food Web (Ecosystem) EVS, B. Pharmacy 1st Year, Sem-II
Food Chain and Food Web (Ecosystem) EVS, B. Pharmacy 1st Year, Sem-IIFood Chain and Food Web (Ecosystem) EVS, B. Pharmacy 1st Year, Sem-II
Food Chain and Food Web (Ecosystem) EVS, B. Pharmacy 1st Year, Sem-II
 
ICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptxICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptx
 
Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17
 
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
 
Asian American Pacific Islander Month DDSD 2024.pptx
Asian American Pacific Islander Month DDSD 2024.pptxAsian American Pacific Islander Month DDSD 2024.pptx
Asian American Pacific Islander Month DDSD 2024.pptx
 

Phenocams hanigan-20140309

  • 1. Addressing the Key Challenges of Storage, Discoverability, Accessibility and Analysis. Ivan Hanigan and Marco Fahmi Australian SuperSite Network (ASN) and Long Term Ecological Research Network (LTERN) ACEAS Phenocam Workshop 2014-03-11 Ivan Hanigan and Marco Fahmi (ASN-LTERN)Addressing the Key Challenges of Storage, Discoverability, Accessibility and Analysis.2014-03-11 1 / 21
  • 2. Topic 1 Introduction 2 What I want out of this workshop 3 Storage hosting of the data 4 Discoverability 5 Accessibility 6 Analysis 7 Conclusion Ivan Hanigan and Marco Fahmi (ASN-LTERN)Addressing the Key Challenges of Storage, Discoverability, Accessibility and Analysis.2014-03-11 2 / 21
  • 3. Introduction Four key challenges to working with large data collections: Storage (Big Data, resilience to disasters, future proofing) Discoverability (exposing metadata, indexing, standard schemas) Accessibility (who is accessing what? Is it collaborative?) Analysis (workflow and provenance) Ivan Hanigan and Marco Fahmi (ASN-LTERN)Addressing the Key Challenges of Storage, Discoverability, Accessibility and Analysis.2014-03-11 3 / 21
  • 4. Phenocams Managing phenocam data is an exemplar of these issues Ivan Hanigan and Marco Fahmi (ASN-LTERN)Addressing the Key Challenges of Storage, Discoverability, Accessibility and Analysis.2014-03-11 4 / 21
  • 5. Topic 1 Introduction 2 What I want out of this workshop 3 Storage hosting of the data 4 Discoverability 5 Accessibility 6 Analysis 7 Conclusion Ivan Hanigan and Marco Fahmi (ASN-LTERN)Addressing the Key Challenges of Storage, Discoverability, Accessibility and Analysis.2014-03-11 5 / 21
  • 6. What I want out of this workshop My work as a Data Manager / Data Analyst at ASN and LTERN Toward a better set of descriptions of the business requirements for each of these goals Building systems that address these challenges. Ivan Hanigan and Marco Fahmi (ASN-LTERN)Addressing the Key Challenges of Storage, Discoverability, Accessibility and Analysis.2014-03-11 6 / 21
  • 7. Topic 1 Introduction 2 What I want out of this workshop 3 Storage hosting of the data 4 Discoverability 5 Accessibility 6 Analysis 7 Conclusion Ivan Hanigan and Marco Fahmi (ASN-LTERN)Addressing the Key Challenges of Storage, Discoverability, Accessibility and Analysis.2014-03-11 7 / 21
  • 8. The Data Deluge “The next five years will produce more research data than has been produced in all of previous human history.” The great data explosion April 29, 2009 http://www.theaustralian.news.com.au/story/0,25197,25400306- 12332,00.html Ivan Hanigan and Marco Fahmi (ASN-LTERN)Addressing the Key Challenges of Storage, Discoverability, Accessibility and Analysis.2014-03-11 8 / 21
  • 9. Australian Research Cloud IT Infrastructure available is unprecedented Often cheap or free Ivan Hanigan and Marco Fahmi (ASN-LTERN)Addressing the Key Challenges of Storage, Discoverability, Accessibility and Analysis.2014-03-11 9 / 21
  • 10. Storage hosting of the data There are technical challenges of storing (as well as uploading/downloading) data. Sustainability and future-proofing of the storage is a logistical challenge. Questions arise such as should your store be the only location of the data or one of several mirrors? Is storage “indefinitely” really possible? Ivan Hanigan and Marco Fahmi (ASN-LTERN)Addressing the Key Challenges of Storage, Discoverability, Accessibility and Analysis.2014-03-11 10 / 21
  • 11. Topic 1 Introduction 2 What I want out of this workshop 3 Storage hosting of the data 4 Discoverability 5 Accessibility 6 Analysis 7 Conclusion Ivan Hanigan and Marco Fahmi (ASN-LTERN)Addressing the Key Challenges of Storage, Discoverability, Accessibility and Analysis.2014-03-11 11 / 21
  • 12. Data and metadata standards With the gathered expertise, it will be useful to advocate: Conventions over Configuration appropriate syntax and semantics for Phenocam data with well considered conceptual frameworks for grouping datasets appropriate compatibility/compliance with other standards. Ivan Hanigan and Marco Fahmi (ASN-LTERN)Addressing the Key Challenges of Storage, Discoverability, Accessibility and Analysis.2014-03-11 12 / 21
  • 13. Topic 1 Introduction 2 What I want out of this workshop 3 Storage hosting of the data 4 Discoverability 5 Accessibility 6 Analysis 7 Conclusion Ivan Hanigan and Marco Fahmi (ASN-LTERN)Addressing the Key Challenges of Storage, Discoverability, Accessibility and Analysis.2014-03-11 13 / 21
  • 14. Ownership, Sharing and Anonymous Re-use There will be some contractual obligations about sharing and publishing data (or not!) as well as a general inclination of the group of what/when to share. There is also the appropriate licensing scheme governing this, embargoes and controlled access. Ivan Hanigan and Marco Fahmi (ASN-LTERN)Addressing the Key Challenges of Storage, Discoverability, Accessibility and Analysis.2014-03-11 14 / 21
  • 15. Ethics and Trust Trust is needed then by the data provider to allay concerns over the re-use of data Collaborative and respectful use should be expected of data users. Ivan Hanigan and Marco Fahmi (ASN-LTERN)Addressing the Key Challenges of Storage, Discoverability, Accessibility and Analysis.2014-03-11 15 / 21
  • 16. Topic 1 Introduction 2 What I want out of this workshop 3 Storage hosting of the data 4 Discoverability 5 Accessibility 6 Analysis 7 Conclusion Ivan Hanigan and Marco Fahmi (ASN-LTERN)Addressing the Key Challenges of Storage, Discoverability, Accessibility and Analysis.2014-03-11 16 / 21
  • 17. Analysis Workflow management because users will need to have tools or tech savvy to do something interesting and useful with the data. The paradigm of “Bringing the Code to the Data” rather than “Taking the Data to the Code” Uses remote supercomputers with very large storage and compute capacity However it often feels like to be able to access and use a supercomputer one needs to be as skillful as a “Super Scientist” How to support ordinary users wanting “Super” analyses? Provenance tracking of analysis outputs to ensure reproducibility Ivan Hanigan and Marco Fahmi (ASN-LTERN)Addressing the Key Challenges of Storage, Discoverability, Accessibility and Analysis.2014-03-11 17 / 21
  • 18. Appropriate analysis There is an implicit belief of ‘big data’ advocates that answers to difficult environmental questions can be found through sharing data But Ecology is inherently about understanding local patterns and processes, and often hard-won, field-based understanding is essential to help interpret the results of data analyses There might be a need for support in study designs from those familiar with the ecosystem Ivan Hanigan and Marco Fahmi (ASN-LTERN)Addressing the Key Challenges of Storage, Discoverability, Accessibility and Analysis.2014-03-11 18 / 21
  • 19. Security against malicious mis-use A data analysis server is geared to executing software code Analyses may require custom code to be written, or installation of third-party software from unknown developers There is a risk that such a Virtual Lab could be the victim of a malicious attack. Ivan Hanigan and Marco Fahmi (ASN-LTERN)Addressing the Key Challenges of Storage, Discoverability, Accessibility and Analysis.2014-03-11 19 / 21
  • 20. Topic 1 Introduction 2 What I want out of this workshop 3 Storage hosting of the data 4 Discoverability 5 Accessibility 6 Analysis 7 Conclusion Ivan Hanigan and Marco Fahmi (ASN-LTERN)Addressing the Key Challenges of Storage, Discoverability, Accessibility and Analysis.2014-03-11 20 / 21
  • 21. Conclusion These challenges are not trivial We suspect the answers to many of these challenges will rely on outsourcing much of the hardware and software as possible to shift the responsibility of upkeep and sustainability on someone else’s shoulders and let the scientists focus on their science. Ivan Hanigan and Marco Fahmi (ASN-LTERN)Addressing the Key Challenges of Storage, Discoverability, Accessibility and Analysis.2014-03-11 21 / 21