SlideShare a Scribd company logo
1 of 43
Shifting the Burden from the User
to the Data Provider
Peter Fox
High Altitude Observatory,
NCAR (***)
With thanks to eGY and various NSF, DoE and
NASA projects
1
Outline
• Background, definitions
• Informatics -> e-Science
• Data has lots of uses
– Virtual Observatories: use cases
– Data Framework: Examples
– Data ingest, integration, mining and …

• Discussion

2
Fox HDF: Semantic Data Burden Shift Oct 15, 2008
Background
Scientists should be able to access a global,
distributed knowledge base of scientific data that:
• appears to be integrated
• appears to be locally available
But… data is obtained by multiple instruments, using
various protocols, in differing vocabularies, using
(sometimes unstated) assumptions, with
inconsistent (or non-existent) meta-data. It may be
inconsistent, incomplete, evolving, and distributed
And… there exist(ed) significant levels of semantic
heterogeneity, large-scale data, complex data
types, legacy systems, inflexible and unsustainable
implementation technology…
3
Fox HDF: Semantic Data Burden Shift Oct 15, 2008
Information
Information has
But data
products have

Lots of Audiences

More Strategic

Less Strategic

SCIENTISTS TOO

From “Why EPO?”, a NASA internal
report on science education, 2005
4

Fox HDF: Semantic Data Burden Shift Oct 15, 2008
The Information Era: Interoperability
Modern information and communications
technologies are creating an
“interoperable” information era in which
ready access to data and information can
be truly universal. Open access to data
and services enables us to meet the new
challenges of understand the Earth and
its space environment as a complex
system:
• managing and accessing large data sets
• higher space/time resolution capabilities
• rapid response requirements
• data assimilation into models
• crossing disciplinary boundaries.
5
Fox HDF: Semantic Data Burden Shift Oct 15, 2008
Shifting the Burden from the User
to the Provider

6
Fox HDF: Semantic Data Burden Shift Oct 15, 2008
Modern capabilities

7
Fox HDF: Semantic Data Burden Shift Oct 15, 2008
Mind the
Gap!
As a result - finding out who is doing what,
• Informatics ofinformation science includes the
sharing experience/ expertise, and substantial
science of (data and) information, the practice
coordination:
of information processing, and the engineering
• There is/ was still a gap between science and the
of information systems. Informatics studies the
underlying infrastructure and technology that is
structure, behavior, and interactions of natural
available
and artificial systems that store, process and
• Cyberinfrastructure is the new
communicate (data and) information. It also
research environment(s) that support
develops its own conceptual and theoretical
advanced data acquisition, data
foundations. Since computers, individuals and
storage, data management, data
organizations all process information,
integration, data mining, data
informatics has computational, cognitive and
visualization and other computing and
social aspects, including study of the social
information processing services over
impact of information technologies. Wikipedia.

the Internet.

8
Fox HDF: Semantic Data Burden Shift Oct 15, 2008
Progression after progression
Informatics

IT Cyber
Infrastru
cture

Cyber
Informatics

Core
Informatics

Science
Informatics,
aka
Xinformatics

Science,
SBAs

9
Fox HDF: Semantic Data Burden Shift Oct 15, 2008
Virtual Observatories
• Conceptual examples:
• In-situ: Virtual measurements
– Related measurements

• Remote sensing: Virtual, integrative
measurements
– Data integration

• Managing virtual data products/ sets
10
Virtual Observatories
Make data and tools quickly and easily accessible
to a wide audience.
Operationally, virtual observatories need to find the
right balance of data/model holdings, portals and
client software that researchers can use without
effort or interference as if all the materials were
available on his/her local computer using the
user’s preferred language: i.e. appear to be
local and integrated
Likely to provide controlled vocabularies that may
be used for interoperation in appropriate
domains along with database interfaces for
access and storage and “smart” tools for
evolution and maintenance.

11
Early days of discipline specific VOs
?

VO2

VO3

VO1

DB1

DB2

DB3

…………

DBn
12
The Astronomy approach; datatypes as a service
Limited
interoperability

VO App1

VO App2

VOTable
VO App3

Open Geospatial Consortium:

Simple
Image
Access
Protocol

Web {Feature, Coverage, Mapping}Simple
Service
Spectrum
Sensor Web Enablement:

VO layer

Sensor {Observation, Planning,
Analysis}Lightweight semantics
Service
DB1

use
DB2

Access
Protocol

Simple
Time Access
Protocol

Limited meaning, hard
coded

the same approach DBn
DB
Limited extensibility
3

…………

Under review

13
Added value

Education, clearinghouses,
disciplines, et c.

other

services,

Semantic mediation layer - mid-upper-level

VO
Portal

Semantic
interoperability
Added value

VO
API

Web
Serv.

Added value
Semantic query,
hypothesis and
inference

Mediation Layer
• Ontology - capturing concepts of Parameters,
Instruments, Date/Time, Data Product (and
Semantic mediation layer - VSTO associated classes, properties) and Service
Classes
• Maps queries to underlying data Metadata, schema,
data
• Generates access requests for metadata, data
• Allows queries, reasoning, analysis, new value
Added
DB2
DB3
hypothesis generation, testing, explanation, et…
… … … c.
DB
1

Query,
access
and use
of data

low level

DBn
14
Content: Coupling Energetics and Dynamics
of Atmospheric Regions WEB
Community data
archive for
observations and
models of
Earth's upper
atmosphere and
geophysical
indices and
parameters
needed to
interpret them.
Includes
browsing
capabilities by
periods, > 310
instruments,
models, > 820
15
parameters…
Content: Mauna Loa Solar
real-time
Observatory Near products
data

from Hawaii from
a variety of solar
instruments.
Source for space
weather, solar
variability, and
basic solar
physics
Other content used
too - Center for
Integrated Space
Weather Modeling
16
Semantic Web Methodology and
Technology Development Process
•
•

Establish and improve a well-defined methodology vision for
Semantic Technology based application development
Leverage controlled vocabularies, et c.

Rapid
Open World:
Evolve, Iterate, Prototype
Redesign,
Redeploy

Leverage
Technology
Infrastructure

Adopt
Science/Expert
Technology
Approach Review & Iteration

Use Tools
Analysis

Use Case
Small Team,
mixed skills

Develop
model/
ontology

17
Science and technical use cases
Find data which represents the state of the neutral
atmosphere anywhere above 100km and toward the
arctic circle (above 45N) at any time of high
geomagnetic activity.
– Extract information from the use-case - encode knowledge
– Translate this into a complete query for data - inference and
integration of data from instruments, indices and models

Provide semantically-enabled, smart data query services
via a SOAP web for the Virtual IonosphereThermosphere-Mesosphere Observatory that retrieve
data, filtered by constraints on Instrument, Date-Time,
and Parameter in any order and with constraints
included in any combination.
18
VSTO - semantics and ontologies in an operational
environment: vsto.hao.ucar.edu, www.vsto.org

Web Service

19
Fox RPI: Semantic Data Frameworks May 14, 2008
Semantic filtering by
domain or instrument
hierarchy

Partial exposure of
Instrument
class
hierarchy - users
seem to LIKE THIS

20
21
Inferred plot type
and return formats
for data products

22
Fox RPI: Semantic Data Frameworks May 14, 2008
Inferred plot type
and return required
axes data
23
Fox RPI: Semantic Data Frameworks May 14, 2008
Semantic Web Benefits
•
•
•
•
•

Unified/ abstracted query workflow: Parameters, Instruments, Date-Time
Decreased input requirements for query: in one case reducing the
number of selections from eight to three
Generates only syntactically correct queries: which was not always
insurable in previous implementations without semantics
Semantic query support: by using background ontologies and a
reasoner, our application has the opportunity to only expose coherent
query (portal and services)
Semantic integration: in the past users had to remember (and maintain
codes) to account for numerous different ways to combine and plot the
data whereas now semantic mediation provides the level of sensible data
integration required, now exposed as smart web services
– understanding of coordinate systems, relationships, data synthesis,
transformations, et c.
– returns independent variables and related parameters

•

A broader range of potential users (PhD scientists, students, professional
research associates and those from outside the fields)
24
What is a Non-Specialist Use Case?

Teacher accesses internet goes
to An Educational Virtual
Observatory and enters a
search for “Aurora”.

Someone
should be able
to query a
virtual
observatory
without having
specialist
knowledge

25
What should the User Receive?
Teacher receives four groupings of search results:
1) Educational materials:
http://www.meted.ucar.edu/topics_spacewx.php
and http://www.meted.ucar.edu/hao/aurora/
2) Research, data and tools: via VSTO, VSPO and
VITMO, knows to search for brightness, or green/red
line emission
3) Did you know?: Aurora is a phenomena of the
upper terrestrial atmosphere (ionosphere) also
known as Northern Lights
4) Did you mean?: Aurora Borealis or Aurora
Australis, et c.

26
Semantic Information Integration:
Concept map for educational use of
science data in a lesson plan

27
Fox RPI: Semantic Data Frameworks May 14, 2008
28
Fox RPI: Semantic Data Frameworks May 14, 2008
Issues for Virtual Observatories

rs
se
u

• Scaling to large numbers of data providers and
or
redefining the role(s)/ relations with them f
as
re
• Crossing discipline boundaries n a
rde
• Security, access to resources, policies
bu
tly
• Branding and attribution (where did this data come
en
from and whourr the credit, is it the correct version,
c gets
is this anrauthoritative source?)
ae
se
• Provenance/derivation (propagating key information
he
Tas it passes through a variety of services, copies of
processing algorithms, …)
• Data quality, preservation, stewardship
29
Problem definition
•

Data is coming in faster, in greater volumes and outstripping our
ability to perform adequate quality control

•

Data is being used in new ways and we frequently do not have
sufficient information on what happened to the data along the
processing stages to determine if it is suitable for a use we did not
envision

•

We often fail to capture, represent and propagate manually
generated information that need to go with the data flows

•

Each time we develop a new instrument, we develop a new data
ingest procedure and collect different metadata and organize it
differently. It is then hard to use with previous projects

•

30
The task of event determination and feature classification is onerous
and we don't do it until after we get the data
Use cases
•
•
•
•
•
•
•
•
•
•

Determine which flat field calibration was applied to the image taken on
January, 26, 2005 around 2100UT by the ACOS Mark IV polarimeter.
Which flat-field algorithm was applied to the set of images taken during the
period November 1, 2004 to February 28, 2005?
How many different data product types can be generated from the ACOS
CHIP instrument?
What images comprised the flat field calibration image used on January 26,
2007 for all ACOS CHIP images?
What processing steps were completed to obtain the ACOS PICS limb
image of the day for January 26, 2005?
Who (person or program) added the comments to the science data file for
the best vignetted, rectangular polarization brightness image from January,
26, 2005 1849:09UT taken by the ACOS Mark IV polarimeter?
What was the cloud cover and atmospheric seeing conditions during the
local morning of January 26, 2005 at MLSO?
Find all good images on March 21, 2008.
Why are the quick look images from March 21, 2008, 1900UT missing?
Why does this image look bad?
31
Provenance
• Origin or source from which something
comes, intention for use, who/what
generated for, manner of manufacture,
history of subsequent owners, sense of
place and time of manufacture, production
or discovery, documented in detail
sufficient to allow reproducibility

32
33
34
35
Visual browse

36
37
38
Discussion (1)
• Taken together, an emerging set of collected
experience manifests an emerging informatics
core capability that is starting to take data
intensive science into a new realm of realizability
and potentially, sustainability
–
–
–
–

Use cases (i.e. real users)
X-informatics
Core Informatics
Cyber Informatics

• There are implications for data models
39
Progression after progression
Informatics

IT Cyber
Infrastru
cture

Cyber
Informatics

Core
Informatics

Science
Informatics

Science,
SBAs

Example:
•CI = OPeNDAP server running over HTTP/HTTPS
•Cyberinformatics = Data (product) and service ontologies, triple store
•Core informatics = Reasoning engine (Pellet), OWL
•Science (X) informatics = Use cases, science domain terms, concepts in
an ontology

40
Discussion (2)
• Data and information science is becoming
the ‘fourth’ column (along with theory,
experiment and computation)
• Semantics (of the data) are a very key
ingredient -> may imply richer data models

41
Summary
• Informatics is playing a key role in filling the gap
between science (and the spectrum of non-expert)
use and generation and the underlying
cyberinfrastructure, i.e. in shifting the burden
– This is evident due to the emergence of Xinformatics
(world-wide)

• Our experience is implementing informatics as
semantics in Virtual Observatories (as a working
paradigm) and Grid environments
– VSTO is only one example of success
– Data mining, data integration, smart search, provenance
are close behind

• Informatics is a profession and a community activity
and requires efforts in all 3 sub-areas (science, core,
cyber) and must be synergistic
42
Fox RPI: Semantic Data Frameworks May 14, 2008
More Information
• Virtual Solar Terrestrial Observatory (VSTO):
http://vsto.hao.ucar.edu, http://www.vsto.org
• Semantically-Enalbed Science Data Integration (SESDI):
http://sesdi.hao.ucar.edu
• Semantic Provenance Capture in Data Ingest Systems
(SPCDIS): http://spcdis.hao.ucar.edu
• Semantic Knowledge Integration Framework (SKIF/SAM):
http://skif.hao.ucar.edu
• Semantic Web for Earth and Environmental Terminology
(SWEET): http://sweet.jpl.nasa.gov
• Conferences: AGU 2008, EGU 2009, ISWC 2008, CIKM
2008, …
• Peter Fox pfox@ucar.edu
43

More Related Content

What's hot

HPC-ABDS: The Case for an Integrating Apache Big Data Stack with HPC
HPC-ABDS: The Case for an Integrating Apache Big Data Stack with HPC HPC-ABDS: The Case for an Integrating Apache Big Data Stack with HPC
HPC-ABDS: The Case for an Integrating Apache Big Data Stack with HPC Geoffrey Fox
 
Hughes RDAP11 Data Publication Repositories
Hughes RDAP11 Data Publication RepositoriesHughes RDAP11 Data Publication Repositories
Hughes RDAP11 Data Publication RepositoriesASIS&T
 
Data repositories -- Xiamen University 2012 06-08
Data repositories -- Xiamen University 2012 06-08Data repositories -- Xiamen University 2012 06-08
Data repositories -- Xiamen University 2012 06-08Jian Qin
 
Curation and Preservation of Crystallography Data
Curation and Preservation of Crystallography DataCuration and Preservation of Crystallography Data
Curation and Preservation of Crystallography DataManjulaPatel
 
Research Data Services Best Practices by Dalal Rahme
Research Data Services Best Practices by Dalal RahmeResearch Data Services Best Practices by Dalal Rahme
Research Data Services Best Practices by Dalal RahmeDalal Rahme
 
What's all the data about? - Linking and Profiling of Linked Datasets
What's all the data about? - Linking and Profiling of Linked DatasetsWhat's all the data about? - Linking and Profiling of Linked Datasets
What's all the data about? - Linking and Profiling of Linked DatasetsStefan Dietze
 
Trust and Accountability: experiences from the FAIRDOM Commons Initiative.
Trust and Accountability: experiences from the FAIRDOM Commons Initiative.Trust and Accountability: experiences from the FAIRDOM Commons Initiative.
Trust and Accountability: experiences from the FAIRDOM Commons Initiative.Carole Goble
 
Functional and Architectural Requirements for Metadata: Supporting Discovery...
Functional and Architectural Requirements for Metadata: Supporting Discovery...Functional and Architectural Requirements for Metadata: Supporting Discovery...
Functional and Architectural Requirements for Metadata: Supporting Discovery...Jian Qin
 
Research Objects, SEEK and FAIRDOM
Research Objects, SEEK and FAIRDOMResearch Objects, SEEK and FAIRDOM
Research Objects, SEEK and FAIRDOMCarole Goble
 
More ways of symbol grounding for knowledge graphs?
More ways of symbol grounding for knowledge graphs?More ways of symbol grounding for knowledge graphs?
More ways of symbol grounding for knowledge graphs?Paul Groth
 
Open Science: how to serve the needs of the researcher?
Open Science: how to serve the needs of the researcher? Open Science: how to serve the needs of the researcher?
Open Science: how to serve the needs of the researcher? Carole Goble
 
How to Make Your Content Smarter
How to Make Your Content SmarterHow to Make Your Content Smarter
How to Make Your Content SmarterBianca Pereira
 
Elag workshop sessie 1 en 2 v10
Elag workshop sessie 1 en 2 v10Elag workshop sessie 1 en 2 v10
Elag workshop sessie 1 en 2 v10Jeroen Rombouts
 
Introducing the Whole Tale Project: Merging Science and Cyberinfrastructure P...
Introducing the Whole Tale Project: Merging Science and Cyberinfrastructure P...Introducing the Whole Tale Project: Merging Science and Cyberinfrastructure P...
Introducing the Whole Tale Project: Merging Science and Cyberinfrastructure P...Bertram Ludäscher
 
The role of virtual research environments (VRE's) within the context of an e-...
The role of virtual research environments (VRE's) within the context of an e-...The role of virtual research environments (VRE's) within the context of an e-...
The role of virtual research environments (VRE's) within the context of an e-...heila1
 
Some Proposed Principles for Interoperating Cloud Based Data Platforms
Some Proposed Principles for Interoperating Cloud Based Data PlatformsSome Proposed Principles for Interoperating Cloud Based Data Platforms
Some Proposed Principles for Interoperating Cloud Based Data PlatformsRobert Grossman
 
PEARC17: Data Access for LIGO on the OSG
PEARC17: Data Access for LIGO on the OSGPEARC17: Data Access for LIGO on the OSG
PEARC17: Data Access for LIGO on the OSGDerek Weitzel
 
Computing Just What You Need: Online Data Analysis and Reduction at Extreme ...
Computing Just What You Need: Online Data Analysis and Reduction  at Extreme ...Computing Just What You Need: Online Data Analysis and Reduction  at Extreme ...
Computing Just What You Need: Online Data Analysis and Reduction at Extreme ...Ian Foster
 

What's hot (20)

HPC-ABDS: The Case for an Integrating Apache Big Data Stack with HPC
HPC-ABDS: The Case for an Integrating Apache Big Data Stack with HPC HPC-ABDS: The Case for an Integrating Apache Big Data Stack with HPC
HPC-ABDS: The Case for an Integrating Apache Big Data Stack with HPC
 
Hughes RDAP11 Data Publication Repositories
Hughes RDAP11 Data Publication RepositoriesHughes RDAP11 Data Publication Repositories
Hughes RDAP11 Data Publication Repositories
 
Data repositories -- Xiamen University 2012 06-08
Data repositories -- Xiamen University 2012 06-08Data repositories -- Xiamen University 2012 06-08
Data repositories -- Xiamen University 2012 06-08
 
Curation and Preservation of Crystallography Data
Curation and Preservation of Crystallography DataCuration and Preservation of Crystallography Data
Curation and Preservation of Crystallography Data
 
Research Data Services Best Practices by Dalal Rahme
Research Data Services Best Practices by Dalal RahmeResearch Data Services Best Practices by Dalal Rahme
Research Data Services Best Practices by Dalal Rahme
 
What's all the data about? - Linking and Profiling of Linked Datasets
What's all the data about? - Linking and Profiling of Linked DatasetsWhat's all the data about? - Linking and Profiling of Linked Datasets
What's all the data about? - Linking and Profiling of Linked Datasets
 
RDM@Edinburgh_interoperation_IDCC2015
RDM@Edinburgh_interoperation_IDCC2015RDM@Edinburgh_interoperation_IDCC2015
RDM@Edinburgh_interoperation_IDCC2015
 
Trust and Accountability: experiences from the FAIRDOM Commons Initiative.
Trust and Accountability: experiences from the FAIRDOM Commons Initiative.Trust and Accountability: experiences from the FAIRDOM Commons Initiative.
Trust and Accountability: experiences from the FAIRDOM Commons Initiative.
 
Functional and Architectural Requirements for Metadata: Supporting Discovery...
Functional and Architectural Requirements for Metadata: Supporting Discovery...Functional and Architectural Requirements for Metadata: Supporting Discovery...
Functional and Architectural Requirements for Metadata: Supporting Discovery...
 
Research Objects, SEEK and FAIRDOM
Research Objects, SEEK and FAIRDOMResearch Objects, SEEK and FAIRDOM
Research Objects, SEEK and FAIRDOM
 
More ways of symbol grounding for knowledge graphs?
More ways of symbol grounding for knowledge graphs?More ways of symbol grounding for knowledge graphs?
More ways of symbol grounding for knowledge graphs?
 
Open Science: how to serve the needs of the researcher?
Open Science: how to serve the needs of the researcher? Open Science: how to serve the needs of the researcher?
Open Science: how to serve the needs of the researcher?
 
How to Make Your Content Smarter
How to Make Your Content SmarterHow to Make Your Content Smarter
How to Make Your Content Smarter
 
Ci days notre_dame_april2010
Ci days notre_dame_april2010Ci days notre_dame_april2010
Ci days notre_dame_april2010
 
Elag workshop sessie 1 en 2 v10
Elag workshop sessie 1 en 2 v10Elag workshop sessie 1 en 2 v10
Elag workshop sessie 1 en 2 v10
 
Introducing the Whole Tale Project: Merging Science and Cyberinfrastructure P...
Introducing the Whole Tale Project: Merging Science and Cyberinfrastructure P...Introducing the Whole Tale Project: Merging Science and Cyberinfrastructure P...
Introducing the Whole Tale Project: Merging Science and Cyberinfrastructure P...
 
The role of virtual research environments (VRE's) within the context of an e-...
The role of virtual research environments (VRE's) within the context of an e-...The role of virtual research environments (VRE's) within the context of an e-...
The role of virtual research environments (VRE's) within the context of an e-...
 
Some Proposed Principles for Interoperating Cloud Based Data Platforms
Some Proposed Principles for Interoperating Cloud Based Data PlatformsSome Proposed Principles for Interoperating Cloud Based Data Platforms
Some Proposed Principles for Interoperating Cloud Based Data Platforms
 
PEARC17: Data Access for LIGO on the OSG
PEARC17: Data Access for LIGO on the OSGPEARC17: Data Access for LIGO on the OSG
PEARC17: Data Access for LIGO on the OSG
 
Computing Just What You Need: Online Data Analysis and Reduction at Extreme ...
Computing Just What You Need: Online Data Analysis and Reduction  at Extreme ...Computing Just What You Need: Online Data Analysis and Reduction  at Extreme ...
Computing Just What You Need: Online Data Analysis and Reduction at Extreme ...
 

Viewers also liked

Viewers also liked (20)

EOSDIS Status
EOSDIS StatusEOSDIS Status
EOSDIS Status
 
Profile of HDF-EOS5 Files
Profile of HDF-EOS5 FilesProfile of HDF-EOS5 Files
Profile of HDF-EOS5 Files
 
The CFD General Notation System transition to HDF5
The CFD General Notation System transition to HDF5The CFD General Notation System transition to HDF5
The CFD General Notation System transition to HDF5
 
Support for NPP/NPOESS by The HDF Group
Support for NPP/NPOESS by The HDF GroupSupport for NPP/NPOESS by The HDF Group
Support for NPP/NPOESS by The HDF Group
 
Workshop Discussion: HDF & HDF-EOS Future Direction
Workshop Discussion: HDF & HDF-EOS Future DirectionWorkshop Discussion: HDF & HDF-EOS Future Direction
Workshop Discussion: HDF & HDF-EOS Future Direction
 
Status of HDF-EOS, Related Software, and Tools
Status of HDF-EOS, Related Software, and ToolsStatus of HDF-EOS, Related Software, and Tools
Status of HDF-EOS, Related Software, and Tools
 
HDFView and HDF Java Products
HDFView and HDF Java ProductsHDFView and HDF Java Products
HDFView and HDF Java Products
 
ENVI/IDL for HDF
ENVI/IDL for HDFENVI/IDL for HDF
ENVI/IDL for HDF
 
Proposal for adding Named Dimensions to HDF5 Arrays
Proposal for adding Named Dimensions to HDF5 ArraysProposal for adding Named Dimensions to HDF5 Arrays
Proposal for adding Named Dimensions to HDF5 Arrays
 
Migrating from HDF5 1.6 to 1.8
Migrating from HDF5 1.6 to 1.8Migrating from HDF5 1.6 to 1.8
Migrating from HDF5 1.6 to 1.8
 
What will be new in HDF5?
What will be new in HDF5?What will be new in HDF5?
What will be new in HDF5?
 
HDF Update
HDF UpdateHDF Update
HDF Update
 
HDF5 OPeNDAP project update and demo
HDF5 OPeNDAP project update and demoHDF5 OPeNDAP project update and demo
HDF5 OPeNDAP project update and demo
 
HDF and HDF-EOS Experiences and Applications
HDF and HDF-EOS Experiences and ApplicationsHDF and HDF-EOS Experiences and Applications
HDF and HDF-EOS Experiences and Applications
 
Profile of NPOESS HDF5 Files
Profile of NPOESS HDF5 FilesProfile of NPOESS HDF5 Files
Profile of NPOESS HDF5 Files
 
The MATLAB Low-Level HDF5 Interface
The MATLAB Low-Level HDF5 InterfaceThe MATLAB Low-Level HDF5 Interface
The MATLAB Low-Level HDF5 Interface
 
Reading HDF family of formats via NetCDF-Java / CDM
Reading HDF family of formats via NetCDF-Java / CDMReading HDF family of formats via NetCDF-Java / CDM
Reading HDF family of formats via NetCDF-Java / CDM
 
Using HDF5 Archive Information Package to preserve HDF-EOS2 data
Using HDF5 Archive Information Package to preserve HDF-EOS2 dataUsing HDF5 Archive Information Package to preserve HDF-EOS2 data
Using HDF5 Archive Information Package to preserve HDF-EOS2 data
 
ORNL DAAC MODIS Land Product Subsets
ORNL DAAC MODIS Land Product SubsetsORNL DAAC MODIS Land Product Subsets
ORNL DAAC MODIS Land Product Subsets
 
Introduction to HDF5
Introduction to HDF5Introduction to HDF5
Introduction to HDF5
 

Similar to Shifting the Burden from User to Data Provider

SPatially Explicit Data Discovery, Extraction and Evaluation Services (SPEDDE...
SPatially Explicit Data Discovery, Extraction and Evaluation Services (SPEDDE...SPatially Explicit Data Discovery, Extraction and Evaluation Services (SPEDDE...
SPatially Explicit Data Discovery, Extraction and Evaluation Services (SPEDDE...aceas13tern
 
NSF Software @ ApacheConNA
NSF Software @ ApacheConNANSF Software @ ApacheConNA
NSF Software @ ApacheConNADaniel S. Katz
 
Accelerating Discovery via Science Services
Accelerating Discovery via Science ServicesAccelerating Discovery via Science Services
Accelerating Discovery via Science ServicesIan Foster
 
Research Object Composer: A Tool for Publishing Complex Data Objects in the C...
Research Object Composer: A Tool for Publishing Complex Data Objects in the C...Research Object Composer: A Tool for Publishing Complex Data Objects in the C...
Research Object Composer: A Tool for Publishing Complex Data Objects in the C...Anita de Waard
 
Semantic technologies for the Internet of Things
Semantic technologies for the Internet of Things Semantic technologies for the Internet of Things
Semantic technologies for the Internet of Things PayamBarnaghi
 
A Linked Fusion of Things, Services, and Data to Support a Collaborative Data...
A Linked Fusion of Things, Services, and Data to Support a Collaborative Data...A Linked Fusion of Things, Services, and Data to Support a Collaborative Data...
A Linked Fusion of Things, Services, and Data to Support a Collaborative Data...Eric Stephan
 
Research in Intelligent Systems and Data Science at the Knowledge Media Insti...
Research in Intelligent Systems and Data Science at the Knowledge Media Insti...Research in Intelligent Systems and Data Science at the Knowledge Media Insti...
Research in Intelligent Systems and Data Science at the Knowledge Media Insti...Enrico Motta
 
A Comprehensive Guide to Data Science Technologies.pdf
A Comprehensive Guide to Data Science Technologies.pdfA Comprehensive Guide to Data Science Technologies.pdf
A Comprehensive Guide to Data Science Technologies.pdfGeethaPratyusha
 
EarthCube Monthly Community Webinar- Nov. 22, 2013
EarthCube Monthly Community Webinar- Nov. 22, 2013EarthCube Monthly Community Webinar- Nov. 22, 2013
EarthCube Monthly Community Webinar- Nov. 22, 2013EarthCube
 
FAIRy stories: the FAIR Data principles in theory and in practice
FAIRy stories: the FAIR Data principles in theory and in practiceFAIRy stories: the FAIR Data principles in theory and in practice
FAIRy stories: the FAIR Data principles in theory and in practiceCarole Goble
 
big_data_casestudies_2.ppt
big_data_casestudies_2.pptbig_data_casestudies_2.ppt
big_data_casestudies_2.pptvishal choudhary
 
Australian Ecosystems Science Cloud
Australian Ecosystems Science CloudAustralian Ecosystems Science Cloud
Australian Ecosystems Science CloudTERN Australia
 
Big Data Europe SC6 WS 3: Ron Dekker, Director CESSDA European Open Science A...
Big Data Europe SC6 WS 3: Ron Dekker, Director CESSDA European Open Science A...Big Data Europe SC6 WS 3: Ron Dekker, Director CESSDA European Open Science A...
Big Data Europe SC6 WS 3: Ron Dekker, Director CESSDA European Open Science A...BigData_Europe
 
NIST Big Data Public Working Group NBD-PWG
NIST Big Data Public Working Group NBD-PWGNIST Big Data Public Working Group NBD-PWG
NIST Big Data Public Working Group NBD-PWGGeoffrey Fox
 
NIH Data Summit - The NIH Data Commons
NIH Data Summit - The NIH Data CommonsNIH Data Summit - The NIH Data Commons
NIH Data Summit - The NIH Data CommonsVivien Bonazzi
 
Urm concept for sharing information inside of communities
Urm concept for sharing information inside of communitiesUrm concept for sharing information inside of communities
Urm concept for sharing information inside of communitiesKarel Charvat
 

Similar to Shifting the Burden from User to Data Provider (20)

SPatially Explicit Data Discovery, Extraction and Evaluation Services (SPEDDE...
SPatially Explicit Data Discovery, Extraction and Evaluation Services (SPEDDE...SPatially Explicit Data Discovery, Extraction and Evaluation Services (SPEDDE...
SPatially Explicit Data Discovery, Extraction and Evaluation Services (SPEDDE...
 
Semantic Technologies for Big Sciences including Astrophysics
Semantic Technologies for Big Sciences including AstrophysicsSemantic Technologies for Big Sciences including Astrophysics
Semantic Technologies for Big Sciences including Astrophysics
 
NSF Software @ ApacheConNA
NSF Software @ ApacheConNANSF Software @ ApacheConNA
NSF Software @ ApacheConNA
 
Accelerating Discovery via Science Services
Accelerating Discovery via Science ServicesAccelerating Discovery via Science Services
Accelerating Discovery via Science Services
 
Research Object Composer: A Tool for Publishing Complex Data Objects in the C...
Research Object Composer: A Tool for Publishing Complex Data Objects in the C...Research Object Composer: A Tool for Publishing Complex Data Objects in the C...
Research Object Composer: A Tool for Publishing Complex Data Objects in the C...
 
Semantic technologies for the Internet of Things
Semantic technologies for the Internet of Things Semantic technologies for the Internet of Things
Semantic technologies for the Internet of Things
 
A Linked Fusion of Things, Services, and Data to Support a Collaborative Data...
A Linked Fusion of Things, Services, and Data to Support a Collaborative Data...A Linked Fusion of Things, Services, and Data to Support a Collaborative Data...
A Linked Fusion of Things, Services, and Data to Support a Collaborative Data...
 
Research in Intelligent Systems and Data Science at the Knowledge Media Insti...
Research in Intelligent Systems and Data Science at the Knowledge Media Insti...Research in Intelligent Systems and Data Science at the Knowledge Media Insti...
Research in Intelligent Systems and Data Science at the Knowledge Media Insti...
 
A Comprehensive Guide to Data Science Technologies.pdf
A Comprehensive Guide to Data Science Technologies.pdfA Comprehensive Guide to Data Science Technologies.pdf
A Comprehensive Guide to Data Science Technologies.pdf
 
EarthCube Monthly Community Webinar- Nov. 22, 2013
EarthCube Monthly Community Webinar- Nov. 22, 2013EarthCube Monthly Community Webinar- Nov. 22, 2013
EarthCube Monthly Community Webinar- Nov. 22, 2013
 
FAIRy stories: the FAIR Data principles in theory and in practice
FAIRy stories: the FAIR Data principles in theory and in practiceFAIRy stories: the FAIR Data principles in theory and in practice
FAIRy stories: the FAIR Data principles in theory and in practice
 
big_data_casestudies_2.ppt
big_data_casestudies_2.pptbig_data_casestudies_2.ppt
big_data_casestudies_2.ppt
 
Australian Ecosystems Science Cloud
Australian Ecosystems Science CloudAustralian Ecosystems Science Cloud
Australian Ecosystems Science Cloud
 
Big Data Europe SC6 WS 3: Ron Dekker, Director CESSDA European Open Science A...
Big Data Europe SC6 WS 3: Ron Dekker, Director CESSDA European Open Science A...Big Data Europe SC6 WS 3: Ron Dekker, Director CESSDA European Open Science A...
Big Data Europe SC6 WS 3: Ron Dekker, Director CESSDA European Open Science A...
 
Sgci esip-7-20-18
Sgci esip-7-20-18Sgci esip-7-20-18
Sgci esip-7-20-18
 
NIST Big Data Public Working Group NBD-PWG
NIST Big Data Public Working Group NBD-PWGNIST Big Data Public Working Group NBD-PWG
NIST Big Data Public Working Group NBD-PWG
 
Linked Data and Semantic Web Application Development by Peter Haase
Linked Data and Semantic Web Application Development by Peter HaaseLinked Data and Semantic Web Application Development by Peter Haase
Linked Data and Semantic Web Application Development by Peter Haase
 
NIH Data Summit - The NIH Data Commons
NIH Data Summit - The NIH Data CommonsNIH Data Summit - The NIH Data Commons
NIH Data Summit - The NIH Data Commons
 
Urm concept for sharing information inside of communities
Urm concept for sharing information inside of communitiesUrm concept for sharing information inside of communities
Urm concept for sharing information inside of communities
 
FAIR play?
FAIR play? FAIR play?
FAIR play?
 

More from The HDF-EOS Tools and Information Center

STARE-PODS: A Versatile Data Store Leveraging the HDF Virtual Object Layer fo...
STARE-PODS: A Versatile Data Store Leveraging the HDF Virtual Object Layer fo...STARE-PODS: A Versatile Data Store Leveraging the HDF Virtual Object Layer fo...
STARE-PODS: A Versatile Data Store Leveraging the HDF Virtual Object Layer fo...The HDF-EOS Tools and Information Center
 

More from The HDF-EOS Tools and Information Center (20)

Cloud-Optimized HDF5 Files
Cloud-Optimized HDF5 FilesCloud-Optimized HDF5 Files
Cloud-Optimized HDF5 Files
 
Accessing HDF5 data in the cloud with HSDS
Accessing HDF5 data in the cloud with HSDSAccessing HDF5 data in the cloud with HSDS
Accessing HDF5 data in the cloud with HSDS
 
The State of HDF
The State of HDFThe State of HDF
The State of HDF
 
Highly Scalable Data Service (HSDS) Performance Features
Highly Scalable Data Service (HSDS) Performance FeaturesHighly Scalable Data Service (HSDS) Performance Features
Highly Scalable Data Service (HSDS) Performance Features
 
Creating Cloud-Optimized HDF5 Files
Creating Cloud-Optimized HDF5 FilesCreating Cloud-Optimized HDF5 Files
Creating Cloud-Optimized HDF5 Files
 
HDF5 OPeNDAP Handler Updates, and Performance Discussion
HDF5 OPeNDAP Handler Updates, and Performance DiscussionHDF5 OPeNDAP Handler Updates, and Performance Discussion
HDF5 OPeNDAP Handler Updates, and Performance Discussion
 
Hyrax: Serving Data from S3
Hyrax: Serving Data from S3Hyrax: Serving Data from S3
Hyrax: Serving Data from S3
 
Accessing Cloud Data and Services Using EDL, Pydap, MATLAB
Accessing Cloud Data and Services Using EDL, Pydap, MATLABAccessing Cloud Data and Services Using EDL, Pydap, MATLAB
Accessing Cloud Data and Services Using EDL, Pydap, MATLAB
 
HDF - Current status and Future Directions
HDF - Current status and Future DirectionsHDF - Current status and Future Directions
HDF - Current status and Future Directions
 
HDFEOS.org User Analsys, Updates, and Future
HDFEOS.org User Analsys, Updates, and FutureHDFEOS.org User Analsys, Updates, and Future
HDFEOS.org User Analsys, Updates, and Future
 
HDF - Current status and Future Directions
HDF - Current status and Future Directions HDF - Current status and Future Directions
HDF - Current status and Future Directions
 
H5Coro: The Cloud-Optimized Read-Only Library
H5Coro: The Cloud-Optimized Read-Only LibraryH5Coro: The Cloud-Optimized Read-Only Library
H5Coro: The Cloud-Optimized Read-Only Library
 
MATLAB Modernization on HDF5 1.10
MATLAB Modernization on HDF5 1.10MATLAB Modernization on HDF5 1.10
MATLAB Modernization on HDF5 1.10
 
HDF for the Cloud - Serverless HDF
HDF for the Cloud - Serverless HDFHDF for the Cloud - Serverless HDF
HDF for the Cloud - Serverless HDF
 
HDF5 <-> Zarr
HDF5 <-> ZarrHDF5 <-> Zarr
HDF5 <-> Zarr
 
HDF for the Cloud - New HDF Server Features
HDF for the Cloud - New HDF Server FeaturesHDF for the Cloud - New HDF Server Features
HDF for the Cloud - New HDF Server Features
 
Apache Drill and Unidata THREDDS Data Server for NASA HDF-EOS on S3
Apache Drill and Unidata THREDDS Data Server for NASA HDF-EOS on S3Apache Drill and Unidata THREDDS Data Server for NASA HDF-EOS on S3
Apache Drill and Unidata THREDDS Data Server for NASA HDF-EOS on S3
 
STARE-PODS: A Versatile Data Store Leveraging the HDF Virtual Object Layer fo...
STARE-PODS: A Versatile Data Store Leveraging the HDF Virtual Object Layer fo...STARE-PODS: A Versatile Data Store Leveraging the HDF Virtual Object Layer fo...
STARE-PODS: A Versatile Data Store Leveraging the HDF Virtual Object Layer fo...
 
HDF5 and Ecosystem: What Is New?
HDF5 and Ecosystem: What Is New?HDF5 and Ecosystem: What Is New?
HDF5 and Ecosystem: What Is New?
 
HDF5 Roadmap 2019-2020
HDF5 Roadmap 2019-2020HDF5 Roadmap 2019-2020
HDF5 Roadmap 2019-2020
 

Recently uploaded

Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsMiki Katsuragi
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024The Digital Insurer
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxhariprasad279825
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Wonjun Hwang
 

Recently uploaded (20)

Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering Tips
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptx
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
 

Shifting the Burden from User to Data Provider

  • 1. Shifting the Burden from the User to the Data Provider Peter Fox High Altitude Observatory, NCAR (***) With thanks to eGY and various NSF, DoE and NASA projects 1
  • 2. Outline • Background, definitions • Informatics -> e-Science • Data has lots of uses – Virtual Observatories: use cases – Data Framework: Examples – Data ingest, integration, mining and … • Discussion 2 Fox HDF: Semantic Data Burden Shift Oct 15, 2008
  • 3. Background Scientists should be able to access a global, distributed knowledge base of scientific data that: • appears to be integrated • appears to be locally available But… data is obtained by multiple instruments, using various protocols, in differing vocabularies, using (sometimes unstated) assumptions, with inconsistent (or non-existent) meta-data. It may be inconsistent, incomplete, evolving, and distributed And… there exist(ed) significant levels of semantic heterogeneity, large-scale data, complex data types, legacy systems, inflexible and unsustainable implementation technology… 3 Fox HDF: Semantic Data Burden Shift Oct 15, 2008
  • 4. Information Information has But data products have Lots of Audiences More Strategic Less Strategic SCIENTISTS TOO From “Why EPO?”, a NASA internal report on science education, 2005 4 Fox HDF: Semantic Data Burden Shift Oct 15, 2008
  • 5. The Information Era: Interoperability Modern information and communications technologies are creating an “interoperable” information era in which ready access to data and information can be truly universal. Open access to data and services enables us to meet the new challenges of understand the Earth and its space environment as a complex system: • managing and accessing large data sets • higher space/time resolution capabilities • rapid response requirements • data assimilation into models • crossing disciplinary boundaries. 5 Fox HDF: Semantic Data Burden Shift Oct 15, 2008
  • 6. Shifting the Burden from the User to the Provider 6 Fox HDF: Semantic Data Burden Shift Oct 15, 2008
  • 7. Modern capabilities 7 Fox HDF: Semantic Data Burden Shift Oct 15, 2008
  • 8. Mind the Gap! As a result - finding out who is doing what, • Informatics ofinformation science includes the sharing experience/ expertise, and substantial science of (data and) information, the practice coordination: of information processing, and the engineering • There is/ was still a gap between science and the of information systems. Informatics studies the underlying infrastructure and technology that is structure, behavior, and interactions of natural available and artificial systems that store, process and • Cyberinfrastructure is the new communicate (data and) information. It also research environment(s) that support develops its own conceptual and theoretical advanced data acquisition, data foundations. Since computers, individuals and storage, data management, data organizations all process information, integration, data mining, data informatics has computational, cognitive and visualization and other computing and social aspects, including study of the social information processing services over impact of information technologies. Wikipedia. the Internet. 8 Fox HDF: Semantic Data Burden Shift Oct 15, 2008
  • 9. Progression after progression Informatics IT Cyber Infrastru cture Cyber Informatics Core Informatics Science Informatics, aka Xinformatics Science, SBAs 9 Fox HDF: Semantic Data Burden Shift Oct 15, 2008
  • 10. Virtual Observatories • Conceptual examples: • In-situ: Virtual measurements – Related measurements • Remote sensing: Virtual, integrative measurements – Data integration • Managing virtual data products/ sets 10
  • 11. Virtual Observatories Make data and tools quickly and easily accessible to a wide audience. Operationally, virtual observatories need to find the right balance of data/model holdings, portals and client software that researchers can use without effort or interference as if all the materials were available on his/her local computer using the user’s preferred language: i.e. appear to be local and integrated Likely to provide controlled vocabularies that may be used for interoperation in appropriate domains along with database interfaces for access and storage and “smart” tools for evolution and maintenance. 11
  • 12. Early days of discipline specific VOs ? VO2 VO3 VO1 DB1 DB2 DB3 ………… DBn 12
  • 13. The Astronomy approach; datatypes as a service Limited interoperability VO App1 VO App2 VOTable VO App3 Open Geospatial Consortium: Simple Image Access Protocol Web {Feature, Coverage, Mapping}Simple Service Spectrum Sensor Web Enablement: VO layer Sensor {Observation, Planning, Analysis}Lightweight semantics Service DB1 use DB2 Access Protocol Simple Time Access Protocol Limited meaning, hard coded the same approach DBn DB Limited extensibility 3 ………… Under review 13
  • 14. Added value Education, clearinghouses, disciplines, et c. other services, Semantic mediation layer - mid-upper-level VO Portal Semantic interoperability Added value VO API Web Serv. Added value Semantic query, hypothesis and inference Mediation Layer • Ontology - capturing concepts of Parameters, Instruments, Date/Time, Data Product (and Semantic mediation layer - VSTO associated classes, properties) and Service Classes • Maps queries to underlying data Metadata, schema, data • Generates access requests for metadata, data • Allows queries, reasoning, analysis, new value Added DB2 DB3 hypothesis generation, testing, explanation, et… … … … c. DB 1 Query, access and use of data low level DBn 14
  • 15. Content: Coupling Energetics and Dynamics of Atmospheric Regions WEB Community data archive for observations and models of Earth's upper atmosphere and geophysical indices and parameters needed to interpret them. Includes browsing capabilities by periods, > 310 instruments, models, > 820 15 parameters…
  • 16. Content: Mauna Loa Solar real-time Observatory Near products data from Hawaii from a variety of solar instruments. Source for space weather, solar variability, and basic solar physics Other content used too - Center for Integrated Space Weather Modeling 16
  • 17. Semantic Web Methodology and Technology Development Process • • Establish and improve a well-defined methodology vision for Semantic Technology based application development Leverage controlled vocabularies, et c. Rapid Open World: Evolve, Iterate, Prototype Redesign, Redeploy Leverage Technology Infrastructure Adopt Science/Expert Technology Approach Review & Iteration Use Tools Analysis Use Case Small Team, mixed skills Develop model/ ontology 17
  • 18. Science and technical use cases Find data which represents the state of the neutral atmosphere anywhere above 100km and toward the arctic circle (above 45N) at any time of high geomagnetic activity. – Extract information from the use-case - encode knowledge – Translate this into a complete query for data - inference and integration of data from instruments, indices and models Provide semantically-enabled, smart data query services via a SOAP web for the Virtual IonosphereThermosphere-Mesosphere Observatory that retrieve data, filtered by constraints on Instrument, Date-Time, and Parameter in any order and with constraints included in any combination. 18
  • 19. VSTO - semantics and ontologies in an operational environment: vsto.hao.ucar.edu, www.vsto.org Web Service 19 Fox RPI: Semantic Data Frameworks May 14, 2008
  • 20. Semantic filtering by domain or instrument hierarchy Partial exposure of Instrument class hierarchy - users seem to LIKE THIS 20
  • 21. 21
  • 22. Inferred plot type and return formats for data products 22 Fox RPI: Semantic Data Frameworks May 14, 2008
  • 23. Inferred plot type and return required axes data 23 Fox RPI: Semantic Data Frameworks May 14, 2008
  • 24. Semantic Web Benefits • • • • • Unified/ abstracted query workflow: Parameters, Instruments, Date-Time Decreased input requirements for query: in one case reducing the number of selections from eight to three Generates only syntactically correct queries: which was not always insurable in previous implementations without semantics Semantic query support: by using background ontologies and a reasoner, our application has the opportunity to only expose coherent query (portal and services) Semantic integration: in the past users had to remember (and maintain codes) to account for numerous different ways to combine and plot the data whereas now semantic mediation provides the level of sensible data integration required, now exposed as smart web services – understanding of coordinate systems, relationships, data synthesis, transformations, et c. – returns independent variables and related parameters • A broader range of potential users (PhD scientists, students, professional research associates and those from outside the fields) 24
  • 25. What is a Non-Specialist Use Case? Teacher accesses internet goes to An Educational Virtual Observatory and enters a search for “Aurora”. Someone should be able to query a virtual observatory without having specialist knowledge 25
  • 26. What should the User Receive? Teacher receives four groupings of search results: 1) Educational materials: http://www.meted.ucar.edu/topics_spacewx.php and http://www.meted.ucar.edu/hao/aurora/ 2) Research, data and tools: via VSTO, VSPO and VITMO, knows to search for brightness, or green/red line emission 3) Did you know?: Aurora is a phenomena of the upper terrestrial atmosphere (ionosphere) also known as Northern Lights 4) Did you mean?: Aurora Borealis or Aurora Australis, et c. 26
  • 27. Semantic Information Integration: Concept map for educational use of science data in a lesson plan 27 Fox RPI: Semantic Data Frameworks May 14, 2008
  • 28. 28 Fox RPI: Semantic Data Frameworks May 14, 2008
  • 29. Issues for Virtual Observatories rs se u • Scaling to large numbers of data providers and or redefining the role(s)/ relations with them f as re • Crossing discipline boundaries n a rde • Security, access to resources, policies bu tly • Branding and attribution (where did this data come en from and whourr the credit, is it the correct version, c gets is this anrauthoritative source?) ae se • Provenance/derivation (propagating key information he Tas it passes through a variety of services, copies of processing algorithms, …) • Data quality, preservation, stewardship 29
  • 30. Problem definition • Data is coming in faster, in greater volumes and outstripping our ability to perform adequate quality control • Data is being used in new ways and we frequently do not have sufficient information on what happened to the data along the processing stages to determine if it is suitable for a use we did not envision • We often fail to capture, represent and propagate manually generated information that need to go with the data flows • Each time we develop a new instrument, we develop a new data ingest procedure and collect different metadata and organize it differently. It is then hard to use with previous projects • 30 The task of event determination and feature classification is onerous and we don't do it until after we get the data
  • 31. Use cases • • • • • • • • • • Determine which flat field calibration was applied to the image taken on January, 26, 2005 around 2100UT by the ACOS Mark IV polarimeter. Which flat-field algorithm was applied to the set of images taken during the period November 1, 2004 to February 28, 2005? How many different data product types can be generated from the ACOS CHIP instrument? What images comprised the flat field calibration image used on January 26, 2007 for all ACOS CHIP images? What processing steps were completed to obtain the ACOS PICS limb image of the day for January 26, 2005? Who (person or program) added the comments to the science data file for the best vignetted, rectangular polarization brightness image from January, 26, 2005 1849:09UT taken by the ACOS Mark IV polarimeter? What was the cloud cover and atmospheric seeing conditions during the local morning of January 26, 2005 at MLSO? Find all good images on March 21, 2008. Why are the quick look images from March 21, 2008, 1900UT missing? Why does this image look bad? 31
  • 32. Provenance • Origin or source from which something comes, intention for use, who/what generated for, manner of manufacture, history of subsequent owners, sense of place and time of manufacture, production or discovery, documented in detail sufficient to allow reproducibility 32
  • 33. 33
  • 34. 34
  • 35. 35
  • 37. 37
  • 38. 38
  • 39. Discussion (1) • Taken together, an emerging set of collected experience manifests an emerging informatics core capability that is starting to take data intensive science into a new realm of realizability and potentially, sustainability – – – – Use cases (i.e. real users) X-informatics Core Informatics Cyber Informatics • There are implications for data models 39
  • 40. Progression after progression Informatics IT Cyber Infrastru cture Cyber Informatics Core Informatics Science Informatics Science, SBAs Example: •CI = OPeNDAP server running over HTTP/HTTPS •Cyberinformatics = Data (product) and service ontologies, triple store •Core informatics = Reasoning engine (Pellet), OWL •Science (X) informatics = Use cases, science domain terms, concepts in an ontology 40
  • 41. Discussion (2) • Data and information science is becoming the ‘fourth’ column (along with theory, experiment and computation) • Semantics (of the data) are a very key ingredient -> may imply richer data models 41
  • 42. Summary • Informatics is playing a key role in filling the gap between science (and the spectrum of non-expert) use and generation and the underlying cyberinfrastructure, i.e. in shifting the burden – This is evident due to the emergence of Xinformatics (world-wide) • Our experience is implementing informatics as semantics in Virtual Observatories (as a working paradigm) and Grid environments – VSTO is only one example of success – Data mining, data integration, smart search, provenance are close behind • Informatics is a profession and a community activity and requires efforts in all 3 sub-areas (science, core, cyber) and must be synergistic 42 Fox RPI: Semantic Data Frameworks May 14, 2008
  • 43. More Information • Virtual Solar Terrestrial Observatory (VSTO): http://vsto.hao.ucar.edu, http://www.vsto.org • Semantically-Enalbed Science Data Integration (SESDI): http://sesdi.hao.ucar.edu • Semantic Provenance Capture in Data Ingest Systems (SPCDIS): http://spcdis.hao.ucar.edu • Semantic Knowledge Integration Framework (SKIF/SAM): http://skif.hao.ucar.edu • Semantic Web for Earth and Environmental Terminology (SWEET): http://sweet.jpl.nasa.gov • Conferences: AGU 2008, EGU 2009, ISWC 2008, CIKM 2008, … • Peter Fox pfox@ucar.edu 43

Editor's Notes

  1. There are lots of different kinds of audiences interested in data. While we are talking about using data in the classroom today, several other audiences of are importance to Virtual observatories. In particular, on the more strategic end are groups that, while smaller, have great impact on the public’s and the government’s perception of the value of the data and its providers. In this category, I would place both science policy specialists and the media. Policy specialists and decision makers have a tremendous impact on budgets, but also feel, at least at some level, beholden to the tax payers. They want to see the impact that data has on people’s lives. They are also looking for information that will help them made an informed decision. In addition, the media plays a critical role, providing about 85% of the science content to the general public. A third group that is worth considering is the educated general public (the science-attentive public). They take science very seriously and can be a vocal advocate for a scinetific resource -- look at the Hubble scenario as an example.
  2. Interoperability technologies have a 20 year history of development and are now mature. Combined with our growing ability to transmit large amounts of information efficiently (Internet, GRID), this provides us with an unprecedented ability to address new and old scientific problems in ways that were hitherto impossible. The challenge now in the geosciences is to capitalise on this new capability. e-Science initiatives are growing up in centers around the world in response to this opportunity.
  3. In the “heroic” era of science, the provider of data plays a relatively passive role. The onus is on the user to identify each data source, accumulate the required data, and prepare the data for assimilation and analysis. A similar process applies for acquiring software for analysis, modeling, and visualisation. In many cases, the user and the provider are the same person. In an interoperable e-Science world, the provider has to put much more work into describing the structure and content of data and information, and someone has to provide and support Web Services. The user is relieved of these burdens and benefits accordingly. The overall reduction in work load is enormous, but the provider does not see that.
  4. This presentation is a template to be used by anyone as a basis for an introductory eGY presentation - please use it and modify it for your particular audience. A collection of .ppt files from past presentations are on the website: www.egy.org/resources. Use any you wish. Notes accompany each slide, so the presentation should be reviewed under “View: Normal”, or perhaps “View: Notes”. eGY Development Team July 2006
  5. VITMO (ITM) VMO (magnetospheric - east) VMO (magnetospheric - west) VIRBO (radiation belt) VHO (heliospheric) Madrigal (“VISRO”, …) VSPO (space-physics) VGMO (geo-magnetic) GAIA (auroral) And lots more…
  6. CISM - http://www.bu.edu/cism MLSO - http://mlso.hao.ucar.edu CEDAR - http://cedarweb.hao.ucar.edu