EarthCube DDMA AGU

A Community Roadmap for Enabling
Access to Geosciences Data

Tanu Malik
Ian Foster
Computation Institute
University of Chicago and Argonne National Lab.
tanum@ci.uchicago.edu, foster@anl.gov

www.ci.anl.gov
www.ci.uchicago.edu

Outline
• Access Workshop
• DataSpace
• Post Charette EarthCube

www.ci.anl.gov
2
www.ci.uchicago.edu

Access is Vital for EarthCube’s Success
• The goal of EarthCube is to create a sustainable
infrastructure that enables the sharing of all
geosciences data, information, and knowledge in an
open, transparent and inclusive manner.

I cant get access to *.

It is difficult for me to *.

I want to integrate data from other disciplines, but *.

Access refers to software and activities that make data and computational
resources easily, efficiently and reliably available to scientists across
disciplines.
www.ci.anl.gov
3
www.ci.uchicago.edu

Access Workshop Goals
• Encourage discussions on emergent issues:
– Use of cloud computing
– Exploiting the general principle of moving computation to data
– A technological and governance framework for cross-disciplinary
access, service architecture, brokering principles, real-time data, uniform
authentication and authorization environment, etc.
– Improving access to data in publications.

• Bring some standardization on research data life cycle issues:
– In general, data, once generated, follow a lifecycle---they are
stored, described, processed, transformed, accessed, discovered, analyze
d, and curated. In organized networks and campaigns, lifecycle stages are
often documented and standardized, though vary significantly across
networks and campaigns. In individual initiatives, the lifecycle stages
continue to remain ad hoc and ill-defined. [RDLM-Workshop2011]
• Obtain community consensus on a few use cases

www.ci.anl.gov
4
www.ci.uchicago.edu

Workshop Activity Outcomes
• Use Case 1: Can I access “not large” but “big data”
to conduct statistical analysis?

• Use Case 2: I have a hypothesis not tied to a
physical instrument or geophysical parameter. Can
I still access all the data, in an “interactive” fashion
to test my hypothesis?

• Use Case 3: The storm dust paper is vital to my
research. Can I access the data in the publication
and change parameters of experiments to
understand the nature of storm dust?
www.ci.anl.gov
5
www.ci.uchicago.edu

Workshop Reflections
• Its all about data!
People

Import Import

Resources, Resources,
Data Services Data
Services

Export Export

www.ci.anl.gov
6
www.ci.uchicago.edu

Workshop Reflections-2
• Discussing technology issues in insolation is a
recipe for disaster.
– Access is closely aligned with other subgroups
– It is important to organize in functional units

www.ci.anl.gov
7
www.ci.uchicago.edu

Workshop Reflections-3
• Challenges will continue

Social Challenges Changing Requirements/
Changing Technology
• Transparency
• Openness Adoption Culture
• Establishing social ties • Real-time data
Adoption is slow • Cross-disciplinary Data
Sustainability • High dimensionality
Establishing practices • Network bandwidth,
Computational resource,
Data management constraints

www.ci.anl.gov
8
www.ci.uchicago.edu

%4
Principles of Data Sharing in EarthCube
$ ) 4 ) B '* 7$ & / '* -* #$ -"#2& B '4 $ -% -$ -5'+#-* (-$ ! $
, ) % -+$ 4 ! '* #1-$
& !1, !) & "* +'/ '% ! /
"* +2+#-* "* (-$ % 8 "#"9, "(-H
. , )
)$#1-$
-! $ 5 6' . &#) & & 0%2$ ! $ , 4 '-! $ "#"$ "* "7-/ -* #$ 6$ & 0'! '* 7$ '%& $ 0'(-+$ ) &
/, 2* ) / +-&
$
%$
* A '+() 0-& "* ! $
$! 6$
1&/ ) -/ $!4 #, !4
-$% #30 !
0 &1N! @
#"$ ! $ * 7$ +$
"* K) @"'4
7!*+#3% $+!-*. !. ) / *+, . -. !
4 3#!/ &*-3/ &(!
5&1&' -(-*) !*3!&22#, . . !
!2&*&@ !92(1$ (", ". '4
"$ '#6$
$ "#"$ "(-$ $ 1'(1$
! +, '* B
& "#'0-$ , & "(1-+$ ) &
", ) %$
"* ! $ & 0'* 7$ "#"$
, -+-& !
0 1+&. -?, 2!*+&*!
. !1#36-2-/ $!&!% ->% !/ ,
 *&/ *-&(!, 44-5-, / 5-, . !*3!*+3. , !$#3% !*+&*!&#, !&(#, and reuse
/ $!. % . Lowers the barrier to entry for data sharing&2) !3#$&/ -?, 2!&/ 2!4 / 2, 27
' 1. % !
 Uses tenets like “metadata ASAP” to encourage submission of data
!&55, (, #&*-/ $!. 0 &((, #!$#3% !' ) !#, 0 36-/ $!+&#2C &#, !&/ 2!. 34 &#, !36, #+, &2. @ !
1. *C !P+,
!4  Enables creation of “Curation7
3#!&$$#, $&*-3/ !34 !2&*&7 &() . -. 7
!&/ !*33(. !&($3#-*+0among communities, sub-communities
Co-ops” . 7 32, (. !&/ 2!0 , *+32. !43#!
!0
 Serve the NSF !0 , *#-5. !. %
5, !*+&*!% , . !530 0 % -*)DMP requirement =-/ $!&/ 2!4 , 2' &5=!*3!. *3#, !&/ 2!1#30 3*, !
. / 5+!&. !#&/ ,
 Based on a cloud-based infrastructure to support !0 &#=, *!/ , *C 3#=@
, $&*, 2!#, . , &#5+!1#32% !&/ 2!-/ 5#, &. , !*+, !36, #&((!6&(% !34 data discovery, access, and
5*. , !*+, !
!530 1(,mining !*3!*+, ![ &#*+!H5-, / 5, !U3((&' 3#&*3#) !6-. -3/ !*+&*!U+#-. !^ ) / / , . 7
0 , / *&#) !
53((, &$% . !-/ !*+, !I , 2, #&*-3/ !34 &#*+!H5-, / 5, !A 4
, ![ / 3#0 &*-3/ !Y&#*/ , #. !K HA O---!+&6, !
[ YN
*-3/ !34 &#*+!. 5-, / 5, !-/ 4
![ 3#0 &*-3/ !&/ 2!=/ 3C (, 2$, !#, . 3% . !' 3*+!+3#-?3/ *&(() !&/ 2!
#5,
www.ci.anl.gov
&6, !, . *&' (-. +, 2!. *#3/ $!53((&' 3#&*-3/ !' , *C , , / !*+, !*C 3!&5*-6-*-, . 7 !-((% *#&*, 2!' ) !*+, !
9 !&. . www.ci.uchicago.edu

Enabling A Data Sharing Space: The
DataSpace
• Embrace a “semi--structured” notion

• Ingest data in raw form,
Structuring and refinement of the data and metadata.

• Open, extensible architecture that supports Import
Software as a Service (SaaS) model,
Process for vetting contributed services prior to their incorporation.
Based on on-demand resources
Resc,
• Emphasis on usability instead Services Data
DataSpace
on developing technology/infrastructure

Export

&

www.ci.anl.gov
10
www.ci.uchicago.edu

Post-Charette
• 2 Earthcube PI meets at University of Colorado, Boulder
– A Concept group meeting,
o some representation from Community groups,
o July 10, 2012

– A Concept and Community group meeting,
o October 4 -5, 2012

• Primary objective: Convergence
– Through Roadmaps
– Architecture
– On future steps

www.ci.anl.gov
11
www.ci.uchicago.edu

Highlights: Summary of Roadmaps
• Workplace to collaborate,
• Lower barriers for participation,
• Openness and extensibility,
• Feedback and reproducibility,
• Discovery of materials held by long-tailed
scientists,
• Education and reward system for scientists,
• Cross-domain teams and broad collaboration
• A new community paradigm.
www.ci.anl.gov
12
www.ci.uchicago.edu

Defining DataSpace: Architecture-1

Import

Resources,
Services Data

Export

www.ci.anl.gov
13
www.ci.uchicago.edu

Defining DataSpace: Architecture-2

www.ci.anl.gov
14
www.ci.uchicago.edu

Acknowledgements
• Don Middleton, NCAR • Dave Fulker, OPeNDAP,
• Robert Gibb, New Zealand Landcare • Amarnath Gupta, UCS,
Research • Robert Jacob, ANL
• Jeff Heard, U. of North Carolina
• Chris Jenkins, JPL
• Doug Lindholm, U. of Colorado
• Craig Mattocks, U. Miami
• Joseph Baker, Virginia Tech
• Beth Plale, Indiana Univ.
• Anne Wilson, U of Colorado
• Stephen M. Richard, AZGS
• Chris Lynnes, NASA/ESIP Federation
• Sameer Sirugeri, Microsoft
• Karsten Steinhauser, U. of
• Zhangfan Xing, JPL,
Minnesota
• John Williams, NCAR
• Ruth Duerr, NSIDC

www.ci.anl.gov
15
www.ci.uchicago.edu

Thank You!
• Tanu Malik, tanum@ci.uchicago.edu,
• Ian Foster, foster@anl.gov

• Questions?

www.ci.anl.gov
16
www.ci.uchicago.edu

EarthCube DDMA AGU

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to EarthCube DDMA AGU

Similar to EarthCube DDMA AGU (20)

EarthCube DDMA AGU

Editor's Notes