AHM 2014: Enterprise Architecture for Transformative Research and Collaboration Across Geoscinces

EarthCube Conceptual Design:
Enterprise Architecture for
Transformative Research
and Collaboration
Across the Geosciences
http://workspace.earthcube.org/transformative-research-collaboration
ILYA ZASLAVSKY, DAVID VALENTINE, AMARNATH GUPTA
San Diego Supercomputer Center/UCSD
STEPHEN RICHARD
Arizona Geological Survey
TANU MALIK
University of Chicago

Enterprise Architecture for Transformative Research and Collaboration Across the Geosciences
The Science Enterprise
• Ask questions
• Collect information
• Formulate hypotheses
• Test hypotheses to
determine which (if any)
provide satisfactory answer
• Document, curate, and
disseminate data and
results.
…. AND INCREASINGLY:
• Integrate data, analyses,
models across domains
• Collaborate: leverage pooled expertise and resources
increasing amount of data produced in modern science. LSDMA
bridges the gap between data production and data analysis using
a novel approach by combining specific community support and
generic, cross community development. In the Data Life Cycle
Labs (DLCL) experts from the data domain work closely with
scientific groups of selected research domains in joint R&D
where community-specific data life cycles are iteratively
optimized, data and meta-data formats are defined and
standardized, simple access and use is established as well as data
and scientific insights are preserved in long-term and open
accessible archives.
Keywords: data management, data life cycle, data intensive
computing, data analysis, data exploration, LSDMA, support, data
infrastructure
I. INTRODUCTION
Today data is knowledge – data exploration has become the
4th pillar in modern science besides experiment, theory, and
simulation as postulated by Jim Gray in 2007 [1]. Rapidly
increasing data rates in experiments, measurements and
simulation are limiting the speed of scientific production in
various research communities and the gap between the
generated data and data entering the data life cycle (cf. Fig1) is
widening. By providing high performance data management
components, analysis tools, computing resources, storage and
services it is possible to address this challenge but the
realization of a data intensive infrastructure at institutes and
universities is usually time consuming and always expensive.
The introduced “Large Scale Data Management and Analysis”
(LSDMA) project extends the services for research of the
Helmholtz Association of research centers in Germany with
community specific Data Life Cycle Laboratories (DLCL). The
The LSDMA project initiated at the Karlsruhe Institute of
Technology (KIT), builds on the familiarity with supporting
local scientists at a computer center, the knowledge of running
the Grid Computing Centre Karlsruhe (GridKa) [2] as the
German Tier 1 hub in the World Wide LHC Computing
infrastructure [3], the Large Scale Data Facility (LSDF) [4] and
the experience with the very successful Simulation Labs [5]
that specialize at supporting HPC users.
Figure 1. The scientific data life cycle

Design Framework:
Federation of Systems
Research enterprise includes subsystems at the project, program and
agency level, many of which are independent of NSF
• Requirements are a moving target
• Emergent behavior is to be expected
• Technology is constantly changing
• Community governance within constraints of funding agencies
• Evolutionary process and adaptation:
• Lots of variation; Mechanism to select ‘fittest’; Composability
• Technology must foster delegation of responsibilities and communication:
• Promote self-organization, Cultivate ideas, Maintain feedback between
subsystems
• Reliability: responsiveness, robustness, correctness
• Identity of system is based on shared goals and practices

Communication loops
Bottom-up Studies
Top-down Studies
Cross-Domain Scientists
Trends and
Patterns
Data
interoperability
best practices
Scientific Governance
Success stories
Technical Governance
Data Providers
Feasibility
Priorities
Strategies
Data Products
Options
Costs
Problems
and issues
Related work
Questions and
clarifications
Questions and
clarifications

Communication metrics

Components and Perspectives on EarthCube

Converging on reference
architecture semantics
 Analysis of existing building blocks, and their variability
 Component
 System
 Function
 Description
 Interfaces
 Implementation
 Steward Organization
 Availability
 Reference
 Developing cross-domain vocabularies, connecting domain models

Requirements Process
 Workshop Summaries
 Surveys
 Architecture Designs
 Analyze what worked
 Incorporate social
technologies
 Inventory CI building blocks

Concerns
 Hitting the right level of granularity in the design
 Identifying necessary communication channels
 Account for all key perspectives
 Fixing the scope and technologies
 Balancing current and future requirements
 Harmonizing technical and social subsystems and managing
interactions between them
 Uneven standardization and convergence across domains
and functional components
 Constructing a self-organizing plug-and-play system
 Inventorying building blocks

Summary
 System is defined by:
 Specifications for interfaces and interchange formats (the gateways)
 Definition of key functional components at an abstract level
 Discovery, Workflow s, Data processing, annotation, documentation
 Technology needs to support
 Communication between subsystems (people and machines)
 Collection of metrics required to assess what is working (selection
of the fittest)
 Assembly of components

AHM 2014: Enterprise Architecture for Transformative Research and Collaboration Across Geoscinces

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to AHM 2014: Enterprise Architecture for Transformative Research and Collaboration Across Geoscinces

Similar to AHM 2014: Enterprise Architecture for Transformative Research and Collaboration Across Geoscinces (20)

More from EarthCube

More from EarthCube (20)

Recently uploaded

Recently uploaded (20)

AHM 2014: Enterprise Architecture for Transformative Research and Collaboration Across Geoscinces