1. ICOS Services and Products
1
Alex Vermeulen
With contributions from:
ICOS Carbon Portal,
ICOS Head Office,
The ICOS RI Team
2. ICOS – all about reliable data
2
Integrated Carbon Observation System
Informs:
– Scientists
– Policy makers
– General public
Generate high quality observation data
Transparent – open data
From raw to elaborated data products
Clear data lifecycle
Community based
Catalyser of biogeochemical science
Improve knowledge and inform on
– Anthropogenic and natural fluxes
– Detect climate feedbacks
– Emission trends
Timely and factual information, beyond debate!
3. ICOS RI Services to members
• Uniform station design (meeting or exceeding global standards)
• Community defined common measurement protocols, standardized instrumentation
• Central data processing at (distributed) Thematic Centers (TC)
• Full processing chain from raw to full QC’ed product, traceable, transparent
• PI’s contribute metadata, check data, add quality flags
• Central Calibration lab (Germany)
– Flask and 14CO2 analysis
– Provision & reassignment of spiked natural air working standards and targets (WMO scales)
• Station networks run by nations -> monitoring station assemblies
• Legal representation in ERIC, Head Office (Finland) plus Carbon Portal (Sweden)
• Central administration
• Coordination, together with heads of TCs and MSA chairs
• Communication
• International strategy and relations: WMO GAW, SOCAT, Fluxnet, GEO Carbon and GHG Initiative
• Central data portal, open access, attribution and usage tracking
• Financial contributions by member states
– Membership, partially dependent on GDP
– Station contribution, dependent on domain, Class (I, II, associated)
• Nations contribute to 80% of HO, CP, TC, CAL, rest from member contribution
4. 4
Overview of ICOS revised strategy objectives (ICOS ERIC, 2018):
• Producing standardized high-precision long-term observational data
• Stimulating scientific studies and modelling efforts and providing platform for data analysis
and synthesis
• Promoting technical developments
• Ensuring that ICOS is the European pillar of a global GHG observation system
• Communicating science-based knowledge towards society and contributing timely
information relevant to the GHG policy and decision-making
0
50
100
150
200
2013 2014 2015 2016 2017
# media articles per year
unique headlines doubles
0
20
40
60
80
100
120
140
0
20
40
60
80
100
120
140
Finland
UnitedStates
Germany
Switzerland
France
Italy
Sweden
Belgium
Netherlands
UnitedKingdom
Norway
Denmark
Canada
India
Spain
NewZealand
Portugal
Singapore
Australia
Luxembourg
Poland
Ireland
Liechtenstein
Russia
CzechRepublic
Austria
China
PotentialReachofmedia
articles
Millions
Numberofmediaarticles
Number of media articles Potential reach
5. Socio-economic impact: from Observations to Decisions
SBSTAData sharing
Data management
(incl. metadata)
Observations Services Knowledge Decisions
From data to
knowledge
Model-Data
Fusion projects
Decisions have economic impact.
How to measure the economic impact
of the UNFCCC process?
And the role of the RIs?
Any KPIs?
7. Achievements
Tremendous progress in ICOS Research Infrastructure
• Definition of data lifecycle
• Station design and protocols
• Station qualification (labelling) well underway to more than 130 stations in 2019!
• First high quality data products are now available
• ‘FAIR’ data portal ready
• Globally well connected: WMO GAW, Fluxnet, SOCAT, Geo Carbon and GHG initiative,
IG3IS, Copernicus
• Innovations in measurements and data products (EUDAT2020, ENVRIplus, RINGO and
soon ENVRIFAIR)
7
8. Strategy 2018-2025
• Expansion and consolidation of the network
– ICOS across-themes common metadata model and exchange
– Integration of campaign, TCCON, urban data, historic time series
– Ensure sustainability, performance
• Stimulate scientific studies
– Support scientific studies, provide platform for modelling and computing through CP
– Extend user base, connect to society with policy relevant results
• Innovation
– Continuous innovation, new types of observations, instruments
• Enhance international cooperation
– Promoting our standards, federated data portal, extend the user base
– Closer international cooperation, Fluxnet, SOCAT, IG3IS, GEO-C
• Communicate Science with society through data products and services
– UNFCCC, IPCC, Paris agreement
– City, regional networks and data products (forestry, agriculture)
– General communication on climate change, raising awareness
8
12. About FAIR...
12
• Stands for Findable, Accessible, Interoperable, Reusable
• Was coined by FORCE11 in 2014, out of discussions in the
Life Sciences community
• Not a standard, but a set of principles
• Has become the new fashion (and Holy Grail!)
• Is increasingly called for by funders & policy makers
*FORCE11, 2014 (https://www.force11.org/fairprinciples)
ICOS CP is ‘FAIR avant la lettre’, concept paper is from 2013!
13. ICOS Carbon Portal, system elements
Persistent identifiers, linking to data object and metadata: DOI and/or Handle system
PID based on checksum of data object: Data Integrity control
High granularity of Data Objects
Support for versioning
Support for collections
Semantic web (WEB 3.0), open linked data, the web is the database, everything is a URL
Metadata based on ontology, all elements have (linked) URIs
Machines first, humans second
Machine actionable through standard http(s) protocol, RESTful API
nonSQL, RDF database
Open SPARQL endpoint
Versioned meta data store, roll-back, time dependent queries
Fully scalable and portable (dockerized), ready for the cloud
Data objects in trusted long term repository (B2SAFE, 2 replicates)
Open software, shared through GITHUB, GPL licence
Efficient, robust, flexible and safe
NGiNX proxy redirects to services (https://service.domain.eu), domain determines RI
13
15. DOAP for (data) scientists
• IP-number concept is what makes
the internet tick
• PID concept is what makes data
tick
• PIDs play central role
PID = Persistent IDentifier
OK, but what does that mean?
15
16. 16
DOI or Hdl id:
pre/suffixSearch
GET https://doi.org/pre/suffix
https://handle.net/pre/suffix
URL: meta.icos-cp.eu/objects/suffixGET https://meta.icos-
cp.eu/objects/suffix
Webpage (html) Metadata
(JSON, XML, …)
DO URL: data.icos-cp.eu/objects/suffix
17. ICOS PIDs
• ICOS
– handle.net prefix=11676;
– DOI prefix=10.18160
• ICOS DO suffix is SHA256 checksum of data (=fingerprint)
• Combination of phone, fingerprint and personal number:
– uniquely identifies digital object and makes it findable
– ‘Phone number’: prefix (country code) + suffix (local number)
• PID resolve to landing page URL through telephone book:
– Handle system: https://handle.net/prefix/suffix
– DOI: https://doi.org/prefix/suffix
• DOI = PID plus prescribed metadata set for scientific documents
• Meta data human & machine readable
17
19. Ontology and semantic web
• Ontology: formal description of knowledge, naming and describing
objects, relations and properties
• Semantic web:
– Proposed development of the World Wide Web in which data in web
pages is structured and tagged in such a way that it can be read directly
by computers.
– Minted by Tim Berners Lee, inventor of the World Wide Web
– Extension of the World Wide Web through standards by the World Wide
Web Consortium (W3C)
19
20. Open (versioned) linked data store
• The web becomes the database: Web 3.0
• All data and metadata accessible through standard http(s), no drivers required
• Any portal or portal of portals can link to ICOS (meta)data and vice versa!
ICOS Carbon Portal adds:
• Data is streamed dynamically, efficient and secure
• Data is stored in a long-term trusted repository
• License check, usage tracking while streaming
• Services on top create and are triggered by URLs (REST interface) and PIDs as
parameters (enable citation of result)
20
21. News: Google Data Search
21
• Launched last week
• No ICOS Data there YET
• Based on schema.org ontology and CKAN
• As always with portals of portals:
– Many dead links
– Confusion
– Bad user experience
• But this is the future direction…
23. Measurement stations
(National networks)
ICOS Carbon Portal
User 1 User 2 User 3
Ecosystem
Thematic
Centre
Atmospheric
Thematic
Centre
Oceanic
Thematic
Centre
Standardized
processing, quality
assurance & control
ICOS
repository
(data, metadata)
Sensor data
• Data ingestion
• PID and DOI minting
• Metadata services
• Data discovery & access
• Usage tracking
• Data visualisation
• Long term archiving
• Repository administration
• Preservation planning
• User community support
Diverse user communities,
including data producers and
other portals
High performance and
throughput computing
services
Finalized and
elaborated
data productsExternal metadata
registry & catalogue
services
Calibration
Labs
23
ICOS Data Flow
B2STAGE
B2SAFE
B2FIND
B2FIND
B2STAGE
B2SAFE
B2SHARE
VRE (EGI)
Datacite
VRE (EGI)
24. ICOS Data
• Level 0
– raw sensor output (either mV or physical units)
• Level 1/NRT
– calibrated and automatically Quality Assured data
• Level 2
– final observation data products
• Level 3
– elaborated data products based on ICOS data plus other model or observation data
24
25. Data services
https://data.icos-cp.eu
• Access of data object link triggers:
– Licence check
– Usage count
– https download
• Data links can be harvested and linked transparently into other portals: license check,
download and usage count still under full control, no redistribution needed
• Fully interactive search frontend (REST)
• Data cart (in user profile)
• Preview interactive charts/maps (REST)
• Supports versions, collections (subsetting planned)
25
35. Dynamic linking of elaborated and obs data
https://stilt.icos-cp.eu/viewer/
35
36. VRE to run atm transport model, workflow
https://stilt.icos-cp.eu/worker/
36
37. Interactive analysis tools for model results & data
• Analysis of simulated fossil fuel CO2 time series (RINGO)
• Evaluation of sampling strategies
• EUROCOM inversion intercomparison
• …
40. Achievements at the ICOS data portal
• Basic functionality of data portal ready
– Ingestion, search, preview, download, usage tracking
– Advanced, scalable and generic solution
– Transparent, secure, reliable, open source
• Operational data flows established
– from raw data to QC’ed data to elaborated products
• Synergies with H2020: EUDAT2020, ENVRI+
• International cooperation: WMO GAW, IG3IS, RDA, EOSC, Copernicus in-
situ
• Elaborated products and services full in use (EUROCOM, RINGO)
40
41. Next steps for ICOS data developments (2018-2019)
• Adding functionality: user needs
• Dedicated portal apps for important user groups
• Support ICOS related scientific communities with services and data
publication
– e.g. VERIFY, CHE, IG3IS, MEMO2, TRANSCOM
• Further metadata integration across domains (RINGO)
• Operational transport and footprint model for flask sampling planning and
analysis
• Operationalise and connect to elaborated services:
– Carbontracker, FLUXCOM, FluxEngine: Spatially and temporally resolved flux
data based on observations
41
43. FAIR principles
The principles refer to three types of entities:
• data (or any digital object),
• metadata (information about that digital object),
• and infrastructure.
For instance, principle F4 defines that both metadata and data are
registered or indexed in a searchable resource (the infrastructure
component).
43
44. FAIR: Findable
The first step in (re)using data is to find them. Metadata and data should be
easy to find for both humans and computers. Machine-readable metadata are
essential for automatic discovery of datasets and services, so this is an
essential component of the FAIRification process.
F1. (Meta)data are assigned a globally unique and persistent identifier
identifier
F2. Data are described with rich metadata
F3. Metadata clearly and explicitly include the identifier of the data they
the data they describe
F4. (Meta)data are registered or indexed in a searchable resource
resource
44
45. FAIR: Accessible
Once the user finds the required data, she/he needs to know how can
they be accessed, possibly including authentication and authorisation.
A1. (Meta)data are retrievable by their identifier using a standardised
standardised communications protocol
• A1.1 The protocol is open, free, and universally implementable
45
46. FAIR: Interoperable
The data usually need to be integrated with other data. In addition, the data
need to interoperate with applications or workflows for analysis, storage,
and processing.
I1. (Meta)data use a formal, accessible, shared, and broadly applicable
broadly applicable language for knowledge representation.
representation.
I2. (Meta)data use vocabularies that follow FAIR principles
I3. (Meta)data include qualified references to other (meta)data
(meta)data
46
47. FAIR: Reusable
The ultimate goal of FAIR is to optimise the reuse of data. To achieve this,
metadata and data should be well-described so that they can be replicated
and/or combined in different settings.
R1. Meta(data) are richly described with a plurality of accurate and
accurate and relevant attributes
R1.1. (Meta)data are released with a clear and accessible data usage license
usage license
R1.2. (Meta)data are associated with detailed provenance
R1.3. (Meta)data meet domain-relevant community standards
standards
47
48. Example: Wikidata
48
#Overall causes of death ranking
#added before 2016-10
#defaultView:BubbleChart
SELECT ?cid ?cause (count(*) as ?count)
WHERE
{
?pid wdt:P31 wd:Q5 .
?pid wdt:P509 ?cid .
OPTIONAL {
?cid rdfs:label ?cause
filter (lang(?cause) = "en")
}
}
GROUP BY ?cid ?cause
ORDER BY DESC(?count) ASC(?cause)