SlideShare a Scribd company logo
Improving access to
geospatial Big Data in the
hydrology domain
Claudia Vitolo1,2
and Wouter Buytaert1
1
Imperial College London
2
Brunel University London
Big Data and Spatial Analytics - Business and Industrial Section
Royal Statistical Society, London, UK - 18.11.2015
Outline
1. Background
2. Open Data and access approaches
3. Demo
4. Conclusions
1.
Background
What is Hydrology?
Hydrology is the scientific study of the movement,
distribution, and quality of water on Earth.
Source: Hydrology. In Wikipedia, The Free Encyclopedia.
What do (river) hydrologists do?
▣ Collect data on climate,
soil, geology,
topography, etc.
▣ Setup model
▣ Calibrate model with
observed water levels
and stream flows
□ locations
□ time intervals
▣ Use models to analyse
scenarios and make
predictions
Big Data in Hydrology
Information:
▣ Topography & bathymetry
▣ Geology
▣ Soil & Moisture
▣ Land cover
▣ Weather & Climate
▣ Hydrometry
▣ Quality samples
▣ Groundwater
▣ Infrastructures
Format:
▣ Plain text
▣ Raster
▣ Vector
▣ Binary
▣ Markup Languages
▣ Graphs & networks
▣ Cad drawings
Big Data in Hydrology
Information:
▣ Topography & bathymetry
▣ Geology
▣ Soil & Moisture
▣ Land cover
▣ Weather & Climate
▣ Hydrometry
▣ Quality samples
▣ Groundwater
▣ Infrastructures
Format:
▣ Plain text
▣ Raster
▣ Vector
▣ Binary
▣ Markup Languages
▣ Graphs & networks
▣ Cad drawings
Big Data in Hydrology
Information:
▣ Topography & bathymetry
▣ Geology
▣ Soil & Moisture
▣ Land cover
▣ Weather & Climate
▣ Hydrometry
▣ Quality samples
▣ Groundwater
▣ Infrastructures
Format:
▣ Plain text
▣ Raster
▣ Vector
▣ Binary
▣ Markup Languages
▣ Graphs & networks
▣ Cad drawings
Big Data in Hydrology
Information:
▣ Topography & bathymetry
▣ Geology
▣ Soil & Moisture
▣ Land cover
▣ Weather & Climate
▣ Hydrometry
▣ Quality samples
▣ Groundwater
▣ Infrastructures
Format:
▣ Plain text
▣ Raster
▣ Vector
▣ Binary
▣ Markup Languages
▣ Graphs & networks
▣ Cad drawings
Big Data challenges:
▣ Get large volume of heterogeneous data
▣ Mash-up information and use it to make
decisions
2.
Open Data
and data access
approaches
Open Data
“Open data and content can be freely used, modified, and
shared by anyone for any purpose”
Source: http://opendefinition.org/
Open Data
“Open data and content can be freely used, modified, and
shared by anyone for any purpose”
Source: http://opendefinition.org/
Open Data
“Open data and content can be freely used, modified, and
shared by anyone for any purpose”
Source: http://opendefinition.org/
Open Data
“Open data and content can be freely used, modified, and
shared by anyone for any purpose”
Source: http://opendefinition.org/
Open Data
“Open data and content can be freely used, modified, and
shared by anyone for any purpose”
Source: http://opendefinition.org/
The National River Flow Archive (NRFA)
River flow data from gauging station networks across the UK
including networks operated by:
● Environment Agency (England),
● Natural Resources Wales,
● Scottish Environment Protection Agency,
● Rivers Agency (Northern Ireland).
http://nrfa.ceh.ac.uk/
GUI
PROS: simple and intuitive
CONS: not scalable, not
flexible
Point & click (GUI) vs programmatic
(API) data retrieval
API
PROS: scalable, fast and
flexible
CONS: requires
programming skills
Application Programming Interface
SERVER
USER/CLIENT
API
The NRFA’s API
▣ metadata catalogue,
▣ catalogue filters,
▣ time series of gauged daily data,
▣ time series of catchment monthly rainfall.
How does an API work?
server/format/service?X=1&Y=2&Z=3
How does an API work?
server/format/service?X=1&Y=2&Z=3
QUESTION A:
How do I get information on station “18019” from the NRFA catalogue?
How does an API work?
server/format/service?X=1&Y=2&Z=3
QUESTION A:
How do I get information on station “18019” from the NRFA catalogue?
ANSWER:
nrfaapps.ceh.ac.uk/nrfa/json/stationSummary?db=nrfa_public&stn=18019
How does an API work?
server/format/service?X=1&Y=2&Z=3
QUESTION B:
How do I get the time series of gauged daily data for station “18019”?
How does an API work?
server/format/service?X=1&Y=2&Z=3
QUESTION B:
How do I get the time series of gauged daily data for station “18019”?
ANSWER:
nrfaapps.ceh.ac.uk/nrfa/xml/waterml2?db=nrfa_public&stn=18019&dt=gdf
From machine-readable to human-
readable formats
JSON
XML
Plain text
R libraries to interface APIs
▣ raincpc: download and process the Climate Prediction Center's
(CPC) daily rainfall data
▣ rnoaa: an interface to NOAA Climate data API
▣ soilDB: read data from USDA-NCSS soil databases.
▣ waterData: retrieve, analyse, and calculate anomalies of daily
hydrologic time series data.
▣ rnrfa: an interface to the UK National River Flow Archive data API.
3.
Demo
The R package RNRFA
API interface:
▣ make request
▣ parse response
▣ retrieve and filter metadata catalogue
▣ get time series of gauged daily data and catchment monthly
rainfall
API interface + external libraries:
▣ make maps
▣ create interactive tables and plots
▣ simplify and speed up reporting!
Example of dynamic report
▣ Find all the stations operated by National Resources Wales
▣ Retrieve time series of daily flows
▣ Run a basic analysis
▣ Create interactive plot, table and map
4.
Conclusions
Summary
Big Data
Large volumes of
heterogeneous spatio-
temporal data is becoming
increasingly open in the
hydrology domain.
GUIs vs APIs
GUIs may be the easiest way
to browse data but not the
most efficient. APIs are fast
and scalable.
Hardware/software
Hardware & software burden
is on the data provider side.
No need to update your
datasets, you always access
the latest version
R as interface
R is an easy-to-learn
language, widely used by
statisticians and scientists. It
provides a number of libraries
to obtain and parse data from
the web.
Reproducible workflows
Query databases, filter
information, convert
coordinates, generate plots
and maps for reproducible
reporting.
Scalability & Interoperability
An approach to gather
information for single as well
as multiple sites. At larger
scale, computing can be
made more efficient by using
cloud facilities.
R
Summary
Big Data
Large volumes of
heterogeneous spatio-
temporal data is becoming
increasingly open in the
hydrology domain.
GUIs vs APIs
GUIs may be the easiest way
to browse data but not the
most efficient. APIs are fast
and scalable.
Hardware/software
Hardware & software burden
is on the data provider side.
No need to update your
datasets, you always access
the latest version
R as interface
R is an easy-to-learn
language, widely used by
statisticians and scientists. It
provides a number of libraries
to obtain and parse data from
the web.
Reproducible workflows
Query databases, filter
information, convert
coordinates, generate plots
and maps for reproducible
reporting.
Scalability & Interoperability
An approach to gather
information for single as well
as multiple sites. At larger
scale, computing can be
made more efficient by using
cloud facilities.
R
Summary
Big Data
Large volumes of
heterogeneous spatio-
temporal data is becoming
increasingly open in the
hydrology domain.
GUIs vs APIs
GUIs may be the easiest way
to browse data but not the
most efficient. APIs are fast
and scalable.
Hardware/software
Hardware & software burden
is on the data provider side.
No need to update your
datasets, you always access
the latest version
R as interface
R is an easy-to-learn
language, widely used by
statisticians and scientists. It
provides a number of libraries
to obtain and parse data from
the web.
Reproducible workflows
Query databases, filter
information, convert
coordinates, generate plots
and maps for reproducible
reporting.
Scalability & Interoperability
An approach to gather
information for single as well
as multiple sites. At larger
scale, computing can be
made more efficient by using
cloud facilities.
R
Summary
Big Data
Large volumes of
heterogeneous spatio-
temporal data is becoming
increasingly open in the
hydrology domain.
GUIs vs APIs
GUIs may be the easiest way
to browse data but not the
most efficient. APIs are fast
and scalable.
Hardware/software
Hardware & software burden
is on the data provider side.
No need to update your
datasets, you always access
the latest version
R as interface
R is an easy-to-learn
language, widely used by
statisticians and scientists. It
provides a number of libraries
to obtain and parse data from
the web.
Reproducible workflows
Query databases, filter
information, convert
coordinates, generate plots
and maps for reproducible
reporting.
Scalability & Interoperability
An approach to gather
information for single as well
as multiple sites. At larger
scale, computing can be
made more efficient by using
cloud facilities.
R
Summary
Big Data
Large volumes of
heterogeneous spatio-
temporal data is becoming
increasingly open in the
hydrology domain.
GUIs vs APIs
GUIs may be the easiest way
to browse data but not the
most efficient. APIs are fast
and scalable.
Hardware/software
Hardware & software burden
is on the data provider side.
No need to update your
datasets, you always access
the latest version
R as interface
R is an easy-to-learn
language, widely used by
statisticians and scientists. It
provides a number of libraries
to obtain and parse data from
the web.
Reproducible workflows
Query databases, filter
information, convert
coordinates, generate plots
and maps for reproducible
reporting.
Scalability & Interoperability
An approach to gather
information for single as well
as multiple sites. At larger
scale, computing can be
made more efficient by using
cloud facilities.
R
Summary
Big Data
Large volumes of
heterogeneous spatio-
temporal data is becoming
increasingly open in the
hydrology domain.
GUIs vs APIs
GUIs may be the easiest way
to browse data but not the
most efficient. APIs are fast
and scalable.
Hardware/software
Hardware & software burden
is on the data provider side.
No need to update your
datasets, you always access
the latest version
R as interface
R is an easy-to-learn
language, widely used by
statisticians and scientists. It
provides a number of libraries
to obtain and parse data from
the web.
Reproducible workflows
Query databases, filter
information, convert
coordinates, generate plots
and maps for reproducible
reporting.
Scalability & Interoperability
An approach to gather
information for single as well
as multiple sites. At larger
scale, computing can be
made more efficient by using
cloud facilities.
R
Summary
Big Data
Large volumes of
heterogeneous spatio-
temporal data is becoming
increasingly open in the
hydrology domain.
GUIs vs APIs
GUIs may be the easiest way
to browse data but not the
most efficient. APIs are fast
and scalable.
Hardware/software
Hardware & software burden
is on the data provider side.
No need to update your
datasets, you always access
the latest version
R as interface
R is an easy-to-learn
language, widely used by
statisticians and scientists. It
provides a number of libraries
to obtain and parse data from
the web.
Reproducible workflows
Query databases, filter
information, convert
coordinates, generate plots
and maps for reproducible
reporting.
Scalability & Interoperability
An approach to gather
information for single as well
as multiple sites. At larger
scale, computing can be
made more efficient by using
cloud facilities.
R
Thanks!
Any questions?
Claudia Vitolo
Twitter: @clavitolo
Email: claudia.vitolo@gmail.com
Blog: http://claudiavitolo.com/

More Related Content

What's hot

GlobusWorld 2021: Arecibo Observatory Data Movement
GlobusWorld 2021: Arecibo Observatory Data MovementGlobusWorld 2021: Arecibo Observatory Data Movement
GlobusWorld 2021: Arecibo Observatory Data Movement
Globus
 
Big data for SAS programmers
Big data for SAS programmersBig data for SAS programmers
Big data for SAS programmers
Kevin Lee
 
Linked Logainm: Enhancing Library Metadata using Linked Data of Irish Place N...
Linked Logainm: Enhancing Library Metadata using Linked Data of Irish Place N...Linked Logainm: Enhancing Library Metadata using Linked Data of Irish Place N...
Linked Logainm: Enhancing Library Metadata using Linked Data of Irish Place N...
nunoalexandrelopes
 
GlobusWorld 2021: Saving Arecibo Observatory Data
GlobusWorld 2021: Saving Arecibo Observatory DataGlobusWorld 2021: Saving Arecibo Observatory Data
GlobusWorld 2021: Saving Arecibo Observatory Data
Globus
 
PIC Tier-1 (LHCP Conference / Barcelona)
PIC Tier-1 (LHCP Conference / Barcelona)PIC Tier-1 (LHCP Conference / Barcelona)
PIC Tier-1 (LHCP Conference / Barcelona)
Josep Flix
 
ResourceSync Introduction at SWIB13
ResourceSync Introduction at SWIB13ResourceSync Introduction at SWIB13
ResourceSync Introduction at SWIB13
Simeon Warner
 
Linked Sensor Data cube
Linked Sensor Data cubeLinked Sensor Data cube
Linked Sensor Data cube
Laurent Lefort
 
More Complete Resultset Retrieval from Large Heterogeneous RDF Sources
More Complete Resultset Retrieval from Large Heterogeneous RDF SourcesMore Complete Resultset Retrieval from Large Heterogeneous RDF Sources
More Complete Resultset Retrieval from Large Heterogeneous RDF Sources
André Valdestilhas
 
2015 09 rda-pre-meeting_jk
2015 09 rda-pre-meeting_jk2015 09 rda-pre-meeting_jk
2015 09 rda-pre-meeting_jk
Johannes Keizer
 
CLIM Program: Remote Sensing Workshop, High Performance Computing and Spatial...
CLIM Program: Remote Sensing Workshop, High Performance Computing and Spatial...CLIM Program: Remote Sensing Workshop, High Performance Computing and Spatial...
CLIM Program: Remote Sensing Workshop, High Performance Computing and Spatial...
The Statistical and Applied Mathematical Sciences Institute
 
Dynamic Integrations of Crop Data and Corresponding Meteorological Data based...
Dynamic Integrations of Crop Data and Corresponding Meteorological Data based...Dynamic Integrations of Crop Data and Corresponding Meteorological Data based...
Dynamic Integrations of Crop Data and Corresponding Meteorological Data based...
AIMS (Agricultural Information Management Standards)
 
Artificial Intelligence and Big Data Technologies for Copernicus Data: the Ex...
Artificial Intelligence and Big Data Technologies for Copernicus Data: the Ex...Artificial Intelligence and Big Data Technologies for Copernicus Data: the Ex...
Artificial Intelligence and Big Data Technologies for Copernicus Data: the Ex...
ExtremeEarth
 
Forensic Readiness on Hadoop Platform: Non-Ambari HDP as a Case Study
Forensic Readiness on Hadoop Platform: Non-Ambari HDP as a Case StudyForensic Readiness on Hadoop Platform: Non-Ambari HDP as a Case Study
Forensic Readiness on Hadoop Platform: Non-Ambari HDP as a Case Study
IJCSIS Research Publications
 
Querying Linked Geospatial Data with Incomplete Information
Querying Linked Geospatial Data with  Incomplete InformationQuerying Linked Geospatial Data with  Incomplete Information
Querying Linked Geospatial Data with Incomplete Information
Charalampos (Babis) Nikolaou
 
Dynamic Data Center concept
Dynamic Data Center concept  Dynamic Data Center concept
Dynamic Data Center concept
Miha Ahronovitz
 
CourboSpark: Decision Tree for Time-series on Spark
CourboSpark: Decision Tree for Time-series on SparkCourboSpark: Decision Tree for Time-series on Spark
CourboSpark: Decision Tree for Time-series on Spark
DataWorks Summit
 
Big Linked Data Federation - ExtremeEarth Open Workshop
Big Linked Data Federation - ExtremeEarth Open WorkshopBig Linked Data Federation - ExtremeEarth Open Workshop
Big Linked Data Federation - ExtremeEarth Open Workshop
ExtremeEarth
 
ESWC 2019 - A Software Framework and Datasets for the Analysis of Graphs Meas...
ESWC 2019 - A Software Framework and Datasets for the Analysis of Graphs Meas...ESWC 2019 - A Software Framework and Datasets for the Analysis of Graphs Meas...
ESWC 2019 - A Software Framework and Datasets for the Analysis of Graphs Meas...
Matthäus Zloch
 

What's hot (20)

GlobusWorld 2021: Arecibo Observatory Data Movement
GlobusWorld 2021: Arecibo Observatory Data MovementGlobusWorld 2021: Arecibo Observatory Data Movement
GlobusWorld 2021: Arecibo Observatory Data Movement
 
Big data for SAS programmers
Big data for SAS programmersBig data for SAS programmers
Big data for SAS programmers
 
Linked Logainm: Enhancing Library Metadata using Linked Data of Irish Place N...
Linked Logainm: Enhancing Library Metadata using Linked Data of Irish Place N...Linked Logainm: Enhancing Library Metadata using Linked Data of Irish Place N...
Linked Logainm: Enhancing Library Metadata using Linked Data of Irish Place N...
 
GlobusWorld 2021: Saving Arecibo Observatory Data
GlobusWorld 2021: Saving Arecibo Observatory DataGlobusWorld 2021: Saving Arecibo Observatory Data
GlobusWorld 2021: Saving Arecibo Observatory Data
 
PIC Tier-1 (LHCP Conference / Barcelona)
PIC Tier-1 (LHCP Conference / Barcelona)PIC Tier-1 (LHCP Conference / Barcelona)
PIC Tier-1 (LHCP Conference / Barcelona)
 
ResourceSync Introduction at SWIB13
ResourceSync Introduction at SWIB13ResourceSync Introduction at SWIB13
ResourceSync Introduction at SWIB13
 
Linked Sensor Data cube
Linked Sensor Data cubeLinked Sensor Data cube
Linked Sensor Data cube
 
HDF Town Hall
HDF Town HallHDF Town Hall
HDF Town Hall
 
More Complete Resultset Retrieval from Large Heterogeneous RDF Sources
More Complete Resultset Retrieval from Large Heterogeneous RDF SourcesMore Complete Resultset Retrieval from Large Heterogeneous RDF Sources
More Complete Resultset Retrieval from Large Heterogeneous RDF Sources
 
2015 09 rda-pre-meeting_jk
2015 09 rda-pre-meeting_jk2015 09 rda-pre-meeting_jk
2015 09 rda-pre-meeting_jk
 
CLIM Program: Remote Sensing Workshop, High Performance Computing and Spatial...
CLIM Program: Remote Sensing Workshop, High Performance Computing and Spatial...CLIM Program: Remote Sensing Workshop, High Performance Computing and Spatial...
CLIM Program: Remote Sensing Workshop, High Performance Computing and Spatial...
 
Dynamic Integrations of Crop Data and Corresponding Meteorological Data based...
Dynamic Integrations of Crop Data and Corresponding Meteorological Data based...Dynamic Integrations of Crop Data and Corresponding Meteorological Data based...
Dynamic Integrations of Crop Data and Corresponding Meteorological Data based...
 
Dynamic integrations of crop data and corresponding meteorological data based...
Dynamic integrations of crop data and corresponding meteorological data based...Dynamic integrations of crop data and corresponding meteorological data based...
Dynamic integrations of crop data and corresponding meteorological data based...
 
Artificial Intelligence and Big Data Technologies for Copernicus Data: the Ex...
Artificial Intelligence and Big Data Technologies for Copernicus Data: the Ex...Artificial Intelligence and Big Data Technologies for Copernicus Data: the Ex...
Artificial Intelligence and Big Data Technologies for Copernicus Data: the Ex...
 
Forensic Readiness on Hadoop Platform: Non-Ambari HDP as a Case Study
Forensic Readiness on Hadoop Platform: Non-Ambari HDP as a Case StudyForensic Readiness on Hadoop Platform: Non-Ambari HDP as a Case Study
Forensic Readiness on Hadoop Platform: Non-Ambari HDP as a Case Study
 
Querying Linked Geospatial Data with Incomplete Information
Querying Linked Geospatial Data with  Incomplete InformationQuerying Linked Geospatial Data with  Incomplete Information
Querying Linked Geospatial Data with Incomplete Information
 
Dynamic Data Center concept
Dynamic Data Center concept  Dynamic Data Center concept
Dynamic Data Center concept
 
CourboSpark: Decision Tree for Time-series on Spark
CourboSpark: Decision Tree for Time-series on SparkCourboSpark: Decision Tree for Time-series on Spark
CourboSpark: Decision Tree for Time-series on Spark
 
Big Linked Data Federation - ExtremeEarth Open Workshop
Big Linked Data Federation - ExtremeEarth Open WorkshopBig Linked Data Federation - ExtremeEarth Open Workshop
Big Linked Data Federation - ExtremeEarth Open Workshop
 
ESWC 2019 - A Software Framework and Datasets for the Analysis of Graphs Meas...
ESWC 2019 - A Software Framework and Datasets for the Analysis of Graphs Meas...ESWC 2019 - A Software Framework and Datasets for the Analysis of Graphs Meas...
ESWC 2019 - A Software Framework and Datasets for the Analysis of Graphs Meas...
 

Viewers also liked

Data Assimilation for the Lorenz (1963) Model using Ensemble and Extended Kal...
Data Assimilation for the Lorenz (1963) Model using Ensemble and Extended Kal...Data Assimilation for the Lorenz (1963) Model using Ensemble and Extended Kal...
Data Assimilation for the Lorenz (1963) Model using Ensemble and Extended Kal...
Claudia Vitolo
 
Hydrology and Water resources software
Hydrology and Water resources softwareHydrology and Water resources software
Hydrology and Water resources software
Hariyali Pujara
 
Hydrology
HydrologyHydrology
Advanced hydrology & water resource engg
Advanced hydrology & water resource enggAdvanced hydrology & water resource engg
Advanced hydrology & water resource enggCivil Engineers
 
(3) irrigation hydrology
(3) irrigation hydrology(3) irrigation hydrology
(3) irrigation hydrology
Prakash Pandya
 

Viewers also liked (6)

Definition and scope
Definition and scopeDefinition and scope
Definition and scope
 
Data Assimilation for the Lorenz (1963) Model using Ensemble and Extended Kal...
Data Assimilation for the Lorenz (1963) Model using Ensemble and Extended Kal...Data Assimilation for the Lorenz (1963) Model using Ensemble and Extended Kal...
Data Assimilation for the Lorenz (1963) Model using Ensemble and Extended Kal...
 
Hydrology and Water resources software
Hydrology and Water resources softwareHydrology and Water resources software
Hydrology and Water resources software
 
Hydrology
HydrologyHydrology
Hydrology
 
Advanced hydrology & water resource engg
Advanced hydrology & water resource enggAdvanced hydrology & water resource engg
Advanced hydrology & water resource engg
 
(3) irrigation hydrology
(3) irrigation hydrology(3) irrigation hydrology
(3) irrigation hydrology
 

Similar to Improving access to geospatial Big Data in the hydrology domain

A Data Lake and a Data Lab to Optimize Operations and Safety within a nuclear...
A Data Lake and a Data Lab to Optimize Operations and Safety within a nuclear...A Data Lake and a Data Lab to Optimize Operations and Safety within a nuclear...
A Data Lake and a Data Lab to Optimize Operations and Safety within a nuclear...
DataWorks Summit/Hadoop Summit
 
NOAA Big Data Project Handout
NOAA Big Data Project HandoutNOAA Big Data Project Handout
NOAA Big Data Project Handout
Amy Gaskins
 
‘Facilitating User Engagement by Enriching Library Data using Semantic Techno...
‘Facilitating User Engagement by Enriching Library Data using Semantic Techno...‘Facilitating User Engagement by Enriching Library Data using Semantic Techno...
‘Facilitating User Engagement by Enriching Library Data using Semantic Techno...
CONUL Conference
 
Presentation
PresentationPresentation
Presentationbolu804
 
Big Data to SMART Data : Process Scenario
Big Data to SMART Data : Process ScenarioBig Data to SMART Data : Process Scenario
Big Data to SMART Data : Process Scenario
CHAKER ALLAOUI
 
Experiences as a producer, consumer and observer of open data
Experiences as a producer, consumer and observer of open dataExperiences as a producer, consumer and observer of open data
Experiences as a producer, consumer and observer of open data
ProgCity
 
DSD-INT 2015 -EU FP7 project “FAST” introduction - Mindert de Vries
DSD-INT 2015 -EU FP7 project “FAST” introduction - Mindert de VriesDSD-INT 2015 -EU FP7 project “FAST” introduction - Mindert de Vries
DSD-INT 2015 -EU FP7 project “FAST” introduction - Mindert de Vries
Deltares
 
Facing data sharing in a heterogeneous research community: lights and shadows...
Facing data sharing in a heterogeneous research community: lights and shadows...Facing data sharing in a heterogeneous research community: lights and shadows...
Facing data sharing in a heterogeneous research community: lights and shadows...
Research Data Alliance
 
Bridging Environmental Data Providers and SeaDataNet DIVA Service within a Co...
Bridging Environmental Data Providers and SeaDataNet DIVA Service within a Co...Bridging Environmental Data Providers and SeaDataNet DIVA Service within a Co...
Bridging Environmental Data Providers and SeaDataNet DIVA Service within a Co...
Blue BRIDGE
 
Analysis of National Footprint Accounts using MapReduce, Hive, Pig and Sqoop
Analysis of National Footprint Accounts using MapReduce, Hive, Pig and SqoopAnalysis of National Footprint Accounts using MapReduce, Hive, Pig and Sqoop
Analysis of National Footprint Accounts using MapReduce, Hive, Pig and Sqoop
sushantparte
 
SPatially Explicit Data Discovery, Extraction and Evaluation Services (SPEDDE...
SPatially Explicit Data Discovery, Extraction and Evaluation Services (SPEDDE...SPatially Explicit Data Discovery, Extraction and Evaluation Services (SPEDDE...
SPatially Explicit Data Discovery, Extraction and Evaluation Services (SPEDDE...
aceas13tern
 
Dealing with Semantic Heterogeneity in Real-Time Information
Dealing with Semantic Heterogeneity in Real-Time InformationDealing with Semantic Heterogeneity in Real-Time Information
Dealing with Semantic Heterogeneity in Real-Time Information
Edward Curry
 
2008-02-11: EPA DataFed Presentation
2008-02-11: EPA DataFed Presentation2008-02-11: EPA DataFed Presentation
2008-02-11: EPA DataFed PresentationRudolf Husar
 
Esri and the Scientific Community
Esri and the Scientific CommunityEsri and the Scientific Community
Esri and the Scientific Community
Dawn Wright
 
A genealogy of data assemblages: tracing the geospatial open access and open ...
A genealogy of data assemblages: tracing the geospatial open access and open ...A genealogy of data assemblages: tracing the geospatial open access and open ...
A genealogy of data assemblages: tracing the geospatial open access and open ...
Communication and Media Studies, Carleton University
 
The Earth System Grid Federation: Origins, Current State, Evolution
The Earth System Grid Federation: Origins, Current State, EvolutionThe Earth System Grid Federation: Origins, Current State, Evolution
The Earth System Grid Federation: Origins, Current State, Evolution
Ian Foster
 
2019 02-12 eosc-hub for eo
2019 02-12 eosc-hub for eo2019 02-12 eosc-hub for eo
2019 02-12 eosc-hub for eo
EGI Federation
 
The AGINFRA+ Virtual Research Environment (VRE)
The AGINFRA+ Virtual Research Environment (VRE)The AGINFRA+ Virtual Research Environment (VRE)
The AGINFRA+ Virtual Research Environment (VRE)
AGINFRA
 
Show me the data
Show me the dataShow me the data
Show me the data
Rich Pauloo
 

Similar to Improving access to geospatial Big Data in the hydrology domain (20)

A Data Lake and a Data Lab to Optimize Operations and Safety within a nuclear...
A Data Lake and a Data Lab to Optimize Operations and Safety within a nuclear...A Data Lake and a Data Lab to Optimize Operations and Safety within a nuclear...
A Data Lake and a Data Lab to Optimize Operations and Safety within a nuclear...
 
NOAA Big Data Project Handout
NOAA Big Data Project HandoutNOAA Big Data Project Handout
NOAA Big Data Project Handout
 
Open Spatial Data: Sources and Tools
Open Spatial Data: Sources and ToolsOpen Spatial Data: Sources and Tools
Open Spatial Data: Sources and Tools
 
‘Facilitating User Engagement by Enriching Library Data using Semantic Techno...
‘Facilitating User Engagement by Enriching Library Data using Semantic Techno...‘Facilitating User Engagement by Enriching Library Data using Semantic Techno...
‘Facilitating User Engagement by Enriching Library Data using Semantic Techno...
 
Presentation
PresentationPresentation
Presentation
 
Big Data to SMART Data : Process Scenario
Big Data to SMART Data : Process ScenarioBig Data to SMART Data : Process Scenario
Big Data to SMART Data : Process Scenario
 
Experiences as a producer, consumer and observer of open data
Experiences as a producer, consumer and observer of open dataExperiences as a producer, consumer and observer of open data
Experiences as a producer, consumer and observer of open data
 
DSD-INT 2015 -EU FP7 project “FAST” introduction - Mindert de Vries
DSD-INT 2015 -EU FP7 project “FAST” introduction - Mindert de VriesDSD-INT 2015 -EU FP7 project “FAST” introduction - Mindert de Vries
DSD-INT 2015 -EU FP7 project “FAST” introduction - Mindert de Vries
 
Facing data sharing in a heterogeneous research community: lights and shadows...
Facing data sharing in a heterogeneous research community: lights and shadows...Facing data sharing in a heterogeneous research community: lights and shadows...
Facing data sharing in a heterogeneous research community: lights and shadows...
 
Bridging Environmental Data Providers and SeaDataNet DIVA Service within a Co...
Bridging Environmental Data Providers and SeaDataNet DIVA Service within a Co...Bridging Environmental Data Providers and SeaDataNet DIVA Service within a Co...
Bridging Environmental Data Providers and SeaDataNet DIVA Service within a Co...
 
Analysis of National Footprint Accounts using MapReduce, Hive, Pig and Sqoop
Analysis of National Footprint Accounts using MapReduce, Hive, Pig and SqoopAnalysis of National Footprint Accounts using MapReduce, Hive, Pig and Sqoop
Analysis of National Footprint Accounts using MapReduce, Hive, Pig and Sqoop
 
SPatially Explicit Data Discovery, Extraction and Evaluation Services (SPEDDE...
SPatially Explicit Data Discovery, Extraction and Evaluation Services (SPEDDE...SPatially Explicit Data Discovery, Extraction and Evaluation Services (SPEDDE...
SPatially Explicit Data Discovery, Extraction and Evaluation Services (SPEDDE...
 
Dealing with Semantic Heterogeneity in Real-Time Information
Dealing with Semantic Heterogeneity in Real-Time InformationDealing with Semantic Heterogeneity in Real-Time Information
Dealing with Semantic Heterogeneity in Real-Time Information
 
2008-02-11: EPA DataFed Presentation
2008-02-11: EPA DataFed Presentation2008-02-11: EPA DataFed Presentation
2008-02-11: EPA DataFed Presentation
 
Esri and the Scientific Community
Esri and the Scientific CommunityEsri and the Scientific Community
Esri and the Scientific Community
 
A genealogy of data assemblages: tracing the geospatial open access and open ...
A genealogy of data assemblages: tracing the geospatial open access and open ...A genealogy of data assemblages: tracing the geospatial open access and open ...
A genealogy of data assemblages: tracing the geospatial open access and open ...
 
The Earth System Grid Federation: Origins, Current State, Evolution
The Earth System Grid Federation: Origins, Current State, EvolutionThe Earth System Grid Federation: Origins, Current State, Evolution
The Earth System Grid Federation: Origins, Current State, Evolution
 
2019 02-12 eosc-hub for eo
2019 02-12 eosc-hub for eo2019 02-12 eosc-hub for eo
2019 02-12 eosc-hub for eo
 
The AGINFRA+ Virtual Research Environment (VRE)
The AGINFRA+ Virtual Research Environment (VRE)The AGINFRA+ Virtual Research Environment (VRE)
The AGINFRA+ Virtual Research Environment (VRE)
 
Show me the data
Show me the dataShow me the data
Show me the data
 

Recently uploaded

SCHIZOPHRENIA Disorder/ Brain Disorder.pdf
SCHIZOPHRENIA Disorder/ Brain Disorder.pdfSCHIZOPHRENIA Disorder/ Brain Disorder.pdf
SCHIZOPHRENIA Disorder/ Brain Disorder.pdf
SELF-EXPLANATORY
 
filosofia boliviana introducción jsjdjd.pptx
filosofia boliviana introducción jsjdjd.pptxfilosofia boliviana introducción jsjdjd.pptx
filosofia boliviana introducción jsjdjd.pptx
IvanMallco1
 
erythropoiesis-I_mechanism& clinical significance.pptx
erythropoiesis-I_mechanism& clinical significance.pptxerythropoiesis-I_mechanism& clinical significance.pptx
erythropoiesis-I_mechanism& clinical significance.pptx
muralinath2
 
EY - Supply Chain Services 2018_template.pptx
EY - Supply Chain Services 2018_template.pptxEY - Supply Chain Services 2018_template.pptx
EY - Supply Chain Services 2018_template.pptx
AlguinaldoKong
 
In silico drugs analogue design: novobiocin analogues.pptx
In silico drugs analogue design: novobiocin analogues.pptxIn silico drugs analogue design: novobiocin analogues.pptx
In silico drugs analogue design: novobiocin analogues.pptx
AlaminAfendy1
 
Nutraceutical market, scope and growth: Herbal drug technology
Nutraceutical market, scope and growth: Herbal drug technologyNutraceutical market, scope and growth: Herbal drug technology
Nutraceutical market, scope and growth: Herbal drug technology
Lokesh Patil
 
Anemia_ different types_causes_ conditions
Anemia_ different types_causes_ conditionsAnemia_ different types_causes_ conditions
Anemia_ different types_causes_ conditions
muralinath2
 
FAIR & AI Ready KGs for Explainable Predictions
FAIR & AI Ready KGs for Explainable PredictionsFAIR & AI Ready KGs for Explainable Predictions
FAIR & AI Ready KGs for Explainable Predictions
Michel Dumontier
 
Lab report on liquid viscosity of glycerin
Lab report on liquid viscosity of glycerinLab report on liquid viscosity of glycerin
Lab report on liquid viscosity of glycerin
ossaicprecious19
 
Seminar of U.V. Spectroscopy by SAMIR PANDA
 Seminar of U.V. Spectroscopy by SAMIR PANDA Seminar of U.V. Spectroscopy by SAMIR PANDA
Seminar of U.V. Spectroscopy by SAMIR PANDA
SAMIR PANDA
 
insect morphology and physiology of insect
insect morphology and physiology of insectinsect morphology and physiology of insect
insect morphology and physiology of insect
anitaento25
 
ESR_factors_affect-clinic significance-Pathysiology.pptx
ESR_factors_affect-clinic significance-Pathysiology.pptxESR_factors_affect-clinic significance-Pathysiology.pptx
ESR_factors_affect-clinic significance-Pathysiology.pptx
muralinath2
 
NuGOweek 2024 Ghent - programme - final version
NuGOweek 2024 Ghent - programme - final versionNuGOweek 2024 Ghent - programme - final version
NuGOweek 2024 Ghent - programme - final version
pablovgd
 
Unveiling the Energy Potential of Marshmallow Deposits.pdf
Unveiling the Energy Potential of Marshmallow Deposits.pdfUnveiling the Energy Potential of Marshmallow Deposits.pdf
Unveiling the Energy Potential of Marshmallow Deposits.pdf
Erdal Coalmaker
 
RNA INTERFERENCE: UNRAVELING GENETIC SILENCING
RNA INTERFERENCE: UNRAVELING GENETIC SILENCINGRNA INTERFERENCE: UNRAVELING GENETIC SILENCING
RNA INTERFERENCE: UNRAVELING GENETIC SILENCING
AADYARAJPANDEY1
 
Cancer cell metabolism: special Reference to Lactate Pathway
Cancer cell metabolism: special Reference to Lactate PathwayCancer cell metabolism: special Reference to Lactate Pathway
Cancer cell metabolism: special Reference to Lactate Pathway
AADYARAJPANDEY1
 
THE IMPORTANCE OF MARTIAN ATMOSPHERE SAMPLE RETURN.
THE IMPORTANCE OF MARTIAN ATMOSPHERE SAMPLE RETURN.THE IMPORTANCE OF MARTIAN ATMOSPHERE SAMPLE RETURN.
THE IMPORTANCE OF MARTIAN ATMOSPHERE SAMPLE RETURN.
Sérgio Sacani
 
GBSN - Biochemistry (Unit 5) Chemistry of Lipids
GBSN - Biochemistry (Unit 5) Chemistry of LipidsGBSN - Biochemistry (Unit 5) Chemistry of Lipids
GBSN - Biochemistry (Unit 5) Chemistry of Lipids
Areesha Ahmad
 
Viksit bharat till 2047 India@2047.pptx
Viksit bharat till 2047  India@2047.pptxViksit bharat till 2047  India@2047.pptx
Viksit bharat till 2047 India@2047.pptx
rakeshsharma20142015
 
Predicting property prices with machine learning algorithms.pdf
Predicting property prices with machine learning algorithms.pdfPredicting property prices with machine learning algorithms.pdf
Predicting property prices with machine learning algorithms.pdf
binhminhvu04
 

Recently uploaded (20)

SCHIZOPHRENIA Disorder/ Brain Disorder.pdf
SCHIZOPHRENIA Disorder/ Brain Disorder.pdfSCHIZOPHRENIA Disorder/ Brain Disorder.pdf
SCHIZOPHRENIA Disorder/ Brain Disorder.pdf
 
filosofia boliviana introducción jsjdjd.pptx
filosofia boliviana introducción jsjdjd.pptxfilosofia boliviana introducción jsjdjd.pptx
filosofia boliviana introducción jsjdjd.pptx
 
erythropoiesis-I_mechanism& clinical significance.pptx
erythropoiesis-I_mechanism& clinical significance.pptxerythropoiesis-I_mechanism& clinical significance.pptx
erythropoiesis-I_mechanism& clinical significance.pptx
 
EY - Supply Chain Services 2018_template.pptx
EY - Supply Chain Services 2018_template.pptxEY - Supply Chain Services 2018_template.pptx
EY - Supply Chain Services 2018_template.pptx
 
In silico drugs analogue design: novobiocin analogues.pptx
In silico drugs analogue design: novobiocin analogues.pptxIn silico drugs analogue design: novobiocin analogues.pptx
In silico drugs analogue design: novobiocin analogues.pptx
 
Nutraceutical market, scope and growth: Herbal drug technology
Nutraceutical market, scope and growth: Herbal drug technologyNutraceutical market, scope and growth: Herbal drug technology
Nutraceutical market, scope and growth: Herbal drug technology
 
Anemia_ different types_causes_ conditions
Anemia_ different types_causes_ conditionsAnemia_ different types_causes_ conditions
Anemia_ different types_causes_ conditions
 
FAIR & AI Ready KGs for Explainable Predictions
FAIR & AI Ready KGs for Explainable PredictionsFAIR & AI Ready KGs for Explainable Predictions
FAIR & AI Ready KGs for Explainable Predictions
 
Lab report on liquid viscosity of glycerin
Lab report on liquid viscosity of glycerinLab report on liquid viscosity of glycerin
Lab report on liquid viscosity of glycerin
 
Seminar of U.V. Spectroscopy by SAMIR PANDA
 Seminar of U.V. Spectroscopy by SAMIR PANDA Seminar of U.V. Spectroscopy by SAMIR PANDA
Seminar of U.V. Spectroscopy by SAMIR PANDA
 
insect morphology and physiology of insect
insect morphology and physiology of insectinsect morphology and physiology of insect
insect morphology and physiology of insect
 
ESR_factors_affect-clinic significance-Pathysiology.pptx
ESR_factors_affect-clinic significance-Pathysiology.pptxESR_factors_affect-clinic significance-Pathysiology.pptx
ESR_factors_affect-clinic significance-Pathysiology.pptx
 
NuGOweek 2024 Ghent - programme - final version
NuGOweek 2024 Ghent - programme - final versionNuGOweek 2024 Ghent - programme - final version
NuGOweek 2024 Ghent - programme - final version
 
Unveiling the Energy Potential of Marshmallow Deposits.pdf
Unveiling the Energy Potential of Marshmallow Deposits.pdfUnveiling the Energy Potential of Marshmallow Deposits.pdf
Unveiling the Energy Potential of Marshmallow Deposits.pdf
 
RNA INTERFERENCE: UNRAVELING GENETIC SILENCING
RNA INTERFERENCE: UNRAVELING GENETIC SILENCINGRNA INTERFERENCE: UNRAVELING GENETIC SILENCING
RNA INTERFERENCE: UNRAVELING GENETIC SILENCING
 
Cancer cell metabolism: special Reference to Lactate Pathway
Cancer cell metabolism: special Reference to Lactate PathwayCancer cell metabolism: special Reference to Lactate Pathway
Cancer cell metabolism: special Reference to Lactate Pathway
 
THE IMPORTANCE OF MARTIAN ATMOSPHERE SAMPLE RETURN.
THE IMPORTANCE OF MARTIAN ATMOSPHERE SAMPLE RETURN.THE IMPORTANCE OF MARTIAN ATMOSPHERE SAMPLE RETURN.
THE IMPORTANCE OF MARTIAN ATMOSPHERE SAMPLE RETURN.
 
GBSN - Biochemistry (Unit 5) Chemistry of Lipids
GBSN - Biochemistry (Unit 5) Chemistry of LipidsGBSN - Biochemistry (Unit 5) Chemistry of Lipids
GBSN - Biochemistry (Unit 5) Chemistry of Lipids
 
Viksit bharat till 2047 India@2047.pptx
Viksit bharat till 2047  India@2047.pptxViksit bharat till 2047  India@2047.pptx
Viksit bharat till 2047 India@2047.pptx
 
Predicting property prices with machine learning algorithms.pdf
Predicting property prices with machine learning algorithms.pdfPredicting property prices with machine learning algorithms.pdf
Predicting property prices with machine learning algorithms.pdf
 

Improving access to geospatial Big Data in the hydrology domain

  • 1. Improving access to geospatial Big Data in the hydrology domain Claudia Vitolo1,2 and Wouter Buytaert1 1 Imperial College London 2 Brunel University London Big Data and Spatial Analytics - Business and Industrial Section Royal Statistical Society, London, UK - 18.11.2015
  • 2. Outline 1. Background 2. Open Data and access approaches 3. Demo 4. Conclusions
  • 4. What is Hydrology? Hydrology is the scientific study of the movement, distribution, and quality of water on Earth. Source: Hydrology. In Wikipedia, The Free Encyclopedia.
  • 5. What do (river) hydrologists do? ▣ Collect data on climate, soil, geology, topography, etc. ▣ Setup model ▣ Calibrate model with observed water levels and stream flows □ locations □ time intervals ▣ Use models to analyse scenarios and make predictions
  • 6. Big Data in Hydrology Information: ▣ Topography & bathymetry ▣ Geology ▣ Soil & Moisture ▣ Land cover ▣ Weather & Climate ▣ Hydrometry ▣ Quality samples ▣ Groundwater ▣ Infrastructures Format: ▣ Plain text ▣ Raster ▣ Vector ▣ Binary ▣ Markup Languages ▣ Graphs & networks ▣ Cad drawings
  • 7. Big Data in Hydrology Information: ▣ Topography & bathymetry ▣ Geology ▣ Soil & Moisture ▣ Land cover ▣ Weather & Climate ▣ Hydrometry ▣ Quality samples ▣ Groundwater ▣ Infrastructures Format: ▣ Plain text ▣ Raster ▣ Vector ▣ Binary ▣ Markup Languages ▣ Graphs & networks ▣ Cad drawings
  • 8. Big Data in Hydrology Information: ▣ Topography & bathymetry ▣ Geology ▣ Soil & Moisture ▣ Land cover ▣ Weather & Climate ▣ Hydrometry ▣ Quality samples ▣ Groundwater ▣ Infrastructures Format: ▣ Plain text ▣ Raster ▣ Vector ▣ Binary ▣ Markup Languages ▣ Graphs & networks ▣ Cad drawings
  • 9. Big Data in Hydrology Information: ▣ Topography & bathymetry ▣ Geology ▣ Soil & Moisture ▣ Land cover ▣ Weather & Climate ▣ Hydrometry ▣ Quality samples ▣ Groundwater ▣ Infrastructures Format: ▣ Plain text ▣ Raster ▣ Vector ▣ Binary ▣ Markup Languages ▣ Graphs & networks ▣ Cad drawings
  • 10. Big Data challenges: ▣ Get large volume of heterogeneous data ▣ Mash-up information and use it to make decisions
  • 11. 2. Open Data and data access approaches
  • 12. Open Data “Open data and content can be freely used, modified, and shared by anyone for any purpose” Source: http://opendefinition.org/
  • 13. Open Data “Open data and content can be freely used, modified, and shared by anyone for any purpose” Source: http://opendefinition.org/
  • 14. Open Data “Open data and content can be freely used, modified, and shared by anyone for any purpose” Source: http://opendefinition.org/
  • 15. Open Data “Open data and content can be freely used, modified, and shared by anyone for any purpose” Source: http://opendefinition.org/
  • 16. Open Data “Open data and content can be freely used, modified, and shared by anyone for any purpose” Source: http://opendefinition.org/
  • 17. The National River Flow Archive (NRFA) River flow data from gauging station networks across the UK including networks operated by: ● Environment Agency (England), ● Natural Resources Wales, ● Scottish Environment Protection Agency, ● Rivers Agency (Northern Ireland). http://nrfa.ceh.ac.uk/
  • 18. GUI PROS: simple and intuitive CONS: not scalable, not flexible Point & click (GUI) vs programmatic (API) data retrieval API PROS: scalable, fast and flexible CONS: requires programming skills
  • 20. The NRFA’s API ▣ metadata catalogue, ▣ catalogue filters, ▣ time series of gauged daily data, ▣ time series of catchment monthly rainfall.
  • 21. How does an API work? server/format/service?X=1&Y=2&Z=3
  • 22. How does an API work? server/format/service?X=1&Y=2&Z=3 QUESTION A: How do I get information on station “18019” from the NRFA catalogue?
  • 23. How does an API work? server/format/service?X=1&Y=2&Z=3 QUESTION A: How do I get information on station “18019” from the NRFA catalogue? ANSWER: nrfaapps.ceh.ac.uk/nrfa/json/stationSummary?db=nrfa_public&stn=18019
  • 24. How does an API work? server/format/service?X=1&Y=2&Z=3 QUESTION B: How do I get the time series of gauged daily data for station “18019”?
  • 25. How does an API work? server/format/service?X=1&Y=2&Z=3 QUESTION B: How do I get the time series of gauged daily data for station “18019”? ANSWER: nrfaapps.ceh.ac.uk/nrfa/xml/waterml2?db=nrfa_public&stn=18019&dt=gdf
  • 26. From machine-readable to human- readable formats JSON XML Plain text
  • 27. R libraries to interface APIs ▣ raincpc: download and process the Climate Prediction Center's (CPC) daily rainfall data ▣ rnoaa: an interface to NOAA Climate data API ▣ soilDB: read data from USDA-NCSS soil databases. ▣ waterData: retrieve, analyse, and calculate anomalies of daily hydrologic time series data. ▣ rnrfa: an interface to the UK National River Flow Archive data API.
  • 29. The R package RNRFA API interface: ▣ make request ▣ parse response ▣ retrieve and filter metadata catalogue ▣ get time series of gauged daily data and catchment monthly rainfall API interface + external libraries: ▣ make maps ▣ create interactive tables and plots ▣ simplify and speed up reporting!
  • 30. Example of dynamic report ▣ Find all the stations operated by National Resources Wales ▣ Retrieve time series of daily flows ▣ Run a basic analysis ▣ Create interactive plot, table and map
  • 32. Summary Big Data Large volumes of heterogeneous spatio- temporal data is becoming increasingly open in the hydrology domain. GUIs vs APIs GUIs may be the easiest way to browse data but not the most efficient. APIs are fast and scalable. Hardware/software Hardware & software burden is on the data provider side. No need to update your datasets, you always access the latest version R as interface R is an easy-to-learn language, widely used by statisticians and scientists. It provides a number of libraries to obtain and parse data from the web. Reproducible workflows Query databases, filter information, convert coordinates, generate plots and maps for reproducible reporting. Scalability & Interoperability An approach to gather information for single as well as multiple sites. At larger scale, computing can be made more efficient by using cloud facilities. R
  • 33. Summary Big Data Large volumes of heterogeneous spatio- temporal data is becoming increasingly open in the hydrology domain. GUIs vs APIs GUIs may be the easiest way to browse data but not the most efficient. APIs are fast and scalable. Hardware/software Hardware & software burden is on the data provider side. No need to update your datasets, you always access the latest version R as interface R is an easy-to-learn language, widely used by statisticians and scientists. It provides a number of libraries to obtain and parse data from the web. Reproducible workflows Query databases, filter information, convert coordinates, generate plots and maps for reproducible reporting. Scalability & Interoperability An approach to gather information for single as well as multiple sites. At larger scale, computing can be made more efficient by using cloud facilities. R
  • 34. Summary Big Data Large volumes of heterogeneous spatio- temporal data is becoming increasingly open in the hydrology domain. GUIs vs APIs GUIs may be the easiest way to browse data but not the most efficient. APIs are fast and scalable. Hardware/software Hardware & software burden is on the data provider side. No need to update your datasets, you always access the latest version R as interface R is an easy-to-learn language, widely used by statisticians and scientists. It provides a number of libraries to obtain and parse data from the web. Reproducible workflows Query databases, filter information, convert coordinates, generate plots and maps for reproducible reporting. Scalability & Interoperability An approach to gather information for single as well as multiple sites. At larger scale, computing can be made more efficient by using cloud facilities. R
  • 35. Summary Big Data Large volumes of heterogeneous spatio- temporal data is becoming increasingly open in the hydrology domain. GUIs vs APIs GUIs may be the easiest way to browse data but not the most efficient. APIs are fast and scalable. Hardware/software Hardware & software burden is on the data provider side. No need to update your datasets, you always access the latest version R as interface R is an easy-to-learn language, widely used by statisticians and scientists. It provides a number of libraries to obtain and parse data from the web. Reproducible workflows Query databases, filter information, convert coordinates, generate plots and maps for reproducible reporting. Scalability & Interoperability An approach to gather information for single as well as multiple sites. At larger scale, computing can be made more efficient by using cloud facilities. R
  • 36. Summary Big Data Large volumes of heterogeneous spatio- temporal data is becoming increasingly open in the hydrology domain. GUIs vs APIs GUIs may be the easiest way to browse data but not the most efficient. APIs are fast and scalable. Hardware/software Hardware & software burden is on the data provider side. No need to update your datasets, you always access the latest version R as interface R is an easy-to-learn language, widely used by statisticians and scientists. It provides a number of libraries to obtain and parse data from the web. Reproducible workflows Query databases, filter information, convert coordinates, generate plots and maps for reproducible reporting. Scalability & Interoperability An approach to gather information for single as well as multiple sites. At larger scale, computing can be made more efficient by using cloud facilities. R
  • 37. Summary Big Data Large volumes of heterogeneous spatio- temporal data is becoming increasingly open in the hydrology domain. GUIs vs APIs GUIs may be the easiest way to browse data but not the most efficient. APIs are fast and scalable. Hardware/software Hardware & software burden is on the data provider side. No need to update your datasets, you always access the latest version R as interface R is an easy-to-learn language, widely used by statisticians and scientists. It provides a number of libraries to obtain and parse data from the web. Reproducible workflows Query databases, filter information, convert coordinates, generate plots and maps for reproducible reporting. Scalability & Interoperability An approach to gather information for single as well as multiple sites. At larger scale, computing can be made more efficient by using cloud facilities. R
  • 38. Summary Big Data Large volumes of heterogeneous spatio- temporal data is becoming increasingly open in the hydrology domain. GUIs vs APIs GUIs may be the easiest way to browse data but not the most efficient. APIs are fast and scalable. Hardware/software Hardware & software burden is on the data provider side. No need to update your datasets, you always access the latest version R as interface R is an easy-to-learn language, widely used by statisticians and scientists. It provides a number of libraries to obtain and parse data from the web. Reproducible workflows Query databases, filter information, convert coordinates, generate plots and maps for reproducible reporting. Scalability & Interoperability An approach to gather information for single as well as multiple sites. At larger scale, computing can be made more efficient by using cloud facilities. R
  • 39. Thanks! Any questions? Claudia Vitolo Twitter: @clavitolo Email: claudia.vitolo@gmail.com Blog: http://claudiavitolo.com/