SlideShare a Scribd company logo
BigDataEurope - Supporting the
Variety Dimension of Big Data
Mohamed Nadjib MAMI - Fraunhofer IAISICWE17 - 06.06.2017
Big Data Europe - the Project
◎ EU Horizon 2020-programme-funded
◎ Coordination & Support action (CSA) Project
o Show societal value of Big data to 7 Domains
o Lower barrier for using Big Data technologies
=> BigDataEurope Platform
2
Consortium Partners
3
Consortium of 17 Partners
o Industry, SMEs, universities, research institutes, etc.
BDE Europe - The Platform
◎ Integrator of Big Data technologies
o Easy to use/get started (plug-and-play)
o Flexible, Customisable
◎ Bundles with only Open Source solutions
o Data Storage
o Message Passing
o Data Processing
o Data Searching & Publishing
◎ Publicly released in May 2017
4
BDE Platform - Components (some)
Search/Indexing Data Processing
Apache Solr Apache Spark
Elasticsearch Apache Flink
Data Acquisition Semantic Components
Apache Flume Strabon
Message Passing Sextant
Apache Kafka GeoTriples
Data Storage Silk
Apache Hadoop SEMAGROW
Apache Cassandra LIMES
Apache Hive 4Store
Postgis OpenLink Virtuoso
5
BDE Platform - Architecture
Support Layer
Init Daemon
GUIs
Base Setup
App Layer
Traffic
Forecast
Satellite Image
Analysis
Platform Layer
Spark Flink Semantic Layer
Ontario SANSA Semagrow
Kafka
Real-time Stream
Monitoring
...
...
Resource Management Layer (Swarm)
Hardware Layer
Premises Cloud (AWS, GCE, MS Azure, …)
Data Layer
Hadoop NOSQL Store CassandraElasticsearch ...RDF Store
Semantic Data Lake (Unified View)
6
BDE Platform - Hardware & Virtualization
◎ Docker used for packaging and deploying applications
◎ Based on containers:
o A lightweight environment to make a piece of
software run in isolation
❖ Shares the host operating system kernel (unlike
VMs)
❖ Reduces conflicts e.g., versions
◎ Docker Compose: creates multi-container applications
7
BDE Platform - Resource Managements
◎ Swarm (mode) used for managing, scheduling and
orchestrating Dockers in multi-node clusters
◎ It provides:
o Scalability and Fault Tolerance
o Containers interlinking
o Log-based monitoring
◎ Separate hardware from software management
◎ Based on Services
o Swarm execution unit running a Docker Image
8
BDE Platform - Support Layer
◎ Init Daemon: orchestrates the initialization process of
the components (containers of Docker Compose):
o Components report their initialization progress
o It validates whether a specific component can start
o It specifies the dependencies between services
o It Indicates where a human interaction is required
◎ Examples:
o Wait data to load to HDFS to start a Spark job
o Wait Spark Master to successfully start to start a Worker
9
BDE Platform - User Interfaces
10
Component 1
Component 2
Component 3
Pipeline Builder: creates step-by-step dependency
pipeline (fed to the init daemon)
BDE Platform - User Interfaces
11
Component 1
Finished
Component 2
Finished
Component 3
Inprogress
Pipeline Monitor: displays the status (not started, running or finished) of
components in a running pipeline (retrieved from the init daemon)
BDE Platform - User Interfaces
12
Swarm UI: allows to clone a Git repository containing a
pipeline and deploys/controls/monitors it on Swarm
BDE Platform - User Interfaces
13
Integrator UI: displays the dashboard of each running
component in a unified interface
BDE Platform - Semantic Layer > Ontario
◎ Data Lake or Swamp?
o Repository of data in its original formats
o Structured, semi-structured, unstructured
o Without unified schema
◎ Semantic Data Lake (Ontario)
o Add a Semantic Layer on top of the source datasets
❖ The data is semantically lifted using ontology
terms
❖ Provide a uniform view over nonuniform data
14
BDE Platform - Semantic Layer > Ontario
15
SELECT count(distinct(?publication))
AS ?no_of_publications
count(?deaths) AS ?no_of_deaths
WHERE {
?item a qb:Observation .
?item gho:Country ?country .
?item gho:Disease ?disease .
?item att:unitMeasure gho:Measure .
?item eg:incidence ?deaths .
?country rdfs:label "India" .
?disease rdfs:label "Tuberculosis".
?trial a ct:trials .
?trial ct:condition ?condition .
?trial ct:location ?location .
?trial ct:reference ?publication.
?condition owl:sameAs ?disease .
?location redd:locatedIn ?country .
?publication ct:citation ?citation.
}
?item a qb:Observation .
?item gho:Country ?country .
?item gho:Disease ?disease .
?item att:unitMeasure gho:Measure .
?item eg:incidence ?deaths .
?trial a ct:trials .
?trial ct:condition ?condition .
?trial ct:location ?location .
?trial ct:reference ?publication.
?condition owl:sameAs ?disease .
?disease rdfs:label "Tuberculosis".
?country rdfs:label "India" .
?location redd:locatedIn ?country .
?publication ct:citation ?citation.
Query “number of distinct publications and number of
distinct deaths due to the disease Tuberculosis in India”
BDE Platform - Semantic Layer > Ontario
16
Publications
Meta-wrapper
Trials
Meta-wrapper Conditions
Meta-wrapper
Observations
Meta-wrapper
2. Planning
3. Meta-wrapper
invocation
Query 1. Query Parsing
& Validation
BDE Platform - Semantic Layer > Ontario
17
Publications
Meta-wrapper
Observations
Meta-wrapper
Trials
Meta-wrapper
Wrapper (XML) Wrapper (CSV)
Conditions
Meta-wrapper
Wrapper (RDF)
4. Wrapper
Selection &
Query
Translation
?item gho:Country ?country .
?item gho:Disease ?disease .
...
SELECT country, disease, ...
FROM Observations
Mapping rules
...
[Xpath]
...
...
[Sparql]
...
5. Query
Execution
...
[Sparql]
...
BDE Platform - Semantic Layer > SANSA
18
SANSA a Framework for distributed RDF
data processing
◎ Read/write Layer: Read and write
native RDF/OWL data in distributed
storage e.g., Hadoop, Spark (RDD,
DataFrames, GraphX), Tensors
following different representations &
partitioning scheme e.g., graphs, tables
◎ Querying Layer: Query distributed
RDF using SPARQL (SPARQL-to-SQL
approaches, Virtual Views, Intelligent
Indexing, ...)
http://sansa-stack.net
BDE Platform - Semantic Layer > SANSA
19
http://sansa-stack.net
◎ Inference Layer: Derive new facts from
existing ones, detect inconsistencies,
extract new rules to help in reasoning
◎ Machine Learning Layer: Perform ML
or analytics to gain insights for relevant
trends, predictions or detection of
anomalies from RDF data
o Tensor Factorization for e.g. KB
completion (testing stage)
o Graph Clustering (testing stage)
o Association rule mining (evaluation stage)
o Semantic Decision trees (idea stage)
o Inference in Knowledge Graph
Embeddings (idea stage)
BDE Platform - Semantic Layer > Semagrow
Semagrow a SPARQL query processing system that federates
multiple remote endpoints
◎ Original Semagrow
o Optimizes queries transparently
o Executes sub-queries in the remote endpoints
o Integrates results dynamically in heterogeneous data
models
o Joins the partial results into the final query answer
◎ Next-gen Semagrow
o Support different querying languages
o Query planner and execution engine adapted
e.g., translate SPARQL to CQL for Cassandra
databases
20
BDE Showcases (pilots)
21
SC1 SC2 SC3 SC4 SC5 SC6
SC7
SC1 - Open PHACTS discovery platform relating to biological/medical questions
SC2 - Discovery and Linking of Viticulture-relevant information
SC3 - System monitoring in energy production units
SC4 - Short-Term traffic flow forecasting.
SC5 - Supporting data-intensive climate research
SC6 - Citizens & Researchers Budget on Municipal Level
SC7 - Ingestion of remote sensing images and social sensing data to detect and verify
changes on the Earth surface for security applications
◎ 7 Societal Challenges > 7 pilot implementations
Showcase SC1: Health, demographic
change and wellbeing
◎ SC1 Implements Open PHACTS Discovery Platform
o Integrates and links data from multiple sources:
ChEBI, ChEMBL, the Gene Ontology and UniProt
(Chemistry, Biological, Medical, etc.)
o Explores the relationships between data
(compounds, targets, pathways, diseases and
tissues)
o Data accessed using RESTful-API requests
❖ Translated to SPARQL queries
◎ Technologies used:
o 4Store, Memchached, MySQL, Puelia, SWAGGER
22
Showcase SC7: Secure Societies
◎ Detect changes in land cover in satellite images (e.g.,
monitoring critical infrastructures)
◎ Display geo-located events in news sites and social
media (e.g., news articles, social networks)
◎ Three workflows:
o Change detection workflow
o Event detection workflow
o Activation workflow
◎ Technologies used: Apache Spark, Cassandra,
Sextant, Semagrow, Strabon, GeoTriples
23
Showcase 2 (SC7): Secure Societies
24
General Architecture of the SC7 Pilot
Showcase 2 (SC7): Secure Societies
area and the time
interval of interest
Satellite Images Compare Images
Change detection workflow
25
Showcase 2 (SC7): Secure Societies
Event detection workflow
Associate names
with coordinates
Cluster news into events
(associate geo-location)
26
Showcase 2 (SC7): Secure Societies
Activation detection workflow
Areas with changes
Summary of events
Spatiotemporal
RDF store
27
Showcase 2 (SC7): Secure Societies
refugee camps located in Zaatari, Jordan
28
News
TweetsSelected
Area
Detected
changes
Thanks & Questions?
For more info...
◎ Project-related: Simon Scerri (scerri@cs.uni-bonn.de)
◎ Ontario: Mohamed Nadjib Mami (mami@cs.uni-bonn.de)
◎ SANSA: Jens Lehmann (jens.lehmann@cs.uni-bonn.de)
◎ Semagrow: Stasinos Konstantopoulos (konstant@iit.demokritos.gr)
◎ Pilots (showcases):
o SC1: Ronald Siebes (rm.siebes@few.vu.nl)
o SC7: George Papadakis (gpapadis@di.uoa.gr)
o All: Ronald Siebes (rm.siebes@few.vu.nl)
◎ Github repos: https://github.com/big-data-europe/README
◎ Website: https://big-data-europe.eu
29
BDE Platform vs. Hadoop Distributions
30
SFR = Single failure recovery
MFR = Multiple failure recovery
SF = Self healing

More Related Content

Similar to ICWE2017 BigDataEurope

BigDataEurope @BDVA Summit2016 1: The BDE Platform
BigDataEurope @BDVA Summit2016 1: The BDE PlatformBigDataEurope @BDVA Summit2016 1: The BDE Platform
BigDataEurope @BDVA Summit2016 1: The BDE Platform
BigData_Europe
 
Big Linked Data Federation - ExtremeEarth Open Workshop
Big Linked Data Federation - ExtremeEarth Open WorkshopBig Linked Data Federation - ExtremeEarth Open Workshop
Big Linked Data Federation - ExtremeEarth Open Workshop
ExtremeEarth
 
Big Data Europe SC6 WS #3: Big Data Europe Platform: Apps, challenges, goals ...
Big Data Europe SC6 WS #3: Big Data Europe Platform: Apps, challenges, goals ...Big Data Europe SC6 WS #3: Big Data Europe Platform: Apps, challenges, goals ...
Big Data Europe SC6 WS #3: Big Data Europe Platform: Apps, challenges, goals ...
BigData_Europe
 
Apache Big_Data Europe event: "Integrators at work! Real-life applications of...
Apache Big_Data Europe event: "Integrators at work! Real-life applications of...Apache Big_Data Europe event: "Integrators at work! Real-life applications of...
Apache Big_Data Europe event: "Integrators at work! Real-life applications of...
BigData_Europe
 
Apache Big_Data Europe event: "Integrators at work! Real-life applications of...
Apache Big_Data Europe event: "Integrators at work! Real-life applications of...Apache Big_Data Europe event: "Integrators at work! Real-life applications of...
Apache Big_Data Europe event: "Integrators at work! Real-life applications of...
Hajira Jabeen
 
BDE SC1 Workshop 3 - Open PHACTS Pilot (Kiera McNeice)
BDE SC1 Workshop 3 - Open PHACTS Pilot (Kiera McNeice)BDE SC1 Workshop 3 - Open PHACTS Pilot (Kiera McNeice)
BDE SC1 Workshop 3 - Open PHACTS Pilot (Kiera McNeice)
BigData_Europe
 
WSO2 Big Data Platform and Applications
WSO2 Big Data Platform and ApplicationsWSO2 Big Data Platform and Applications
WSO2 Big Data Platform and Applications
Srinath Perera
 
Event Visualization with OpenStreetMap Data, Interdisciplinary Project
Event Visualization with OpenStreetMap Data, Interdisciplinary ProjectEvent Visualization with OpenStreetMap Data, Interdisciplinary Project
Event Visualization with OpenStreetMap Data, Interdisciplinary ProjectBibek Shrestha
 
Introduction to Big Data Analytics: Batch, Real-Time, and the Best of Both Wo...
Introduction to Big Data Analytics: Batch, Real-Time, and the Best of Both Wo...Introduction to Big Data Analytics: Batch, Real-Time, and the Best of Both Wo...
Introduction to Big Data Analytics: Batch, Real-Time, and the Best of Both Wo...
WSO2
 
DEMETER at OGC Agriculture Session
DEMETER at OGC Agriculture SessionDEMETER at OGC Agriculture Session
DEMETER at OGC Agriculture Session
H2020 DEMETER
 
Open Data Hub - Roberto Monsorno - Environmental Data Platform as a tool for ...
Open Data Hub - Roberto Monsorno - Environmental Data Platform as a tool for ...Open Data Hub - Roberto Monsorno - Environmental Data Platform as a tool for ...
Open Data Hub - Roberto Monsorno - Environmental Data Platform as a tool for ...
South Tyrol Free Software Conference
 
WLCG Grid Infrastructure Monitoring
WLCG Grid Infrastructure MonitoringWLCG Grid Infrastructure Monitoring
WLCG Grid Infrastructure Monitoring
James Casey
 
BDE SC3.3 Workshop - BDE Platform: Technical overview
 BDE SC3.3 Workshop -  BDE Platform: Technical overview BDE SC3.3 Workshop -  BDE Platform: Technical overview
BDE SC3.3 Workshop - BDE Platform: Technical overview
BigData_Europe
 
BDE-BDVA Webinar: BDE Technical Overview
BDE-BDVA Webinar: BDE Technical OverviewBDE-BDVA Webinar: BDE Technical Overview
BDE-BDVA Webinar: BDE Technical Overview
BigData_Europe
 
Lodstats: The Data Web Census Dataset. Kobe, Japan, 2016
Lodstats: The Data Web Census Dataset. Kobe, Japan, 2016Lodstats: The Data Web Census Dataset. Kobe, Japan, 2016
Lodstats: The Data Web Census Dataset. Kobe, Japan, 2016
Ivan Ermilov
 
Ultralight Data Movement for IoT with SDC Edge
Ultralight Data Movement for IoT with SDC EdgeUltralight Data Movement for IoT with SDC Edge
Ultralight Data Movement for IoT with SDC Edge
DataWorks Summit
 
Populating a Release History Database (ICSM 2013 MIP)
Populating a Release History Database (ICSM 2013 MIP)Populating a Release History Database (ICSM 2013 MIP)
Populating a Release History Database (ICSM 2013 MIP)
Martin Pinzger
 
BigDataEurope @BDVA Summit2016 2: Societal Pilots
BigDataEurope @BDVA Summit2016 2: Societal PilotsBigDataEurope @BDVA Summit2016 2: Societal Pilots
BigDataEurope @BDVA Summit2016 2: Societal Pilots
BigData_Europe
 
The Enterprise Guide to Building a Data Mesh - Introducing SpecMesh
The Enterprise Guide to Building a Data Mesh - Introducing SpecMeshThe Enterprise Guide to Building a Data Mesh - Introducing SpecMesh
The Enterprise Guide to Building a Data Mesh - Introducing SpecMesh
IanFurlong4
 
OpenTelemetry Introduction
OpenTelemetry Introduction OpenTelemetry Introduction
OpenTelemetry Introduction
DimitrisFinas1
 

Similar to ICWE2017 BigDataEurope (20)

BigDataEurope @BDVA Summit2016 1: The BDE Platform
BigDataEurope @BDVA Summit2016 1: The BDE PlatformBigDataEurope @BDVA Summit2016 1: The BDE Platform
BigDataEurope @BDVA Summit2016 1: The BDE Platform
 
Big Linked Data Federation - ExtremeEarth Open Workshop
Big Linked Data Federation - ExtremeEarth Open WorkshopBig Linked Data Federation - ExtremeEarth Open Workshop
Big Linked Data Federation - ExtremeEarth Open Workshop
 
Big Data Europe SC6 WS #3: Big Data Europe Platform: Apps, challenges, goals ...
Big Data Europe SC6 WS #3: Big Data Europe Platform: Apps, challenges, goals ...Big Data Europe SC6 WS #3: Big Data Europe Platform: Apps, challenges, goals ...
Big Data Europe SC6 WS #3: Big Data Europe Platform: Apps, challenges, goals ...
 
Apache Big_Data Europe event: "Integrators at work! Real-life applications of...
Apache Big_Data Europe event: "Integrators at work! Real-life applications of...Apache Big_Data Europe event: "Integrators at work! Real-life applications of...
Apache Big_Data Europe event: "Integrators at work! Real-life applications of...
 
Apache Big_Data Europe event: "Integrators at work! Real-life applications of...
Apache Big_Data Europe event: "Integrators at work! Real-life applications of...Apache Big_Data Europe event: "Integrators at work! Real-life applications of...
Apache Big_Data Europe event: "Integrators at work! Real-life applications of...
 
BDE SC1 Workshop 3 - Open PHACTS Pilot (Kiera McNeice)
BDE SC1 Workshop 3 - Open PHACTS Pilot (Kiera McNeice)BDE SC1 Workshop 3 - Open PHACTS Pilot (Kiera McNeice)
BDE SC1 Workshop 3 - Open PHACTS Pilot (Kiera McNeice)
 
WSO2 Big Data Platform and Applications
WSO2 Big Data Platform and ApplicationsWSO2 Big Data Platform and Applications
WSO2 Big Data Platform and Applications
 
Event Visualization with OpenStreetMap Data, Interdisciplinary Project
Event Visualization with OpenStreetMap Data, Interdisciplinary ProjectEvent Visualization with OpenStreetMap Data, Interdisciplinary Project
Event Visualization with OpenStreetMap Data, Interdisciplinary Project
 
Introduction to Big Data Analytics: Batch, Real-Time, and the Best of Both Wo...
Introduction to Big Data Analytics: Batch, Real-Time, and the Best of Both Wo...Introduction to Big Data Analytics: Batch, Real-Time, and the Best of Both Wo...
Introduction to Big Data Analytics: Batch, Real-Time, and the Best of Both Wo...
 
DEMETER at OGC Agriculture Session
DEMETER at OGC Agriculture SessionDEMETER at OGC Agriculture Session
DEMETER at OGC Agriculture Session
 
Open Data Hub - Roberto Monsorno - Environmental Data Platform as a tool for ...
Open Data Hub - Roberto Monsorno - Environmental Data Platform as a tool for ...Open Data Hub - Roberto Monsorno - Environmental Data Platform as a tool for ...
Open Data Hub - Roberto Monsorno - Environmental Data Platform as a tool for ...
 
WLCG Grid Infrastructure Monitoring
WLCG Grid Infrastructure MonitoringWLCG Grid Infrastructure Monitoring
WLCG Grid Infrastructure Monitoring
 
BDE SC3.3 Workshop - BDE Platform: Technical overview
 BDE SC3.3 Workshop -  BDE Platform: Technical overview BDE SC3.3 Workshop -  BDE Platform: Technical overview
BDE SC3.3 Workshop - BDE Platform: Technical overview
 
BDE-BDVA Webinar: BDE Technical Overview
BDE-BDVA Webinar: BDE Technical OverviewBDE-BDVA Webinar: BDE Technical Overview
BDE-BDVA Webinar: BDE Technical Overview
 
Lodstats: The Data Web Census Dataset. Kobe, Japan, 2016
Lodstats: The Data Web Census Dataset. Kobe, Japan, 2016Lodstats: The Data Web Census Dataset. Kobe, Japan, 2016
Lodstats: The Data Web Census Dataset. Kobe, Japan, 2016
 
Ultralight Data Movement for IoT with SDC Edge
Ultralight Data Movement for IoT with SDC EdgeUltralight Data Movement for IoT with SDC Edge
Ultralight Data Movement for IoT with SDC Edge
 
Populating a Release History Database (ICSM 2013 MIP)
Populating a Release History Database (ICSM 2013 MIP)Populating a Release History Database (ICSM 2013 MIP)
Populating a Release History Database (ICSM 2013 MIP)
 
BigDataEurope @BDVA Summit2016 2: Societal Pilots
BigDataEurope @BDVA Summit2016 2: Societal PilotsBigDataEurope @BDVA Summit2016 2: Societal Pilots
BigDataEurope @BDVA Summit2016 2: Societal Pilots
 
The Enterprise Guide to Building a Data Mesh - Introducing SpecMesh
The Enterprise Guide to Building a Data Mesh - Introducing SpecMeshThe Enterprise Guide to Building a Data Mesh - Introducing SpecMesh
The Enterprise Guide to Building a Data Mesh - Introducing SpecMesh
 
OpenTelemetry Introduction
OpenTelemetry Introduction OpenTelemetry Introduction
OpenTelemetry Introduction
 

More from BigData_Europe

Luigi Selmi - The Big Data Integrator Platform
Luigi Selmi - The Big Data Integrator PlatformLuigi Selmi - The Big Data Integrator Platform
Luigi Selmi - The Big Data Integrator Platform
BigData_Europe
 
Josep Maria Salanova - Introduction to BDE+SC4
Josep Maria Salanova - Introduction to BDE+SC4Josep Maria Salanova - Introduction to BDE+SC4
Josep Maria Salanova - Introduction to BDE+SC4
BigData_Europe
 
Rajendra Akerkar - LeMO Project
Rajendra Akerkar - LeMO ProjectRajendra Akerkar - LeMO Project
Rajendra Akerkar - LeMO Project
BigData_Europe
 
Big Data Europe SC6 WS #3: PILOT SC6: CITIZEN BUDGET ON MUNICIPAL LEVEL, Mart...
Big Data Europe SC6 WS #3: PILOT SC6: CITIZEN BUDGET ON MUNICIPAL LEVEL, Mart...Big Data Europe SC6 WS #3: PILOT SC6: CITIZEN BUDGET ON MUNICIPAL LEVEL, Mart...
Big Data Europe SC6 WS #3: PILOT SC6: CITIZEN BUDGET ON MUNICIPAL LEVEL, Mart...
BigData_Europe
 
Big Data Europe SC6 WS 3: Where we are and are going for Big Data in OpenScie...
Big Data Europe SC6 WS 3: Where we are and are going for Big Data in OpenScie...Big Data Europe SC6 WS 3: Where we are and are going for Big Data in OpenScie...
Big Data Europe SC6 WS 3: Where we are and are going for Big Data in OpenScie...
BigData_Europe
 
Big Data Europe SC6 WS 3: Ron Dekker, Director CESSDA European Open Science A...
Big Data Europe SC6 WS 3: Ron Dekker, Director CESSDA European Open Science A...Big Data Europe SC6 WS 3: Ron Dekker, Director CESSDA European Open Science A...
Big Data Europe SC6 WS 3: Ron Dekker, Director CESSDA European Open Science A...
BigData_Europe
 
Big Data Europe: SC6 Workshop 3: The European Research Data Landscape: Opport...
Big Data Europe: SC6 Workshop 3: The European Research Data Landscape: Opport...Big Data Europe: SC6 Workshop 3: The European Research Data Landscape: Opport...
Big Data Europe: SC6 Workshop 3: The European Research Data Landscape: Opport...
BigData_Europe
 
BDE SC3.3 Workshop - BDE review: Scope and Opportunities
 BDE SC3.3 Workshop -  BDE review: Scope and Opportunities BDE SC3.3 Workshop -  BDE review: Scope and Opportunities
BDE SC3.3 Workshop - BDE review: Scope and Opportunities
BigData_Europe
 
BDE SC3.3 Workshop - Agenda
 BDE SC3.3 Workshop - Agenda BDE SC3.3 Workshop - Agenda
BDE SC3.3 Workshop - Agenda
BigData_Europe
 
BDE SC3.3 Workshop - BDE Pilot case for Wind Turbine condition monitoring re...
 BDE SC3.3 Workshop - BDE Pilot case for Wind Turbine condition monitoring re... BDE SC3.3 Workshop - BDE Pilot case for Wind Turbine condition monitoring re...
BDE SC3.3 Workshop - BDE Pilot case for Wind Turbine condition monitoring re...
BigData_Europe
 
BDE SC3.3 Workshop - Data management in WT testing and monitoring
 BDE SC3.3 Workshop - Data management in WT testing and monitoring  BDE SC3.3 Workshop - Data management in WT testing and monitoring
BDE SC3.3 Workshop - Data management in WT testing and monitoring
BigData_Europe
 
BDE SC3.3 Workshop - Big Data in Wind Turbine Condition Monitoring
 BDE SC3.3 Workshop -  Big Data in Wind Turbine Condition Monitoring BDE SC3.3 Workshop -  Big Data in Wind Turbine Condition Monitoring
BDE SC3.3 Workshop - Big Data in Wind Turbine Condition Monitoring
BigData_Europe
 
BDE SC3.3 Workshop - Options for Wind Farm performance assessment and Power f...
BDE SC3.3 Workshop - Options for Wind Farm performance assessment and Power f...BDE SC3.3 Workshop - Options for Wind Farm performance assessment and Power f...
BDE SC3.3 Workshop - Options for Wind Farm performance assessment and Power f...
BigData_Europe
 
BDE SC3.3 Workshop - Wind Farm Monitoring and advanced analytics
 BDE SC3.3 Workshop - Wind Farm Monitoring and advanced analytics  BDE SC3.3 Workshop - Wind Farm Monitoring and advanced analytics
BDE SC3.3 Workshop - Wind Farm Monitoring and advanced analytics
BigData_Europe
 
Big Data Europe: Workshop 3 SC6 Social Science: THE IMPORTANCE OF METADATA & ...
Big Data Europe: Workshop 3 SC6 Social Science: THE IMPORTANCE OF METADATA & ...Big Data Europe: Workshop 3 SC6 Social Science: THE IMPORTANCE OF METADATA & ...
Big Data Europe: Workshop 3 SC6 Social Science: THE IMPORTANCE OF METADATA & ...
BigData_Europe
 
BDE SC1 Workshop 3 - BigMedilytics Overview (Supriyo Chatterjea)
BDE SC1 Workshop 3 - BigMedilytics Overview (Supriyo Chatterjea)BDE SC1 Workshop 3 - BigMedilytics Overview (Supriyo Chatterjea)
BDE SC1 Workshop 3 - BigMedilytics Overview (Supriyo Chatterjea)
BigData_Europe
 
BDE SC1 Workshop 3 - iASiS (Guillermo Palma)
BDE SC1 Workshop 3 - iASiS (Guillermo Palma)BDE SC1 Workshop 3 - iASiS (Guillermo Palma)
BDE SC1 Workshop 3 - iASiS (Guillermo Palma)
BigData_Europe
 
BDE SC1 Workshop 3 - MIDAS (Michaela Black)
BDE SC1 Workshop 3 - MIDAS (Michaela Black)BDE SC1 Workshop 3 - MIDAS (Michaela Black)
BDE SC1 Workshop 3 - MIDAS (Michaela Black)
BigData_Europe
 
BDE SC1 Workshop 3 - Big Data Europe (Simon Scerri)
BDE SC1 Workshop 3 - Big Data Europe (Simon Scerri)BDE SC1 Workshop 3 - Big Data Europe (Simon Scerri)
BDE SC1 Workshop 3 - Big Data Europe (Simon Scerri)
BigData_Europe
 
SC1 Hangout: Updating public databases: Automation and other challenges for c...
SC1 Hangout: Updating public databases: Automation and other challenges for c...SC1 Hangout: Updating public databases: Automation and other challenges for c...
SC1 Hangout: Updating public databases: Automation and other challenges for c...
BigData_Europe
 

More from BigData_Europe (20)

Luigi Selmi - The Big Data Integrator Platform
Luigi Selmi - The Big Data Integrator PlatformLuigi Selmi - The Big Data Integrator Platform
Luigi Selmi - The Big Data Integrator Platform
 
Josep Maria Salanova - Introduction to BDE+SC4
Josep Maria Salanova - Introduction to BDE+SC4Josep Maria Salanova - Introduction to BDE+SC4
Josep Maria Salanova - Introduction to BDE+SC4
 
Rajendra Akerkar - LeMO Project
Rajendra Akerkar - LeMO ProjectRajendra Akerkar - LeMO Project
Rajendra Akerkar - LeMO Project
 
Big Data Europe SC6 WS #3: PILOT SC6: CITIZEN BUDGET ON MUNICIPAL LEVEL, Mart...
Big Data Europe SC6 WS #3: PILOT SC6: CITIZEN BUDGET ON MUNICIPAL LEVEL, Mart...Big Data Europe SC6 WS #3: PILOT SC6: CITIZEN BUDGET ON MUNICIPAL LEVEL, Mart...
Big Data Europe SC6 WS #3: PILOT SC6: CITIZEN BUDGET ON MUNICIPAL LEVEL, Mart...
 
Big Data Europe SC6 WS 3: Where we are and are going for Big Data in OpenScie...
Big Data Europe SC6 WS 3: Where we are and are going for Big Data in OpenScie...Big Data Europe SC6 WS 3: Where we are and are going for Big Data in OpenScie...
Big Data Europe SC6 WS 3: Where we are and are going for Big Data in OpenScie...
 
Big Data Europe SC6 WS 3: Ron Dekker, Director CESSDA European Open Science A...
Big Data Europe SC6 WS 3: Ron Dekker, Director CESSDA European Open Science A...Big Data Europe SC6 WS 3: Ron Dekker, Director CESSDA European Open Science A...
Big Data Europe SC6 WS 3: Ron Dekker, Director CESSDA European Open Science A...
 
Big Data Europe: SC6 Workshop 3: The European Research Data Landscape: Opport...
Big Data Europe: SC6 Workshop 3: The European Research Data Landscape: Opport...Big Data Europe: SC6 Workshop 3: The European Research Data Landscape: Opport...
Big Data Europe: SC6 Workshop 3: The European Research Data Landscape: Opport...
 
BDE SC3.3 Workshop - BDE review: Scope and Opportunities
 BDE SC3.3 Workshop -  BDE review: Scope and Opportunities BDE SC3.3 Workshop -  BDE review: Scope and Opportunities
BDE SC3.3 Workshop - BDE review: Scope and Opportunities
 
BDE SC3.3 Workshop - Agenda
 BDE SC3.3 Workshop - Agenda BDE SC3.3 Workshop - Agenda
BDE SC3.3 Workshop - Agenda
 
BDE SC3.3 Workshop - BDE Pilot case for Wind Turbine condition monitoring re...
 BDE SC3.3 Workshop - BDE Pilot case for Wind Turbine condition monitoring re... BDE SC3.3 Workshop - BDE Pilot case for Wind Turbine condition monitoring re...
BDE SC3.3 Workshop - BDE Pilot case for Wind Turbine condition monitoring re...
 
BDE SC3.3 Workshop - Data management in WT testing and monitoring
 BDE SC3.3 Workshop - Data management in WT testing and monitoring  BDE SC3.3 Workshop - Data management in WT testing and monitoring
BDE SC3.3 Workshop - Data management in WT testing and monitoring
 
BDE SC3.3 Workshop - Big Data in Wind Turbine Condition Monitoring
 BDE SC3.3 Workshop -  Big Data in Wind Turbine Condition Monitoring BDE SC3.3 Workshop -  Big Data in Wind Turbine Condition Monitoring
BDE SC3.3 Workshop - Big Data in Wind Turbine Condition Monitoring
 
BDE SC3.3 Workshop - Options for Wind Farm performance assessment and Power f...
BDE SC3.3 Workshop - Options for Wind Farm performance assessment and Power f...BDE SC3.3 Workshop - Options for Wind Farm performance assessment and Power f...
BDE SC3.3 Workshop - Options for Wind Farm performance assessment and Power f...
 
BDE SC3.3 Workshop - Wind Farm Monitoring and advanced analytics
 BDE SC3.3 Workshop - Wind Farm Monitoring and advanced analytics  BDE SC3.3 Workshop - Wind Farm Monitoring and advanced analytics
BDE SC3.3 Workshop - Wind Farm Monitoring and advanced analytics
 
Big Data Europe: Workshop 3 SC6 Social Science: THE IMPORTANCE OF METADATA & ...
Big Data Europe: Workshop 3 SC6 Social Science: THE IMPORTANCE OF METADATA & ...Big Data Europe: Workshop 3 SC6 Social Science: THE IMPORTANCE OF METADATA & ...
Big Data Europe: Workshop 3 SC6 Social Science: THE IMPORTANCE OF METADATA & ...
 
BDE SC1 Workshop 3 - BigMedilytics Overview (Supriyo Chatterjea)
BDE SC1 Workshop 3 - BigMedilytics Overview (Supriyo Chatterjea)BDE SC1 Workshop 3 - BigMedilytics Overview (Supriyo Chatterjea)
BDE SC1 Workshop 3 - BigMedilytics Overview (Supriyo Chatterjea)
 
BDE SC1 Workshop 3 - iASiS (Guillermo Palma)
BDE SC1 Workshop 3 - iASiS (Guillermo Palma)BDE SC1 Workshop 3 - iASiS (Guillermo Palma)
BDE SC1 Workshop 3 - iASiS (Guillermo Palma)
 
BDE SC1 Workshop 3 - MIDAS (Michaela Black)
BDE SC1 Workshop 3 - MIDAS (Michaela Black)BDE SC1 Workshop 3 - MIDAS (Michaela Black)
BDE SC1 Workshop 3 - MIDAS (Michaela Black)
 
BDE SC1 Workshop 3 - Big Data Europe (Simon Scerri)
BDE SC1 Workshop 3 - Big Data Europe (Simon Scerri)BDE SC1 Workshop 3 - Big Data Europe (Simon Scerri)
BDE SC1 Workshop 3 - Big Data Europe (Simon Scerri)
 
SC1 Hangout: Updating public databases: Automation and other challenges for c...
SC1 Hangout: Updating public databases: Automation and other challenges for c...SC1 Hangout: Updating public databases: Automation and other challenges for c...
SC1 Hangout: Updating public databases: Automation and other challenges for c...
 

Recently uploaded

From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
Product School
 
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Jeffrey Haguewood
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
DianaGray10
 
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
Product School
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance
 
Essentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersEssentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with Parameters
Safe Software
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
DanBrown980551
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
Alan Dix
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
Laura Byrne
 
When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...
Elena Simperl
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance
 
Leading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdfLeading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdf
OnBoard
 
Elevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object CalisthenicsElevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object Calisthenics
Dorra BARTAGUIZ
 
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
Product School
 
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Tobias Schneck
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
Kari Kakkonen
 
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Albert Hoitingh
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
Jemma Hussein Allen
 
UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3
DianaGray10
 

Recently uploaded (20)

From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
 
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
 
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
 
Essentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersEssentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with Parameters
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
 
When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
 
Leading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdfLeading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdf
 
Elevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object CalisthenicsElevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object Calisthenics
 
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
 
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
 
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
 
UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3
 

ICWE2017 BigDataEurope

  • 1. BigDataEurope - Supporting the Variety Dimension of Big Data Mohamed Nadjib MAMI - Fraunhofer IAISICWE17 - 06.06.2017
  • 2. Big Data Europe - the Project ◎ EU Horizon 2020-programme-funded ◎ Coordination & Support action (CSA) Project o Show societal value of Big data to 7 Domains o Lower barrier for using Big Data technologies => BigDataEurope Platform 2
  • 3. Consortium Partners 3 Consortium of 17 Partners o Industry, SMEs, universities, research institutes, etc.
  • 4. BDE Europe - The Platform ◎ Integrator of Big Data technologies o Easy to use/get started (plug-and-play) o Flexible, Customisable ◎ Bundles with only Open Source solutions o Data Storage o Message Passing o Data Processing o Data Searching & Publishing ◎ Publicly released in May 2017 4
  • 5. BDE Platform - Components (some) Search/Indexing Data Processing Apache Solr Apache Spark Elasticsearch Apache Flink Data Acquisition Semantic Components Apache Flume Strabon Message Passing Sextant Apache Kafka GeoTriples Data Storage Silk Apache Hadoop SEMAGROW Apache Cassandra LIMES Apache Hive 4Store Postgis OpenLink Virtuoso 5
  • 6. BDE Platform - Architecture Support Layer Init Daemon GUIs Base Setup App Layer Traffic Forecast Satellite Image Analysis Platform Layer Spark Flink Semantic Layer Ontario SANSA Semagrow Kafka Real-time Stream Monitoring ... ... Resource Management Layer (Swarm) Hardware Layer Premises Cloud (AWS, GCE, MS Azure, …) Data Layer Hadoop NOSQL Store CassandraElasticsearch ...RDF Store Semantic Data Lake (Unified View) 6
  • 7. BDE Platform - Hardware & Virtualization ◎ Docker used for packaging and deploying applications ◎ Based on containers: o A lightweight environment to make a piece of software run in isolation ❖ Shares the host operating system kernel (unlike VMs) ❖ Reduces conflicts e.g., versions ◎ Docker Compose: creates multi-container applications 7
  • 8. BDE Platform - Resource Managements ◎ Swarm (mode) used for managing, scheduling and orchestrating Dockers in multi-node clusters ◎ It provides: o Scalability and Fault Tolerance o Containers interlinking o Log-based monitoring ◎ Separate hardware from software management ◎ Based on Services o Swarm execution unit running a Docker Image 8
  • 9. BDE Platform - Support Layer ◎ Init Daemon: orchestrates the initialization process of the components (containers of Docker Compose): o Components report their initialization progress o It validates whether a specific component can start o It specifies the dependencies between services o It Indicates where a human interaction is required ◎ Examples: o Wait data to load to HDFS to start a Spark job o Wait Spark Master to successfully start to start a Worker 9
  • 10. BDE Platform - User Interfaces 10 Component 1 Component 2 Component 3 Pipeline Builder: creates step-by-step dependency pipeline (fed to the init daemon)
  • 11. BDE Platform - User Interfaces 11 Component 1 Finished Component 2 Finished Component 3 Inprogress Pipeline Monitor: displays the status (not started, running or finished) of components in a running pipeline (retrieved from the init daemon)
  • 12. BDE Platform - User Interfaces 12 Swarm UI: allows to clone a Git repository containing a pipeline and deploys/controls/monitors it on Swarm
  • 13. BDE Platform - User Interfaces 13 Integrator UI: displays the dashboard of each running component in a unified interface
  • 14. BDE Platform - Semantic Layer > Ontario ◎ Data Lake or Swamp? o Repository of data in its original formats o Structured, semi-structured, unstructured o Without unified schema ◎ Semantic Data Lake (Ontario) o Add a Semantic Layer on top of the source datasets ❖ The data is semantically lifted using ontology terms ❖ Provide a uniform view over nonuniform data 14
  • 15. BDE Platform - Semantic Layer > Ontario 15 SELECT count(distinct(?publication)) AS ?no_of_publications count(?deaths) AS ?no_of_deaths WHERE { ?item a qb:Observation . ?item gho:Country ?country . ?item gho:Disease ?disease . ?item att:unitMeasure gho:Measure . ?item eg:incidence ?deaths . ?country rdfs:label "India" . ?disease rdfs:label "Tuberculosis". ?trial a ct:trials . ?trial ct:condition ?condition . ?trial ct:location ?location . ?trial ct:reference ?publication. ?condition owl:sameAs ?disease . ?location redd:locatedIn ?country . ?publication ct:citation ?citation. } ?item a qb:Observation . ?item gho:Country ?country . ?item gho:Disease ?disease . ?item att:unitMeasure gho:Measure . ?item eg:incidence ?deaths . ?trial a ct:trials . ?trial ct:condition ?condition . ?trial ct:location ?location . ?trial ct:reference ?publication. ?condition owl:sameAs ?disease . ?disease rdfs:label "Tuberculosis". ?country rdfs:label "India" . ?location redd:locatedIn ?country . ?publication ct:citation ?citation. Query “number of distinct publications and number of distinct deaths due to the disease Tuberculosis in India”
  • 16. BDE Platform - Semantic Layer > Ontario 16 Publications Meta-wrapper Trials Meta-wrapper Conditions Meta-wrapper Observations Meta-wrapper 2. Planning 3. Meta-wrapper invocation Query 1. Query Parsing & Validation
  • 17. BDE Platform - Semantic Layer > Ontario 17 Publications Meta-wrapper Observations Meta-wrapper Trials Meta-wrapper Wrapper (XML) Wrapper (CSV) Conditions Meta-wrapper Wrapper (RDF) 4. Wrapper Selection & Query Translation ?item gho:Country ?country . ?item gho:Disease ?disease . ... SELECT country, disease, ... FROM Observations Mapping rules ... [Xpath] ... ... [Sparql] ... 5. Query Execution ... [Sparql] ...
  • 18. BDE Platform - Semantic Layer > SANSA 18 SANSA a Framework for distributed RDF data processing ◎ Read/write Layer: Read and write native RDF/OWL data in distributed storage e.g., Hadoop, Spark (RDD, DataFrames, GraphX), Tensors following different representations & partitioning scheme e.g., graphs, tables ◎ Querying Layer: Query distributed RDF using SPARQL (SPARQL-to-SQL approaches, Virtual Views, Intelligent Indexing, ...) http://sansa-stack.net
  • 19. BDE Platform - Semantic Layer > SANSA 19 http://sansa-stack.net ◎ Inference Layer: Derive new facts from existing ones, detect inconsistencies, extract new rules to help in reasoning ◎ Machine Learning Layer: Perform ML or analytics to gain insights for relevant trends, predictions or detection of anomalies from RDF data o Tensor Factorization for e.g. KB completion (testing stage) o Graph Clustering (testing stage) o Association rule mining (evaluation stage) o Semantic Decision trees (idea stage) o Inference in Knowledge Graph Embeddings (idea stage)
  • 20. BDE Platform - Semantic Layer > Semagrow Semagrow a SPARQL query processing system that federates multiple remote endpoints ◎ Original Semagrow o Optimizes queries transparently o Executes sub-queries in the remote endpoints o Integrates results dynamically in heterogeneous data models o Joins the partial results into the final query answer ◎ Next-gen Semagrow o Support different querying languages o Query planner and execution engine adapted e.g., translate SPARQL to CQL for Cassandra databases 20
  • 21. BDE Showcases (pilots) 21 SC1 SC2 SC3 SC4 SC5 SC6 SC7 SC1 - Open PHACTS discovery platform relating to biological/medical questions SC2 - Discovery and Linking of Viticulture-relevant information SC3 - System monitoring in energy production units SC4 - Short-Term traffic flow forecasting. SC5 - Supporting data-intensive climate research SC6 - Citizens & Researchers Budget on Municipal Level SC7 - Ingestion of remote sensing images and social sensing data to detect and verify changes on the Earth surface for security applications ◎ 7 Societal Challenges > 7 pilot implementations
  • 22. Showcase SC1: Health, demographic change and wellbeing ◎ SC1 Implements Open PHACTS Discovery Platform o Integrates and links data from multiple sources: ChEBI, ChEMBL, the Gene Ontology and UniProt (Chemistry, Biological, Medical, etc.) o Explores the relationships between data (compounds, targets, pathways, diseases and tissues) o Data accessed using RESTful-API requests ❖ Translated to SPARQL queries ◎ Technologies used: o 4Store, Memchached, MySQL, Puelia, SWAGGER 22
  • 23. Showcase SC7: Secure Societies ◎ Detect changes in land cover in satellite images (e.g., monitoring critical infrastructures) ◎ Display geo-located events in news sites and social media (e.g., news articles, social networks) ◎ Three workflows: o Change detection workflow o Event detection workflow o Activation workflow ◎ Technologies used: Apache Spark, Cassandra, Sextant, Semagrow, Strabon, GeoTriples 23
  • 24. Showcase 2 (SC7): Secure Societies 24 General Architecture of the SC7 Pilot
  • 25. Showcase 2 (SC7): Secure Societies area and the time interval of interest Satellite Images Compare Images Change detection workflow 25
  • 26. Showcase 2 (SC7): Secure Societies Event detection workflow Associate names with coordinates Cluster news into events (associate geo-location) 26
  • 27. Showcase 2 (SC7): Secure Societies Activation detection workflow Areas with changes Summary of events Spatiotemporal RDF store 27
  • 28. Showcase 2 (SC7): Secure Societies refugee camps located in Zaatari, Jordan 28 News TweetsSelected Area Detected changes
  • 29. Thanks & Questions? For more info... ◎ Project-related: Simon Scerri (scerri@cs.uni-bonn.de) ◎ Ontario: Mohamed Nadjib Mami (mami@cs.uni-bonn.de) ◎ SANSA: Jens Lehmann (jens.lehmann@cs.uni-bonn.de) ◎ Semagrow: Stasinos Konstantopoulos (konstant@iit.demokritos.gr) ◎ Pilots (showcases): o SC1: Ronald Siebes (rm.siebes@few.vu.nl) o SC7: George Papadakis (gpapadis@di.uoa.gr) o All: Ronald Siebes (rm.siebes@few.vu.nl) ◎ Github repos: https://github.com/big-data-europe/README ◎ Website: https://big-data-europe.eu 29
  • 30. BDE Platform vs. Hadoop Distributions 30 SFR = Single failure recovery MFR = Multiple failure recovery SF = Self healing