Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Bde euro proworkshop


Published on

Project contributions described at the EuroPro Workshop at the EDBT/ICDT 2017 Conference in Venice.

Published in: Technology
  • Be the first to comment

  • Be the first to like this

Bde euro proworkshop

  2. 2. Supporting the Societal Domains with Big Data Technology BigDataEurope Project
  3. 3. BigDataEurope Action  EC Horizon 2020 Coordination & Support Action o ~5mio €, 2015-2017  Show societal value of Big Data o Across all societal challenges addressed by H2020  Lower barrier for using big data technologies o Effort to setup and deploy use-case workflows o Lack of skills & expertise  Help establish data value chains across domains & orgs.
  4. 4. Consortium NCSR DEMOKRITOS
  5. 5. Data Value Chain Evolution 22-mars-17 Extraction, Curation Quality, Linking, Integration Publication, Visualization, Analysis Extraction, Curation, Quality, Linking, Integration, Publication, Visualization, Analysis Health Transport Security Extraction Curation Quality Linking Integration Publication Visualization Analysis Data Repositories Linked Open Data TIME Food SocietiesClimate Energy Proprietary, ‘locked-in’ solutions OS Solutions, Big Data
  6. 6. A flexible, generic platform for (Big) Data Value Chain Deployment 2. Architecture Big Data Integrator
  7. 7. Big Data Integrator: Architecture  Stacks Open Source solutions (Free)  Dockerization  Facilitates integration and deployment  Plug-and-play BD Platform  Key BDE additions o Support layer: integrated UI o Semantification layer  Final BDI Release: o (3rd) May 2017] Final
  8. 8. Big Data Integrator: In-Use  Big Data Integrator: WIKI : extensive documentation, information on supported components, instructions, etc.
  9. 9. BDE vs Hadoop distributions Hortonworks Cloudera MapR Bigtop BDE File System HDFS HDFS NFS HDFS HDFS Installation Native Native Native Native lightweight virtualization Plug & play components (no rigid schema) no no no no yes High Availability Single failure recovery (yarn) Single failure recovery (yarn) Self healing, mult. failure rec. Single failure recovery (yarn) Multiple Failure recovery Cost Commercial Commercial Commercial Free Free Scaling Freemium Freemium Freemium Free Free Addition of custom components Not easy No No No Yes Integration testing yes yes yes yes -- Operating systems Linux Linux Linux Linux All Management tool Ambari Cloudera manager MapR Control system - Docker swarm UI+ Custom 9
  10. 10.  Data Acquisition: Apache Flume  Data Storage: Hue, Apache Cassandra, ScyllaDB, Apache Hive, Postgis  Search/Indexing: Apache Solr  Message Passing: Apache Kafka  Data Processing: Spark, Flink  Semantic Components: Sansa, Silk, Strabon, Sextant, GeoTriples, Semagrow, Limes, 4Store, Openlink Virtuoso BDI Docker Containers (..and counting)
  11. 11. Semantic Layer  Semantic Data Lakes o Minimal ingestion pre-processing o Semantic layer maintains metadata o Add meaning when retrieving/processin Data Lake: scalable unstructured data store Relationship definitions and metadata JSON-LD CSVW R2RMLXML2RDF  Ongoing Research for Semantic Big Data & Analytics Knowledge Graphs
  12. 12. Semantic Layer tools  BDE tooling for Semantic Data Lake: o Swagger: Semantics of RESTful APIs o Semantic Analytics Stack (SANSA): Distributed data processing over large-scale Knowledge Graphs o Semagrow: SPARQL over Big Data stores o Ontario: Querying over Semantic
  13. 13. Demonstrating the Societal Value through 7 Pilot ‘Real-world’ use-cases 1. Overview BigDataEurope Pilots
  14. 14. 7 Pilots ◎ BDI Platform Instantiations o Allow end-users to easily deploy functionality in own system environment o Modularized Docker approach - easier to replace components o Reduces effort to keep 3rd party software updated & integrated ◎ 7 Societal Challenge Pilots o Aligned with 7 European Commision H2020 Societal Challenges o Real-world use-cases (Data, Objectives, Solutions) o Some pilots have different data & objectives but a similar solution ◎More info:
  15. 15. 7 BDI Instances
  16. 16. Free Workshops, Hangouts & Webinars BigDataEurope Activities
  17. 17. 3rd round of Societal Workshops Health 11/12 May 2017 Malta Collocated with EC eHealth Week Food&Agr i 31 March 2017 Brussels Co-organized with e-ROSA H2020 project Energy Autumn 2017 Brussels (Details to Follow) Transport 12/13 September 2017 Collocated with Big Data for Transport, Tisa workshop, organised by Ertico Climate 21/22 September 2017 Brussels Collocated with iScape Project Workshop – Improving the Smart Control of Air Pollution in Europe by EC JRC Ispra. Societies Autumn 2017 T.B.C. (Details to Follow) Security Autumn 2017 Brussels Standalone Event
  18. 18. BIG DATA INTEGRATOR SEVEN PILOT DESCRIPTIONS PROJECT COORDINATION (Fraunhofer IAIS) Prof. Sören Auer, auer © cs.uni-bonn · de > Dr. Simon Scerri, scerri © cs.uni-bonn · de EIS Department/Group, Fraunhofer IAIS & CS Department Uni-Bonn, Bonn, Germany Questions & Contacts 22-mars-17 #BigDataEurope leads the Fraunhofer Big Data Alliance