Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Platform introduction & Summary

235 views

Published on

Hajira Jabeen introduces the Big Data Europe Integrator Platform. The deck also includes the slides use to summarise the other presentations in the launch webinar.

Published in: Data & Analytics
  • Be the first to comment

  • Be the first to like this

Platform introduction & Summary

  1. 1. Big Data Europe Platform ReleaseMay 3rd 2017 Big Data Europe Integrator Platform Empowering Communities with Data Technologies Platform release Dr. Hajira Jabeen Senior researcher University of Bonn
  2. 2. Platform Goals ◎Opensource ◎Ease of Use ◎Support a variety of use cases ◎Embrace emerging Big Data Technologies ◎Simple integration with custom components
  3. 3. Key actors
  4. 4. Platform Architecture Evolution 4
  5. 5. Platform Architecture Evolution 5
  6. 6. Platform Architecture Evolution 6
  7. 7. 7 Platform Architecture Existing
  8. 8. Platform Architecture Existing 8
  9. 9. Platform Architecture Alternate View Support Layer Init Daemon GUIs Monitor App Layer Traffic Forecast Satellite Image Analysis Platform Layer Spark Flink Semantic Layer Ontario SANSA Semagrow Kafka Real-time Stream Monitoring ... ... Resource Management Layer (Swarm) Hardware Layer Premises Cloud (AWS, GCE, MS Azure, …) Hadoop NOSQL Store CassandraElasticsearch ...RDF Store Data Layer
  10. 10. BDE Supported Frameworks Search/indexing Data processing Apache Solr Apache Spark Data acquisition Apache Flink Apache Flume Semantic Components Message passing Strabon Apache Kafka Sextant Data storage GeoTriples Hue Silk Apache Cassandra SEMAGROW ScyllaDB LIMES Apache Hive 4Store Postgis OpenLink Virtuoso 10
  11. 11. Platform features ◎ BDE Development Environment o Stack builder o Workflow builder o Instructions to add custom components to the BDE stack ◎ Administrator Interface o SwarmUI ◎ UI Integrator o Workflow monitor o Integrated web interface 11
  12. 12. What BDE Provides ? ◎Platform Installation Instructions ◎Usage Instructions o Creating a stack o Creating a workflow o Monitoring the Stack o Integration of Custom Components 12
  13. 13. Platform installation ◎Manual installation guide ◎Using Docker Machine o On local machine (VirtualBox) o In cloud (AWS, DigitalOcean, Azure) o Bare metal ◎Screencasts 13
  14. 14. Deploying a Big Data Stack ◎ Stack Builder ◎ Stack o Collection of communicating components to solve a specific problem ◎ Described in Docker Compose o Component configuration o Application topology 14
  15. 15. Creation of WorkFlows ◎Pipeline Builder o Allows creation of dependencies among different applications ◎ WorkFlow Monitor o Monitoring of pipeline-workflow using 15
  16. 16. Integrating Custom Components ◎Instructions o Orchestrator required for initialization process (init_daemon) ❖ Components may depend on each other ❖ Components may require manual intervention o User Interface Integration ❖ Standard Interfaces from components ❖ Combine and align the interfaces 16
  17. 17. User Interfaces ◎Target: Facilitate the use of the platform o User Interface Adaption ◎Available interfaces o Workflow UIs ❖ Workflow Builder ❖ Workflow Monitor o Swarm UI o Integrator UI 17
  18. 18. Details ! 18
  19. 19. Presentations by Ivan, Aad and Jens 19
  20. 20. Summary 20
  21. 21. Platform Architecture 21
  22. 22. Pilot Show Cases 22 SC1 SC2 SC3 SC4 SC5 SC6 SC7 SC1 - Open PHACTS discovery platform relating to biological/medical questions SC2 - Discovery and Linking of Viticulture-relevant information SC3 - System monitoring in energy production units SC4 - Short-Term traffic flow forecasting. SC5 - Supporting data-intensive climate research SC6 - Citizens & Researchers Budget on Municipal Level SC7 - Ingestion of remote sensing images and social sensing data to detect and verify changes on the Earth surface for security applications
  23. 23. SC1- Health 23
  24. 24. SC2 - Food 24
  25. 25. SC3 - Energy 25
  26. 26. 26 SC4 - Transport FCD: Floating Car Data NRT: Near Real Time
  27. 27. SC5 - Climate 27
  28. 28. SC6 - Social Sciences 28
  29. 29. SC7 - Security 29
  30. 30. BDE vs Hadoop distributions Hortonworks Cloudera MapR Bigtop BDE File System HDFS HDFS NFS HDFS HDFS Installation Native Native Native Native lightweight virtualization Flexible Modular Architecture no no no no yes High Availability Single failure recovery (yarn) Single failure recovery (yarn) Self healing, mult. failure rec. Single failure recovery (yarn) Failure recovery Cost Commercial Commercial Commercial Free Free Scaling Freemium Freemium Freemium Free Free Addition of custom components Not easy No No No Yes Integration testing yes yes yes yes -- Operating systems Linux Linux Linux Linux Windows/Mac/Linux Management tool Ambari Cloudera manager MapR Control system - Docker swarm UI+ Custom 30
  31. 31. BDE vs Hadoop distributions ◎BDE is not built on top of existing distributions ◎Targets o Communities o Research Institutions ◎Bridges scientists and open data ◎Multi Tier research efforts towards Smart Data 31
  32. 32. Maintenance and Uptake ◎Community Driven ◎Adapters ■ Feuga , Eurostat, ILVO ■ I2cat, Vicomtech, IoF ◎Follow Up Projects ■ HOBBIT ■ Special ■ Big Ocean ■ Qrowd 32
  33. 33. Wrap up 33 ◎Big Data Europe - Platform o Containerized Components o Development and Runtime Facilities ◎Show Cases o From Components to Architectures o Evolving Microservice Architectures

×