The document discusses the Big Data Integrator (BDI) platform, a one-stop solution for big and smart data management developed by the BigDataEurope project. The BDI is a flexible, generic platform that supports a variety of big data components through its Docker-based architecture. It addresses requirements from multiple stakeholders and goes beyond existing solutions by incorporating semantic capabilities and enabling easy deployment of customized data pipelines. A demo of the BDI platform shows how different big data stacks can be deployed through its user-friendly interfaces.
Hajira Jabeen introduces the Big Data Europe Integrator Platform. The deck also includes the slides use to summarise the other presentations in the launch webinar.
Apache Big_Data Europe event: "Demonstrating the Societal Value of Big & Smar...BigData_Europe
H2020 BigDataEurope is a flagship project of the European Union's Horizon 2020 framework programme for research and innovation. In this talk we present the Docker-based BigDataEurope platform, which integrates a variety of Big Data processing components such as Hive, Cassandra, Apache Flink and Spark. Particularly supporting the variety dimension of Big Data, it adds a semantic data processing layer, which allows to ingest, map, transform and exploit semantically enriched data. In this talk, we will present the innovative technical architecture as well as applications of the BigDataEurope platform for life sciences (OpenPhacts), mobility, food & agriculture as well as industrial analytics (predictive maintenance). We demonstrate how societal value can be generated by Big Data analytics, e.g. making transportation networks more efficient or facilitating drug research.
BDE-SC6 Hangout - “Insight into Virtual Currency Ecosystems”BigData_Europe
Third SC6 webinar was held on 16 February 2017. It was organised by the Consortium of Social Science Data Archives (CESSDA) from Norway and the Semantic Web Company (SWC) from Austria. Theme of the webinar was “Insight into Virtual Currency Ecosystems” presented by Dr. Bernhard Haslhofer, Data Scientist at the Austrian Institute of Technology.
Big Data Europe SC6 WS #3: Big Data Europe Platform: Apps, challenges, goals ...BigData_Europe
Talk at the Big Data Europe SC6 workshop number 3 taking place on 11.9.2017 in Amsterdam co-located with SEMANTiCS2017 conference: The Big Data Europe Platform: Apps, challenges, goals by Aad Versteden, TenForce.
Jens Lehmann's overview of the use of semantics in the Big Data Europe Integrator Platform. Including the Semantic Data Lake (Ontario), and the SANSA Analytics Engine.
Hajira Jabeen introduces the Big Data Europe Integrator Platform. The deck also includes the slides use to summarise the other presentations in the launch webinar.
Apache Big_Data Europe event: "Demonstrating the Societal Value of Big & Smar...BigData_Europe
H2020 BigDataEurope is a flagship project of the European Union's Horizon 2020 framework programme for research and innovation. In this talk we present the Docker-based BigDataEurope platform, which integrates a variety of Big Data processing components such as Hive, Cassandra, Apache Flink and Spark. Particularly supporting the variety dimension of Big Data, it adds a semantic data processing layer, which allows to ingest, map, transform and exploit semantically enriched data. In this talk, we will present the innovative technical architecture as well as applications of the BigDataEurope platform for life sciences (OpenPhacts), mobility, food & agriculture as well as industrial analytics (predictive maintenance). We demonstrate how societal value can be generated by Big Data analytics, e.g. making transportation networks more efficient or facilitating drug research.
BDE-SC6 Hangout - “Insight into Virtual Currency Ecosystems”BigData_Europe
Third SC6 webinar was held on 16 February 2017. It was organised by the Consortium of Social Science Data Archives (CESSDA) from Norway and the Semantic Web Company (SWC) from Austria. Theme of the webinar was “Insight into Virtual Currency Ecosystems” presented by Dr. Bernhard Haslhofer, Data Scientist at the Austrian Institute of Technology.
Big Data Europe SC6 WS #3: Big Data Europe Platform: Apps, challenges, goals ...BigData_Europe
Talk at the Big Data Europe SC6 workshop number 3 taking place on 11.9.2017 in Amsterdam co-located with SEMANTiCS2017 conference: The Big Data Europe Platform: Apps, challenges, goals by Aad Versteden, TenForce.
Jens Lehmann's overview of the use of semantics in the Big Data Europe Integrator Platform. Including the Semantic Data Lake (Ontario), and the SANSA Analytics Engine.
Open Source project failure often stems from not setting clear objectives or having a shared vision from the start. That said there are many success stories, including two well known Statistical examples: Demetra; and Eurostat SDMX tools (SDMX-RI). However, in all these examples there was at first a founding organisation/entity that created the right environment for its successful path into a new paradigm. In the context of my presentation this being the Statistical Information System Collaboration Community (SIS-CC / http://siscc.oecd.org).
Presented at the International Marketing and Output DataBase Conference, Gozd Martuljek, September 18 - 22, 2016.
Present and future of unified, portable and efficient data processing with Ap...DataWorks Summit
The world of big data involves an ever-changing field of players. Much as SQL stands as a lingua franca for declarative data analysis, Apache Beam aims to provide a portable standard for expressing robust, out-of-order data processing pipelines in a variety of languages across a variety of platforms. In a way, Apache Beam is a glue that can connect the big data ecosystem together; it enables users to "run any data processing pipeline anywhere."
This talk will briefly cover the capabilities of the Beam model for data processing and discuss its architecture, including the portability model. We’ll focus on the present state of the community and the current status of the Beam ecosystem. We’ll cover the state of the art in data processing and discuss where Beam is going next, including completion of the portability framework and the Streaming SQL. Finally, we’ll discuss areas of improvement and how anybody can join us on the path of creating the glue that interconnects the big data ecosystem.
Speaker
Davor Bonaci, V.P. of Apache Beam; Founder/CEO at Operiant
OCCIware presentation at EclipseDay in Lyon, November 2017, by Marc Dutoo, SmileOCCIware
Presentation title: Model and pilot all cloud layers with OCCIware, from IoT to Big Data
Abstract: Who uses multi cloud today ? Everybody. Alas, this leads to a lot of "technical glue". Enter OCCIware's Studio and Runtime : manage all layers and domains of the Cloud (XaaS) in a uniform, standard, extensible way - the Cloud consumer platform.presentation.
This talk presents how the OCCIware Studio - currently being contributed to the Eclipse Foundation by Inria and Obeo - takes advantage of Eclipse Modeling and SIrius in order to support a metamodel for the generic Open Cloud Computing Interface (OCCI) REST API and build a "studio factory", while providing feedback and lessons learned on various other Eclipse components.
It concludes on a live demonstration of using it to model and pilot an IoT (nodeMCU/ESP8266), Linked & Big Data (JSON-LD, Spark), containerized Cloud solution to let electricity consumption be monitored across territories by all actors - individuals, utility providers, up to regional public bodies.
Model and pilot all cloud layers with OCCIware - Eclipse Day Lyon 2017Marc Dutoo
Who uses multi cloud today ? Everybody. Alas, this leads to a lot of "technical glue". Enter OCCIware's Studio and Runtime : manage all layers and domains of the Cloud (XaaS) in a uniform, standard, extensible way - the Cloud consumer platform.
This talk presents how the OCCIware Studio - currently being contributed to the Eclipse Foundation by Inria and Obeo - takes advantage of Eclipse Modeling and SIrius in order to support a metamodel for the generic Open Cloud Computing Interface (OCCI) REST API and build a "studio factory", while providing feedback and lessons learned on various other Eclipse components.
It concludes on a live demonstration of using it to model and pilot an IoT (nodeMCU/ESP8266), Linked & Big Data (JSON-LD, Spark), containerized Cloud solution to let electricity consumption be monitored across territories by all actors - individuals, utility providers, up to regional public bodies.
HTML5 and CSS3 charts. Ready for visualizing both batch and streaming data.
See more at: https://github.com/proteus-h2020/proteic
https://github.com/proteus-h2020/proteic
A summary of DBpedia's History and a detailed analysis of challenges and solutions.
We show how the Linked Data Cloud evolved around DBpedia and also what problems we and other data projects encountered. We included a section on the new solutions that will lead DBpedia into a bright future.
Presentation of OCCIware, a standard, extensible Cloud consumer platform at P...OCCIware
OCCIware - standard, extensible Cloud consumer platform : an end-to-end demo (IoT, Linked Data, Spark, Docker)
Who uses multi cloud today ? Everybody. Alas, this leads to a lot of "technical glue". Enter OCCIware's Studio and Runtime : manage all layers and domains of the Cloud (XaaS) in a uniform, standard, extensible way - the Cloud consumer platform.
This presentation first introduces the OCCIware platform - the result of 3 years of R&D by French Open Source companies and labs led byb Smile and Inria. It then shows a live demonstration of how its component helps an IoT, Linked & Big Data, containerized Cloud solution to let electricity consumption be monitored across territories by all actors - individuals, utility providers, up to regional public bodies.
The presentation includes demos of OCCIware's visual Docker & Linked Data Studios, OCCInterface web playground.
OCCIware @ Paris Open Source Summit 2017 - a standard, extensible Cloud consu...Marc Dutoo
Who uses multi cloud today ? Everybody. Alas, this leads to a lot of "technical glue". Enter OCCIware's Studio and Runtime : manage all layers and domains of the Cloud (XaaS) in a uniform, standard, extensible way - the Cloud consumer platform.
This presentation first introduces the OCCIware platform - the result of 3 years of R&D by French Open Source companies and labs led byb Smile and Inria. It then shows a live demonstration of how its component helps an IoT, Linked & Big Data, containerized Cloud solution to let electricity consumption be monitored across territories by all actors - individuals, utility providers, up to regional public bodies.
Keywords : nodeMCU/ESP8266, JSON-LD, Spark, react.js, Docker, and obviously Open Cloud Computing Interface (OCCI).
With demos of OCCIware's visual Docker & Linked Data Studios, OCCInterface web playground.
The Enterprise Guide to Building a Data Mesh - Introducing SpecMeshIanFurlong4
For organisations to successfully adopt data mesh, setting up and maintaining infrastructure needs to be easy.
We believe the best way to achieve this is to leverage the learnings from building a ‘central nervous system‘, commonly used in modern data-streaming ecosystems. This approach formalises and automates of the manual parts of building a data mesh.
This presentation introduces SpecMesh; a methodology and supporting developer toolkit to enable business to build the foundations of their data mesh.
OCCIware: extensible and standard-based XaaS platform to manage everything in...OW2
The OCCIware project aims at managing in a unified manner all layers and domains of the Cloud (XaaS), by building on the Open Cloud Computing (OCCI) standard. OCCIware Metamodel formally specifies the main OCCI concepts. Today a first EMF metamodel is defined that adds to OCCI new concepts such as Extension, Configuration, and EDataType, addressing some limitations of OCCI.
This session highlights OCCIware platform two main components:
- The OCCIware Studio Factory, allowing to produce visually customizable diagram editors for any Cloud configuration business domain modeled in OCCI using the OCCI Extension Studio, such as the flagship Docker Studio ;
- The OCCIware Runtime, based on OW2 erocci project, including the tools for deployment, supervision and administration, and allowing to federate multiple XaaS Cloud runtimes, such as the Roboconf PaaS server and the ActiveEon Cloud Automation multi-IaaS connector. This talk includes a demonstration of the Docker connector and of how to use the OCCIware Cloud Designer to configure a real life Cloud application (a Java API server on top of a MongoDB cluster)'s business, platform and infrastructure layers seamlessly on both VirtualBox and OpenStack infrastructure.
OCCIware: extensible and standard-based XaaS platform to manage everything in...OCCIware
The OCCIware project aims at managing in a unified manner all layers and domains of the Cloud (XaaS), by building on the Open Cloud Computing (OCCI) standard. OCCIware Metamodel formally specifies the main OCCI concepts. Today a first EMF metamodel is defined that adds to OCCI new concepts such as Extension, Configuration, and EDataType, addressing some limitations of OCCI.
This session highlights OCCIware platform two main components:
- The OCCIware Studio Factory, allowing to produce visually customizable diagram editors for any Cloud configuration business domain modeled in OCCI using the OCCI Extension Studio, such as the flagship Docker Studio ;
- The OCCIware Runtime, based on OW2 erocci project, including the tools for deployment, supervision and administration, and allowing to federate multiple XaaS Cloud runtimes, such as the Roboconf PaaS server and the ActiveEon Cloud Automation multi-IaaS connector. This talk includes a demonstration of the Docker connector and of how to use the OCCIware Cloud Designer to configure a real life Cloud application (a Java API server on top of a MongoDB cluster)'s business, platform and infrastructure layers seamlessly on both VirtualBox and OpenStack infrastructure.
Big Data Europe SC6 WS #3: PILOT SC6: CITIZEN BUDGET ON MUNICIPAL LEVEL, Mart...BigData_Europe
Presentation at the Big Data Europe SC6 workshop #3 on 11.9.2017 in Amsterdam co-located with SEMANTiCS2017 conference: BDE PIlot Societal Challenge 6: CITIZEN BUDGET ON MUNICIPAL LEVEL by Martin Kaltenboeck (Semantic Web Company, SWC).
Big Data Europe SC6 WS 3: Where we are and are going for Big Data in OpenScie...BigData_Europe
Where we are and are going for Big Data in OpenScience
Keynote talk at the Big Data Europe SC6 Workshop on 11.9.2017 in Amsterdam co-located with SEMANTiCS2017: The perspective of European official statistics by Fernando Reis, Task-Force Big Data, European Commission (Eurostat).
Big Data Europe SC6 WS 3: Ron Dekker, Director CESSDA European Open Science A...BigData_Europe
Slides for keynote talk at the Big Data Europe workshop nr 3 on 11.9.2017 in Amsterdam co-located with SEMANTiCS2017 conference by Ron Dekker, Director CESSDA: European Open Science Agenda: where we are and where we are going?
Big Data Europe: SC6 Workshop 3: The European Research Data Landscape: Opport...BigData_Europe
Slides of the keynote at the 3rd Big Data Europe SC6 Workshop co-located at SEMANTiCS2018 in Amsterdam (NL) on: The European Research Data Landscape: Opportunities for CESSDA by Peter Doorn, Director DANS, Chair, Science Europe W.G. on Research Data. Chair, CESSDA ERIC General Assembly
BDE SC3.3 Workshop - Options for Wind Farm performance assessment and Power f...BigData_Europe
Options for Wind Farm performance assessment and Power forecasting (Mr. A. Kyritsis, ALTSOL/TERNA) at the BigDataEurope Workshop, Amsterdam, Novermber 2017.
Big Data Europe: Workshop 3 SC6 Social Science: THE IMPORTANCE OF METADATA & ...BigData_Europe
Big Data Europe: Workshop 3 SC6 Social Science - 11.09.2017 in Amsterdam, co-located with SEMANTiCS2017 titled: THE IMPORTANCE OF METADATA & BIG DATA IN OPEN SCIENCE. Slides by Ivana Versic (Cessda) and Martin Kaltenböck (SWC)
BDE SC1 Workshop 3 - Open PHACTS Pilot (Kiera McNeice)BigData_Europe
Overview of Open PHACTS, the BDE Pilot project in SC1, presented at BDE SC1 Workshop 3, 13 December, 2017.
https://www.big-data-europe.eu/the-final-big-data-europe-workshop/
BDE SC1 Workshop 3 - Big Data Europe (Simon Scerri)BigData_Europe
Overview of the Big Data Europe project presented at BDE SC1 Workshop 3, 13 December, 2017.
https://www.big-data-europe.eu/the-final-big-data-europe-workshop/
SC1 Hangout: Updating public databases: Automation and other challenges for c...BigData_Europe
A recording of this webinar can be found at https://youtu.be/IqG3j5b-CXQ
Keeping databases up-to-date is a significant challenge with the rate at which many data sources are growing. Open PHACTS and Big Data Europe organised this webinar to hold an open, informal discussion around keeping databases updated – from user needs, to the challenges of automation, to potential technical approaches underpinning key data sources.
Joining our panel are Dr Evan Bolton, who manages the PubChem project at NCBI, and Professor Chris Evelo, Co-Founder and Director at WikiPathways.
Enhancing Project Management Efficiency_ Leveraging AI Tools like ChatGPT.pdfJay Das
With the advent of artificial intelligence or AI tools, project management processes are undergoing a transformative shift. By using tools like ChatGPT, and Bard organizations can empower their leaders and managers to plan, execute, and monitor projects more effectively.
Paketo Buildpacks : la meilleure façon de construire des images OCI? DevopsDa...Anthony Dahanne
Les Buildpacks existent depuis plus de 10 ans ! D’abord, ils étaient utilisés pour détecter et construire une application avant de la déployer sur certains PaaS. Ensuite, nous avons pu créer des images Docker (OCI) avec leur dernière génération, les Cloud Native Buildpacks (CNCF en incubation). Sont-ils une bonne alternative au Dockerfile ? Que sont les buildpacks Paketo ? Quelles communautés les soutiennent et comment ?
Venez le découvrir lors de cette session ignite
Cyaniclab : Software Development Agency Portfolio.pdfCyanic lab
CyanicLab, an offshore custom software development company based in Sweden,India, Finland, is your go-to partner for startup development and innovative web design solutions. Our expert team specializes in crafting cutting-edge software tailored to meet the unique needs of startups and established enterprises alike. From conceptualization to execution, we offer comprehensive services including web and mobile app development, UI/UX design, and ongoing software maintenance. Ready to elevate your business? Contact CyanicLab today and let us propel your vision to success with our top-notch IT solutions.
Globus Connect Server Deep Dive - GlobusWorld 2024Globus
We explore the Globus Connect Server (GCS) architecture and experiment with advanced configuration options and use cases. This content is targeted at system administrators who are familiar with GCS and currently operate—or are planning to operate—broader deployments at their institution.
In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...Juraj Vysvader
In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I didn't get rich from it but it did have 63K downloads (powered possible tens of thousands of websites).
May Marketo Masterclass, London MUG May 22 2024.pdfAdele Miller
Can't make Adobe Summit in Vegas? No sweat because the EMEA Marketo Engage Champions are coming to London to share their Summit sessions, insights and more!
This is a MUG with a twist you don't want to miss.
Software Engineering, Software Consulting, Tech Lead.
Spring Boot, Spring Cloud, Spring Core, Spring JDBC, Spring Security,
Spring Transaction, Spring MVC,
Log4j, REST/SOAP WEB-SERVICES.
Gamify Your Mind; The Secret Sauce to Delivering Success, Continuously Improv...Shahin Sheidaei
Games are powerful teaching tools, fostering hands-on engagement and fun. But they require careful consideration to succeed. Join me to explore factors in running and selecting games, ensuring they serve as effective teaching tools. Learn to maintain focus on learning objectives while playing, and how to measure the ROI of gaming in education. Discover strategies for pitching gaming to leadership. This session offers insights, tips, and examples for coaches, team leads, and enterprise leaders seeking to teach from simple to complex concepts.
Custom Healthcare Software for Managing Chronic Conditions and Remote Patient...Mind IT Systems
Healthcare providers often struggle with the complexities of chronic conditions and remote patient monitoring, as each patient requires personalized care and ongoing monitoring. Off-the-shelf solutions may not meet these diverse needs, leading to inefficiencies and gaps in care. It’s here, custom healthcare software offers a tailored solution, ensuring improved care and effectiveness.
First Steps with Globus Compute Multi-User EndpointsGlobus
In this presentation we will share our experiences around getting started with the Globus Compute multi-user endpoint. Working with the Pharmacology group at the University of Auckland, we have previously written an application using Globus Compute that can offload computationally expensive steps in the researcher's workflows, which they wish to manage from their familiar Windows environments, onto the NeSI (New Zealand eScience Infrastructure) cluster. Some of the challenges we have encountered were that each researcher had to set up and manage their own single-user globus compute endpoint and that the workloads had varying resource requirements (CPUs, memory and wall time) between different runs. We hope that the multi-user endpoint will help to address these challenges and share an update on our progress here.
Accelerate Enterprise Software Engineering with PlatformlessWSO2
Key takeaways:
Challenges of building platforms and the benefits of platformless.
Key principles of platformless, including API-first, cloud-native middleware, platform engineering, and developer experience.
How Choreo enables the platformless experience.
How key concepts like application architecture, domain-driven design, zero trust, and cell-based architecture are inherently a part of Choreo.
Demo of an end-to-end app built and deployed on Choreo.
Navigating the Metaverse: A Journey into Virtual Evolution"Donna Lenk
Join us for an exploration of the Metaverse's evolution, where innovation meets imagination. Discover new dimensions of virtual events, engage with thought-provoking discussions, and witness the transformative power of digital realms."
Unleash Unlimited Potential with One-Time Purchase
BoxLang is more than just a language; it's a community. By choosing a Visionary License, you're not just investing in your success, you're actively contributing to the ongoing development and support of BoxLang.
top nidhi software solution freedownloadvrstrong314
This presentation emphasizes the importance of data security and legal compliance for Nidhi companies in India. It highlights how online Nidhi software solutions, like Vector Nidhi Software, offer advanced features tailored to these needs. Key aspects include encryption, access controls, and audit trails to ensure data security. The software complies with regulatory guidelines from the MCA and RBI and adheres to Nidhi Rules, 2014. With customizable, user-friendly interfaces and real-time features, these Nidhi software solutions enhance efficiency, support growth, and provide exceptional member services. The presentation concludes with contact information for further inquiries.
Check out the webinar slides to learn more about how XfilesPro transforms Salesforce document management by leveraging its world-class applications. For more details, please connect with sales@xfilespro.com
If you want to watch the on-demand webinar, please click here: https://www.xfilespro.com/webinars/salesforce-document-management-2-0-smarter-faster-better/
BigDataEurope @BDVA Summit2016 1: The BDE Platform
1. BIG DATA EUROPE'S
INTEGRATOR PLATFORM
A ONE-STOP SOLUTION FOR BIG AND
SMART DATA MANAGEMENT
BDVA Summit 2016, Valencia1 December 2016
Summit 2016
2. Talk outline
The BigDataEurope Project, Mission & BDVA Synergies
The Big Data Integrator (BDI) platform
o Stakeholder Requirements
o Architecture
o Supported Components
o Beyond the State-of-the-Art
A look into the BDI platform [DEMO]
6-déc.-16www.big-data-europe.eu
3. Supporting the Societal Domains with Big Data Technology
BigDataEurope Project
6-déc.-16www.big-data-europe.eu
4. BigDataEurope Action
EC Horizon 2020 Coordination & Support Action
o ~5mio €, 2015-2017
Show societal value of Big Data
o Across all societal challenges addressed by H2020
Lower barrier for using big data technologies
o Effort to setup and deploy use-case workflows
o Lack of skills & expertise
Help establish data value chains across domains & orgs.
6-déc.-16www.big-data-europe.eu
6. Stakeholder Engagement Cycle
Present action, showcase
deployments
Raise awareness about BDE results,
what they mean for stakeholders
Collect requirements to drive
further development
6-déc.-16
www.big-data-europe.eu
M12M6 M18 M24 M30
7. Data Value Chain Evolution
6-déc.-16
Extraction, Curation Quality, Linking,
Integration
Publication,
Visualization, Analysis
Extraction, Curation, Quality,
Linking, Integration, Publication,
Visualization, Analysis
Health
Transport
Security
Extraction Curation Quality Linking Integration Publication Visualization Analysis
Data
Repositories
Linked
Open Data
TIME
Food SocietiesClimate Energy
Proprietary,
‘locked-in’
solutions
OS Solutions,
Big Data Stacks
www.big-data-europe.eu
8. Parallels to BDVA Mission
Task Force 6 (Technical)
o SG1: Management
o SG2: Big Data Architectures and Infrastructures
The Big Data Integrator Platform (SG2)
o Generic Architecture (Blueprint) & Instances
Smart Big Data Management (SG1)
o Support for Semantic Components & Data Lakes
6-déc.-16www.big-data-europe.eu
9. A flexible, generic platform for (Big) Data Value
Chain Deployment
1. Stakeholder Requirements
Big Data Integrator
6-déc.-16www.big-data-europe.eu
14. A flexible, generic platform for (Big) Data Value
Chain Deployment
2. Architecture
Big Data Integrator
6-déc.-16www.big-data-europe.eu
15. Big Data Integrator Architecture
Prototype developed by BDE
o Incorporates existing BD technology
o Facilitates integration and deployment
Main points of the architecture
o Dockerization
o Support layer, including integrated UI
o Semantification layer
6-déc.-16www.big-data-europe.eu
17. Docker containers
6-déc.-16www.big-data-europe.eu
Docker offers lightweight virtualization
o Containers can be shared/provisioned on different Linux variations/versions
Identical base system
o NOT Required
All BDI components
o Docker containers
22. BDE vs Hadoop distributions
BDE is not built on top of existing distributions
Targets
o Communities
o Research institutions
Bridges scientists and open data
Multi-Tier research efforts towards Smart Data
22
23. BDE vs Hadoop distributions
Hortonworks Cloudera MapR Bigtop BDE
File System HDFS HDFS NFS HDFS HDFS
Installation Native Native Native Native lightweight
virtualization
Plug & play components (no
rigid schema)
no no no no yes
High Availability Single failure
recovery (yarn)
Single failure
recovery (yarn)
Self healing, mult.
failure rec.
Single failure
recovery (yarn)
Multiple Failure
recovery
Cost Commercial Commercial Commercial Free Free
Scaling Freemium Freemium Freemium Free Free
Addition of custom components Not easy No No No Yes
Integration testing yes yes yes yes --
Operating systems Linux Linux Linux Linux All
Management tool Ambari Cloudera manager MapR Control
system
- Docker swarm UI+
Custom
23
24. A flexible, generic platform for (Big) Data Value
Chain Deployment
3. Supported Components
Big Data Integrator
6-déc.-16www.big-data-europe.eu
25. Dockerized Components
6-déc.-16www.big-data-europe.eu
Processing and storage components
o Re-used existing docker containers (where available)
o Dockerized by BDE otherwise
o Ensuring all can be provisioned through Docker Swarm
Other Components
o Semantic Layer
o Support Layer
27. A flexible, generic platform for (Big) Data Value
Chain Deployment
4. In-use: Deployment & Installation
Big Data Integrator
6-déc.-16www.big-data-europe.eu
29. Platform installation
Manual installation guide
Using Docker Machine
o On local machine (VirtualBox)
o In cloud (AWS, DigitalOcean, Azure)
o Bare metal
Screencasts (Getting Starting with the Platform)
29
30. Developing a component
Base Docker images
o Serve as a template for a (Big Data) technology
o Easily extendable custom algorithm/data
Published components
o Responsibilities divided b/w partners
o Image repositories on GitHub
o Automated builds on DockerHub
o Documentation on BDE Wiki
30
31. Deploying a Big Data Stack
Stack: Collection of communicating components to solve
a specific problem
Described in Docker Compose
o Component configuration
o Application topology
Orchestrator required for initialization process
o Components may depend on each other
o Components may require manual intervention
31
32. Support Layer (User Interfaces)
6-déc.-16www.big-data-europe.eu
Integrator UI
o Web UIs from BDE dockers (including 3rd party components)
follow these BDE stylesheets
Stack Monitor App
o Workflow Builder
o Workflow Monitor
Swarm UI
o Allows scaling up/down multiple Docker instances
Stack
37. Demonstrating the ease-of-use in deploying
custom instances of the BDI Platform
Recorded video showing an example available:
https://www.youtube.com/watch?v=1zHIhFDDdCg
BDI Platform – A Demo
6-déc.-16www.big-data-europe.eu
38. A flexible, generic platform for (Big) Data Value
Chain Deployment
5. Beyond the State-of-the-Art
Big Data Integrator
6-déc.-16www.big-data-europe.eu
40. Quelle: Gesellschaft für Informatik
Variety – The most neglected V?
Data Source
Heterogeneity
Lack of
interoperability
/semantics
41. Semantic Layer tools
6-déc.-16www.big-data-europe.eu
BDE tooling for Semantic Data Lake:
o Swagger: Semantics of RESTful APIs
o Semantic Analytics Stack (SANSA):
Distributed data processing over large-
scale Knowledge Graphs
o Semagrow: SPARQL over Big Data stores
o Ontario: Querying over Semantic Data
Lakes
42. Semantic Layer
www.big-data-europe.eu
Semantic Data Lakes
o Minimal ingestion
pre-processing
o Semantic layer
maintains metadata
o Add meaning when
retrieving/processing
Data Lake: scalable unstructureddata store
Relationshipdefinitions and metadata
JSON-LD CSVW R2RMLXML2RDF
Ongoing Research for Semantic Big Data & Analytics
Knowledge Graphs
43. Ontario: Semantic Data Lakes
Repository of data in its raw format
o Structured, semi-structured, unstructured
Schema-less
o No schema is defined on write, it is defined only on read
Open to any kind of processing
Add a Semantic layer on top of the source datasets
o Semantic data is handled as-is
o Non-Semantic data is semantically lifted using existing
ontology terms
43
45. SANSA: Semantic Analytics Stack
Abundant machine readable structured information is
available (e.g. in RDF)
o Across SCs, e.g. Life Science Data (OpenPhacts)
o General: DBpedia, Google knowledge graph
o Social graphs: Facebook, Twitter
Need for scalable querying, inference & ML
o Link prediction
o Knowledge base completion
o Predictive analytics
45
48. More Information
Big Data Integrator:
https://github.com/big-data-europe
README includes extensive
documentation, instructions and
information on supported
components
6-déc.-16www.big-data-europe.eu
50. 2nd round of Societal Workshops
6-déc.-16www.big-data-europe.eu
Transport 22 September 2016 Brussels Collocated with Big Data for
Transport, Tisa workshop
Food&Agri 30 September 2016 Brussels Collocated with DG AGRI WP2018-
20 stakeholder consultation
Energy 4 October 2016 Brussels Collocated with EC H2020 Info Day
on “Smart Grids and Storage”
Climate 11 October 2016 Brussels Collocated with Melodies Project
Event – Exploiting Open Data
Security 18 October 2016 Brussels Standalone Workshop
Societies 5 December 2016 Cologne Collocated with EDDI16- 8th Annual
European DDI User Conference
Health 9 December 2016 Brussels Standalone Workshop
51. Other Activities
Fresh set (7) of Societal Workshops in 2017
Various SC-focussed and general hangouts, follow!
o Apache Flink & BDE (20 Oct) – available online
o BDVA & BDE Webinar planned early next year
o Keep track on BDE Website (Events)
6-déc.-16www.big-data-europe.eu
54. SANSA: Read Write Layer
Ingest RDF and OWL data in different formats
using Jena / OWL API style interfaces
Represent data in multiple formats (e.g. RDD, Data
Frames, GraphX, Tensors)
Allow transformation among these formats
Compute dataset statistics and apply functions to
URIs, literals, subjects, objects → Distributed
LODStats
54
55. SANSA: Query Layer
To make generic queries efficient and fast using:
o Intelligent indexing
o Splitting strategies
o Distributed Storage
o Distributed/ Federated Querying
Early work in progress: query evaluation (SPARQL-
to-SQL approaches, Virtual Views)
Provision of W3C SPARQL compliant endpoint
55
56. SANSA: Inference Layer
W3C Standards for Modelling: RDFS and
OWL
Parallel in-memory inference via rule-based
forward chaining
Beyond state of the art: dynamically build a
rule dependency graph for a rule set
→ Adjustable performance levels
56
57. SANSA: ML Layer
Distributed Machine Learning (ML) algorithms that
work on RDF data and make use of its structure /
semantics
Work in Progress:
o Tensor Factorization for e.g. KB completion (testing stage)
o Simple spatiotemporal analytics (idea stage)
o Graph Clustering (testing stage)
o Association rule mining (evaluation stage)
o Semantic Decision trees (idea stage)
57