Ontology-based data access: why it is so cool!Josef Hardi
A brief introduction about ontology-based data access (shortly OBDA) and its core implementation. I presented too a recent simple benchmark between -ontop- and Semantika---two most available software for OBDA framework---in term of query performance (including details in the appendix section). The slides were presented for Friday Research Meeting in Stanford Center for Biomedical Informatics Research (BMIR).
License: Creative Commons by Attribution 3.0
WSO2 Machine Learner takes data one step further, pairing data gathering and analytics with predictive intelligence: this helps you understand not just the present, but to predict scenarios and generate solutions for the future.
Hajira Jabeen introduces the Big Data Europe Integrator Platform. The deck also includes the slides use to summarise the other presentations in the launch webinar.
Ontology-based data access: why it is so cool!Josef Hardi
A brief introduction about ontology-based data access (shortly OBDA) and its core implementation. I presented too a recent simple benchmark between -ontop- and Semantika---two most available software for OBDA framework---in term of query performance (including details in the appendix section). The slides were presented for Friday Research Meeting in Stanford Center for Biomedical Informatics Research (BMIR).
License: Creative Commons by Attribution 3.0
WSO2 Machine Learner takes data one step further, pairing data gathering and analytics with predictive intelligence: this helps you understand not just the present, but to predict scenarios and generate solutions for the future.
Hajira Jabeen introduces the Big Data Europe Integrator Platform. The deck also includes the slides use to summarise the other presentations in the launch webinar.
Big Data Europe SC6 WS #3: Big Data Europe Platform: Apps, challenges, goals ...BigData_Europe
Talk at the Big Data Europe SC6 workshop number 3 taking place on 11.9.2017 in Amsterdam co-located with SEMANTiCS2017 conference: The Big Data Europe Platform: Apps, challenges, goals by Aad Versteden, TenForce.
BDE SC1 Workshop 3 - Open PHACTS Pilot (Kiera McNeice)BigData_Europe
Overview of Open PHACTS, the BDE Pilot project in SC1, presented at BDE SC1 Workshop 3, 13 December, 2017.
https://www.big-data-europe.eu/the-final-big-data-europe-workshop/
This slide deck provides an overview to WSO2 Big data platform and discuss some of its customer case studies and applications. It discuss Big Data in general, real time analytics WSO2 CEP, batch analytics WSO2 BAM, and new products like predictive analytics with WSO2 Machine Learner. For more information, please reach us though architecture@wso2.org.
Introduction to Big Data Analytics: Batch, Real-Time, and the Best of Both Wo...WSO2
In this webinar, Srinath Perera, director of research at WSO2, will discuss
Big data landscape: concepts, use cases, and technologies
Real-time analytics with WSO2 CEP
Batch analytics with WSO2 BAM
Combining batch and real-time analytics
Introducing WSO2 Machine Learner
Environmental monitoring research requires access to heterogeneous information collected from different sensor sources, based on satellites, UAVs, and in-situ measurements. The wide availability of data collected by Copernicus missions generates a huge amount of data (Big Data) that needs to be organized in an adequate way and requires integration and analysis tools. Moreover, with the possibility of having large areas covered by low-cost sensors, the amount of point data can also grow enormously. Data are typically organized in different data structures, and it is often challenging to understand where the data are collected, how to use them, and how to work with different data sources simultaneously.
The DPS4ESLAB project proposed the creation of an innovative research infrastructure by implementing an IT platform capable of handling the data workflows generated by heterogeneous sources and equipped with web tools and applications to find, use, analyze and share the data stored therein.
The data platform is accessible through the web application at https://edp-portal.eurac.edu/home from which the user can reach the data catalogue, the analysis tool, and the data and mapping collaboration and sharing portal. A fourth element gathers documentation and example code in order to guide the user to use the platform at its best.
All collaborators of Eurac Research have access to the contents of the platform, and thanks to a user management system, access to the platform components is enabled for research partners and private organizations that want to benefit from the available data and tools.
The EDP is build by means of open source components and implements well established interoperability standards, therefore is federated with international organizations such as GEOSS and EOSC, as well as with the OpenDataHub.
Edge computing and the Internet of Things bring great promise, but often just getting data from the edge requires moving mountains. Let's learn how to make edge data ingestion and analytics easier using StreamSets Data Collector edge, an ultralight, platform independent and small-footprint Open Source solution written in Go for streaming data from resource-constrained sensors and personal devices (like medical equipment or smartphones) to Apache Kafka, Amazon Kinesis and many others. This talk includes an overview of the SDC Edge main features, supported protocols and available processors for data transformation, insights on how it solves some challenges of traditional approaches to data ingestion, pipeline design basics, a walk-through some practical applications (Android devices and Raspberry Pi) and its integration with other technologies such as Streamsets Data Collector, Apache Kafka, Apache Hadoop, InfluxDB and Grafana. The goal here is to make attendees ready to quickly become IoT data intake and SDC Edge Ninjas.
Speaker
Guglielmo Iozzia, Big Data Delivery Manager, Optum (United Health)
The Enterprise Guide to Building a Data Mesh - Introducing SpecMeshIanFurlong4
For organisations to successfully adopt data mesh, setting up and maintaining infrastructure needs to be easy.
We believe the best way to achieve this is to leverage the learnings from building a ‘central nervous system‘, commonly used in modern data-streaming ecosystems. This approach formalises and automates of the manual parts of building a data mesh.
This presentation introduces SpecMesh; a methodology and supporting developer toolkit to enable business to build the foundations of their data mesh.
Big Data Europe SC6 WS #3: Big Data Europe Platform: Apps, challenges, goals ...BigData_Europe
Talk at the Big Data Europe SC6 workshop number 3 taking place on 11.9.2017 in Amsterdam co-located with SEMANTiCS2017 conference: The Big Data Europe Platform: Apps, challenges, goals by Aad Versteden, TenForce.
BDE SC1 Workshop 3 - Open PHACTS Pilot (Kiera McNeice)BigData_Europe
Overview of Open PHACTS, the BDE Pilot project in SC1, presented at BDE SC1 Workshop 3, 13 December, 2017.
https://www.big-data-europe.eu/the-final-big-data-europe-workshop/
This slide deck provides an overview to WSO2 Big data platform and discuss some of its customer case studies and applications. It discuss Big Data in general, real time analytics WSO2 CEP, batch analytics WSO2 BAM, and new products like predictive analytics with WSO2 Machine Learner. For more information, please reach us though architecture@wso2.org.
Introduction to Big Data Analytics: Batch, Real-Time, and the Best of Both Wo...WSO2
In this webinar, Srinath Perera, director of research at WSO2, will discuss
Big data landscape: concepts, use cases, and technologies
Real-time analytics with WSO2 CEP
Batch analytics with WSO2 BAM
Combining batch and real-time analytics
Introducing WSO2 Machine Learner
Environmental monitoring research requires access to heterogeneous information collected from different sensor sources, based on satellites, UAVs, and in-situ measurements. The wide availability of data collected by Copernicus missions generates a huge amount of data (Big Data) that needs to be organized in an adequate way and requires integration and analysis tools. Moreover, with the possibility of having large areas covered by low-cost sensors, the amount of point data can also grow enormously. Data are typically organized in different data structures, and it is often challenging to understand where the data are collected, how to use them, and how to work with different data sources simultaneously.
The DPS4ESLAB project proposed the creation of an innovative research infrastructure by implementing an IT platform capable of handling the data workflows generated by heterogeneous sources and equipped with web tools and applications to find, use, analyze and share the data stored therein.
The data platform is accessible through the web application at https://edp-portal.eurac.edu/home from which the user can reach the data catalogue, the analysis tool, and the data and mapping collaboration and sharing portal. A fourth element gathers documentation and example code in order to guide the user to use the platform at its best.
All collaborators of Eurac Research have access to the contents of the platform, and thanks to a user management system, access to the platform components is enabled for research partners and private organizations that want to benefit from the available data and tools.
The EDP is build by means of open source components and implements well established interoperability standards, therefore is federated with international organizations such as GEOSS and EOSC, as well as with the OpenDataHub.
Edge computing and the Internet of Things bring great promise, but often just getting data from the edge requires moving mountains. Let's learn how to make edge data ingestion and analytics easier using StreamSets Data Collector edge, an ultralight, platform independent and small-footprint Open Source solution written in Go for streaming data from resource-constrained sensors and personal devices (like medical equipment or smartphones) to Apache Kafka, Amazon Kinesis and many others. This talk includes an overview of the SDC Edge main features, supported protocols and available processors for data transformation, insights on how it solves some challenges of traditional approaches to data ingestion, pipeline design basics, a walk-through some practical applications (Android devices and Raspberry Pi) and its integration with other technologies such as Streamsets Data Collector, Apache Kafka, Apache Hadoop, InfluxDB and Grafana. The goal here is to make attendees ready to quickly become IoT data intake and SDC Edge Ninjas.
Speaker
Guglielmo Iozzia, Big Data Delivery Manager, Optum (United Health)
The Enterprise Guide to Building a Data Mesh - Introducing SpecMeshIanFurlong4
For organisations to successfully adopt data mesh, setting up and maintaining infrastructure needs to be easy.
We believe the best way to achieve this is to leverage the learnings from building a ‘central nervous system‘, commonly used in modern data-streaming ecosystems. This approach formalises and automates of the manual parts of building a data mesh.
This presentation introduces SpecMesh; a methodology and supporting developer toolkit to enable business to build the foundations of their data mesh.
Big Data Europe SC6 WS #3: PILOT SC6: CITIZEN BUDGET ON MUNICIPAL LEVEL, Mart...BigData_Europe
Presentation at the Big Data Europe SC6 workshop #3 on 11.9.2017 in Amsterdam co-located with SEMANTiCS2017 conference: BDE PIlot Societal Challenge 6: CITIZEN BUDGET ON MUNICIPAL LEVEL by Martin Kaltenboeck (Semantic Web Company, SWC).
Big Data Europe SC6 WS 3: Where we are and are going for Big Data in OpenScie...BigData_Europe
Where we are and are going for Big Data in OpenScience
Keynote talk at the Big Data Europe SC6 Workshop on 11.9.2017 in Amsterdam co-located with SEMANTiCS2017: The perspective of European official statistics by Fernando Reis, Task-Force Big Data, European Commission (Eurostat).
Big Data Europe SC6 WS 3: Ron Dekker, Director CESSDA European Open Science A...BigData_Europe
Slides for keynote talk at the Big Data Europe workshop nr 3 on 11.9.2017 in Amsterdam co-located with SEMANTiCS2017 conference by Ron Dekker, Director CESSDA: European Open Science Agenda: where we are and where we are going?
Big Data Europe: SC6 Workshop 3: The European Research Data Landscape: Opport...BigData_Europe
Slides of the keynote at the 3rd Big Data Europe SC6 Workshop co-located at SEMANTiCS2018 in Amsterdam (NL) on: The European Research Data Landscape: Opportunities for CESSDA by Peter Doorn, Director DANS, Chair, Science Europe W.G. on Research Data. Chair, CESSDA ERIC General Assembly
BDE SC3.3 Workshop - Options for Wind Farm performance assessment and Power f...BigData_Europe
Options for Wind Farm performance assessment and Power forecasting (Mr. A. Kyritsis, ALTSOL/TERNA) at the BigDataEurope Workshop, Amsterdam, Novermber 2017.
Big Data Europe: Workshop 3 SC6 Social Science: THE IMPORTANCE OF METADATA & ...BigData_Europe
Big Data Europe: Workshop 3 SC6 Social Science - 11.09.2017 in Amsterdam, co-located with SEMANTiCS2017 titled: THE IMPORTANCE OF METADATA & BIG DATA IN OPEN SCIENCE. Slides by Ivana Versic (Cessda) and Martin Kaltenböck (SWC)
BDE SC1 Workshop 3 - Big Data Europe (Simon Scerri)BigData_Europe
Overview of the Big Data Europe project presented at BDE SC1 Workshop 3, 13 December, 2017.
https://www.big-data-europe.eu/the-final-big-data-europe-workshop/
SC1 Hangout: Updating public databases: Automation and other challenges for c...BigData_Europe
A recording of this webinar can be found at https://youtu.be/IqG3j5b-CXQ
Keeping databases up-to-date is a significant challenge with the rate at which many data sources are growing. Open PHACTS and Big Data Europe organised this webinar to hold an open, informal discussion around keeping databases updated – from user needs, to the challenges of automation, to potential technical approaches underpinning key data sources.
Joining our panel are Dr Evan Bolton, who manages the PubChem project at NCBI, and Professor Chris Evelo, Co-Founder and Director at WikiPathways.
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Jeffrey Haguewood
Sidekick Solutions uses Bonterra Impact Management (fka Social Solutions Apricot) and automation solutions to integrate data for business workflows.
We believe integration and automation are essential to user experience and the promise of efficient work through technology. Automation is the critical ingredient to realizing that full vision. We develop integration products and services for Bonterra Case Management software to support the deployment of automations for a variety of use cases.
This video focuses on the notifications, alerts, and approval requests using Slack for Bonterra Impact Management. The solutions covered in this webinar can also be deployed for Microsoft Teams.
Interested in deploying notification automations for Bonterra Impact Management? Contact us at sales@sidekicksolutionsllc.com to discuss next steps.
UiPath Test Automation using UiPath Test Suite series, part 4DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 4. In this session, we will cover Test Manager overview along with SAP heatmap.
The UiPath Test Manager overview with SAP heatmap webinar offers a concise yet comprehensive exploration of the role of a Test Manager within SAP environments, coupled with the utilization of heatmaps for effective testing strategies.
Participants will gain insights into the responsibilities, challenges, and best practices associated with test management in SAP projects. Additionally, the webinar delves into the significance of heatmaps as a visual aid for identifying testing priorities, areas of risk, and resource allocation within SAP landscapes. Through this session, attendees can expect to enhance their understanding of test management principles while learning practical approaches to optimize testing processes in SAP environments using heatmap visualization techniques
What will you get from this session?
1. Insights into SAP testing best practices
2. Heatmap utilization for testing
3. Optimization of testing processes
4. Demo
Topics covered:
Execution from the test manager
Orchestrator execution result
Defect reporting
SAP heatmap example with demo
Speaker:
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
Essentials of Automations: Optimizing FME Workflows with ParametersSafe Software
Are you looking to streamline your workflows and boost your projects’ efficiency? Do you find yourself searching for ways to add flexibility and control over your FME workflows? If so, you’re in the right place.
Join us for an insightful dive into the world of FME parameters, a critical element in optimizing workflow efficiency. This webinar marks the beginning of our three-part “Essentials of Automation” series. This first webinar is designed to equip you with the knowledge and skills to utilize parameters effectively: enhancing the flexibility, maintainability, and user control of your FME projects.
Here’s what you’ll gain:
- Essentials of FME Parameters: Understand the pivotal role of parameters, including Reader/Writer, Transformer, User, and FME Flow categories. Discover how they are the key to unlocking automation and optimization within your workflows.
- Practical Applications in FME Form: Delve into key user parameter types including choice, connections, and file URLs. Allow users to control how a workflow runs, making your workflows more reusable. Learn to import values and deliver the best user experience for your workflows while enhancing accuracy.
- Optimization Strategies in FME Flow: Explore the creation and strategic deployment of parameters in FME Flow, including the use of deployment and geometry parameters, to maximize workflow efficiency.
- Pro Tips for Success: Gain insights on parameterizing connections and leveraging new features like Conditional Visibility for clarity and simplicity.
We’ll wrap up with a glimpse into future webinars, followed by a Q&A session to address your specific questions surrounding this topic.
Don’t miss this opportunity to elevate your FME expertise and drive your projects to new heights of efficiency.
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...DanBrown980551
Do you want to learn how to model and simulate an electrical network from scratch in under an hour?
Then welcome to this PowSyBl workshop, hosted by Rte, the French Transmission System Operator (TSO)!
During the webinar, you will discover the PowSyBl ecosystem as well as handle and study an electrical network through an interactive Python notebook.
PowSyBl is an open source project hosted by LF Energy, which offers a comprehensive set of features for electrical grid modelling and simulation. Among other advanced features, PowSyBl provides:
- A fully editable and extendable library for grid component modelling;
- Visualization tools to display your network;
- Grid simulation tools, such as power flows, security analyses (with or without remedial actions) and sensitivity analyses;
The framework is mostly written in Java, with a Python binding so that Python developers can access PowSyBl functionalities as well.
What you will learn during the webinar:
- For beginners: discover PowSyBl's functionalities through a quick general presentation and the notebook, without needing any expert coding skills;
- For advanced developers: master the skills to efficiently apply PowSyBl functionalities to your real-world scenarios.
Epistemic Interaction - tuning interfaces to provide information for AI supportAlan Dix
Paper presented at SYNERGY workshop at AVI 2024, Genoa, Italy. 3rd June 2024
https://alandix.com/academic/papers/synergy2024-epistemic/
As machine learning integrates deeper into human-computer interactions, the concept of epistemic interaction emerges, aiming to refine these interactions to enhance system adaptability. This approach encourages minor, intentional adjustments in user behaviour to enrich the data available for system learning. This paper introduces epistemic interaction within the context of human-system communication, illustrating how deliberate interaction design can improve system understanding and adaptation. Through concrete examples, we demonstrate the potential of epistemic interaction to significantly advance human-computer interaction by leveraging intuitive human communication strategies to inform system design and functionality, offering a novel pathway for enriching user-system engagements.
The Art of the Pitch: WordPress Relationships and SalesLaura Byrne
Clients don’t know what they don’t know. What web solutions are right for them? How does WordPress come into the picture? How do you make sure you understand scope and timeline? What do you do if sometime changes?
All these questions and more will be explored as we talk about matching clients’ needs with what your agency offers without pulling teeth or pulling your hair out. Practical tips, and strategies for successful relationship building that leads to closing the deal.
Elevating Tactical DDD Patterns Through Object CalisthenicsDorra BARTAGUIZ
After immersing yourself in the blue book and its red counterpart, attending DDD-focused conferences, and applying tactical patterns, you're left with a crucial question: How do I ensure my design is effective? Tactical patterns within Domain-Driven Design (DDD) serve as guiding principles for creating clear and manageable domain models. However, achieving success with these patterns requires additional guidance. Interestingly, we've observed that a set of constraints initially designed for training purposes remarkably aligns with effective pattern implementation, offering a more ‘mechanical’ approach. Let's explore together how Object Calisthenics can elevate the design of your tactical DDD patterns, offering concrete help for those venturing into DDD for the first time!
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Tobias Schneck
As AI technology is pushing into IT I was wondering myself, as an “infrastructure container kubernetes guy”, how get this fancy AI technology get managed from an infrastructure operational view? Is it possible to apply our lovely cloud native principals as well? What benefit’s both technologies could bring to each other?
Let me take this questions and provide you a short journey through existing deployment models and use cases for AI software. On practical examples, we discuss what cloud/on-premise strategy we may need for applying it to our own infrastructure to get it to work from an enterprise perspective. I want to give an overview about infrastructure requirements and technologies, what could be beneficial or limiting your AI use cases in an enterprise environment. An interactive Demo will give you some insides, what approaches I got already working for real.
DevOps and Testing slides at DASA ConnectKari Kakkonen
My and Rik Marselis slides at 30.5.2024 DASA Connect conference. We discuss about what is testing, then what is agile testing and finally what is Testing in DevOps. Finally we had lovely workshop with the participants trying to find out different ways to think about quality and testing in different parts of the DevOps infinity loop.
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Albert Hoitingh
In this session I delve into the encryption technology used in Microsoft 365 and Microsoft Purview. Including the concepts of Customer Key and Double Key Encryption.
UiPath Test Automation using UiPath Test Suite series, part 3DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 3. In this session, we will cover desktop automation along with UI automation.
Topics covered:
UI automation Introduction,
UI automation Sample
Desktop automation flow
Pradeep Chinnala, Senior Consultant Automation Developer @WonderBotz and UiPath MVP
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
UiPath Test Automation using UiPath Test Suite series, part 3
ICWE2017 BigDataEurope
1. BigDataEurope - Supporting the
Variety Dimension of Big Data
Mohamed Nadjib MAMI - Fraunhofer IAISICWE17 - 06.06.2017
2. Big Data Europe - the Project
◎ EU Horizon 2020-programme-funded
◎ Coordination & Support action (CSA) Project
o Show societal value of Big data to 7 Domains
o Lower barrier for using Big Data technologies
=> BigDataEurope Platform
2
4. BDE Europe - The Platform
◎ Integrator of Big Data technologies
o Easy to use/get started (plug-and-play)
o Flexible, Customisable
◎ Bundles with only Open Source solutions
o Data Storage
o Message Passing
o Data Processing
o Data Searching & Publishing
◎ Publicly released in May 2017
4
6. BDE Platform - Architecture
Support Layer
Init Daemon
GUIs
Base Setup
App Layer
Traffic
Forecast
Satellite Image
Analysis
Platform Layer
Spark Flink Semantic Layer
Ontario SANSA Semagrow
Kafka
Real-time Stream
Monitoring
...
...
Resource Management Layer (Swarm)
Hardware Layer
Premises Cloud (AWS, GCE, MS Azure, …)
Data Layer
Hadoop NOSQL Store CassandraElasticsearch ...RDF Store
Semantic Data Lake (Unified View)
6
7. BDE Platform - Hardware & Virtualization
◎ Docker used for packaging and deploying applications
◎ Based on containers:
o A lightweight environment to make a piece of
software run in isolation
❖ Shares the host operating system kernel (unlike
VMs)
❖ Reduces conflicts e.g., versions
◎ Docker Compose: creates multi-container applications
7
8. BDE Platform - Resource Managements
◎ Swarm (mode) used for managing, scheduling and
orchestrating Dockers in multi-node clusters
◎ It provides:
o Scalability and Fault Tolerance
o Containers interlinking
o Log-based monitoring
◎ Separate hardware from software management
◎ Based on Services
o Swarm execution unit running a Docker Image
8
9. BDE Platform - Support Layer
◎ Init Daemon: orchestrates the initialization process of
the components (containers of Docker Compose):
o Components report their initialization progress
o It validates whether a specific component can start
o It specifies the dependencies between services
o It Indicates where a human interaction is required
◎ Examples:
o Wait data to load to HDFS to start a Spark job
o Wait Spark Master to successfully start to start a Worker
9
10. BDE Platform - User Interfaces
10
Component 1
Component 2
Component 3
Pipeline Builder: creates step-by-step dependency
pipeline (fed to the init daemon)
11. BDE Platform - User Interfaces
11
Component 1
Finished
Component 2
Finished
Component 3
Inprogress
Pipeline Monitor: displays the status (not started, running or finished) of
components in a running pipeline (retrieved from the init daemon)
12. BDE Platform - User Interfaces
12
Swarm UI: allows to clone a Git repository containing a
pipeline and deploys/controls/monitors it on Swarm
13. BDE Platform - User Interfaces
13
Integrator UI: displays the dashboard of each running
component in a unified interface
14. BDE Platform - Semantic Layer > Ontario
◎ Data Lake or Swamp?
o Repository of data in its original formats
o Structured, semi-structured, unstructured
o Without unified schema
◎ Semantic Data Lake (Ontario)
o Add a Semantic Layer on top of the source datasets
❖ The data is semantically lifted using ontology
terms
❖ Provide a uniform view over nonuniform data
14
15. BDE Platform - Semantic Layer > Ontario
15
SELECT count(distinct(?publication))
AS ?no_of_publications
count(?deaths) AS ?no_of_deaths
WHERE {
?item a qb:Observation .
?item gho:Country ?country .
?item gho:Disease ?disease .
?item att:unitMeasure gho:Measure .
?item eg:incidence ?deaths .
?country rdfs:label "India" .
?disease rdfs:label "Tuberculosis".
?trial a ct:trials .
?trial ct:condition ?condition .
?trial ct:location ?location .
?trial ct:reference ?publication.
?condition owl:sameAs ?disease .
?location redd:locatedIn ?country .
?publication ct:citation ?citation.
}
?item a qb:Observation .
?item gho:Country ?country .
?item gho:Disease ?disease .
?item att:unitMeasure gho:Measure .
?item eg:incidence ?deaths .
?trial a ct:trials .
?trial ct:condition ?condition .
?trial ct:location ?location .
?trial ct:reference ?publication.
?condition owl:sameAs ?disease .
?disease rdfs:label "Tuberculosis".
?country rdfs:label "India" .
?location redd:locatedIn ?country .
?publication ct:citation ?citation.
Query “number of distinct publications and number of
distinct deaths due to the disease Tuberculosis in India”
18. BDE Platform - Semantic Layer > SANSA
18
SANSA a Framework for distributed RDF
data processing
◎ Read/write Layer: Read and write
native RDF/OWL data in distributed
storage e.g., Hadoop, Spark (RDD,
DataFrames, GraphX), Tensors
following different representations &
partitioning scheme e.g., graphs, tables
◎ Querying Layer: Query distributed
RDF using SPARQL (SPARQL-to-SQL
approaches, Virtual Views, Intelligent
Indexing, ...)
http://sansa-stack.net
19. BDE Platform - Semantic Layer > SANSA
19
http://sansa-stack.net
◎ Inference Layer: Derive new facts from
existing ones, detect inconsistencies,
extract new rules to help in reasoning
◎ Machine Learning Layer: Perform ML
or analytics to gain insights for relevant
trends, predictions or detection of
anomalies from RDF data
o Tensor Factorization for e.g. KB
completion (testing stage)
o Graph Clustering (testing stage)
o Association rule mining (evaluation stage)
o Semantic Decision trees (idea stage)
o Inference in Knowledge Graph
Embeddings (idea stage)
20. BDE Platform - Semantic Layer > Semagrow
Semagrow a SPARQL query processing system that federates
multiple remote endpoints
◎ Original Semagrow
o Optimizes queries transparently
o Executes sub-queries in the remote endpoints
o Integrates results dynamically in heterogeneous data
models
o Joins the partial results into the final query answer
◎ Next-gen Semagrow
o Support different querying languages
o Query planner and execution engine adapted
e.g., translate SPARQL to CQL for Cassandra
databases
20
21. BDE Showcases (pilots)
21
SC1 SC2 SC3 SC4 SC5 SC6
SC7
SC1 - Open PHACTS discovery platform relating to biological/medical questions
SC2 - Discovery and Linking of Viticulture-relevant information
SC3 - System monitoring in energy production units
SC4 - Short-Term traffic flow forecasting.
SC5 - Supporting data-intensive climate research
SC6 - Citizens & Researchers Budget on Municipal Level
SC7 - Ingestion of remote sensing images and social sensing data to detect and verify
changes on the Earth surface for security applications
◎ 7 Societal Challenges > 7 pilot implementations
22. Showcase SC1: Health, demographic
change and wellbeing
◎ SC1 Implements Open PHACTS Discovery Platform
o Integrates and links data from multiple sources:
ChEBI, ChEMBL, the Gene Ontology and UniProt
(Chemistry, Biological, Medical, etc.)
o Explores the relationships between data
(compounds, targets, pathways, diseases and
tissues)
o Data accessed using RESTful-API requests
❖ Translated to SPARQL queries
◎ Technologies used:
o 4Store, Memchached, MySQL, Puelia, SWAGGER
22
23. Showcase SC7: Secure Societies
◎ Detect changes in land cover in satellite images (e.g.,
monitoring critical infrastructures)
◎ Display geo-located events in news sites and social
media (e.g., news articles, social networks)
◎ Three workflows:
o Change detection workflow
o Event detection workflow
o Activation workflow
◎ Technologies used: Apache Spark, Cassandra,
Sextant, Semagrow, Strabon, GeoTriples
23
24. Showcase 2 (SC7): Secure Societies
24
General Architecture of the SC7 Pilot
25. Showcase 2 (SC7): Secure Societies
area and the time
interval of interest
Satellite Images Compare Images
Change detection workflow
25
26. Showcase 2 (SC7): Secure Societies
Event detection workflow
Associate names
with coordinates
Cluster news into events
(associate geo-location)
26
27. Showcase 2 (SC7): Secure Societies
Activation detection workflow
Areas with changes
Summary of events
Spatiotemporal
RDF store
27
28. Showcase 2 (SC7): Secure Societies
refugee camps located in Zaatari, Jordan
28
News
TweetsSelected
Area
Detected
changes
29. Thanks & Questions?
For more info...
◎ Project-related: Simon Scerri (scerri@cs.uni-bonn.de)
◎ Ontario: Mohamed Nadjib Mami (mami@cs.uni-bonn.de)
◎ SANSA: Jens Lehmann (jens.lehmann@cs.uni-bonn.de)
◎ Semagrow: Stasinos Konstantopoulos (konstant@iit.demokritos.gr)
◎ Pilots (showcases):
o SC1: Ronald Siebes (rm.siebes@few.vu.nl)
o SC7: George Papadakis (gpapadis@di.uoa.gr)
o All: Ronald Siebes (rm.siebes@few.vu.nl)
◎ Github repos: https://github.com/big-data-europe/README
◎ Website: https://big-data-europe.eu
29
30. BDE Platform vs. Hadoop Distributions
30
SFR = Single failure recovery
MFR = Multiple failure recovery
SF = Self healing