More and more applications require real-time processing of heterogeneous data streams. In terms of the “Vs” of Big Data (volume, velocity, variety and veracity), they require addressing velocity and variety at the same time. Big Data solutions able to handle separately velocity and variety have been around for a while, but only Stream Reasoning approaches those two dimensions at once. Current results in the Stream Reasoning field are relevant for application areas that require to: handle massive datasets, process data streams on the fly, cope with heterogeneous incomplete and noisy data, provide reactive answers, support fine-grained information access, and integrate complex domain models. This talk starting from those requirements, frames the problem addressed by Stream Reasoning. It poses the research question and operationalise it with four simpler sub-questions. It describes how the database group of Politecnico di Milano positively answered those sub-questions in the last 7 years of research. It briefly surveys alternative approaches investigated by other research groups world wide and it elaborates on current limitations and open challenges.
The 10 minutes presentation I gave at my PhD defence on 21.9.2015 in Amsterdam. Prof. Frank van Harmelen was my promoter. Prof. Ian Horrocks, prof. Manfred Hauswirth, prof. Geert-Jan Houben, Peter Boncz and prof. Guus Schreiber were my opponents.
Listening to the pulse of our cities fusing Social Media Streams and Call Dat...Emanuele Della Valle
The digital reflection of our cities is sharpening and it is tracking their evolution with a decreasing delay. This happens thanks to the pervasive deployment of sensors, the wide adoption of smart phones, the usage of (location-based) social networks and the availability of datasets about urban environment. So while data becomes every day more abundant, decision makers face the challenge to increase their capability to create value out of the analysis of this data. This key note presents how advance visual analytics, ontology base data access and information flow processing methods can help in making sense of Social Media Streams and Call Data Records from Mobile Network Operators during city scale events. Real-world deployments demonstrate the ability of those methods to advance our ability to feel the pulse of our cities in order to deliver innovative services.
The talk about "Stream Reasoning" for INQUEST -- INnovative QUErying of STreams 2012 -- (http://games.cs.ox.ac.uk/inquest12/) organized in Oxford, United Kingdom, September 25-27 2012.
The talks presents a comprehensive view on "Stream Reasoning" -- reasoning on rapidly flowing information. It illustrates the challenges, presents the achievements of the database group of Politecnico di Milano on the topic, reviews the challenges pointing to results and ongoing work in the Semantic Web community and proposes how to go beyond the current Stream Reasoning concept. It particular, it points out that "orders matters" when processing massive data and it proposes to investigate streaming algorithms for automated reasoning that can be applied not only to data streams that are "naturally" ordered (by recency) but to any sortable data source.
It's a Streaming World! Reasoning upon Rapidly Changing Information (Milano, ...Emanuele Della Valle
Reasoning on rapidly chancing information requires: a) semantic models for representing both data streams and continuous querying/reasoning tasks, and b) reasoning algorithms optimised for continuous reactive query-answering. This talk presents applications cases from which Stream Reasoning requirements were elicited, it briefly covers the findings of 5 year of research, it presents an optimised algorithm for Incremental Reasoning on RDF Streams (IMaRS), and offers an outlook on future research opportunities.
The third lecture of the course I'm giving on "Interoperability and Semantic Technologies" at Politecnico di Milano in the academic year 2015-16. It presents an introduction to the Semantic Web taking a brief walk through in this 15 years of research, standardisation and industrial uptake.
Order Matters! Harnessing a World of Orderings for Reasoning over Massive DataEmanuele Della Valle
More and more applications require real-time processing of massive, dynamically generated, ordered data; order is an essential factor as it reflects recency or relevance. Semantic technologies risk being unable to meet the needs of such applications, as they are not equipped with the appropriate instruments for answering queries over massive, highly dynamic, ordered data sets. This talk argues that some order-aware data management techniques should be exported to the context of semantic technologies, by integrating ordering with reasoning, and by using methods which are inspired by stream and rank-aware data management. This talk systematically explores the problem space, and points both to problems which have been successfully approached and to problems which still need fundamental research, in an attempt to stimulate and guide a paradigm shift in semantic technologies.
The second lecture of the course I'm giving on "Interoperability and Semantic Technologies" at Politecnico di Milano in the academic year 2015-16. It discusses interoperability using HL7 v2 and v3 as examples of syntactic and semantic interoperability, respectively.
The 10 minutes presentation I gave at my PhD defence on 21.9.2015 in Amsterdam. Prof. Frank van Harmelen was my promoter. Prof. Ian Horrocks, prof. Manfred Hauswirth, prof. Geert-Jan Houben, Peter Boncz and prof. Guus Schreiber were my opponents.
Listening to the pulse of our cities fusing Social Media Streams and Call Dat...Emanuele Della Valle
The digital reflection of our cities is sharpening and it is tracking their evolution with a decreasing delay. This happens thanks to the pervasive deployment of sensors, the wide adoption of smart phones, the usage of (location-based) social networks and the availability of datasets about urban environment. So while data becomes every day more abundant, decision makers face the challenge to increase their capability to create value out of the analysis of this data. This key note presents how advance visual analytics, ontology base data access and information flow processing methods can help in making sense of Social Media Streams and Call Data Records from Mobile Network Operators during city scale events. Real-world deployments demonstrate the ability of those methods to advance our ability to feel the pulse of our cities in order to deliver innovative services.
The talk about "Stream Reasoning" for INQUEST -- INnovative QUErying of STreams 2012 -- (http://games.cs.ox.ac.uk/inquest12/) organized in Oxford, United Kingdom, September 25-27 2012.
The talks presents a comprehensive view on "Stream Reasoning" -- reasoning on rapidly flowing information. It illustrates the challenges, presents the achievements of the database group of Politecnico di Milano on the topic, reviews the challenges pointing to results and ongoing work in the Semantic Web community and proposes how to go beyond the current Stream Reasoning concept. It particular, it points out that "orders matters" when processing massive data and it proposes to investigate streaming algorithms for automated reasoning that can be applied not only to data streams that are "naturally" ordered (by recency) but to any sortable data source.
It's a Streaming World! Reasoning upon Rapidly Changing Information (Milano, ...Emanuele Della Valle
Reasoning on rapidly chancing information requires: a) semantic models for representing both data streams and continuous querying/reasoning tasks, and b) reasoning algorithms optimised for continuous reactive query-answering. This talk presents applications cases from which Stream Reasoning requirements were elicited, it briefly covers the findings of 5 year of research, it presents an optimised algorithm for Incremental Reasoning on RDF Streams (IMaRS), and offers an outlook on future research opportunities.
The third lecture of the course I'm giving on "Interoperability and Semantic Technologies" at Politecnico di Milano in the academic year 2015-16. It presents an introduction to the Semantic Web taking a brief walk through in this 15 years of research, standardisation and industrial uptake.
Order Matters! Harnessing a World of Orderings for Reasoning over Massive DataEmanuele Della Valle
More and more applications require real-time processing of massive, dynamically generated, ordered data; order is an essential factor as it reflects recency or relevance. Semantic technologies risk being unable to meet the needs of such applications, as they are not equipped with the appropriate instruments for answering queries over massive, highly dynamic, ordered data sets. This talk argues that some order-aware data management techniques should be exported to the context of semantic technologies, by integrating ordering with reasoning, and by using methods which are inspired by stream and rank-aware data management. This talk systematically explores the problem space, and points both to problems which have been successfully approached and to problems which still need fundamental research, in an attempt to stimulate and guide a paradigm shift in semantic technologies.
The second lecture of the course I'm giving on "Interoperability and Semantic Technologies" at Politecnico di Milano in the academic year 2015-16. It discusses interoperability using HL7 v2 and v3 as examples of syntactic and semantic interoperability, respectively.
Brief report about the contents of the Stream Reasoning workshop at SIWC 2016. Additional info about the event are available at: http://streamreasoning.org/events/sr2016
AMA INA you two are role models for everyone who believes in eternal love, fo...Mar Mae AG
You are role models, showing that two are better than one, that it is better to be a team in facing life's challenges. You are role models for commitment, for caring, for unselfish devotion. AMA may your love continue to be a golden beacon for the rest of us.
The presentation offers a novel approach to predicting winners in soccer and in business.
Getting the fish in the net and sending the ball in the net share many commonalties. Extending our experiences in fishing shall help us strategize the football games and business games as well. Based on this extrapolation who shall be the winning team in the World Cup Tournament 2014? The presentation offers an answer.
Championing the Golden Quarter with Google Shopping - INDebalina C.
India purchases 10X during the last 3 months of the year - the glorious festive season - when sales, discounts and business strategies lock horns to reach shoppers in India, and break their banks doing it. The legacy E-Commerce players maximise their budgets, and the younger players have a difficult time to manage costs and have qualified transactions. We at Sokrati, as Google’s Premier Partners, are putting together a highly insightful and intelligence-driven webinar on how to have your stellar golden quarter, no matter the size of your business.
Stream Reasoning: a summary of ten years of research and a vision for the nex...Emanuele Della Valle
Stream reasoning studies the application of inference techniques to data characterised by being highly dynamic. It can find application in several settings, from Smart Cities to Industry 4.0, from Internet of Things to Social Media analytics. This year stream reasoning turns ten, and this talk analyses its growth. In the first part, it traces the main results obtained so far, by presenting the most prominent studies. It starts by an overview of the most relevant studies developed in the context of semantic web, and then it extends the analysis to include contributions from adjacent areas, such as database and artificial intelligence. Looking at the past is useful to prepare for the future: the second part presents a set of open challenges and issues that stream reasoning will face in the next future.
Internet Infrastructures for Big Data (Verisign's Distinguished Speaker Series)eXascale Infolab
Internet Infrastructures for Big Data
Talk given at Verisign's Distinguished Speaker Series, 2014
Prof. Philippe Cudre-Mauroux
eXascale Infolab
http://exascale.info/
Brief report about the contents of the Stream Reasoning workshop at SIWC 2016. Additional info about the event are available at: http://streamreasoning.org/events/sr2016
AMA INA you two are role models for everyone who believes in eternal love, fo...Mar Mae AG
You are role models, showing that two are better than one, that it is better to be a team in facing life's challenges. You are role models for commitment, for caring, for unselfish devotion. AMA may your love continue to be a golden beacon for the rest of us.
The presentation offers a novel approach to predicting winners in soccer and in business.
Getting the fish in the net and sending the ball in the net share many commonalties. Extending our experiences in fishing shall help us strategize the football games and business games as well. Based on this extrapolation who shall be the winning team in the World Cup Tournament 2014? The presentation offers an answer.
Championing the Golden Quarter with Google Shopping - INDebalina C.
India purchases 10X during the last 3 months of the year - the glorious festive season - when sales, discounts and business strategies lock horns to reach shoppers in India, and break their banks doing it. The legacy E-Commerce players maximise their budgets, and the younger players have a difficult time to manage costs and have qualified transactions. We at Sokrati, as Google’s Premier Partners, are putting together a highly insightful and intelligence-driven webinar on how to have your stellar golden quarter, no matter the size of your business.
Stream Reasoning: a summary of ten years of research and a vision for the nex...Emanuele Della Valle
Stream reasoning studies the application of inference techniques to data characterised by being highly dynamic. It can find application in several settings, from Smart Cities to Industry 4.0, from Internet of Things to Social Media analytics. This year stream reasoning turns ten, and this talk analyses its growth. In the first part, it traces the main results obtained so far, by presenting the most prominent studies. It starts by an overview of the most relevant studies developed in the context of semantic web, and then it extends the analysis to include contributions from adjacent areas, such as database and artificial intelligence. Looking at the past is useful to prepare for the future: the second part presents a set of open challenges and issues that stream reasoning will face in the next future.
Internet Infrastructures for Big Data (Verisign's Distinguished Speaker Series)eXascale Infolab
Internet Infrastructures for Big Data
Talk given at Verisign's Distinguished Speaker Series, 2014
Prof. Philippe Cudre-Mauroux
eXascale Infolab
http://exascale.info/
BigData conference - Introduction to stream processingNicolas Fränkel
While “software is eating the world”, those who are able to best manage the huge mass of data will emerge out on the top.
The batch processing model has been faithfully serving us for decades. However, it might have reached the end of its usefulness for all but some very specific use-cases. As the pace of businesses increases, most of the time, decision makers prefer slightly wrong data sooner, than 100% accurate data later. Stream processing – or data streaming – exactly matches this usage: instead of managing the entire bulk of data, manage pieces of them as soon as they become available.
In this talk, Nicolas will define the context in which the old batch processing model was born, the reasons that are behind the new stream processing one, how they compare, what are their pros and cons, and a list of existing technologies implementing the latter with their most prominent characteristics. He’ll conclude by describing in detail one possible use-case of data streaming that is not possible with batches: display in (near) real-time all trains in Switzerland and their position on a map. He’ll go through the all the requirements and the design. Finally, using an OpenData endpoint and the Hazelcast platform, he’ll try to impress attendees with a working demo implementation of it.
BSC and Integrating Persistent Data and Parallel Programming Modelsinside-BigData.com
In this deck from the HPC Advisory Council Spain Conference, Toni Cortés from the Barcelona Supercomputing Center presents: BSC and Integrating Persistent Data and Parallel Programming Models.
Watch the video presentation: http://wp.me/p3RLHQ-exQ
Sign up for our insideHPC Newsletter: http://insidehpc.com/newsletter
A l'occasion de l'eGov Innovation Day 2014 - DONNÉES DE L’ADMINISTRATION, UNE MINE (qui) D’OR(t) - Philippe Cudré-Mauroux présente Big Data et eGovernment.
Devclub.lv - Introduction to stream processingNicolas Fränkel
While “software is eating the world”, those who are able to best manage the huge mass of data will emerge out on the top.
The batch processing model has been faithfully serving us for decades. However, it might have reached the end of its usefulness for all but some very specific use-cases. As the pace of businesses increases, most of the time, decision-makers prefer slightly wrong data sooner, than 100% accurate data later. Stream processing – or data streaming – exactly matches this usage: instead of managing the entire bulk of data, manage pieces of them as soon as they become available.
Predictive Analytics: Context and Use Cases
Historical context for successful implementation of predictive analytic techniques and examples of implementation of successful use cases.
Big Data & Analytics for Government - Case StudiesJohn Palfreyman
This presentation explains the future challenges that Governments face, and illustrates how Big Data & Analytics technologies can help address these challenges. Four case studies - based on recent customer projects - are used to show the value that the innovative application of these technologies can bring.
AUTOMATED LEAK DETECTION SYSTEM FOR THE IMPROVEMENT OF WATER NETWORK MANAGEMENTWaternomics
The need for an efficient Water Management System (WMS) is strongly felt by water utilities, municipalities and by medium to large scale corporates that have to face every day with problems dealing with water usage and supply [1]. Leveraging a sensor data network, an automated system to implement fault detection in a water network at an early stage can be a valuable tool that saves water, energy, time and money. This paper introduces a novel FDD (fault detection and diagnosis) approach for water networks developed within the FP7 Waternomics Project by modeling a water network in the simulation environment EPANET and applying an anomaly detection algorithm named ADWICE (Anomaly Detection With fast Incremental ClustEring) [2] to real time data of water flow and pressure to infer performance and operational anomalies. The method is currently being implemented at the Linate Airport water network in Milan, and initial results are presented in this paper.
Ontology Building vs Data Harvesting and Cleaning for Smart-city ServicesPaolo Nesi
Presently, a very large number of public and private data sets are available around the local governments. In most cases, they are not semantically interoperable and a huge human effort is needed to create integrated ontologies and knowledge base for smart city. Smart City ontology is not yet standardized, and a lot of research work is needed to identify models that can easily support the data reconciliation, the management of the complexity and reasoning. In this paper, a system for data ingestion and reconciliation of smart cities related aspects as road graph, services available on the roads, traffic sensors etc., is proposed. The system allows managing a big volume of data coming from a variety of sources considering both static and dynamic data. These data are mapped to smart-city ontology and stored into an RDF-Store where they are available for applications via SPARQL queries to provide new services to the users. The paper presents the process adopted to produce the ontology and the knowledge base and the mechanisms adopted for the verification, reconciliation and validation. Some examples about the possible usage of the coherent knowledge base produced are also offered and are accessible from the RDF-Store and related services. The article also presented the work performed about reconciliation algorithms and their comparative assessment and selection. Keywords Smart city, knowledge base construction, reconciliation, validation and verification of knowledge base, smart city ontology, linked open graph.
Data streams take many forms and their velocity is hard to tame. They can be myriads of tiny flows that you can collect to tame with Time-series Databases; continuous massive flows than you cannot stop to tame with Data Stream Management Systems; Continuous numerous flows that can turn into a torrent to tame with Event-based Systems; and myriads of continuous flows of any size and speed that form an immense delta to tame with Event-Driven Architectures. Enjoy this introductory talk!
This is the presentation that I did for PoliMI Data Scientists on Stream Reasoning, an approach to blend Artificial Intelligence and Stream Processing.
While the state of the art in Machine Learning offers practitioners effective tecniques to deal with static data sets, there are only accademic results tailored to data streams. In this presentation for the 4th Stream Reasoning workshop, I report on an effort of Alessio Bernardo (a student of mines) to set up a benchmark enviroment to (i) repeat academic results, (ii) perform studies on real data for confirming the academic results, and (iii) study the research problem of "incremental rebalancing learning on evolving data streams".
HiPPO and Flipism are no longer the only way to take decisions. In the Big Data / Data Science era one can dream of data-driven organization. If the data were "oil", Big Data technologies extract, transport, and store it, while Data Science methods provide the a way to "refine the crude oil". This presentation elaborates on the Ws (What, Why, When, Who and How) of Big Data and Data Science.
From the semantic interoperability problem to Google's knowledge graph passing from the Semantic Web, Linked Data, Yahoo! search monkey, Facebook Open Graph, and schema.org.
La Città dei Balocchi, con le sue luci, è un evento chiave nel panorama dell'offerta turistica Natalizia Lombarda. La presentazione riporta i risultati di un'analisi di chi è venuto e quando.
Realizzato da Fluxedo srl e Olivetti spa per il Consorzio Como Turistica, con la collaborazione di Politecnico di Milano, TIM e Comune di Como, nel contesto del progetto CrowdInsights finanziato da EIT Digital.
Stream reasoning: an approach to tame the velocity and variety dimensions of ...Emanuele Della Valle
Big Data tech can tame volume and velocity. Taming Variety in presence of volume and velocity is the real challenge. I’ve been working on taming variety and velocity simultaneously (Stream Reasoning) for 10 years, now. In this talk, I give you some examples of application domains where this is necessary. I explain where the Stream Reasoning community went so far in theory, applications and products. In particular I focus on my applications and my startup Fluxedo, which is offering real-time social media analytics across social networks. I conclude the talk discussing what comes next: 1) the need to focus on languages and abstractions able to easily capture user needs; 2) the need to find the sweet-spot between scalability and expressive semantics; 3) the need to used semantics to model more than the data access; and 4) the need to get over imperfect data. If you are exited, I did my job for today!
Every body talks about Big Data, but why? Do it create value? Do it enable some paradigmatic shifts in the way we work with data? This talk I did at ComoNext research and technological park cast some light on those questions.
Listening to the pulse of our cities with Stream Reasoning (and few more tech...Emanuele Della Valle
The digital reflection of our cities is sharpening and it is tracking their evolution with a decreasing delay. However, we risk that data piles up without easing decision making. This key note, which I gave at the 12th Semantic Web Summer School, presents how stream reasoning (an approach to tame simultaneously the variety and velocity dimensions of Big Data) and advance visual analytics can support decision makers and discusses the lesson learnt.
The forth lecture of the course I'm giving on "Interoperability and Semantic Technologies" at Politecnico di Milano in the academic year 2015-16. It presents an introduction to RDF. It starts presenting the data model. Then it presents the turtle serialization. It compares XML vs. RDF. Finally, it provides few informations about RDFa and Linked Data.
C’è un modo di raccontare un evento che passa attraverso la lettura dei flussi social che genera. Quella traccia digitale che ogni partecipante lascia sui social network quando condivide la sua partecipazione o la sua opinione. E’ possibile fondere e interpretare in tempo reale tali tracce utilizzando tecnologie d’analisi d’avanguardia e modelli avanzati di visualizzazione dei dati. Nel 2014 in collaborazione con StudioLabo e Telecom Italia, il Politecnico di Milano ha realizzato CitySensing, per mostrare l’impronta lasciata dal FuoriSalone sui social network. Focalizzando, in seguito, CitySensing sulle esigenze del gestore dell’evento, il Politecnico di Milano ha mostrato la potenzialità dell’approccio per il Festival della Comunicazione di Camogli e per il Festival delle Letterature di Pescara. La soluzione è ora offerta da Fluxedo.
C'è un modo di racocontare la città che passa attraverso la lettura dei flussi di dati che essa genera. Quelle tracce digitali che ciascuno di noi lascia ogni volta che compie un piccolo gesto quotidiano, come fare una telefonata o inviare un tweet.
In City Data Fusion, il Politecnico di Milano e Telecom Italia raccontano le città fondendo, interpretando e visualizzando i Big Data, ovvero quell'enorme e continuo flusso di tracce digitali che i loro abitanti e visitotori lasciano utilizzando il proprio smartphone o i servizi della città.
Questa presentazione vi introduce all'osservazione alcune città italiane in una prospettiva nuova.
Bi-later integration are a short term approach to business integration, but only standards provide a long term solution. Unfortunately, agreeing on standards is hard and takes time, thus translation between standards is unavoidable. Embracing change is the only way to benefit from short term translation while developing over time comprehensive standards. Semantic technologies are design with flexibility in mind and, therefore, they can help in developing more comprehensive standards and easier to maintain translations.
Big data: why, what, paradigm shifts enabled , tools and market landscapeEmanuele Della Valle
This presentation brings together many contents you may have seen before (reports by McKinsey, Gatner and IBM, and info-graphics by Intel and Go-Globe) are agglomerated in one comprehensive and up-to-date view of Big Data.
City Data Fusion and City Sensing presented at EIT ICT Labs for EXPO 2015Emanuele Della Valle
EIT ICT Labs wants be present at EXPO 2015. The City Data Fusion project proposes to install City Sensing in EXPO Gate to display the pulse of Milano during the EXPO. The idea of City Data Fusion and the installation of City Data Fusion for Milano Design Week 2014 is covered in the slides.
On the effectiveness of a Mobile Puzzle Game UI to Crowdsource Linked Data Ma...Emanuele Della Valle
Linked Data publishing on the Web is a stably growing phenomenon, but its effective usage depends on the ability of consumers to assess the trustworthiness and the relevance of the published data. Pure automatic techniques are often inadequate to this end. Crowdsourcing is often advocated as a valuable solution. In this presentation, we propose WikiFinder – a Games With A Purpose inspired by popular mobile puzzle games – and we report on its effectiveness in solving typical Linked Data Management tasks.
City Data Fusion: A Big Data Infrastructure to sense the pulse of the city in...Emanuele Della Valle
Streams of information flow through our cities thanks to their progressive instrumentation with diverse sensors, a wide adoption of smart phones and social networks, and a growing open release of datasets. This research investigates the possibility to feel the pulse of our cities in real-time by fusing and making sense of all those information flows. The expected result is a Big Data infrastructure that exploits: semantic technologies, streaming databases, visual analytics, and crowd-sourcing techniques whose incentives are designed for urban environment and life styles. Early deployments for city scale events offer insights on the kind of services such infrastructure will enable.
Multi-cluster Kubernetes Networking- Patterns, Projects and GuidelinesSanjeev Rampal
Talk presented at Kubernetes Community Day, New York, May 2024.
Technical summary of Multi-Cluster Kubernetes Networking architectures with focus on 4 key topics.
1) Key patterns for Multi-cluster architectures
2) Architectural comparison of several OSS/ CNCF projects to address these patterns
3) Evolution trends for the APIs of these projects
4) Some design recommendations & guidelines for adopting/ deploying these solutions.
1.Wireless Communication System_Wireless communication is a broad term that i...JeyaPerumal1
Wireless communication involves the transmission of information over a distance without the help of wires, cables or any other forms of electrical conductors.
Wireless communication is a broad term that incorporates all procedures and forms of connecting and communicating between two or more devices using a wireless signal through wireless communication technologies and devices.
Features of Wireless Communication
The evolution of wireless technology has brought many advancements with its effective features.
The transmitted distance can be anywhere between a few meters (for example, a television's remote control) and thousands of kilometers (for example, radio communication).
Wireless communication can be used for cellular telephony, wireless access to the internet, wireless home networking, and so on.
ER(Entity Relationship) Diagram for online shopping - TAEHimani415946
https://bit.ly/3KACoyV
The ER diagram for the project is the foundation for the building of the database of the project. The properties, datatypes, and attributes are defined by the ER diagram.
This 7-second Brain Wave Ritual Attracts Money To You.!nirahealhty
Discover the power of a simple 7-second brain wave ritual that can attract wealth and abundance into your life. By tapping into specific brain frequencies, this technique helps you manifest financial success effortlessly. Ready to transform your financial future? Try this powerful ritual and start attracting money today!
This 7-second Brain Wave Ritual Attracts Money To You.!
Stream reasoning: mastering the velocity and the variety dimensions of Big Data at once
1. Stream
Reasoning:
mastering
the
velocity
and
the
variety
dimensions
of
Big
Data
at
once
Emanuele
Della
Valle
DEIB
-‐
Politecnico
di
Milano
@manudellavalle
emanuele.dellavalle@polimi.it
hBp://emanueledellavalle.org
University
of
Olso,
Norway
-‐
3.11.2015
2. It's
a
streaming
world
…
• Off-‐shore
oil
operaQons
• Smart
CiQes
• Global
Contact
Center
• Social
networks
• Generate
data
streams!
E.
Della
Valle,
S.
Ceri,
F.
van
Harmelen,
D.
Fensel
It's
a
Streaming
World!
Reasoning
upon
Rapidly
Changing
Informa:on.
IEEE
Intelligent
Systems
24(6):
83-‐89
(2009)
UiO,
Norway
-‐
3.11.2015
@manudellavalle
-‐
hBp://emanueledellavalle.org
2
3. …
looking
for
reacQve
answers
…
• What
is
the
expected
Qme
to
failure
when
that
turbine's
barring
starts
to
vibrate
as
detected
in
the
last
10
minutes?
• Is
public
transportaQon
where
the
people
are?
• Who
are
the
best
available
agents
to
route
all
these
unexpected
contacts
about
the
tariff
plan
launched
yesterday?
• Who
is
driving
the
discussion
about
the
top
10
emerging
topics
?
• Require
conQnuous
processing
and
reacQve
answer
UiO,
Norway
-‐
3.11.2015
@manudellavalle
-‐
hBp://emanueledellavalle.org
3
4. …with
conflicQng
requirements
1/8
A
system
able
to
answer
those
queries
must
be
able
to
• handle
massive
datasets
– A
typical
oil
producQon
plaeorm
is
equipped
with
about
400.000
sensors
– Telecom
data
is
the
most
pervasive
data
source
in
urban
are,
in
Milano
there
are
1.8
million
mobile
users
– A
global
contact
centre
of
a
Telecom
operator
counts
500
millions
of
clients
– Facebook
alone
has
1.1
billion
of
acQve
users
UiO,
Norway
-‐
3.11.2015
@manudellavalle
-‐
hBp://emanueledellavalle.org
4
5. …with
conflicQng
requirements
2/8
A
system
able
to
answer
those
queries
must
be
able
to
• process
data
streams
on
the
fly
– The
sensors
on
typical
oil
producQon
plaeorm
generates
10,000
observaQons
per
minute
with
peaks
of
100,000
o/m
– The
mobile
users
in
Milano
generates
20,000
call/sms/data
connecQons
per
minute
with
peaks
of
80,000
c/m
– A
global
contact
centre
receives
10,000
contacts
per
minute
with
peaks
of
30,000
c/m
– Facebook,
as
of
May
2013,
observes
3
millions
"I
like"
per
minute
UiO,
Norway
-‐
3.11.2015
@manudellavalle
-‐
hBp://emanueledellavalle.org
5
6. …with
conflicQng
requirements
3/8
A
system
able
to
answer
those
queries
must
be
able
to
• cope
with
heterogeneous
dataset
– The
sensors
on
typical
oil
producQon
have
been
deployed
over
10
years
by
10s
of
different
producers
– Tens
of
data
sources
are
normally
needed
to
make
sense
of
an
urban
phenomena
– A
global
contact
centre
consists
in
100s
of
offices
owned
by
different
subsidiary
companies
engaged
yearly
– Each
social
network
has
its
own
data
model,
APIs,
…
UiO,
Norway
-‐
3.11.2015
@manudellavalle
-‐
hBp://emanueledellavalle.org
6
7. …with
conflicQng
requirements
4/8
A
system
able
to
answer
those
queries
must
be
able
to
• cope
with
incomplete
data
– 10s
of
sensors
and
networking
links
broke
down
daily
– Coverage
is
incomplete
– Only
standard
cases
are
covered
by
fully
machine
processable
data
records
100s
of
contacts
per
minute
are
manage
ad-‐hoc
– Conversa:ons
happen
outside
the
social
networks,
too!
UiO,
Norway
-‐
3.11.2015
@manudellavalle
-‐
hBp://emanueledellavalle.org
7
8. …with
conflicQng
requirements
5/8
A
system
able
to
answer
those
queries
must
be
able
to
• cope
with
noisy
data
– Sensor
out-‐of-‐opera:ng
range
– Faulty
sensors
– Agents
misunderstand,
get
:red,
…
–
Irony,
sarcasm,
…
UiO,
Norway
-‐
3.11.2015
@manudellavalle
-‐
hBp://emanueledellavalle.org
8
9. …with
conflicQng
requirements
6/8
A
system
able
to
answer
those
queries
must
be
able
to
• provide
reac:ve
answers
– detecQon
of
dangerous
situaQons
must
occur
within
minutes
– recommendaQons
to
ciQzens
must
be
performed
in
few
seconds
– rouQng
a
contact
through
each
step
of
the
decision
tree
must
take
less
than
a
second
– Search
autocompleQng
may
need
to
be
updated
every
few
minutes
UiO,
Norway
-‐
3.11.2015
@manudellavalle
-‐
hBp://emanueledellavalle.org
9
10. …with
conflicQng
requirements
7/8
A
system
able
to
answer
those
queries
must
be
able
to
• support
fine-‐grained
informa:on
access
– IdenQfy
a
turbine
among
thousands
– Locate
a
bus
among
thousands
– Contact
an
agent
among
thousands
– IdenQfy
an
opinion
maker
among
thousands
of
influencers
for
a
topic
UiO,
Norway
-‐
3.11.2015
@manudellavalle
-‐
hBp://emanueledellavalle.org
10
11. …with
conflicQng
requirements
8/8
A
system
able
to
answer
those
queries
must
be
able
to
• integrate
complex
domain
models
of
– opera:onal
and
control
process
– various
city
aspects
– contact
management,
contract
types,
agent
skills,
contactor
profiles,
…
– topics,
user
profiles,
…
UiO,
Norway
-‐
3.11.2015
@manudellavalle
-‐
hBp://emanueledellavalle.org
11
12. Challenges
A
system
able
to
answer
those
queries
must
be
able
to
• handle
massive
datasets
x
• process
data
streams
on
the
fly
x
• cope
with
heterogeneous
datasets
x
• cope
with
incomplete
data
x
x
• cope
with
noisy
data
x
• provide
reac:ve
answers
x
• support
fine-‐grained
access
x
x
• integrate
complex
domain
models
x
Volume'
Velocity'
Variety'
Veracity'
In Big Data terms
UiO,
Norway
-‐
3.11.2015
@manudellavalle
-‐
hBp://emanueledellavalle.org
12
13. Grand
challenge
• Volume
+
Velocity
+
Variety
=
hard
deal
UiO,
Norway
-‐
3.11.2015
@manudellavalle
-‐
hBp://emanueledellavalle.org
Volume
months days hours min. sec. ms.
velocity
ZB
EB
PB
TB
GB
MB
KB
Variety
13
14. A
good
reason
to
embrace
it!
• ++
Variety
à
++
value
UiO,
Norway
-‐
3.11.2015
@manudellavalle
-‐
hBp://emanueledellavalle.org
Value
ms. sec. min. hours days months years
velocity
Variety
14
15. From
challenges
to
opportuniQes
• Formally
data
streams
are
:
– unbounded
sequences
of
Qme-‐varying
data
elements
• Less
formally,
in
many
applicaQon
domains,
they
are:
– a
“conQnuous”
flow
of
informaQon
– where
recent
informa:on
is
more
relevant
as
it
describes
the
current
state
of
a
dynamic
system
• OpportuniQes
– Forget
old
enough
informa:on
– Exploit
the
implicit
ordering
(by
recency)
in
the
data
time
UiO,
Norway
-‐
3.11.2015
@manudellavalle
-‐
hBp://emanueledellavalle.org
15
16. State-‐of-‐the-‐art:
DSMS
and
CEP
• A
paradigma:c
change!
• ConQnuous
queries
registered
over
streams
that
are
observed
trough
windows
window
input streams streams of answerRegistered
ConQnuous
Query
Dynamic
System
UiO,
Norway
-‐
3.11.2015
@manudellavalle
-‐
hBp://emanueledellavalle.org
16
17. DSMS
and
CEP
vs.
requirements
Requirement
DSMS
CEP
massive datasets
data streams
heterogeneous dataset
incomplete data
noisy data
reactive answers
fine-grained information access
complex domain models
✗
✗
✗
UiO,
Norway
-‐
3.11.2015
@manudellavalle
-‐
hBp://emanueledellavalle.org
17
18. State of the art: OBDA
• Given
ontology
O
and
query
Q,
use
O
to
rewrite
Q
as
Q’
so
that,
for
any
set
of
ground
facts
A
contained
in
mulQple
databases:
– answer(Q,O,A)
=
answer(Q’,!,A)
The
answer
of
the
query
Q
using
the
ontology
O
for
any
set
of
ground
facts
A
is
equal
to
answer
of
a
query
Q’
without
considering
the
ontology
O
• Use
mapping
M
to
map
Q’
to
mulQple
SQL
queries
to
the
various
databases
Rewrite
O
Q
Q’
Map
SQL
M
answer
A
UiO,
Norway
-‐
3.11.2015
@manudellavalle
-‐
hBp://emanueledellavalle.org
18
19. DSMS/CEP,OBDA
vs.
requirements
Requirement
DSMS
CEP
OBDA
massive datasets
data streams
heterogeneous dataset
incomplete data
noisy data
reactive answers
fine-grained information access
complex domain models
✗
✗
✗
✗
✗
✗
UiO,
Norway
-‐
3.11.2015
@manudellavalle
-‐
hBp://emanueledellavalle.org
19
20. Stream
Reasoning
• Research
quesQon
– is
it
possible
to
make
sense
in
real
:me
of
mul:ple,
heterogeneous,
gigan:c
and
inevitably
noisy
and
incomplete
data
streams
in
order
to
support
the
decision
processes
of
extremely
large
numbers
of
concurrent
users?
• Proposed
approach
Complexity
Raw
Stream
Processing
SemanQc
Streams
DL-‐Lite
DL
AbstracQon
SelecQon
InterpretaQon
Reasoning
Querying
Re-‐wriQng
Change
Frequency
PTIME
NEXPTIME
104
Hz
1
Hz
Complexity
vs.
Dynamics
AC0
H.
Stuckenschmidt,
S.
Ceri,
E.
Della
Valle,
F.
van
Harmelen:
Towards
Expressive
Stream
Reasoning.
Proceedings
of
the
Dagstuhl
Seminar
on
SemanQc
Aspects
of
Sensor
Networks,
2010.
UiO,
Norway
-‐
3.11.2015
@manudellavalle
-‐
hBp://emanueledellavalle.org
20
21. Sub-‐research
quesQons
1. Is
it
possible
extend
the
Seman:c
Web
stack
in
order
to
represent
heterogeneous
data
streams,
conQnuous
queries,
and
conQnuous
reasoning
tasks?
2. Does
the
ordered
nature
of
data
streams
and
the
possibility
to
forget
old
enough
informaQon
allow
to
op:mize
con:nuous
querying
and
con:nuous
reasoning
tasks
so
to
provide
reac:ve
answers
to
large
number
of
concurrent
users
without
forsaking
correctness
or
completeness?
3. Can
SemanQc
Web
and
Machine
Learning
technologies
be
jointly
employed
to
cope
with
the
noisy
and
incomplete
nature
of
data
streams?
4. Are
there
prac:cal
cases
where
processing
data
stream
at
semanQc
level
is
the
best
choice?
UiO,
Norway
-‐
3.11.2015
@manudellavalle
-‐
hBp://emanueledellavalle.org
21
22. Sub-‐research
quesQons
1. Is
it
possible
extend
the
Seman:c
Web
stack
in
order
to
represent
heterogeneous
data
streams,
conQnuous
queries,
and
conQnuous
reasoning
tasks?
2. Does
the
ordered
nature
of
data
streams
and
the
possibility
to
forget
old
enough
informaQon
allow
to
op:mize
con:nuous
querying
and
con:nuous
reasoning
tasks
so
to
provide
reac:ve
answers
to
large
number
of
concurrent
users
without
forsaking
correctness
or
completeness?
3. Can
SemanQc
Web
and
Machine
Learning
technologies
be
jointly
employed
to
cope
with
the
noisy
and
incomplete
nature
of
data
streams?
4. Are
there
prac:cal
cases
where
processing
data
stream
at
semanQc
level
is
the
best
choice?
UiO,
Norway
-‐
3.11.2015
@manudellavalle
-‐
hBp://emanueledellavalle.org
22
23. State-‐of-‐the-‐art:
RDF
model
• RDF:
Resource
DescripQon
Framework
– It
allows
to
make
statements
about
resources
in
the
form
of
subject-‐predicate-‐object
expressions
• In
RDF
terminology
triples
• E.g.
@BarakObama
posts
"Four
more
years"
– A
collecQon
of
RDF
statements
represents
a
labelled,
directed
graph
• In
RDF
terminology
a
graph
• E.g.,
the
tweet
above
by
Barak
Obama
is
connected
to
– 800,000+
twiBer
user
profiles
via
retweets
– 300,000+
twiBer
user
profiles
favorite
– …
subject predicate object
UiO,
Norway
-‐
3.11.2015
@manudellavalle
-‐
hBp://emanueledellavalle.org
23
24. ContribuQon:
RDF
stream
Models
• RDF
Stream
(the
C-‐SPARQL
way)
– Unbound
sequence
of
:me-‐varying
triples
– each
represented
by
a
pair
made
of
an
RDF
triple
and
its
Qmestamp
– Timestamp
are
non-‐decreasing
(allowing
for
simultaneity)
…
@BarakObama
posts
"Four
more
years",
8:16PM
6
Nov
2012
@Alice
posts
"RT:
Four
more
years",
8:17PM
6
Nov
2012
…
D.F.
Barbieri,
D.
Braga,
S.
Ceri,
E.
Della
Valle,
M.
Grossniklaus:
Querying
RDF
streams
with
C-‐SPARQL.
SIGMOD
Record
39(1):
20-‐26
(2010)
subject predicate object timestamp
UiO,
Norway
-‐
3.11.2015
@manudellavalle
-‐
hBp://emanueledellavalle.org
24
25. ContribuQon:
RDF
stream
Models
• RDF
Stream
(the
Streaming
Linked
Data
way)
– Unbound
sequence
of
:me-‐varying
graphs
– each
represented
by
a
pair
made
of
an
RDF
graph
and
its
Qmestamp
– Timestamps
(if
present)
are
monotonically
increasing
– Graphs
act
as
a
form
of
punctuaQon
(all
triples
in
a
graph
are
simultaneous)
UiO,
Norway
-‐
3.11.2015
@manudellavalle
-‐
hBp://emanueledellavalle.org
D.F.
Barbieri,
E.
Della
Valle:
A
Proposal
for
Publishing
Data
Streams
as
Linked
Data
-‐
A
Posi:on
Paper.
LDOW
(2010)
25
26. RDF
streams
Qme
semanQcs
1/3
• A
RDF
stream
without
Qmestamp
is
an
ordered
sequence
of
data
items
• The
order
can
be
exploited
to
perform
queries
– Does
Alice
meet
Bob
before
Carl?
– Who
does
Carl
meet
first?
S
e1
:alice
:isWith
:bob
e2
:alice
:isWith
:carl
e3
:bob
:isWith
:diana
e4
:diana
:isWith
:carl
UiO,
Norway
-‐
3.11.2015
@manudellavalle
-‐
hBp://emanueledellavalle.org
26
27. RDF
streams
Qme
semanQcs
2/3
• One
Qmestamp:
the
Qme
instant
on
which
the
data
item
occurs
• We
can
start
to
compose
queries
taking
into
account
the
Qme
– How
many
people
has
Alice
met
in
the
last
5m?
– Does
Diana
meet
Bob
and
then
Carl
within
5m?
e1
e2
e3
e4
S
t
3
6
9
1
:alice
:isWith
:bob
:alice
:isWith
:carl
:bob
:isWith
:diana
:diana
:isWith
:carl
UiO,
Norway
-‐
3.11.2015
@manudellavalle
-‐
hBp://emanueledellavalle.org
27
28. RDF
streams
Qme
semanQcs
3/3
• Two
Qmestamps:
the
Qme
range
on
which
the
data
item
is
valid
(from,
to]
• It
is
possible
to
write
even
more
complex
constraints:
– Which
are
the
meeQngs
the
last
less
than
5m?
– Which
are
the
meeQngs
with
conflicts?
.
S
t
3
6
9
1
:alice
:isWith
:bob
:alice
:isWith
:carl
:bob
:isWith
:diana
:diana
:isWith
:carl
e1
e2
e3
e4
D.
Anicic,
P.
Fodor,
S.
Rudolph,
&
N.
Stojanovic.
EP-‐SPARQL:
a
unified
language
for
event
processing
and
stream
reasoning.
In
WWW
2011,
pages
635–644
UiO,
Norway
-‐
3.11.2015
@manudellavalle
-‐
hBp://emanueledellavalle.org
28
29. Finding
• The
Seman:c
Web
stack
can
be
extended
so
to
incorporate
streaming
data
as
a
first
class
ciQzen
– RDF
stream
data
model(s)
– Con:nuous
SPARQL
syntax
and
semanQcs
– Con:nuous
deduc:ve
reasoning
semanQcs
UiO,
Norway
-‐
3.11.2015
@manudellavalle
-‐
hBp://emanueledellavalle.org
29
30. Work
in
progress
• In
2013,
an
RDF
Stream
Processing
(RSP)
community
group
was
created
at
W3C
hBp://www.w3.org/community/rsp/
• RSP
data
model
and
serializaQon
– hBps://github.com/streamreasoning/RSP-‐QL/blob/
master/SerializaQon.md
UiO,
Norway
-‐
3.11.2015
@manudellavalle
-‐
hBp://emanueledellavalle.org
30
33. ContribuQon:
ConQnuous-‐SPARQL
Who
are
the
opinion
makers?
i.e.,
the
users
who
are
likely
to
influence
the
behavior
their
followers
REGISTER STREAM OpinionMakers COMPUTED EVERY 5m AS
CONSTRUCT { ?opinionMaker sd:about ?resource }
FROM STREAM <http://…> [RANGE 30m STEP 5m]
WHERE {
?opinionMaker ?opinion ?res .
?follower sioc:follows ?opinionMaker.
?follower ?opinion ?res.
FILTER (cs:timestamp(?follower ?opinion ?res) >
cs:timestamp(?opinionMaker ?opinion ?res) )
}
HAVING ( COUNT(DISTINCT ?follower) > 3 )
SR
2015,
Austria
-‐
3.11.2015
@manudellavalle
-‐
hBp://emanueledellavalle.org
33
34. ContribuQon:
ConQnuous-‐SPARQL
Who
are
the
opinion
makers?
i.e.,
the
users
who
are
likely
to
influence
the
behavior
their
followers
REGISTER STREAM OpinionMakers COMPUTED EVERY 5m AS
CONSTRUCT { ?opinionMaker sd:about ?resource }
FROM STREAM <http://…> [RANGE 30m STEP 5m]
WHERE {
?opinionMaker ?opinion ?res .
?follower sioc:follows ?opinionMaker.
?follower ?opinion ?res.
FILTER (cs:timestamp(?follower ?opinion ?res) >
cs:timestamp(?opinionMaker ?opinion ?res) )
}
HAVING ( COUNT(DISTINCT ?follower) > 3 )
Query
registra:on
(for
con:nuous
execu:on)
FROM
STREAM
clause
WINDOW
RDF
Stream
added
as
new
ouput
format
Buil:n
to
access
:mestamps
SR
2015,
Austria
-‐
3.11.2015
@manudellavalle
-‐
hBp://emanueledellavalle.org
D.F.
Barbieri,
D.
Braga,
S.
Ceri,
E.
Della
Valle,
M.
Grossniklaus:
Querying
RDF
streams
with
C-‐SPARQL.
SIGMOD
Record
39(1):
20-‐26
(2010)
34
35. Finding
• The
Seman:c
Web
stack
can
be
extended
so
to
incorporate
streaming
data
as
a
first
class
ciQzen
– RDF
stream
data
model
– Con:nuous
SPARQL
syntax
and
semanQcs
– Con:nuous
deduc:ve
reasoning
semanQcs
UiO,
Norway
-‐
3.11.2015
@manudellavalle
-‐
hBp://emanueledellavalle.org
35
36. AlternaQves
to
C-‐SPARQL
• CQELS
– What:
STREAM
clause,
focus
on
new
answer
– Ref:
Le-‐Phuoc,
D.,
Dao-‐Tran,
M.,
Xavier
Parreira,
J.,
&
Hauswirth,
M.
A
naQve
and
adapQve
approach
for
unified
processing
of
linked
streams
and
linked
data.
In
ISWC
2011,
pages
370–388.
• SPARQLStream
– What:
window
in
the
past,
focus
on
RDF
to
Stream
operators
– Ref:
Calbimonte,
J.-‐P.,
Corcho,
O.,
&
Gray,
A.
J.
G.
Enabling
ontology-‐based
access
to
streaming
data
sources.
In
ISWC,
2010,
pages
96–111.
• EP-‐SPARQL
– What:
focus
on
event
specific
operators
– Ref:
Anicic,
D.,
Fodor,
P.,
Rudolph,
S.,
&
Stojanovic,
N.
EP-‐SPARQL:
a
unified
language
for
event
processing
and
stream
reasoning.
In
WWW
2011,
pages
635–644.
• TEF-‐SPARQL
– What:
adds
"facts"
as
first
class
elements
– Ref:
hBps://www.merlin.uzh.ch/publicaQon/show/8467
UiO,
Norway
-‐
3.11.2015
@manudellavalle
-‐
hBp://emanueledellavalle.org
36
37. AlternaQves
to
C-‐SPARQL
• Comparison
between
exisQng
approaches
System
S2R
R2R
Time-‐aware
R2S
C-‐SPARQL
Engine
Logical
and
triple-‐based
SPARQL
1.1
query
Qmestamp
funcQon
Batch
only
Streaming
Linked
Data
Framework
Logical
and
graph-‐based
SPARQL
1.1
no
Batch
only
SPARQLstream
Logical
and
triple-‐based
SPARQL
1.1
query
no
Ins,
batch,
del
CQELS
Logical
and
triple-‐based
SPARQL
1.1
query
no
Ins
only
TEF-‐SPARQL
no
SPARQL-‐like
Temporarily
Facts,
BEFORE
SINCE,
UNTIL,
DURING,
Batch
only
EP-‐SPARQL
no
SPARQL
1.0
SEQ,
PAR,
AND,
OR,
DURING,
STARTS,
EQUALS,
NOT,
MEETS,
FINISHES
Ins
only
UiO,
Norway
-‐
3.11.2015
@manudellavalle
-‐
hBp://emanueledellavalle.org
37
38. Work
in
progress
at
RSP@W3C
• RSP-‐QL
– Syntax
• hBps://github.com/streamreasoning/RSP-‐QL/blob/master/RSP-‐
QL%20Sample%20Queries.md
– Proposed
semanQcs
• D.Dell'Aglio,
E.Della
Valle,
J.-‐P.Calbimonte,
Ó.
Corcho:
RSP-‐QL
SemanQcs:
A
Unifying
Query
Model
to
Explain
Heterogeneity
of
RDF
Stream
Processing
Systems.
Int.
J.
SemanQc
Web
Inf.
Syst.
10(4):
17-‐44
(2014)
– SemanQcs
(work
in
progress)
• hBps://github.com/streamreasoning/RSP-‐QL/blob/master/
SemanQcs.md
– Quick
ref.
• D.
Dell'Aglio,
J.-‐P.
Calbimonte,
E.
Della
Valle,
Ó.
Corcho:
Towards
a
Unified
Language
for
RDF
Stream
Query
Processing.
ESWC
(Satellite
Events)
2015:
353-‐363
UiO,
Norway
-‐
3.11.2015
@manudellavalle
-‐
hBp://emanueledellavalle.org
38
39. ContribuQon:
conQnuous
deducQve
reasoning
• DL
Ontology
Stream
ST
– A
ontology
stream
with
respect
to
a
staQc
Tbox
T
is
a
sequence
of
Abox
axioms
ST(i)
• A
Windowed
Ontology
Stream
ST(o,c]
– A
windowed
ontology
stream
with
respect
to
a
staQc
Tbox
T
is
the
union
of
the
Abox
axioms
ST(i)
where
o<i≤c
• Reasoning
on
a
Windowed
Ontology
Stream
ST(o,c]
is
as
reasoning
on
a
staQc
DL
KB
SR
2015,
Austria
-‐
3.11.2015
@manudellavalle
-‐
hBp://emanueledellavalle.org
39
Emanuele
Della
Valle,
Stefano
Ceri,
Davide
Francesco
Barbieri,
Daniele
Braga,
Alessandro
Campi:
A
First
Step
Towards
Stream
Reasoning.
FIS
2008:
72-‐81
40. discusses
discusses
discusses
discusses
discusses
discusses
discusses
Example
of
conQnuous
deducQve
reasoning
What impact has been my micropost p1 creating in the last hour?
Let’s count the number of microposts that discuss it …
REGISTER STREAM ImpactMeter AS
SELECT (count(?p) AS ?impact)
FROM STREAM <http://…/fb> [RANGE 60m STEP 10m]
WHERE {
:Alice posts [ sr:discusses ?p ]
}
p1
p3
p5
p8
p2
p4
p7
p6
7!
Transitive
property
Alice posts p1 .
UiO,
Norway
-‐
3.11.2015
@manudellavalle
-‐
hBp://emanueledellavalle.org
40
41. Finding
• The
Seman:c
Web
stack
can
be
extended
so
to
incorporate
streaming
data
as
a
first
class
ciQzen
– RDF
stream
data
model
– Con:nuous
SPARQL
syntax
and
semanQcs
– Con:nuous
deduc:ve
reasoning
semanQcs
UiO,
Norway
-‐
3.11.2015
@manudellavalle
-‐
hBp://emanueledellavalle.org
41
42. AlternaQves
to
conQnuous
deducQve
(RDFS++)
reasoning
• ETALIS
– What:
RDFS
+
Allen
Algebra
– Ref:
Anicic,
D.,
Rudolph,
S.,
Fodor,
P.,
&
Stojanovic,
N.
Stream
reasoning
and
complex
event
processing
in
ETALIS.
SemanQc
Web,
3(4),
2012,
397–407.
• STARQL
– What:
• DL-‐Lite
+
ConjuncQve
Query
+
Qme-‐series
• SHI
+
Grounded
ConjuncQve
Queries
+
Qme-‐series
– Ref:
ÖL
Özçep,
R
Möller.
Ontology
Based
Data
Access
on
Temporal
and
Streaming
Data.
Reasoning
Web,
2014
• ASP-‐based
– What:
Qme-‐decaying
ASP
– Ref:
hBp://arxiv.org/abs/1301.1392
• LARS
– What:
high-‐level
unified
formal
foundaQon
for
stream
reasoning
– Ref:
H.
Beck,
M.
Dao-‐Tran,
T.
Eiter,
M.
Fink:
LARS:
A
Logic-‐Based
Framework
for
Analyzing
Reasoning
over
Streams.
AAAI
2015:
1431-‐1438H.
UiO,
Norway
-‐
3.11.2015
@manudellavalle
-‐
hBp://emanueledellavalle.org
42
43. Sub-‐research
quesQons
1. Is
it
possible
extend
the
Seman:c
Web
stack
in
order
to
represent
heterogeneous
data
streams,
conQnuous
queries,
and
conQnuous
reasoning
tasks?
2. Does
the
ordered
nature
of
data
streams
and
the
possibility
to
forget
old
enough
informaQon
allow
to
op:mize
con:nuous
querying
and
con:nuous
reasoning
tasks
so
to
provide
reac:ve
answers
to
large
number
of
concurrent
users
without
forsaking
correctness
or
completeness?
3. Can
SemanQc
Web
and
Machine
Learning
technologies
be
jointly
employed
to
cope
with
the
noisy
and
incomplete
nature
of
data
streams?
4. Are
there
prac:cal
cases
where
processing
data
stream
at
semanQc
level
is
the
best
choice?
UiO,
Norway
-‐
3.11.2015
@manudellavalle
-‐
hBp://emanueledellavalle.org
43
44. ContribuQon:
opQmize
querying
for
reacQve
answers
• C-‐SPARQL
engine
Qme
window-‐based
selecQon
outperforms
SPARQL
filter-‐based
selecQon
(Jena-‐ARQ)
D.
Barbieri,
D.
Braga,
S.
Ceri,
E.
Della
Valle,
Y.
Huang,
V.
Tresp,
A.Re•nger,
H.
Wermser:
Deduc:ve
and
Induc:ve
Stream
Reasoning
for
Seman:c
Social
Media
Analy:cs
IEEE
Intelligent
Systems,
30
Aug.
2010.
Our In-memory
RDF stream
processing
engine
UiO,
Norway
-‐
3.11.2015
@manudellavalle
-‐
hBp://emanueledellavalle.org
44
45. Finding
• Stream
Reasoning
task
is
feasible
and
the
very
nature
of
streaming
data
offers
opportuniQes
to
op:mise
reasoning
tasks
where
data
is
ordered
by
recency
and
can
be
forgoBen
a€er
a
while
– C-‐SPARQL
Engine
prototype
– IMaRS
conQnuous
incremental
reasoning
algorithm
UiO,
Norway
-‐
3.11.2015
@manudellavalle
-‐
hBp://emanueledellavalle.org
45
46. Work
in
progress
• When
volumes
also
maBers
…
UiO,
Norway
-‐
3.11.2015
@manudellavalle
-‐
hBp://emanueledellavalle.org
46
Join
Data
Stream
SPARQL
endpoint
Window
Maintenance
Policy
Local
View
RSP
engine
Web
Soheila
Dehghanzadeh,
Daniele
Dell'Aglio,
Shen
Gao,
Emanuele
Della
Valle,
Alessandra
Mileo,
Abraham
Bernstein:
Approximate
Con:nuous
Query
Answering
over
Streams
and
Dynamic
Linked
Data
Sets.
ICWE
2015:
307-‐325
47. State-‐of-‐the-‐art
deducQve
reasoning
• Data-‐driven
(a.k.a.
forward
reasoning)
• Query-‐driven
–
backward
reasoning
• Query-‐driven
–
query
rewriQng
(a.k.a.
ontology
based
data
access)
Reasoner
RDFd
ata
SPARQL
Inferred
data
ontology
SPARQL
ontology
RewriBen
query
Reasoner
Reasoner
RDFd
ata
SPARQL
ontology
data
UiO,
Norway
-‐
3.11.2015
@manudellavalle
-‐
hBp://emanueledellavalle.org
47
48. Naïve
approaches
to
Stream
Reasoning
windowing
then
reasoning
• Data-‐driven
(a.k.a.
forward
reasoning)
• Query-‐driven
–
backward
reasoning
• Query-‐driven
–
query
rewriQng
(a.k.a.
ontology
based
data
access)
Reasoner
RDF
data
SPARQL
Inferred
data
ontology
ontology
RewriBen
query
Reasoner
Reasoner
RDF
data
ontology
Window
Window
Window
SPARQL
SPARQL
data
UiO,
Norway
-‐
3.11.2015
@manudellavalle
-‐
hBp://emanueledellavalle.org
48
49. Not
so
naïve
approach
to
stream
reasoning
• The
problem
is
that
materializaQon
(the
result
of
data-‐driven
processing)
are
very
difficult
to
decrement
efficiently.
– State-‐of-‐the-‐art:
DRed
algorithm
• Over
delete
• Re-‐derive
• Insert
Reasoner
Inferred
data
ontology
window
inserQons
deleQons
Incremental
!!!
SPARQL
Y.
Ren,
J.
Z.
Pan.
OpQmising
ontology
stream
reasoning
with
truth
maintenance
system.
In
CIKM
(2011)
UiO,
Norway
-‐
3.11.2015
@manudellavalle
-‐
hBp://emanueledellavalle.org
49
50. Is
DRed
needed?
• DRed
works
with
random
inserQons
and
deleQons
• In
a
streaming
sedng,
when
a
triple
enters
the
window,
given
the
size
of
the
window,
the
reasoner
knows
already
when
it
will
be
deleted!
• E.g.,
– if
the
window
is
40
minutes
long,
and,
– it
is
10:00,
the
triple(s)
entering
now
– will
exit
on
10:40.
• Conclusion
– dele:ons
are
predictable
Time
Enter
window
Exit
window
Explicitly in
window
Infer
win
10:00 A!B
10:10 B!C
10:20 A!E
10:30 E!C
10:40 A!B
10:50 B!C
11:00 A!E
A B
A B C A
A B C
E
A
A B C
E
A
A C
E
A
A B C
E
A
C
E
UiO,
Norway
-‐
3.11.2015
@manudellavalle
-‐
hBp://emanueledellavalle.org
50
51. ContribuQon:
IMaRS
algorithm
• Idea:
– add
an
expira:on
:me
to
each
triple
and
– use
an
hash
table
to
index
triples
by
their
expiraQon
Qme
• The
algorithm
1. deletes
expired
triples
2. Adds
the
new
derivaQons
that
are
consequences
of
inserQons
annota:ng
each
inferred
triple
with
an
expira:on
:me
(the
min
of
those
of
the
triple
it
is
derived
from),
and
3. when
mul:ple
deriva:ons
occur,
for
each
mulQple
derivaQon,
it
keeps
the
max
expiraQon
Qme.
UiO,
Norway
-‐
3.11.2015
@manudellavalle
-‐
hBp://emanueledellavalle.org
51
52. ContribuQon:
IMaRS
algorithm
• Incremental
Reasoning
on
RDF
streams
(IMaRS):
new
reasoning
algorithm
opQmized
for
reacQve
query
answering
D.F.
Barbieri,
D.
Braga,
S.Ceri,
E.
Della
Valle,
M.
Grossniklaus:
Incremental
Reasoning
on
Streams
and
Rich
Background
Knowledge.
ESWC
(1)
2010:
1-‐15
D.
Dell'Aglio,
E.
Della
Valle:
Incremental
Reasoning
on
RDF
Streams.
In
A.Harth,
K.Hose,
R.Schenkel
(Eds.)
Linked
Data
Management,
CRC
Press
2014,
ISBN
9781466582408
! Re-materialize after each window slide
! Use DRed
! IMaRS
% of deletions w.r.t. the content of the window
UiO,
Norway
-‐
3.11.2015
@manudellavalle
-‐
hBp://emanueledellavalle.org
52
53. ContribuQon:
IMaRS
algorithm
• comparison
of
the
average
Qme
needed
to
answer
a
C-‐SPARQL
query,
when
2%
of
the
content
exits
the
window
each
Qme
it
slides,
using
– A
backward
reasoner
on
the
window
content
– DRed
+
standard
SPARQL
on
the
materializaQon
– IMaRS
+
standard
SPARQL
on
the
materializaQon
UiO,
Norway
-‐
3.11.2015
@manudellavalle
-‐
hBp://emanueledellavalle.org
53
54. Finding
• Stream
Reasoning
task
is
feasible
and
the
very
nature
of
streaming
data
offers
opportuniQes
to
op:mise
reasoning
tasks
where
data
is
ordered
by
recency
and
can
be
forgoBen
a€er
a
while
– C-‐SPARQL
Engine
prototype
– IMaRS
conQnuous
incremental
reasoning
algorithm
UiO,
Norway
-‐
3.11.2015
@manudellavalle
-‐
hBp://emanueledellavalle.org
54
55. OpQmizing
for
stream
reasoning
alternaQve
approaches
• DyKnow
– How:
logical
models
of
an
observed
dynamic
system
+
metric
temporal
logics
– Fredrik
Heintz,
Jonas
Kvarnström,
Patrick
Doherty:
Bridging
the
sense-‐reasoning
gap:
DyKnow
-‐
Stream-‐based
middleware
for
knowledge
processing.
Advanced
Engineering
InformaQcs
24(1):
14-‐26
(2010)
• MorphStream
– How:
rewriQng
in
DSMS
languages
(one
at
a
Qme)
– Ref:
Calbimonte,
J.-‐P.,
Corcho,
O.,
&
Gray,
A.
J.
G.
Enabling
ontology-‐based
access
to
streaming
data
sources.
In
ISWC,
2010,
pages
96–111.
• TR-‐OWL
– How:
Truth
maintenance
for
EL++
with
syntacQc
approximaQons
– Ref:
Y.
Ren,
J.
Z.
Pan.
OpQmising
ontology
stream
reasoning
with
truth
maintenance
system.
In
CIKM
(2011)
• ETALIS
– How:
rewriQng
in
prolog
– Ref:
Anicic,
D.,
Rudolph,
S.,
Fodor,
P.,
&
Stojanovic,
N..
Stream
reasoning
and
complex
event
processing
in
ETALIS.
SemanQc
Web,
3(4),
2012,
397–407.
(conQnues
in
the
next
slide)
SR
2015,
Austria
-‐
3.11.2015
@manudellavalle
-‐
hBp://emanueledellavalle.org
55
56. OpQmizing
for
stream
reasoning
alternaQve
approaches
• Sparkwave
– How:
extended
RETE
algorithm
for
windows
and
RDFS
– Ref:
Sparkwave:
ConQnuous
Schema-‐Enhanced
PaBern
Matching
over
RDF
Data
Streams.
Komazec
S,
Cerri
D.
DEBS
2012
• DynamiTE
– How:
Truth
maintenance
for
ρDF
(a
fragment
of
RDFS)
– J.
Urbani,
A.
Margara,
C.
J.
H.
Jacobs,
F.
van
Harmelen,
H.E.
Bal:
DynamiTE:
Parallel
MaterializaQon
of
Dynamic
RDF
Data.
ISWC
(1)
2013:
657-‐672
• STARQL
– How:
rewriQng
on
a
scalable
DSMS
with
Qme-‐series
support
– Ref:
ÖL
Özçep,
R
Möller.
Ontology
Based
Data
Access
on
Temporal
and
Streaming
Data.
Reasoning
Web,
2014
• ASP-‐based
– How:
opQmizing
ASP
for
incremental
and
Qme-‐decaying
programs
– Ref:
hBp://arxiv.org/abs/1301.1392
• The
Backward/Forward
Algorithm
– How:
opQmizing
DRed
– B.
MoQk,
Y.
Nenov,
R.E.F.
Piro,
I.
Horrocks:
Incremental
Update
of
Datalog
MaterialisaQon:
the
Backward/Forward
Algorithm.
AAAI
2015:
1560-‐1568
SR
2015,
Austria
-‐
3.11.2015
@manudellavalle
-‐
hBp://emanueledellavalle.org
56
57. Sub-‐research
quesQons
1. Is
it
possible
extend
the
Seman:c
Web
stack
in
order
to
represent
heterogeneous
data
streams,
conQnuous
queries,
and
conQnuous
reasoning
tasks?
2. Does
the
ordered
nature
of
data
streams
and
the
possibility
to
forget
old
enough
informaQon
allow
to
op:mize
con:nuous
querying
and
con:nuous
reasoning
tasks
so
to
provide
reac:ve
answers
to
large
number
of
concurrent
users
without
forsaking
correctness
or
completeness?
3. Can
SemanQc
Web
and
Machine
Learning
technologies
be
jointly
employed
to
cope
with
the
noisy
and
incomplete
nature
of
data
streams?
4. Are
there
prac:cal
cases
where
processing
data
stream
at
semanQc
level
is
the
best
choice?
UiO,
Norway
-‐
3.11.2015
@manudellavalle
-‐
hBp://emanueledellavalle.org
57
58. Cope
with
the
noisy
and
incomplete
data
• "Noise"
is
reduced
using
DSMS
techniques
• Deduc:ve
stream
reasoning
copes
with
incompleteness
deducing
implicit
facts
• Induc:ve
stream
reasoning
copes
with
"irrepairable"
incompleteness
inducing
missing
facts
D.F.
Barbieri,
D.
Braga,
S.
Ceri,
E.
Della
Valle,
Y.
Huang,
V.
Tresp,
A.
Re•nger,
H.
Wermser:
Deduc:ve
and
Induc:ve
Stream
Reasoning
for
Seman:c
Social
Media
Analy:cs.
IEEE
Intelligent
Systems
25(6):
32-‐41
(2010)
UiO,
Norway
-‐
3.11.2015
@manudellavalle
-‐
hBp://emanueledellavalle.org
58
59. Findings
• A
combina:on
of
deduc:ve
and
induc:ve
stream
reasoning
techniques
can
cope
with
incomplete
and
noisy
data
UiO,
Norway
-‐
3.11.2015
@manudellavalle
-‐
hBp://emanueledellavalle.org
59
60. AlternaQve
approaches
• Stream
Reasoning
with
ProbabilisQc
Answer
Set
Programming
– MaBhias
Nickles,
Alessandra
Mileo:
Web
Stream
Reasoning
Using
ProbabilisQc
Answer
Set
Programming.
RR
2014:
197-‐205
– Anastasios
SkarlaQdis,
Georgios
Paliouras,
Alexander
ArQkis,
George
A.
Vouros:
ProbabilisQc
Event
Calculus
for
Event
RecogniQon.
ACM
Trans.
Comput.
Log.
16(2):
11:1-‐11:37
(2015)
– Anni-‐Yasmin
Turhan,
Erik
Zenker:
Towards
Temporal
Fuzzy
Query
Answering
on
Stream-‐based
Data.
HiDeSt@KI
2015:
56-‐69
SR
2015,
Austria
-‐
3.11.2015
@manudellavalle
-‐
hBp://emanueledellavalle.org
60
61. Sub-‐research
quesQons
1. Is
it
possible
extend
the
Seman:c
Web
stack
in
order
to
represent
heterogeneous
data
streams,
conQnuous
queries,
and
conQnuous
reasoning
tasks?
2. Does
the
ordered
nature
of
data
streams
and
the
possibility
to
forget
old
enough
informaQon
allow
to
op:mize
con:nuous
querying
and
con:nuous
reasoning
tasks
so
to
provide
reac:ve
answers
to
large
number
of
concurrent
users
without
forsaking
correctness
or
completeness?
3. Can
SemanQc
Web
and
Machine
Learning
technologies
be
jointly
employed
to
cope
with
the
noisy
and
incomplete
nature
of
data
streams?
4. Are
there
prac:cal
cases
where
processing
data
stream
at
semanQc
level
is
the
best
choice?
UiO,
Norway
-‐
3.11.2015
@manudellavalle
-‐
hBp://emanueledellavalle.org
61
62. ContribuQon:
Streaming
Linked
Data
Framework
UiO,
Norway
-‐
3.11.2015
@manudellavalle
-‐
hBp://emanueledellavalle.org
62
Stream Bus
Recorder Re-player
AnalyserDecorator
Adapter Publisher VisualizerStream
HTTP
HTTP
Data Source Streaming Linked Data Server HTML5 Browser
Marco
Balduini,
Emanuele
Della
Valle,
Daniele
Dell'Aglio,
Mikalai
Tsytsarau,
Themis
Palpanas,
CrisQan
Confalonieri:
Social
Listening
of
City
Scale
Events
Using
the
Streaming
Linked
Data
Framework.
InternaQonal
SemanQc
Web
Conference
(2)
2013:
1-‐16
64. PracQcal
cases
• 10+
deployments
in
Sensor
Networks
&
Social
media
analyQcs,
e.g.
BOTTARI
Winner of Semantic Web
Challenge 2011
City Data Fusion
Winner of IBM
faculty award 2013
M.
Balduini,
I.
Celino,
D.
Dell’Aglio,
E.
Della
Valle,
Y.
Huang,
T.
Lee,
S.-‐H.
Kim,
V.
Tresp:
BOTTARI:
An
augmented
reality
mobile
applica:on
to
deliver
personalized
and
loca:on-‐based
recommenda:ons
by
con:nuous
analysis
of
social
media
streams.
J.
Web
Sem.
16:
33-‐41
(2012)
Social Listener
M.Balduini,
E.Della
Valle,
M.Azzi,
R.Larcher,
F.Antonelli,
and
P.Ciuccarelli:
CitySensing:
Fusing
City
Data
for
Visual
Storytelling.
IEEE
MulQMedia
22(3):
44-‐53
(2015)
UiO,
Norway
-‐
3.11.2015
@manudellavalle
-‐
hBp://emanueledellavalle.org
64
65. Findings
1. The
Seman:c
Web
stack
can
be
extended
so
to
incorporate
streaming
data
as
a
first
class
ciQzen
– RDF
stream
data
model
– Con:nuous
SPARQL
syntax
and
semanQcs
– Con:nuous
deduc:ve
reasoning
semanQcs
2. Stream
Reasoning
task
is
feasible
and
the
very
nature
of
streaming
data
offers
opportuniQes
to
op:mise
reasoning
tasks
where
data
is
ordered
by
recency
and
can
be
forgoBen
a€er
a
while
– IMaRS
conQnuous
incremental
reasoning
algorithm
– C-‐SPARQL
Engine
prototype
3. A
combinaQon
of
deduc:ve
and
induc:ve
stream
reasoning
techniques
can
cope
with
incomplete
and
noisy
data
4. There
are
applica:on
domains
where
Stream
Reasoning
offers
an
adequate
soluQon
UiO,
Norway
-‐
3.11.2015
@manudellavalle
-‐
hBp://emanueledellavalle.org
65
66. Open
issues
1. The
Seman:c
Web
stack
can
be
extended
– "NavigaQng
the
Chasm
between
the
Scylla
of
PracQcal
ApplicaQons
and
the
Charybdis
of
TheoreQcal
Approaches"
A.
Bernstein,
2015
2. Stream
Reasoning
task
is
feasible
– It's
Qme
to
start
removing
assumpQons
• knowledge
does
not
change
• background
data
does
not
change
– OBDA
for
SQL
≠
OBDA
for
conQnuous
querying
3. Stream
reasoning
can
cope
with
incomplete
and
noisy
data
– Theory
is
needed!
4. There
are
applica:on
domains
where
Stream
Reasoning
offers
an
adequate
soluQon
– Rigorous
quanQtaQve
comparaQve
research
is
needed
UiO,
Norway
-‐
3.11.2015
@manudellavalle
-‐
hBp://emanueledellavalle.org
66
67. AdverQsements
:-‐P
• Check
out
my
PhD
thesis
– hBp://dare.ubvu.vu.nl/handle/1871/53293
– Chapter
1:
IntroducQon
• The
content
of
this
presentaQon
– Chapter
8:
conclusions
• A
review
of
stream
reasoning
approaches
updated
in
spring
2015
• Put
an
"I
like"
to
Stream
Reasoning
on
Facebook
– hBps://www.facebook.com/streamreasoning
@manudellavalle
-‐
hBp://emanueledellavalle.org
UiO,
Norway
-‐
3.11.2015
67
68. Thank
you!
Any
QuesQon?
Emanuele
Della
Valle
DEIB
-‐
Politecnico
di
Milano
emanuele.dellavalle@polimi.it
hBp://emanueledellavalle.org
University
of
Olso,
Norway
-‐
3.11.2015