A l'occasion de l'eGov Innovation Day 2014 - DONNÉES DE L’ADMINISTRATION, UNE MINE (qui) D’OR(t) - Philippe Cudré-Mauroux présente Big Data et eGovernment.
VIP Kolkata Call Girl Jatin Das Park 👉 8250192130 Available With Room
Big Data et eGovernment
1. Big Data & eGovernment
Prof. Dr. Philippe Cudré-Mauroux
eXascale Infolab, University of Fribourg
Switzerland
eGov Innovation Day
November 28, 2014
Fribourg – Switzerland
9. Big Data Buzz
9
Between now and 2015, the firm expects big data to
create some 4.4 million IT jobs globally; of those, 1.9
million will be in the U.S. Applying an economic
multiplier to that estimate, Gartner expects each new big-data-
related IT job to create work for three more people
outside the tech industry, for a total of almost 6 million
more U.S. jobs.
Growth in the Asia Pacific Big Data market
is expected to accelerate rapidly in two to
three years time, from a mere US$258.5
million last year to in excess of $1.76 billion
in 2016, with highest growth in the storage
segment.
10. Big Data as a New Class of Asset
• The Age of Big Data (NYTimes Feb. 11, 2012)
“The new megarich of Silicon Valley, first at Google and now Facebook, are
masters at harnessing the data of the Web — online searches, posts and
messages — with Internet advertising. At the World Economic Forum last
month in Davos, Switzerland, Big Data was a marquee topic. A report by the
forum, “Big Data, Big Impact,” declared data a new class of economic asset,
like currency or gold.”
• Hype => fact (deal with it)
• Problem => opportunity
10
11. Big Data Central Theorem
Data+Technology è Actionable Insight è $$
11
Reporting, Monitoring, Root Cause Analysis,
(User) Modeling, Prediction
13. 10 ways big data changes everything
• Some concrete examples
– http://gigaom.com/2012/03/11/10-ways-big-data-is-changing-everything/2/
1. Can gigabytes predict the next Lady Gaga?
2. How big data can curb the world’s energy consumption
3. Big data is now your company’s virtual assistant
4. The future of Foursquare is data-fueled recommendations
5. How Twitter data-tracked cholera in Haiti
6. Revolutionizing Web publishing with big data
7. Can cell phone data cure society’s ills?
8. How data can help predict and create video hits
9. The new face of data visualization
10. One hospital’s embrace of big data
13
14. The 3-Vs of Big Data
• Volume
– Amount of data
• Velocity
– speed of data in and out
• Variety
– range of data types and sources
• [Gartner 2012] "Big Data are high-volume, high-velocity, and/or high-variety
information assets that require new forms of processing to
enable enhanced decision making, insight discovery and process
optimization" 14
20. 3 eGov / SmartCities Examples from XI
• Three Big Data examples from the eXascale Infolab:
– Volume: energy provisioning
– Velocity: detecting anomalies in smart-cities
– Variety: integrating information
21. Volume: Energy Provisioning
• Wide adoption of smart-meter technology
– Individuals / neighborhood / city / country
More data
=> better energy
provisioning ?
22. Not that Easy…
• Very difficult to analyze
energy signals in a database!
• Solution: new encoding
system, new database
23. Results
• 250x faster than current solutions
• Error on prediction reduced by 100x
• Paper presented at BigData 2014, Washington DC
24. Velocity: Real-Time Data
Management for Smarter Cities
• Detecting leaks / pipe bursts / contamination in real-time
for water distribution networks
24
25. Sensors installed in the water pipes!
• Spatial + temporal statistical processing (mini-Lisas)
• Stream processing (Storm) + Array processing (SciDB)
base
station 29
sensor 1053
sensor 1054
base
station 17
base
Peer Information Management overlay station 42
OLTP HYRISE OLAP
OLTP HYRISE OLAP
OLTP HYRISE OLAP
Anomaly
Detection
Alert
Array Data Management System
Missing Data?
Sliding-Window
Average
Data Gap
Event
Mini-Lisa
Computations
Anomaly
Detected?
Yes
No
Yes Anomaly
Event
Delta
Compression
Fluctuation?
Yes Publish
Value
Event
No
No
Alive Event
Stream Processing Flow
25
26. Variety: Integrating eGov Data
• Integration: still the biggest IT problem (Gartner)
• 2 inherently difficult problems
– Integrating various data formats / text
– Automated integration
27. Paradigm Change
• Use of Semantic / Knowledge Graphs to store, trace
(provenance) and integrate information
– RDF, Linked Data
– Excel, Word, CSV, XML, Relational
• Combines both algorithmic and human matchers
using probabilistic networks
28. ZenCrowd
• Uses sets of algorithmic matchers to match entities to
rich knowledge graphs
• Uses human intelligence at scale through crowdsourcing
• Combines both algorithmic and human matchers using
probabilistic networks
Input Output
Micro
Matching
Tasks
HTML
Pages
HTML+ RDFa
Pages
LOD Open Data Cloud
Crowdsourcing
Platform
ZenCrowd
Entity
Extractors
LOD Index Get Entity
Decision Engine
Probabilistic
Network
Micro-
Task Manager
Workers Decisions
Algorithmic
Matchers