SlideShare a Scribd company logo
1 of 90
Download to read offline
Transforming Big Data into Smart Data:
Deriving Value via harnessing Volume, Variety and Velocity
using semantics and Semantic Web
Put Knoesis Banner
Keynote at 30th IEEE International Conference on Data Engineering (ICDE) 2014
Amit Sheth
LexisNexis Ohio Eminent Scholar & Exec. Director,
The Ohio Center of Excellence in Knowledge-enabled Computing (Kno.e.sis)
Wright State, USA
2
Amit Sheth’s
PHD students
Ashutosh Jadhav
Hemant
Purohit
Vinh Nguyen Lu Chen
Pramod
AnantharamSujan
Perera
Alan Smith
Maryam Panahiazar
Sarasi Lalithsena
Cory Henson
Kalpa
Gunaratna
Delroy Cameron
Sanjaya
Wijeratne
Wenbo
Wang
Kno.e.sis in 2012 = ~100 researchers (15 faculty, ~50 PhD students)
Special Thanks
Pavan
Kapanipathi
Special Thanks Special Thanks
Special Thanks
Shreyansh Bhatt
Acknowledgements: Kno.e.sis team, Funds - NSF, NIH, AFRL, Industry…
2011
How much data?
48
(2013)
500
(2013)
4http://www.knowledgeinfusion.com/blog/2011/11/get-your-head-out-of-the-clouds-and-into-big-data/
Only 0.5% to 1% of
the data is used for
analysis.
5
http://www.csc.com/insights/flxwd/78931-big_data_growth_just_beginning_to_explode
http://www.guardian.co.uk/news/datablog/2012/dec/19/big-data-study-digital-universe-global-volume
Variety – not just structure but modality: multimodal, multisensory
Semi structured
6
Velocity
Fast Data
Rapid Changes
Real-Time/Stream Analysis
Current application examples: financial services, stock brokerage, weather tracking, movies/entertainment and online retail 7
• What if your data volume gets so large and
varied you don't know how to deal with it?
• Do you store all your data?
• Do you analyze it all?
• What is coverage, skew, quality?
How can you find out which data points are
really important?
• How can you use it to your best advantage?
9
Questions typically asked on Big Data
http://www.sas.com/big-data/
http://techcrunch.com/2012/10/27/big-data-right-now-five-trendy-open-source-technologies/
Variety of Data Analytics Enablers
10
• Prediction of the spread of flu in real time during H1N1 2009
– Google tested a mammoth of 450 million different mathematical
models to test the search terms that provided 45 important
parameters
– Model was tested when H1N1 crisis struck in 2009 and gave more
meaningful and valuable real time information than any public health
official system [Big Data, Viktor Mayer-Schonberger and Kenneth Cukier, 2013]
• FareCast: predict the direction of air fares over different
routes [Big Data, Viktor Mayer-Schonberger and Kenneth Cukier, 2013]
• NY city manholes problem [ICML Discussion, 2012]
11
Illustrative Big Data Applications
Current focus mainly to serve business intelligence and
targeted analytics needs, not to serve complex
individual and collective human needs (e.g., empower
human in health, fitness and well-being; better disaster
coordination, personalized smart energy)
12
What is missing?
 highly personalized/individualized/contextualized
 Incorporate real-world complexity:
- multi-modal and multi-sensory nature of
physical-world and human perception
 Can More Data beat better algorithms?
 Can Big Data replace human judgment?
13
Many opportunities, many challenges, lessons to apply
• Not just data to information, not just analysis, but actionable
information, delivering insight and support better decision
making right in the context of human activities
15
What is needed?
Data Information
Actionable: An apple a day
keeps the doctor away
16
What is needed? Taking inspiration from cognitive models
• Bottom up and top down cognitive
processes:
– Bottom up: find patterns, mine (ML, …)
– Top down: Infusion of models and background
knowledge (data + knowledge + reasoning)
Left(plans)/Right(perceives) Brain
Top(plans)/Bottom(perceives) Brain
http://online.wsj.com/news/articles/SB10001424052702304410204579139423079198270
• Ambient processing as much as possible while enabling
natural human involvement to guide the system
17
What is needed?
Smart Refrigerator: Low on Apples
Adapting the Plan:
shopping for apples
Makes Sense to a human
Is actionable –
timely and better decisions/outcomes
18
20
My 2004-2005 formulation of SMART DATA - Semagix
Formulation of Smart Data
strategy providing services
for Search, Explore, Notify.
“Use of Ontologies and
Data repositories to gain
relevant insights”
Smart Data (2013 retake)
Smart data makes sense out of Big data
It provides value from harnessing the
challenges posed by volume, velocity,
variety and veracity of big data, in-turn
providing actionable information and
improve decision making.
21
OF human, BY human FOR human
Smart data is focused on the actionable
value achieved by human involvement in
data creation, processing and consumption
phases for improving
the human experience.
Another perspective on Smart Data
22
OF human, BY human FOR human
Another perspective on Smart Data
23
Petabytes of Physical(sensory)-Cyber-Social Data everyday!
More on PCS Computing: http://wiki.knoesis.org/index.php/PCS 24
„OF human‟ : Relevant Real-time Data
Streams for Human Experience
OF human, BY human FOR human
25
Another perspective on Smart Data
Use of Prior Human-created Knowledge Models
26
„BY human‟: Involving
Crowd Intelligence in data processing workflows
Crowdsourcing and Domain-expert guided
Machine Learning Modeling
OF human, BY human FOR human
Another perspective on Smart Data
27
Detection of events, such as wheezing
sound, indoor temperature, humidity,
dust, and CO level
Weather Application
Asthma Healthcare
Application
Close the window at home
during day to avoid CO in
gush, to avoid asthma attacks
at night
28
„FOR human‟ :
Improving Human Experience
Population Level
Personal
Public Health
Action in the Physical World
Luminosity
CO level
CO in gush
during day time
Electricity usage over a day, device at
work, power consumption, cost/kWh,
heat index, relative humidity, and public
events from social stream
Weather Application
Power Monitoring
Application
29
„FOR human‟ :
Improving Human Experience
Population Level Observations
Personal Level Observations
Action in the Physical World
Washing and drying has
resulted in significant cost
since it was done during peak
load period. Consider
changing this time to night.
30
Every one and everything has Big Data –
It is Smart Data that matter!
• Healthcare:
ADFH, Asthma, GI
– Using kHealth system
• Social Media Analysis:
Crisis coordination
– Using Twitris platform
• Smart Cities:
Traffic management
31
I will use applications in 3 domains to demonstrate
• Healthcare:
ADFH, Asthma, GI
– Using kHealth system
• Social Media Analysis:
Crisis coordination
– Using Twitris platform
• Smart Cities:
Traffic management
43
Smart Data Applications
44
A Historical Perspective on Collecting Health Observations
Diseases treated only
by external observations
First peek beyond just
external observations
Information overload!
Doctors relied only on
external observations
Stethoscope was the
first instrument to go
beyond just external
observations
Though the stethoscope
has survived, it is only one
among many observations
in modern medicine
http://en.wikipedia.org/wiki/Timeline_of_medicine_and_medical_technology
2600 BC ~1815 Today
Imhotep
Laennec’s stethoscope
Image Credit: British Museum
The Patient of the Future
MIT Technology Review, 2012
http://www.technologyreview.com/featuredstory/426968/the-patient-of-the-future/ 45
Through physical monitoring and
analysis, our cellphones could act as
an early warning system to detect
serious health conditions, and
provide actionable information
canary in a coal mine
Empowering Individuals (who are not Larry Smarr!) for their own health
kHealth: knowledge-enabled healthcare
46
Weight Scale
Heart Rate Monitor
Blood Pressure
Monitor
47
Sensors
Android Device
(w/ kHealth App)
Readmissions cost $17B/year: $50K/readmission;
Total kHealth kit cost: < $500
kHealth Kit for the application for reducing ADHF readmission
ADHF – Acute Decompensated Heart Failure
48
1http://www.nhlbi.nih.gov/health/health-topics/topics/asthma/
2http://www.lung.org/lung-disease/asthma/resources/facts-and-figures/asthma-in-adults.html
3Akinbami et al. (2009). Status of childhood asthma in the United States, 1980–2007. Pediatrics,123(Supplement 3), S131-S145.
25
million
300
million
$50
billion
155,000
593,000
People in the U.S. are
diagnosed with asthma
(7 million are children)1.
People suffering from
asthma worldwide2.
Spent on asthma alone
in a year2
Hospital admissions in
20063
Emergency department
visits in 20063
Asthma: Severity of the problem
Sensordrone
(Carbon monoxide,
temperature, humidity)
Node Sensor
(exhaled Nitric
Oxide)
49
Sensors
Android Device
(w/ kHealth App)
Total cost: ~ $500
kHealth Kit for the application for Asthma management
*Along with two sensors in the kit, the application uses a variety of population level signals from the web:
Pollen level Air Quality
Temperature & Humidity
51
Data Overload for Patients/health aficionados
Providing actionable information in a timely manner is
crucial to avoid information overload or fatigue
Personal level
Signals
Public level
Signals
Population level
Signals
52
Data Overload Spanning Physical-Cyber-Social Modalities
Increasingly, real-world events are:
(a) Continuous: Observations are fine grained over time
(b) Multimodal, multisensory: Observations span PCS modalities
what can we do to avoid asthma episode?
54
Real-time health signals from personal level (e.g., Wheezometer, NO in breath,
accelerometer, microphone), public health (e.g., CDC, Hospital EMR), and
population level (e.g., pollen level, CO2) arriving continuously in fine grained
samples potentially with missing information and uneven sampling frequencies.
Variety Volume
VeracityVelocity
Value
What risk factors influence asthma control?
What is the contribution of each risk factor?
semantics
Understanding relationships between
health signals and asthma attacks
for providing actionable information
WHY Big Data to Smart Data: Asthma example
kHealth: Health Signal Processing Architecture
Personal level
Signals
Public level
Signals
Population level
Signals
Domain
Knowledge
Risk Model
Events from
Social Streams
Take Medication before
going to work
Avoid going out in the
evening due to high pollen
levels
Contact doctor
Analysis
Personalized
Actionable
Information
Data Acquisition &
aggregation
55
57
Asthma Domain Knowledge
Domain
Knowledge
Asthma Control
à
Daily Medication
Choices for starting
therapy
Not Well Controlled Poor Controlled
Severity Level
of Asthma
(Recommended Action) (Recommended Action) (Recommended Action)
Intermittent Asthma SABA prn - -
Mild Persistent Asthma Low dose ICS Medium ICS Medium ICS
Moderate Persistent
Asthma
Medium dose ICS alone
Or with
LABA/montelukast
Medium ICS +
LABA/Montelukast
Or High dose ICS
Medium ICS +
LABA/Montelukast
Or High dose ICS*
Severe Persistent Asthma High dose ICS with
LABA/montelukast
Needs specialist care Needs specialist care
ICS= inhaled corticosteroid, LABA = inhaled long-acting beta2-agonist, SABA= inhaled short-acting beta2-agonist ;
*consider referral to specialist
Asthma Control
and Actionable Information
58
Patient Health Score (diagnostic)
Risk assessment
model
Semantic
Perception
Personal level
Signals
Public level
Signals
Domain
Knowledge
Population level
Signals
GREEN -- Well Controlled
YELLOW – Not well controlled
Red -- poor controlled
How controlled is my asthma?
59
Patient Vulnerability Score (prognostic)
Risk assessment
model
Semantic
Perception
Personal level
Signals
Public level
Signals
Domain
Knowledge
Population level
Signals
Patient health
Score
How vulnerable* is my control level today?
*considering changing environmental conditions and current control level
60
3.4 billion people will have smartphones or tablets by 2017
-- Research2Guidance
“Intelligence at the Edges” for Digital Health
http://www.digikey.com/us/en/techzone/energy-harvesting/resources/articles/zigbees-smart-energy-20-profile.html
m-health app market is predicted to reach $26 billion in 2017
-- Research2Guidance
63
Sensordrone – for monitoring
environmental air quality
Wheezometer – for monitoring
wheezing sounds
Can I reduce my asthma attacks at night?
What are the triggers? What is the wheezing level?
What is the propensity toward asthma?
What is the exposure level over a day?
Commute to Work
Asthma: Actionable Information for Asthma Patients
Luminosity
CO level
CO in gush
during day time
Actionable
Information
Personal level
Signals
Public level
Signals
Population level
Signals
What is the air quality indoors?
64
Population Level
Personal
Wheeze – Yes
Do you have tightness of chest? –Yes
ObservationsPhysical-Cyber-Social System Health Signal Extraction Health Signal Understanding
<Wheezing=Yes, time, location>
<ChectTightness=Yes, time, location>
<PollenLevel=Medium, time, location>
<Pollution=Yes, time, location>
<Activity=High, time, location>
Wheezing
ChectTightness
PollenLevel
Pollution
Activity
Wheezing
ChectTightness
PollenLevel
Pollution
Activity
RiskCategory
<PollenLevel, ChectTightness, Pollution,
Activity, Wheezing, RiskCategory>
<2, 1, 1,3, 1, RiskCategory>
<2, 1, 1,3, 1, RiskCategory>
<2, 1, 1,3, 1, RiskCategory>
<2, 1, 1,3, 1, RiskCategory>
.
.
.
Expert
Knowledge
Background
Knowledge
tweet reporting pollution level
and asthma attacks
Acceleration readings from
on-phone sensors
Sensor and personal
observations
Signals from personal, personal
spaces, and community spaces
Risk Category assigned by
doctors
Qualify
Quantify
Enrich
Outdoor pollen and pollution
Public Health
Health Signal Extraction to Understanding
Well Controlled - continue
Not Well Controlled – contact nurse
Poor Controlled – contact doctor
70
RDF OWL
How are machines supposed to integrate and interpret sensor data?
Semantic Sensor Networks (SSN)
71
W3C Semantic Sensor Network Ontology
Lefort, L., Henson, C., Taylor, K., Barnaghi, P., Compton, M., Corcho, O., Garcia-Castro, R., Graybeal, J., Herzog, A., Janowicz, K.,
Neuhaus, H., Nikolov, A., and Page, K.: Semantic Sensor Network XG Final Report, W3C Incubator Group Report (2011).
73
W3C Semantic Sensor Network Ontology
Lefort, L., Henson, C., Taylor, K., Barnaghi, P., Compton, M., Corcho, O., Garcia-Castro, R., Graybeal, J., Herzog, A., Janowicz, K.,
Neuhaus, H., Nikolov, A., and Page, K.: Semantic Sensor Network XG Final Report, W3C Incubator Group Report (2011).
SSN
Ontology
2 Interpreted data
(deductive)
[in OWL]
e.g., threshold
1 Annotated Data
[in RDF]
e.g., label
0 Raw Data
[in TEXT]
e.g., number
Levels of Abstraction
3 Interpreted data
(abductive)
[in OWL]
e.g., diagnosis
Intellego
“150”
Systolic blood pressure of 150 mmHg
Elevated
Blood
Pressure
Hyperthyroidism
……
75
76
Making sense of sensor data with
People are good at making sense of sensory input
What can we learn from cognitive models of perception?
• The key ingredient is prior knowledge
77
* based on Neisser’s cognitive model of perception
Observe
Property
Perceive
Feature
Explanation
Discrimination
1
2
Perception Cycle*
Translating low-level signals
into high-level knowledge
Focusing attention on those
aspects of the environment that
provide useful information
Prior Knowledge
78
To enable machine perception,
Semantic Web technology is used to integrate
sensor data with prior knowledge on the Web
79
Prior knowledge on the Web
W3C Semantic Sensor
Network (SSN) Ontology Bi-partite Graph
80
Prior knowledge on the Web
W3C Semantic Sensor
Network (SSN) Ontology Bi-partite Graph
81
Observe
Property
Perceive
Feature
Explanation
1
Translating low-level signals
into high-level knowledge
Explanation
Explanation is the act of choosing the objects or events that best account for a
set of observations; often referred to as hypothesis building
82
Discrimination is the act of finding those properties that, if observed, would help distinguish
between multiple explanatory features
Observe
Property
Perceive
Feature
Explanation
Discrimination
2
Focusing attention on those
aspects of the environment that
provide useful information
Discrimination
85
Discrimination
Discriminating Property: is neither expected nor not-applicable
DiscriminatingProperty ≡ ¬ExpectedProperty ⊓ ¬NotApplicableProperty
elevated blood pressure
clammy skin
palpitations
Hypertension
Hyperthyroidism
Pulmonary Edema
Discriminating Property Explanatory Feature
89
Semantic scalability: Resource savings of abstracting sensor data
90
Orders of magnitude resource savings for generating and storing relevant
abstractions vs. raw observations.
Relevant abstractions
Raw observations
How do we implement machine perception efficiently on a
resource-constrained device?
Use of OWL reasoner is resource intensive
(especially on resource-constrained devices),
in terms of both memory and time
• Runs out of resources with prior knowledge >> 15 nodes
• Asymptotic complexity: O(n3)
92
intelligence at the edge
Approach 1: Send all sensor observations
to the cloud for processing
Approach 2: downscale semantic
processing so that each device is capable
of machine perception
93
Henson et al. 'An Efficient Bit Vector Approach to Semantics-based Machine Perception in Resource-Constrained Devices,
ISWC 2012.
Efficient execution of machine perception
Use bit vector encodings and their operations to encode prior knowledge and
execute semantic reasoning
010110001101
0011110010101
1000110110110
101100011010
0111100101011
000110101100
0110100111
94
O(n3) < x < O(n4) O(n)
Efficiency Improvement
• Problem size increased from 10’s to 1000’s of nodes
• Time reduced from minutes to milliseconds
• Complexity growth reduced from polynomial to linear
Evaluation on a mobile device
95
2 Prior knowledge is the key to perception
Using SW technologies, machine perception can be formalized and
integrated with prior knowledge on the Web
3 Intelligence at the edge
By downscaling semantic inference, machine perception can
execute efficiently on resource-constrained devices
Semantic Perception for smarter analytics: 3 ideas to takeaway
1 Translate low-level data to high-level knowledge
Machine perception can be used to convert low-level sensory
signals into high-level knowledge useful for decision making
96
• Healthcare:
ADFH, Asthma, GI
– Using kHealth system
• Social Media Analysis:
Crisis coordination
– Using Twitris platform
• Smart Cities:
Traffic management
98
Smart Data Applications
99
Smart Data for Social Good
Mining human behavior to help
societal and humanitarian
development
• crisis response coordination,
harassment, gender-based
violence, …
20 million tweets with “sandy, hurricane”
keywords between Oct 27th and Nov 1st
2nd most popular topic on Facebook during 2012
Social (Big) Data during Crisis- Example of Hurricane Sandy
100
• http://www.guardian.co.uk/news/datablog/2
012/oct/31/twitter-sandy-flooding
• http://www.huffingtonpost.com/2012/11/02
/twitter-hurricane-sandy_n_2066281.html
• http://mashable.com/2012/10/31/hurricane-
sandy-facebook/
103http://usatoday30.usatoday.com/news/politics/twitter-election-meter
http://twitris.knoesis.org/
Twitris‟ Dimensions of Integrated Semantic Analysis
104Sheth et al. Twitris- a System for Collective Social Intelligence, ESNAM-2014
What is Smart Data in the context of
Disaster Management
ACTIONABLE: Timely delivery of
right resources and information to
the right people at right location!
113
Because everyone wants to Help, but DON’T KNOW HOW!
Really sparse Signal to Noise:
• 2M tweets during the first 48 hrs. of #Oklahoma-tornado-2013
- 1.3% as the precise resource donation requests to help
- 0.02% as the precise resource donation offers to help
114
• Anyone know how to get involved to
help the tornado victims in
Oklahoma??#tornado #oklahomacity
(OFFER)
• I want to donate to the Oklahoma cause
shoes clothes even food if I can (OFFER)
Disaster Response Coordination:
Finding Actionable Nuggets for Responders to act
• Text REDCROSS to 909-99 to donate to
those impacted by the Moore tornado!
http://t.co/oQMljkicPs (REQUEST)
• Please donate to Oklahoma disaster
relief efforts.: http://t.co/crRvLAaHtk
(REQUEST)
For responders, most important information is the scarcity and
availability of resources
Blog by our colleague Patrick Meier on this analysis: http://irevolution.net/2013/05/29/analyzing-tweets-tornado/
Join us for the Social
Good!
http://twitris.knoesis.org
RT @OpOKRelief:
Southgate Baptist Church
on 4th Street in Moore
has food, water, clothes,
diapers, toys, and more.
If you can't go,call 794
Text "FOOD" to
32333, REDCROSS to
90999, or STORM to
80888 to donate $10
in storm relief.
#moore #oklahoma
#disasterrelief
#donate
Want to help animals in
#Oklahoma? @ASPCA tells
how you can help:
http://t.co/mt8l9PwzmO
CITIZEN SENSORS
RESPONSE TEAMS
(including humanitarian
org. and ‘pseudo’ responders)
VICTIM SITE
Coordination of
needs and offers
Using Social Media
Does anyone
know where to
send a check to
donate to the
tornado
victims?
Where do I go
to help out for
volunteer work
around Moore?
Anyone know?
Anyone know
where to donate
to help the
animals from the
Oklahoma
disaster? #oklah
oma #dogs
Matched
Matched
Matched
Serving the need!
If you would like to volunteer
today, help is desperately
needed in Shawnee. Call
273-5331 for more info
http://www.slideshare.net/knoesis/iccm-2013ignitetalkhemantpurohitunnairobi
115
Purohit et al. Emergency-relief coordination on social media: Automatically matching resource requests and offers, 2014. With Int’l collaborator
Continuous Semantics for Evolving Events to Extract Smart Data
126
Dynamic Model Creation
Continuous Semantics 127
• Healthcare:
ADFH, Asthma, GI
– Using kHealth system
• Social Media Analysis:
Crisis coordination
– Using Twitris platform
• Smart Cities:
Traffic management
130
Smart Data Applications
131
Traffic Management
To improve the
everyday life
entangled due
to our most
common
problem of
‘stuck in traffic’
1IBM Smarter Traffic 132
Severity of the Traffic Problem
Vehicular traffic data from San Francisco Bay Area aggregated from on-road
sensors (numerical) and incident reports (textual)
133
http://511.org/
Every minute update of speed, volume, travel time, and occupancy resulting in
178 million link status observations, 738 active events, and 146 scheduled
events with many unevenly sampled observations collected over 3 months.
Variety Volume
VeracityVelocity
Value
Can we detect the onset of traffic congestion?
Can we characterize traffic congestion based on events?
Can we estimate traffic delays in a road network?
semantics
Representing prior knowledge of
traffic lead to a focused exploration
of this massive dataset
Big Data to Smart Data: Traffic Management example
134
Duration: 36 months
Requested funding: 2.531.202 €
CityPulse Consortium
City of Aarhus
City of Brasov
Textual Streams for City Related Events
135
City Infrastructure
Tweets from a city
POS
Tagging
Hybrid NER+
Event term
extraction
Geohashing
Temporal
Estimation
Impact
Assessment
Event
Aggregation
OSM
Locations
SCRIBE
ontology
511.org hierarchy
City Event Extraction
City Event Extraction Solution Architecture
City Event Annotation
OSM – Google Open Street Maps
NER – Named Entity Recognition 136
City Event Annotation – CRF Annotation Examples
Last O night O in O CA... O (@ O Half B-LOCATION Moon I-LOCATION Bay B-LOCATION
Brewing I-LOCATION Company O w/ O 8 O others) O http://t.co/w0eGEJjApY O
B-LOCATION
I-LOCATION
B-EVENT
I-EVENT
O
Tags used in our approach:
These are the annotations provided
by a Conditional Random Field model
trained on tweet corpus to spot
city related events and location
BIO – Beginning, Intermediate, and Other is a notation used in multi-phrase entity spotting 138
City Events from Sensor and Social Streams can be…
• Complementary
• Additional information
• e.g., slow traffic from sensor data and accident from textual data
• Corroborative
• Additional confidence
• e.g., accident event supporting a accident report from ground truth
• Timely
• Additional insight
• e.g., knowing poor visibility before formal report from ground truth
143
Events from Social Streams and City Department*
Corroborative EventsComplementary Events
Event Sources
City events extracted from tweets
511.org, Active events e.g., accidents, breakdowns
511.org, Scheduled events e.g., football game, parade
City event from twitter providing complementary and
corroborative evidence for fog reported by 511.org
*511.org
146
147
Actionable Information in City Management
Tweets from a CityTraffic Sensor Data
OSM
Locations
SCRIBE
ontology
511.org hierarchy
Web of Data
How issues in a city can be resolved?
e.g., what should I do when I have fog condition?
• Big Data is every where
– at individual level and not just limited to
corporation
– with growing complexity: multimodal, Physical-
Cyber-Social
• Analysis is not sufficient
• Bottom up techniques is not sufficient, need
top down processing, need background
knowledge
149
Take Away
Take Away
• Focus on Humans and Improve human life and
experience with SMART Data.
– Data to Information to Contextually Relevant
Abstractions
– Actionable Information (Value from data) to assist
and support Human in decision making.
• Focus on Value -- SMART Data
– Big Data Challenges without the intention of deriving
Value is a “Journey without GOAL”.
150
153
thank you, and please visit us at
http://knoesis.org/vision
Kno.e.sis – Ohio Center of Excellence in Knowledge-enabled Computing
Wright State University, Dayton, Ohio, USA
Smart Data
Ohio Center of Excellence in Knowledge-enabled
Computing
• Among top universities in the world in World Wide Web (cf: 5-yr
impact, Microsoft Academic Search: shared 2nd place in Mar13)
• Largest academic group in the US in Semantic Web + Social/Sensor
Webs, Mobile/Cloud/Cognitive Computing, Big Data, IoT,
Health/Clinical & Biomedicine Applications
• Exceptional student success: internships and jobs at top salary (IBM
Research, MSR, Amazon, CISCO, Oracle, Yahoo!, Samsung, research
universities, NLM, startups )
• 100 researchers including 15 World Class faculty (>3K
citations/faculty) and 45+ PhD students- practically all funded
• $2M+/yr research for largely multidisciplinary projects; world class
resources; industry sponsorships/collaborations (Google, IBM, …)
155

More Related Content

What's hot

Data Processing and Semantics for Advanced Internet of Things (IoT) Applicati...
Data Processing and Semantics for Advanced Internet of Things (IoT) Applicati...Data Processing and Semantics for Advanced Internet of Things (IoT) Applicati...
Data Processing and Semantics for Advanced Internet of Things (IoT) Applicati...Artificial Intelligence Institute at UofSC
 
Kno.e.sis Approach to Impactful Research & Training for Exceptional Careers
Kno.e.sis Approach to Impactful Research & Training for Exceptional CareersKno.e.sis Approach to Impactful Research & Training for Exceptional Careers
Kno.e.sis Approach to Impactful Research & Training for Exceptional CareersAmit Sheth
 
Semantic, Cognitive, and Perceptual Computing – three intertwined strands of ...
Semantic, Cognitive, and Perceptual Computing – three intertwined strands of ...Semantic, Cognitive, and Perceptual Computing – three intertwined strands of ...
Semantic, Cognitive, and Perceptual Computing – three intertwined strands of ...Amit Sheth
 
Physical Cyber Social Computing: An early 21st century approach to Computing ...
Physical Cyber Social Computing: An early 21st century approach to Computing ...Physical Cyber Social Computing: An early 21st century approach to Computing ...
Physical Cyber Social Computing: An early 21st century approach to Computing ...Amit Sheth
 
Smart Data and real-world semantic web applications (2004)
Smart Data and real-world semantic web applications (2004)Smart Data and real-world semantic web applications (2004)
Smart Data and real-world semantic web applications (2004)Amit Sheth
 
What's up at Kno.e.sis?
What's up at Kno.e.sis? What's up at Kno.e.sis?
What's up at Kno.e.sis? Amit Sheth
 
A Semantics-based Approach to Machine Perception
A Semantics-based Approach to Machine PerceptionA Semantics-based Approach to Machine Perception
A Semantics-based Approach to Machine PerceptionCory Andrew Henson
 
Knowledge-empowered Probabilistic Graphical Models for Physical-Cyber-Social ...
Knowledge-empowered Probabilistic Graphical Models for Physical-Cyber-Social ...Knowledge-empowered Probabilistic Graphical Models for Physical-Cyber-Social ...
Knowledge-empowered Probabilistic Graphical Models for Physical-Cyber-Social ...Artificial Intelligence Institute at UofSC
 
Semantics-empowered Smart City applications: today and tomorrow
Semantics-empowered Smart City applications: today and tomorrowSemantics-empowered Smart City applications: today and tomorrow
Semantics-empowered Smart City applications: today and tomorrowAmit Sheth
 
Ontology-enabled Healthcare Applications exploiting Physical-Cyber-Social Big...
Ontology-enabled Healthcare Applications exploiting Physical-Cyber-Social Big...Ontology-enabled Healthcare Applications exploiting Physical-Cyber-Social Big...
Ontology-enabled Healthcare Applications exploiting Physical-Cyber-Social Big...Amit Sheth
 
Guidance for Incorporating Big Data into Humanitarian Operations - 2015 - web...
Guidance for Incorporating Big Data into Humanitarian Operations - 2015 - web...Guidance for Incorporating Big Data into Humanitarian Operations - 2015 - web...
Guidance for Incorporating Big Data into Humanitarian Operations - 2015 - web...Katie Whipkey
 
ON EXPLOITING MULTIMODAL INFORMATION FOR MACHINE INTELLIGENCE AND NATURAL IN...
ON EXPLOITING MULTIMODAL INFORMATION FOR MACHINE INTELLIGENCE AND NATURAL IN...ON EXPLOITING MULTIMODAL INFORMATION FOR MACHINE INTELLIGENCE AND NATURAL IN...
ON EXPLOITING MULTIMODAL INFORMATION FOR MACHINE INTELLIGENCE AND NATURAL IN...Amit Sheth
 
The Age of Big Data: A New Class of Economic Asset
The Age of Big Data: A New Class of Economic AssetThe Age of Big Data: A New Class of Economic Asset
The Age of Big Data: A New Class of Economic AssetChulalongkorn University
 
Hemant Purohit PhD Defense: Mining Citizen Sensor Communities for Cooperation...
Hemant Purohit PhD Defense: Mining Citizen Sensor Communities for Cooperation...Hemant Purohit PhD Defense: Mining Citizen Sensor Communities for Cooperation...
Hemant Purohit PhD Defense: Mining Citizen Sensor Communities for Cooperation...Artificial Intelligence Institute at UofSC
 
Extracting City Traffic Events from Social Streams
 Extracting City Traffic Events from Social Streams Extracting City Traffic Events from Social Streams
Extracting City Traffic Events from Social StreamsPramod Anantharam
 
Challenges in Analytics for BIG Data
Challenges in Analytics for BIG DataChallenges in Analytics for BIG Data
Challenges in Analytics for BIG DataPrasant Misra
 
wireless sensor network
wireless sensor networkwireless sensor network
wireless sensor networkparry prabhu
 
Big Data, AI, and Pharma
Big Data, AI, and PharmaBig Data, AI, and Pharma
Big Data, AI, and PharmaAmit Sheth
 
Philosophy of Big Data: Big Data, the Individual, and Society
Philosophy of Big Data: Big Data, the Individual, and SocietyPhilosophy of Big Data: Big Data, the Individual, and Society
Philosophy of Big Data: Big Data, the Individual, and SocietyMelanie Swan
 

What's hot (20)

Data Processing and Semantics for Advanced Internet of Things (IoT) Applicati...
Data Processing and Semantics for Advanced Internet of Things (IoT) Applicati...Data Processing and Semantics for Advanced Internet of Things (IoT) Applicati...
Data Processing and Semantics for Advanced Internet of Things (IoT) Applicati...
 
Kno.e.sis Approach to Impactful Research & Training for Exceptional Careers
Kno.e.sis Approach to Impactful Research & Training for Exceptional CareersKno.e.sis Approach to Impactful Research & Training for Exceptional Careers
Kno.e.sis Approach to Impactful Research & Training for Exceptional Careers
 
Semantic, Cognitive, and Perceptual Computing – three intertwined strands of ...
Semantic, Cognitive, and Perceptual Computing – three intertwined strands of ...Semantic, Cognitive, and Perceptual Computing – three intertwined strands of ...
Semantic, Cognitive, and Perceptual Computing – three intertwined strands of ...
 
Physical Cyber Social Computing: An early 21st century approach to Computing ...
Physical Cyber Social Computing: An early 21st century approach to Computing ...Physical Cyber Social Computing: An early 21st century approach to Computing ...
Physical Cyber Social Computing: An early 21st century approach to Computing ...
 
Smart Data and real-world semantic web applications (2004)
Smart Data and real-world semantic web applications (2004)Smart Data and real-world semantic web applications (2004)
Smart Data and real-world semantic web applications (2004)
 
What's up at Kno.e.sis?
What's up at Kno.e.sis? What's up at Kno.e.sis?
What's up at Kno.e.sis?
 
A Semantics-based Approach to Machine Perception
A Semantics-based Approach to Machine PerceptionA Semantics-based Approach to Machine Perception
A Semantics-based Approach to Machine Perception
 
Knowledge-empowered Probabilistic Graphical Models for Physical-Cyber-Social ...
Knowledge-empowered Probabilistic Graphical Models for Physical-Cyber-Social ...Knowledge-empowered Probabilistic Graphical Models for Physical-Cyber-Social ...
Knowledge-empowered Probabilistic Graphical Models for Physical-Cyber-Social ...
 
Semantics-empowered Smart City applications: today and tomorrow
Semantics-empowered Smart City applications: today and tomorrowSemantics-empowered Smart City applications: today and tomorrow
Semantics-empowered Smart City applications: today and tomorrow
 
Ontology-enabled Healthcare Applications exploiting Physical-Cyber-Social Big...
Ontology-enabled Healthcare Applications exploiting Physical-Cyber-Social Big...Ontology-enabled Healthcare Applications exploiting Physical-Cyber-Social Big...
Ontology-enabled Healthcare Applications exploiting Physical-Cyber-Social Big...
 
Understanding City Traffic Dynamics Utilizing Sensor and Textual Observations
Understanding City Traffic Dynamics Utilizing Sensor and Textual ObservationsUnderstanding City Traffic Dynamics Utilizing Sensor and Textual Observations
Understanding City Traffic Dynamics Utilizing Sensor and Textual Observations
 
Guidance for Incorporating Big Data into Humanitarian Operations - 2015 - web...
Guidance for Incorporating Big Data into Humanitarian Operations - 2015 - web...Guidance for Incorporating Big Data into Humanitarian Operations - 2015 - web...
Guidance for Incorporating Big Data into Humanitarian Operations - 2015 - web...
 
ON EXPLOITING MULTIMODAL INFORMATION FOR MACHINE INTELLIGENCE AND NATURAL IN...
ON EXPLOITING MULTIMODAL INFORMATION FOR MACHINE INTELLIGENCE AND NATURAL IN...ON EXPLOITING MULTIMODAL INFORMATION FOR MACHINE INTELLIGENCE AND NATURAL IN...
ON EXPLOITING MULTIMODAL INFORMATION FOR MACHINE INTELLIGENCE AND NATURAL IN...
 
The Age of Big Data: A New Class of Economic Asset
The Age of Big Data: A New Class of Economic AssetThe Age of Big Data: A New Class of Economic Asset
The Age of Big Data: A New Class of Economic Asset
 
Hemant Purohit PhD Defense: Mining Citizen Sensor Communities for Cooperation...
Hemant Purohit PhD Defense: Mining Citizen Sensor Communities for Cooperation...Hemant Purohit PhD Defense: Mining Citizen Sensor Communities for Cooperation...
Hemant Purohit PhD Defense: Mining Citizen Sensor Communities for Cooperation...
 
Extracting City Traffic Events from Social Streams
 Extracting City Traffic Events from Social Streams Extracting City Traffic Events from Social Streams
Extracting City Traffic Events from Social Streams
 
Challenges in Analytics for BIG Data
Challenges in Analytics for BIG DataChallenges in Analytics for BIG Data
Challenges in Analytics for BIG Data
 
wireless sensor network
wireless sensor networkwireless sensor network
wireless sensor network
 
Big Data, AI, and Pharma
Big Data, AI, and PharmaBig Data, AI, and Pharma
Big Data, AI, and Pharma
 
Philosophy of Big Data: Big Data, the Individual, and Society
Philosophy of Big Data: Big Data, the Individual, and SocietyPhilosophy of Big Data: Big Data, the Individual, and Society
Philosophy of Big Data: Big Data, the Individual, and Society
 

Similar to TRANSFORMING BIG DATA INTO SMART DATA: Deriving Value via Harnessing Volume, Variety, and Velocity using Semantic Techniques and Technologies

Inauguration Function - Ohio Center of Excellence in Knowledge-Enabled Comput...
Inauguration Function - Ohio Center of Excellence in Knowledge-Enabled Comput...Inauguration Function - Ohio Center of Excellence in Knowledge-Enabled Comput...
Inauguration Function - Ohio Center of Excellence in Knowledge-Enabled Comput...Artificial Intelligence Institute at UofSC
 
Sun==big data analytics for health care
Sun==big data analytics for health careSun==big data analytics for health care
Sun==big data analytics for health careAravindharamanan S
 
Exploiting Multimodal Information for Machine Intelligence and Natural Intera...
Exploiting Multimodal Information for Machine Intelligence and Natural Intera...Exploiting Multimodal Information for Machine Intelligence and Natural Intera...
Exploiting Multimodal Information for Machine Intelligence and Natural Intera...Artificial Intelligence Institute at UofSC
 
AI WORLD: I-World: EIS Global Innovation Platform: BIG Knowledge World vs. BI...
AI WORLD: I-World: EIS Global Innovation Platform: BIG Knowledge World vs. BI...AI WORLD: I-World: EIS Global Innovation Platform: BIG Knowledge World vs. BI...
AI WORLD: I-World: EIS Global Innovation Platform: BIG Knowledge World vs. BI...Azamat Abdoullaev
 
Big Data, The Community and The Commons (May 12, 2014)
Big Data, The Community and The Commons (May 12, 2014)Big Data, The Community and The Commons (May 12, 2014)
Big Data, The Community and The Commons (May 12, 2014)Robert Grossman
 
Personalized health knowledge graph ckg workshop - iswc 2018 (2)
Personalized health knowledge graph   ckg workshop - iswc 2018 (2)Personalized health knowledge graph   ckg workshop - iswc 2018 (2)
Personalized health knowledge graph ckg workshop - iswc 2018 (2)Amélie Gyrard
 
Delivering on the promise of data-driven healthcare: trade-offs, challenges, ...
Delivering on the promise of data-driven healthcare: trade-offs, challenges, ...Delivering on the promise of data-driven healthcare: trade-offs, challenges, ...
Delivering on the promise of data-driven healthcare: trade-offs, challenges, ...Paolo Missier
 
Amia Chi Citizen Public Health V2
Amia Chi Citizen Public Health V2Amia Chi Citizen Public Health V2
Amia Chi Citizen Public Health V2bonander
 
Framework for understanding data science.pdf
Framework for understanding data science.pdfFramework for understanding data science.pdf
Framework for understanding data science.pdfMichael Brodie
 
Waterloo September 00 Presentations
Waterloo September 00 PresentationsWaterloo September 00 Presentations
Waterloo September 00 Presentationsbrighteyes
 
Building Effective Visualization Shiny WVF
Building Effective Visualization Shiny WVFBuilding Effective Visualization Shiny WVF
Building Effective Visualization Shiny WVFOlga Scrivner
 
The future interface of mental health with information technology: high touch...
The future interface of mental health with information technology: high touch...The future interface of mental health with information technology: high touch...
The future interface of mental health with information technology: high touch...HealthXn
 
strata_ny_2016_version_final_no_animation
strata_ny_2016_version_final_no_animationstrata_ny_2016_version_final_no_animation
strata_ny_2016_version_final_no_animationTaposh Dutta Roy
 

Similar to TRANSFORMING BIG DATA INTO SMART DATA: Deriving Value via Harnessing Volume, Variety, and Velocity using Semantic Techniques and Technologies (20)

Inauguration Function - Ohio Center of Excellence in Knowledge-Enabled Comput...
Inauguration Function - Ohio Center of Excellence in Knowledge-Enabled Comput...Inauguration Function - Ohio Center of Excellence in Knowledge-Enabled Comput...
Inauguration Function - Ohio Center of Excellence in Knowledge-Enabled Comput...
 
Sun==big data analytics for health care
Sun==big data analytics for health careSun==big data analytics for health care
Sun==big data analytics for health care
 
Exploiting Multimodal Information for Machine Intelligence and Natural Intera...
Exploiting Multimodal Information for Machine Intelligence and Natural Intera...Exploiting Multimodal Information for Machine Intelligence and Natural Intera...
Exploiting Multimodal Information for Machine Intelligence and Natural Intera...
 
AI WORLD: I-World: EIS Global Innovation Platform: BIG Knowledge World vs. BI...
AI WORLD: I-World: EIS Global Innovation Platform: BIG Knowledge World vs. BI...AI WORLD: I-World: EIS Global Innovation Platform: BIG Knowledge World vs. BI...
AI WORLD: I-World: EIS Global Innovation Platform: BIG Knowledge World vs. BI...
 
From Clinical Information Systems toward HealthGrid
From Clinical Information Systems toward HealthGridFrom Clinical Information Systems toward HealthGrid
From Clinical Information Systems toward HealthGrid
 
Big Data, The Community and The Commons (May 12, 2014)
Big Data, The Community and The Commons (May 12, 2014)Big Data, The Community and The Commons (May 12, 2014)
Big Data, The Community and The Commons (May 12, 2014)
 
Personalized health knowledge graph ckg workshop - iswc 2018 (2)
Personalized health knowledge graph   ckg workshop - iswc 2018 (2)Personalized health knowledge graph   ckg workshop - iswc 2018 (2)
Personalized health knowledge graph ckg workshop - iswc 2018 (2)
 
Delivering on the promise of data-driven healthcare: trade-offs, challenges, ...
Delivering on the promise of data-driven healthcare: trade-offs, challenges, ...Delivering on the promise of data-driven healthcare: trade-offs, challenges, ...
Delivering on the promise of data-driven healthcare: trade-offs, challenges, ...
 
kHealth: Proactive Personalized Actionable Information for Better Healthcare
kHealth: Proactive Personalized Actionable Information for Better Healthcare kHealth: Proactive Personalized Actionable Information for Better Healthcare
kHealth: Proactive Personalized Actionable Information for Better Healthcare
 
Big data in healthcare
Big data in healthcareBig data in healthcare
Big data in healthcare
 
Amia Chi Citizen Public Health V2
Amia Chi Citizen Public Health V2Amia Chi Citizen Public Health V2
Amia Chi Citizen Public Health V2
 
Framework for understanding data science.pdf
Framework for understanding data science.pdfFramework for understanding data science.pdf
Framework for understanding data science.pdf
 
Waterloo September 00 Presentations
Waterloo September 00 PresentationsWaterloo September 00 Presentations
Waterloo September 00 Presentations
 
Geohealth symposium-UTen ITC
Geohealth symposium-UTen ITCGeohealth symposium-UTen ITC
Geohealth symposium-UTen ITC
 
Geohealth and Safe Society
Geohealth and Safe SocietyGeohealth and Safe Society
Geohealth and Safe Society
 
Building Effective Visualization Shiny WVF
Building Effective Visualization Shiny WVFBuilding Effective Visualization Shiny WVF
Building Effective Visualization Shiny WVF
 
Advancing-OSHMS High-Performance WS in OHM
Advancing-OSHMS High-Performance WS in OHMAdvancing-OSHMS High-Performance WS in OHM
Advancing-OSHMS High-Performance WS in OHM
 
Teenage Sex of the 21st Century
Teenage Sex of the 21st CenturyTeenage Sex of the 21st Century
Teenage Sex of the 21st Century
 
The future interface of mental health with information technology: high touch...
The future interface of mental health with information technology: high touch...The future interface of mental health with information technology: high touch...
The future interface of mental health with information technology: high touch...
 
strata_ny_2016_version_final_no_animation
strata_ny_2016_version_final_no_animationstrata_ny_2016_version_final_no_animation
strata_ny_2016_version_final_no_animation
 

Recently uploaded

Geoffrey Chaucer Works II UGC NET JRF TGT PGT MA PHD Entrance Exam II History...
Geoffrey Chaucer Works II UGC NET JRF TGT PGT MA PHD Entrance Exam II History...Geoffrey Chaucer Works II UGC NET JRF TGT PGT MA PHD Entrance Exam II History...
Geoffrey Chaucer Works II UGC NET JRF TGT PGT MA PHD Entrance Exam II History...DrVipulVKapoor
 
6 ways Samsung’s Interactive Display powered by Android changes the classroom
6 ways Samsung’s Interactive Display powered by Android changes the classroom6 ways Samsung’s Interactive Display powered by Android changes the classroom
6 ways Samsung’s Interactive Display powered by Android changes the classroomSamsung Business USA
 
Decoding the Tweet _ Practical Criticism in the Age of Hashtag.pptx
Decoding the Tweet _ Practical Criticism in the Age of Hashtag.pptxDecoding the Tweet _ Practical Criticism in the Age of Hashtag.pptx
Decoding the Tweet _ Practical Criticism in the Age of Hashtag.pptxDhatriParmar
 
Employablity presentation and Future Career Plan.pptx
Employablity presentation and Future Career Plan.pptxEmployablity presentation and Future Career Plan.pptx
Employablity presentation and Future Career Plan.pptxryandux83rd
 
31 ĐỀ THI THỬ VÀO LỚP 10 - TIẾNG ANH - FORM MỚI 2025 - 40 CÂU HỎI - BÙI VĂN V...
31 ĐỀ THI THỬ VÀO LỚP 10 - TIẾNG ANH - FORM MỚI 2025 - 40 CÂU HỎI - BÙI VĂN V...31 ĐỀ THI THỬ VÀO LỚP 10 - TIẾNG ANH - FORM MỚI 2025 - 40 CÂU HỎI - BÙI VĂN V...
31 ĐỀ THI THỬ VÀO LỚP 10 - TIẾNG ANH - FORM MỚI 2025 - 40 CÂU HỎI - BÙI VĂN V...Nguyen Thanh Tu Collection
 
Mythology Quiz-4th April 2024, Quiz Club NITW
Mythology Quiz-4th April 2024, Quiz Club NITWMythology Quiz-4th April 2024, Quiz Club NITW
Mythology Quiz-4th April 2024, Quiz Club NITWQuiz Club NITW
 
The role of Geography in climate education: science and active citizenship
The role of Geography in climate education: science and active citizenshipThe role of Geography in climate education: science and active citizenship
The role of Geography in climate education: science and active citizenshipKarl Donert
 
PART 1 - CHAPTER 1 - CELL THE FUNDAMENTAL UNIT OF LIFE
PART 1 - CHAPTER 1 - CELL THE FUNDAMENTAL UNIT OF LIFEPART 1 - CHAPTER 1 - CELL THE FUNDAMENTAL UNIT OF LIFE
PART 1 - CHAPTER 1 - CELL THE FUNDAMENTAL UNIT OF LIFEMISSRITIMABIOLOGYEXP
 
How to Uninstall a Module in Odoo 17 Using Command Line
How to Uninstall a Module in Odoo 17 Using Command LineHow to Uninstall a Module in Odoo 17 Using Command Line
How to Uninstall a Module in Odoo 17 Using Command LineCeline George
 
BÀI TẬP BỔ TRỢ TIẾNG ANH 8 - I-LEARN SMART WORLD - CẢ NĂM - CÓ FILE NGHE (BẢN...
BÀI TẬP BỔ TRỢ TIẾNG ANH 8 - I-LEARN SMART WORLD - CẢ NĂM - CÓ FILE NGHE (BẢN...BÀI TẬP BỔ TRỢ TIẾNG ANH 8 - I-LEARN SMART WORLD - CẢ NĂM - CÓ FILE NGHE (BẢN...
BÀI TẬP BỔ TRỢ TIẾNG ANH 8 - I-LEARN SMART WORLD - CẢ NĂM - CÓ FILE NGHE (BẢN...Nguyen Thanh Tu Collection
 
physiotherapy in Acne condition.....pptx
physiotherapy in Acne condition.....pptxphysiotherapy in Acne condition.....pptx
physiotherapy in Acne condition.....pptxAneriPatwari
 
BÀI TẬP BỔ TRỢ 4 KĨ NĂNG TIẾNG ANH LỚP 8 - CẢ NĂM - GLOBAL SUCCESS - NĂM HỌC ...
BÀI TẬP BỔ TRỢ 4 KĨ NĂNG TIẾNG ANH LỚP 8 - CẢ NĂM - GLOBAL SUCCESS - NĂM HỌC ...BÀI TẬP BỔ TRỢ 4 KĨ NĂNG TIẾNG ANH LỚP 8 - CẢ NĂM - GLOBAL SUCCESS - NĂM HỌC ...
BÀI TẬP BỔ TRỢ 4 KĨ NĂNG TIẾNG ANH LỚP 8 - CẢ NĂM - GLOBAL SUCCESS - NĂM HỌC ...Nguyen Thanh Tu Collection
 
Grade Three -ELLNA-REVIEWER-ENGLISH.pptx
Grade Three -ELLNA-REVIEWER-ENGLISH.pptxGrade Three -ELLNA-REVIEWER-ENGLISH.pptx
Grade Three -ELLNA-REVIEWER-ENGLISH.pptxkarenfajardo43
 
4.9.24 School Desegregation in Boston.pptx
4.9.24 School Desegregation in Boston.pptx4.9.24 School Desegregation in Boston.pptx
4.9.24 School Desegregation in Boston.pptxmary850239
 
An Overview of the Calendar App in Odoo 17 ERP
An Overview of the Calendar App in Odoo 17 ERPAn Overview of the Calendar App in Odoo 17 ERP
An Overview of the Calendar App in Odoo 17 ERPCeline George
 
Shark introduction Morphology and its behaviour characteristics
Shark introduction Morphology and its behaviour characteristicsShark introduction Morphology and its behaviour characteristics
Shark introduction Morphology and its behaviour characteristicsArubSultan
 
Healthy Minds, Flourishing Lives: A Philosophical Approach to Mental Health a...
Healthy Minds, Flourishing Lives: A Philosophical Approach to Mental Health a...Healthy Minds, Flourishing Lives: A Philosophical Approach to Mental Health a...
Healthy Minds, Flourishing Lives: A Philosophical Approach to Mental Health a...Osopher
 

Recently uploaded (20)

Geoffrey Chaucer Works II UGC NET JRF TGT PGT MA PHD Entrance Exam II History...
Geoffrey Chaucer Works II UGC NET JRF TGT PGT MA PHD Entrance Exam II History...Geoffrey Chaucer Works II UGC NET JRF TGT PGT MA PHD Entrance Exam II History...
Geoffrey Chaucer Works II UGC NET JRF TGT PGT MA PHD Entrance Exam II History...
 
6 ways Samsung’s Interactive Display powered by Android changes the classroom
6 ways Samsung’s Interactive Display powered by Android changes the classroom6 ways Samsung’s Interactive Display powered by Android changes the classroom
6 ways Samsung’s Interactive Display powered by Android changes the classroom
 
Decoding the Tweet _ Practical Criticism in the Age of Hashtag.pptx
Decoding the Tweet _ Practical Criticism in the Age of Hashtag.pptxDecoding the Tweet _ Practical Criticism in the Age of Hashtag.pptx
Decoding the Tweet _ Practical Criticism in the Age of Hashtag.pptx
 
Employablity presentation and Future Career Plan.pptx
Employablity presentation and Future Career Plan.pptxEmployablity presentation and Future Career Plan.pptx
Employablity presentation and Future Career Plan.pptx
 
31 ĐỀ THI THỬ VÀO LỚP 10 - TIẾNG ANH - FORM MỚI 2025 - 40 CÂU HỎI - BÙI VĂN V...
31 ĐỀ THI THỬ VÀO LỚP 10 - TIẾNG ANH - FORM MỚI 2025 - 40 CÂU HỎI - BÙI VĂN V...31 ĐỀ THI THỬ VÀO LỚP 10 - TIẾNG ANH - FORM MỚI 2025 - 40 CÂU HỎI - BÙI VĂN V...
31 ĐỀ THI THỬ VÀO LỚP 10 - TIẾNG ANH - FORM MỚI 2025 - 40 CÂU HỎI - BÙI VĂN V...
 
Mythology Quiz-4th April 2024, Quiz Club NITW
Mythology Quiz-4th April 2024, Quiz Club NITWMythology Quiz-4th April 2024, Quiz Club NITW
Mythology Quiz-4th April 2024, Quiz Club NITW
 
The role of Geography in climate education: science and active citizenship
The role of Geography in climate education: science and active citizenshipThe role of Geography in climate education: science and active citizenship
The role of Geography in climate education: science and active citizenship
 
PART 1 - CHAPTER 1 - CELL THE FUNDAMENTAL UNIT OF LIFE
PART 1 - CHAPTER 1 - CELL THE FUNDAMENTAL UNIT OF LIFEPART 1 - CHAPTER 1 - CELL THE FUNDAMENTAL UNIT OF LIFE
PART 1 - CHAPTER 1 - CELL THE FUNDAMENTAL UNIT OF LIFE
 
How to Uninstall a Module in Odoo 17 Using Command Line
How to Uninstall a Module in Odoo 17 Using Command LineHow to Uninstall a Module in Odoo 17 Using Command Line
How to Uninstall a Module in Odoo 17 Using Command Line
 
Introduction to Research ,Need for research, Need for design of Experiments, ...
Introduction to Research ,Need for research, Need for design of Experiments, ...Introduction to Research ,Need for research, Need for design of Experiments, ...
Introduction to Research ,Need for research, Need for design of Experiments, ...
 
CARNAVAL COM MAGIA E EUFORIA _
CARNAVAL COM MAGIA E EUFORIA            _CARNAVAL COM MAGIA E EUFORIA            _
CARNAVAL COM MAGIA E EUFORIA _
 
BÀI TẬP BỔ TRỢ TIẾNG ANH 8 - I-LEARN SMART WORLD - CẢ NĂM - CÓ FILE NGHE (BẢN...
BÀI TẬP BỔ TRỢ TIẾNG ANH 8 - I-LEARN SMART WORLD - CẢ NĂM - CÓ FILE NGHE (BẢN...BÀI TẬP BỔ TRỢ TIẾNG ANH 8 - I-LEARN SMART WORLD - CẢ NĂM - CÓ FILE NGHE (BẢN...
BÀI TẬP BỔ TRỢ TIẾNG ANH 8 - I-LEARN SMART WORLD - CẢ NĂM - CÓ FILE NGHE (BẢN...
 
Plagiarism,forms,understand about plagiarism,avoid plagiarism,key significanc...
Plagiarism,forms,understand about plagiarism,avoid plagiarism,key significanc...Plagiarism,forms,understand about plagiarism,avoid plagiarism,key significanc...
Plagiarism,forms,understand about plagiarism,avoid plagiarism,key significanc...
 
physiotherapy in Acne condition.....pptx
physiotherapy in Acne condition.....pptxphysiotherapy in Acne condition.....pptx
physiotherapy in Acne condition.....pptx
 
BÀI TẬP BỔ TRỢ 4 KĨ NĂNG TIẾNG ANH LỚP 8 - CẢ NĂM - GLOBAL SUCCESS - NĂM HỌC ...
BÀI TẬP BỔ TRỢ 4 KĨ NĂNG TIẾNG ANH LỚP 8 - CẢ NĂM - GLOBAL SUCCESS - NĂM HỌC ...BÀI TẬP BỔ TRỢ 4 KĨ NĂNG TIẾNG ANH LỚP 8 - CẢ NĂM - GLOBAL SUCCESS - NĂM HỌC ...
BÀI TẬP BỔ TRỢ 4 KĨ NĂNG TIẾNG ANH LỚP 8 - CẢ NĂM - GLOBAL SUCCESS - NĂM HỌC ...
 
Grade Three -ELLNA-REVIEWER-ENGLISH.pptx
Grade Three -ELLNA-REVIEWER-ENGLISH.pptxGrade Three -ELLNA-REVIEWER-ENGLISH.pptx
Grade Three -ELLNA-REVIEWER-ENGLISH.pptx
 
4.9.24 School Desegregation in Boston.pptx
4.9.24 School Desegregation in Boston.pptx4.9.24 School Desegregation in Boston.pptx
4.9.24 School Desegregation in Boston.pptx
 
An Overview of the Calendar App in Odoo 17 ERP
An Overview of the Calendar App in Odoo 17 ERPAn Overview of the Calendar App in Odoo 17 ERP
An Overview of the Calendar App in Odoo 17 ERP
 
Shark introduction Morphology and its behaviour characteristics
Shark introduction Morphology and its behaviour characteristicsShark introduction Morphology and its behaviour characteristics
Shark introduction Morphology and its behaviour characteristics
 
Healthy Minds, Flourishing Lives: A Philosophical Approach to Mental Health a...
Healthy Minds, Flourishing Lives: A Philosophical Approach to Mental Health a...Healthy Minds, Flourishing Lives: A Philosophical Approach to Mental Health a...
Healthy Minds, Flourishing Lives: A Philosophical Approach to Mental Health a...
 

TRANSFORMING BIG DATA INTO SMART DATA: Deriving Value via Harnessing Volume, Variety, and Velocity using Semantic Techniques and Technologies

  • 1. Transforming Big Data into Smart Data: Deriving Value via harnessing Volume, Variety and Velocity using semantics and Semantic Web Put Knoesis Banner Keynote at 30th IEEE International Conference on Data Engineering (ICDE) 2014 Amit Sheth LexisNexis Ohio Eminent Scholar & Exec. Director, The Ohio Center of Excellence in Knowledge-enabled Computing (Kno.e.sis) Wright State, USA
  • 2. 2
  • 3. Amit Sheth’s PHD students Ashutosh Jadhav Hemant Purohit Vinh Nguyen Lu Chen Pramod AnantharamSujan Perera Alan Smith Maryam Panahiazar Sarasi Lalithsena Cory Henson Kalpa Gunaratna Delroy Cameron Sanjaya Wijeratne Wenbo Wang Kno.e.sis in 2012 = ~100 researchers (15 faculty, ~50 PhD students) Special Thanks Pavan Kapanipathi Special Thanks Special Thanks Special Thanks Shreyansh Bhatt Acknowledgements: Kno.e.sis team, Funds - NSF, NIH, AFRL, Industry…
  • 5. Only 0.5% to 1% of the data is used for analysis. 5 http://www.csc.com/insights/flxwd/78931-big_data_growth_just_beginning_to_explode http://www.guardian.co.uk/news/datablog/2012/dec/19/big-data-study-digital-universe-global-volume
  • 6. Variety – not just structure but modality: multimodal, multisensory Semi structured 6
  • 7. Velocity Fast Data Rapid Changes Real-Time/Stream Analysis Current application examples: financial services, stock brokerage, weather tracking, movies/entertainment and online retail 7
  • 8. • What if your data volume gets so large and varied you don't know how to deal with it? • Do you store all your data? • Do you analyze it all? • What is coverage, skew, quality? How can you find out which data points are really important? • How can you use it to your best advantage? 9 Questions typically asked on Big Data http://www.sas.com/big-data/
  • 10. • Prediction of the spread of flu in real time during H1N1 2009 – Google tested a mammoth of 450 million different mathematical models to test the search terms that provided 45 important parameters – Model was tested when H1N1 crisis struck in 2009 and gave more meaningful and valuable real time information than any public health official system [Big Data, Viktor Mayer-Schonberger and Kenneth Cukier, 2013] • FareCast: predict the direction of air fares over different routes [Big Data, Viktor Mayer-Schonberger and Kenneth Cukier, 2013] • NY city manholes problem [ICML Discussion, 2012] 11 Illustrative Big Data Applications
  • 11. Current focus mainly to serve business intelligence and targeted analytics needs, not to serve complex individual and collective human needs (e.g., empower human in health, fitness and well-being; better disaster coordination, personalized smart energy) 12 What is missing?
  • 12.  highly personalized/individualized/contextualized  Incorporate real-world complexity: - multi-modal and multi-sensory nature of physical-world and human perception  Can More Data beat better algorithms?  Can Big Data replace human judgment? 13 Many opportunities, many challenges, lessons to apply
  • 13. • Not just data to information, not just analysis, but actionable information, delivering insight and support better decision making right in the context of human activities 15 What is needed? Data Information Actionable: An apple a day keeps the doctor away
  • 14. 16 What is needed? Taking inspiration from cognitive models • Bottom up and top down cognitive processes: – Bottom up: find patterns, mine (ML, …) – Top down: Infusion of models and background knowledge (data + knowledge + reasoning) Left(plans)/Right(perceives) Brain Top(plans)/Bottom(perceives) Brain http://online.wsj.com/news/articles/SB10001424052702304410204579139423079198270
  • 15. • Ambient processing as much as possible while enabling natural human involvement to guide the system 17 What is needed? Smart Refrigerator: Low on Apples Adapting the Plan: shopping for apples
  • 16. Makes Sense to a human Is actionable – timely and better decisions/outcomes 18
  • 17. 20 My 2004-2005 formulation of SMART DATA - Semagix Formulation of Smart Data strategy providing services for Search, Explore, Notify. “Use of Ontologies and Data repositories to gain relevant insights”
  • 18. Smart Data (2013 retake) Smart data makes sense out of Big data It provides value from harnessing the challenges posed by volume, velocity, variety and veracity of big data, in-turn providing actionable information and improve decision making. 21
  • 19. OF human, BY human FOR human Smart data is focused on the actionable value achieved by human involvement in data creation, processing and consumption phases for improving the human experience. Another perspective on Smart Data 22
  • 20. OF human, BY human FOR human Another perspective on Smart Data 23
  • 21. Petabytes of Physical(sensory)-Cyber-Social Data everyday! More on PCS Computing: http://wiki.knoesis.org/index.php/PCS 24 „OF human‟ : Relevant Real-time Data Streams for Human Experience
  • 22. OF human, BY human FOR human 25 Another perspective on Smart Data
  • 23. Use of Prior Human-created Knowledge Models 26 „BY human‟: Involving Crowd Intelligence in data processing workflows Crowdsourcing and Domain-expert guided Machine Learning Modeling
  • 24. OF human, BY human FOR human Another perspective on Smart Data 27
  • 25. Detection of events, such as wheezing sound, indoor temperature, humidity, dust, and CO level Weather Application Asthma Healthcare Application Close the window at home during day to avoid CO in gush, to avoid asthma attacks at night 28 „FOR human‟ : Improving Human Experience Population Level Personal Public Health Action in the Physical World Luminosity CO level CO in gush during day time
  • 26. Electricity usage over a day, device at work, power consumption, cost/kWh, heat index, relative humidity, and public events from social stream Weather Application Power Monitoring Application 29 „FOR human‟ : Improving Human Experience Population Level Observations Personal Level Observations Action in the Physical World Washing and drying has resulted in significant cost since it was done during peak load period. Consider changing this time to night.
  • 27. 30 Every one and everything has Big Data – It is Smart Data that matter!
  • 28. • Healthcare: ADFH, Asthma, GI – Using kHealth system • Social Media Analysis: Crisis coordination – Using Twitris platform • Smart Cities: Traffic management 31 I will use applications in 3 domains to demonstrate
  • 29. • Healthcare: ADFH, Asthma, GI – Using kHealth system • Social Media Analysis: Crisis coordination – Using Twitris platform • Smart Cities: Traffic management 43 Smart Data Applications
  • 30. 44 A Historical Perspective on Collecting Health Observations Diseases treated only by external observations First peek beyond just external observations Information overload! Doctors relied only on external observations Stethoscope was the first instrument to go beyond just external observations Though the stethoscope has survived, it is only one among many observations in modern medicine http://en.wikipedia.org/wiki/Timeline_of_medicine_and_medical_technology 2600 BC ~1815 Today Imhotep Laennec’s stethoscope Image Credit: British Museum
  • 31. The Patient of the Future MIT Technology Review, 2012 http://www.technologyreview.com/featuredstory/426968/the-patient-of-the-future/ 45
  • 32. Through physical monitoring and analysis, our cellphones could act as an early warning system to detect serious health conditions, and provide actionable information canary in a coal mine Empowering Individuals (who are not Larry Smarr!) for their own health kHealth: knowledge-enabled healthcare 46
  • 33. Weight Scale Heart Rate Monitor Blood Pressure Monitor 47 Sensors Android Device (w/ kHealth App) Readmissions cost $17B/year: $50K/readmission; Total kHealth kit cost: < $500 kHealth Kit for the application for reducing ADHF readmission ADHF – Acute Decompensated Heart Failure
  • 34. 48 1http://www.nhlbi.nih.gov/health/health-topics/topics/asthma/ 2http://www.lung.org/lung-disease/asthma/resources/facts-and-figures/asthma-in-adults.html 3Akinbami et al. (2009). Status of childhood asthma in the United States, 1980–2007. Pediatrics,123(Supplement 3), S131-S145. 25 million 300 million $50 billion 155,000 593,000 People in the U.S. are diagnosed with asthma (7 million are children)1. People suffering from asthma worldwide2. Spent on asthma alone in a year2 Hospital admissions in 20063 Emergency department visits in 20063 Asthma: Severity of the problem
  • 35. Sensordrone (Carbon monoxide, temperature, humidity) Node Sensor (exhaled Nitric Oxide) 49 Sensors Android Device (w/ kHealth App) Total cost: ~ $500 kHealth Kit for the application for Asthma management *Along with two sensors in the kit, the application uses a variety of population level signals from the web: Pollen level Air Quality Temperature & Humidity
  • 36. 51 Data Overload for Patients/health aficionados Providing actionable information in a timely manner is crucial to avoid information overload or fatigue Personal level Signals Public level Signals Population level Signals
  • 37. 52 Data Overload Spanning Physical-Cyber-Social Modalities Increasingly, real-world events are: (a) Continuous: Observations are fine grained over time (b) Multimodal, multisensory: Observations span PCS modalities
  • 38. what can we do to avoid asthma episode? 54 Real-time health signals from personal level (e.g., Wheezometer, NO in breath, accelerometer, microphone), public health (e.g., CDC, Hospital EMR), and population level (e.g., pollen level, CO2) arriving continuously in fine grained samples potentially with missing information and uneven sampling frequencies. Variety Volume VeracityVelocity Value What risk factors influence asthma control? What is the contribution of each risk factor? semantics Understanding relationships between health signals and asthma attacks for providing actionable information WHY Big Data to Smart Data: Asthma example
  • 39. kHealth: Health Signal Processing Architecture Personal level Signals Public level Signals Population level Signals Domain Knowledge Risk Model Events from Social Streams Take Medication before going to work Avoid going out in the evening due to high pollen levels Contact doctor Analysis Personalized Actionable Information Data Acquisition & aggregation 55
  • 40. 57 Asthma Domain Knowledge Domain Knowledge Asthma Control à Daily Medication Choices for starting therapy Not Well Controlled Poor Controlled Severity Level of Asthma (Recommended Action) (Recommended Action) (Recommended Action) Intermittent Asthma SABA prn - - Mild Persistent Asthma Low dose ICS Medium ICS Medium ICS Moderate Persistent Asthma Medium dose ICS alone Or with LABA/montelukast Medium ICS + LABA/Montelukast Or High dose ICS Medium ICS + LABA/Montelukast Or High dose ICS* Severe Persistent Asthma High dose ICS with LABA/montelukast Needs specialist care Needs specialist care ICS= inhaled corticosteroid, LABA = inhaled long-acting beta2-agonist, SABA= inhaled short-acting beta2-agonist ; *consider referral to specialist Asthma Control and Actionable Information
  • 41. 58 Patient Health Score (diagnostic) Risk assessment model Semantic Perception Personal level Signals Public level Signals Domain Knowledge Population level Signals GREEN -- Well Controlled YELLOW – Not well controlled Red -- poor controlled How controlled is my asthma?
  • 42. 59 Patient Vulnerability Score (prognostic) Risk assessment model Semantic Perception Personal level Signals Public level Signals Domain Knowledge Population level Signals Patient health Score How vulnerable* is my control level today? *considering changing environmental conditions and current control level
  • 43. 60 3.4 billion people will have smartphones or tablets by 2017 -- Research2Guidance “Intelligence at the Edges” for Digital Health http://www.digikey.com/us/en/techzone/energy-harvesting/resources/articles/zigbees-smart-energy-20-profile.html m-health app market is predicted to reach $26 billion in 2017 -- Research2Guidance
  • 44. 63 Sensordrone – for monitoring environmental air quality Wheezometer – for monitoring wheezing sounds Can I reduce my asthma attacks at night? What are the triggers? What is the wheezing level? What is the propensity toward asthma? What is the exposure level over a day? Commute to Work Asthma: Actionable Information for Asthma Patients Luminosity CO level CO in gush during day time Actionable Information Personal level Signals Public level Signals Population level Signals What is the air quality indoors?
  • 45. 64 Population Level Personal Wheeze – Yes Do you have tightness of chest? –Yes ObservationsPhysical-Cyber-Social System Health Signal Extraction Health Signal Understanding <Wheezing=Yes, time, location> <ChectTightness=Yes, time, location> <PollenLevel=Medium, time, location> <Pollution=Yes, time, location> <Activity=High, time, location> Wheezing ChectTightness PollenLevel Pollution Activity Wheezing ChectTightness PollenLevel Pollution Activity RiskCategory <PollenLevel, ChectTightness, Pollution, Activity, Wheezing, RiskCategory> <2, 1, 1,3, 1, RiskCategory> <2, 1, 1,3, 1, RiskCategory> <2, 1, 1,3, 1, RiskCategory> <2, 1, 1,3, 1, RiskCategory> . . . Expert Knowledge Background Knowledge tweet reporting pollution level and asthma attacks Acceleration readings from on-phone sensors Sensor and personal observations Signals from personal, personal spaces, and community spaces Risk Category assigned by doctors Qualify Quantify Enrich Outdoor pollen and pollution Public Health Health Signal Extraction to Understanding Well Controlled - continue Not Well Controlled – contact nurse Poor Controlled – contact doctor
  • 46. 70 RDF OWL How are machines supposed to integrate and interpret sensor data? Semantic Sensor Networks (SSN)
  • 47. 71 W3C Semantic Sensor Network Ontology Lefort, L., Henson, C., Taylor, K., Barnaghi, P., Compton, M., Corcho, O., Garcia-Castro, R., Graybeal, J., Herzog, A., Janowicz, K., Neuhaus, H., Nikolov, A., and Page, K.: Semantic Sensor Network XG Final Report, W3C Incubator Group Report (2011).
  • 48. 73 W3C Semantic Sensor Network Ontology Lefort, L., Henson, C., Taylor, K., Barnaghi, P., Compton, M., Corcho, O., Garcia-Castro, R., Graybeal, J., Herzog, A., Janowicz, K., Neuhaus, H., Nikolov, A., and Page, K.: Semantic Sensor Network XG Final Report, W3C Incubator Group Report (2011).
  • 49. SSN Ontology 2 Interpreted data (deductive) [in OWL] e.g., threshold 1 Annotated Data [in RDF] e.g., label 0 Raw Data [in TEXT] e.g., number Levels of Abstraction 3 Interpreted data (abductive) [in OWL] e.g., diagnosis Intellego “150” Systolic blood pressure of 150 mmHg Elevated Blood Pressure Hyperthyroidism …… 75
  • 50. 76 Making sense of sensor data with
  • 51. People are good at making sense of sensory input What can we learn from cognitive models of perception? • The key ingredient is prior knowledge 77
  • 52. * based on Neisser’s cognitive model of perception Observe Property Perceive Feature Explanation Discrimination 1 2 Perception Cycle* Translating low-level signals into high-level knowledge Focusing attention on those aspects of the environment that provide useful information Prior Knowledge 78
  • 53. To enable machine perception, Semantic Web technology is used to integrate sensor data with prior knowledge on the Web 79
  • 54. Prior knowledge on the Web W3C Semantic Sensor Network (SSN) Ontology Bi-partite Graph 80
  • 55. Prior knowledge on the Web W3C Semantic Sensor Network (SSN) Ontology Bi-partite Graph 81
  • 56. Observe Property Perceive Feature Explanation 1 Translating low-level signals into high-level knowledge Explanation Explanation is the act of choosing the objects or events that best account for a set of observations; often referred to as hypothesis building 82
  • 57. Discrimination is the act of finding those properties that, if observed, would help distinguish between multiple explanatory features Observe Property Perceive Feature Explanation Discrimination 2 Focusing attention on those aspects of the environment that provide useful information Discrimination 85
  • 58. Discrimination Discriminating Property: is neither expected nor not-applicable DiscriminatingProperty ≡ ¬ExpectedProperty ⊓ ¬NotApplicableProperty elevated blood pressure clammy skin palpitations Hypertension Hyperthyroidism Pulmonary Edema Discriminating Property Explanatory Feature 89
  • 59. Semantic scalability: Resource savings of abstracting sensor data 90 Orders of magnitude resource savings for generating and storing relevant abstractions vs. raw observations. Relevant abstractions Raw observations
  • 60. How do we implement machine perception efficiently on a resource-constrained device? Use of OWL reasoner is resource intensive (especially on resource-constrained devices), in terms of both memory and time • Runs out of resources with prior knowledge >> 15 nodes • Asymptotic complexity: O(n3) 92
  • 61. intelligence at the edge Approach 1: Send all sensor observations to the cloud for processing Approach 2: downscale semantic processing so that each device is capable of machine perception 93 Henson et al. 'An Efficient Bit Vector Approach to Semantics-based Machine Perception in Resource-Constrained Devices, ISWC 2012.
  • 62. Efficient execution of machine perception Use bit vector encodings and their operations to encode prior knowledge and execute semantic reasoning 010110001101 0011110010101 1000110110110 101100011010 0111100101011 000110101100 0110100111 94
  • 63. O(n3) < x < O(n4) O(n) Efficiency Improvement • Problem size increased from 10’s to 1000’s of nodes • Time reduced from minutes to milliseconds • Complexity growth reduced from polynomial to linear Evaluation on a mobile device 95
  • 64. 2 Prior knowledge is the key to perception Using SW technologies, machine perception can be formalized and integrated with prior knowledge on the Web 3 Intelligence at the edge By downscaling semantic inference, machine perception can execute efficiently on resource-constrained devices Semantic Perception for smarter analytics: 3 ideas to takeaway 1 Translate low-level data to high-level knowledge Machine perception can be used to convert low-level sensory signals into high-level knowledge useful for decision making 96
  • 65. • Healthcare: ADFH, Asthma, GI – Using kHealth system • Social Media Analysis: Crisis coordination – Using Twitris platform • Smart Cities: Traffic management 98 Smart Data Applications
  • 66. 99 Smart Data for Social Good Mining human behavior to help societal and humanitarian development • crisis response coordination, harassment, gender-based violence, …
  • 67. 20 million tweets with “sandy, hurricane” keywords between Oct 27th and Nov 1st 2nd most popular topic on Facebook during 2012 Social (Big) Data during Crisis- Example of Hurricane Sandy 100 • http://www.guardian.co.uk/news/datablog/2 012/oct/31/twitter-sandy-flooding • http://www.huffingtonpost.com/2012/11/02 /twitter-hurricane-sandy_n_2066281.html • http://mashable.com/2012/10/31/hurricane- sandy-facebook/
  • 69. Twitris‟ Dimensions of Integrated Semantic Analysis 104Sheth et al. Twitris- a System for Collective Social Intelligence, ESNAM-2014
  • 70. What is Smart Data in the context of Disaster Management ACTIONABLE: Timely delivery of right resources and information to the right people at right location! 113 Because everyone wants to Help, but DON’T KNOW HOW!
  • 71. Really sparse Signal to Noise: • 2M tweets during the first 48 hrs. of #Oklahoma-tornado-2013 - 1.3% as the precise resource donation requests to help - 0.02% as the precise resource donation offers to help 114 • Anyone know how to get involved to help the tornado victims in Oklahoma??#tornado #oklahomacity (OFFER) • I want to donate to the Oklahoma cause shoes clothes even food if I can (OFFER) Disaster Response Coordination: Finding Actionable Nuggets for Responders to act • Text REDCROSS to 909-99 to donate to those impacted by the Moore tornado! http://t.co/oQMljkicPs (REQUEST) • Please donate to Oklahoma disaster relief efforts.: http://t.co/crRvLAaHtk (REQUEST) For responders, most important information is the scarcity and availability of resources Blog by our colleague Patrick Meier on this analysis: http://irevolution.net/2013/05/29/analyzing-tweets-tornado/
  • 72. Join us for the Social Good! http://twitris.knoesis.org RT @OpOKRelief: Southgate Baptist Church on 4th Street in Moore has food, water, clothes, diapers, toys, and more. If you can't go,call 794 Text "FOOD" to 32333, REDCROSS to 90999, or STORM to 80888 to donate $10 in storm relief. #moore #oklahoma #disasterrelief #donate Want to help animals in #Oklahoma? @ASPCA tells how you can help: http://t.co/mt8l9PwzmO CITIZEN SENSORS RESPONSE TEAMS (including humanitarian org. and ‘pseudo’ responders) VICTIM SITE Coordination of needs and offers Using Social Media Does anyone know where to send a check to donate to the tornado victims? Where do I go to help out for volunteer work around Moore? Anyone know? Anyone know where to donate to help the animals from the Oklahoma disaster? #oklah oma #dogs Matched Matched Matched Serving the need! If you would like to volunteer today, help is desperately needed in Shawnee. Call 273-5331 for more info http://www.slideshare.net/knoesis/iccm-2013ignitetalkhemantpurohitunnairobi 115 Purohit et al. Emergency-relief coordination on social media: Automatically matching resource requests and offers, 2014. With Int’l collaborator
  • 73. Continuous Semantics for Evolving Events to Extract Smart Data 126
  • 75. • Healthcare: ADFH, Asthma, GI – Using kHealth system • Social Media Analysis: Crisis coordination – Using Twitris platform • Smart Cities: Traffic management 130 Smart Data Applications
  • 76. 131 Traffic Management To improve the everyday life entangled due to our most common problem of ‘stuck in traffic’
  • 77. 1IBM Smarter Traffic 132 Severity of the Traffic Problem
  • 78. Vehicular traffic data from San Francisco Bay Area aggregated from on-road sensors (numerical) and incident reports (textual) 133 http://511.org/ Every minute update of speed, volume, travel time, and occupancy resulting in 178 million link status observations, 738 active events, and 146 scheduled events with many unevenly sampled observations collected over 3 months. Variety Volume VeracityVelocity Value Can we detect the onset of traffic congestion? Can we characterize traffic congestion based on events? Can we estimate traffic delays in a road network? semantics Representing prior knowledge of traffic lead to a focused exploration of this massive dataset Big Data to Smart Data: Traffic Management example
  • 79. 134 Duration: 36 months Requested funding: 2.531.202 € CityPulse Consortium City of Aarhus City of Brasov
  • 80. Textual Streams for City Related Events 135
  • 81. City Infrastructure Tweets from a city POS Tagging Hybrid NER+ Event term extraction Geohashing Temporal Estimation Impact Assessment Event Aggregation OSM Locations SCRIBE ontology 511.org hierarchy City Event Extraction City Event Extraction Solution Architecture City Event Annotation OSM – Google Open Street Maps NER – Named Entity Recognition 136
  • 82. City Event Annotation – CRF Annotation Examples Last O night O in O CA... O (@ O Half B-LOCATION Moon I-LOCATION Bay B-LOCATION Brewing I-LOCATION Company O w/ O 8 O others) O http://t.co/w0eGEJjApY O B-LOCATION I-LOCATION B-EVENT I-EVENT O Tags used in our approach: These are the annotations provided by a Conditional Random Field model trained on tweet corpus to spot city related events and location BIO – Beginning, Intermediate, and Other is a notation used in multi-phrase entity spotting 138
  • 83. City Events from Sensor and Social Streams can be… • Complementary • Additional information • e.g., slow traffic from sensor data and accident from textual data • Corroborative • Additional confidence • e.g., accident event supporting a accident report from ground truth • Timely • Additional insight • e.g., knowing poor visibility before formal report from ground truth 143
  • 84. Events from Social Streams and City Department* Corroborative EventsComplementary Events Event Sources City events extracted from tweets 511.org, Active events e.g., accidents, breakdowns 511.org, Scheduled events e.g., football game, parade City event from twitter providing complementary and corroborative evidence for fog reported by 511.org *511.org 146
  • 85. 147 Actionable Information in City Management Tweets from a CityTraffic Sensor Data OSM Locations SCRIBE ontology 511.org hierarchy Web of Data How issues in a city can be resolved? e.g., what should I do when I have fog condition?
  • 86. • Big Data is every where – at individual level and not just limited to corporation – with growing complexity: multimodal, Physical- Cyber-Social • Analysis is not sufficient • Bottom up techniques is not sufficient, need top down processing, need background knowledge 149 Take Away
  • 87. Take Away • Focus on Humans and Improve human life and experience with SMART Data. – Data to Information to Contextually Relevant Abstractions – Actionable Information (Value from data) to assist and support Human in decision making. • Focus on Value -- SMART Data – Big Data Challenges without the intention of deriving Value is a “Journey without GOAL”. 150
  • 88. 153 thank you, and please visit us at http://knoesis.org/vision Kno.e.sis – Ohio Center of Excellence in Knowledge-enabled Computing Wright State University, Dayton, Ohio, USA Smart Data
  • 89. Ohio Center of Excellence in Knowledge-enabled Computing • Among top universities in the world in World Wide Web (cf: 5-yr impact, Microsoft Academic Search: shared 2nd place in Mar13) • Largest academic group in the US in Semantic Web + Social/Sensor Webs, Mobile/Cloud/Cognitive Computing, Big Data, IoT, Health/Clinical & Biomedicine Applications • Exceptional student success: internships and jobs at top salary (IBM Research, MSR, Amazon, CISCO, Oracle, Yahoo!, Samsung, research universities, NLM, startups ) • 100 researchers including 15 World Class faculty (>3K citations/faculty) and 45+ PhD students- practically all funded • $2M+/yr research for largely multidisciplinary projects; world class resources; industry sponsorships/collaborations (Google, IBM, …)
  • 90. 155

Editor's Notes

  1. Starting slide Various Big data problems – Traditional examples vs what we are doing examples. Variety and Velocity than Volume. kHealth problem. People will be interested in Smart Data.Traditional ML techniques, High Performance Computing, Statistics. Human level of Abstraction is Smart data.
  2. Note:For images and sources, if not on slides, please see slide notesSome images were taken from the Web Search results and all such images belong to their respective owners, we are grateful to the owners for usefulness of these images in our context.
  3. http://www.knowledgeinfusion.com/blog/2011/11/get-your-head-out-of-the-clouds-and-into-big-data/
  4. http://www.csc.com/insights/flxwd/78931-big_data_growth_just_beginning_to_explodehttp://www.guardian.co.uk/news/datablog/2012/dec/19/big-data-study-digital-universe-global-volume
  5. Types of DataFormats of DataAlso talk about the increase in the platforms that helps generating these data
  6. Example high velocity Big Data applications at work:financial services, stock brokerage, weather tracking, movies/entertainment and online retail.Fast data (rate at which data is coming: esp from mobile, social and sensor sources), Rapid changes – in the data content, Stream analysis – to cope with the incoming data for real-time online analytics
  7. Source: http://techcrunch.com/2012/10/27/big-data-right-now-five-trendy-open-source-technologies
  8. http://radhakrishna.typepad.com/rks_musings/2013/04/big-data-review.htmlGoogle predicted the spread of flu in real time - after analyzing two datasets, a.) 50 million most common terms that Americans type, b.) data on the spread of seasonal flu from public health agency- tested a mammoth of 450 million different mathematical models to test the search terms, comparing their predictions against the actual flu cases- model was tested when H1N1 crisis struck in 2009 and gave more meaningful and valuable real time information than any public health official system (Big Data, Viktor Mayer-Schonberger and Kenneth Cukier, 2013)
  9. Better Algorithms Beat More Data — And Here’s Whyhttp://allthingsd.com/20121128/better-algorithms-beat-more-data-and-heres-why/Big Data Cannot Replace Human Judgmenthttp://www.matchcite.com/blog/blog/2012/july/big-data-cannot-replace-human-judgment.aspx**Comments about the articles
  10. Better Algorithms Beat More Data — And Here’s Whyhttp://allthingsd.com/20121128/better-algorithms-beat-more-data-and-heres-why/Big Data Cannot Replace Human Judgmenthttp://www.matchcite.com/blog/blog/2012/july/big-data-cannot-replace-human-judgment.aspx**Comments about the articles
  11. Better Algorithms Beat More Data — And Here’s Whyhttp://allthingsd.com/20121128/better-algorithms-beat-more-data-and-heres-why/Big Data Cannot Replace Human Judgmenthttp://www.matchcite.com/blog/blog/2012/july/big-data-cannot-replace-human-judgment.aspx**Comments about the articles
  12. Top and bottom part of the brain -- http://online.wsj.com/news/articles/SB10001424052702304410204579139423079198270 Top part of the brain is known for generating plansBottom part of the brain deals with current situational awarenessPerception through senses happens in the primitive part of the brain (mostly subconsciously)Machine perception allows us to transform low level sensor observations to higher level abstractions that are directly communicable to the upper part of the brain (non-subconscious)Thus, people can understand/adapt their plan quickly with abstractionsThe left brain here is generating plan of having an apple a day to make a healthy living The right part of the brain identifies an apple through senses
  13. Communicating the “abstraction” of less apples at home through “Ambient processing/intelligence”The left/top part of the brain will adapt the plan to shopping for apple soon so that the overall plan of having an apple a day can be achieved
  14. Smart data makes sense out of big data – it provides value from harnessing the challenges posed by volume, velocity, variety and veracity of big data, to provide actionable information and improve decision making.
  15. - HUMAN CENTRIC!!
  16. All the data related to human activity, existence and experiencesMore on PCS Computing: http://wiki.knoesis.org/index.php/PCS
  17. Information is CREATED by human with the Machinery available – Wikipedia tool, sensors and social networksInformation is STORED in Man+Machine readable format, LODInformation is PROCESSED using the LOD and Human assisted Knowledge-basedHigher level abstraction on info is now consumed in many mechanistic ways (including GIS) to provide EXPERIENCE for humans Example of a human guided modeling and improved performancehttp://research.microsoft.com/en-us/um/people/akapoor/papers/IJCAI%202011a.pdf
  18. Actionable information example:In Asthma use case we have a sensor – sensordrone which records luminosity and CO levelsA high correlation between CO level and luminosity is foundThis is an actionable information to the user interpreting it as CO in gush during day time=&gt; Mitigating action can be “closing the window” during day
  19. Also, we have weather application which performs abstraction on weather sensory observations to identify blizzard conditions (food for actions!!) :--20,000 weather stations (with ~5 sensors per station)-- Real-Time Feature Streams - live demo: http://knoesis1.wright.edu/EventStreams/ - video demo: https://skydrive.live.com/?cid=77950e284187e848&amp;sc=photos&amp;id=77950E284187E848%21276
  20. Lets find it..
  21. http://www.huffingtonpost.com/2012/10/30/hurricane-sandy-power-outage-map-infographic_n_2044411.htmlI would like to start with a motivational example here.
  22. Fraustino, Julia Daisy, Brooke Liu and Yan Jin. “Social Media Use during Disasters: A Review of the Knowledge Base and Gaps,” Final Report to Human Factors/Behavioral Sciences Division, Science and Technology Directorate, U.S. Department of Homeland Security. College Park, MD: START, 2012. Disaster communication deals with disaster information disseminated to the public by governments, emergency management organizations, and disaster responders as well as disaster information created and shared by journalists and the public. Disaster communication increasingly occurs via social media in addition to more conventional communication modes such as traditional media (e.g., newspaper, TV, radio) and word-of-mouth (e.g., phone call, face-to-face, group). Timely, interactive communication and user-generated content are hallmarks of social media, which include a diverse array of web- and mobile-based tools Disaster communication deals with (1) disaster information disseminated to the public by governments, emergency management organizations, and disaster responders often via traditional and social media; as well as (2) disaster information created and shared by journalists and affected members of the public often through word-of-mouth communication and social media. For information seeking. Disasters often breed high levels of uncertainty among the public (Mitroff, 2004), which prompts them to engage in heightened information seeking, (Boyle, Schmierbach, Armstrong, &amp; McLeod, 2004; Procopio &amp; Procopio, 2007). As expected, information seeking is a primary driver of social media use during routine times and during disasters (Liu et al., in press; PEW Internet, 2011). For timely information. Social media provide real-time disaster information, which no other media can provide (Kavanaugh et al., 2011; Kodrich &amp; Laituri, 2011). Social media can become the primary source of time-sensitive disaster information, especially when official sources provide information too slowly or are unavailable (Spiro et al., 2012). For example, during the 2007 California wildfires, the public turned to social media because they thought journalists and public officials were too slow to provide relevant information about their communities (Sutton, Palen, &amp; Shklovski, 2008). Time-sensitive information provided by social media during disasters is also useful for officials. For example, in an analysis of more than 500 million tweets, Culotta (2010) found Twitter data forecasted future influenza rates with high accuracy during the 2009 pandemic, obtaining a 95% correlation with national health statistics. Notably, the national statistics came from hospital survey reports, which typically had a lag time of one to two weeks for influenza reporting. For unique information. One of the primary reasons the public uses social media during disaster is to obtain unique information (Caplan, Perse, &amp; Gennaria, 2007). Applied to a disaster setting, which is inherently unpredictable and evolving, it follows that individuals turn to whatever source will provide the newest details. Oftentimes, individuals experiencing the event first-hand are on the scene of the disaster and can provide updates more quickly than traditional news sources and disaster response organization. For instance, in the Mumbai terrorist attacks that included multiple coordinated shootings and bombings across two days, laypersons were first to break the news on Twitter (Merrifield &amp; Palenchar, 2012). Research participants report using social media to satisfy their need to have the latest information available during disasters and for information gathering and sharing during disasters (Palen, Starbird, Vieweg, &amp; Hughes, 2010; Vieweg, Hughes, Starbird, &amp; Palen, 2010). For unfiltered information. To obtain crisis information, individuals often communicate with one another via social media rather than seeking a traditional news source or organizational website (Stephens &amp; Malone, 2009). The public check in with social media not only to obtain up-to-date, timely information unavailable elsewhere, but also because they appreciate that information may be unfiltered by traditional media, organizations, or politicians (Liu et al., in press).  To determine disaster magnitude. The public uses social media to stay apprised of the extent of a disaster (Liu et al., in press). They may turn to governmental or organizational sources for this information, but research has shown that if the public do not receive the information they desire when they desire it, they, along with others, will fill in the blanks (Stephens &amp; Malone, 2009), which can create rumors and misinformation. On the flipside, when the public believed that officials were not disseminating enough information regarding the size and trajectory of the 2007 California wildfires, they took matters into their own hands, using social media to track fire locations in real-time and notify residents who were potentially in danger (Sutton, Palen, &amp; Shklovski, 2008).  To check in with family and friends. While Americans predominately use social media to connect with family and friends (PEW Internet, 2011), during disasters those connections may shift. For those with family or friends directly involved with the disaster, social media can provide a way to ensure safety, offer support, and receive timely status updates (Procopio &amp; Procopio, 2007; Stephens &amp; Malone, 2009). In a survey of 1,058 Americans, the American Red Cross (2010) found that nearly half of their respondents would use social media to let loved ones know they are safe during disasters. After the 2011 earthquake and tsunami in Japan, the public turned to Twitter, Facebook, Skype, and local Japanese social networks to keep in touch with loved ones while mobile networks were down (Gao, Barbier, &amp; Goolsby, 2011). Researchers also note that disasters may enhance feelings of affection toward family members, and indeed survey participants reported expressing more positive emotions toward their loved ones than usual as a result of the September 11 terrorist attacks, even if they were not directly impacted by the disaster (Fredrickson et al., 2003). Finally, disasters can motivate the public to reconnect with family and friends via social media (Procopio &amp; Procopio, 2009; Semaan &amp; Mark, 2012).  To self-mobilize. During disasters, the public may use social media to organize emergency relief and ongoing assistance efforts from both near and afar. In fact, one research group dubbed those who surge to the forefront of digital and in-person disaster relief efforts as “voluntweeters” (Starbird &amp; Palen, 2011). Other research documents the role of Facebook and Twitter in disaster relief fundraising (Horrigan &amp; Morris, 2005; PEJ, 2010). Research also reveals how social media can help identify and respond to urgent needs after disasters. For example, just two hours after the 2010 Haitian earthquake Tufts University volunteers created Ushahidi-Haiti, a crisis map where disaster survivors and volunteers could send incident reports via text messages and tweets. In less than two weeks, 2,500 incident reports were sent to the map (Gao, Barbier, &amp; Gollsby, 2011).  To maintain a sense of community. During disasters the media in general and social media in particular may provide a unique gratification: sense of community. That is, as the public logs in online to share their feelings and thoughts, they assist each other in creating a sense of security and community, even when scattered across a vast geographical area (Lev-On, 2011; Procopio &amp; Procopio, 2007). As Reynolds and Seeger (2012) observed, social media create communities during disasters that may be temporary or may continue well into the future.  To seek emotional support and healing. Finally, disasters are often inherently tragic, prompting individuals to seek not only information but also human contact, conversation, and emotional care (Sutton et al., 2008). Social media are positioned to facilitate emotional support, allowing individuals to foster virtual communities and relationships, share information and feelings, and even demand resolution (Choi &amp; Lin, 2009; Stephens &amp; Malone, 2009). Indeed, social media in general and blogs in particular are instrumental for providing emotional support during and after disasters (Macias, Hilyard, &amp; Freimuth, 2009; PEJ New Media Index, 2011). Additionally, social media in general and Twitter in particular can aid healing, as research finds during both natural disasters, such as Hurricane Katrina (Procopio &amp; Procopio, 2007), and man-made disasters, such as the July 2011 attacks in Oslo, Norway (Perng et al., 2012).
  23. http://www.buzzfeed.com/annanorth/how-social-media-is-aiding-the-hurricane-sandy-rec -- Facebook help during Hurricane Sandyhttp://blog.twitter.com/2012/10/hurricane-sandy-resources-on-twitter.html – Twitter page for Hurricane Sandyhttp://www.treehugger.com/culture/12-ways-help-hurricane-sandy-relief-efforts.htmlCategorization of severity based on weather conditions. Actionable information is contextually dependent.
  24. http://news.cnet.com/8301-1023_3-57541566-93/report-twitter-hits-half-a-billion-tweets-a-day/http://semiocast.com/en/publications/2012_07_30_Twitter_reaches_half_a_billion_accounts_140m_in_the_USLet me consider one small example of how social data (in turn data) can help people during disasters. Data becomes smart data if it takes recipient into account - context.Sensor data for emergency responders. Who in the population needs immediate attention (1) Location (2) Severity (3) Health Condition Need for abstraction. – Semantic Perception needs abstraction. 90 + Heart Problem  Don’t run out23  Run out
  25. http://news.cnet.com/8301-1023_3-57541566-93/report-twitter-hits-half-a-billion-tweets-a-day/http://semiocast.com/en/publications/2012_07_30_Twitter_reaches_half_a_billion_accounts_140m_in_the_UShttp://www.internews.org/sites/default/files/resources/InternewsEurope_Report_Japan_Connecting%20the%20last%20mile%20Japan_2013.pdfLet me consider one small example of how social data (inturn data) can help people during disasters. Data becomes smart data if it takes recipient into account and changes contact accordingly.Sensor data for emergency responders. Who in the population needs immidiate attention (1) Location (2) Severity (3) Health Condition Need for abstraction. – Semantic Perception needs abstraction. 90 + Heart Problem  Don’t run out23  Run out
  26. http://news.cnet.com/8301-1023_3-57541566-93/report-twitter-hits-half-a-billion-tweets-a-day/http://semiocast.com/en/publications/2012_07_30_Twitter_reaches_half_a_billion_accounts_140m_in_the_USLet me consider one small example of how social data (inturn data) can help people during disasters. Data becomes smart data if it takes recipient into account and changes contxt accordingly.Sensor data for emergency responders. Who in the population needs immediate attention (1) Location (2) Severity (3) Health Condition Need for abstraction. – Semantic Perception needs abstraction. 90 + Heart Problem  Don’t run out23  Run out
  27. http://www.buzzfeed.com/jackstuef/the-man-behind-comfortablysmug-hurricane-sandysDuring the storm last night, user @comfortablysmug was the source of a load of frightening but false information about conditions in New York City that spread wildly on Twitter and onto news broadcasts before Con Ed, the MTA, and Wall Street sources had to take time out of the crisis situation to refute them.
  28. Although we face challenges like these with data everytime. The most important thing is what you aim to do with the data. I mean what value do you intend to provide from the data
  29. http://www.wired.com/insights/2013/04/big-data-fast-data-smart-data/
  30. http://www.wired.com/insights/2013/04/big-data-fast-data-smart-data/
  31. -- Contextual Questioning – Potential Information needed from Humans
  32. &quot;2600 BC – Imhotep wrote texts on ancient Egyptian medicine describing diagnosis and treatment of 200 diseases in 3rd dynasty Egypt.”Sir William Osler, 1st Baronet, was a Canadian physician and one of the four founding professors of Johns Hopkins Hospital. He was called the father of modern medicine. Sir William Osler called Imhotep as the true father of medicine.Observations related to human body was quite limited Initially, doctors communicated with patients asking for their symptoms (subjective)Laennec’s [Rene TheophileHyacintheLaënnec, a French Physician] stethoscope was the fist peek into the observations of human body (objective)Now, there are petabytes of data being generated for observations of human body
  33. Larry Smarr is a professor at the University of California, San DiegoAnd he was diagnosed with crohn&apos;s diseaseWhat’s interesting about this case is that Larry diagnosed himselfHe is a pioneer in the area of Quantified-Self, which uses sensors to monitor physiological symptomsThrough this process he discovered inflammation, which led him to discovery of Crones DiseaseThis type of self-tracking is becoming more and more common
  34. - With this ability,many problems could be solved- For example: we could help solve health problems (before they become serious health problems) through monitoring symptoms and real-time sense making, acting as an early warning system to detect problematic health conditions
  35. ADHF – Acute Decompensated Heart Failure
  36. 1)www.pollen.com(For pollen levels)2)http://www.airnow.gov/(For air quality levels)3)http://www.weatherforyou.com/(For temperature and humidity)
  37. Data overload in the context of asthma
  38. AmitSheth, Pramod Anantharam, Cory Henson, &apos;Physical-Cyber-Social Computing: An Early 21st Century Approach,&apos; IEEE Intelligent Systems, vol. 28, no. 1, pp. 78-82, Jan.-Feb., 2013.
  39. Research on Asthma has three phases Data collection: what signals to collect?Analysis: what analysis to be done?Actionable information: what action to recommend?In the next slide, we take a peek into the analysis that we do for Asthma
  40. What is the current state of a person/patient? =&gt; Summarizing all the observations (sensor and personal) into a single score indicating health of a personInstead of presenting all the raw data (often to much e.g., Asthma application we have developed collects CO, temperature, and humidity every 10 seconds resulting in 8,640 observations/day) which may not be comprehensible to the patient, we empower them by providing actionable summaries.
  41. What is the likely state of the person in future? =&gt; Given the current state and the changing environmental conditions, estimate the state of the person by summarizing it into a number which is actionable. For example, vulnerability score for a person with Asthma is computed with environmental factors (pollen, air quality, external temperature and humidity) and current state of the patient. Intuitively, a person with well controlled asthma should have a lower vulnerability score than a person with poorly controlled asthma both being in a poor environmental state.
  42. In the absence of declarative knowledge in a domain, we resort to statistical approaches to glean insights from dataEven if there is declarative knowledge of a domain, it may have to be personalizedThe CO level may be related to the luminosity as observed by the sensordrone – as it gets brighter the CO level also increases =&gt; high CO level in daytime If such an insight is provided to a person, the interpretation can be:Some activity inside the house leads to high CO levelsOutside activity leads to high CO levels inside the houseSince the person knows that he/she is absent in the house during mornings, it has to be something from outside.- Person narrows down to a possible opened window at home (forgot to close more often)
  43. There are two components in making sense of Health Signals:Health signal extraction – processing, aggregating, and abstracting from raw sensor/textual data to create human intelligible abstractionsHealth signal understanding – derive (1) connections between abstractions and (2) Action recommendation:ContinueContact nurseContact doctor
  44. Only score based structure extraction is presented here. Other popular structure extraction techniques include constraint based approaches which finds independences between random variables X1, …, XnI-Map =&gt; different structures result in the same loglikelihood score. Thus recovering the original structure of the graph generating data using data alone is considered impossible! We go the the rescue of declarative knowledge to: (1) choose promising structures and (2) to break ties when two structure results in the same score
  45. Massive amount of data is collected by sensors and mobile devices yet patients and doctors care about “actionable” information.This data has all the four Vs of big data and we used knowledge enabled techniques to transform it into valueIn the context of PD, we analyzed massive amount of sensor data collected by sensors on a smartphones to understand detection and characterization of PD severity.
  46. Main idea: Prior knowledge of PD was used to facilitate its detection from massive sensor data by reducing the search spaceDetails:Declarative knowledge of PD includes PD severity and their symptoms as shown in the logical rule aboveEach PD severity level is a conjunction of a set of PD symptomsEach symptom was mapped to its manifestation in sensor observationsThe availability of declarative knowledge significantly improved the analytics by aiding feature selection processThe graphs above contrasts the physical movements and voice of two control group members and two PD patients
  47. sense making based on human cognitive models
  48. perception cycle contains two primary phasesexplanationtranslating low-level signals into high-level abstractions inference to the best explanationdiscriminationfocusing attention on those properties that will help distinguish between multiple possible explanationsused to intelligently task sensors and collect additional observations (rather than brute force approach of blindly collecting all observations)
  49. perception cycle contains two primary phasesexplanationtranslating low-level signals into high-level abstractions inference to the best explanationdiscriminationfocusing attention on those properties that will help distinguish between multiple possible explanationsused to intelligently task sensors and collect additional observations (rather than brute force approach of blindly collecting all observations)
  50. A single-feature (disease) assumption means that all the observed properties (symptoms) must be explained by a single feature.i.e., this framework is not expressive enough to model comorbidity where there may be more than one feature (disease) co-existing For example, if there are two diseases causing disjoint symptoms, and all the symptoms of both the diseases are observed, then this framework will not be able to find the coverage and returns no diseases.Parsimony criteria is single feature assumption to choose from among multiple explanationsNot true: if multiple disease account for single property…Rewrite with more relaxed parcimony criteria (complex, cannot be modeled in OWL)Make KB more intelligent: create an individual that represents the two disease which together explain a symptom
  51. perception cycle contains two primary phasesexplanationtranslating low-level signals into high-level abstractions inference to the best explanationdiscriminationfocusing attention on those properties that will help distinguish between multiple possible explanationsused to intelligently task sensors and collect additional observations (rather than brute force approach of blindly collecting all observations)
  52. So check galvanic skin response sensor
  53. Intelligence distributed at the edge of the networkRequires resource-constrained devices (mobile phones, gateway notes, etc.) to be able to utilize SW technologies
  54. Intelligence distributed at the edge of the networkRequires resource-constrained devices (mobile phones, gateway notes, etc.) to be able to utilize SW technologiesHenson et al. &apos;An Efficient Bit Vector Approach to Semantics-based Machine Perception in Resource-Constrained Devices, ISWC 2012.
  55. compute machine perception inferences -- i.e., explanation and discrimination -- of high-complexity on a resource-constrained devices in milisecondsDifference between the other systems and what this system provides
  56. Intelligence at the age. Shipping computation and domain models to the edge (Distributed)
  57. http://www.adrants.com/2013/09/what-is-social-good-and-how-can-brands.php
  58. http://www.guardian.co.uk/news/datablog/2012/oct/31/twitter-sandy-floodinghttp://www.huffingtonpost.com/2012/11/02/twitter-hurricane-sandy_n_2066281.htmlhttp://mashable.com/2012/10/31/hurricane-sandy-facebook/We in our lab have quite a bit of Social Data Research going on. So I would like to focus on the use of social networks during these disasters/crisis.Twitter and Facebook are massively used during disasters. During Hurricane Sandy there were …Not only this a major outbreak of tweets were during Japan earthquake which crossed more that 2000 tweets/sec.So why do people intend to use social networks to this extent during disasters.
  59. http://www.flickr.com/photos/twitteroffice/5897088517/sizes/o/in/photostream/http://bayarea.sbnation.com/49ers/2013/2/3/3947738/super-bowl-prop-bets-2013-twitterhttp://bayarea.sbnation.com/49ers/2013/2/3/3947738/super-bowl-prop-bets-2013-twitterhttp://expandedramblings.com/index.php/march-2013-by-the-numbers-a-few-amazing-twitter-stats/
  60. Much of the early work in Big data is being done with focusing on uni-directional among XYZ.
  61. http://semanticweb.com/picking-the-president-twindex-twitris-track-social-media-electorate_b31249http://semanticweb.com/election-2012-the-semantic-recap_b33278
  62. http://knoesis.wright.edu/library/resource.php?id=1787
  63. Categorization of severity based on weather conditions. Actionable information is contextually dependent.
  64. - 1 (+half) minuteAlright, so let’s motivate by this situation during emergency - Various actors: resource seekers, responder teams, resource providers at remote siteAnd - each of these actor groups have questions --- - needs - providers - responders: wondering!Here we have social network to connect these actors and bridge the gap for communication platformBut it’s potential use is yet to be realized for effective help
  65. Talk about what kind of smart data we provide that helps the actions of crisis response coordination.
  66. Source: Purohit et. al 2013 (https://docs.google.com/a/knoesis.org/document/d/1aBJ2egHICUwaWxR8jOoTIUfEYj1QAnUt0q7haIKoYGY/edit# , http://www.knoesis.org/library/resource.php?id=1865)
  67. http://twitris.knoesis.org/oklahomatornado
  68. (It is real-time widget for monitoring of needs, so will not be active after the event has passed) http://twitris.knoesis.org/oklahomatornado
  69. Highly rich interface for response team
  70. Definition of the event US Elections and some changes/subevents --- Primaries --- Debates -- People/Places/Organizations involved in the eventArab Spring -- Subevents during those -- Egypt protests
  71. Explain about continuous semantics
  72. Pucher, J., Korattyswaroopam, N., &amp; Ittyerah, N. (2004). The crisis of public transport in India: Overwhelming needs but limited resources. Journal of Public Transportation, 7(4), 1-30.
  73. Twitter as a source of real-time informationThere are over 200 million users generating 500 million tweets / dayTwitter as a source of events in a cityCitizens use twitter to express their concerns of city infrastructure that impacts their life
  74. The red-tweets are the tweets that are related to city infrastructure e.g., trafficThere are two steps in converting raw tweets from a city to city related events:City event annotation: sequence labeling technique to spot location and event termsCity event extraction: aggregating all the location + event terms to derive eventsTo do this aggregation, we follow some principles that characterize city events
  75. CRF assigns a tag to each tokenGlobal normalization is the argmax termRHS is just a regression based implementation of linear chain (potentials defined only over adjacent tags) CRFLingPipe implementation of CRF is used in our experiments
  76. There principles characterize city events
  77. localized event detection strategy, city a composition of smaller geographical unitsWe call these geographical units as grids Geohash provides us a way of compartmentalizing a city into uniquely addressable gridsDistance computed using the formula:dlon = lon2 - lon1 dlat = lat2 - lat1 a = (sin(dlat/2))^2 + cos(lat1) * cos(lat2) * (sin(dlon/2))^2 c = 2 * atan2( sqrt(a), sqrt(1-a) ) d = R * c (where R is the radius of the Earth)Found the box for the tweet!37.7545166015625, -122.42065429687537.7545166015625, -122.4096679687537.7490234375, -122.4096679687537.7490234375, -122.420654296875
  78. These algorithms take the annotated tweets as input and then emit events with their metadata
  79. Now that we have presented the (1) event extraction and (2) event aggregation algorithms, how well are we doing?We evaluate both the componentsThe ground truth are the events reported on 511.orgWe compare the events we extract from tweets with the 511.org events
  80. We evaluate the extracted events based on there orthogonal metrics.We compare (1) events extracted from tweets using our algorithms and (2) 511.org events Complementary events – the events from (1) and (2) may complement each other i.e., one providing a different view from the otherCorroborative events – the events from (1) and (2) may support each other i.e., redundant eventsTimeliness – the events were reported on (1) before it was reported on (2)
  81. Next few slides give examples of the evaluation metric
  82. The record of 511.org may have its own timestamp which may be before tweets
  83. More at: http://wiki.knoesis.org/index.php/PCSAnd http://knoesis.org/projects/ssw/