Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

TRANSFORMING BIG DATA INTO SMART DATA: Deriving Value via Harnessing Volume, Variety, and Velocity using Semantic Techniques and Technologies

7,356 views

Published on

Keynote given at ICDE2014, April 2014. Details at: http://ieee-icde2014.eecs.northwestern.edu/keynotes.html

A video of a version of this talk is available here: http://youtu.be/8RhpFlfpJ-A

(download to see many hidden slides).

Two versions of this talk, targeted at Smart Energy and Personalized Digital Health domains/apps at: http://wiki.knoesis.org/index.php/Smart_Data

Previous (older) version replaced by this version: http://www.slideshare.net/apsheth/big-data-to-smart-data-keynote

Published in: Education, Technology, Business
  • Be the first to comment

TRANSFORMING BIG DATA INTO SMART DATA: Deriving Value via Harnessing Volume, Variety, and Velocity using Semantic Techniques and Technologies

  1. 1. Transforming Big Data into Smart Data: Deriving Value via harnessing Volume, Variety and Velocity using semantics and Semantic Web Put Knoesis Banner Keynote at 30th IEEE International Conference on Data Engineering (ICDE) 2014 Amit Sheth LexisNexis Ohio Eminent Scholar & Exec. Director, The Ohio Center of Excellence in Knowledge-enabled Computing (Kno.e.sis) Wright State, USA
  2. 2. 2
  3. 3. Amit Sheth’s PHD students Ashutosh Jadhav Hemant Purohit Vinh Nguyen Lu Chen Pramod AnantharamSujan Perera Alan Smith Maryam Panahiazar Sarasi Lalithsena Cory Henson Kalpa Gunaratna Delroy Cameron Sanjaya Wijeratne Wenbo Wang Kno.e.sis in 2012 = ~100 researchers (15 faculty, ~50 PhD students) Special Thanks Pavan Kapanipathi Special Thanks Special Thanks Special Thanks Shreyansh Bhatt Acknowledgements: Kno.e.sis team, Funds - NSF, NIH, AFRL, Industry…
  4. 4. 2011 How much data? 48 (2013) 500 (2013) 4http://www.knowledgeinfusion.com/blog/2011/11/get-your-head-out-of-the-clouds-and-into-big-data/
  5. 5. Only 0.5% to 1% of the data is used for analysis. 5 http://www.csc.com/insights/flxwd/78931-big_data_growth_just_beginning_to_explode http://www.guardian.co.uk/news/datablog/2012/dec/19/big-data-study-digital-universe-global-volume
  6. 6. Variety – not just structure but modality: multimodal, multisensory Semi structured 6
  7. 7. Velocity Fast Data Rapid Changes Real-Time/Stream Analysis Current application examples: financial services, stock brokerage, weather tracking, movies/entertainment and online retail 7
  8. 8. • What if your data volume gets so large and varied you don't know how to deal with it? • Do you store all your data? • Do you analyze it all? • What is coverage, skew, quality? How can you find out which data points are really important? • How can you use it to your best advantage? 9 Questions typically asked on Big Data http://www.sas.com/big-data/
  9. 9. http://techcrunch.com/2012/10/27/big-data-right-now-five-trendy-open-source-technologies/ Variety of Data Analytics Enablers 10
  10. 10. • Prediction of the spread of flu in real time during H1N1 2009 – Google tested a mammoth of 450 million different mathematical models to test the search terms that provided 45 important parameters – Model was tested when H1N1 crisis struck in 2009 and gave more meaningful and valuable real time information than any public health official system [Big Data, Viktor Mayer-Schonberger and Kenneth Cukier, 2013] • FareCast: predict the direction of air fares over different routes [Big Data, Viktor Mayer-Schonberger and Kenneth Cukier, 2013] • NY city manholes problem [ICML Discussion, 2012] 11 Illustrative Big Data Applications
  11. 11. Current focus mainly to serve business intelligence and targeted analytics needs, not to serve complex individual and collective human needs (e.g., empower human in health, fitness and well-being; better disaster coordination, personalized smart energy) 12 What is missing?
  12. 12.  highly personalized/individualized/contextualized  Incorporate real-world complexity: - multi-modal and multi-sensory nature of physical-world and human perception  Can More Data beat better algorithms?  Can Big Data replace human judgment? 13 Many opportunities, many challenges, lessons to apply
  13. 13. • Not just data to information, not just analysis, but actionable information, delivering insight and support better decision making right in the context of human activities 15 What is needed? Data Information Actionable: An apple a day keeps the doctor away
  14. 14. 16 What is needed? Taking inspiration from cognitive models • Bottom up and top down cognitive processes: – Bottom up: find patterns, mine (ML, …) – Top down: Infusion of models and background knowledge (data + knowledge + reasoning) Left(plans)/Right(perceives) Brain Top(plans)/Bottom(perceives) Brain http://online.wsj.com/news/articles/SB10001424052702304410204579139423079198270
  15. 15. • Ambient processing as much as possible while enabling natural human involvement to guide the system 17 What is needed? Smart Refrigerator: Low on Apples Adapting the Plan: shopping for apples
  16. 16. Makes Sense to a human Is actionable – timely and better decisions/outcomes 18
  17. 17. 20 My 2004-2005 formulation of SMART DATA - Semagix Formulation of Smart Data strategy providing services for Search, Explore, Notify. “Use of Ontologies and Data repositories to gain relevant insights”
  18. 18. Smart Data (2013 retake) Smart data makes sense out of Big data It provides value from harnessing the challenges posed by volume, velocity, variety and veracity of big data, in-turn providing actionable information and improve decision making. 21
  19. 19. OF human, BY human FOR human Smart data is focused on the actionable value achieved by human involvement in data creation, processing and consumption phases for improving the human experience. Another perspective on Smart Data 22
  20. 20. OF human, BY human FOR human Another perspective on Smart Data 23
  21. 21. Petabytes of Physical(sensory)-Cyber-Social Data everyday! More on PCS Computing: http://wiki.knoesis.org/index.php/PCS 24 „OF human‟ : Relevant Real-time Data Streams for Human Experience
  22. 22. OF human, BY human FOR human 25 Another perspective on Smart Data
  23. 23. Use of Prior Human-created Knowledge Models 26 „BY human‟: Involving Crowd Intelligence in data processing workflows Crowdsourcing and Domain-expert guided Machine Learning Modeling
  24. 24. OF human, BY human FOR human Another perspective on Smart Data 27
  25. 25. Detection of events, such as wheezing sound, indoor temperature, humidity, dust, and CO level Weather Application Asthma Healthcare Application Close the window at home during day to avoid CO in gush, to avoid asthma attacks at night 28 „FOR human‟ : Improving Human Experience Population Level Personal Public Health Action in the Physical World Luminosity CO level CO in gush during day time
  26. 26. Electricity usage over a day, device at work, power consumption, cost/kWh, heat index, relative humidity, and public events from social stream Weather Application Power Monitoring Application 29 „FOR human‟ : Improving Human Experience Population Level Observations Personal Level Observations Action in the Physical World Washing and drying has resulted in significant cost since it was done during peak load period. Consider changing this time to night.
  27. 27. 30 Every one and everything has Big Data – It is Smart Data that matter!
  28. 28. • Healthcare: ADFH, Asthma, GI – Using kHealth system • Social Media Analysis: Crisis coordination – Using Twitris platform • Smart Cities: Traffic management 31 I will use applications in 3 domains to demonstrate
  29. 29. • Healthcare: ADFH, Asthma, GI – Using kHealth system • Social Media Analysis: Crisis coordination – Using Twitris platform • Smart Cities: Traffic management 43 Smart Data Applications
  30. 30. 44 A Historical Perspective on Collecting Health Observations Diseases treated only by external observations First peek beyond just external observations Information overload! Doctors relied only on external observations Stethoscope was the first instrument to go beyond just external observations Though the stethoscope has survived, it is only one among many observations in modern medicine http://en.wikipedia.org/wiki/Timeline_of_medicine_and_medical_technology 2600 BC ~1815 Today Imhotep Laennec’s stethoscope Image Credit: British Museum
  31. 31. The Patient of the Future MIT Technology Review, 2012 http://www.technologyreview.com/featuredstory/426968/the-patient-of-the-future/ 45
  32. 32. Through physical monitoring and analysis, our cellphones could act as an early warning system to detect serious health conditions, and provide actionable information canary in a coal mine Empowering Individuals (who are not Larry Smarr!) for their own health kHealth: knowledge-enabled healthcare 46
  33. 33. Weight Scale Heart Rate Monitor Blood Pressure Monitor 47 Sensors Android Device (w/ kHealth App) Readmissions cost $17B/year: $50K/readmission; Total kHealth kit cost: < $500 kHealth Kit for the application for reducing ADHF readmission ADHF – Acute Decompensated Heart Failure
  34. 34. 48 1http://www.nhlbi.nih.gov/health/health-topics/topics/asthma/ 2http://www.lung.org/lung-disease/asthma/resources/facts-and-figures/asthma-in-adults.html 3Akinbami et al. (2009). Status of childhood asthma in the United States, 1980–2007. Pediatrics,123(Supplement 3), S131-S145. 25 million 300 million $50 billion 155,000 593,000 People in the U.S. are diagnosed with asthma (7 million are children)1. People suffering from asthma worldwide2. Spent on asthma alone in a year2 Hospital admissions in 20063 Emergency department visits in 20063 Asthma: Severity of the problem
  35. 35. Sensordrone (Carbon monoxide, temperature, humidity) Node Sensor (exhaled Nitric Oxide) 49 Sensors Android Device (w/ kHealth App) Total cost: ~ $500 kHealth Kit for the application for Asthma management *Along with two sensors in the kit, the application uses a variety of population level signals from the web: Pollen level Air Quality Temperature & Humidity
  36. 36. 51 Data Overload for Patients/health aficionados Providing actionable information in a timely manner is crucial to avoid information overload or fatigue Personal level Signals Public level Signals Population level Signals
  37. 37. 52 Data Overload Spanning Physical-Cyber-Social Modalities Increasingly, real-world events are: (a) Continuous: Observations are fine grained over time (b) Multimodal, multisensory: Observations span PCS modalities
  38. 38. what can we do to avoid asthma episode? 54 Real-time health signals from personal level (e.g., Wheezometer, NO in breath, accelerometer, microphone), public health (e.g., CDC, Hospital EMR), and population level (e.g., pollen level, CO2) arriving continuously in fine grained samples potentially with missing information and uneven sampling frequencies. Variety Volume VeracityVelocity Value What risk factors influence asthma control? What is the contribution of each risk factor? semantics Understanding relationships between health signals and asthma attacks for providing actionable information WHY Big Data to Smart Data: Asthma example
  39. 39. kHealth: Health Signal Processing Architecture Personal level Signals Public level Signals Population level Signals Domain Knowledge Risk Model Events from Social Streams Take Medication before going to work Avoid going out in the evening due to high pollen levels Contact doctor Analysis Personalized Actionable Information Data Acquisition & aggregation 55
  40. 40. 57 Asthma Domain Knowledge Domain Knowledge Asthma Control à Daily Medication Choices for starting therapy Not Well Controlled Poor Controlled Severity Level of Asthma (Recommended Action) (Recommended Action) (Recommended Action) Intermittent Asthma SABA prn - - Mild Persistent Asthma Low dose ICS Medium ICS Medium ICS Moderate Persistent Asthma Medium dose ICS alone Or with LABA/montelukast Medium ICS + LABA/Montelukast Or High dose ICS Medium ICS + LABA/Montelukast Or High dose ICS* Severe Persistent Asthma High dose ICS with LABA/montelukast Needs specialist care Needs specialist care ICS= inhaled corticosteroid, LABA = inhaled long-acting beta2-agonist, SABA= inhaled short-acting beta2-agonist ; *consider referral to specialist Asthma Control and Actionable Information
  41. 41. 58 Patient Health Score (diagnostic) Risk assessment model Semantic Perception Personal level Signals Public level Signals Domain Knowledge Population level Signals GREEN -- Well Controlled YELLOW – Not well controlled Red -- poor controlled How controlled is my asthma?
  42. 42. 59 Patient Vulnerability Score (prognostic) Risk assessment model Semantic Perception Personal level Signals Public level Signals Domain Knowledge Population level Signals Patient health Score How vulnerable* is my control level today? *considering changing environmental conditions and current control level
  43. 43. 60 3.4 billion people will have smartphones or tablets by 2017 -- Research2Guidance “Intelligence at the Edges” for Digital Health http://www.digikey.com/us/en/techzone/energy-harvesting/resources/articles/zigbees-smart-energy-20-profile.html m-health app market is predicted to reach $26 billion in 2017 -- Research2Guidance
  44. 44. 63 Sensordrone – for monitoring environmental air quality Wheezometer – for monitoring wheezing sounds Can I reduce my asthma attacks at night? What are the triggers? What is the wheezing level? What is the propensity toward asthma? What is the exposure level over a day? Commute to Work Asthma: Actionable Information for Asthma Patients Luminosity CO level CO in gush during day time Actionable Information Personal level Signals Public level Signals Population level Signals What is the air quality indoors?
  45. 45. 64 Population Level Personal Wheeze – Yes Do you have tightness of chest? –Yes ObservationsPhysical-Cyber-Social System Health Signal Extraction Health Signal Understanding <Wheezing=Yes, time, location> <ChectTightness=Yes, time, location> <PollenLevel=Medium, time, location> <Pollution=Yes, time, location> <Activity=High, time, location> Wheezing ChectTightness PollenLevel Pollution Activity Wheezing ChectTightness PollenLevel Pollution Activity RiskCategory <PollenLevel, ChectTightness, Pollution, Activity, Wheezing, RiskCategory> <2, 1, 1,3, 1, RiskCategory> <2, 1, 1,3, 1, RiskCategory> <2, 1, 1,3, 1, RiskCategory> <2, 1, 1,3, 1, RiskCategory> . . . Expert Knowledge Background Knowledge tweet reporting pollution level and asthma attacks Acceleration readings from on-phone sensors Sensor and personal observations Signals from personal, personal spaces, and community spaces Risk Category assigned by doctors Qualify Quantify Enrich Outdoor pollen and pollution Public Health Health Signal Extraction to Understanding Well Controlled - continue Not Well Controlled – contact nurse Poor Controlled – contact doctor
  46. 46. 70 RDF OWL How are machines supposed to integrate and interpret sensor data? Semantic Sensor Networks (SSN)
  47. 47. 71 W3C Semantic Sensor Network Ontology Lefort, L., Henson, C., Taylor, K., Barnaghi, P., Compton, M., Corcho, O., Garcia-Castro, R., Graybeal, J., Herzog, A., Janowicz, K., Neuhaus, H., Nikolov, A., and Page, K.: Semantic Sensor Network XG Final Report, W3C Incubator Group Report (2011).
  48. 48. 73 W3C Semantic Sensor Network Ontology Lefort, L., Henson, C., Taylor, K., Barnaghi, P., Compton, M., Corcho, O., Garcia-Castro, R., Graybeal, J., Herzog, A., Janowicz, K., Neuhaus, H., Nikolov, A., and Page, K.: Semantic Sensor Network XG Final Report, W3C Incubator Group Report (2011).
  49. 49. SSN Ontology 2 Interpreted data (deductive) [in OWL] e.g., threshold 1 Annotated Data [in RDF] e.g., label 0 Raw Data [in TEXT] e.g., number Levels of Abstraction 3 Interpreted data (abductive) [in OWL] e.g., diagnosis Intellego “150” Systolic blood pressure of 150 mmHg Elevated Blood Pressure Hyperthyroidism …… 75
  50. 50. 76 Making sense of sensor data with
  51. 51. People are good at making sense of sensory input What can we learn from cognitive models of perception? • The key ingredient is prior knowledge 77
  52. 52. * based on Neisser’s cognitive model of perception Observe Property Perceive Feature Explanation Discrimination 1 2 Perception Cycle* Translating low-level signals into high-level knowledge Focusing attention on those aspects of the environment that provide useful information Prior Knowledge 78
  53. 53. To enable machine perception, Semantic Web technology is used to integrate sensor data with prior knowledge on the Web 79
  54. 54. Prior knowledge on the Web W3C Semantic Sensor Network (SSN) Ontology Bi-partite Graph 80
  55. 55. Prior knowledge on the Web W3C Semantic Sensor Network (SSN) Ontology Bi-partite Graph 81
  56. 56. Observe Property Perceive Feature Explanation 1 Translating low-level signals into high-level knowledge Explanation Explanation is the act of choosing the objects or events that best account for a set of observations; often referred to as hypothesis building 82
  57. 57. Discrimination is the act of finding those properties that, if observed, would help distinguish between multiple explanatory features Observe Property Perceive Feature Explanation Discrimination 2 Focusing attention on those aspects of the environment that provide useful information Discrimination 85
  58. 58. Discrimination Discriminating Property: is neither expected nor not-applicable DiscriminatingProperty ≡ ¬ExpectedProperty ⊓ ¬NotApplicableProperty elevated blood pressure clammy skin palpitations Hypertension Hyperthyroidism Pulmonary Edema Discriminating Property Explanatory Feature 89
  59. 59. Semantic scalability: Resource savings of abstracting sensor data 90 Orders of magnitude resource savings for generating and storing relevant abstractions vs. raw observations. Relevant abstractions Raw observations
  60. 60. How do we implement machine perception efficiently on a resource-constrained device? Use of OWL reasoner is resource intensive (especially on resource-constrained devices), in terms of both memory and time • Runs out of resources with prior knowledge >> 15 nodes • Asymptotic complexity: O(n3) 92
  61. 61. intelligence at the edge Approach 1: Send all sensor observations to the cloud for processing Approach 2: downscale semantic processing so that each device is capable of machine perception 93 Henson et al. 'An Efficient Bit Vector Approach to Semantics-based Machine Perception in Resource-Constrained Devices, ISWC 2012.
  62. 62. Efficient execution of machine perception Use bit vector encodings and their operations to encode prior knowledge and execute semantic reasoning 010110001101 0011110010101 1000110110110 101100011010 0111100101011 000110101100 0110100111 94
  63. 63. O(n3) < x < O(n4) O(n) Efficiency Improvement • Problem size increased from 10’s to 1000’s of nodes • Time reduced from minutes to milliseconds • Complexity growth reduced from polynomial to linear Evaluation on a mobile device 95
  64. 64. 2 Prior knowledge is the key to perception Using SW technologies, machine perception can be formalized and integrated with prior knowledge on the Web 3 Intelligence at the edge By downscaling semantic inference, machine perception can execute efficiently on resource-constrained devices Semantic Perception for smarter analytics: 3 ideas to takeaway 1 Translate low-level data to high-level knowledge Machine perception can be used to convert low-level sensory signals into high-level knowledge useful for decision making 96
  65. 65. • Healthcare: ADFH, Asthma, GI – Using kHealth system • Social Media Analysis: Crisis coordination – Using Twitris platform • Smart Cities: Traffic management 98 Smart Data Applications
  66. 66. 99 Smart Data for Social Good Mining human behavior to help societal and humanitarian development • crisis response coordination, harassment, gender-based violence, …
  67. 67. 20 million tweets with “sandy, hurricane” keywords between Oct 27th and Nov 1st 2nd most popular topic on Facebook during 2012 Social (Big) Data during Crisis- Example of Hurricane Sandy 100 • http://www.guardian.co.uk/news/datablog/2 012/oct/31/twitter-sandy-flooding • http://www.huffingtonpost.com/2012/11/02 /twitter-hurricane-sandy_n_2066281.html • http://mashable.com/2012/10/31/hurricane- sandy-facebook/
  68. 68. 103http://usatoday30.usatoday.com/news/politics/twitter-election-meter http://twitris.knoesis.org/
  69. 69. Twitris‟ Dimensions of Integrated Semantic Analysis 104Sheth et al. Twitris- a System for Collective Social Intelligence, ESNAM-2014
  70. 70. What is Smart Data in the context of Disaster Management ACTIONABLE: Timely delivery of right resources and information to the right people at right location! 113 Because everyone wants to Help, but DON’T KNOW HOW!
  71. 71. Really sparse Signal to Noise: • 2M tweets during the first 48 hrs. of #Oklahoma-tornado-2013 - 1.3% as the precise resource donation requests to help - 0.02% as the precise resource donation offers to help 114 • Anyone know how to get involved to help the tornado victims in Oklahoma??#tornado #oklahomacity (OFFER) • I want to donate to the Oklahoma cause shoes clothes even food if I can (OFFER) Disaster Response Coordination: Finding Actionable Nuggets for Responders to act • Text REDCROSS to 909-99 to donate to those impacted by the Moore tornado! http://t.co/oQMljkicPs (REQUEST) • Please donate to Oklahoma disaster relief efforts.: http://t.co/crRvLAaHtk (REQUEST) For responders, most important information is the scarcity and availability of resources Blog by our colleague Patrick Meier on this analysis: http://irevolution.net/2013/05/29/analyzing-tweets-tornado/
  72. 72. Join us for the Social Good! http://twitris.knoesis.org RT @OpOKRelief: Southgate Baptist Church on 4th Street in Moore has food, water, clothes, diapers, toys, and more. If you can't go,call 794 Text "FOOD" to 32333, REDCROSS to 90999, or STORM to 80888 to donate $10 in storm relief. #moore #oklahoma #disasterrelief #donate Want to help animals in #Oklahoma? @ASPCA tells how you can help: http://t.co/mt8l9PwzmO CITIZEN SENSORS RESPONSE TEAMS (including humanitarian org. and ‘pseudo’ responders) VICTIM SITE Coordination of needs and offers Using Social Media Does anyone know where to send a check to donate to the tornado victims? Where do I go to help out for volunteer work around Moore? Anyone know? Anyone know where to donate to help the animals from the Oklahoma disaster? #oklah oma #dogs Matched Matched Matched Serving the need! If you would like to volunteer today, help is desperately needed in Shawnee. Call 273-5331 for more info http://www.slideshare.net/knoesis/iccm-2013ignitetalkhemantpurohitunnairobi 115 Purohit et al. Emergency-relief coordination on social media: Automatically matching resource requests and offers, 2014. With Int’l collaborator
  73. 73. Continuous Semantics for Evolving Events to Extract Smart Data 126
  74. 74. Dynamic Model Creation Continuous Semantics 127
  75. 75. • Healthcare: ADFH, Asthma, GI – Using kHealth system • Social Media Analysis: Crisis coordination – Using Twitris platform • Smart Cities: Traffic management 130 Smart Data Applications
  76. 76. 131 Traffic Management To improve the everyday life entangled due to our most common problem of ‘stuck in traffic’
  77. 77. 1IBM Smarter Traffic 132 Severity of the Traffic Problem
  78. 78. Vehicular traffic data from San Francisco Bay Area aggregated from on-road sensors (numerical) and incident reports (textual) 133 http://511.org/ Every minute update of speed, volume, travel time, and occupancy resulting in 178 million link status observations, 738 active events, and 146 scheduled events with many unevenly sampled observations collected over 3 months. Variety Volume VeracityVelocity Value Can we detect the onset of traffic congestion? Can we characterize traffic congestion based on events? Can we estimate traffic delays in a road network? semantics Representing prior knowledge of traffic lead to a focused exploration of this massive dataset Big Data to Smart Data: Traffic Management example
  79. 79. 134 Duration: 36 months Requested funding: 2.531.202 € CityPulse Consortium City of Aarhus City of Brasov
  80. 80. Textual Streams for City Related Events 135
  81. 81. City Infrastructure Tweets from a city POS Tagging Hybrid NER+ Event term extraction Geohashing Temporal Estimation Impact Assessment Event Aggregation OSM Locations SCRIBE ontology 511.org hierarchy City Event Extraction City Event Extraction Solution Architecture City Event Annotation OSM – Google Open Street Maps NER – Named Entity Recognition 136
  82. 82. City Event Annotation – CRF Annotation Examples Last O night O in O CA... O (@ O Half B-LOCATION Moon I-LOCATION Bay B-LOCATION Brewing I-LOCATION Company O w/ O 8 O others) O http://t.co/w0eGEJjApY O B-LOCATION I-LOCATION B-EVENT I-EVENT O Tags used in our approach: These are the annotations provided by a Conditional Random Field model trained on tweet corpus to spot city related events and location BIO – Beginning, Intermediate, and Other is a notation used in multi-phrase entity spotting 138
  83. 83. City Events from Sensor and Social Streams can be… • Complementary • Additional information • e.g., slow traffic from sensor data and accident from textual data • Corroborative • Additional confidence • e.g., accident event supporting a accident report from ground truth • Timely • Additional insight • e.g., knowing poor visibility before formal report from ground truth 143
  84. 84. Events from Social Streams and City Department* Corroborative EventsComplementary Events Event Sources City events extracted from tweets 511.org, Active events e.g., accidents, breakdowns 511.org, Scheduled events e.g., football game, parade City event from twitter providing complementary and corroborative evidence for fog reported by 511.org *511.org 146
  85. 85. 147 Actionable Information in City Management Tweets from a CityTraffic Sensor Data OSM Locations SCRIBE ontology 511.org hierarchy Web of Data How issues in a city can be resolved? e.g., what should I do when I have fog condition?
  86. 86. • Big Data is every where – at individual level and not just limited to corporation – with growing complexity: multimodal, Physical- Cyber-Social • Analysis is not sufficient • Bottom up techniques is not sufficient, need top down processing, need background knowledge 149 Take Away
  87. 87. Take Away • Focus on Humans and Improve human life and experience with SMART Data. – Data to Information to Contextually Relevant Abstractions – Actionable Information (Value from data) to assist and support Human in decision making. • Focus on Value -- SMART Data – Big Data Challenges without the intention of deriving Value is a “Journey without GOAL”. 150
  88. 88. 153 thank you, and please visit us at http://knoesis.org/vision Kno.e.sis – Ohio Center of Excellence in Knowledge-enabled Computing Wright State University, Dayton, Ohio, USA Smart Data
  89. 89. Ohio Center of Excellence in Knowledge-enabled Computing • Among top universities in the world in World Wide Web (cf: 5-yr impact, Microsoft Academic Search: shared 2nd place in Mar13) • Largest academic group in the US in Semantic Web + Social/Sensor Webs, Mobile/Cloud/Cognitive Computing, Big Data, IoT, Health/Clinical & Biomedicine Applications • Exceptional student success: internships and jobs at top salary (IBM Research, MSR, Amazon, CISCO, Oracle, Yahoo!, Samsung, research universities, NLM, startups ) • 100 researchers including 15 World Class faculty (>3K citations/faculty) and 45+ PhD students- practically all funded • $2M+/yr research for largely multidisciplinary projects; world class resources; industry sponsorships/collaborations (Google, IBM, …)
  90. 90. 155

×