Advertisement
Advertisement

More Related Content

Slideshows for you(20)

Similar to Smart Data - How you and I will exploit Big Data for personalized digital health and many other activities(20)

Advertisement

Smart Data - How you and I will exploit Big Data for personalized digital health and many other activities

  1. Put Knoesis Banner Smart Data - How you and I will exploit Big Data for personalized digital health and many other activities Keynote at IEEE BigData 2014, Oct 28, 2014 Amit Sheth LexisNexis Ohio Eminent Scholar & Exec. Director, The Ohio Center of Excellence in Knowledge-enabled Computing (Kno.e.sis) Wright State, USA
  2. 2 Thanks: My team (missing Pramod, Hemant, ...) Collaborators: Clinicians: Dr. William Abrahams (OSU-Wexner), Dr. Shalini Forbis (Dayton Childrens), Dr. Sangeeta Agrawal (VA), Valerie Shalin (WSU Cognitive Scientists ), Payam Barnaghi (U-Surrey), Ramesh Jain(UCI), … Funding: NSF (esp. IIS-1111183 “SoCS: Social Media Enhanced Organizational Sensemaking in Emergency Response,”), AFRL, NIH, Industry….
  3. 3 Big Data 2014 http://hrboss.com/hiringboss/articles/big-data-infographic
  4. Only 0.5% to 1% of the data is used for analysis. http://www.csc.com/insights/flxwd/78931-big_data_growth_just_beginning_to_explode 4 http://www.guardian.co.uk/news/datablog/2012/dec/19/big-data-study-digital-universe-global-volume
  5. Variety – not just structure but modality: multimodal, multisensory Semi structured 5
  6. Velocity Fast Data Rapid Changes Real-Time/Stream Analysis Current application examples: financial services, stock brokerage, weather tracking, movies/entertainment and online retail 6
  7. 7 Ever Increasing Connected Devices and People About 2 billion of the 5+ billion have data connections – so they perform “citizen sensing”. And there are more devices connected to the Internet than the entire human population. These ~2 billion citizen sensors and 10 billion devices & objects connected to the Internet makes this an era of IoT (Internet of Things) and Internet of Everything (IoE). http://www.cisco.com/web/about/ac79/docs/innov/IoT_IBSG_0411FINAL.pdf
  8. 8 Internet of Things / Everything : Future Trends “The next wave of dramatic Internet growth will come through the confluence of people, process, data, and things — the Internet of Everything (IoE).” - CISCO IBSG, 2013 Beyond the IoE based infrastructure, it is the possibility of developing applications that spans Physical, Cyber and the Social Worlds that is very exciting. http://www.cisco.com/web/about/ac79/docs/innov/IoE_Economy.pdf
  9. 10 What has not changed? We are still working on the simpler representations of the real-world! http://artint.info/html/ArtInt_8.html http://en.wikipedia.org/wiki/Traffic_congestion solve represent interpret real-world simplified representation compute
  10. 11 What should change? solve represent interpret real-world richer representation compute We need computational paradigms to tap into the rich pulse of the human populace, and utilize diverse data Represent, capture, and compute with richer and fine-grained representations of real-world problems + Richer representation of traffic observations Effective solutions People interpreting a real-world event
  11. Physical-Cyber-Social Computing for Actionable Insights from Multimodal Data High CO influences Wheezing Level (Low/High) High CO Reduced CO level => better Asthma control High Wheeze Vertical Operators (Semantic abstraction) operates on Artifacts at each level and transcends them to the next level. Horizontal Operators (Semantic Integration) operates on data from heterogeneous sources to create Integrated/correlated data streams. High Luminosity Carbon Monoxide “a holistic treatment of data, information, and knowledge integrate, correlate, interpret, Low Luminosity Wheeze Luminosity Low Wheeze from the PCS worlds to and provide contextually 1Amit Sheth, Pramod Anantharam, Cory Henson, 'Physical-Cyber-Social Computing: An Early 21st Century Approach,' IEEE Intelligent Systems, vol. 28, no. 1, pp. 78-82, Jan.-Feb., 2013. http://doi.ieeecomputersociety.org/10.1109/MIS.2013.20 relevant abstractions to humans. ”1 12
  12. • Healthcare: ADFH, Asthma, GI, Demintia – Using kHealth system • Traffic Analytics: – Understanding traffic flow • Social Media Analysis : – Crisis coordination using Twitris 13 I will use applications in 3 domains to demonstrate
  13. 14 MIT Technology Review, 2012 The Patient of the Future http://www.technologyreview.com/featuredstory/426968/the-patient-of-the-future/
  14. Asthma: A Multi-faceted and Symptomatically Variable Health Challenge 15 Personal level Signals Public level Signals Population level Signals “ … survey indicates that adult patients and caregivers of pediatric patients report variability in asthma symptoms over time, even when asthma medications are taken.”1 1Marcus, Philip, Kevin R. Murphy, Abid Rahman, and Christopher D. O’Brien. "Intrapatient symptom variability in adults and children with asthma: Results of a survey." Advances in therapy 22, no. 5 (2005): 488-497.
  15. Far better an approximate answer to the right question, which is often vague, than the exact answer to the wrong question, which can always be made precise. -- John Tukey, Ann. Math. Stat. 33 (1962) 16 Asthma: Actionable Information How is my Asthma control? Should I take additional medication today? How can I reduce my asthma attacks at home?
  16. 17 Asthma: Challenges in Heterogeneity, Variability, and Personalization Contextual Personalized Actionable Personal level Signals Public level Signals Population level Signals Domain Knowledge http://www.tuberktoraks.org/managete/fu_folder/2011-03/html/2011-3-291-311.html OR
  17. 18 My 2004-2005 formulation of SMART DATA - Semagix Formulation of Smart Data strategy providing services for Search, Explore, Notify. “Use of Ontologies and Data repositories to gain relevant insights”
  18. Smart Data (2014 retake) Smart data makes sense out of Big data It provides value from harnessing the challenges posed by volume, velocity, variety and veracity of big data, in-turn providing actionable information and improve decision making. 19
  19. Another perspective on Smart Data OF human, BY human FOR human Smart data is about extracting value by improving human involvement in data creation, processing and consumption. It is about (improving) computing for human experience. 20
  20. ‘OF human’ : Relevant Real-time Data Streams for Human Experience Petabytes of Physical(sensory)-Cyber-Social Data everyday! More on PCS Computing: http://wiki.knoesis.org/index.php/PCS 21
  21. Use of Prior Human-created Knowledge Models 22 ‘BY human’: Involving Crowd Intelligence in data processing Crowdsourcing and Domain-expert guided Machine Learning Modeling
  22. Weather Application Asthma Healthcare Application Personal Public Health Detection of events, such as wheezing sound, indoor temperature, humidity, dust, and CO level High CO content at home during day 23 ‘FOR human’ : Improving Human Experience (Smart Health) Population Level Action in the Physical World Luminosity CO level CO in gush during day time
  23. ‘FOR human’ : Improving Human Experience (Smart Energy) Weather Application Power Monitoring Application Personal Level Observations Electricity usage over a day, device at work, power consumption, cost/kWh, heat index, relative humidity, and public events from social stream 24 Population Level Observations Action in the Physical World Washing and drying has resulted in significant cost since it was done during peak load period. Consider changing this time to night.
  24. 25 Big Data is pervasive - It is Smart Data that matter!
  25. DATA Observations from machine and social sensors KNOWLEDGE for interpretation of observations ACTIONS situation awareness useful for decision making 26 Primary challenge is to bridge the gap between data and actions Contextualization Personalization
  26. “the top part of the brain is involved in setting up plans, controlling movements, registering changes in where objects are located in space, and revising plans when anticipated events do not occur.” 27 In the process, engaging both top and bottom brain “bottom is involved in classifying and interpreting what we perceive, and allows us to confer meaning on the world.” “The Theory of Cognitive Modes* emphasizes the constant and close interaction of the top and bottom systems. They don’t work in isolation — or in competition — but seamlessly together.” *http://brainblogger.com/2013/12/19/top-brain-bottom-brain-part-3-the-theory-of-cognitive-modes/ by G. Wayne Miller and Stephen M. Kosslyn, PhD | December 19, 2013
  27. 28 Can we take inspiration from the ‘Theory of Cognitive Modes’ to develop a computational model? T & B B T Mover Perceiver Simulator Adaptor http://online.stanford.edu/pgm-fa12 T- Top brain, B- Bottom brain our baby step toward a computational model for perception (Machine Perception)
  28. 29 Toward a symbiotic partnership between machines and people J. McCarthy M. Weiser D. Engelbart J. C. R. Licklider htttp://j.mp/k-che http://knoesis.org/index.php/Computing_For_Human_Experience
  29. 30 How are machines supposed to integrate and interpret sensor data? RDF OWL Semantic Sensor Networks (SSN)
  30. 31 W3C Semantic Sensor Network Ontology Lefort, L., Henson, C., Taylor, K., Barnaghi, P., Compton, M., Corcho, O., Garcia-Castro, R., Graybeal, J., Herzog, A., Janowicz, K., Neuhaus, H., Nikolov, A., and Page, K.: Semantic Sensor Network XG Final Report, W3C Incubator Group Report (2011).
  31. 32 W3C Semantic Sensor Network Ontology Lefort, L., Henson, C., Taylor, K., Barnaghi, P., Compton, M., Corcho, O., Garcia-Castro, R., Graybeal, J., Herzog, A., Janowicz, K., Neuhaus, H., Nikolov, A., and Page, K.: Semantic Sensor Network XG Final Report, W3C Incubator Group Report (2011).
  32. SSN Ontology 3 Interpreted data (abductive) [in OWL] e.g., diagnosis 2 Interpreted data (deductive) [in OWL] e.g., threshold 1 Annotated Data [in RDF] e.g., label 0 Raw Data [in TEXT] e.g., number Intellego Hyperthyroidism … … Elevated Blood Pressure Systolic blood pressure of 150 mmHg “150” 33 Levels of Abstraction
  33. 34 What if we could automate this interpretation of Data? … and do it efficiently and at scale
  34. 35 Making sense of sensor data with Henson et al An Ontological Approach to Focusing Attention and Enhancing Machine Perception on the Web, Applied Ont, 2011
  35. 36 People are good at making sense of sensory input What can we learn from cognitive models of perception? The key ingredient is prior knowledge
  36. Observe Property * based on Neisser’s cognitive model of perception Perceive Feature Explanation Discrimination 1 2 Translating low-level signals into high-level knowledge Focusing attention on those aspects of the environment that provide useful information Prior Knowledge 37 Convert large number of observations to semantic abstractions that provide insights and translate into decisions Perception Cycle*
  37. 38 To enable machine perception, Semantic Web technology is used to integrate sensor data with prior knowledge on the Web W3C SSN XG 2010-2011, SSN Ontology
  38. W3C Semantic Sensor Network (SSN) Ontology Bi-partite Graph 39 Prior knowledge on the Web
  39. W3C Semantic Sensor Network (SSN) Ontology Bi-partite Graph 40 Prior knowledge on the Web
  40. Observe Property Perceive Feature Explanation 1 Explanation Translating low-level signals into high-level knowledge 41 Explanation is the act of choosing the objects or events that best account for a set of observations; often referred to as hypothesis building
  41. Inference to the best explanation • In general, explanation is an abductive problem; and hard to compute Finding the sweet spot between abduction and OWL • Single-feature assumption* enables use of OWL-DL deductive reasoner * An explanation must be a single feature which accounts for all observed properties 42 Explanation is the act of choosing the objects or events that best account for a set of observations; often referred to as hypothesis building Representation of Parsimonious Covering Theory in OWL-DL Explanation
  42. Explanation ExplanatoryFeature ≡ ∃ssn:isPropertyOf—.{p1} ⊓ … ⊓ ∃ssn:isPropertyOf—.{pn} Observed Property Explanatory Feature elevated blood pressure clammy skin palpitations Hypertension Hyperthyroidism Pulmonary Edema 43 Explanatory Feature: a feature that explains the set of observed properties
  43. Discrimination Observe Property Perceive Feature Explanation Discrimination 2 Focusing attention on those aspects of the environment that provide useful information 44 Discrimination is the act of finding those properties that, if observed, would help distinguish between multiple explanatory features
  44. Discrimination ExpectedProperty ≡ ∃ssn:isPropertyOf.{f1} ⊓ … ⊓ ∃ssn:isPropertyOf.{fn} Expected Property Explanatory Feature elevated blood pressure clammy skin palpitations Hypertension Hyperthyroidism Pulmonary Edema 45 Expected Property: would be explained by every explanatory feature
  45. Discrimination NotApplicableProperty ≡ ¬∃ssn:isPropertyOf.{f1} ⊓ … ⊓ ¬∃ssn:isPropertyOf.{fn} Not Applicable Property Explanatory Feature elevated blood pressure clammy skin palpitations Hypertension Hyperthyroidism Pulmonary Edema 46 Not Applicable Property: would not be explained by any explanatory feature
  46. Discrimination DiscriminatingProperty ≡ ¬ExpectedProperty ⊓ ¬NotApplicableProperty Discriminating Property Explanatory Feature elevated blood pressure clammy skin palpitations Hypertension Hyperthyroidism Pulmonary Edema 47 Discriminating Property: is neither expected nor not-applicable
  47. Semantic scalability: Resource savings of abstracting sensor data 48 Orders of magnitude resource savings for generating and storing relevant abstractions vs. raw observations. Relevant abstractions Raw observations
  48. Qualities -High BP -Increased Weight Entities -Hypertension -Hypothyroidism kHealth Machine Sensors Personal Input EMR/PHR Comorbidity risk score e.g., Charlson Index Longitudinal studies of cardiovascular risks - Find risk factors - Validation - domain knowledge - domain expert Find contribution of each risk factor Risk Assessment Model Current Observations -Physical -Physiological -History Risk Score (e.g., 1 => continue 3 => contact clinic) Validate correlations Model Creation Historical observations e.g., EMR, sensor observations 49 Risk Score: from Data to Abstraction and Actionable Information
  49. Use of OWL reasoner is resource intensive (especially on resource-constrained devices), in terms of both memory and time • Runs out of resources with prior knowledge >> 15 nodes • Asymptotic complexity: O(n3) 50 How do we implement machine perception efficiently on a resource-constrained device?
  50. Approach 1: Send all sensor observations to the cloud for processing intelligence at the edge 51 Approach 2: downscale semantic processing so that each device is capable of machine perception
  51. Efficient execution of machine perception 010110001101 0011110010101 1000110110110 101100011010 0111100101011 000110101100 0110100111 52 Use bit vector encodings and their operations to encode prior knowledge and execute semantic reasoning Henson et al. 'An Efficient Bit Vector Approach to Semantics-based Machine Perception in Resource-Constrained Devices, ISWC 2012.
  52. Efficiency Improvement • Problem size increased from 10’s to 1000’s of nodes • Time reduced from minutes to milliseconds • Complexity growth reduced from polynomial to linear O(n3) < x < O(n4) O(n) 53 Evaluation on a mobile device
  53. 1 Translate low-level data to high-level knowledge Machine perception can be used to convert low-level sensory signals into high-level knowledge useful for decision making 2 Prior knowledge is the key to perception Using SW technologies, machine perception can be formalized and integrated with prior knowledge on the Web 3 Intelligence at the edge By downscaling semantic inference, machine perception can execute efficiently on resource-constrained devices 54 Semantic Perception for smarter analytics: 3 ideas to takeaway
  54. kHealth Knowledge-enabled Healthcare Applied to ADHF, Asthma, GI, and Dementia 55
  55. Brief Introduction Video
  56. Empowering Individuals (who are not Larry Smarr!) for their own health Through physical monitoring and analysis, our cellphones could act as an early warning system to detect serious health conditions, and provide actionable information canary in a coal mine kHealth: knowledge-enabled healthcare 57
  57. 1http://www.nhlbi.nih.gov/health/health-topics/topics/asthma/ 2http://www.lung.org/lung-disease/asthma/resources/facts-and-figures/asthma-in-adults.html 3Akinbami et al. (2009). Status of childhood asthma in the United States, 1980–2007. Pediatrics,123(Supplement 3), S131-S145. 60 25 million 300 million $50 billion 155,000 593,000 People in the U.S. are diagnosed with asthma (7 million are children)1. People suffering from asthma worldwide2. Spent on asthma alone in a year2 Hospital admissions in 20063 Emergency department visits in 20063 Asthma: Severity of the problem
  58. WHY Big Data to Smart Data: Asthma example what can we do to avoid asthma episode? Understanding relationships between health signals and asthma attacks for providing actionable information 61 Value What risk factors influence asthma control? What is the contribution of each risk factor? semantics Velocity Veracity Variety Volume Real-time health signals from personal level (e.g., Wheezometer, NO in breath, accelerometer, microphone), public health (e.g., CDC, Hospital EMR), and population level (e.g., pollen level, CO2) arriving continuously in fine grained samples potentially with missing information and uneven sampling frequencies.
  59. kHealth: Health Signal Processing Architecture Personal level Signals Public level Signals Population level Signals Domain Knowledge Risk Model Events from Social Streams Take Medication before going to work Contact doctor Avoid going out in the evening due to high pollen levels Analysis Personalized Actionable Information Data Acquisition & aggregation 62
  60. 63 Asthma Domain Knowledge Asthma Control and Actionable Information Domain Knowledge Asthma Control à Daily Medication Choices for starting therapy Not Well Controlled Poor Controlled Severity Level of Asthma (Recommended Action) (Recommended Action) (Recommended Action) Intermittent Asthma SABA prn - - Mild Persistent Asthma Low dose ICS Medium ICS Medium ICS Moderate Persistent Asthma Medium dose ICS alone Or with LABA/montelukast Medium ICS + LABA/Montelukast Or High dose ICS Medium ICS + LABA/Montelukast Or High dose ICS* Severe Persistent Asthma High dose ICS with LABA/montelukast Needs specialist care Needs specialist care ICS= inhaled corticosteroid, LABA = inhaled long-acting beta2-agonist, SABA= inhaled short-acting beta2-agonist ; *consider referral to specialist
  61. 64 Patient Health Score (diagnostic) How controlled is my asthma? Risk assessment model Semantic Perception Personal level Signals Public level Signals Domain Knowledge Population level Signals GREEN -- Well Controlled YELLOW – Not well controlled Red -- poor controlled
  62. Background Knowledge 65 Patient Health Score (diagnostic): Details Physical-Cyber-Social System Observations Health Signal Extraction Health Signal Understanding Personal Population Level Acceleration readings from on-phone sensors Wheeze – Yes Do you have tightness of chest? –Yes Risk Category assigned by doctors <Wheezing=Yes, time, location> <ChectTightness=Yes, time, location> <PollenLevel=Medium, time, location> <Pollution=Yes, time, location> <Activity=High, time, location> PollenLevel Wheezing ChectTightness Pollution Activity PollenLevel Wheezing ChectTightness Pollution Activity RiskCategory <PollenLevel, ChectTightness, Pollution, Activity, Wheezing, RiskCategory> <2, 1, 1,3, 1, RiskCategory> <2, 1, 1,3, 1, RiskCategory> <2, 1, 1,3, 1, RiskCategory> <2, 1, 1,3, 1, RiskCategory> . . . Expert Knowledge Sensor and personal observations tweet reporting pollution level and asthma attacks Signals from personal, personal spaces, and community spaces Qualify Quantify Enrich Outdoor pollen and pollution Public Health Well Controlled - continue Not Well Controlled – contact nurse Poor Controlled – contact doctor
  63. 66 Patient Vulnerability Score (prognostic) How vulnerable* is my control level today? Risk assessment model Semantic Perception Personal level Signals Public level Signals Domain Knowledge Population level Signals Patient health Score *considering changing environmental conditions and current control level
  64. 67 Patient Vulnerability Score (prognostic): Details Sensordrone – for monitoring environmental air quality Wheezometer – for monitoring wheezing sounds Can I reduce my asthma attacks at night? What are the triggers? What is the wheezing level? What is the exposure level over a day? What is the propensity toward asthma? Commute to Work Luminosity CO level CO in gush during day time Actionable Information Personal level Signals Public level Signals Population level Signals What is the air quality indoors?
  65. Sensordrone (Carbon monoxide, temperature, humidity) Node Sensor (exhaled Nitric Oxide) 68 Sensors Android Device (w/ kHealth App) Total cost: ~ $500 kHealth Kit for the application for Asthma management Along with two sensors in the kit, the application uses a variety of population level signals from the web: Pollen level Air Quality Temperature & Humidity
  66. 69 Usability and decision support trial Dr. Shalini G. Forbis, MD, MPH
  67. Preliminary insights from patient data S1 S2 Sensor data QA data Number of Observations 36 108 40 121
  68. Medication (Albuterol) related to decreasing Exhaled Nitric Oxide 1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 Did patient take albuterol last night due to cough or wheeze? 0.25 0.2 0.15 0.1 0.05 0 Exhaled Nitric Oxide
  69. Activity limitation related to high exhaled Nitric Oxide 1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 How much did asthma or asthma symptoms limit patient's activity today? 0.25 0.2 0.15 0.1 0.05 0 Exhaled Nitric Oxide
  70. Low exhaled Nitric Oxide observed with absence of coughing 1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 Has patient had wheeze, chest tightness, or asthma related 6/2/2014 6/3/2014 6/4/2014 6/5/2014 6/6/2014 6/7/2014 6/8/2014 6/9/2014 6/10/2014 6/11/2014 6/12/2014 cough today? 0.25 0.2 0.15 0.1 0.05 0 Nitric Oxide
  71. Activity limitation observed with high pollen activity 2.5 2 1.5 1 0.5 0 Pollen 1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 How much did asthma or asthma symptoms limit patient's activity today?
  72. 75 Two research directions for kHealth asthma with more data… Root cause analysis Action Recommendation Find Triggers of Asthma Derive the cause of asthma attacks for a given patient using statistical techniques + knowledge of asthma and its triggers Minimize Asthma Attacks Model actions based on the utility theory (cost of actions & its rewards) + knowledge of action consequences
  73. • Healthcare: ADFH, Asthma, GI – Using kHealth system • Traffic Analytics: – Understanding traffic flow • Social Media Analysis : – Crisis coordination using Twitris 76 I will use applications in 3 domains to demonstrate
  74. 78 Understanding traffic flow variations
  75. Big Data to Smart Data: Traffic Management example Vehicular traffic data from San Francisco Bay Area aggregated from on-road sensors (numerical) and incident reports (textual) Value Can we detect the onset of traffic congestion? Can we characterize traffic congestion based on events? Can we provide actionable information to decision makers? semantics Velocity Veracity Variety Volume Representing prior knowledge of traffic lead to a focused exploration of this massive dataset Every minute update of speed, volume, travel time, and occupancy resulting in 178 million link status observations, 738 active events, and 146 scheduled events with many unevenly sampled observations collected over 3 months. 79 http://511.org/
  76. Semantic Annotation using Background Knowledge slow-moving-traffic Domain knowledge in the form of traffic vocabulary Image Credit: http://traffic.511.org/index Domain knowledge of traffic flow synthesized from sensor data 80 Explained-by Horizontal operator: relating/mapping data from different modality to a concept (theme) within a spatio-temporal context; Spatial context even include what it means to have a slow traffic for the type of road
  77. • Healthcare: ADFH, Asthma, GI – Using kHealth system • Traffic Analytics: – Understanding traffic flow • Social Media Analysis : – Crisis coordination using Twitris 81 I will use applications in 2 domains to demonstrate
  78. [BIG] Ad-hoc Community with Varying but [FEW] Important Intents Image: http://www.gizmodo.com.au/2012/04/how-we-identify-single-voices- 82 in-a-crowd/ Me and @CeceVancePR are coordinating a clothing/food drive for families affected by Hurricane Sandy. If you would like to donate, DM us Does anyone know how to donate clothes to hurricane #Sandy victims? BIG QUESTION: Can these needles be identified in the haystack of massive datasets? [REQUEST/DEMAND] [OFFER/SUPPLY] Coordination teams want to hear!
  79. Uncoordinated Engagement • May lead to second disaster to be managed: – Under-supply of required demands – Over-supply of not required resources • Hurricane Sandy example, “Thanks, but no thanks”, NPR, Jan 12 2013 Story link:http://www.npr.org/2013/01/09/168946170/tha nks-but-no-thanks-when-post-disaster-donations-overwhelm
  80. 84 How to volunteer, donate to Hurricane Sandy: <URL> If you have clothes to donate to those who are victims of Hurricane Sandy … Red Cross is urging blood donations to support those affected <URL> I have TONS of cute shoes & purses I want to donate to hurricane victims … Does anyone know how to donate clothes to hurricane #Sandy victims? Does anyone know of community service organizations to volunteer to help out? Needs to get something, suggests scarcity: REQUEST (demand) Offers or wants to give, suggests abundance: OFFER (supply) Matching requests with offers
  81. Want to help animals in #Oklahoma? @ASPCA tells how you can help: http://t.co/mt8l9PwzmO x RESPONSE TEAMS (including humanitarian org. and ‘pseudo’ responders) VICTIM SITE Where do I go to help out for volunteer work DEMAND SUPPLY around Moore? Anyone know? CITIZEN SENSORS Anyone know where to donate to help the animals from the Oklahoma disaster? #oklaho ma #dogs Matchable Matchable If you would like to volunteer today, help is desperately needed in Shawnee. Call 273-5331 for more info 85 Match-making: Assisting Coordination Image: http://offthewallsocial.com/tag/social-media/
  82. Two excellent videos • Vinod Khosla: the Power of Storytelling and the Future of Healthcare • Larry Smarr: The Human Microbiome and the Revolution in Digital Health 86 Wrapping up: For more on importance of what we talked about
  83. • Big Data is every where – at individual and community levels - not just limited to corporation – with growing complexity: Physical-Cyber-Social • Analysis is not sufficient • Need interaction between bottom up techniques and top down processing 87 Wrapping up: Take Away
  84. Wrapping up: Take Away • Focus on Humans and Improve human life and experience with SMART Data. – Data to Information to Personally and Contextually Relevant Abstractions (Semantic Perception) – Actionable Information (Value from data) to assist and support human in decision making. • Focus on Value -- SMART Data – Big Data Challenges without the intention of deriving Value is a “Journey without GOAL”. 88
  85. Special thanks: Pramod. This presentation covers some of the work of my PhD students. Key contributors: Pramod Anantharam, Cory Henson and TK Prasad. Amit Sheth’s PHD students Ashutos h Jadhav* Hemant Purohit Vinh Nguyen Lu Chen Pavan Kapanipathi* Pramod Sujan Perera Anantharam* Maryam Panahiazar Sarasi Lalithsena Shreyansh Batt Kalpa Gunaratna Delroy Cameron Sanjaya Wijeratne Wenbo Wang 89 Special thanks
  86. • Among top universities in the world in World Wide Web (cf: 10-yr impact, Microsoft Academic Search: among top 10 in June2014) • Among the largest academic groups in the US in Semantic Web + Social/Sensor Webs, Mobile/Cloud/Cognitive Computing, Big Data, IoT, Health/Clinical & Biomedicine Applications • Exceptional student success: internships and jobs at top salary (IBM Watson/Research, MSR, Amazon, CISCO, Oracle, Yahoo!, Samsung, research universities, NLM, startups ) • 100 researchers including 15 World Class faculty (>3K citations/faculty avg) and ~45 PhD students- practically all funded • Extensive research for largely multidisciplinary projects; world class resources; industry sponsorships/collaborations (Google, IBM, …) 90
  87. 91 Top organization in WWW: 10-yr Field Rating (MAS)
  88. 92
  89. 93 Smart Data - How you and I will exploit Big Data thank you, and please visit us at http://knoesis.org

Editor's Notes

  1. Starting slide Various Big data problems – Traditional examples vs what we are doing examples. Variety and Velocity than Volume. kHealth problem. People will be interested in Smart Data. Traditional ML techniques, High Performance Computing, Statistics. Human level of Abstraction is Smart data.
  2. http://www.csc.com/insights/flxwd/78931-big_data_growth_just_beginning_to_explode http://www.guardian.co.uk/news/datablog/2012/dec/19/big-data-study-digital-universe-global-volume
  3. Types of Data Formats of Data Also talk about the increase in the platforms that helps generating these data
  4. Example high velocity Big Data applications at work: financial services, stock brokerage, weather tracking, movies/entertainment and online retail. Fast data (rate at which data is coming: esp from mobile, social and sensor sources), Rapid changes – in the data content, Stream analysis – to cope with the incoming data for real-time online analytics
  5. There are over 99.4% of physical devices that may one day be connected to The Internet still unconnected. - CISCO IBSG, 2013
  6. Human interpretation of the world along with personalization context …
  7. Raw data  annotated data  statistical analysis  background knowledge based interpretation for actionable information
  8. - Larry Smarr is a professor at the University of California, San Diego And he was diagnosed with Chrones Disease What’s interesting about this case is that Larry diagnosed himself He is a pioneer in the area of Quantified-Self, which uses sensors to monitor physiological symptoms Through this process he discovered inflammation, which led him to discovery of Chrones Disease This type of self-tracking is becoming more and more common sdd link to video
  9. Characteristics of asthma – why is it a complex condition?
  10. Asthma requires that we provide contextual, personalized, and actionable information to the patient by analyzing observations from Personal, Public, and Population level modalities
  11. - HUMAN CENTRIC!!
  12. All the data related to human activity, existence and experiences More on PCS Computing: http://wiki.knoesis.org/index.php/PCS
  13. Information is CREATED by human with the Machinery available – Wikipedia tool, sensors and social networks Information is STORED in Man+Machine readable format, LOD Information is PROCESSED using the LOD and Human assisted Knowledge-based Higher level abstraction on info is now consumed in many mechanistic ways (including GIS) to provide EXPERIENCE for humans Example of a human guided modeling and improved performance http://research.microsoft.com/en-us/um/people/akapoor/papers/IJCAI%202011a.pdf
  14. Actionable information example: In Asthma use case we have a sensor – sensordrone which records luminosity and CO levels A high correlation between CO level and luminosity is found This is an actionable information to the user interpreting it as CO in gush during day time => Mitigating action can be “closing the window” during day
  15. Also, we have weather application which performs abstraction on weather sensory observations to identify blizzard conditions (food for actions!!) : -- 20,000 weather stations (with ~5 sensors per station) -- Real-Time Feature Streams - live demo: http://knoesis1.wright.edu/EventStreams/ - video demo: https://skydrive.live.com/?cid=77950e284187e848&sc=photos&id=77950E284187E848%21276
  16. Lets find it..
  17. Add personalization and contextual
  18. - what if we could automate this sense making ability? - and what if we could do this at scale?
  19. sense making based on human cognitive models
  20. perception cycle contains two primary phases explanation translating low-level signals into high-level abstractions inference to the best explanation discrimination focusing attention on those properties that will help distinguish between multiple possible explanations used to intelligently task sensors and collect additional observations (rather than brute force approach of blindly collecting all observations)
  21. perception cycle contains two primary phases explanation translating low-level signals into high-level abstractions inference to the best explanation discrimination focusing attention on those properties that will help distinguish between multiple possible explanations used to intelligently task sensors and collect additional observations (rather than brute force approach of blindly collecting all observations)
  22. A single-feature (disease) assumption means that all the observed properties (symptoms) must be explained by a single feature. i.e., this framework is not expressive enough to model comorbidity where there may be more than one feature (disease) co-existing For example, if there are two diseases causing disjoint symptoms, and all the symptoms of both the diseases are observed, then this framework will not be able to find the coverage and returns no diseases.
  23. perception cycle contains two primary phases explanation translating low-level signals into high-level abstractions inference to the best explanation discrimination focusing attention on those properties that will help distinguish between multiple possible explanations used to intelligently task sensors and collect additional observations (rather than brute force approach of blindly collecting all observations)
  24. Intelligence distributed at the edge of the network Requires resource-constrained devices (mobile phones, gateway notes, etc.) to be able to utilize SW technologies
  25. Intelligence distributed at the edge of the network Requires resource-constrained devices (mobile phones, gateway notes, etc.) to be able to utilize SW technologies Henson et al. 'An Efficient Bit Vector Approach to Semantics-based Machine Perception in Resource-Constrained Devices, ISWC 2012.
  26. compute machine perception inferences -- i.e., explanation and discrimination -- of high-complexity on a resource-constrained devices in miliseconds Difference between the other systems and what this system provides
  27. Intelligence at the age. Shipping computation and domain models to the edge (Distributed)
  28. ADHF – Acute Decompensated Heart Failure
  29. - With this ability, many problems could be solved - For example: we could help solve health problems (before they become serious health problems) through monitoring symptoms and real-time sense making, acting as an early warning system to detect problematic health conditions
  30. ADHF – Acute Decompensated Heart Failure
  31. Research on Asthma has three phases Data collection: what signals to collect? Analysis: what analysis to be done? Actionable information: what action to recommend? In the next slide, we take a peek into the analysis that we do for Asthma
  32. What is the current state of a person/patient? => Summarizing all the observations (sensor and personal) into a single score indicating health of a person Instead of presenting all the raw data (often to much e.g., Asthma application we have developed collects CO, temperature, and humidity every 10 seconds resulting in 8,640 observations/day) which may not be comprehensible to the patient, we empower them by providing actionable summaries.
  33. There are two components in making sense of Health Signals: Health signal extraction – processing, aggregating, and abstracting from raw sensor/textual data to create human intelligible abstractions Health signal understanding – derive (1) connections between abstractions and (2) Action recommendation: Continue Contact nurse Contact doctor
  34. What is the likely state of the person in future? => Given the current state and the changing environmental conditions, estimate the state of the person by summarizing it into a number which is actionable. For example, vulnerability score for a person with Asthma is computed with environmental factors (pollen, air quality, external temperature and humidity) and current state of the patient. Intuitively, a person with well controlled asthma should have a lower vulnerability score than a person with poorly controlled asthma both being in a poor environmental state.
  35. In the absence of declarative knowledge in a domain, we resort to statistical approaches to glean insights from data Even if there is declarative knowledge of a domain, it may have to be personalized The CO level may be related to the luminosity as observed by the sensordrone – as it gets brighter the CO level also increases => high CO level in daytime If such an insight is provided to a person, the interpretation can be: Some activity inside the house leads to high CO levels Outside activity leads to high CO levels inside the house Since the person knows that he/she is absent in the house during mornings, it has to be something from outside. - Person narrows down to a possible opened window at home (forgot to close more often)
  36. 1)www.pollen.com(For pollen levels) 2)http://www.airnow.gov/(For air quality levels) 3)http://www.weatherforyou.com/(For temperature and humidity)
  37. Subject 1 121 Data points from sensor observations 40 Data points from QA including one comment  Subject 2 108 Data points from sensor observations 36 Data points from QA including one comment
  38. Pucher, J., Korattyswaroopam, N., & Ittyerah, N. (2004). The crisis of public transport in India: Overwhelming needs but limited resources. Journal of Public Transportation, 7(4), 1-30.
  39. Horizontal operation
  40. People join these SM communities for variety of intentions. Varying intent may include a very small sample of important intentions to assist the coordination of actions --- request to help --- offer to help
  41. (1) Example overview
  42. Alright, so let’s motivate by this situation during emergency - Various actors: resource seekers, responder teams, resource providers at remote site And - each of these actor groups have questions --- - needs - providers - responders: wondering! Here we have social network to connect these actors and bridge the gap for communication platform But it’s potential use is yet to be realized for effective help Because.. (next slide)
  43. More at: http://wiki.knoesis.org/index.php/PCS And http://knoesis.org/projects/ssw/
Advertisement