1) Amit Sheth presented on how knowledge can help machines better understand big data.
2) He discussed challenges like understanding implicit entities, analyzing drug abuse forums, and understanding city traffic using sensors and text.
3) Sheth argued that knowledge graphs and ontologies can help interpret diverse data types and provide contextual understanding to help solve real-world problems.
Semantic, Cognitive, and Perceptual Computing – three intertwined strands of ...Amit Sheth
Keynote at Web Intelligence 2017: http://webintelligence2017.com/program/keynotes/
Video: https://youtu.be/EIbhcqakgvA Paper: http://knoesis.org/node/2698
Abstract: While Bill Gates, Stephen Hawking, Elon Musk, Peter Thiel, and others engage in OpenAI discussions of whether or not AI, robots, and machines will replace humans, proponents of human-centric computing continue to extend work in which humans and machine partner in contextualized and personalized processing of multimodal data to derive actionable information.
In this talk, we discuss how maturing towards the emerging paradigms of semantic computing (SC), cognitive computing (CC), and perceptual computing (PC) provides a continuum through which to exploit the ever-increasing and growing diversity of data that could enhance people’s daily lives. SC and CC sift through raw data to personalize it according to context and individual users, creating abstractions that move the data closer to what humans can readily understand and apply in decision-making. PC, which interacts with the surrounding environment to collect data that is relevant and useful in understanding the outside world, is characterized by interpretative and exploratory activities that are supported by the use of prior/background knowledge. Using the examples of personalized digital health and a smart city, we will demonstrate how the trio of these computing paradigms form complementary capabilities that will enable the development of the next generation of intelligent systems. For background: http://bit.ly/PCSComputing
Presented at SW2012 @ ISWC2012.
http://amitsheth.blogspot.com/2012/08/semantics-empowered-physical-cyber.html
This is an old version of this talk, for more recent information on this topic (eg talks, papers, events), see: http://wiki.knoesis.org/index.php/PCS
Smart Data for you and me: Personalized and Actionable Physical Cyber Social ...Amit Sheth
Featured Keynote at Worldcomp'14, July 2014: http://www.world-academy-of-science.org/worldcomp14/ws/keynotes/keynote_sheth
Video of the talk at: http://youtu.be/2991W7OBLqU
Big Data has captured a lot of interest in industry, with the emphasis on the challenges of the four Vs of Big Data: Volume, Variety, Velocity, and Veracity, and their applications to drive value for businesses. Recently, there is rapid growth in situations where a big data challenge relates to making individually relevant decisions. A key example is human health, fitness, and well-being. Consider for instance, understanding the reasons for and avoiding an asthma attack based on Big Data in the form of personal health signals (e.g., physiological data measured by devices/sensors or Internet of Things around humans, on the humans, and inside/within the humans), public health signals (information coming from the healthcare system such as hospital admissions), and population health signals (such as Tweets by people related to asthma occurrences and allergens, Web services providing pollen and smog information, etc.). However, no individual has the ability to process all these data without the help of appropriate technology, and each human has different set of relevant data!
In this talk, I will forward the concept of Smart Data that is realized by extracting value from Big Data, to benefit not just large companies but each individual. If I am an asthma patient, for all the data relevant to me with the four V-challenges, what I care about is simply, “How is my current health, and what is the risk of having an asthma attack in my personal situation, especially if that risk has changed?” As I will show, Smart Data that gives such personalized and actionable information will need to utilize metadata, use domain specific knowledge, employ semantics and intelligent processing, and go beyond traditional reliance on ML and NLP.
For harnessing volume, I will discuss the concept of Semantic Perception, that is, how to convert massive amounts of data into information, meaning, and insight useful for human decision-making. For dealing with Variety, I will discuss experience in using agreement represented in the form of ontologies, domain models, or vocabularies, to support semantic interoperability and integration. For Velocity, I will discuss somewhat more recent work on Continuous Semantics, which seeks to use dynamically created models of new objects, concepts, and relationships, using them to better understand new cues in the data that capture rapidly evolving events and situations.
Smart Data applications in development at Kno.e.sis come from the domains of personalized health, energy, disaster response, and smart city. I will present examples from a couple of these.
This tutorial presents tools and techniques for effectively utilizing the Internet of Things (IoT) for building advanced applications, including the Physical-Cyber-Social (PCS) systems. The issues and challenges related to IoT, semantic data modelling, annotation, knowledge representation (e.g. modelling for constrained environments, complexity issues and time/location dependency of data), integration, analy- sis, and reasoning will be discussed. The tutorial will de- scribe recent developments on creating annotation models and semantic description frameworks for IoT data (e.g. such as W3C Semantic Sensor Network ontology). A review of enabling technologies and common scenarios for IoT applications from the data and knowledge engineering point of view will be discussed. Information processing, reasoning, and knowledge extraction, along with existing solutions re- lated to these topics will be presented. The tutorial summarizes state-of-the-art research and developments on PCS systems, IoT related ontology development, linked data, do- main knowledge integration and management, querying large- scale IoT data, and AI applications for automated knowledge extraction from real world data.
Related: Semantic Sensor Web: http://knoesis.org/projects/ssw
Physical-Cyber-Social Computing: http://wiki.knoesis.org/index.php/PCS
Kno.e.sis Approach to Impactful Research & Training for Exceptional CareersAmit Sheth
Abstract
Kno.e.sis (http://knoesis.org) is a world-class research center that uses semantic, cognitive, and perceptual computing for gathering insights from physical/IoT, cyber/Web, and social and enterprise (e.g., clinical) big data. We innovate and employ semantic web, machine learning, NLP/IR, data mining, network science and highly scalable computing techniques. Our highly interdisciplinary research impacts health and clinical applications, biomedical and translational research, epidemiology, cognitive science, social good, policy, development, etc. A majority of our $12+ million in active funds come from the NSF and NIH. In this talk, I will provide an overview of some of our major research projects.
Kno.e.sis is highly successful in its primary mission of exceptional student outcomes: our students have exceptional publication and real-world impact and our PhDs compete with their counterparts from top 10 schools for initial jobs in research universities, top industry research labs, and highly competitive companies. A key reason for Kno.e.sis' success is its unique work culture involving teamwork to solve complex problems. Practically all our work involves real-world challenges, real-world data, interdisciplinary collaborators, path-breaking research to solve challenges, real-world deployments, real-world use, and measurable real-world impact.
In this talk, I will also seek to discuss our choice of research topics and our unique ecosystem that prepares our students for exceptional careers.
TRANSFORMING BIG DATA INTO SMART DATA: Deriving Value via Harnessing Volume, ...Amit Sheth
Keynote given at ICDE2014, April 2014. Details at: http://ieee-icde2014.eecs.northwestern.edu/keynotes.html
A video of a version of this talk is available here: http://youtu.be/8RhpFlfpJ-A
(download to see many hidden slides).
Two versions of this talk, targeted at Smart Energy and Personalized Digital Health domains/apps at: http://wiki.knoesis.org/index.php/Smart_Data
Previous (older) version replaced by this version: http://www.slideshare.net/apsheth/big-data-to-smart-data-keynote
Presentation at the AAAI 2013 Fall Symposium on Semantics for Big Data, Arlington, Virginia, November 15-17, 2013
Additional related material at: http://wiki.knoesis.org/index.php/Smart_Data
Related paper at: http://www.knoesis.org/library/resource.php?id=1903
Abstract: We discuss the nature of Big Data and address the role of semantics in analyzing and processing Big Data that arises in the context of Physical-Cyber-Social Systems. We organize our research around the five V's of Big Data, where four of the Vs are harnessed to produce the fifth V - value. To handle the challenge of Volume, we advocate semantic perception that can convert low-level observational data to higher-level abstractions more suitable for decision-making. To handle the challenge of Variety, we resort to the use of semantic models and annotations of data so that much of the intelligent processing can be done at a level independent of heterogeneity of data formats and media. To handle the challenge of Velocity, we seek to use continuous semantics capability to dynamically create event or situation specific models and recognize new concepts, entities and facts. To handle Veracity, we explore the formalization of trust models and approaches to glean trustworthiness. The above four Vs of Big Data are harnessed by the semantics-empowered analytics to derive Value for supporting practical applications transcending physical-cyber-social continuum.
Smart Data - How you and I will exploit Big Data for personalized digital hea...Amit Sheth
Amit Sheth's keynote at IEEE BigData 2014, Oct 29, 2014.
Abstract from:
http://cci.drexel.edu/bigdata/bigdata2014/keynotespeech.htm
Big Data has captured a lot of interest in industry, with the emphasis on the challenges of the four Vs of Big Data: Volume, Variety, Velocity, and Veracity, and their applications to drive value for businesses. Recently, there is rapid growth in situations where a big data challenge relates to making individually relevant decisions. A key example is personalized digital health that related to taking better decisions about our health, fitness, and well-being. Consider for instance, understanding the reasons for and avoiding an asthma attack based on Big Data in the form of personal health signals (e.g., physiological data measured by devices/sensors or Internet of Things around humans, on the humans, and inside/within the humans), public health signals (e.g., information coming from the healthcare system such as hospital admissions), and population health signals (such as Tweets by people related to asthma occurrences and allergens, Web services providing pollen and smog information). However, no individual has the ability to process all these data without the help of appropriate technology, and each human has different set of relevant data!
In this talk, I will describe Smart Data that is realized by extracting value from Big Data, to benefit not just large companies but each individual. If my child is an asthma patient, for all the data relevant to my child with the four V-challenges, what I care about is simply, “How is her current health, and what are the risk of having an asthma attack in her current situation (now and today), especially if that risk has changed?” As I will show, Smart Data that gives such personalized and actionable information will need to utilize metadata, use domain specific knowledge, employ semantics and intelligent processing, and go beyond traditional reliance on ML and NLP. I will motivate the need for a synergistic combination of techniques similar to the close interworking of the top brain and the bottom brain in the cognitive models.
For harnessing volume, I will discuss the concept of Semantic Perception, that is, how to convert massive amounts of data into information, meaning, and insight useful for human decision-making. For dealing with Variety, I will discuss experience in using agreement represented in the form of ontologies, domain models, or vocabularies, to support semantic interoperability and integration. For Velocity, I will discuss somewhat more recent work on Continuous Semantics, which seeks to use dynamically created models of new objects, concepts, and relationships, using them to better understand new cues in the data that capture rapidly evolving events and situations.
Smart Data applications in development at Kno.e.sis come from the domains of personalized health, energy, disaster response, and smart city.
Semantic, Cognitive, and Perceptual Computing – three intertwined strands of ...Amit Sheth
Keynote at Web Intelligence 2017: http://webintelligence2017.com/program/keynotes/
Video: https://youtu.be/EIbhcqakgvA Paper: http://knoesis.org/node/2698
Abstract: While Bill Gates, Stephen Hawking, Elon Musk, Peter Thiel, and others engage in OpenAI discussions of whether or not AI, robots, and machines will replace humans, proponents of human-centric computing continue to extend work in which humans and machine partner in contextualized and personalized processing of multimodal data to derive actionable information.
In this talk, we discuss how maturing towards the emerging paradigms of semantic computing (SC), cognitive computing (CC), and perceptual computing (PC) provides a continuum through which to exploit the ever-increasing and growing diversity of data that could enhance people’s daily lives. SC and CC sift through raw data to personalize it according to context and individual users, creating abstractions that move the data closer to what humans can readily understand and apply in decision-making. PC, which interacts with the surrounding environment to collect data that is relevant and useful in understanding the outside world, is characterized by interpretative and exploratory activities that are supported by the use of prior/background knowledge. Using the examples of personalized digital health and a smart city, we will demonstrate how the trio of these computing paradigms form complementary capabilities that will enable the development of the next generation of intelligent systems. For background: http://bit.ly/PCSComputing
Presented at SW2012 @ ISWC2012.
http://amitsheth.blogspot.com/2012/08/semantics-empowered-physical-cyber.html
This is an old version of this talk, for more recent information on this topic (eg talks, papers, events), see: http://wiki.knoesis.org/index.php/PCS
Smart Data for you and me: Personalized and Actionable Physical Cyber Social ...Amit Sheth
Featured Keynote at Worldcomp'14, July 2014: http://www.world-academy-of-science.org/worldcomp14/ws/keynotes/keynote_sheth
Video of the talk at: http://youtu.be/2991W7OBLqU
Big Data has captured a lot of interest in industry, with the emphasis on the challenges of the four Vs of Big Data: Volume, Variety, Velocity, and Veracity, and their applications to drive value for businesses. Recently, there is rapid growth in situations where a big data challenge relates to making individually relevant decisions. A key example is human health, fitness, and well-being. Consider for instance, understanding the reasons for and avoiding an asthma attack based on Big Data in the form of personal health signals (e.g., physiological data measured by devices/sensors or Internet of Things around humans, on the humans, and inside/within the humans), public health signals (information coming from the healthcare system such as hospital admissions), and population health signals (such as Tweets by people related to asthma occurrences and allergens, Web services providing pollen and smog information, etc.). However, no individual has the ability to process all these data without the help of appropriate technology, and each human has different set of relevant data!
In this talk, I will forward the concept of Smart Data that is realized by extracting value from Big Data, to benefit not just large companies but each individual. If I am an asthma patient, for all the data relevant to me with the four V-challenges, what I care about is simply, “How is my current health, and what is the risk of having an asthma attack in my personal situation, especially if that risk has changed?” As I will show, Smart Data that gives such personalized and actionable information will need to utilize metadata, use domain specific knowledge, employ semantics and intelligent processing, and go beyond traditional reliance on ML and NLP.
For harnessing volume, I will discuss the concept of Semantic Perception, that is, how to convert massive amounts of data into information, meaning, and insight useful for human decision-making. For dealing with Variety, I will discuss experience in using agreement represented in the form of ontologies, domain models, or vocabularies, to support semantic interoperability and integration. For Velocity, I will discuss somewhat more recent work on Continuous Semantics, which seeks to use dynamically created models of new objects, concepts, and relationships, using them to better understand new cues in the data that capture rapidly evolving events and situations.
Smart Data applications in development at Kno.e.sis come from the domains of personalized health, energy, disaster response, and smart city. I will present examples from a couple of these.
This tutorial presents tools and techniques for effectively utilizing the Internet of Things (IoT) for building advanced applications, including the Physical-Cyber-Social (PCS) systems. The issues and challenges related to IoT, semantic data modelling, annotation, knowledge representation (e.g. modelling for constrained environments, complexity issues and time/location dependency of data), integration, analy- sis, and reasoning will be discussed. The tutorial will de- scribe recent developments on creating annotation models and semantic description frameworks for IoT data (e.g. such as W3C Semantic Sensor Network ontology). A review of enabling technologies and common scenarios for IoT applications from the data and knowledge engineering point of view will be discussed. Information processing, reasoning, and knowledge extraction, along with existing solutions re- lated to these topics will be presented. The tutorial summarizes state-of-the-art research and developments on PCS systems, IoT related ontology development, linked data, do- main knowledge integration and management, querying large- scale IoT data, and AI applications for automated knowledge extraction from real world data.
Related: Semantic Sensor Web: http://knoesis.org/projects/ssw
Physical-Cyber-Social Computing: http://wiki.knoesis.org/index.php/PCS
Kno.e.sis Approach to Impactful Research & Training for Exceptional CareersAmit Sheth
Abstract
Kno.e.sis (http://knoesis.org) is a world-class research center that uses semantic, cognitive, and perceptual computing for gathering insights from physical/IoT, cyber/Web, and social and enterprise (e.g., clinical) big data. We innovate and employ semantic web, machine learning, NLP/IR, data mining, network science and highly scalable computing techniques. Our highly interdisciplinary research impacts health and clinical applications, biomedical and translational research, epidemiology, cognitive science, social good, policy, development, etc. A majority of our $12+ million in active funds come from the NSF and NIH. In this talk, I will provide an overview of some of our major research projects.
Kno.e.sis is highly successful in its primary mission of exceptional student outcomes: our students have exceptional publication and real-world impact and our PhDs compete with their counterparts from top 10 schools for initial jobs in research universities, top industry research labs, and highly competitive companies. A key reason for Kno.e.sis' success is its unique work culture involving teamwork to solve complex problems. Practically all our work involves real-world challenges, real-world data, interdisciplinary collaborators, path-breaking research to solve challenges, real-world deployments, real-world use, and measurable real-world impact.
In this talk, I will also seek to discuss our choice of research topics and our unique ecosystem that prepares our students for exceptional careers.
TRANSFORMING BIG DATA INTO SMART DATA: Deriving Value via Harnessing Volume, ...Amit Sheth
Keynote given at ICDE2014, April 2014. Details at: http://ieee-icde2014.eecs.northwestern.edu/keynotes.html
A video of a version of this talk is available here: http://youtu.be/8RhpFlfpJ-A
(download to see many hidden slides).
Two versions of this talk, targeted at Smart Energy and Personalized Digital Health domains/apps at: http://wiki.knoesis.org/index.php/Smart_Data
Previous (older) version replaced by this version: http://www.slideshare.net/apsheth/big-data-to-smart-data-keynote
Presentation at the AAAI 2013 Fall Symposium on Semantics for Big Data, Arlington, Virginia, November 15-17, 2013
Additional related material at: http://wiki.knoesis.org/index.php/Smart_Data
Related paper at: http://www.knoesis.org/library/resource.php?id=1903
Abstract: We discuss the nature of Big Data and address the role of semantics in analyzing and processing Big Data that arises in the context of Physical-Cyber-Social Systems. We organize our research around the five V's of Big Data, where four of the Vs are harnessed to produce the fifth V - value. To handle the challenge of Volume, we advocate semantic perception that can convert low-level observational data to higher-level abstractions more suitable for decision-making. To handle the challenge of Variety, we resort to the use of semantic models and annotations of data so that much of the intelligent processing can be done at a level independent of heterogeneity of data formats and media. To handle the challenge of Velocity, we seek to use continuous semantics capability to dynamically create event or situation specific models and recognize new concepts, entities and facts. To handle Veracity, we explore the formalization of trust models and approaches to glean trustworthiness. The above four Vs of Big Data are harnessed by the semantics-empowered analytics to derive Value for supporting practical applications transcending physical-cyber-social continuum.
Smart Data - How you and I will exploit Big Data for personalized digital hea...Amit Sheth
Amit Sheth's keynote at IEEE BigData 2014, Oct 29, 2014.
Abstract from:
http://cci.drexel.edu/bigdata/bigdata2014/keynotespeech.htm
Big Data has captured a lot of interest in industry, with the emphasis on the challenges of the four Vs of Big Data: Volume, Variety, Velocity, and Veracity, and their applications to drive value for businesses. Recently, there is rapid growth in situations where a big data challenge relates to making individually relevant decisions. A key example is personalized digital health that related to taking better decisions about our health, fitness, and well-being. Consider for instance, understanding the reasons for and avoiding an asthma attack based on Big Data in the form of personal health signals (e.g., physiological data measured by devices/sensors or Internet of Things around humans, on the humans, and inside/within the humans), public health signals (e.g., information coming from the healthcare system such as hospital admissions), and population health signals (such as Tweets by people related to asthma occurrences and allergens, Web services providing pollen and smog information). However, no individual has the ability to process all these data without the help of appropriate technology, and each human has different set of relevant data!
In this talk, I will describe Smart Data that is realized by extracting value from Big Data, to benefit not just large companies but each individual. If my child is an asthma patient, for all the data relevant to my child with the four V-challenges, what I care about is simply, “How is her current health, and what are the risk of having an asthma attack in her current situation (now and today), especially if that risk has changed?” As I will show, Smart Data that gives such personalized and actionable information will need to utilize metadata, use domain specific knowledge, employ semantics and intelligent processing, and go beyond traditional reliance on ML and NLP. I will motivate the need for a synergistic combination of techniques similar to the close interworking of the top brain and the bottom brain in the cognitive models.
For harnessing volume, I will discuss the concept of Semantic Perception, that is, how to convert massive amounts of data into information, meaning, and insight useful for human decision-making. For dealing with Variety, I will discuss experience in using agreement represented in the form of ontologies, domain models, or vocabularies, to support semantic interoperability and integration. For Velocity, I will discuss somewhat more recent work on Continuous Semantics, which seeks to use dynamically created models of new objects, concepts, and relationships, using them to better understand new cues in the data that capture rapidly evolving events and situations.
Smart Data applications in development at Kno.e.sis come from the domains of personalized health, energy, disaster response, and smart city.
Presented at the Panel on
Sensor, Data, Analytics and Integration in Advanced Manufacturing, at the Connected Manufacturing track of Bosch-USA organized "Leveraging Public-Private Partnerships for Regional Growth Summit". Panel statement: Sensors, data and analytics are the core of any smart manufacturing system. What are the main challenges to create actionable outputs, replicate systems and scale efficiency gains across industries?
Moderator: Thomas Stiedl, Bosch
Panelists:
1. Amit Sheth, Wright State University
2. Howie Choset, Carnegie Melon University
3. Nagi Gebraeel, Georgia Institute of Technology
4. Brian Anthony, Massachusetts Institute of Technology
5. Yarom Polosky, Oak Ridget National Laboratory
For in-depth look:
Smart IoT: IoT as a human agent, human extension, and human complement
http://amitsheth.blogspot.com/2015/03/smart-iot-iot-as-human-agent-human.html
Semantic Gateway: http://knoesis.org/library/resource.php?id=2154
SSN Ontology: http://knoesis.org/library/resource.php?id=1659
Applications of Multimodal Physical (IoT), Cyber and Social Data for Reliable and Actionable Insights: http://knoesis.org/library/resource.php?id=2018
Smart Data: Transforming Big Data into Smart Data...: http://wiki.knoesis.org/index.php/Smart_Data
Historic use of the term Smart Data (2004): http://www.scribd.com/doc/186588820
Physical Cyber Social Computing: An early 21st century approach to Computing ...Amit Sheth
Keynote given at WiMS 2013 Conference, June 12-14 2013, Madrid, Spain. http://aida.ii.uam.es/wims13/keynotes.php
Video of this talk at: http://videolectures.net/wims2013_sheth_physical_cyber_social_computing/
More information at: More at: http://wiki.knoesis.org/index.php/PCS
and http://knoesis.org/projects/ssw/
Replacing earlier versions: http://www.slideshare.net/apsheth/physical-cyber-social-computing & http://www.slideshare.net/apsheth/semantics-empowered-physicalcybersocial-systems-for-earthcube
Abstract: The proper role of technology to improve human experience has been discussed by visionaries and scientists from the early days of computing and electronic communication. Technology now plays an increasingly important role in facilitating and improving personal and social activities and engagements, decision making, interaction with physical and social worlds, generating insights, and just about anything that an intelligent human seeks to do. I have used the term Computing for Human Experience (CHE) [1] to capture this essential role of technology in a human centric vision. CHE emphasizes the unobtrusive, supportive and assistive role of technology in improving human experience, so that technology “takes into account the human world and allows computers themselves to disappear in the background” (Mark Weiser [2]).
In this talk, I will portray physical-cyber-social (PCS) computing that takes ideas from, and goes significantly beyond, the current progress in cyber-physical systems, socio-technical systems and cyber-social systems to support CHE [3]. I will exemplify future PCS application scenarios in healthcare and traffic management that are supported by (a) a deeper and richer semantic interdependence and interplay between sensors and devices at physical layers, (b) rich technology mediated social interactions, and (c) the gathering and application of collective intelligence characterized by massive and contextually relevant background knowledge and advanced reasoning in order to bridge machine and human perceptions. I will share an example of PCS computing using semantic perception [4], which converts low-level, heterogeneous, multimodal and contextually relevant data into high-level abstractions that can provide insights and assist humans in making complex decisions. The key proposition is to explain that PCS computing will need to move away from traditional data processing to multi-tier computation along data-information-knowledge-wisdom dimension that supports reasoning to convert data into abstractions that humans are adept at using.
[1] A. Sheth, Computing for Human Experience
[2] M. Weiser, The Computer for 21st Century
[3] A. Sheth, Semantics empowered Cyber-Physical-Social Systems
[4] C. Henson, A. Sheth, K. Thirunarayan, Semantic Perception: Converting Sensory Observations to Abstractions
This is a brief a brief review of current multi-disciplinary and collaborative projects at Kno.e.sis led by Prof. Amit Sheth. They cover research in big social data, IoT, semantic web, semantic sensor web, health informatics, personalized digital health, social data for social good, smart city, crisis informatics, digital data for material genome initiative, etc. Dec 2015 edition.
Transforming Big Data into Smart Data for Smart Energy: Deriving Value via ha...Amit Sheth
Keynote at the Workshop on Building Research Collaboration: Electricity Systems. Purdue University, West Lafayette, IN. Aug 28-29, 2013.
Abstract:
Big Data has captured much interest in research and industry, with anticipation of better decisions, efficient organizations, and many new jobs. Much of the emphasis is on technology that handles volume, including storage and computational techniques to support analysis (Hadoop, NoSQL, MapReduce, etc), and the challenges of the four Vs of Big Data: Volume, Variety, Velocity, and Veracity. However, the most important feature of data, the raison d'etre, is neither volume, variety, velocity, nor veracity -- but value. In this talk, I will emphasize the significance of Smart Data, and discuss how it is can be realized by extracting value from Big Data. Accomplishing this task requires organized ways to harness and overcome the original four V-challenges; and while the technologies currently touted may provide some necessary infrastructure-- they are far from sufficient. In particular, we will need to utilize metadata, employ semantics and intelligent processing, and leverage some of the extensive work that predates Big Data.
For achieving energy sustainability, Smart Grids are known to transform the way we generate, distribute, and consume power. Unprecedented amount of data is being collected from smart meters, smart devices, and sensors all throughout the power grid. I will discuss the central question of deriving Value from the entire smart grid data deluge by discussing novel algorithms and techniques such as Semantic Perception for dealing with Velocity, use of ontologies and vocabularies for dealing with Variety, and Continuous Semantics for dealing with Velocity. I will discuss scenarios that exemplify the process of deriving Value from Big Data in the context of Smart Grid.
Additional background is at: http://wiki.knoesis.org/index.php/Smart_Data
A previous version of this talk with more technical details but not focused on energy: http://j.mp/SmatData
There is a rapid intertwining of sensors and mobile devices into the fabric of our lives. This has resulted in unprecedented growth in the number of observations from the physical and social worlds reported in the cyber world. Sensing and computational components embedded in the physical world is termed as Cyber-Physical System (CPS). Current science of CPS is yet to effectively integrate citizen observations in CPS analysis. We demonstrate the role of citizen observations in CPS and propose a novel approach to perform a holistic analysis of machine and citizen sensor observations. Specifically, we demonstrate the complementary, corroborative, and timely aspects of citizen sensor observations compared to machine sensor observations in Physical-Cyber-Social (PCS) Systems.
Physical processes are inherently complex and embody uncertainties. They manifest as machine and citizen sensor observations in PCS Systems. We propose a generic framework to move from observations to decision-making and actions in PCS systems consisting of: (a) PCS event extraction, (b) PCS event understanding, and (c) PCS action recommendation. We demonstrate the role of Probabilistic Graphical Models (PGMs) as a unified framework to deal with uncertainty, complexity, and dynamism that help translate observations into actions. Data driven approaches alone are not guaranteed to be able to synthesize PGMs reflecting real-world dependencies accurately. To overcome this limitation, we propose to empower PGMs using the declarative domain knowledge. Specifically, we propose four techniques: (a) automatic creation of massive training data for Conditional Random Fields (CRFs) using domain knowledge of entities used in PCS event extraction, (b) Bayesian Network structure refinement using causal knowledge from Concept Net used in PCS event understanding, (c) knowledge-driven piecewise linear approximation of nonlinear time series dynamics using Linear Dynamical Systems (LDS) used in PCS event understanding, and the (d) transforming knowledge of goals and actions into a Markov Decision Process (MDP) model used in PCS action recommendation.
We evaluate the benefits of the proposed techniques on real-world applications involving traffic analytics and Internet of Things (IoT).
Abstract: http://j.mp/1MhWWei
Healthcare applications now have the ability to exploit big data in all its complexity. A crucial challenge is to achieve interoperability or integration so that a variety of content from diverse physical (IoT)- cyber (web-based)- and social sources, with diverse formats and modality (text, image, video), can be used in analysis, insight, and decision-making. At Kno.e.sis, an Ohio Center of Excellence in BioHealth Innovation, we have a variety of large, collaborative healthcare/clinical/biomedical projects, all involving domain experts and end-users, and access to real world data that include: clinical/EMR data (of individual patients and that related to public health), data from a variety of sensors (IoT) on and around patients measuring real-time physiological and environmental observations), social data (Twitter, Web forums, PatientsLikeMe), Web search logs, etc. Key projects include: Prescription drug abuse online-surveillance and epidemiology (PREDOSE), Social media analysis to monitor cannabis and synthetic cannabinoid use (eDrugTrends), Modeling Social Behavior for Healthcare Utilization in Depression, Medical Information Decision Assistant and Support (MIDAS) with application to musculoskeletal issues, kHealth: A Semantic Approach to Proactive, Personalized Asthma Management Using Multimodal Sensing (also for Dementia), and Cardiology Semantic Analysis System (with applications to Computer Assisted Coding and Computerized Document Improvement).
This talk will review how ontologies or knowledge graphs play a central role in supporting semantic filtering, interoperability and integration (including the issues such as disambiguation), reasoning and decision-making in all our health-centric research and applications. Additional relevant information is at the speaker’s HCLS page. http://knoesis.org/amit/hcls
Citizen Sensor Data Mining, Social Media Analytics and ApplicationsAmit Sheth
Opening talk at Singapore Symposium on Sentiment Analysis (S3A), February 6, 2015, Singapore. http://s3a.sentic.net/#s3a2015
Abstract
With the rapid rise in the popularity of social media, and near ubiquitous mobile access, the sharing of observations and opinions has become common-place. This has given us an unprecedented access to the pulse of a populace and the ability to perform analytics on social data to support a variety of socially intelligent applications -- be it for brand tracking and management, crisis coordination, organizing revolutions or promoting social development in underdeveloped and developing countries.
I will review: 1) understanding and analysis of informal text, esp. microblogs (e.g., issues of cultural entity extraction and role of semantic/background knowledge enhanced techniques), and 2) how we built Twitris, a comprehensive social media analytics (social intelligence) platform.
I will describe the analysis capabilities along three dimensions: spatio-temporal-thematic, people-content-network, and sentiment-emption-intent. I will couple technical insights with identification of computational techniques and real-world examples using live demos of Twitris (http://twitris2.knoesis.org).
Social media provides a natural platform for dynamic emergence of citizen (as) sensor communities, where the citizens share information, express opinions, and engage in discussions. Often such a Online Citizen Sensor Community (CSC) has stated or implied goals related to workflows of organizational actors with defined roles and responsibilities. For example, a community of crisis response volunteers, for informing the prioritization of responses for resource needs (e.g., medical) to assist the managers of crisis response organizations. However, in CSC, there are challenges related to information overload for organizational actors, including finding reliable information providers and finding the actionable information from citizens. This threatens awareness and articulation of workflows to enable cooperation between citizens and organizational actors. CSCs supported by Web 2.0 social media platforms offer new opportunities and pose new challenges. This work addresses issues of ambiguity in interpreting unconstrained natural language (e.g., ‘wanna help’ appearing in both types of messages for asking and offering help during crises), sparsity of user and group behaviors (e.g., expression of specific intent), and diversity of user demographics (e.g., medical or technical professional) for interpreting user-generated data of citizen sensors. Interdisciplinary research involving social and computer sciences is essential to address these socio-technical issues in CSC, and allow better accessibility to user-generated data at higher level of information abstraction for organizational actors. This study presents a novel web information processing framework focused on actors and actions in cooperation, called Identify-Match-Engage (IME), which fuses top-down and bottom-up computing approaches to design a cooperative web information system between citizens and organizational actors. It includes a.) identification of action related seeking-offering intent behaviors from short, unstructured text documents using both declarative and statistical knowledge based classification model, b.) matching of intentions about seeking and offering, and c.) engagement models of users and groups in CSC to prioritize whom to engage, by modeling context with social theories using features of users, their generated content, and their dynamic network connections in the user interaction networks. The results show an improvement in modeling efficiency from the fusion of top-down knowledge-driven and bottom-up data-driven approaches than from conventional bottom-up approaches alone for modeling intent and engagement. Several applications of this work include use of the engagement interface tool during recent crises to enable efficient citizen engagement for spreading critical information of prioritized needs to ensure donation of only required supplies by the citizens. The engagement interface application also won the United Nations ICT agency ITU's Young Innovator 2014 award.
Cognitive Computing by Professor Gordon Pipadiannepatricia
Professor Dr. Gordon Pipa, University of Osnabrueck, Germany is making this presentation for the Cognitive Systems Institute Speaker Series on May 26, 2016.
Ohio Center of Excellence in Knowledge-Enabled Computing at Wright State (Kno.e.sis)
Center overview: http://bit.ly/coe-k
Invitation: http://bit.ly/COE-invite
Smart Data and real-world semantic web applications (2004)Amit Sheth
Probably the first recorded use of "smart data" for achieving the Semantic Web and for realizing productivity, efficiency, and effectiveness gains by using semantics to transform raw data into Smart Data.
2013 retake on this is discussed at: http://wiki.knoesis.org/index.php/Smart_Data
Understanding speed and travel-time dynamics in response to various city related events is an important and challenging problem. Sensor data (numerical) containing average speed of vehicles passing through a road link can be interpreted in terms of traffic related incident reports from city authorities and social media data (textual), providing a complementary understanding of traffic dynamics. State-of-the-art research is focused on either analyzing sensor observations or citizen observations; we seek to exploit both in a synergistic manner.
We demonstrate the role of domain knowledge in capturing the non-linearity of speed and travel-time dynamics by segmenting speed and travel-time observations into simpler components amenable to description using linear models such as Linear Dynamical System (LDS). Specifically, we propose Restricted Switching Linear Dynamical System (RSLDS) to model normal speed and travel time dynamics and thereby characterize anomalous dynamics. We utilize the city traffic events extracted from text to explain anomalous dynamics. We present a large scale evaluation of the proposed approach on a real-world traffic and twitter dataset collected over a year with promising results.
Semantics-empowered Smart City applications: today and tomorrowAmit Sheth
Citation:
Amit Sheth, "Semantics-empowered Smart City applications: today and tomorrow,” Keynote presented at the The 6th Workshop on Semantics for Smarter Cities (S4SC 2015), collocated with the 14th International Semantic Web Conference (ISWC2015), Bethlehem, PA, USA. Oct 11-12, 2015.
http://kat.ee.surrey.ac.uk/wssc/index.html
Abstract: There has been a massive growth in potentially relevant physical (sensor/IoT)- cyber (Web)- social data related to activities and operations of cities and citizens. As part of our participation in smart city projects, including the EU-funded CityPulse project, we have analyzed a large number of of use cases with inputs from city administrations and end users, and developed a few early applications. In this talk, I will present some exciting smart city applications possible today and venture to speculate on some future ones where Big Data technologies and semantic computing, including the use of domain knowledge, play a critical role.
This is a version of series of talks given at NCSA-UIUC's director seminar, IBM Almaden, HP Labs, DERI-Galway, City Univ of Dublin, and KMI-Open University during Aug-Oct 2010 (replaces earlier keynote version). It deals with couple of items of the vision outlined at http://bit.ly/4ynB7A
A video of this presentation: http://www.ncsa.illinois.edu/News/Video/2010/sheth.html
Link to this talk as http://bit.ly/CHE-talk
Cities are composed of complex systems with physical, cyber, and social components. Current works on extracting and understanding city events mainly rely on technology enabled infrastructure to observe and record events. In this work, we propose an approach to leverage citizen observations of various city systems and services such as traffic, public transport, water supply, weather, sewage, and public safety as a source of city events. We investigate the feasibility of using such textual streams for extracting city events from annotated text. We formalize the problem of annotating social streams such as microblogs as a sequence labeling problem. We present a novel training data creation process for training sequence labeling models. Our automatic training data creation process utilizes instance level domain knowledge (e.g., locations in a city, possible event terms). We compare this automated annotation process to a state-of-the-art tool that needs manually created training data and show that it has comparable performance in annotation tasks. An aggregation algorithm is then presented for event extraction from annotated text. We carry out a comprehensive evaluation of the event annotation and event extraction on a real-world dataset consisting of event reports and tweets collected over four months from San Francisco Bay Area. The evaluation results are promising and provide insights into the utility of social stream for extracting city events.
How HudsonAlpha Innovates on IT for Research-Driven Education, Genomic Medici...Dana Gardner
Transcript of a discussion on how HudsonAlpha leverages modern IT infrastructure and big data analytics to power research projects as well as pioneering genomic medicine findings.
(I’ll GO OVER STEP BY STEP IN CLASS TOMORROW)Part OneP.docxgertrudebellgrove
(I’ll GO OVER STEP BY STEP IN CLASS TOMORROW)
Part One
Portfolio Critique Using Morningstar.com
Morningstar, Inc. is a leading provider of independent investment research in the United States and in major international markets and offers an extensive line of Internet, software, and print-based products for individual investors, financial advisors, and institutional clients. Morningstar is a trusted source for insightful information on stocks, mutual funds, variable annuities, closed-end funds, exchange-traded funds, separate accounts, hedge funds, and 529 college savings plans.
1. Go to www.morningstar.com. Sign up for Premium Membership. You will be able to receive a 14-day free trial. Browse the site to become familiar with everything Morningstar has to offer. Be prepared to participate in classroom discussion and bring your questions if you have any.
2. Go to X-Ray and print the page. Write a portfolio critique.
Part Two
Use the daily data on the portfolio returns and the market returns (e.g., the S&P 500 index) to estimate a single-index market model. Your analysis should include
(Morningstar automatically will calculate)
1. Standard deviation for each portfolio.
1. Covariance between the rates of return of portfolio and S&P500.
1. The correlation coefficient between each portfolio and S&P500.
1. Run a regression of each portfolio against the market return and find:\
(In fact Morningstar will automatically calculate)
0. Alpha for each portfolio.
0. Beta for each portfolio.
0. What is the systematic and nonsystematic risk of the each security?
0. Sharpe Ratio of portfolios
1. Plot the risk and return of each portfolio and draw the efficient frontiers.
1. Identify which portfolio dominates on the efficient frontier.
1. For which portfolio had an average return in excess of that predicated by the CAPM?
Essay Portion Study Guide
Psych 120, Spring 2019
1. What are aphantasia (and hyperphantasia), and why are they interesting to conceptualization researchers? What sort of information have we already discovered through studying aphantasia? Discuss TWO experiments we covered in class that could be re-examined in an aphantasic population, and why they would contribute to a greater understanding of cognition.
2. How do we recognize and categorize objects? Trace the processes involved with object recognition and categorization, discussing all possibilities covered for how we can do this. Lastly, provide TWO pieces of evidence in support of those various possibilities.
3. What is the dual visual system theory and what does it have to do with consciousness and cognition? Provide TWO pieces of evidence (neurological or behavioral) supporting the dual visual system theory. Next, discuss how those same TWO pieces of evidence might actually not support the dual visual system theory.
4. How do video games impact cognition? Are all video games equal in their benefits or detriments to various cognitive activities? Provide TWO pieces of evi ...
Presented at the Panel on
Sensor, Data, Analytics and Integration in Advanced Manufacturing, at the Connected Manufacturing track of Bosch-USA organized "Leveraging Public-Private Partnerships for Regional Growth Summit". Panel statement: Sensors, data and analytics are the core of any smart manufacturing system. What are the main challenges to create actionable outputs, replicate systems and scale efficiency gains across industries?
Moderator: Thomas Stiedl, Bosch
Panelists:
1. Amit Sheth, Wright State University
2. Howie Choset, Carnegie Melon University
3. Nagi Gebraeel, Georgia Institute of Technology
4. Brian Anthony, Massachusetts Institute of Technology
5. Yarom Polosky, Oak Ridget National Laboratory
For in-depth look:
Smart IoT: IoT as a human agent, human extension, and human complement
http://amitsheth.blogspot.com/2015/03/smart-iot-iot-as-human-agent-human.html
Semantic Gateway: http://knoesis.org/library/resource.php?id=2154
SSN Ontology: http://knoesis.org/library/resource.php?id=1659
Applications of Multimodal Physical (IoT), Cyber and Social Data for Reliable and Actionable Insights: http://knoesis.org/library/resource.php?id=2018
Smart Data: Transforming Big Data into Smart Data...: http://wiki.knoesis.org/index.php/Smart_Data
Historic use of the term Smart Data (2004): http://www.scribd.com/doc/186588820
Physical Cyber Social Computing: An early 21st century approach to Computing ...Amit Sheth
Keynote given at WiMS 2013 Conference, June 12-14 2013, Madrid, Spain. http://aida.ii.uam.es/wims13/keynotes.php
Video of this talk at: http://videolectures.net/wims2013_sheth_physical_cyber_social_computing/
More information at: More at: http://wiki.knoesis.org/index.php/PCS
and http://knoesis.org/projects/ssw/
Replacing earlier versions: http://www.slideshare.net/apsheth/physical-cyber-social-computing & http://www.slideshare.net/apsheth/semantics-empowered-physicalcybersocial-systems-for-earthcube
Abstract: The proper role of technology to improve human experience has been discussed by visionaries and scientists from the early days of computing and electronic communication. Technology now plays an increasingly important role in facilitating and improving personal and social activities and engagements, decision making, interaction with physical and social worlds, generating insights, and just about anything that an intelligent human seeks to do. I have used the term Computing for Human Experience (CHE) [1] to capture this essential role of technology in a human centric vision. CHE emphasizes the unobtrusive, supportive and assistive role of technology in improving human experience, so that technology “takes into account the human world and allows computers themselves to disappear in the background” (Mark Weiser [2]).
In this talk, I will portray physical-cyber-social (PCS) computing that takes ideas from, and goes significantly beyond, the current progress in cyber-physical systems, socio-technical systems and cyber-social systems to support CHE [3]. I will exemplify future PCS application scenarios in healthcare and traffic management that are supported by (a) a deeper and richer semantic interdependence and interplay between sensors and devices at physical layers, (b) rich technology mediated social interactions, and (c) the gathering and application of collective intelligence characterized by massive and contextually relevant background knowledge and advanced reasoning in order to bridge machine and human perceptions. I will share an example of PCS computing using semantic perception [4], which converts low-level, heterogeneous, multimodal and contextually relevant data into high-level abstractions that can provide insights and assist humans in making complex decisions. The key proposition is to explain that PCS computing will need to move away from traditional data processing to multi-tier computation along data-information-knowledge-wisdom dimension that supports reasoning to convert data into abstractions that humans are adept at using.
[1] A. Sheth, Computing for Human Experience
[2] M. Weiser, The Computer for 21st Century
[3] A. Sheth, Semantics empowered Cyber-Physical-Social Systems
[4] C. Henson, A. Sheth, K. Thirunarayan, Semantic Perception: Converting Sensory Observations to Abstractions
This is a brief a brief review of current multi-disciplinary and collaborative projects at Kno.e.sis led by Prof. Amit Sheth. They cover research in big social data, IoT, semantic web, semantic sensor web, health informatics, personalized digital health, social data for social good, smart city, crisis informatics, digital data for material genome initiative, etc. Dec 2015 edition.
Transforming Big Data into Smart Data for Smart Energy: Deriving Value via ha...Amit Sheth
Keynote at the Workshop on Building Research Collaboration: Electricity Systems. Purdue University, West Lafayette, IN. Aug 28-29, 2013.
Abstract:
Big Data has captured much interest in research and industry, with anticipation of better decisions, efficient organizations, and many new jobs. Much of the emphasis is on technology that handles volume, including storage and computational techniques to support analysis (Hadoop, NoSQL, MapReduce, etc), and the challenges of the four Vs of Big Data: Volume, Variety, Velocity, and Veracity. However, the most important feature of data, the raison d'etre, is neither volume, variety, velocity, nor veracity -- but value. In this talk, I will emphasize the significance of Smart Data, and discuss how it is can be realized by extracting value from Big Data. Accomplishing this task requires organized ways to harness and overcome the original four V-challenges; and while the technologies currently touted may provide some necessary infrastructure-- they are far from sufficient. In particular, we will need to utilize metadata, employ semantics and intelligent processing, and leverage some of the extensive work that predates Big Data.
For achieving energy sustainability, Smart Grids are known to transform the way we generate, distribute, and consume power. Unprecedented amount of data is being collected from smart meters, smart devices, and sensors all throughout the power grid. I will discuss the central question of deriving Value from the entire smart grid data deluge by discussing novel algorithms and techniques such as Semantic Perception for dealing with Velocity, use of ontologies and vocabularies for dealing with Variety, and Continuous Semantics for dealing with Velocity. I will discuss scenarios that exemplify the process of deriving Value from Big Data in the context of Smart Grid.
Additional background is at: http://wiki.knoesis.org/index.php/Smart_Data
A previous version of this talk with more technical details but not focused on energy: http://j.mp/SmatData
There is a rapid intertwining of sensors and mobile devices into the fabric of our lives. This has resulted in unprecedented growth in the number of observations from the physical and social worlds reported in the cyber world. Sensing and computational components embedded in the physical world is termed as Cyber-Physical System (CPS). Current science of CPS is yet to effectively integrate citizen observations in CPS analysis. We demonstrate the role of citizen observations in CPS and propose a novel approach to perform a holistic analysis of machine and citizen sensor observations. Specifically, we demonstrate the complementary, corroborative, and timely aspects of citizen sensor observations compared to machine sensor observations in Physical-Cyber-Social (PCS) Systems.
Physical processes are inherently complex and embody uncertainties. They manifest as machine and citizen sensor observations in PCS Systems. We propose a generic framework to move from observations to decision-making and actions in PCS systems consisting of: (a) PCS event extraction, (b) PCS event understanding, and (c) PCS action recommendation. We demonstrate the role of Probabilistic Graphical Models (PGMs) as a unified framework to deal with uncertainty, complexity, and dynamism that help translate observations into actions. Data driven approaches alone are not guaranteed to be able to synthesize PGMs reflecting real-world dependencies accurately. To overcome this limitation, we propose to empower PGMs using the declarative domain knowledge. Specifically, we propose four techniques: (a) automatic creation of massive training data for Conditional Random Fields (CRFs) using domain knowledge of entities used in PCS event extraction, (b) Bayesian Network structure refinement using causal knowledge from Concept Net used in PCS event understanding, (c) knowledge-driven piecewise linear approximation of nonlinear time series dynamics using Linear Dynamical Systems (LDS) used in PCS event understanding, and the (d) transforming knowledge of goals and actions into a Markov Decision Process (MDP) model used in PCS action recommendation.
We evaluate the benefits of the proposed techniques on real-world applications involving traffic analytics and Internet of Things (IoT).
Abstract: http://j.mp/1MhWWei
Healthcare applications now have the ability to exploit big data in all its complexity. A crucial challenge is to achieve interoperability or integration so that a variety of content from diverse physical (IoT)- cyber (web-based)- and social sources, with diverse formats and modality (text, image, video), can be used in analysis, insight, and decision-making. At Kno.e.sis, an Ohio Center of Excellence in BioHealth Innovation, we have a variety of large, collaborative healthcare/clinical/biomedical projects, all involving domain experts and end-users, and access to real world data that include: clinical/EMR data (of individual patients and that related to public health), data from a variety of sensors (IoT) on and around patients measuring real-time physiological and environmental observations), social data (Twitter, Web forums, PatientsLikeMe), Web search logs, etc. Key projects include: Prescription drug abuse online-surveillance and epidemiology (PREDOSE), Social media analysis to monitor cannabis and synthetic cannabinoid use (eDrugTrends), Modeling Social Behavior for Healthcare Utilization in Depression, Medical Information Decision Assistant and Support (MIDAS) with application to musculoskeletal issues, kHealth: A Semantic Approach to Proactive, Personalized Asthma Management Using Multimodal Sensing (also for Dementia), and Cardiology Semantic Analysis System (with applications to Computer Assisted Coding and Computerized Document Improvement).
This talk will review how ontologies or knowledge graphs play a central role in supporting semantic filtering, interoperability and integration (including the issues such as disambiguation), reasoning and decision-making in all our health-centric research and applications. Additional relevant information is at the speaker’s HCLS page. http://knoesis.org/amit/hcls
Citizen Sensor Data Mining, Social Media Analytics and ApplicationsAmit Sheth
Opening talk at Singapore Symposium on Sentiment Analysis (S3A), February 6, 2015, Singapore. http://s3a.sentic.net/#s3a2015
Abstract
With the rapid rise in the popularity of social media, and near ubiquitous mobile access, the sharing of observations and opinions has become common-place. This has given us an unprecedented access to the pulse of a populace and the ability to perform analytics on social data to support a variety of socially intelligent applications -- be it for brand tracking and management, crisis coordination, organizing revolutions or promoting social development in underdeveloped and developing countries.
I will review: 1) understanding and analysis of informal text, esp. microblogs (e.g., issues of cultural entity extraction and role of semantic/background knowledge enhanced techniques), and 2) how we built Twitris, a comprehensive social media analytics (social intelligence) platform.
I will describe the analysis capabilities along three dimensions: spatio-temporal-thematic, people-content-network, and sentiment-emption-intent. I will couple technical insights with identification of computational techniques and real-world examples using live demos of Twitris (http://twitris2.knoesis.org).
Social media provides a natural platform for dynamic emergence of citizen (as) sensor communities, where the citizens share information, express opinions, and engage in discussions. Often such a Online Citizen Sensor Community (CSC) has stated or implied goals related to workflows of organizational actors with defined roles and responsibilities. For example, a community of crisis response volunteers, for informing the prioritization of responses for resource needs (e.g., medical) to assist the managers of crisis response organizations. However, in CSC, there are challenges related to information overload for organizational actors, including finding reliable information providers and finding the actionable information from citizens. This threatens awareness and articulation of workflows to enable cooperation between citizens and organizational actors. CSCs supported by Web 2.0 social media platforms offer new opportunities and pose new challenges. This work addresses issues of ambiguity in interpreting unconstrained natural language (e.g., ‘wanna help’ appearing in both types of messages for asking and offering help during crises), sparsity of user and group behaviors (e.g., expression of specific intent), and diversity of user demographics (e.g., medical or technical professional) for interpreting user-generated data of citizen sensors. Interdisciplinary research involving social and computer sciences is essential to address these socio-technical issues in CSC, and allow better accessibility to user-generated data at higher level of information abstraction for organizational actors. This study presents a novel web information processing framework focused on actors and actions in cooperation, called Identify-Match-Engage (IME), which fuses top-down and bottom-up computing approaches to design a cooperative web information system between citizens and organizational actors. It includes a.) identification of action related seeking-offering intent behaviors from short, unstructured text documents using both declarative and statistical knowledge based classification model, b.) matching of intentions about seeking and offering, and c.) engagement models of users and groups in CSC to prioritize whom to engage, by modeling context with social theories using features of users, their generated content, and their dynamic network connections in the user interaction networks. The results show an improvement in modeling efficiency from the fusion of top-down knowledge-driven and bottom-up data-driven approaches than from conventional bottom-up approaches alone for modeling intent and engagement. Several applications of this work include use of the engagement interface tool during recent crises to enable efficient citizen engagement for spreading critical information of prioritized needs to ensure donation of only required supplies by the citizens. The engagement interface application also won the United Nations ICT agency ITU's Young Innovator 2014 award.
Cognitive Computing by Professor Gordon Pipadiannepatricia
Professor Dr. Gordon Pipa, University of Osnabrueck, Germany is making this presentation for the Cognitive Systems Institute Speaker Series on May 26, 2016.
Ohio Center of Excellence in Knowledge-Enabled Computing at Wright State (Kno.e.sis)
Center overview: http://bit.ly/coe-k
Invitation: http://bit.ly/COE-invite
Smart Data and real-world semantic web applications (2004)Amit Sheth
Probably the first recorded use of "smart data" for achieving the Semantic Web and for realizing productivity, efficiency, and effectiveness gains by using semantics to transform raw data into Smart Data.
2013 retake on this is discussed at: http://wiki.knoesis.org/index.php/Smart_Data
Understanding speed and travel-time dynamics in response to various city related events is an important and challenging problem. Sensor data (numerical) containing average speed of vehicles passing through a road link can be interpreted in terms of traffic related incident reports from city authorities and social media data (textual), providing a complementary understanding of traffic dynamics. State-of-the-art research is focused on either analyzing sensor observations or citizen observations; we seek to exploit both in a synergistic manner.
We demonstrate the role of domain knowledge in capturing the non-linearity of speed and travel-time dynamics by segmenting speed and travel-time observations into simpler components amenable to description using linear models such as Linear Dynamical System (LDS). Specifically, we propose Restricted Switching Linear Dynamical System (RSLDS) to model normal speed and travel time dynamics and thereby characterize anomalous dynamics. We utilize the city traffic events extracted from text to explain anomalous dynamics. We present a large scale evaluation of the proposed approach on a real-world traffic and twitter dataset collected over a year with promising results.
Semantics-empowered Smart City applications: today and tomorrowAmit Sheth
Citation:
Amit Sheth, "Semantics-empowered Smart City applications: today and tomorrow,” Keynote presented at the The 6th Workshop on Semantics for Smarter Cities (S4SC 2015), collocated with the 14th International Semantic Web Conference (ISWC2015), Bethlehem, PA, USA. Oct 11-12, 2015.
http://kat.ee.surrey.ac.uk/wssc/index.html
Abstract: There has been a massive growth in potentially relevant physical (sensor/IoT)- cyber (Web)- social data related to activities and operations of cities and citizens. As part of our participation in smart city projects, including the EU-funded CityPulse project, we have analyzed a large number of of use cases with inputs from city administrations and end users, and developed a few early applications. In this talk, I will present some exciting smart city applications possible today and venture to speculate on some future ones where Big Data technologies and semantic computing, including the use of domain knowledge, play a critical role.
This is a version of series of talks given at NCSA-UIUC's director seminar, IBM Almaden, HP Labs, DERI-Galway, City Univ of Dublin, and KMI-Open University during Aug-Oct 2010 (replaces earlier keynote version). It deals with couple of items of the vision outlined at http://bit.ly/4ynB7A
A video of this presentation: http://www.ncsa.illinois.edu/News/Video/2010/sheth.html
Link to this talk as http://bit.ly/CHE-talk
Cities are composed of complex systems with physical, cyber, and social components. Current works on extracting and understanding city events mainly rely on technology enabled infrastructure to observe and record events. In this work, we propose an approach to leverage citizen observations of various city systems and services such as traffic, public transport, water supply, weather, sewage, and public safety as a source of city events. We investigate the feasibility of using such textual streams for extracting city events from annotated text. We formalize the problem of annotating social streams such as microblogs as a sequence labeling problem. We present a novel training data creation process for training sequence labeling models. Our automatic training data creation process utilizes instance level domain knowledge (e.g., locations in a city, possible event terms). We compare this automated annotation process to a state-of-the-art tool that needs manually created training data and show that it has comparable performance in annotation tasks. An aggregation algorithm is then presented for event extraction from annotated text. We carry out a comprehensive evaluation of the event annotation and event extraction on a real-world dataset consisting of event reports and tweets collected over four months from San Francisco Bay Area. The evaluation results are promising and provide insights into the utility of social stream for extracting city events.
How HudsonAlpha Innovates on IT for Research-Driven Education, Genomic Medici...Dana Gardner
Transcript of a discussion on how HudsonAlpha leverages modern IT infrastructure and big data analytics to power research projects as well as pioneering genomic medicine findings.
(I’ll GO OVER STEP BY STEP IN CLASS TOMORROW)Part OneP.docxgertrudebellgrove
(I’ll GO OVER STEP BY STEP IN CLASS TOMORROW)
Part One
Portfolio Critique Using Morningstar.com
Morningstar, Inc. is a leading provider of independent investment research in the United States and in major international markets and offers an extensive line of Internet, software, and print-based products for individual investors, financial advisors, and institutional clients. Morningstar is a trusted source for insightful information on stocks, mutual funds, variable annuities, closed-end funds, exchange-traded funds, separate accounts, hedge funds, and 529 college savings plans.
1. Go to www.morningstar.com. Sign up for Premium Membership. You will be able to receive a 14-day free trial. Browse the site to become familiar with everything Morningstar has to offer. Be prepared to participate in classroom discussion and bring your questions if you have any.
2. Go to X-Ray and print the page. Write a portfolio critique.
Part Two
Use the daily data on the portfolio returns and the market returns (e.g., the S&P 500 index) to estimate a single-index market model. Your analysis should include
(Morningstar automatically will calculate)
1. Standard deviation for each portfolio.
1. Covariance between the rates of return of portfolio and S&P500.
1. The correlation coefficient between each portfolio and S&P500.
1. Run a regression of each portfolio against the market return and find:\
(In fact Morningstar will automatically calculate)
0. Alpha for each portfolio.
0. Beta for each portfolio.
0. What is the systematic and nonsystematic risk of the each security?
0. Sharpe Ratio of portfolios
1. Plot the risk and return of each portfolio and draw the efficient frontiers.
1. Identify which portfolio dominates on the efficient frontier.
1. For which portfolio had an average return in excess of that predicated by the CAPM?
Essay Portion Study Guide
Psych 120, Spring 2019
1. What are aphantasia (and hyperphantasia), and why are they interesting to conceptualization researchers? What sort of information have we already discovered through studying aphantasia? Discuss TWO experiments we covered in class that could be re-examined in an aphantasic population, and why they would contribute to a greater understanding of cognition.
2. How do we recognize and categorize objects? Trace the processes involved with object recognition and categorization, discussing all possibilities covered for how we can do this. Lastly, provide TWO pieces of evidence in support of those various possibilities.
3. What is the dual visual system theory and what does it have to do with consciousness and cognition? Provide TWO pieces of evidence (neurological or behavioral) supporting the dual visual system theory. Next, discuss how those same TWO pieces of evidence might actually not support the dual visual system theory.
4. How do video games impact cognition? Are all video games equal in their benefits or detriments to various cognitive activities? Provide TWO pieces of evi ...
Notes on "Artificial Intelligence in Bioscience Symposium 2017"PetteriTeikariPhD
Including talks for drug discovery, drug target selection, scientific reproducibility, machine learning in omics and GWAS, network biology, functional connectome, endotype discovery, bayesian causal networks, systems biology, brain decoding, place cells, personalized medicine, sepsis warning system, knowledge engineering, CRISPR genome editing, data science stacks, feline gene sequencing, generative models for chemical compounds via variational autoencoders, ethics in AI medicine
https://www.bioscience.ai/ | #bioai2017 | Sept 14, 2017 | The British Library, London
Alternative download for slides if Slideshare download is acting up: https://www.dropbox.com/s/2wdfuqzifns7475/bioai2017.pdf?dl=0
Slides contain information about why bioinformatics appeared,
who bioinformaticians are, what they do, what kind of cool applications and challenges in bioinformatics there are.
Slides were prepared for the Bioinformatics seminar 2016, Institute of Computer Science, University of Tartu.
Frankie Rybicki slide set for Deep Learning in Radiology / MedicineFrank Rybicki
These are my #AI slides for medical deep learning using #radiology and medical imaging examples. Please use them & modify to teach your own group about medical AI.
Artificial intelligence in health care by Islam salama " Saimo#BoOm "Dr-Islam Salama
A Lecture about basics and concepts of Artificial Intelligence in health care & there applications
محاضرة عامة حول الذكاء الإصطناعي وأساسياته في الرعاية الصحية والطبية وتطبيقاته
International Journal of Engineering and Science Invention (IJESI) is an international journal intended for professionals and researchers in all fields of computer science and electronics. IJESI publishes research articles and reviews within the whole field Engineering Science and Technology, new teaching methods, assessment, validation and the impact of new technologies and it will continue to provide information on the latest trends and developments in this ever-expanding subject. The publications of papers are selected through double peer reviewed to ensure originality, relevance, and readability. The articles published in our journal can be accessed online.
How New York Genome Center Manages the Massive Data Generated from DNA Sequen...Dana Gardner
Transcript of a sponsored discussion on how the drive to better diagnose diseases and develop more effective treatments is aided by swift, cost efficient, and accessible big data analytics infrastructure.
Depression Detection in Tweets using Logistic Regression Modelijtsrd
In the growing world of modernization, mental health issues like depression, anxiety and stress are very normal among people and social media like Facebook, Instagram and Twitter have boosted the growth of such mental health. Everything has its legitimacy and negative mark. During this pandemic, people are more likely to suffer from mental health issues, they are available 24 7 and are cut off from the real world. Past examinations have shown that individuals who invest more energy via online media are bound to be depressed. In this project, we find out people who are depressed based on their tweets, followers, following and many other factors. For this, I have trained and tested our text classifier, which will distinguish between the user who is depressed or not depressed. Rahul Kumar Sharma | Vijayakumar A "Depression Detection in Tweets using Logistic Regression Model" Published in International Journal of Trend in Scientific Research and Development (ijtsrd), ISSN: 2456-6470, Volume-5 | Issue-4 , June 2021, URL: https://www.ijtsrd.compapers/ijtsrd41284.pdf Paper URL: https://www.ijtsrd.comcomputer-science/data-miining/41284/depression-detection-in-tweets-using-logistic-regression-model/rahul-kumar-sharma
0x01 - Newton's Third Law: Static vs. Dynamic AbusersOWASP Beja
f you offer a service on the web, odds are that someone will abuse it. Be it an API, a SaaS, a PaaS, or even a static website, someone somewhere will try to figure out a way to use it to their own needs. In this talk we'll compare measures that are effective against static attackers and how to battle a dynamic attacker who adapts to your counter-measures.
About the Speaker
===============
Diogo Sousa, Engineering Manager @ Canonical
An opinionated individual with an interest in cryptography and its intersection with secure software development.
Sharpen existing tools or get a new toolbox? Contemporary cluster initiatives...Orkestra
UIIN Conference, Madrid, 27-29 May 2024
James Wilson, Orkestra and Deusto Business School
Emily Wise, Lund University
Madeline Smith, The Glasgow School of Art
Acorn Recovery: Restore IT infra within minutesIP ServerOne
Introducing Acorn Recovery as a Service, a simple, fast, and secure managed disaster recovery (DRaaS) by IP ServerOne. A DR solution that helps restore your IT infra within minutes.
This presentation, created by Syed Faiz ul Hassan, explores the profound influence of media on public perception and behavior. It delves into the evolution of media from oral traditions to modern digital and social media platforms. Key topics include the role of media in information propagation, socialization, crisis awareness, globalization, and education. The presentation also examines media influence through agenda setting, propaganda, and manipulative techniques used by advertisers and marketers. Furthermore, it highlights the impact of surveillance enabled by media technologies on personal behavior and preferences. Through this comprehensive overview, the presentation aims to shed light on how media shapes collective consciousness and public opinion.
Have you ever wondered how search works while visiting an e-commerce site, internal website, or searching through other types of online resources? Look no further than this informative session on the ways that taxonomies help end-users navigate the internet! Hear from taxonomists and other information professionals who have first-hand experience creating and working with taxonomies that aid in navigation, search, and discovery across a range of disciplines.
This presentation by Morris Kleiner (University of Minnesota), was made during the discussion “Competition and Regulation in Professions and Occupations” held at the Working Party No. 2 on Competition and Regulation on 10 June 2024. More papers and presentations on the topic can be found out at oe.cd/crps.
This presentation was uploaded with the author’s consent.
Obesity causes and management and associated medical conditions
Knowledge Will Propel Machine Understanding of Big Data
1. Amit Sheth
Kno.e.sis – Ohio Center of Excellence in Knowledge-enabled Computing:
Wright State University, Dayton, Ohio
Knowledge will Propel Machine
Understanding of Big Data
Keynote at the China Conference on Knowledge Graph and Semantic Computing, Chengdu, China, 26-
29 August 2017. Invited talk at the Summer School on Learning in Data Science: Models, Algorithms
and Tools, Ahmedabad, 17 July 2017. Colloquium at Fraunhofer- Berlin, 23 Aug 2017.
1
2. Machine Intelligence - we will interpret
it much more broadly than Google: “all aspect of machine learning”... We
will define it as machines (any system) performing similar to (nearly
emulating) human intelligence.
For this talk, our focus will be limited to (big) data/content - esp.
How will machines “understand” the data/signals/observations,
so that it can (help) take timely and good (evidence based)
decision and actions.
2
4. • The astounding bandwidth of your
senses is 11 million bits of
information every second.
• In conscious activities like reading,
the human brain distills
approximately 40 bits of
information per second.
…and do it efficiently and at scale
http://www.cisco.com/c/en/us/solutions/collateral/service-provider/visual-networking-index-vni/mobile-white-paper-c11-520862.html
The Brain: Inspiration for Intelligent Processing:
What if we could automate the interpretation of data?
4
5. Credit: Looi Consulting (http://www.looiconsulting.com/home/enterprise-
big-data/)
• In 2008, data generated > storage
available. Less than 0.5% of data get
analyzed.
• Vast variety of data: text > images >
A/V > genome sequencing > IoT.
• Of all the data generated, which data
is relevant, and why? Which data to
analyze? Which data can offer
insight? Who cares for what data?
How to get attention to a human
decision maker? What we need is
intelligent processing to get
actionable, smart data.
A Big Challenge and Opportunity in Recent Times
Scale of Data
Analysis of Data
Different forms
of Data
Uncertainty
of Data
5
6. First used in 2004; redefined in 2013: http://wiki.knoesis.org/index.php/Smart_Data.
Smart data makes sense out of big data.
How do we solve problems with real-world complexity,
gather vast amounts of data, diverse knowledge, and come
up with intelligent decisions and timely actions?
Smart Data provides value from harnessing the challenges
posed by volume, velocity, variety, and veracity of big data,
in-turn providing actionable information and improving
decision making.
6
7. Levels of Abstraction
Hyperthyroidism
Elevated Blood
Pressure
Systolic blood pressure of
150 mmHg
“150”
...
...
Interpreted data
(abductive)
[in OWL]
e.g., diagnosis
Interpreted data
(deductive)
[in OWL]
e.g., threshold
Interpreted data
(deductive)
[in RDF]
e.g., label
Raw data
[in TEXT]
e.g., number
Intellego
SSN Ontology
7
11. Today’s focus is on how do computers better
“understand” diverse, multimodal data
With the focus on the role knowledge plays, often complementing/enhancing
ML and NLP techniques, in contextual “understanding” of data to help solve the
problem for which the data is potentially relevant.
This encompasses topics of information extraction and semantic annotation.
111
13. 13
Short detour: it is becoming easier to find or create
relevant knowledge for a given application
• Existence of large knowledge bases
• Ability to search/find a relevant knowledge bases [WI’13]
• Ability extract a relevant subset [IEEE Big Data’16]
• Ability to enrich - by deriving new concepts and new facts [BIBM’12]
Knowledge graphs are already playing influential roles in many
applications involving big data, starting with search
[15 years of search & knowledge graphs].
14. 14
Knowledge Graphs become prominent
Linked Open Data >
9960 datasets,
> 149 B triples
38.3 M entities and
8.8 B facts Google Knowledge Graph
570 M entities and 18 B facts
Schema.org annotations Linkedln knowledge graph
15. 15
Domain-specific knowledge extraction from LOD
Linked Open
Data
Book related
information?
Filter relevant datasets
Extract relevant portion
of a data set
Project
Gutenberg
DBpedia
DBTropes
Books, Countries, Drugs
Books, movie, games
Books
Book
specific
DBpedia
Book
specific
DBTropes
http://knoesis.org/node/2272
http://knoesis.org/node/2793
16. 16
Ability to enrich knowledge graphs
Atrial fibrillation
Hypertension
Diabetes
Fatigue
Syncope
Weight loss
Chest pain
Discomfort in chest
Dizzy
Shortness of Breath
Nausea
Vomiting
Headache
Cough
Weight gain
Initial knowledge
graph on disorder
and symptoms
Patient Notes
Atrial fibrillation
Hypertension
Diabetes
Chest pain
Weight gain
Discomfort in chest
Cough
Headache
Edema
Shortness of Breath
Initial knowledge base does not know about edema. Can Edema be a symptom of
any of the disorders mentioned according to the patient notes?
http://knoesis.org/node/2642
17. 17
Knowledge plays an indispensable role in deeper
understanding of content
Especially interesting situations:
I. Large amounts of training data are unavailable,
II. The objects to be recognized are complex, such as
implicit entities and highly subjective content,
and
III.Applications need to use complementary or
related data in multiple modalities/media.
18. 18
Challenging Examples/Applications
I. Implicit entity recognition and linking
II. Understanding and analyzing drug abuse related
discussions on web forums
III.Understanding city traffic dynamics using sensor
and textual observations
IV.Emoji similarity and sense disambiguation
19. 19
Implicit Entity Recognition and
Linking
Sujan Perera, Pablo N. Mendes, Adarsh Alex, Amit Sheth, Krishnaprasad Thirunarayan. Implicit Entity Linking in Tweets. Extended
Semantic Web Conference. Heraklion, Crete, Greece : Springer; 2016. p. 118-132. http://knoesis.org/node/2644
Sujan Perera, Pablo Mendes, Amit Sheth, Krishnaprasad Thirunarayan, Adarsh Alex, Christopher Heid, Greg Mott. Implicit Entity
Recognition in Clinical Documents. 4th Joint Conference on Lexical and Computational Semantics (*SEM) 2015. Denver, CO:
Association for Computational Linguistics; 2015. p. 228-238. http://knoesis.org/node/2171
20. 20
Implicit Entity Recognition and Linking
Named Entity Recognition Relationship Extraction Entity Linking Implicit information extraction
24. 24
Understanding and Analyzing Drug Abuse
Related Discussions on Web Forums
Cameron, Delroy, Gary A. Smith, Raminta Daniulaityte, Amit P. Sheth, Drashti Dave, Lu Chen, Gaurish Anand, Robert Carlson, Kera Z.
Watkins, and Russel Falck. "PREDOSE: a semantic web platform for drug abuse epidemiology using social media." Journal of biomedical
informatics 46, no. 6 (2013): 985-997. http://knoesis.org/node/2469
25. Codes Triples (subject-predicate-object)
Suboxone used by injection, negative experience Suboxone injection-causes-Cephalalgia
Suboxone used by injection, amount Suboxone injection-dosage amount-2mg
Suboxone used by injection, positive experience Suboxone injection-has_side_effect-Euphoria
experience sucked, didn’t do
shit, bad headache
feel pretty damn good, feel great
Sentiment Extraction
+ve
-ve
Triples
DOSAGE PRONOUN
INTERVAL Route of Admin.
RELATIONSHIPS SENTIMENTS
DIVERSE DATA TYPES
ENTITIES
I was sent home with 5 x 2 mg Suboxones. I also got a bunch of phenobarbital (I took all 180 mg and it didn't do shit except make me a
walking zombie for 2 days). I waited 24 hours after my last 2 mg dose of Suboxone and tried injecting 4 mg of the bupe. It gave me a
bad headache, for hours, and I almost vomited. I could feel the bupe working but overall the experience sucked.
Of course, junkie that I am, I decided to repeat the experiment. Today, after waiting 48 hours after my last bunk 4 mg injection, I injected
2 mg. There wasn't really any rush to speak of, but after 5 minutes I started to feel pretty damn good. So I injected another 1 mg. That
was about half an hour ago. I feel great now.
Buprenorphine
subClassOf
bupe
Entity Identification
has_slang_term
SuboxoneSubutex
subClassOf
bupey
has_slang_term
Drug Abuse Ontology (DAO)
83 Classes
37 Properties
33:1 Buprenorphine
24:1 Loperamide
25
26. 26
Ontology Lexicon Lexico-ontology Rule-based Grammar
ENTITIES
TRIPLES
EMOTION
INTENSITY
PRONOUN
SENTIMENT
DRUG-FORM
ROUTE OF ADM
SIDEEFFECT
DOSAGE
FREQUENCY
INTERVAL
Suboxone, Kratom, Heroin,
Suboxone-CAUSE-Cephalalgia
disgusted, amazed, irritated
more than, a, few of
I, me, mine, my
Im glad, turn out bad, weird
ointment, tablet, pill, film
smoke, inject, snort, sniff
Itching, blisters, flushing, shaking
hands, difficulty breathing
DOSAGE: <AMT><UNIT>
(e.g. 5mg, 2-3 tabs)
FREQ: <AMT><FREQ_IND><PERIOD>
(e.g. 5 times a week)
INTERVAL: <PERIOD_IND><PERIOD>
(e.g. several years)
PREDOSE: Smarter Data through Shared Context and Data Integration
27. 27
Understanding city traffic using
sensor and textual observations
Pramod Anantharam, Krishnaprasad Thirunarayan, Surendra Marupudi, Amit Sheth, Tanvi Banerjee. Understanding City Traffic
Dynamics Utilizing Sensor and Textual Observations. In 30th AAAI Conference on Artificial Intelligence (AAAI-16). Phoenix, Arizona;
2016. http://knoesis.org/node/2145
Pramod Anantharam, Krishnaprasad Thirunarayan, Amit Sheth. Traffic Analytics using Probabilistic Graphical Models Enhanced with
Knowledge Bases. In 2nd International Workshop on Analytics for Cyber-Physical Systems (ACS-2013) at SIAM International
Conference on Data Mining (SDM 2013). Austin, Texas; 2013. http://knoesis.org/node/2476
28. 28
By 2001 over 285 million Indians lived in
cities, more than in all North American cities
combined (Office of the Registrar General of
India 2001)1.
1 The Crisis of Public Transport in India.
2 IBM Smarter Traffic.
Modes of Transportation in Indian Cities
The Texas Transportation
Institute (TTI) Congestion
report for the United States
Severity of the Traffic Problem
[2011]
2030
29. 29
• What time to start?
• What route to take?
• What is the reason for traffic?
• Wait for some time or re-route?
Questions Asked Daily
32. 32
7 × 24
LDS(1,1), LDS(1,2) ,…., LDS(1,24)
LDS(7,1), LDS(7,2) ,…., LDS(7,24)
.
.
.
di
hj
Mon.
Tue.
Wed.
Thu.
Fri.
Sat.
Sun.
Mon.
Tue.
Wed.
Thu.
Fri.
Sat.
Sun.Speed/travel-time time
series data from a link.
Time series data for each hour of
day (1-24) for each day of week
(Monday – Sunday).
Mean time series computed
for each day of week and
hour of day along with the
medoid.
168 LDS models for each
link; Total models
learned = 425,712 i.e.,
(2,534 links × 168 models
per link).
Step 1: Index data for each link
for day of week and hour of day
utilizing the traffic domain
knowledge for piece-wise linear
approximation
Step 2: Find the “typical”
dynamics by computing the
mean and choosing the medoid
for each hour of day and day of
week
Step 3: Learn LDS parameters for
the medoid for each hour of day
(24 hours) and each day of week
(7 days) resulting in 24 × 7 = 168
models for each link
Learning Context-specific LDS Models
33. 33
Tagging Anomalies with LDS Models
Log likelihood min. and
max. values obtained from
five number summary
Compute Log Likelihood for
each hour of observed data
(di,hj)
LDS(hj,di)
LDS(1,1), LDS(1,2) ,…., LDS(1,24)
LDS(7,1), LDS(7,2) ,….,
LDS(7,24)
.
.
d
i
hj
(Input)
Speed and travel-time time
Observations from a link
Train?
Tag Anomalous hours using the
Log Likelihood Range
Lik(1,1), Lik(1,2) ,…., Lik(1,24)
Lik(7,1), Lik(7,2) ,…., Lik(7,24)
L=
Yes (Training Phase) No
(di,hj) (min. likelihood)
(Output)
Anomalies
.
.
35. 35
Most of the drivers tend to go
5 km/h over the posted speed limit.
There are relatively few drivers who go more than
10 km/h over the posted speed limit.
There are situations in a day where the drivers are going
(forced) below the speed limit e.g., rush hour traffic.
Do these histograms resemble any probability distribution?
Traffic Data: Possible Explanation
37. 37
Pramod Anantharam, Payam Barnaghi, Krishnaprasad Thirunarayan, and Amit Sheth. 2015. Extracting City Traffic Events from Social Streams. ACM
Trans. Intell. Syst. Technol. 6, 4, Article 43 (July 2015), 27 pages. DOI: 10.1145/2717317. http://doi.acm.org/10.1145/2717317/
Last O night O in O CA... O (@ O Half B-LOCATION Moon I-LOCATION Bay B-
LOCATION Brewing I-LOCATION Company O w/ O 8 O others) O
http://t.co/w0eGEJjApY O
Extracting City Events from Textual Data
41. Image Credit:
http://traffic.511.org/index
Overturned Truck
Domain knowledge in the
form of traffic vocabulary
Domain knowledge of traffic flow
synthesized from sensor data
Explained-by
Horizontal operator: relating/mapping data from different modality to a
concept (theme) within a spatio-temporal context;
Spatial context even include what it means to have a slow traffic for the type
of road (http://wiki.knoesis.org/index.php/PCS)
Understanding: Semantic Annotation of Sensor + Textual Data
Utilizing Background Knowledge
41
42. 42
This example demonstrates use of:
• Multimodal data streams (types of events from text - signature from sensor data).
• Multiple sources of declarative knowledge/ontologies.
• Semantic annotations and enrichments.
• Use of rich representation (PGM)
• learned probabilistic models improved using declarative knowledge
• Statistical approach to create normalcy models and understand anomalies using
historical data. Explain anomalies using extracted events.
• use declarative knowledge to approximate nonlinear models using a collection of
linear dynamical systems
• Provide actionable information.
How traffic analysis captures complexity of the real-world?
43. 43
Emoji Similarity and Sense
Disambiguation
Sanjaya Wijeratne, Lakshika Balasuriya, Amit Sheth, Derek Doran. EmojiNet: Building a Machine Readable Sense Inventory for Emoji. In
8th International Conference on Social Informatics (SocInfo 2016). Bellevue, WA, USA; 2016. http://knoesis.org/node/2781
Sanjaya Wijeratne, Lakshika Balasuriya, Amit Sheth, Derek Doran. EmojiNet: An Open Service and API for Emoji Sense Discovery. In
11th International AAAI Conference on Web and Social Media (ICWSM 2017). Montreal, Canada; 2017. http://knoesis.org/node/2819
Sanjaya Wijeratne, Lakshika Balasuriya, Amit Sheth, Derek Doran. A Semantics-Based Measure of Emoji Similarity. In 2017 IEEE/WIC/ACM
International Conference on Web Intelligence (Web Intelligence 2017). Leipzig, Germany; 2017. http://knoesis.org/node/2834
44. 44
• 6B messages with emoji are exchanged everyday!
https://www.appboy.com/blog/emojis-used-in-777-more-campaigns/
45. 45
Understanding Emoji Meanings
• The ability to automatically process, derive meaning, and
interpret text fused with emoji will be essential to
understand emoji
• Having access to knowledge bases that capture emoji
meaning can play a vital role in representing, contextually
disambiguating, and converting emoji into text
• They can help to leverage already existing NLP techniques
for processing and better understanding emoji
http://knoesis.org/node/2781
46. 46
EmojiNet: A machine-readable emoji sense
inventory
http://knoesis.org/node/2819
Creating of EmojiNet, with Nonuple of an emoji
47. 47
Emoji Sense Disambiguation
“The ability to identify the meaning of an emoji in the
context of a message in a computational manner”
Emoji usage in social media with multiple senses
http://knoesis.org/node/2819
Currently there’s no labeled dataset that can be used to solve emoji sense
disambiguation in a supervised learning setting.
48. 48
Tackling Emoji Sense Disambiguation
• Use Simplified LESK algorithm to disambiguate emoji sense
http://knoesis.org/node/2819
49. 49
Emoji Similarity
“Given two or more emoji, how to calculate the semantic similarity
between them in a computational manner?”
Top-5 emoji pairs with highest inter-annotator agreement for each ordinal value from 0 to 4 for two questions. Here,
the Q1 was on the equivalence of the two emoji and the Q2 was on the relatedness between them. Ordinal values 0
and 4 represent the least and the highest relatedness/equivalence, respectively.
http://knoesis.org/node/2834
50. 50
Using EmojiNet to measure Emoji Similarity
• Different types of emoji meanings extracted from EmojiNet are used to model
the meaning of an emoji (more details on http://knoesis.org/node/2834)
51. 51
Using EmojiNet to measure Emoji Similarity
• We combine distributional semantics of words (learned via word
embeddings) and emoji definitions in EmojiNet (external
knowledge) to model emoji embeddings
• Our emoji embeddings models outperform the previous emoji
embedding models (based on purely distributional semantics) by
~10% in a benchmark sentiment analysis task
http://knoesis.org/node/2834
52. 52
Knowledge-based Approaches and the
Resulting Improvements
Problem Domain Use of Knowledge/Knowledge bases Problems we could solve that could
not be solved (well) w/o knowledge
Implicit Entity Linking Adapted UMLS definitions for identifying
medical entities, and Wikipedia and
Twitter data for identifying Twitter entities
Was not solved before
Understanding Drug
Abuse-related
Discussions
Application of Drug Abuse Ontology
along with slang term dictionaries and
grammar
Not solved well at all
Traffic Data Analysis Statistical knowledge extraction and
using ontologies for Twitter event
extraction
Multi-modal data stream correlation
and explanation virtually impossible
Emoji Similarity and
Sense Disambiguation
Generation and application of EmojiNet Emoji interpretation solved much
better
53. 53
Take away
“Data alone is not enough”: https://homes.cs.washington.edu/~pedrod/papers/cacm12.pdf
Consider combining data-centric/bottom up/statistical learning with
knowledge-based/top down techniques
• To improve understanding of simpler content
• To understand complex content and concepts
• To understand heterogeneous/multimodal content