Citizen Sensing<br />Opportunities and Challenges in Mining Social Signals and Perceptions<br />Amit P. Shethamit@knoesis....
Ohio Center of Excellence in Knowledge-enabled Computing<br />Semantics as core enabler, enhancer @ Kno.e.sis<br />one of ...
Semantics & Semantic Web in 1999-2002<br />
ATTRIBUTE & KEYWORDQUERYING<br />SEMANTIC BROWSING<br />uniform view of worldwide distributed assets of similar type<br />...
Links to news on companies that compete against Commerce One<br />Links to news on companies Commerce One competes against...
Semantic Search/Browsing/Directory (2001-….)<br />System recognizes ENTITY & CATEGORY<br />Relevant portionof the Director...
Semantic Search/Browsing/Directory (2001-….)<br />Users can explore<br />Semantically related<br />Information.<br />
Extracting Semantic Metadata from Semistructured and Structured Sources (1999 – 2002)<br />Semagix Freedom for building <b...
Fast forward to 2010-2011<br />
Search<br />Integration<br />Analysis<br />Discovery<br />Question <br />  Answering<br />Situational <br />   Awareness<b...
Let Us Start with Social Data<br />
Jan. 2011<br />Egypt Protest<br />Image:http://bit.ly/g4yPXS<br />Image:http://bit.ly/qHI7wI<br />Image:http://bit.ly/qmDo...
http://bit.ly/nP1E4q<br />Image:http://bit.ly/fl4gEJ<br />Mar. 2011 <br />http://cnet.co/jdQgME<br />Japan Earthquake <br ...
Image:http://bit.ly/nqq6Wj<br />Jul. 2011<br />I-75 Traffic Jam in US<br />
Citizen Sensing<br />Who?<br />An interconnected network of people<br />What?<br />Observe, report, collect, analyze, and ...
Enablers: Mobile Devices & Ubiquitous Connectivity<br />Mobile Platforms Hit Critical Mass, Over 5 billions users<br /> 1+...
Enablers: Web 2.0 & Social Media<br />A huge number of users<br />750M+ active Facebook Users<br />1+B tweets/wk; 175M+ Tw...
Role of Semantics in Citizen Sensing<br />Key of citizen sensing: extract metadata/annotate<br /> different types of metad...
Research Application: Twitris<br />
Twitris - Motivation<br />Image: http://bit.ly/etFezl<br />What were people in U.S.A. saying about <br />		Bin Laden’s dea...
Twitris: Semantic Social Web Mash-up<br />Select topic<br />Select date<br />N-gram summaries<br />Topic tree<br />Spatial...
Analyzing Events from Temporal Perspective<br />How did tweets in United States on the death of Bin Laden evolve over time...
Analyzing Events from Spatial Perspective<br />Tweets (Death of Bin Laden) in Egypt VS tweets in India<br />May <br />2nd<...
A sample of current research @ Kno.e.sis demonstrating role of semantics in Citizen Sensing & Social Media Analysis<br />
User-communityEngagement Analysis<br />
User-community Engagement<br />How do we understand  the phenomenon of user participation (engagement) in topic discussion...
Analysis Framework:People-Content-network Analysis (PCNA)<br />28<br />Understanding User-Community Engagement by Multi-fa...
Three Sets of Features<br />Which set is more important?<br />Content features[Characteristics of tweets posted by active ...
Experiments: Results<br />Content is the key to understand user community engagement<br />High<br />Low<br />Performance<b...
Background Knowledge to improve Social Data Analysis<br />Event oriented Community<br />External Knowledge bases<br />Dyna...
Analysing the Content can be Hard…<br />Is Merry Christmas a song? <br />If it is, which ‘Merry Christmas’since there are ...
Real Time Social Media Data Analysis<br />
Motivation<br />People can’t wait for Information<br />Disaster Management<br />Ushahidi (www.ushahidi.org)<br />Real-Time...
Scenarios<br />Brand Tracking<br />Give me a stream of locations whereKinectis being mentioned right now<br />Give me all ...
Twarql (Twitter Feeds through SPARQL)<br />Semantically annotate tweets with entities, hashtags, URLs, sentiments, etc.<br...
Twarql Architecture<br />
Back to the Scenario<br />Give me a stream of locations where Kinect is being mentioned now<br />Give me all people that h...
Dynamic Domain Models for Semantic Analysis of Real-Time DataakaContinuous Semantics<br />
Motivation<br />Semantic processing using a model of the domain<br />But it is difficult to model dynamic domains on socia...
Dynamic Model Creation<br />Heliopolis is a suburb of Cairo.<br />
Dynamic Evolving Models to underpin Semantics<br />“Reports from Azadi Square - 4 people killed by police, people killed p...
Sentiment/Opinion Extraction<br />
Challenges<br />Domain/Topic-dependency: spotting the target of the sentiment is as important as finding sentiment itself<...
The Usage of Background Knowledge <br />
Real Time Feature Stream<br />
Real-Time Sensor, Social, Multi-media data<br />2010’s<br />Dynamic User Generated Content<br />2000’s<br />1990’s<br />St...
Semantic Abstraction<br />Does -1,-2,-4,-4 make any sense to you?<br />Overwhelming amount of raw sensor data doesn’t make...
A cross-country flight from New York to Los Angeles on a Boeing 737 plane generates a massive 240 terabytes of data<br />-...
Background  Knowledge<br />Blizzard<br />Rain Storm<br />ABSTRACTION<br />Huge amount of <br />Raw Sensor Data<br />Featur...
Weather Alert Application<br />Detection of events, such as blizzards, from weather station observations<br />
Evaluation<br /><ul><li>Data Used: Nevada Blizzard (April 1st – April 6th)</li></ul>Show me which places had blizzard in p...
Evaluation (cont.)<br />Abstraction can:<br />Make sense out of raw data<br />Greatly reduce the size of data<br /><ul><li...
Traffic Application<br />Reasons can be found via other types data: tweets, news papers, etc.<br />Sensor data: 10 passing...
Integration and Abstraction of Traffic DataStratified explanation<br />Knowledge Model Empowered Abstraction<br />Knowledg...
Take Home Message<br />Amount of citizen sensing (and machine sensing) data is huge,  varied, and growing rapidly.  Search...
Take Home Message (Cont.)<br />Semantics play a key role in refering "meaning" behind the data. Requires progress from key...
Take Home Message (Cont.)<br />Wide variety of semantic models and KBs(vocabularies, social dictionaries, community create...
Interested in more?<br />Kno.e.sis Wiki for the following and more:<br />Computing for Human Experience<br />Continuous Se...
Upcoming SlideShare
Loading in...5
×

Citizen Sensing: Opportunities and Challenges in Mining Social Signals and Perceptions

2,459

Published on

Invited talk at Session on Semantic Knowledge for Commodity Computing, at Microsoft Research Faculty Summit 2011, July 19-20, 2011, Redmond, WA. http://research.microsoft.com/en-us/events/fs2011/default.aspx

Published in: Education, Technology
1 Comment
4 Likes
Statistics
Notes
No Downloads
Views
Total Views
2,459
On Slideshare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
86
Comments
1
Likes
4
Embeds 0
No embeds

No notes for slide
  • Kno.e.sis has 15 faculty in Computer Science, life sciences and health care, cognitive science and business. It has about 50 PhD students and post docs– about 2/3 of these in Computer Science. Its faculty members have 40 labs, and occupies a majority of 50K sqft Joshi Research Center. Its students are highly successful– eg tenure track faculty @ Case Western Reserve Univ or Researcher at IBM Almaden. It has received recent funding from funding from Microsoft Research. IBM Research, HP Labs, Google, and small companies (Janya, EZdi,…) and collaborates with many more (Yahoo! Labs, NLM, …).
  • Taalee (subsequently Voquette and Semagix) was founded in 1999 as an Audio/Video Web Search Company (focus on A/V mainly for scalability and market focus reasons, servicename: MediaAnywhere). Domain models/ontologies were created in major areas (many more than what you can find on Bing in 2011) and automatically populated to build knowledge bases (populated ontologies or WorldModel) from a variety of structured and semistructured sources, and periodically kept up to date. This was than used for semantic annotation/metadata extraction to drive semantic search, browsing, etc applications over data crawled from Web sites.
  • Mediaanywhere’s search snapshot.
  • The important thing is that the system knew that Robert Duval is a movie actor, is a different person that David Duval who is a golfer and a sportsperson, and had understanding of a variety of relationships Robert Duval participates in – such as
  • Many types of data: Web 2.0/microblogging/social networking with informal text, Blogs/News/Documents with structured text, structured and semistructured data, multimedia data, sensor data.Many types of domain models: formal ontologies, folksonomies, nomenclatures or core vocabularies, community created/managed dictionaries, …Extraction of semantic metadata (labels referenced with domain models) make it possible to build many applications.
  • There is an event oriented community being formed during an eventDynamic domain model would give enhanced understanding of what is going on in the event, and hence, the communityCommunity has people, sub-community of users in the community will form an implicit network for ‘resource sharing’ and ‘resource provider’ How do we get to know such people? Semantic analysis of user profile description, with help of external knowledge bases to tell us, ‘what is a user interest I’ will tell us, ‘who is what’ – Blogger, news journalist, trustee, Red Cross Member etc. and also, ‘who’ is interested in ‘what’This information of mined user interests and types, from users profiles, can be leveraged to perform semantic association with dynamic domain model of the event
  • Architecture of twarqlStream tweetsCovert it into RDFUse SPARQL Query to filter it
  • Again SPARQL query is a subset of domain modelModifying this will help us to be updated with the informationTherefore we use Doozer
  • We use background knowledge to help identifying the entity mentioned in the text, e.g., the knowledge from IMDB and Freebase is used to determine whether a noun phrase in the text is the name of a movie or a person. The lexical resources such as Urban Dictionary are used to help identifying the sentiment clues in the text. Urban Dictionary is a popular online slang dictionary with word definitions written by users. Each word is associated with a list of related words to interpret it, and many glossary definitions given by different users. Both the related words list and glossary definitions can be used to help determining the sentiment of the spotted word. E.g., the word “wicked” has a list of related words, and most of those words carry positive sentiment, so that we can infer that “wicked” is highly possible a positive sentiment clue. In addition, there is also a definition of “wicked” given by user saying that it has different meanings in different countries. Given this knowledge, if we know the location of the author who wrote the tweet, we can infer whether “wicked” in the tweet  is used as a sentiment clue, and whether it is positive, negative or neutral.
  • Consider the number of observation records generated by a commercial flight. A single flight from New York to Los Angeles on a twin-engine Boeing 737 generates around 240 terabytes of data; and, on any given day, there are around 28,537 such commercial flights in the United States (Higginbotham, 2010). So, for only a single day, the amount of sensor data generated can reach the petabyte scale. But how much of this sensor data is actually useful for decision-making? The pilot may need to understand the general condition of the plane and the external environment during flight; the ground crew may need to know the speed, location and trajectory of the aircraft; and the mechanic may need to know of any anomalous behavior detected during the flight. Much of this knowledge must be derived from the low-level observational data, but only the high-level inferences may be required for actual decision-making.
  • The architecture illustrates an abstraction (or explanation) of traffic data through a stratified interpretation of machine sensor observations, information on the Web, and real-time social data. This process involves:+ an integration of various domain (i.e., traffic) and domain-independent (i.e., time, space, sensor) ontologies to annotate and interpret the data+ observations from traffic speed sensors on the road to detect traffic jams (slow moving traffic),+ determine the cause of the traffic jam through a stratified explanation process: - first check if normal/everyday events could explain (i.e., rush-hour) - second check if planned events could explain (i.e., planned events such as road construction, music concerts, and sporting events) by accessing background knowledge on the Web (such as US Department of Transportation, BuckeyeTraffic, or Eventful) -third check if unplanned events could explain (i.e., traffic accident / wreck) by accessing social media (such as twitter)
  • The architecture illustrates an abstraction (or explanation) of traffic data through a stratified interpretation of machine sensor observations, information on the Web, and real-time social data. This process involves:+ an integration of various domain (i.e., traffic) and domain-independent (i.e., time, space, sensor) ontologies to annotate and interpret the data+ observations from traffic speed sensors on the road to detect traffic jams (slow moving traffic),+ determine the cause of the traffic jam through a stratified explanation process: - first check if normal/everyday events could explain (i.e., rush-hour) - second check if planned events could explain (i.e., planned events such as road construction, music concerts, and sporting events) by accessing background knowledge on the Web (such as US Department of Transportation, BuckeyeTraffic, or Eventful) - third check if unplanned events could explain (i.e., traffic accident / wreck) by accessing social media (such as twitter)
  • Citizen Sensing: Opportunities and Challenges in Mining Social Signals and Perceptions

    1. 1.
    2. 2. Citizen Sensing<br />Opportunities and Challenges in Mining Social Signals and Perceptions<br />Amit P. Shethamit@knoesis.org<br />LexisNexis Ohio Eminent Scholar<br />Ohio Center of Excellence in Knowledge enabled Computing (Kno.e.sis)<br />Wright State University, Dayton, OHhttp://knoesis.org<br />Thanks: Kno.e.sis team, esp. Wenbo Wang, Chen Lu, Cory, Hemant, Pavan<br />
    3. 3. Ohio Center of Excellence in Knowledge-enabled Computing<br />Semantics as core enabler, enhancer @ Kno.e.sis<br />one of the two largest academic groups in Semantic Web; multidisciplinary<br />
    4. 4. Semantics & Semantic Web in 1999-2002<br />
    5. 5. ATTRIBUTE & KEYWORDQUERYING<br />SEMANTIC BROWSING<br />uniform view of worldwide distributed assets of similar type<br />Targeted e-shopping/e-commerce<br />assets access<br />Taalee Semantic/Faceted Search & Browsing (1999-2001)<br />BLENDED BROWSING & QUERYING<br />Taalee Semantic Search ….<br />
    6. 6. Links to news on companies that compete against Commerce One<br />Links to news on companies Commerce One competes against<br />(To view news on Ariba, click on the link for Ariba)<br />Crucial news on Commerce One’s competitors (Ariba) can be accessed easily and automatically<br />Search for company ‘Commerce One’<br />Semantic Search/Browsing/Directory (2001-….)<br />
    7. 7. Semantic Search/Browsing/Directory (2001-….)<br />System recognizes ENTITY & CATEGORY<br />Relevant portionof the Directory is automatically <br />presented.<br />
    8. 8. Semantic Search/Browsing/Directory (2001-….)<br />Users can explore<br />Semantically related<br />Information.<br />
    9. 9. Extracting Semantic Metadata from Semistructured and Structured Sources (1999 – 2002)<br />Semagix Freedom for building <br />ontology-driven information system<br />Managing Semantic Content on the Web<br />
    10. 10. Fast forward to 2010-2011<br />
    11. 11. Search<br />Integration<br />Analysis<br />Discovery<br />Question <br /> Answering<br />Situational <br /> Awareness<br />Domain Models<br />Patterns / Inference / Reasoning<br />RDB<br />Relationship Web<br />Metadata Extraction<br />Meta data / Semantic Annotations<br />Text<br />(formal/<br />Informal)<br />Structured and Semi-structured data<br />Sensor Data<br />Multimedia Content and Web data<br />
    12. 12. Let Us Start with Social Data<br />
    13. 13. Jan. 2011<br />Egypt Protest<br />Image:http://bit.ly/g4yPXS<br />Image:http://bit.ly/qHI7wI<br />Image:http://bit.ly/qmDocA<br />
    14. 14. http://bit.ly/nP1E4q<br />Image:http://bit.ly/fl4gEJ<br />Mar. 2011 <br />http://cnet.co/jdQgME<br />Japan Earthquake <br />and Tsunami<br />http://bit.ly<br />/gWboib<br />Recently funded NSF proposal: Social Media Enhanced Organizational Sensemakingin Emergency Response<br />
    15. 15. Image:http://bit.ly/nqq6Wj<br />Jul. 2011<br />I-75 Traffic Jam in US<br />
    16. 16. Citizen Sensing<br />Who?<br />An interconnected network of people<br />What?<br />Observe, report, collect, analyze, and disseminate information<br />How?<br />Via text, audio, video and built in device sensor (and smart devices)<br />Image: http://bit.ly/nvm2iP<br /> "Citizen Sensing, Social Signals, and Enriching Human Experience”, <br />Internet Computing, July-Aug. 2009<br />
    17. 17. Enablers: Mobile Devices & Ubiquitous Connectivity<br />Mobile Platforms Hit Critical Mass, Over 5 billions users<br /> 1+B with internet connected mobile devices (2010)<br />Smartphones > PCs + Notebooks > Notebooks + Netbooks (2010E)<br />500K+ mobile phone applications<br />74% of mobile phone users (2.4B)worldwide used SMS (2007)<br />Mobile is Global; Ubiquitous; 24x7<br />Built in sensors<br />environmental, biometric/biomedical,...<br />Image: http://bit.ly/mYqcPF<br />
    18. 18. Enablers: Web 2.0 & Social Media<br />A huge number of users<br />750M+ active Facebook Users<br />1+B tweets/wk; 175M+ Twitter users<br />Internet Users:2Bln<br />Image: http://bit.ly/euLETT<br />Image: http://bit.ly/euLETT<br />Image: http://bit.ly/euLETT<br />Image: http://bit.ly/euLETT<br />Image: http://bit.ly/euLETT<br />
    19. 19. Role of Semantics in Citizen Sensing<br />Key of citizen sensing: extract metadata/annotate<br /> different types of metadata (depend on application need)<br />Spatial, temporal, thematic: key phrase, named entity, relationship, topic/category, event descriptors, sentiment…<br />People, network, content<br />Semantics: provide the meaning of data<br />various forms of semantic models: core vocabularies/nomenclatures,community created dictionaries/folksonomies/reference databases, automatically extracted domain models, manually created taxonomies, formal ontologies<br />deal with complexities of user generated data; supplement well-known statistical and natural language processing (NLP) techniques<br />
    20. 20. Research Application: Twitris<br />
    21. 21. Twitris - Motivation<br />Image: http://bit.ly/etFezl<br />What were people in U.S.A. saying about <br /> Bin Laden’s death?<br />How about people in Egypt?<br />How about people in India?<br />TOO MANY tweets to be read each day!!!<br />Twitris<br />Now: WHEN, WHERE, People are talking WHAT<br />Future: socio, cultural, behavioral studies<br />‘Spatio-Temporal-Thematic Analysis of Citizen-Sensor Data – Challenges and Experiences’, 2009<br />
    22. 22. Twitris: Semantic Social Web Mash-up<br />Select topic<br />Select date<br />N-gram summaries<br />Topic tree<br />Spatial Marker<br />Tweet traffic<br />Sentiment Analysis<br />Images & Videos<br />Reference news<br />Related tweets<br />Wikipedia articles<br />More: TWITRIS<br />
    23. 23. Analyzing Events from Temporal Perspective<br />How did tweets in United States on the death of Bin Laden evolve over time?<br />May <br />2nd<br />May<br /> 4th<br />“RT @TWlTTERWHALE: Please do not click on any links saying Osama Bin Laden EXECUTION Video! This is a virus that hacks accounts. ”<br />RT @ReallyVirtual: Here's a picture of OBL'shideout in Abbottabad, as shared by a friend @Rahat http://yfrog.com/h7w4izmj<br />Img:http://www.twitpic.com/4t1mt0<br />
    24. 24. Analyzing Events from Spatial Perspective<br />Tweets (Death of Bin Laden) in Egypt VS tweets in India<br />May <br />2nd<br />Egypt<br />May<br /> 2nd<br />India<br />“RT @mvatlarge: U.S. has given Pakistani military nearly $20 billion since 9/11 for the privilege of housing bin Laden: http://is.gd/xegnFm”<br />“#Egypt foreign minister: Egygov't has no official comment but we condemn all forms of violence in international relations. #osama #obl”<br />
    25. 25. A sample of current research @ Kno.e.sis demonstrating role of semantics in Citizen Sensing & Social Media Analysis<br />
    26. 26. User-communityEngagement Analysis<br />
    27. 27. User-community Engagement<br />How do we understand the phenomenon of user participation (engagement) in topic discussions?<br />How communities form during the product launch?<br />What factors can attract users to engage in these communities, therefore further spreading the message?<br />How quickly we can disseminate information between resource providers and people in need of resources in case of emergency?<br />Image: http://itcilo.wordpress.com<br />
    28. 28. Analysis Framework:People-Content-network Analysis (PCNA)<br />28<br />Understanding User-Community Engagement by Multi-faceted Features: A Case Study on Twitter. SoME @ WWW2011. <br />
    29. 29. Three Sets of Features<br />Which set is more important?<br />Content features[Characteristics of tweets posted by active friends of U]:<br />keywords: number of event-relevant keywords<br />hashtags: number of event-relevant hashtags<br />retweet: number of retweets<br />mention: number of mentions<br />url: number of relevancy-adjust hyperlinks<br />Irrelevant hyperlink is given number -1<br />subjectivity: Subjectivity scores for words and emoticons<br />Linguistic Cues (LIWC1 analysis): Features for the language usage. Top-3 transformed features using Principle Component Analysis (PCA) extracted<br />Community features: [Characteristics of the active community/network under consideration]<br />wccSize: size of the weakly-connected component (WCC) which U’s friends belongs to in the active network.<br />wccPercent: ratio of wccSizeto the size of the active network.<br />connectivity: number of active friends (i.e. followees) in the community.<br />communitySize: size of the active community.<br />Author features [Characteristics of friends that U is following]:<br />Only friends in the active community are considered.<br />logFollower: logarithm of follower count<br />logFollowee: logarithm of followee count<br />Klout[1]: a integrated measure of user influence and popularity<br />Other profile information and activity history[2].<br />
    30. 30. Experiments: Results<br />Content is the key to understand user community engagement<br />High<br />Low<br />Performance<br />Network<br />Content<br />All<br />People<br />Event-Type<br />Summary of Prediction Accuracy (%)<br />Statistical significant results are in bold<br />30<br />
    31. 31. Background Knowledge to improve Social Data Analysis<br />Event oriented Community<br />External Knowledge bases<br />Dynamic Domain Model for the event<br />SEMANTIC ASSOCIATION TO UNDERSTAND ENGAGEMENT LEVEL &<br />IMPROVE IE<br />Social Network<br />User Profiles<br />Mined User Interests and User Types<br />
    32. 32. Analysing the Content can be Hard…<br />Is Merry Christmas a song? <br />If it is, which ‘Merry Christmas’since there are 60 songs of the same name.<br />‘So Good’ is also a song!<br />Using a domain model (E.g., MusicBrainz)<br />Using context cues from the content<br /><ul><li>e.g. new Merry Christmas tune</li></ul>Reduce potential entity spot size (with restrictions) <br /><ul><li>e.g. new albums/songs </li></ul> ‘Multimodal Social Intelligence in a Real-Time Dashboard System’ VLDB Journal 2010<br />
    33. 33. Real Time Social Media Data Analysis<br />
    34. 34. Motivation<br />People can’t wait for Information<br />Disaster Management<br />Ushahidi (www.ushahidi.org)<br />Real-Time Markets<br />RealTimeMarkets (http://www.realtimemarkets.com/) <br />Brand Tracking<br />Twarql (http://wiki.knoesis.org/index.php/Twarql)<br />Movie reviews<br />Flicktweets (www.flicktweets.com)<br />Journalism<br />
    35. 35. Scenarios<br />Brand Tracking<br />Give me a stream of locations whereKinectis being mentioned right now<br />Give me all people that have said negative things aboutKinect<br />How can we do this?<br />
    36. 36. Twarql (Twitter Feeds through SPARQL)<br />Semantically annotate tweets with entities, hashtags, URLs, sentiments, etc.<br />Encode content in a structured format (RDF) using shared vocabularies (FOAT, SIOC, MOAT, etc.)<br />Structured querying of tweets<br />Subscribe to a stream of tweets that match a given query<br />Real-time delivery of streaming data.<br />More: TWARQL<br />
    37. 37. Twarql Architecture<br />
    38. 38. Back to the Scenario<br />Give me a stream of locations where Kinect is being mentioned now<br />Give me all people that have said negative things about Kinect<br />
    39. 39. Dynamic Domain Models for Semantic Analysis of Real-Time DataakaContinuous Semantics<br />
    40. 40. Motivation<br />Semantic processing using a model of the domain<br />But it is difficult to model dynamic domains on social web<br />spontaneous (arising suddenly)<br />real-time data requiring continuous searching and analysis<br />distributed participants with fragmented and opinionated information<br />diverse viewpoints involving topical or contentious subjects<br />feature context colored by local knowledge as well as perceptions based on different observations and their socio-cultural background.<br />More: Continuous Semantics<br />
    41. 41. Dynamic Model Creation<br />Heliopolis is a suburb of Cairo.<br />
    42. 42. Dynamic Evolving Models to underpin Semantics<br />“Reports from Azadi Square - 4 people killed by police, people killed police who shot. More shots being fired #iranelections”<br />“Both Ahmadinejad& Mousavideclare victory in Iranian Elections.”<br />“situation in tehran University is so worrisome. police have attacked to girls dormitory #tehran #iranelection”<br />Events<br />June 13 2009<br />June 15 2009<br />June 12 2009<br />Ahmadinejad & Mousavi are politicians in Iran<br />Tehran University is a University in Iran<br />Azadi Square is a city square in Tehran<br />Key phrases<br />Models<br />
    43. 43. Sentiment/Opinion Extraction<br />
    44. 44. Challenges<br />Domain/Topic-dependency: spotting the target of the sentiment is as important as finding sentiment itself<br />E.g., “longriver” (no sentiment), “long battery life” (positive), or “long time for downloading” (negative).<br />Context-awareness : encoding the context information into the extracted sentiment<br />E.g., “mustwatch a movie today” (no sentiment) and “this movie is a must see” (positive). <br />Informal language (abbreviations, misspelling, slang...): using Urban Dictionary<br />
    45. 45. The Usage of Background Knowledge <br />
    46. 46. Real Time Feature Stream<br />
    47. 47. Real-Time Sensor, Social, Multi-media data<br />2010’s<br />Dynamic User Generated Content<br />2000’s<br />1990’s<br />Static Document and files <br />Web DATA evolved over time<br />
    48. 48. Semantic Abstraction<br />Does -1,-2,-4,-4 make any sense to you?<br />Overwhelming amount of raw sensor data doesn’t make much sense to decision makers<br />Freezing temperature<br />Rain<br />What if the data is from sensors on a highway?<br />Freezing temperature + rain => icy road<br />Close the highway? OR <br /> Spread salt on the road to melt ice?<br />So what?<br />Semantic Sensor Web Demo<br />
    49. 49. A cross-country flight from New York to Los Angeles on a Boeing 737 plane generates a massive 240 terabytes of data<br />- GigaOmni Media<br />But a pilot or a ground engineer at the destination is interested in very small<br />number of events and associated observational data that are relevant to their work.<br />Higginbotham, S. (2010, September). Sensor Networks Top Social Networks for Big Data.<br />Gigaom.com. http://gigaom.com/cloud/sensor-networks-top-social-networks-for-big-data-2/.<br />49<br />
    50. 50. Background Knowledge<br />Blizzard<br />Rain Storm<br />ABSTRACTION<br />Huge amount of <br />Raw Sensor Data<br />Features representing Real-World events<br />Abstraction<br />More: Semantic Sensor Web<br />
    51. 51. Weather Alert Application<br />Detection of events, such as blizzards, from weather station observations<br />
    52. 52. Evaluation<br /><ul><li>Data Used: Nevada Blizzard (April 1st – April 6th)</li></ul>Show me which places had blizzard in past 24 hrs<br />70% Data clear<br />30% Feature Observed<br />
    53. 53. Evaluation (cont.)<br />Abstraction can:<br />Make sense out of raw data<br />Greatly reduce the size of data<br /><ul><li>Data Used: Nevada Blizzard (April 1st – April 6th)</li></ul>Amount of Raw Sensor Data<br />Amount of Abstraction Data<br />Semantic Scalability<br />
    54. 54. Traffic Application<br />Reasons can be found via other types data: tweets, news papers, etc.<br />Sensor data: 10 passing cars per minute<br />What might be the reason?<br />
    55. 55. Integration and Abstraction of Traffic DataStratified explanation<br />Knowledge Model Empowered Abstraction<br />Knowledge Models<br />Different types of Data<br />
    56. 56. Take Home Message<br />Amount of citizen sensing (and machine sensing) data is huge, varied, and growing rapidly. Search and Sift won’t work. <br />
    57. 57. Take Home Message (Cont.)<br />Semantics play a key role in refering "meaning" behind the data. Requires progress from keywords -> entities -> relationships -> events, from raw data to human-centric abstractions.<br />
    58. 58. Take Home Message (Cont.)<br />Wide variety of semantic models and KBs(vocabularies, social dictionaries, community created semi-structured knowledge, domain-specific datasets, ontologies)empower semantic solutions. This can lead to Semantic Scalability – scalability that is meaningful to human activities and decision making.<br />
    59. 59. Interested in more?<br />Kno.e.sis Wiki for the following and more:<br />Computing for Human Experience<br />Continuous Semantics to Analyze Real-Time Data<br />Semantic Modeling for Cloud Computing<br />Citizen Sensing, Social Signals, and Enriching Human Experience<br />Semantics-Empowered Social Computing<br />Semantic Sensor Web <br />Traveling the Semantic Web through Space, Theme and Time <br />Relationship Web: Blazing Semantic Trails between Web Resources <br />SA-REST: Semantically Interoperable and Easier-to-Use Services and Mashups<br />Semantically Annotating a Web Service<br />Tutorial: Citizen Sensor Data Mining, Social Media Analytics and Development Centric Web Applications (WWW2011)<br />Partial Funding: NSF (Semantic Discovery: IIS: 071441, Spatio Temporal Thematic: IIS-0842129), AFRL and DAGSI (Semantic Sensor Web), Microsoft Research (Semantic Search) and IBM Research (Analysis of Social Media Content),and HP Researh(Knowledge Extraction from Community-Generated Content).<br />
    1. A particular slide catching your eye?

      Clipping is a handy way to collect important slides you want to go back to later.

    ×