Detecting Signals from Real-time Social Web

  • 562 views
Uploaded on

Amit Sheth's presentation at the ‘Semantic Social Networking’ panel at Semantic Technology Conference, June 24, 2010. http://bit.ly/bp81jl

Amit Sheth's presentation at the ‘Semantic Social Networking’ panel at Semantic Technology Conference, June 24, 2010. http://bit.ly/bp81jl

More in: Technology , Education
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
No Downloads

Views

Total Views
562
On Slideshare
0
From Embeds
0
Number of Embeds
1

Actions

Shares
Downloads
13
Comments
0
Likes
1

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide



  • my research has focused on three different understanding challenges associated with ugc
    all
    with goals of adding structured to unstructured content

  • my research has focused on three different understanding challenges associated with ugc
    all
    with goals of adding structured to unstructured content

  • my research has focused on three different understanding challenges associated with ugc
    all
    with goals of adding structured to unstructured content

  • my research has focused on three different understanding challenges associated with ugc
    all
    with goals of adding structured to unstructured content

  • my research has focused on three different understanding challenges associated with ugc
    all
    with goals of adding structured to unstructured content

  • my research has focused on three different understanding challenges associated with ugc
    all
    with goals of adding structured to unstructured content

  • my research has focused on three different understanding challenges associated with ugc
    all
    with goals of adding structured to unstructured content

  • my research has focused on three different understanding challenges associated with ugc
    all
    with goals of adding structured to unstructured content

  • my research has focused on three different understanding challenges associated with ugc
    all
    with goals of adding structured to unstructured content

  • my research has focused on three different understanding challenges associated with ugc
    all
    with goals of adding structured to unstructured content

  • my research has focused on three different understanding challenges associated with ugc
    all
    with goals of adding structured to unstructured content

  • my research has focused on three different understanding challenges associated with ugc
    all
    with goals of adding structured to unstructured content

  • my research has focused on three different understanding challenges associated with ugc
    all
    with goals of adding structured to unstructured content

  • my research has focused on three different understanding challenges associated with ugc
    all
    with goals of adding structured to unstructured content

  • my research has focused on three different understanding challenges associated with ugc
    all
    with goals of adding structured to unstructured content

  • my research has focused on three different understanding challenges associated with ugc
    all
    with goals of adding structured to unstructured content

  • my research has focused on three different understanding challenges associated with ugc
    all
    with goals of adding structured to unstructured content

  • in each of these areas I have contributed specific algorithms and techniques, several of which are published efforts..

    mention names of techniques

    collaborations















  • the first work that i want to tell u about has been a joint collab with res at IBM over the last 2 years

    It is a deployed social web application aimed at real-time analytics of music popularity using data from social networks - basically using crowd sourced social intelligence for business intel

  • BBC - a platform for ingesting content from popular online sources for music discussion to generate billboard like popularity .. except from user chatter

    differs from traditional polling
  • there are two kinds of data that go into soundindex

    one structured - here u r seeing the structured metadata artists
    but this also includes - structured attention metadata - user listens, plays

    second type - unstructured text
    significant volume -> user attention to this space

    Ingesting into a common format - fetch and process is separate

    point polling along with ongoing verification with subject matter experts DJs
  • there are two kinds of data that go into soundindex

    one structured - here u r seeing the structured metadata artists
    but this also includes - structured attention metadata - user listens, plays

    second type - unstructured text
    significant volume -> user attention to this space

    Ingesting into a common format - fetch and process is separate

    point polling along with ongoing verification with subject matter experts DJs
  • there are two kinds of data that go into soundindex

    one structured - here u r seeing the structured metadata artists
    but this also includes - structured attention metadata - user listens, plays

    second type - unstructured text
    significant volume -> user attention to this space

    Ingesting into a common format - fetch and process is separate

    point polling along with ongoing verification with subject matter experts DJs
  • there are two kinds of data that go into soundindex

    one structured - here u r seeing the structured metadata artists
    but this also includes - structured attention metadata - user listens, plays

    second type - unstructured text
    significant volume -> user attention to this space

    Ingesting into a common format - fetch and process is separate

    point polling along with ongoing verification with subject matter experts DJs
  • there are two kinds of data that go into soundindex

    one structured - here u r seeing the structured metadata artists
    but this also includes - structured attention metadata - user listens, plays

    second type - unstructured text
    significant volume -> user attention to this space

    Ingesting into a common format - fetch and process is separate

    point polling along with ongoing verification with subject matter experts DJs
  • there are two kinds of data that go into soundindex

    one structured - here u r seeing the structured metadata artists
    but this also includes - structured attention metadata - user listens, plays

    second type - unstructured text
    significant volume -> user attention to this space

    Ingesting into a common format - fetch and process is separate

    point polling along with ongoing verification with subject matter experts DJs
  • there are two kinds of data that go into soundindex

    one structured - here u r seeing the structured metadata artists
    but this also includes - structured attention metadata - user listens, plays

    second type - unstructured text
    significant volume -> user attention to this space

    Ingesting into a common format - fetch and process is separate

    point polling along with ongoing verification with subject matter experts DJs
  • there are two kinds of data that go into soundindex

    one structured - here u r seeing the structured metadata artists
    but this also includes - structured attention metadata - user listens, plays

    second type - unstructured text
    significant volume -> user attention to this space

    Ingesting into a common format - fetch and process is separate

    point polling along with ongoing verification with subject matter experts DJs
  • there are two kinds of data that go into soundindex

    one structured - here u r seeing the structured metadata artists
    but this also includes - structured attention metadata - user listens, plays

    second type - unstructured text
    significant volume -> user attention to this space

    Ingesting into a common format - fetch and process is separate

    point polling along with ongoing verification with subject matter experts DJs
  • there are two kinds of data that go into soundindex

    one structured - here u r seeing the structured metadata artists
    but this also includes - structured attention metadata - user listens, plays

    second type - unstructured text
    significant volume -> user attention to this space

    Ingesting into a common format - fetch and process is separate

    point polling along with ongoing verification with subject matter experts DJs
  • there are two kinds of data that go into soundindex

    one structured - here u r seeing the structured metadata artists
    but this also includes - structured attention metadata - user listens, plays

    second type - unstructured text
    significant volume -> user attention to this space

    Ingesting into a common format - fetch and process is separate

    point polling along with ongoing verification with subject matter experts DJs
  • there are two kinds of data that go into soundindex

    one structured - here u r seeing the structured metadata artists
    but this also includes - structured attention metadata - user listens, plays

    second type - unstructured text
    significant volume -> user attention to this space

    Ingesting into a common format - fetch and process is separate

    point polling along with ongoing verification with subject matter experts DJs
  • there are two kinds of data that go into soundindex

    one structured - here u r seeing the structured metadata artists
    but this also includes - structured attention metadata - user listens, plays

    second type - unstructured text
    significant volume -> user attention to this space

    Ingesting into a common format - fetch and process is separate

    point polling along with ongoing verification with subject matter experts DJs
  • there are two kinds of data that go into soundindex

    one structured - here u r seeing the structured metadata artists
    but this also includes - structured attention metadata - user listens, plays

    second type - unstructured text
    significant volume -> user attention to this space

    Ingesting into a common format - fetch and process is separate

    point polling along with ongoing verification with subject matter experts DJs
  • there are two kinds of data that go into soundindex

    one structured - here u r seeing the structured metadata artists
    but this also includes - structured attention metadata - user listens, plays

    second type - unstructured text
    significant volume -> user attention to this space

    Ingesting into a common format - fetch and process is separate

    point polling along with ongoing verification with subject matter experts DJs
  • there are two kinds of data that go into soundindex

    one structured - here u r seeing the structured metadata artists
    but this also includes - structured attention metadata - user listens, plays

    second type - unstructured text
    significant volume -> user attention to this space

    Ingesting into a common format - fetch and process is separate

    point polling along with ongoing verification with subject matter experts DJs
  • there are two kinds of data that go into soundindex

    one structured - here u r seeing the structured metadata artists
    but this also includes - structured attention metadata - user listens, plays

    second type - unstructured text
    significant volume -> user attention to this space

    Ingesting into a common format - fetch and process is separate

    point polling along with ongoing verification with subject matter experts DJs
  • there are two kinds of data that go into soundindex

    one structured - here u r seeing the structured metadata artists
    but this also includes - structured attention metadata - user listens, plays

    second type - unstructured text
    significant volume -> user attention to this space

    Ingesting into a common format - fetch and process is separate

    point polling along with ongoing verification with subject matter experts DJs
  • there are two kinds of data that go into soundindex

    one structured - here u r seeing the structured metadata artists
    but this also includes - structured attention metadata - user listens, plays

    second type - unstructured text
    significant volume -> user attention to this space

    Ingesting into a common format - fetch and process is separate

    point polling along with ongoing verification with subject matter experts DJs
  • there are two kinds of data that go into soundindex

    one structured - here u r seeing the structured metadata artists
    but this also includes - structured attention metadata - user listens, plays

    second type - unstructured text
    significant volume -> user attention to this space

    Ingesting into a common format - fetch and process is separate

    point polling along with ongoing verification with subject matter experts DJs
  • there are two kinds of data that go into soundindex

    one structured - here u r seeing the structured metadata artists
    but this also includes - structured attention metadata - user listens, plays

    second type - unstructured text
    significant volume -> user attention to this space

    Ingesting into a common format - fetch and process is separate

    point polling along with ongoing verification with subject matter experts DJs
  • there are two kinds of data that go into soundindex

    one structured - here u r seeing the structured metadata artists
    but this also includes - structured attention metadata - user listens, plays

    second type - unstructured text
    significant volume -> user attention to this space

    Ingesting into a common format - fetch and process is separate

    point polling along with ongoing verification with subject matter experts DJs
  • there are two kinds of data that go into soundindex

    one structured - here u r seeing the structured metadata artists
    but this also includes - structured attention metadata - user listens, plays

    second type - unstructured text
    significant volume -> user attention to this space

    Ingesting into a common format - fetch and process is separate

    point polling along with ongoing verification with subject matter experts DJs
  • there are two kinds of data that go into soundindex

    one structured - here u r seeing the structured metadata artists
    but this also includes - structured attention metadata - user listens, plays

    second type - unstructured text
    significant volume -> user attention to this space

    Ingesting into a common format - fetch and process is separate

    point polling along with ongoing verification with subject matter experts DJs
  • Top 45 - showing 10

    however for SI we were interested in one dimensional lists

    talk about ordering overlaps
  • Top 45 - showing 10

    however for SI we were interested in one dimensional lists

    talk about ordering overlaps
  • Top 45 - showing 10

    however for SI we were interested in one dimensional lists

    talk about ordering overlaps
  • We conclude that new opportunities for self expression on the web provide a more accurate place to gather data on what people are really interested in than tra- ditional methods. The even stronger results from the younger audience suggests that this trend is, if any- thing, accelerating.


Transcript

  • 1. Detecting Signals from Real-time Social Web Semantic Social Networking Panel @ STC 2010 June 24, 2010 Amit Sheth Kno.e.sis, Ohio Center of Excellence in Knowledge-enabled Computing Wright State University, Dayton, OH Thanks - Meena Nagarajan, Kno.e.sis
  • 2. Our Approach • Semantics of ‘Semantic Social Networking’ • Bottom-up and top-down • Statistical semantics powered by domain model semantics • Social Networks of Interest • Not the friend/peer/co-author network • Event/topic oriented dynamic networks
  • 3. Dynamic Social Networks: Citizen Journalism, Online Communities.. http://www.telegraph.co.uk/news/worldnews/asia/india/3530640/Mumbai-attacks-Twitter-and-Flickr-used-to-break-news-Bombay- India.html
  • 4. Other Areas of Focus
  • 5. Other Areas of Focus WHAT “I decided to check out Wanted demo today even though I really did not like the movie” “It was THE HANGOVER of the year..lasted forever.. so I went to the movies..bad choice picking “GI Jane” worse now” WHAT: Named entity recognition, topics..
  • 6. Other Areas of Focus “Looking for a cheap body shop mechanic in Dayton WHAT WHY OH” - Transactional “Check out these links..” - Information Sharing “Where can I find a good psp cam” - Information Seeking WHAT: Named entity recognition, topics.. WHY: User intent identification ...
  • 7. Other Areas of Focus Male: “I graduated in '04 from USC... now working in Austin... I like stuff, and i like doing stuff. What stuff do you like to do?” WHAT WHY Female: “Well Im a pretty easy going person. Love the outdoors and going camping, boating, fishing, short weekend trips,the horseraces, drag races, hanging out at HOW home, doing yard work,or just watching movies or having BBQ's with friends.” WHAT: Named entity recognition, topics.. WHY: User intent identification ... HOW: Word usages and an active population..
  • 8. Other Areas of Focus WHAT (NER): “Context and Domain Knowledge Enhanced Entity Spotting in Informal Text”, The 8th International Semantic Web Conference, 2009 “A Measure of Extraction Complexity: a Novel Prior for Improving Recognition of Cultural Entities”, Manuscript in preparation WHAT WHY WHY (Intents): “Monetizing User Activity on Social Networks - HOW Challenges and Experiences”, International Conference on Web Intelligence, 2009 HOW: “An Examination of Language Use in Online Dating Personals”, 3rd Int'l AAAI Conference on Weblogs and Social Media, 2009
  • 9. Sample showcases Social Computing @ Kno.e.sis • Social perceptions behind events : Twitris http://twitris.knoesis.org • Online popularity of music artists: BBC Sound Index (IBM Almaden) http://www.almaden.ibm.com/cs/projects/iis/sound/
  • 10. http://twitris.knoesis.org/ TWITRIS online pulse of a populace around news-worthy events.. Mumbai terror attack, Health care debate ..
  • 11. Chatter around news-worthy events.. Hundreds of tweets, facebook posts, blogs about a single event multiple narratives, strong opinions, breaking news..
  • 12. TWITRIS : Twitter+Tetris • WHAT are people saying, WHEN and from WHERE • Browse citizen reports using social perceptions as the fulcrum • Citizen reports in context by overlaying it with Web articles!
  • 13. What, When and Where: The Power of Spatio-Temporal- Thematic slices
  • 14. 1. Preserving Social Perceptions The Health Care Reform Debate
  • 15. Zooming in on Florida
  • 16. Summaries of Citizen Reports
  • 17. Zooming in on Washington
  • 18. Summaries of Citizen Reports RT @WestWingReport: Obama reminds the faith-based groups "we're neglecting 2 live up 2 the call" of being R brother's keeper on #healthcare
  • 19. Find resources related to Find resources related to social perceptions 2. Social Media in Context social perceptions SOYLENT GREEN and the HEALTH CARE REFORMand News and News Wikipedia articles Information right where you need it ! Wikipedia articles toto put extracted put extracted descriptors in descriptors in context context ws and kipedia articles put extracted scriptors in ntext Cull well blog !Exploit spatio, temporal semantics for thematic aggregation Exploit spatio, temporal semantics for thematic aggregation
  • 20. Quick Show & Tell: http://twitris.knoesis.org
  • 21. Spatial Aggregation Assisted by a model of a domain/event... !"#$%&''()*+,(-*&./01&23&/45670,(8)&9&0:&;6*)(-5/0 &776*)6<0/50!"#$%&'()037(./5160;=3+>>/*?4<>@ABCD0 E6F3&5<G0H/7&56'61I(50 !"#$%"&'()*+%,-"-./#,0012+*3/%,04.*05#,*6#+(7+80%,,*90#:0 8*3%;;+%,.-0#:0:#+<-+0=>?0%!60@#$60A-9*,3#,0#,0!"#$%&#'()*B0 ?+%,02C;(DD/,EF+"G.#<DEHI6!880 !"#$%&'()*+%*+,'%*'!"#!$'-./011234/15%6787'9:;<='9:;<=>?>@AB= 9(C4<=D:E-FG' !"#$%&'()*+,-.(&/&.*0#"(123&'04&2($#( %1))&"(-"(!"#$%((51$*'216(78(91'( :;'1"<,&.0#"((=4161%.""(
  • 22. Twitris - A Village Effort! We are very excited for what is to come! Stay Tuned! http://twitris.knoesis.org/
  • 23. Things we are working on.. • Factual vs. Opinionated tweets • Polarized opinions: what is breaking up a community • Joe Wilson: “You lie!” • Personalized Tweets: what do people like me think about X. • Customizing it to events you want to track! • Trust in Social Media & Content ...... and much more!
  • 24. http://www.almaden.ibm.com/cs/projects/iis/sound/ http://www.almaden.ibm.com/cs/projects/iis/sound/ BBC SoundIndex (IBM Almaden) Pulse of the Online Music Populace Daniel Gruhl, Meenakshi Nagarajan, Jan Pieper, Christine Robson, Amit Sheth: Multimodal Social Intelligence in a Real-Time Dashboard System to appear in a special issue of the VLDB Journal on "Data Management and Mining for Social Networks and Social Media", 2010
  • 25. The Vision !  Netizens do not always buy their music, let alone buy in a CD store. http://www.almaden.ibm.com/cs/projects/iis/sound/ !  Traditional sales figures are a poor indicator of music popularity. • What is ‘really’ hot? • BBC SoundIndex - “A pioneering project to tap into • BBC: Are online music the online buzz surrounding communities good artists and songs, by leveraging several popular proxies for popular online sources” music listings?!
  • 26. “Multimodal Social Intelligence in a Real-Time Dashboard System”, VLDB Journal 2010 Special Issue: Data Management and Mining for Social Networks and Social Media. User metadata, unstructured, Artist/Track structured attention Metadata metadata
  • 27. “Multimodal Social Intelligence in a Real-Time Dashboard System”, VLDB Journal 2010 Special Issue: Data Management and Mining for Social Networks and Social Media. Album/Track identification Sentiment Identification Spam and off-topic comments UIMA Analytics Environment
  • 28. “Multimodal Social Intelligence in a Real-Time Dashboard System”, VLDB Journal 2010 Special Issue: Data Management and Mining for Social Networks and Social Media. Exracted concepts into explorable datastructures
  • 29. “Multimodal Social Intelligence in a Real-Time Dashboard System”, VLDB Journal 2010 Special Issue: Data Management and Mining for Social Networks and Social Media. What are 18 year olds in London listening to?
  • 30. “Multimodal Social Intelligence in a Real-Time Dashboard System”, VLDB Journal 2010 Special Issue: Data Management and Mining for Social Networks and Social Media. What are 18 year olds in London listening to? Crowd-sourced preferences
  • 31. The Word on the Street Billboards Top 50 Singles chart during the week of Sept 22-28 ’07 vs. MySpace popularity charts comments were spam Billboard.com MySpace Analysis comments had positive sentiments comments had negative sentiments Soulja Boy T.I. comments had no identifiable sentiments Kanye West Soulja Boy on Statistics Timbaland Fall Out Boy Fergie Rihanna J. Holiday Keyshia Cole 50 Cent Avril Lavigne in Section 8, the structured metadata Keyshia Cole Timbaland mestamp, etc.) and annotation results Nickelback Pink m, sentiment, etc.) were loaded in the Pink 50 Cent Colbie Caillat Alicia Keys resented by each cell of the cube is the Table 8 Billboard’s Top Artists vs. our generated list ents for a given artist. The dimension- Showing Top 10 e is dependent on what variables we 1 was comprised of respondents between ages 8
  • 32. The Word on the Street Billboards Top 50 Singles chart during the week of Sept 22-28 ’07 vs. MySpace popularity charts comments were spam Billboard.com MySpace Analysis comments had positive sentiments both * Top artists appear in lists, comments had Overlaps Several negative sentiments Soulja Boy T.I. comments had no identifiable sentiments Kanye West Soulja Boy on Statistics Timbaland Fall Out Boy * Predictive power of MySpace - Fergie Rihanna Billboard next week looked a lot like J. Holiday Keyshia Cole 50 Cent Avril Lavigne in MySpace this week.. metadata Section 8, the structured Keyshia Cole Timbaland mestamp, etc.) and annotation results Nickelback Pink m, sentiment, etc.) were loaded in the Pink 50 Cent Teenagers are big music influencers Colbie Caillat Alicia Keys [MediaMark2004] resented by each cell of the cube is the Table 8 Billboard’s Top Artists vs. our generated list ents for a given artist. The dimension- Showing Top 10 e is dependent on what variables we 1 was comprised of respondents between ages 8
  • 33. Powerful Proxies for Popularity • “Which list more accurately reflects the artists that were more popular last week?” • 75 participants • Overall 2:1 preference for MySpace list 38% of total comments were spam Billboard.com MySpace Analysis 61% of total comments had positive sentiments 4% of total comments had negative sentiments • Younger age groups: 6:1 (8-15 yrs) 35% of total comments Table 7 Annotation Statistics had no identifiable sentiments Soulja Boy Kanye West Timbaland T.I. Soulja Boy Fall Out Boy Fergie Rihanna J. Holiday Keyshia Cole 50 Cent Avril Lavigne As described in Section 8, the structured metadata Challenging traditional polling methods! Keyshia Cole Timbaland (artist name, timestamp, etc.) and annotation results Nickelback Pink (spam/non-spam, sentiment, etc.) were loaded in the Pink 50 Cent Colbie Caillat Alicia Keys hypercube. The data represented by each cell of the cube is the Table 8 Billboard’s Top Artists vs. our generated list
  • 34. Details here.. Social Computing research at Kno.e.sis http://knoesis.wright.edu/research/semweb/ projects/socialmedia/ Meena Nagarajan’s research on understanding user- generated content http://knoesis.wright.edu/researchers/meena/
  • 35. Semantic Social Networking Panel @ STC 2010 • How can we use the Social Web to detect and observe signals from real time social data? • How to study diversity and change, identify patterns of interactions, and extract insights • What can we learn about social perceptions of real time events? • Tools for visualization and analysis in space, time and theme • Can social network analysis be trusted? • Capturing social network content to track and analyze buyer preferences, shopping experience, demographics, and other characteristics that influence purchasing behavior