• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
Entity-Based Semantics Emerging from Personal Awareness Streams
 

Entity-Based Semantics Emerging from Personal Awareness Streams

on

  • 996 views

 

Statistics

Views

Total Views
996
Views on SlideShare
996
Embed Views
0

Actions

Likes
1
Downloads
7
Comments
0

0 Embeds 0

No embeds

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment
  • Comment why the blt sub is different for each person..Stresss that the lightweight ontologies emerging are personal ontologies, way in which a user reffered to or characterise a set of entities.Go more slowly in the evaluation .. Say what you
  • The past few years have seen the launch of different social networking platforms that allows a user to expose their online presence, create groups and build bridges for communicating within their online social spheres. The high usage of these platforms has generated an enormous amount of personal information online creating unprecedented opportunities for a wide range of research related to knowledge management, user con- textualisation, and the Semantic Web.As we have talked about this morning, there is big amount of online personal information provided by social web platforms..
  • Social Awareness Streams is a term introduced by Naaman et al. In their work they examined characteristics of social activity and patterns of communication on Twitter. They defined Social Awareness Streams as a collection of semi-public, natural-language messages produced by different users and characterised by their brevity, have revealed new patterns of communication characterised by high social connectivity and their ability to communicate trends..They classified users asIn particular the content coming from microblog services, has acquiered a new name, “social awarenesss streams.. ”
  • In their study they categorised messages as: Information sharing.. Self .. Base on a data set consisting of three weeks of activity of 350 users they found that the dominant categories where
  • In their study they categorised messages as: Information sharing.. Self .. Base on a data set consisting of three weeks of activity of 350 users they found that the dominant categories where They found that the four dominant categories were Information Sharing, Opinions/ Complaints, Statements and Random Thoughts, and Me now categories.We do tweet a lot about ourselves
  • In their study they categorised messages as: Information sharing.. Self .. Base on a data set consisting of three weeks of activity of 350 users they found that the dominant categories where They found that the four dominant categories were Information Sharing, Opinions/ Complaints, Statements and Random Thoughts, and Me now categories.We do tweet a lot about ourselves
  • Based on the definition of Naaman, Wagner and Strohmaier, introduce a notation for characterising these streams. They refer to this notation as a Tweetonomy. We’ll talk about it later on.. But they also identify also introduce the term Personal Awareness Stream for referring to the c
  • In this work we explore whether personal awareness streams can convey meaningful information for modelling user context.
  • User Context is about understanding the userrelationships to the people, places and things in their micro worlds..
  • User Context is about understanding the userrelationships to the people, places and things in their micro worlds..
  • Modelling user context can help to predict services and information that can be useful for you.. For example a common week-day consists of waking up, going to the train station.. Work till 5, go shopping and play some games at night..Modelling you and entities around an entities around you could trigger services such as:
  • Modelling user context can help to predict services and information that can be useful for you.. For example a common week-day consists of waking up, going to the train station.. Work till 5, go shopping and play some games at night..Modelling you and entities around an entities around you could trigger services such as:
  • Modelling user context can help to predict services and information that can be useful for you.. For example a common week-day consists of waking up, going to the train station.. Work till 5, go shopping and play some games at night..Modelling you and entities around an entities around you could trigger services such as:
  • Modelling user context can help to predict services and information that can be useful for you.. For example a common week-day consists of waking up, going to the train station.. Work till 5, go shopping and play some games at night..Modelling you and entities around an entities around you could trigger services such as:Suggestions that are useful in the correct places at the correct times..
  • Some of the existing work, Java correlates the difference in the user’s network connection structures and the users’ activity types.. Ramage uses Lda for generating a topical classification of tweets obtained from the public stream.. They found four dimensions, Krishnamurthy presents a characterisation of Twitter social network, using patterns in geograph growth and social activity.
  • Our methodology is based on the use of light-weight onoologies and the Tweetonomyformatlisation of Wagner. A Tweetonomy consists of a set of Users qualified by a label given by q1.. So for example the author of the tweet, or the person who retweeted a message..in our case we only consider authors.
  • Our methodology is based on the use of light-weight onoologies and the Tweetonomyformatlisation of Wagner. A Tweetonomy consists of a set of Users qualified by a label given by q1.. So for example the author of the tweet, or the person who retweeted a message..in our case we only consider authors.A set of messages qualified by q2, this could be a direct message, a retweeted message etc.. In this case we only consider direct messages.
  • Our methodology is based on the use of light-weight onoologies and the Tweetonomyformatlisation of Wagner. A Tweetonomy consists of a set of Users qualified by a label given by q1.. So for example the author of the tweet, or the person who retweeted a message..in our case we only consider authors.A set of messages qualified by q2, this could be a direct message, a retweeted message etc.. In this case we only consider direct messages. A set of Resources contained on the messages, which are qualified by q3.In our case
  • Our methodology is based on the use of light-weight onoologies and the Tweetonomyformatlisation of Wagner. A Tweetonomy consists of a set of Users qualified by a label given by q1.. So for example the author of the tweet, or the person who retweeted a message..in our case we only consider authors.A set of messages qualified by q2, this could be a direct message, a retweeted message etc.. In this case we only consider direct messages. A set of Resources contained on the messages, which are qualified by q3.In our case Of T which is a ternary relation, which associates a user, message and resources
  • Our methodology is based on the use of light-weight onoologies and the Tweetonomyformatlisation of Wagner. A Tweetonomy consists of a set of Users qualified by a label given by q1.. So for example the author of the tweet, or the person who retweeted a message..in our case we only consider authors.A set of messages qualified by q2, this could be a direct message, a retweeted message etc.. In this case we only consider direct messages. A set of Resources contained on the messages, which are qualified by q3.In our case Of T which is a ternary relation, which associates a user, message and resources
  • Our methodology is based on the use of light-weight onoologies and the Tweetonomyformatlisation of Wagner. A Tweetonomy consists of a set of Users qualified by a label given by q1.. So for example the author of the tweet, or the person who retweeted a message..in our case we only consider authors.A set of messages qualified by q2, this could be a direct message, a retweeted message etc.. In this case we only consider direct messages. A set of Resources contained on the messages, which are qualified by q3.In our case Of T which is a ternary relation, which associates a user, message and resources
  • Our approach for modelling context consisted in observing the way in which a user referred to entities within his messages.. We categorised this entities as being part of the following aspects
  • If we continue following him we would see that he would start featuring the entities he is usually talking about..In this work we can analyse these data structures by considering facets..
  • If we continue following him we would see that he would start featuring the entities he is usually talking about..For example we can analyse all the keywords related to the messages in which @johnbinns appeared, by obtaining a Keyword/ People matrix and projecting on @johnbinns. From this matrix we can already derive
  • If we continue following him we would see that he would start featuring the entities he is usually talking about..For example we can analyse all the keywords related to the messages in which @johnbinns appeared, by obtaining a Keyword/ People matrix and projecting on @johnbinns.Defining in this way a light-weight ontology This type of matrices tend to be quite spars, so methods such as Principal component analysis are used for highlighting only those keywords which provide more information to the whole set.
  • If we continue following him we would see that he would start featuring the entities he is usually talking about..For example we can analyse all the keywords related to the messages in which @johnbinns appeared, by obtaining a Keyword/ People matrix and projecting on @johnbinns. This type of matrices tend to be quite spars, so methods such as Principal component analysis are used for highlighting only those keywords which provide more information to the whole set.In the same we can derive data strcutures for keywords and locations.
  • If we continue following him we would see that he would start featuring the entities he is usually talking about..For example we can analyse all the keywords related to the messages in which @johnbinns appeared, by obtaining a Keyword/ People matrix and projecting on @johnbinns. In the same we can derive data strcutures for keywords and time..This type of matrices tend to be quite spars, so methods such as Principal component analysis are used for highlighting only those keywords which provide more information to the whole set.
  • In order to analyse the simultaneous correlation of different facets for deriving common concepts..
  • In order to analyse the simultaneous correlation of different facets for deriving common concepts..
  • We encapsulated the each matrix in terms of K-K and added them to a tensor
  • We encapsulated the each matrix in terms of K-K and added them to a tensorHaving them encapsulated in this form, we performed a Tucker decomposition which is a higher order Principal Component Analysis..
  • For evaluating this technique we followedIn the absence of a gold standard, evaluating the concepts that emerge from a user’s social aggregation given a context is a difficult task; it requires consulting the author of the social stream whose context-induced concepts are being mapped.
  • For evaluating this technique we followedIn the absence of a gold standard, evaluating the concepts that emerge from a user’s social aggregation given a context is a difficult task; it requires consulting the author of the social stream whose context-induced concepts are being mapped.Although this technique allows you to simultaneously evaluate as many context you want, we decided to go for modelling context defined by pair of facets like
  • For evaluating this technique we followedIn the absence of a gold standard, evaluating the concepts that emerge from a user’s social aggregation given a context is a difficult task; it requires consulting the author of the social stream whose context-induced concepts are being mapped.
  • For evaluating this technique we followedIn the absence of a gold standard, evaluating the concepts that emerge from a user’s social aggregation given a context is a difficult task; it requires consulting the author of the social stream whose context-induced concepts are being mapped.Although this technique allows you to simultaneously evaluate as many context you want, we decided to go for modelling context defined by pair of facets like
  • We evaluated the mean average precisionmicroblogging verbosity provided a better basis for deriving meaningful concepts..the relevance of the concepts given a context depended highly on the user’s patterns of correlating the entities through keywords.
  • We evaluated the mean average precisionUsers’ tend to forget what they have tweeted about, so entity relations decay in time …In our experiments

Entity-Based Semantics Emerging from Personal Awareness Streams Entity-Based Semantics Emerging from Personal Awareness Streams Presentation Transcript

  • Capturing Entity-Based Semantics Emerging from Personal Awareness Streams
    A.E. Cano, S.Tucker, F. Ciravegna
    The Oak Group, Department of Computer Science, The University of Sheffield
  • Outline
    Introduction
    Related Work
    Social Stream Aggregation and Entity-Based Concept Induction
    Modelling Context with Personal Awareness Streams
    Methodology
    Evaluation
    Conclusions
    Outline
  • Introduction
    Introduction
  • Introduction
    Introduction
    Social Awareness Streams
    Collection of semi-public, natural language message produced by different users and characterisedby their brevity
    [1] M. Naaman, J. Boase, and C. H. Lai. Is it really about me?: message content in social awareness streams. In CSCW ’10: Proceedings of the 2010 ACM conference on Computer supported cooperative work, pages 189–192, New York, NY, USA, 2010. ACM.
  • Introduction
    Introduction
    Social Awareness Streams
    [1] M. Naaman, J. Boase, and C. H. Lai. Is it really about me?: message content in social awareness streams. In CSCW ’10: Proceedings of the 2010 ACM conference on Computer supported cooperative work, pages 189–192, New York, NY, USA, 2010. ACM.
  • Introduction
    Introduction
    Social Awareness Streams
    [1] M. Naaman, J. Boase, and C. H. Lai. Is it really about me?: message content in social awareness streams. In CSCW ’10: Proceedings of the 2010 ACM conference on Computer supported cooperative work, pages 189–192, New York, NY, USA, 2010. ACM.
  • Introduction
    Introduction
    Social Awareness Streams
    People talk a lot about themselves!!
    [1] M. Naaman, J. Boase, and C. H. Lai. Is it really about me?: message content in social awareness streams. In CSCW ’10: Proceedings of the 2010 ACM conference on Computer supported cooperative work, pages 189–192, New York, NY, USA, 2010. ACM.
  • Introduction
    Introduction
    Personal Awareness Streams
    Collection of semi-public, natural language message produced by a user and characterisedby their brevity
    [2] C. Wagner and M. Strohmaier. The wisdom in tweetonomies: Acquiring latent conceptual structures from social awareness streams. In Proc. of the Semantic Search 2010 Workshop (SemSearch2010), april 2010..
  • Introduction
    Introduction
    Can personal awareness streams convey meaningful information for modelling user context?
  • Introduction
    Introduction
    Modelling User Context
    People
    Location
    Things
  • Introduction
    Introduction
    Modelling User Context
    Relationships:
    -Semantic - Spatial
    - Social - Temporal
  • Introduction
    Introduction
    Modelling User Context what for ???
    8:00
    9:00
    13:00
    17:00
    -
    20:00
    M-F
    S-S
  • Introduction
    Introduction
    Modelling User Context what for ???
    8:00
    9:00
    13:00
    17:00
    -
    20:00
    BLT offer, 500m
    M-F
    S-S
  • Introduction
    Introduction
    Modelling User Context what for ???
    Tuna
    8:00
    9:00
    13:00
    17:00
    -
    20:00
    BLT offer, 500m
    M-F
    S-S
  • Introduction
    Introduction
    Modelling User Context what for ???
    8:00
    9:00
    13:00
    17:00
    -
    20:00
    Suggested
    By a,b,c
    M-F
    S-S
  • Related Work
    Related Work
    Social Awareness Streams
    • Java et al [ 3], present an analysis of Twitter which suggest that the differences in users’ networkconnection structures can be explained by the following types of user activities: information seeking, information sharing and social activity.
    • Ramageet al [4], apply labelled Latent Dirichlet Allocation (LDA) for mapping content of the public Twitter feed into four dimensions including style and substance.
    • Krishnamurthy et al [5] present a characterisationof Twitter social network, which includes patterns in geographic growth and user’s social activity.
  • Related Work
    Related Work
    Social Awareness Streams Using Linked Data
    • Wagner and Strohmaier [2] introduce the Tweetonomy model
    • Formalisation of social awareness streams.
    • Based on lightweight associative ontologies.
    • Stankovic et al [6], study conference related tweets.
    • Map tweets to talks an sub-events that they refer to.
    • Using linked data they derive additional knowledge about event dynamics and user activities.
  • Related Work
    Related Work
    Our work differs from existing work in …
    • Focus on deriving person-based lightweight ontologies from personal awareness stream; which enrich concepts and reveal structures that are meaningful to the owner of the stream.
    • Analysethe content of the messages not only in terms of traditional resources as hashtags, and links, but also in terms of entities (e.glocation, people, organisations and time).
    • Present a methodology based on tensor analysis that allows the definition of entity-based context for deriving person-based ontologies.
  • Social Stream
    Social Stream Aggregation and Entity Based Concept Induction
    Defining a Tweetonomy
    U
    q1={authorship}
    q1
  • Social Stream
    Social Stream Aggregation and Entity Based Concept Induction
    Defining a Tweetonomy
    U
    q1={author}
    q1
    M
    q2
    q2={direct message}
  • Social Stream
    Social Stream Aggregation and Entity Based Concept Induction
    Defining a Tweetonomy
    U
    q1={author}
    q1
    M
    q2
    q2={direct message}
    R
    q3
    q3={Links, Hash tags, Location, People,
    Places, Organisation}
  • Social Stream
    Social Stream Aggregation and Entity Based Concept Induction
    Defining a Tweetonomy
    U
    q1={author}
    q1
    M
    q2
    q2={direct message}
    R
    q3
    q3={Links, Hash tags, Location, People,
    Places, Organisation}
    T
    T⊆U×M×R
  • Social Stream
    Social Stream Aggregation and Entity Based Concept Induction
    Defining a Tweetonomy
    U
    q1={author}
    q1
    M
    q2
    q2={direct message}
    R
    q3
    q3={Links, Hash tags, Location, People,
    Places, Organisation}
    T
    T⊆U×M×R
    ft
    Function that assigns a temporal marker to each ternary edge.
  • Social Stream
    Social Stream Aggregation and Entity Based Concept Induction
    Defining a Tweetonomy
    U
    q1={author}
    q1
    M
    Tweetonomy
    q2
    q2={direct message}
    S={Uq1, Mq2, Rq3, T, ft}
    R
    q3
    q3={Links, Hash tags}
    T
    T⊆U×M×R
    ft
    Function that assigns a temporal marker to each ternary edge.
  • Social Stream
    Social Stream Aggregation and Entity Based Concept Induction
    Modelling User Context with a Tweetonomy
  • Social Stream
    Social Stream Aggregation and Entity Based Concept Induction
    Modelling User Context with a Tweetonomy
  • Social Stream
    Social Stream Aggregation and Entity Based Concept Induction
    Modelling User Context with a Tweetonomy
    Okp =(RkM)(MRp)
  • Social Stream
    Social Stream Aggregation and Entity Based Concept Induction
    Modelling User Context with a Tweetonomy
    Okp =(RkM)(MRp)
    fan
    work
    therapy
    @Johbinns
    @Tony
    alcohol
  • Social Stream
    Social Stream Aggregation and Entity Based Concept Induction
    Modelling User Context with a Tweetonomy
    Okl =(RkM)(MRl)
    alcohol
    fan
    work
    therapy
    Leeds
    Sheffield
    work
    alcohol
  • Social Stream
    Social Stream Aggregation and Entity Based Concept Induction
    Modelling User Context with a Tweetonomy
    Otl =(RtM)(MRt)
    alcohol
    fan
    work
    therapy
    Morning
    Rest of the Day
    work
  • Social Stream
    Social Stream Aggregation and Entity Based Concept Induction
    Modelling User Context with a Tweetonomy
    What are the concepts that emerge when analysingBigGayShaun in the context of Sheffield (Location), @Johnbinns (Person), during the evening?
  • Social Stream
    Social Stream Aggregation and Entity Based Concept Induction
    Modelling User Context with a Tweetonomy
    What are the concepts that emerge when analysingBigGayShaun in the context of Sheffield (Location), @Johnbinns (Person), during the evening?
    Okp =(RkM)(MRp)
    Okl =(RkM)(MRl)
    Otl =(RtM)(MRt)
  • Social Stream
    Social Stream Aggregation and Entity Based Concept Induction
    Modelling User Context with a Tweetonomy
    Given P lightweight ontologiescharacterising a user’s social streams consisting of N messages; we define a tensor O ∈ RN×N×P consisting of frontal slices of the form Op=BpBTpwith p=1, ..P ,where B is a bipartite ontology Op;
  • Social Stream
    Social Stream Aggregation and Entity Based Concept Induction
    Modelling User Context with a Tweetonomy
    Given P lightweight ontologiescharacterising a user’s social streams consisting of N messages; we define a tensor O ∈ RN×N×P consisting of frontal slices of the form Op=BpBTpwith p=1, ..P ,where B is a bipartite ontology Op;
    What are the concepts that emerge when analysingBigGayShaun in the context of Sheffield (Location (1)), @Johnbinns (Person (2)), during the evening (Time (3))?
    O(1) =Okl(Okl)T
    O(2) =Okp(Okp)T
    O(3) =Okt(Okt)T
  • Evaluation
    Data Set
    Four active Microbloggers
    From Jul - Sep 2010
    From each message, entities where extracted using the OpenCalais service.
    Evaluation
  • Evaluation
    Evaluation
    Concepts in the context of Hashtags-Places-Time
  • Evaluation
    Data Set
    Four active Microbloggers
    From Jul - Sep 2010
    From each message, entities where extracted using the OpenCalais service.
    • User-based evaluation:
    Consulting the author of the social stream whose context-induced concepts are being mapped.
    Evaluation
  • Evaluation
    Data Set
    Four active Microbloggers
    From Jul - Sep 2010
    From each message, entities where extracted using the OpenCalais service.
    • User-based evaluation:
    Consulting the author of the social stream whose context-induced concepts are being mapped.
    Evaluated contexts : hashtag-time, location-people, and organisation-people
    Evaluation
  • Evaluation
    Evaluation
    Higher lexical diversity (K/M) leads to better MAP results (see Figure 3 b)),
    this is an expected result since CSISSA explores the way in which an entity
    is linked to another one through keywords.
  • Evaluation
    Highlights
    • Users tend to forget what they’ve tweeted about.
    • Entity relationships decay with time.
    • Users’ streaming topics’ relevance was in many cases volatile;
    further research is necessary to address these issues
  • Conclusions
    Awareness streams can be used to model context by leveraging the user’s entity affiliations.
    In our experiments a fairly naive approach was taken by not considering the ambiguity in which user’s can relate two entities with a keyword.
    Future work considers:
    Introduction of concept disambiguation for tackling this issue.
    Use this approach for merging user contexts in pervasive environments.
    Conclusions
  • References
    References
    [1] M. Naaman, J. Boase, and C. H. Lai. Is it really about me?: message content in social awareness streams. In CSCW ’10: Proceedings of the 2010 ACM conference on Computer supported cooperative work, pages 189–192, New York, NY, USA, 2010. ACM.
    [2] C. Wagner and M.Strohmaier. The wisdom in tweetonomies: Acquiring latent conceptual structures from social awareness streamshmaier.. In Proc. of the Semantic Search 2010 Workshop (SemSearch2010), april 2010..
    [3] A. Java, X. Song, T. Finin, and B. Tseng. Why we twitter: understanding microbloggingusage
    and communities. In WebKDD/SNA-KDD ’07: Proceedings of the 9th WebKDD and 1st
    SNA-KDD 2007 workshop on Web mining and social network analysis, pages 56–65,
    New York, NY, USA, 2007. ACM.
    [4] D. Ramage, D. Hall, R. Nallapati, and C. D. Manning. Labeled lda: a supervised topicmodel for credit attribution in multi-labeled corpora. In EMNLP ’09: Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing, pages 248–256, Morristown, NJ, USA, 2009. Association for Computational Linguistics.
    [5] B.Krishnamurthy, P.Gill, and M.Arlitt. A few chirps about twitter. In WOSP’08: Proceedings of the first workshop on Online social networks, pages 19–24, New York, NY, USA, 2008.ACM.
    [6] M. R. M. Stankovic and P. Laublet. Mapping tweets to conference talks: A goldmine
    for semantics.In Proceedings of Social Data on the Web workshop, ISWC 2010.
    Shanghai, China. ISWC 2010, 2010.
  • Slideshare
    SlideShare
    http://www.slideshare.net/ampaeli/modellingContext