Searching the Social Web The Challenges of  Socially-Connected Search IR Leadership Seminar 2008 / Ofer Egozi
The problem… What to choose? Whom to trust??...
… The solution? What to choose? Whom to trust??...
Leveraging the Social Graph in web search Focused crawling Personalized ranking Delver is a first-mover The solution? Socially-  connected search: What to choose? Who to trust??...
Trusted results Friends qualify content/sources Potential contact in reach Spam is inherently low Reasoning over results Ranking is transparent Easier to assess relevance Network discovery Experts in my network Serendipity The solution? Socially-  connected search: What to choose? Who to trust??...
Outline Approaches to Social Search The Social Graph Graph-Related Challenges Search-Related Challenges
Humans in the loop Search =  crawl  +  index  +  rank  +  query Crawling (Dmoz, Mahalo) Indexing (del.icio.us, Flickr) Querying (ChaCha) Ranking – that’s what we’ll discuss…
A Taxonomy of Social Search Aggregated Personalized Network-based Behavior- based ? ?
The Social Graph A directed, cyclic graph Nodes are people (identities) Edges are relations between them Large portion is public on social networks A lot isn’t – cellular, email, non-digital Emerging web standards OpenID/hCard – identifier/identity Contact APIs/PoCo/XFN – private/public contact lists FB Connect – a full  proprietary  framework
Social Graph in Research Extraction from interactions Email (Van Alstyne et al. 2003), Chat (Tuulos & Tirri 2004), IM (Lang 2004) Correlation with “physical” Bluetooth contact (Mtibaa et al. 2008) Security and Privacy Shared knowledge authentication (Toomim et al. 2008) Graph link privacy (Xu 2008), (Korolova et al. 2008) Enhance IR ranking Index friends browse history (Mislove et al. 2006) Rank by author centrality (Kirchhoff et al. 2008) Rerank by sampling network click-log (Das et al. 2008)
So first we need to draw the graph…
Social Graph - challenges Social graph nodes Identities/relations across networks Joe friend-of follows friend-of follows Joe JJ123 Joey
 
Social Graph - challenges Social graph nodes Identities/relations across networks Identity impersonation Non-individual identities (groups,  shared authorship…) Privacy is an issue, even with public data Social graph edges Relation “strength” not exposed Super nodes may dominate results “ Politeness” relations are not filtered out Automatic generation – double-edged sword Joe friend-of follows friend-of follows Joe JJ123 Joey
So now we’ve mapped the social graph… … and attached each node with its content…
… can we finally go fetch?
S-C Search - challenges Must  build  a search engine… Store graph, attach content to nodes Reranking will not do, this  is  the long tail
Not in Google’s / Yahoo!’s top-1000!… (dominated by authorities)
S-C Search - challenges Must  build  a search engine… Store graph, attach content to nodes Reranking will not do, this  is  the long tail Scale well, including graph functions Personalized  graph-based rank Integrate content-based with static ranking Use web graph structure, like PageRank etc. Network is  egocentric , unlike PageRank
Socially-Connected Search What are the enablers?  Social networks Users’ content boom What can be achieved? Search-based access to network content Trusted and transparent social ranking What are the challenges? Fragmented social graph Personal-network  ranking Thank you! http://www.delver.com

Searching The Social Web

  • 1.
    Searching the SocialWeb The Challenges of Socially-Connected Search IR Leadership Seminar 2008 / Ofer Egozi
  • 2.
    The problem… Whatto choose? Whom to trust??...
  • 3.
    … The solution?What to choose? Whom to trust??...
  • 4.
    Leveraging the SocialGraph in web search Focused crawling Personalized ranking Delver is a first-mover The solution? Socially- connected search: What to choose? Who to trust??...
  • 5.
    Trusted results Friendsqualify content/sources Potential contact in reach Spam is inherently low Reasoning over results Ranking is transparent Easier to assess relevance Network discovery Experts in my network Serendipity The solution? Socially- connected search: What to choose? Who to trust??...
  • 6.
    Outline Approaches toSocial Search The Social Graph Graph-Related Challenges Search-Related Challenges
  • 7.
    Humans in theloop Search = crawl + index + rank + query Crawling (Dmoz, Mahalo) Indexing (del.icio.us, Flickr) Querying (ChaCha) Ranking – that’s what we’ll discuss…
  • 8.
    A Taxonomy ofSocial Search Aggregated Personalized Network-based Behavior- based ? ?
  • 9.
    The Social GraphA directed, cyclic graph Nodes are people (identities) Edges are relations between them Large portion is public on social networks A lot isn’t – cellular, email, non-digital Emerging web standards OpenID/hCard – identifier/identity Contact APIs/PoCo/XFN – private/public contact lists FB Connect – a full proprietary framework
  • 10.
    Social Graph inResearch Extraction from interactions Email (Van Alstyne et al. 2003), Chat (Tuulos & Tirri 2004), IM (Lang 2004) Correlation with “physical” Bluetooth contact (Mtibaa et al. 2008) Security and Privacy Shared knowledge authentication (Toomim et al. 2008) Graph link privacy (Xu 2008), (Korolova et al. 2008) Enhance IR ranking Index friends browse history (Mislove et al. 2006) Rank by author centrality (Kirchhoff et al. 2008) Rerank by sampling network click-log (Das et al. 2008)
  • 11.
    So first weneed to draw the graph…
  • 12.
    Social Graph -challenges Social graph nodes Identities/relations across networks Joe friend-of follows friend-of follows Joe JJ123 Joey
  • 13.
  • 14.
    Social Graph -challenges Social graph nodes Identities/relations across networks Identity impersonation Non-individual identities (groups, shared authorship…) Privacy is an issue, even with public data Social graph edges Relation “strength” not exposed Super nodes may dominate results “ Politeness” relations are not filtered out Automatic generation – double-edged sword Joe friend-of follows friend-of follows Joe JJ123 Joey
  • 15.
    So now we’vemapped the social graph… … and attached each node with its content…
  • 16.
    … can wefinally go fetch?
  • 17.
    S-C Search -challenges Must build a search engine… Store graph, attach content to nodes Reranking will not do, this is the long tail
  • 18.
    Not in Google’s/ Yahoo!’s top-1000!… (dominated by authorities)
  • 19.
    S-C Search -challenges Must build a search engine… Store graph, attach content to nodes Reranking will not do, this is the long tail Scale well, including graph functions Personalized graph-based rank Integrate content-based with static ranking Use web graph structure, like PageRank etc. Network is egocentric , unlike PageRank
  • 20.
    Socially-Connected Search Whatare the enablers? Social networks Users’ content boom What can be achieved? Search-based access to network content Trusted and transparent social ranking What are the challenges? Fragmented social graph Personal-network ranking Thank you! http://www.delver.com

Editor's Notes

  • #2 Talk in IBM-HRL Information Retrieval seminar 16/12/2008