Measuring the dynamic bi-directional influence between content and social networks

1,101 views

Published on

0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
1,101
On SlideShare
0
From Embeds
0
Number of Embeds
3
Actions
Shares
0
Downloads
2
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

Measuring the dynamic bi-directional influence between content and social networks

  1. 1. Measuring the dynamic bi-directional influence between content and social networks Shenghui Wang and Paul Groth {swang,pgroth}@few.vu.nl @shenghui, @pgroth ISWC 2010 Shanghai, China
  2. 2. Outline •  Influence over time – Social Networks – Content Networks •  Influence Framework 1.  Network Generation 2.  Measuring Network Properties 3.  Time Series Analysis •  Results
  3. 3. http://www.flickr.com/photos/quinnanya/3570357215/sizes/l/in/photostream/
  4. 4. http://www.flickr.com/photos/a-barth/2846621384/sizes/o/in/photostream/
  5. 5. ISWC CFP keywords 2005
  6. 6. ISWC CFP keywords 2010
  7. 7. queri consum correl hierarch profillognorm graph ws-bpel to program decis global electron mechan imbalanc cook word bottleneck brows relev recip geograph markov graph-bas rate design click spectral index section access petri conduct net usag modular clickstream implicit valu search forum auction technolog anchor rdf anycast social opinion semant approxim prefer folksonomi tag-bas substr mobil select use from & recommend on relat probabilist uddi prototyp cach ict4d retriev scalabl annot tag learn stream process share templat topic minimum explor onlin secur travel answer product resourc peer-to-p usabl geoloc bloom domin sparql goal-driven issu inform suggest composit feedback telecom keyboard taxonomi dynam entiti reinforc monitor polici delici handl gadget framework spatio-tempor discuss workload sidejack submodular mode found citat hard combinatori meta sponsor energi extract orient network join space publish research content on-lin adapt internet integr partit navig reason theori complianc thread clickthrough filter length regress frequent independ denorm rank evolut script data interact system messag circl privaci gps eavesdrop fuzzi crawl keyword tree structur h-index balanc video schema browser and function comput mine engin rout technology-enhanc (well soap distribut track price object eye-track regular segment model co-clust multi-keyword determin bulletin commerc qos text cdn random session reput find xml locat winner activ cloak local express mainten cost-per-act requirorgan statist mediat microbusi view wiki set knowledg 2.0 expertis disjunct detect expert pattern review wikipedia debat languag chemic flickr approach email attribut spars isol extens p2p news advertis popul protect instant axiomat dissemin voicesit tempor facet instanc context logic load ontolog walk distil suppli trust communiti duplic invert devic compon interest basic imag bayesian repetit educ hidden semantic-bas novel datalog servic near behavior anonym incentive-cent region server-sid propag metric cross-languag cluster pharm lightweight develop minim media medic econom complex dht infer optim effect user extern task semantics) person programm the paradigm isoton monet photo rest collabor demograph web cut character board persuas subsequ match applic classfic webpag traffic associ measur microformat collect cascad soft page sitemap crawler shed excerpt maxim mirror guarante p3p transport viral for overlay characteris larg market machin same-origin compress web-bas vs. comparison of label semistructur disabl owl effici log task-bas spam question aspect-ori fast interfac analysi semi-supervis wireless cloud pagerank categor consist isid problem similar query-log classif featur evalu pseudo abstract diagnosi proven generat mutual mashup discoveri virtual bpel field communic phish architectur longev svm algorithm fsg reliabl descript visual rule Keyword co-occurrence network in WWW 2008
  8. 8. represent monet queri consum collabor paper semantic/data reput languag entiti web locat polici with explain desktop blog to analyz rich geo/tempor analyt applic digit tangible/hapt spell (slas) traffic relev measur unstructur level h negat authent correct sensemak statist soft manag crawler wiki enterpris properti aspect porn natur creation rate design structur extract click index network for open review multimedia definit publish discoveri content method communiti internet approach defens metadata machin real-world agreement rich-media market base theori repositori news advertis vertic on search auction of page filter context social fine-grain improv semistructur produc control semant e-commerc effici appli qualiti rank system right mobil summar select use from log spam interact compos avail their attack interfac includ recommend corpus large-scal ontolog deliveri that tool privaci site trail visual link ling harvest cach replic novel retriev evolut scalabl servic access annot contextu learn browser object-ori analysi classif comput evalu context-awar process in share mine cluster tag explor generat onlin facet develop techniqu secur perform media research exchang econom other exploratori combin document divers sub/super-docu relat distribut compress discov virus user component-bas engin data model feder audit sentiment algorithm author issu person text inter-organiz suggest mechan the opinion Keyword co-occurrence network in WWW2010
  9. 9. Social Networks Content Networks queri consum correl hierarch profillognorm graph ws-bpel to program decis global electron mechan imbalanc cook word bottleneck brows relev recip geograph markov graph-bas rate design click spectral index section access petri conduct net usag modular clickstream implicit valu search forum auction technolog anchor rdf anycast social opinion semant approxim prefer folksonomi tag-bas substr mobil select use from & recommend on relat probabilist uddi prototyp cach ict4d retriev scalabl annot tag learn stream process share templat topic minimum explor onlin secur travel answer product resourc peer-to-p usabl geoloc bloom domin sparql goal-driven issu inform suggest composit feedback telecom keyboard taxonomi dynam entiti reinforc monitor polici delici handl gadget framework spatio-tempor discuss workload sidejack submodular mode found citat hard combinatori meta sponsor energi extract orient network join space publish research content on-lin adapt internet integr partit navig reason theori complianc thread clickthrough filter length regress frequent independ denorm rank evolut script data interact system messag circl privaci gps eavesdrop fuzzi crawl keyword tree structur h-index balanc video schema browser and function comput mine engin rout technology-enhanc (well soap distribut track price object eye-track regular segment model co-clust multi-keyword determin bulletin commerc qos text cdn random session reput find xml locat winner activ cloak local express mainten cost-per-act requirorgan statist mediat microbusi view wiki set knowledg 2.0 expertis disjunct detect expert pattern review wikipedia debat languag chemic flickr approach email attribut spars isol extens p2p news advertis popul protect instant axiomat dissemin voicesit tempor facet instanc context logic load ontolog walk distil suppli trust communiti duplic invert devic compon interest basic imag bayesian repetit educ hidden semantic-bas novel datalog servic near behavior anonym incentive-cent region server-sid propag metric cross-languag cluster pharm lightweight develop minim media medic econom complex dht infer optim effect user extern task semantics) person programm the paradigm isoton monet photo rest collabor demograph web cut character board persuas subsequ match applic classfic webpag traffic associ measur microformat collect cascad soft page sitemap crawler shed excerpt maxim mirror guarante p3p transport viral for overlay characteris larg market machin same-origin compress web-bas vs. comparison of label semistructur disabl owl effici log task-bas spam question aspect-ori fast interfac analysi semi-supervis wireless cloud pagerank categor consist isid problem similar query-log classif featur evalu pseudo abstract diagnosi proven generat mutual mashup discoveri virtual bpel field communic phish architectur longev svm algorithm fsg reliabl descript visual rule represent monet queri consum collabor paper semantic/data reput languag entiti web locat polici with explain desktop blog to analyz rich geo/tempor analyt applic digit tangible/hapt spell (slas) traffic relev measur unstructur level h negat authent correct sensemak statist soft manag crawler wiki enterpris properti aspect porn natur creation rate design structur extract click index network for open review multimedia definit publish discoveri content method communiti internet approach defens metadata machin real-world agreement rich-media market base theori repositori news advertis vertic on search auction of page filter context social fine-grain improv semistructur produc control semant e-commerc effici appli qualiti rank system right mobil summar select use from log spam interact compos avail their attack interfac includ recommend corpus large-scal ontolog deliveri that tool privaci site trail visual link ling harvest cach replic novel retriev evolut scalabl servic access annot contextu learn browser object-ori analysi classif comput evalu context-awar process in share mine cluster tag explor generat onlin facet develop techniqu secur perform media research exchang econom other exploratori combin document divers sub/super-docu relat distribut compress discov virus user component-bas engin data model feder audit sentiment algorithm author issu person text inter-organiz suggest mechan the opinion
  10. 10. Question: What is that influence? •  If a researcher identifies a new topic one year, does that result in the research having more coauthors the next? •  Does an informative post on a microblogging service lead to a user gaining followers? – If a user is popular in a social network, will his new status updates be widely quoted?
  11. 11. Influence Framework 1.  Network Generation 2.  Measuring Network Properties 3.  Time Series Analysis
  12. 12. 1. Network Generation •  Input: Domain data with identified social actors and content elements •  Outputs: – Series of social networks in time – Series of content networks in time – Bindings between these networks •  Keypoint: Networks must evolve
  13. 13. !"#"$%&'()$)"% *+,'-% ."/0-%1# 2&3"4,35&()+673&$,3-4
  14. 14. !"#"$%&'()$)"% *+,'-% ."/0-%1# 2&3"4,35&()+673&$,3-4 8-9$+,'-%34: ((((4",0-%; !"#"$%&'(,-)3&( (((((4",0-%;
  15. 15. !"#"$%&'()$)"% *+,'-% ."/0-%1# 2&3"4,35&()+673&$,3-4 8-9$+,'-%34: ((((4",0-%; !"#"$%&'(,-)3&( (((((4",0-%;
  16. 16. Semantic Web makes this easy! •  Content and social networks are already bound – E.g. a resource can represent a person or a paper I can point at •  SPARQL queries easily extract the separate networks
  17. 17. Semantic web makes this easy! SPARQL queries to extract co-author pairs and co-occurrence keywords Traditionally, this takes a lot of extraction effort in terms of content analysis
  18. 18. •  Various measures of the centrality of a node determine its relative importance in a network •  The local Clustering Coefficient is an indication of the embeddedness of single vertices, i.e., the degree to which individuals tend to cluster together. 2. Measuring network properties 2 3 1 4 5 6 7
  19. 19. 2. Measuring network properties • Standard network properties, such as degree centrality, betweenness centrality, clustering coefficient • Domain specific network properties or content variables can be used to gain additional insight • Need an interpretation for social reality
  20. 20. Output of the previous steps before time series analysis Author Year Social bc Social dc Social cc Content dc Content bc http://data.semanticweb.org/person/daqing-he 2007 0 0.0118 0 0 0 http://data.semanticweb.org/person/daqing-he 2008 0 0.0123 1.0000 0.0189 0.0372 http://data.semanticweb.org/person/daqing-he 2009 0 0 0 0 0 http://data.semanticweb.org/person/daqing-he 2010 0 0 0 0 0 http://data.semanticweb.org/person/chengxiang-zhai 2007 0 0.0118 1.0000 0 0 http://data.semanticweb.org/person/chengxiang-zhai 2008 0.0005 0.0123 1.0000 0.0184 0.0289 http://data.semanticweb.org/person/chengxiang-zhai 2009 0 0 0 0 0 http://data.semanticweb.org/person/chengxiang-zhai 2010 0 0.0031 1.0000 0.0093 0.0147
  21. 21. 3. Time Series Analysis •  Fit the data to multilevel times series models •  Use Autoregressive methods •  Must deal with: – Fixed effects in general – Random effects considering differences between individuals
  22. 22. Multilevel time-series regression models
  23. 23. 3. Time Series Analysis •  Produces an influence network !"#$%"&%'"%()*+, -*&./0'"%()*+, 1*"(%"('"%()*+, 2%3+%% &0$4(%+."3'&*%5&.%"( 67899 67899 :%()%%"%44 6789; 2%3+%% 6789; 78<< 78=> 678?< 678@@ 678<
  24. 24. Results in two domains •  Influence between co-authors of academic papers and the topics they address •  Influence between social status of online forum participants and the attention they give to particular parties
  25. 25. Results for WWW !"#$%"&%'"%()*+, -*&./0'"%()*+, 1*"(%"('"%()*+, 2%3+%% &0$4(%+."3'&*%5&.%"( 67899 67899 :%()%%"%44 6789; 2%3+%% 6789; 78<< 78=> 678?< 678@@ 678< • Publications of WWW (Semantic Web Dogfood repository) • Time series of four years, from 2007 to 2010
  26. 26. Results for Political Forum !"#$%"&%'"%()*+, -*&./0'"%()*+, 1*"(%"('"%()*+, ."23%4+%% 5678 *$(23%4+%% 5687 9%()%%""%:: 565; 3%4+%% 565< 567 567= 56>? 565@ 56> 56>8 2565; 9%()%%"%:: 565@ 565< • Discussions from online forum nl.politiek • Time series of 259 weeks • More than 21,000 participants • The content is the attention that 19 Dutch political parties receive
  27. 27. !"#$%"&%'"%()*+, -*&./0'"%()*+, 1*"(%"('"%()*+, 2*3$0/+.(4 5678 9%()%%""%::56;< =(>%+?@*(.*" 565A B.:C$:(D/(% 56EF G%):3/3%+1*"(/C.*" 56;7 56;H 56;A 565H 565A 56EE 56;5 565I 5658 565< 565H Online participation vs. emotions, mass media !"#$%"&%'(%)*+,- .+&/01'"%)*+,- 2+33$"/&0)/+"'&+")%") 4+4$10,/)5 6789 :%)*%%""%;; 6769 0&)/</)5 67=> ?/;@$;)A0)% 67=8 (%*;404%,2+")0@/+" 67=> 676B 67=8 676> C)D%,E3+)/+" 6769 678F 67=8 67G8 67== 67=H 67=9 676H 67=B 6769 67=6 67=B 67=6 67=F 676I 67=9
  28. 28. Conclusion •  The world is full of dynamic networks •  The Semantic Web is all about networks •  The influence networks show us how those networks influence each other

×