Influence-based Network-oblivious - ICDM 2013


Published on

How can we detect communities when the social graphs is not available?

Published in: Education, Technology, Business
1 Like
  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Influence-based Network-oblivious - ICDM 2013

  1. 1. Influence'based,Network'oblivious,, Community,Detec:on, Nicola,Barbieri, ,Francesco,Bonchi, , Yahoo!,Labs,',Barcelona,,Spain,, {barbieri,bonchi}@yahoo' , Giuseppe,Manco, , ICAR'CNR,,',Rende,,Italy, ,,,
  2. 2. The,task,of,detec:ng,close&,of,like&minded, people,in,on'line,social,networks,has,plenty,of,applica:ons,in, marke:ng,and,personaliza:on., If,a,user,responded,posi:vely,to,a, certain,campaign:, TARGET&USERS&IN&THE&SAME& COMMUNITY.& By,homophily,one,can,expect,similar,users,to,be,more,likely,to, be,interested,in,the,same,product,than,random,users., If,more,users,in,the,same,community,adopt,the,same, product,,this,might,eventually,create,a,word&of&mouth+buzz.,
  3. 3. The,companies,that,would,mostly,benefit,from,knowing,the, structure,of,the,social,network,oSen,do+not+have+access+to+ the+network!, How&can&we&detect&communi<es&when&the&social&graph&is&not& available?& A,company,adver:sing,or,developing,applica:ons,over,an,on'line, social,network,owns,the,log+of+user+ac.vity,that,it,produces., Exploit,the,phenomenon,of,social+contagion+to+detect+,
  4. 4. Influence'driven,informa:on,propaga:on., Users,performs,ac:ons,(likes,,purchases,,shares,,tweets),and,those, ac:ons,propagate,across,the,network., A,Propaga.on+model,governs,how,influence,propagates,across,a, network., Independent+Cascade+Model:++ When,a,node,(v),become,ac:ve,it,is,considered, contagious,and,it,has,a,single,chance,to, ac:vate,each,inac:ve,neighbor,(u),with, probability,pv,u., As,informa:on,spreads,over,social,connec:ons,,,the,network, naturally,shapes,the,process,of,informa:on,diffusion.,,
  5. 5. where Nk = Optimizing Q( P k ⇡k = 1, 0 Modeling maximum likelihood. We assume that each propagation trace is independent from the others, and we adopt (Unobserved))Social)Network) a maximum a-posteriori perspective. That is, we hypothesize Propaga'on)Log) that action probabilities adhere to a mathematical model ⇡k = governed by a set of parameters ⇥.Our,framework,assumes,the, The likelihood of the data given the model parameters ⇥, canexistence,of,an,unobserved+ hence be expressed as: Here, the prop Y Communi'es) mation of L(⇥; D) = social+network,having,a,modular+the P (u|⇥) u2V structure., community no suppressed. Th where P (u|⇥) represents the likelihood to observe u’s behavnumber of com ior relative to D. As a consequence, the corresponding learning letting some o ˆ problem is finding the optimal ⇥ that maximizes L(⇥; D). We,assume,that,user,ac:vi:es,are,governed,by,an,underlying, general The Following the standard mixture modeling approach [?], we as explained i stochas.c+diffusion+process,over,the,unobserved,social,network., assume that users’ actions can only happen relative to a its robustness community of membership. That is, we assume that a hidden Figure 1: Topic-aware influence parameters are learnt from arbitrarily Each,user,is,associated,with,a,level,of,membership,and,influence+ larg binary variable social network following zu,k denotes the membership of user u to the log of past propagations and the PK [?]. These are the community to build the INFLEX index prerequisites k, with the constraints in,each,community., zu,k = 1. Thus, pitfallstoof loca k=1 likely be co that we use to e ciently answer TIM queries. {⇡1 , . . . , ⇡K , ⇥1 , . . . , ⇥K }, where ⇥ can be partitioned into P (⇥) is ma ⇥ represents the propagation model, We,can,model,the,behavior,of,users,by,exploi:ng,the,standard, a of m of these three paperskdefine an influenceparameter set relative to community k, and same order instead in a recent ⇡k =Barbieri et al. [?] extend the classic work P (zu,k = 1). We can rewrite the likelihood as mixture+modeling+approach:, prior in [?] wo IC model to be topic-aware: the resulting model is named K Topic-aware Independent Cascade (TIC) . of communites YX Barbieri et al. also devise methods to learn,= from a log P (u|⇥k )⇡k , of L(⇥; D) the computatio past propagations, the model parameters, i.e., topic-aware u k=1 without optimi influence strength for each link and topic-distribution for Stochas:c,Framework,for,Network,Oblivious,CD.,
  6. 6. u v Following [13] weinfluenced delaythe adoption ofto define adopt to u in threshold a define ng [13] meter set we adopt a delay threshold i. Similarly we de ⇥)|D; ⇥0 ] + + s.Community'Independent,Cascade,(C'IC)., 2 V |0 of users wh Specifically, we define Fi,u = we define Fi,u V |0 |t ete-Data influencers. Specifically,{v =2{v 2 V u (i) = t{v > }  Fi,u v (i) }tuas the tset ofusersfailedthe influencing u over i. potentially can speci (i) as potentially v (i)  v (i) (2) } who in set of users who Then, we ⇥k )The Generaliza:on,of,the,IC,Model:,each,new,ac:ve,user,v,exerts,her, + log P hts. + the adoption of i. Similarlysurvives uninfected at set analytical}optimization of that a user we define d u in log ⇡kinfluenced(⇥)in the adoption of the least until time tu ) and hazardthe set u i. Similarly we define as |t , ↵ ) (modeling instantaneous infections). Y We resort|t (i) explicit (i) > } of usersu who definitely weights. The analytical optimization of that a user surv to the modeling functions H(t v v,u Learning influence v 2 V u Fi,uv = {v 2 V |tu (i) tv (i) >)(u|⇥difficult. We resort + (i|u, ⇥modeling functions H(tk t influence,globally,,with,a,strength,that,depends,on,the,community+ ⇥ u } of users P to the definitely who ) )·P Q(⇥; 0 n data to(2) u the optimization we can specify theP is still )into= community-basedoptimization (i|u, simplify over i. Then, We reformulate this P⇥ framework k hidden data to simplifyexplicit k a ). influencers nfluencing failed in influencing u over of Then, k as can ispecify the (u|⇥k ) We reformu k+of+the+targeted+node., be a binary variable such that scenario. The C i. (u|⇥ we w P procedure. (C-Rate) propagation model be such that scenario. The analytical optimization of (⇥) a binary variable Learning EM al- The Community-RateThat is, letthati,u,v survives uninfected at least until time tu) and ha e done by means of influence weights. u,v a user Y i Q(⇥; ⇥and still difficult. Wewherethe i,u,v (i|u,v ⇥k )functions H(t |t ,thethe iprobability that triggered the adoption as by u, 0) is is characterized by the explicit1modeling represents↵v,u) (modelingand is characterized P, = if assumptions: u v item u, instantaneous infecti + Y to w followingthe set of all possible w of such thatby 2 F + . resort he (u|⇥ )of= assignment ⇥ ) the (i|u, ⇥ ), W denote (3) al randomthe itemP ⇥, · v i,u,v P adoption influence + (i|u, (u|⇥ P =data to simplify the optimization Psurvives uninfected atand until time• tUser’s infl k k analytical potential of ⇥ ) user We reformulate ), uframeworkP (i|u, ⇥ h of the influencers + P ) Pk let we cankrewrite the (i|u, ⇥ this relativei,u (i|u, that a · Learning until thatweights. .The askhidden optimization influencers complete-dataklikelihoodleast (3) a community-b activated +Then, u ) That i k to possible wi,u,v i v 2 Fi,u That is,• User’s be a binaryisvariable such the scenario. The Community-Rate into to. and ) wo steps such convergence: let wi,u,v influence W limited to that community she belongs (C-Rate) propagation m procedure. bility as functions the “out-of-react” influencers s memb Q(⇥; ⇥0 ) likelihoodu,i,u,v Wetoif v triggered theexplicit modeling none and is characterized by(modeling instantaneous infecti still difficult. = the adoption of the item by u, of H(tu |tv , ↵ w k as complete-data iseach relative 1 resort toto. That i the that is i likely +to influence/bev,u ) the following assumptions: by influen eEM⇥k ) representsWthe probability that some(D, Z,2W, ⇥) = P (D, W|⇥, Z) · P (Z|⇥) · PY for alu,k is, user influenced (⇥), of (i|u, the influencers as hiddendenote the ⇥ of)allrepresents theof Wei,ureformulatethat some ofto thea community-b the let F of data rewrite possible wi,u,v suchPthat vprobability this framework the communityk ) bel to simplify the optimization P . • ⇥ = 1 where P+ (i|u, set kthe complete-data likelihood relative to(i|u, User’s)influence is limited into (1 communit p she Then, u and t (u|⇥k )⇡kactivated we can Pby(i|u, ⇥kof wheresame community,to.while is, the effectis likely to influence/be influe ⇥, the ,Probability,that,some,of,the, + k members ) the proba- Y Y That the user the ,Probability,that,none,of,the,the,“out'of'v P procedure. influencers That is, let w be a binary variable such that W as scenario. Community-Rate (C-Rate) potential influencers activatedP (D,W|⇥, Z) = (i|u, ⇥apk ))z ofthe same + propagation e u react”,influencers,succeeds,in,ac:va:ng,u, and members (1 k the proba- • Informatio i,u,v poten:al,influencers,in,ac:va:ng,u, succeeded: P The by membersdifferentv2Fcommunity, while the m of influence of v k none of ·the “out-of-react” influencers is marginal on gence: Z) P(Z|⇥) · P(⇥),P (D, Z, W, ⇥) = P (D, W|⇥, Z) · P (Z|⇥) · P (⇥), i,u,k of influence is marginal on members communit D, W|⇥, i,u of a ↵ P (u|⇥k )⇡k that adoption of the item i by u, and is characterized by the succeeded: v2F v triggered the none of the “out-of-react” influencers following assumptions: where diff v, k=1wi,u,v = 1 if bility Y Y community. Y Y community. s k tions that w ·z + where (i|u, Information pk (1 w )·z )1 P maximizing all let + denote the ) of 1 possible • (1 Y k ) v z2 Fi,u · P User’sv •to⇥kk is diffusion from the userpvvto)v of contagi the⇥W (i|u, ⇥k set= Eq. 2. wi,u,v suchpthatdiffusion .fromYuserpkvinfluence =limited to (1 community she be Y v Y Information )k ) 1 i,u,k•v2F (1 community isthe k-th theby the density within the the vpv ) characterized within v P+ P (D,W|⇥, Z) + (i|u, ⇥ p = = (1 f(tu |tv , ↵ Then, to u,k the prior P i,ui,u,k v2F k relative to likelihood metric pk )zcan rewrite the complete-datacommunity isv characterized byto. That is, the v,k|tis ,v2F),to thetoexpected The parameter v2F (⇥) (1 we both i,u v where ↵ user relatedin the influence/be influe delay on is 0 +the v ↵v,k As a consequence,the density f(tu Q(⇥; ⇥likely second influence:the ac high v2Fi,u contribution tov triggers)within community k. The proba YYa Y k w ·z tions k (1 We W as model the former in wayv k We is relatedYthe canby members that the activa- community, consequence de row ofwEq. )·z expected delay on 2specify the complete-data likelihood be rewritten as 2Fi,u · p can to of same while P (i|u, ⇥k ) = (1 where ↵v,k 1 pv then 0 2 k of contagion depends on the time delayOn v.u.the pv ) i,u,k v2F of 3zu,k 2 the basis ow ·zu,kExpecta:on'Maximiza:on,algorithm,to,determine,the,parameters, diff automatic estimation tions that k ) = within B of influenceThe probability on members of a term P P(Z|⇥) triggers (1 The parameter ↵v,k khas pXk. wi,u,v an P(D, Z, W, ⇥) =)·zu,k W|⇥, Z) (i|u, ⇥·vP(⇥),X X (1 wi,u,v P(D, community ) X is marginal a direct interpretation in 2. v NetRate model 0 log(1 v2F @log u,k Y ⇡k + 1 pk forAsthe i,u the·it a consequence, contribution to Q(⇥; ⇥ ) in the second influence:Y values )of ↵v,k cause short delays, and Y mmunities. v As latter, of contagion dependsk on the time delay v.u . pv k 7 high 6 u v2Fi,u community. that,maximize, 1 i v2F denote v as strongly influential6 P (u|⇥k or P (⇥) row of Eq. 2 can be rewritten as (1 pv )5 1 4 k. within hen where the specify the complete-data likelihood through: 4 consequence of the above observations,· we can adap probability P (a) for0given P (D|Z, ⇥) = Y then Theu,kX X v,k has a X i,u,k On thediffusion from the user v + in a way We can X Bspecify 2 ↵complete-data likelihood through: k C to v within i,u • XInformation 2 3zk parameter 3 u,k X Y v2Fin terms thelog(1 pk ) + direct zinterpretationbasis to⌘i,u,v,k )of ofk )Sec. III, byv2F the i,u z 0 alternatives.⇡k + u,k NetRatekmodel fit the scheme p A plugging e two different = ⌘ p (1 u,k @ tribution to Q(⇥; ⇥Z) in the second (1 log pv )2 highY vof ↵v,k causei,u,v,k log2vis+characterizedY the z P(D,W|⇥, )Y 3zu,k delays, andY log(1 3density f(tu |tv , ↵ mation of influence: v2F values short as a by Y6 i v2F community u,k u k i 6 k 7 k 7 P (u|⇥k ) = S(T |tv (i), ↵v,k )· ten as 41 = (1 i,u Y · 4 Y vwhere ⌘pv )5 where ↵ is Y the to the expected i,u,k v2F pv )consequence denote (1 strongly influential within k. user v in triggering delay on the a 5 as i,u,v,k is71“responsibility” of i:u62C v2C k 7 latter, it v,k related 6 6 k the u,k i,u i,u,v u,k i,u,v u,k u,k + i,u i,u i,u,v i,u,v u,k u,k + i,u i,u + i,u i,u i,u,k + P (D|Z, ⇥)X X = 1 (1 p ) · Y Y (1 p ) i i
  7. 7. X basis of ) in the depends influence: delay the likelihood that values .of v,k On 0 short delays, X Y Y ofiscontagion secondthe user v in triggering= P(wi,u,vv.u1|u, i,↵u,k = causeadoption ) and as a ,; ⇥or“responsibility”X on the timehighthe consideredthemodel to(i),the above observations, we can ada ⌘i,u,v,k (i), ↵v,k ) = z NetRate H(tu (i)|tv fit v,k scheme of Sec. III, u,kpluggin 1, ⇥ ) of B the ↵ the k by log(1 u k ) @log ⇡k + of the community k: S(tp(i)|tv denote v(5) strongly influential within k. consequence as terms v2C v Learning. Again, instead of directly optimizing the k X X in the context parameterwithin itime(i) T . This approach assumes that Y Y has of i,tu (i) appen ↵iv,kv2F v2Ci,tu interpretation pin i:u2C a direct v . binary observations, likelihood, we the X On the = Q nce: high values i,of ↵v,k⇥0cause basis of the+ above )as a P (u|⇥kwe can adaptintroduce|tthe latentv,k )· variable = S(T v (i), + ↵ = P(wi,u,v = 1|u, i,u between short delays,(1andk time of )the zu,k = 1, H(t (i)|t (i), ↵ ) adoption ) 1 pw u,v,kdependency apk ) u v the fit i,u scheme v,k w2F denoting the fact NetRate modelLearning.the1 instead of Sec. III, by i:u62C abovethat u has been infected by v on i. Th to plugging Again, of directly optimizing thei v2Ci quence denote v asv2C strongly influential within k. k v The,likelihood,of,an,ac:va:on,can,be,formulated,,by,applying, pv k X likelihood be Finally, optimizing Q(⇥; 0 with adapt the and thelog pkaboveu(i)the influenced. CIn NetRate [4], Y cani,u,v rewritten by defininghu,ii2D X Q . i,t likelihood,k Y the=⌘1i,u,v,k +the v pk )of observations, P ⇥ )canwerespect to pvthe latent binary variable w Y S(t (i)|t (i), ↵ ) basis of one+ (1 ⌘i,u,v,k ) log(1 Y)Aintroduce yields we pk X (1 1w P (u|⇥kdenoting the fact⌘that uS(Tbeen(i), ↵v,k )· on i. Then, the Y Y Y ) =hu,ii |tv infected by v u v v,k survival+analysis:, by u,k ateLearning. w2Fi,u in Sec.directlythis dependency in modeled P(D, W|Z, ⇥) = to fit Sec. III, byabove · i,u,v,k plugging described instead of II,of optimizing the i:u62C v2C has +model Again, the scheme S(T|tv (i), ↵v,k )zu,k + 2Fi,u i:u2Ci v2Ci,tu (i) v2F i i,u likelihood can mizing Q(⇥; ⇥0wewithY Ypklatent binarypkvariable wY i be rewritten by defining respect to Y, (4) C (tu |tv likelihood, ) introducefthe v yields , ↵v,u ) of+ transmission, which D k v= i,u,v nal(u|⇥ ) = pk )A ofS(T user v in triggering Y Y S(t (i)|t (i), ↵ ) Xhu,ii62H(tv2Ci (i), ↵ ) P )likelihood log(1 is the khu,ii k · ⌘ the |tv (i), ↵v,k )·Sv,k + Sv,k Yu Y Y u (i)|tv hu,ii2D Y u,v,k “responsibility” P v,k v denoting the fact u,k ui,u,v,k infectedP v onP(D, W|Z, ⇥) = P that has been by i. Then, the zv,k u,k (5) S(T|tv (i), ↵v,k ) v2C· H(tu (i)|tv (i), ↵v, + n likelihoodv2Fi,u bei:u62thev2Cidefining= k: u,k and S i v2Ci,tu (i) u,k . the delay community hu,ii i:u2C of a n the context of Ci v,u . SThe likelihood = hu,ii propagation i,tu (i) with + v,k X2D k v2Ci pk = can rewritten by v,k ,Y (4) + 0 Y hu,ii6 hu,ii2D k v2Ci,tu (i) v + v2F 1 Similar formulations can i,u i,u Y mulatedSi,u,v S= 1|u, Yzu,kstandard survivalv2FYH(tu (i)|t[14], v,k ) wi,u,v zu,k by Y Y i, = u ⇥ )z = P (w v,k + applying S(t1,(i)|tv (i), ↵v,k ) Y analysis v (i), ↵ ,v,k v in triggering v,k user zu,k (5) P P(D, W|Z, ⇥) = kP · YNAMICS H(tu (i)|t (i), ↵v,k ) omitted · S(tu for S(T|tV.(i), ↵v,k ) u,k T EMPORAL DLearning. Again, v instead of directlyhere(i)|tv (i), ↵v,kthe optimizing Modeling,the,probability,that,a,user,survives, lack) of v M ODELING are nitysurvival v,k pvi v2Ci,tu (i) k: v2C the probability of hu,ii u,k and Si:u2CD uX i,u,k . .v,u ) (modeling i,tu (i) v2Ci,tu(i) we introduce the latent binary variable S(t khu,iiv ↵ = |t hu,ii2D k v2C Q hu,ii62 v2F C-IC does not explicitly model temporal dynamics, as it + = likelihood, v2F i,u Y w) +Y 1 = 1, ⇥0 )user Yi,u (1 i,u pkuninfected wi,u,vleast uninfected,at,least,un:l,:me,tu,, theby v on i. Th z w2F that a TEMPORAL DYNAMICS H(t(i)|tv (i),v↵v,kbinary zactivations by employingfactu↵v,k )uu,khas P(D|Z, ⇥) with above component survives on modeling just ) at u,k denotingS(tu(i)|tav t and replacing been infected ) and · the focuses H(tu u (i)|t (i), ↵v,k ) of until time (i),that the hazard we adopt the exponential d M ODELING · likelihood. In the following Learning. Again, instead directly optimizing above i,tu likelihood alterdiscrete-time pv yields functions 0hu,ii2Dv2Clikelihood,propagation model. HereModeling,instantaneous,infec:ons, ↵v,k v,u}, which H(tkuv2Cdynamics, as kwe introduce the latent anbinary tioninfections).v,k exp { |ti,t(i)↵to ) (modeling we present can be variable , ↵wi,u,v ↵ ,(i) v,u it instantaneousrewritten v,k ) = f(tu |t by defining sizing Q(⇥;model with respect . not explicitly ⇥ ) temporal vu and replacing P(D|Z,to characterizeabove component vin theY ⇥) with the the P exploits z ) · S(tmodeling thatlikelihood. delays (i)|t modeling just binary activationsnativeemploying a↵fact u,k timethe following wea community-based ↵v,k v,u } and H(tuu,k, ↵ v, ·by directly denoting (i), v,k ) that u the P (D, W|Z, the = on|ti. distribu- exp hu,ii We reformulate udiffusion optimizingInhas been infected⇥)exponential↵Then, the {S(T |tv (i), ↵v,k )z |tv this process. into adopt by vS(tuY v,kY= u,k of⌘i,u,v,k vthe framework above ning. Again, instead overall + v2F Here propagation model.i,u we likelihood can tion rewritten )by ↵v,k exp { ↵v,k v,u↵v,k . 1 Then, an k k yields which enables toand = introduce with the above componentf(tu the, (C-Rate) propagation model p pv we hood,vreplacing P(D|Z, ⇥) Community-Rate|tv[0, T], thei,u,vis to explicitly },hu,ii62D k v2Ci binary in ↵v,k scenario.time+ topresent an observation window (4) w idea The the latentalter- be variable = defining Given , the ing that exploits S delays v,k characterize X XX Yv ↵v,k X ↵ i. ing process. In thatv,k hasSbeen thethe exponentialuthev ,Y )atYexpeach↵v,k v,u } and H(tu |tzu,k) /) = Ylog ⇡k the factBy,adop:ng,the,,as,density,for,the, u + weby infected by v|t1on v,k Then, the user adopted Q(⇥; ,⇥0Y u,k model the following assumptions: likelihoodS(t distribu- = Y of time which { ,k is ion characterized adopt P likelihood. the following P (D, W|Z, ⇥)↵= . Then, P H(tu (i)|t (i), u,k S(T |tv (i), ·↵ ) v,k condi:onal,transmission,likelihood,and,by,introducing,hidden,vv2Ci ↵v,k each item, or }, u,k . hood can |tbe↵rewrittenidea {definingthe which enables the considered adoption v,k byis to v,k v,u likelihood that hu,ii tion f(twindowand ↵v,k = u,k hu,ii62D k bservation u u,k v,k ) T],Sv,kexp ↵ explicitly , hu,ii v , [0, = the (4) k hu,ii62DX v2Ci + Xthat Xshe belongs X X hu,ii2D k v2Ci,tu (i) X X v2F did userv2Fi,u within ,time )T. Y approach is community S(t• v User’s which↵each not }happenH(tu |tvQ(⇥; ⇥0 )=ThistheY ⇡assumes ↵v,k / Yz ui,u ↵v,k time Yinfluenceand limited to v,k v,u kelihood|tof,variables,for,modeling,the,iden:ty,of,the,influencer,,we,obtain:, logu,k the ) = atexp { Y Y adopted + u,k log ktime of the u,k v ↵v,k · S(t i,u,v,k u,k z u,k there is a dependency W|Z,1⇥) = T EMPORAL DYNAMICS · ↵v,k ) the adoption H(tu (i)|tv (i), ↵v,k )wi,u,v zu,k(i)|tv⌘(i), ↵v,k ) ↵v,k M ODELING That considered S(T |tv (i), between ↵v,k .likelihood that the is, theadoption is likely to influence/be influenced u Then, or the to. user of the u,k hu,ii6 hu,ii2D k v2Ci,tu (i) hu,ii u,k . hu,ii62D k influencer and the onehu,ii2Dinfluenced. In NetRate2D k v2Ci [4], v2Ci X X v2Ci,t X k X X u (i) X X en i,u explicitly model temporal dynamics, as it and replacing P (D|Z, ⇥) with X X not by X Y Y the same community, while the effect above component 2F within time T. This approach assumes that Y log ⇡ previously described in Sec.↵II, this dependency in modeled by u,k log ↵z 0 the ⌘ Q(⇥; between the adoption time of the u,k v,k ) / members of u,k u,v ↵v,k , u,k u,k k i,u,v zu,k ependency⇥just binary activations H(tu (i)|tv (i),+ a )w· S(tu (i)|ti,u,v,k ↵v,k ) v,k odeling · by employing↵v,k likelihood. In(i), following we adopt the exponential di v the v NAMICS aIn NetRate [4], i conditional v2C hu,ii2D k hu,ii2D k u,k D d the oneof theinfluencehu,ii6is kpresent f(tu |tv , ↵v,uX members of a differentv2Ci,tu (i) of hu,ii2D k Herei,twe marginal on of transmission, which influenced.v2C 2(i) likelihood an alter- ) X v2Ci,tu (i) propagation model. X uon the delay X u |tv , ↵v,k ) = ↵v,k exp { ↵v,k v,u }, which e X X depends tion a ral dynamics, as it and modeled by logP (D|Z, ⇥) with of f (tabove component in the v,u replacing ↵v,k . Thezlikelihood thepropagationu,v ↵v,k , scribed in Sec. + this dependency in ⌘i,u,v,k characterize the II, u,k ng that community. S(tu (i)|t by applying ) u,k S(tu analysis u,k exploits time delays to exp { ↵v,k be } and H(t different den [14], ns by employing a can ·be formulated v (i), ↵v,kstandard survival|tv , ↵v,k ) = 1 Similar formulations canv,uobtained by adopting u |tv , ↵v Modeling,temporal,dynamics,with,C'Rate.,
  8. 8. Evalua:on,on,Synthe:c,Data., We,use,a,generator,of,benchmark, graphs[1],,which,generates,directed+ unweighted+graphs,with,possibly+, •  Number,of,nodes,=,1000;, •  Average,in'degree,=,10;, •  Maximum,in'degree,=,150;,, •  Min/max,of,the,community,sizes,=,50/750., ,The,four,networks,differ,on,the,percentage,μ, of,overlapping,memberships., •  Propaga:on,cascades,are,generated, according,to,the,Net'Rate,propaga:on, model., •  The,transmission,rate,for,each,link,is, sampled,from,a,Gamma,distribu:on, (shape=2,,scale=0.3)., ) µ = 0.001 (b) µ = 0.01 TABLE I: Statistics for the synthetic data: four networks corresponding to four values of µ as in Figure ??. # of communities (K) avg # of adoptions avg trace length avg % of communities traversed by a trace S1 9 56k 38 17% S2 7 59k 38 24% S3 11 82k 54 24% S4 6 370k 256 82% The strength of each link is determined by considering both the outdegree (out ) of the source and the indegree (in ) of [1],A.,Lancichineh,and,S.,Fortunato.,Benchmarks,for,tes:ng,community,detec:on,algorithms,on,directed,and,weighted, · · graphs,with,overlapping,communi:es.,Physical,Review,E,,80,,2009., the destination:
  9. 9. Results., Baseline+Models+ •  Based,on,network,reconstruc:on, (assuming,a,dense,graph):, •  Inference,for,the,IC,Model[2];, •  Net'Rate[3];, •  Communi:es,are,detected,by, applying,METIS[4],on,the, reconstructed,graph., •  Mul:nomial,EM, [2],K.,Saito,,R.,Nakano,,and,M.,Kimura,,Predic:on,of,informa:on,diffusion,probabili:es,for,independent,cascade,model., KES’08., [3],M.,Gomez'Rodriguez,,D.,Balduzzi,,B.,Schölkopf.,Uncovering,the,Temporal,Dynamics,of,Diffusion,Networks.,ICML,2011., [4],G.,Karypis,and,V.,Kumar,,A,fast,and,high,quality,mul:level,scheme,for,par::oning,irregular,graph.;,SIAM,Journal,on, Scien:fic,Compu:ng,,vol.,20,,no.,1,,pp.,359–392,,1999.,
  10. 10. Evalua:on,on,real,data., TwiEer+data+ •  •  •  •  Number,of,nodes,=,28,185;, Number,of,links,=,1,636,4511;, Number,of,propaga:ons,(urls),=,8,541;,, Tweets,=,516,412., TABLE II: Summary of the evaluation on real data. Communities Community size (min/max/median) QG Conductance Internal Density Cut Ratio Time (mins) C-IC 20 C-Rate 64 156/3651/1319 97/1758/328 0.3274 0, 5849 0, 031 0, 001 105 0.2424 0, 6791 0, 051 0.0009 122 Internal,density,is,an,order,of,magnitude,higher,than,the,density,of, the,whole,graph,(0.0041)., Modularity,and,the,diagonal,block,structure,of,the,incidence,matrix,, confirm,the,existence,of,a,good,community,structure.,
  11. 11. THANKS!