V. Batagelj - Big data Networks from data bases

37,418 views
37,217 views

Published on

0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
37,418
On SlideShare
0
From Embeds
0
Number of Embeds
3
Actions
Shares
0
Downloads
2
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

V. Batagelj - Big data Networks from data bases

  1. 1. Networks from data bases V. BatageljTwo modenetworks Big dataMultiplication Networks from data basesDerivednetworksPajek Vladimir Batagelj University of Ljubljana Undicesima conferenza nazionale di statistica Rome, February 20-21, 2013 V. Batagelj Networks from data bases
  2. 2. OutlineNetworks from data bases V. BatageljTwo modenetworksMultiplicationDerived 1 Two mode networksnetworks 2 MultiplicationPajek 3 Derived networks 4 Pajek V. Batagelj Networks from data bases
  3. 3. Example: Internet Movie Data BaseNetworks from data bases V. Batagelj Lee TamahoriTwo modenetworks Pierce Brosnan Halle BerryMultiplication Paul HaggisDerived Neal Purvisnetworks Die Another Day Robert Wade Ian FlemingPajek Martin Campbell Judi Dench Casino Royale Daniel Craig Mads Mikkelsen Eva Green Skyfall Sam Mendes John Logan Ralph Fiennes Javier Bardem On February 17, 2013 IMDB (Internet Movie Data Base) contained 2,262,638 titles and 4,745,392 names. Web of Science, Scopus, Zentralblatt Math, Google Scholar, DBLP, Amazon, etc. V. Batagelj Networks from data bases
  4. 4. Two mode networks from data basesNetworks from data bases A simple data base B is a set of records B = {Rk : k ∈ K}, where K is the V. Batagelj set of keys. A record has the form Rk = (k, q1 (k), q2 (k), . . . , qr (k)) where qi (k) is the description of the property (attribute) qi for the key k.Two modenetworks Suppose that the description q(k) takes values in a finite set Q. It can always be transformed into such set by partitioning the set Q and recodingMultiplication the values. Then we can assign to the property q a two-mode networkDerivednetworks K × q = (K, Q, L, w ) where (k, v ) ∈ L iff v ∈ q(k). w (k, v ) is the weightPajek of the link (k, v ); often w (k, v ) = 1. Single-valued properties can be represented by a partition. Examples: (papers, authors, was written by), (papers, keywords, is described by), (parlamentarians, problems, positive vote), (persons, journals, is reading), (persons, societies, is member of, years of membership), (buyers/consumers, goods, bought, quantity), etc. V. Batagelj Networks from data bases
  5. 5. Methods: degree distributionsNetworks from data bases V. BatageljTwo modenetworks In a network (V, L) the degree deg(v ) of vertex v ∈ V is equalMultiplication to the number of links that have vertex v as their end-vertex.Derived The indegree / outdegree is equal to the number of incoming /networks outgoing links.Pajek Usually one of the first analyses of a network is to look at its degree distribution(s). Are there isolated nodes (deg(v ) = 0)? Which are the nodes with the largest degrees? What is the average degree? What is the shape of degree distribution? V. Batagelj Networks from data bases
  6. 6. Methods: two-mode cores and 4-rings weightsNetworks from The subset of vertices C ⊆ V is a (p, q)-core in a two-mode network data bases N = (V1 , V2 ; L), V = V1 ∪ V2 iff V. Batagelj a. in the induced subnetwork K = (C1 , C2 ; L(C )), C1 = C ∩ V1 ,Two modenetworks C2 = C ∩ V2 it holds ∀v ∈ C1 : degK (v ) ≥ p andMultiplication ∀v ∈ C2 : degK (v ) ≥ q ;Derived b. C is the maximal subset of V satisfying condition a.networksPajek A k-ring is a simple closed chain of length k. Using k-rings we can define a weight of edges as wk (e) = # of different k-rings containing the edge e ∈ E In two-mode network there are no 3- rings. The densest substructures are complete bipartite subgraphs Kp,q . They contain many 4-rings. There- fore these weights can be used to identify the dense parts of a network. V. Batagelj Networks from data bases
  7. 7. Example: (247,2)-core and (27,22)-core in IMDBNetworks from Zhukov, Boris (I) Wright, Charles (II) Wilson, Al (III) Wight, Paul Wickens, Brian White, Leon data bases Warrior Warrington, Chaz Ware, David (II) Waltman, Sean Walker, P.J. von Erich, Kerry Vaziri, Kazrow Van Dam, Rob Valentine, Greg Vailahi, Sione Tunney, Jack Traylor, Raymond Tenta, John Taylor, Terry (IV) ’WWF Smackdown!’ Taylor, Scott (IX) Taylor, Scott (IX) Tanaka, Pat Tajiri, Yoshihiro Van Dam, Rob V. Batagelj Szopinski, Terry Storm, Lance Steiner, Scott Steiner, Rick (I) ’WWE Velocity’ Solis, Mercid Snow, Al Smith, Davey Boy Slaughter, Sgt. Matthews, Darren (II) Simmons, Ron (I) Shinzaki, Kensuke Shamrock, Ken Senerca, Pete Scaggs, Charles ’Sunday Night Heat’ LoMonaco, Mark Savage, Randy Saturn, Perry Sags, Jerry Ruth, Glen Runnels, Dustin Rude, Rick Rougeau, Raymond Rougeau Jr., Jacques ’Raw Is War’ Hughes, Devon Rotunda, Mike Ross, Jim (III) Rock, The Roberts, Jake (II) Rivera, Juan (II) Rhodes, Dusty (I) WWF Vengeance Huffman, BookerTwo mode Reso, Jason Reiher, Jim Reed, Bruce (II) Race, Harley Prichard, Tom Powers, Jim (IV) Poffo, Lanny WWF Unforgiven Heyman, Paul Plotcheck, Michael Piper, Roddy Pfohl, Lawrence Hebner, Earlnetworks Pettengill, Todd Peruzovic, Josip Palumbo, Chuck (I) Page, Dallas Ottman, Fred WWF Rebellion Orton, Randy Okerlund, Gene Nowinski, Chris Norris, Tony (I) McMahon, Stephanie Nord, John Neidhart, Jim Nash, Kevin (I) Muraco, Don Morris, Jim (VII) WWF No Way Out Keibler, Stacy Morley, Sean Morgan, Matt (III) Mooney, Sean (I) Moody, William (I) WWF No Mercy Wight, PaulMultiplication Miller, Butch Mero, Marc Survivor Series McMahon, Vince McMahon, Shane Matthews, Darren (II) Martin, Andrew (II) Martel, Rick Marella, Robert Marella, Joseph A. Manna, Michael WWF Judgment Day Simmons, Ron (I) Lothario, Jose Senerca, Pete Long, Teddy LoMonaco, Mark Lockwood, Michael Levy, Scott (III) Levesque, Paul Michael Lesnar, Brock Leslie, Ed WWF Insurrextion Ross, Jim (III)Derived Leinhardt, Rodney Layfield, John Lawler, Jerry Lawler, Brian (II) Laurinaitis, Joe Laughlin, Tom (IV) Lauer, David (II) Knobs, Brian Knight, Dennis (II) WWF Backlash Rock, The Killings, Ronnetworks Kelly, Kevin (VIII) Keirn, Steve Jones, Michael (XVI) Johnson, Ken (X) Jericho, Chris Jarrett, Jeff (I) Jannetty, Marty James, Brian (II) WWE Wrestlemania XX Reso, Jason Jacobs, Glen Jackson, Tiger Hyson, Matt Hughes, Devon Huffman, Booker WWE Wrestlemania X-8 McMahon, Vince Howard, Robert William McMahon, Shane Howard, Jamie Houston, Sam Horowitz, Barry WWE VengeancePajek Horn, Bobby Hollie, Dan Hogan, Hulk Hickenbottom, Michael Heyman, Paul Hernandez, Ray Henry, Mark (I) Martin, Andrew (II) Hennig, Curt Helms, Shane Hegstrand, Michael WWE Unforgiven Heenan, Bobby Hebner, Earl Hebner, Dave Heath, David (I) Levesque, Paul Michael WWE SmackDown! Vs. Raw Hayes, Lord Alfred Hart, Stu Hart, Owen Hart, Jimmy (I) Hart, Bret Harris, Ron (IV) Harris, Don (VII) Layfield, John Harris, Brian (IX) Hardy, Matt Hardy, Jeff (I) Hall, Scott (I) Guttierrez, Oscar Gunn, Billy (II) WWE No Way Out Lawler, Jerry Guerrero, Eddie Guerrero Jr., Chavo Gray, George (VI) Goldberg, Bill (I) Gill, Duane Gasparino, Peter WWE No Mercy Jericho, Chris Garea, Tony Funaki, Sho Fujiwara, Harry Frazier Jr., Nelson Foley, Mick Flair, Ric Finkel, Howard WWE Judgment Day Jacobs, Glen Royal Rumble Fifita, Uliuli Fatu, Eddie Farris, Roy Eudy, Sid Hardy, Matt Enos, Mike (I) Eaton, Mark (II) Eadie, Bill WWE Armageddon Duggan, Jim (II) Douglas, Shane DiBiase, Ted DeMott, William Davis, Danny (III) Hardy, Jeff (I) Darsow, Barry Cornette, James E. Copeland, Adam (I) Constantino, Rico Connor, A.C. Wrestlemania X-Seven Gunn, Billy (II) Cole, Michael (V) Coage, Allen Coachman, Jonathan Clemont, Pierre Clarke, Bryan Chavis, Chris Centopani, Paul Cena, John (I) Wrestlemania X-8 Guerrero, Eddie Canterbury, Mark Candido, Chris Calaway, Mark Bundy, King Kong Buchanan, Barry (II) Brunzell, Jim Wrestlemania 2000 Copeland, Adam (I) Brisco, Gerald Bresciano, Adolph Bloom, Wayne Bloom, Matt (I) Cole, Michael (V) Blood, Richard Blanchard, Tully Blair, Brian (I) Survivor Series Blackman, Steve (I) Bischoff, Eric Bigelow, Scott ’Bam Bam’ Benoit, Chris (I) Batista, Dave Calaway, Mark Bass, Ron (II) Barnes, Roger (II) Backlund, Bob Austin, Steve (IV) Summerslam Bloom, Matt (I) Apollo, Phil Anoai, Solofatu Anoai, Sam Anoai, Rodney Anoai, Matt Anoai, Arthur Angle, Kurt AndrØ the Giant Royal Rumble Benoit, Chris (I) Anderson, Arn Albano, Lou Al-Kassi, Adnan Ahrndt, Jason Adams, Brian (VI) Young, Mae (I) Wright, Juanita No Way Out Austin, Steve (IV) Wilson, Torrie Vachon, Angelle Stratus, Trish Runnels, Terri Robin, Rockin’ Psaltis, Dawn Marie King of the Ring Anoai, Solofatu Moretti, Lisa Moore, Jacqueline (VI) Moore, Carlene (II) Mero, Rena McMichael, Debra Angle, Kurt McMahon, Stephanie Martin, Judy (II) Martel, Sherri Invasion Laurer, Joanie Keibler, Stacy Kai, Leilani Hulette, Elizabeth Stratus, Trish Fully Loaded Guenard, Nidia Garca, LiliÆn Ellison, Lillian Dumas, Amy Dumas, Amy IMDB 2005: n1 = 428440, n2 = 896308, m = 3792390. V. Batagelj Networks from data bases
  8. 8. Example: Islands for w4 / Charlie Brown and AdultNetworks from data bases V. Batagelj Morgan, Jonathan (I) Kesten, Brad Brando, Kevin Schoenberg, Jeremy Boy, T.T. Davis, Mark (V)Two mode Hauer, Brent Robbins, Peter (I)networks Shea, Christopher (I) Charlie Brown and Snoopy Show Voyeur, Vince Altieri, Ann Reilly, Earl ’Rocky’ Charlie Brown CelebrationMultiplication Ornstein, Geoffrey You Don’t Look 40, Charlie Brown He’s Your Dog, Charlie Brown Dough, Jon Making of ’A Charlie Brown Christmas’ You’re In Love, Charlie Brown Sanders, Alex (I)Derived It’s the Great Pumpkin, Charlie Brown Charlie Brown’s All Stars! Life Is a Circus, Charlie Brown North, Peter (I)networks Charlie Brown Christmas Race for Your Life, Charlie Brown Michaels, Sean Be My Valentine, Charlie Brown Horner, MikePajek Mendelson, Karen It’s Magic, Charlie Brown Dryer, Sally Stratford, Tracy Melendez, Bill You’re a Good Sport, Charlie Brown Drake, Steve (I) Boy Named Charlie Brown It’s a Mystery, Charlie Brown It’s an Adventure, Charlie Brown Byron, Tom Silvera, Joey It’s Flashbeagle, Charlie Brown Momberger, Hilary Play It Again, Charlie Brown West, Randy (I) Is This Goodbye, Charlie Brown? Charlie Brown Thanksgiving There’s No Time for Love, Charlie Brown You’re Not Elected, Charlie Brown Jeremy, Ron Snoopy Come Home It’s the Easter Beagle, Charlie Brown Wallice, Marc Savage, Herschel Thomas, Paul (I) Shea, Stephen Pajek Pajek V. Batagelj Networks from data bases
  9. 9. Sparsity and Dunbar’s numberNetworks from data bases V. BatageljTwo mode Networks obtained from data bases are usually large – tens ofnetworks thousands or millions of nodes. Large networks are usuallyMultiplicationDerived sparse – they have small average degree.networksPajek In one-mode networks describing relations among people this can be related to Dunbar’s number with a value around 150. See Wikipedia: Dunbar’s number. In general, if initiator of a link wants to keep the link he should spend / invest a certain amount of finite total ”energy” he has. V. Batagelj Networks from data bases
  10. 10. Multiplication of networksNetworks from data bases To a simple two-mode network N = (I, J , E, w ); where I and J are V. Batagelj sets of vertices, E is a set of edges linking I and J , and w : E → R (or some other semiring) is a weight; we can assign a network matrixTwo modenetworks W = [wi,j ] with elements: wi,j = w (i, j) for (i, j) ∈ E and wi,j = 0Multiplication otherwise. Given a pair of compatible networks NA = (I, K, EA , wA ) andDerivednetworks NB = (K, J , EB , wB ) with corresponding matrices AI×K and BK×JPajek we call a product of networks NA and NB a network NC = (I, J , EC , wC ), where EC = {(i, j) : i ∈ I, j ∈ J , ci,j = 0} and wC (i, j) = ci,j for (i, j) ∈ EC . The product matrix C = [ci,j ]I×J = A ∗ B is defined in the standard way ci,j = ai,k · bk,j k∈K In the case when I = K = J we are dealing with ordinary one-mode networks (with square matrices). V. Batagelj Networks from data bases
  11. 11. Multiplication of networksNetworks from data bases V. BatageljTwo modenetworks iMultiplication ai,k jDerivednetworks bk,jPajek k J I A B K ci,j = ai,k · bk,j k∈K If all weights in networks NA and NB are equal to 1 the value of ci,j counts the number of ways we can go from i ∈ I to j ∈ J passing through K. V. Batagelj Networks from data bases
  12. 12. Multiplication of networksNetworks from data bases V. Batagelj The standard matrix multiplication has the complexityTwo modenetworks O(|I| · |K| · |J |) – it is too slow to be used for large networks.Multiplication For sparse large networks we can multiply much fasterDerived considering only nonzero elements.networks In general the multiplication of large sparse networks is aPajek ’dangerous’ operation since the result can ’explode’ – it is not sparse. If for the sparse networks NA and NB there are in K only few vertices with large degree and no one among them with large degree in both networks then also the resulting product network NC is sparse. V. Batagelj Networks from data bases
  13. 13. Derived networksNetworks from data bases From a bibliographical data base we get two-mode networks WA = V. Batagelj Works × Authors and WK = Works × Keywords. Since they have a common set Works the networks WAT and WK are compatible andTwo modenetworks multiplying them we obtain a derived networkMultiplication AK = WAT ∗ WKDerivednetworksPajek The entry akit = number of times author i used in his/her works keyword t. The dataset of EU projects on simulation (January 2006) contains data about research groups. We obtain networks: P = Groups × Projects, C = Groups × Countries, and U = Groups × Institutions. Sizes: |Groups| = 8869, |Projects| = 933, |Institutions| = 3438, |Countries| = 60. In the derived network W = Projects × Institutions = PT ∗ U we determine link islands for w4 . V. Batagelj Networks from data bases
  14. 14. Analysis of Projects × InstitutionsNetworks from PSI FUR PRODUKTE UND ARMINES TQT SRL SYS.E DER INFORMATIONSTECH. data bases BICC GENERAL CABLE 28283 ESI SOFTWARE SA CHALMERS TEKNISKA HOEGSKOLA COLOPLAST A/S MTU AERO ENGINES 25525 DAIMLER CHRYSLER AG V. Batagelj FRAUENHOFER INST. FUER PRODUKTIONSTECH. UND AUTOMATISIERUNG EADS DE UND RAUMFAHRT E.V. BUURSKOV DE ZENTRUM FUER LUFT EA TECH. LTD VOLKSWAGEN AG 506503 LMS UMWELTSYS.E, DIPL. ING. DR. HERBERT BACK MECALOG SARL BAE SYSTEMS 506257 EUROCOPTER S. INST. NAT. DE RECHERCHE SUR LES TRANSPORTS ET LEUR SCURIT IST-2000-29207 AIRBUS UK LIMITED DASSAULT AVIATION TESSITURA LUIGI SANTI SPATwo mode INST. FUER TEXTIL UND VERFAHRENSTECH. DENKENDORF AIRBUS DEUTSCHLAND 502917 SNECMA MOTEURS SA INST. SUPERIOR TECNICO C. R. FIAT S.C.P.A. KBC MANUFAKTUR, KOECHLIN, 502909networks BAUMGARTNER UND CIE. AG NAT. TEC. UNIV. OF ATHENS NL ORG. FOR APPLIED SCIENTIFIC RESEARCH - TNO 502842 29817 AIRBUS FRANCE SAS BARTENBACH TRUMPF-BLUSEN-KLEIDER ALENIA AERONAUTICA SPA 501084 WALTER GIRNER UND CO. KGMultiplication BARCO NV G4RD-CT-2002-00836 STICHTING NATIONAAL LUCHT G4MA-CT-2002-00022 POLYMAGE SARL MSO CONCEPT INNOVATION + SOFTWARE OFFICE NAT. DETUDES ET ROSENHEIMER GLASTECH. DE REC. AEROSPATIALESDerived 7210-PR/163 BRPR987001 G4RD-CT-2001-00403 502896 G4RD-CT-2000-00178 ENK6-CT-2002-30023 G4RD-CT-2002-00795 RUDOLF BRAUNS AND CO. KGnetworks 7215-PP/031 CENTRE DE RECH. METALLURG. 502889 SHERPA ENGINEERING SARL G4RD-CT-2000-00395 CATALYSE SARL 7210-PR/233 VOEST-ALPINE STAHLPajek INST. DE RECHERCHES DE LA SIDERURGIE FR THYSSENKRUPP STAHL A.G. EVG3-CT-2002-80012 T3.2/99 DISENO DE SISTEMAS EN SILICIO CENTRE FOR EUROP. ECONOMIC SMT4982223 ILEVO AB FONDAZIONE ENI - ENRICO MATTEI 7210-PR/095 CSTB IST-2001-35358 JERNKONTORET HPSE-CT-2002-00108 UNIV. DER BUNDESWEHR MUENCHEN BUILDING RESEARCH CHIPIDEA - MICROELECTRONICA, S.A. ENEL.IT UNIV. PANTHEON-ASSAS - PARIS II LANDIS & GYR - EUROPE AG OESTERREICHISCHER BERGRETTUNGSDIENST SSAB TUNNPL¯T IFEN GES. FUER SATELLITENNAVIGATION 7215-PP/034 RESEARCH INST. OF THE FINNISH ECONOMY JOE3980089 WYKES ENGINEERING COMPANY LH AGRO EAST S.R.O. QLK6-CT-2002-02292 IST-2000-30158 TECHNOFARMING S.R.L. T3.5/99 THE AARHUS SCHOOL OF BUSINESS MEFOS, FOUNDATION FOR CINAR LTD. HELP SERVICE REMOTE SENSING METALLURGICAL RESEARCH INST. CARTOGRAFIC DE CATALUNYA HPSE-CT-2002-00143 LESPROJEKT SLUZBY S.R.O. BAYER. ROTES KREUZ ENERGY RESEARCH CENTRE NL IST-2000-28177 BRITISH STEEL UNIV. OF MACEDONIA 7210-PR/142 JOR3980200 FRAUENHOFER INST. FUER AGRO-SAT CONSULTING MATERIALFLUSS UND LOGISTIK DATASYS S.R.O. ENK5-CT-2000-00335 UNIV. OF ABERDEEN ORAD HI TEC SYS. POLAND CENTRE DE ROBOTIQUE CRE GROUP LTD. MJM GROUP, A.S. FRIMEKO INT. AB BBL INOX PNEUMATIC AS DFA DE FERNSEHNACHRICHTEN AGENTUR TPS TERMISKA PROCESSER AB KOMMANDITGES. HAMBURG 1 PROLEXIA FERNSEHEN BETEILIGUNGS & CO A.S.M. S.A. ZAMISEL D.O.O INGENIORHOJSKOLEN HELSINGOR TEKNIKUM IST-1999-56418 DPME ROBOTICS AB GATE5 AG INDUSTRIAS ROYO 511758 LKSOFTWARE UAB LKSOFT BALTIC BRST985352 SUPERELECTRIC DI IST-2000-30082 SVETS & TILLBEHOR AB CARLO PAGLIALUNGA & C. SAS ALBERTSEN & HOLM AS WISDOM TELE VISION SPORTART IST-1999-57451 OK GAMES DI ALESSANDRO CARTA ASM - DIMATEC INGENIERIA FFT ESPANA TECH. DE AUTOMOCION, ENERGITEKNIK HEATEX AB UNIV. DE ZARAGOZA YAHOO! DEOSAUHING EETRIUKSUS BROD THOMASSON EDAG ENGINEERING + DESIGN GUNNESTORPS SMIDE & MEKANISKA AB Pajek V. Batagelj Networks from data bases
  15. 15. Collaboration networksNetworks from data bases Let WA be the works × authors two mode network; wapi ∈ {0, 1} is V. Batagelj describing the authorship of author i of work p.Two modenetworks wapi = deg(p) = # of authors of work p i∈AMultiplicationDerived Let N be its normalized version, ∀p ∈ W : i∈A npi = 1, obtainednetworks from WA by npi = wapi / deg(p), or by some other rule determiningPajek the author’s contribution. The first collaboration network Co = WAT ∗ WA coij = wapi wapj = 1 p∈W p∈N(i)∩N(j) coij = the number of works that authors i and j wrote together. Problem: The Co network is composed of complete graphs on the set of work’s authors. Works with many authors produce large complete subgraphs. V. Batagelj Networks from data bases
  16. 16. Cores of orders 10–21 in Computational GeometryNetworks from data bases V. Batagelj C.Zelle H.A.El-Gindy S.P.Fekete M.E.Houle J.Czyzowicz M.L.Demaine P.Belleville V.Sacristan K.R.RomanikTwo mode I.Streinu H.Everett D.H.Rappaport F.Hurtado H.Meijernetworks A.Lubiw D.Bremner G.Liotta T.C.Shermer B.Zhu D.M.Avis W.J.Lenhart S.S.Skiena D.M.MountMultiplication T.C.Biedl P.K.Bose E.M.Arkin M.J.vanKreveld G.T.Toussaint S.M.Robbins J.Urrutia J.S.B.Mitchell G.T.Wilfong M.Yvinec S.H.Whitesides O.AichholzerDerived J-M.Robert E.D.Demaine O.Devillers M.T.deBerg Te.Asano S.Lazard N.Katohnetworks T.Roos D.L.Souvaine M.H.Overmars J-R.Sack H.Alt I.G.Tollis M.Teillaud G.Rote H.Imai R.Seidel J.O’Rourke M.T.GoodrichPajek J-D.Boissonnat J.Erickson S.Suri D.Halperin J.S.Vitter N.M.Amato M.Sharir K.Mehlhorn R.Pollack B.Chazelle D.Z.Chen S-W.Cheng R.Wenger J.S.Snoeyink O.Schwarzkopf J.E.Hershberger P.K.Agarwal R.Tamassia R.L.S.Drysdale J.Pach F.P.Preparata Q.Huang E.Welzl L.J.Guibas K.Kedem S.J.Fortune J.C.Clements J.Matousek H.S.Sawhney D.Eppstein C-K.Yap D.G.Kirkpatrick S.A.Mitchell B.Aronov J.Ashley D.White E.Trimble D.P.Dobkin H.Edelsbrunner F.Aurenhammer D.T.Lee M.W.Bern D.Steele W.J.Bohnhoff L.P.Chew A.Aggarwal R.R.Lober S.R.Kosaraju G.D.Sjaardema T.K.Dey N.Amenta P.Yanker D.Petkovic G.Lerman T.J.Wilson L.Lopez-Buriek P.Plassmann J.Hass J.R.Hipp E.Sedgwick C.K.Johnson M.Gorkani M.Flickner W.R.Oakes J.Harer D.Letscher C.Grimm T.J.Tautges A.Hicks S.Parker D.Zorin T.L.Edwards J.Weeks W.Niblack J.Hafner S.E.Benzley B.Dom T.D.Blacker M.Whitely V. Batagelj Networks from data bases
  17. 17. pS -core at level 46 of Computational GeometryNetworks from data bases E.Arkin V. Batagelj J.Mitchell I.Tollis A.Garg M.Bern L.Vismara D.Eppstein G.diBattista M.GoodrichTwo mode R.Tamassianetworks G.Liotta D.Dobkin S.Suri J.O’Rourke J.VitterMultiplication J.HershbergerDerivednetworks B.Chazelle R.Seidel B.Aronov L.Guibas F.Preparata J.Snoeyink H.EdelsbrunnerPajek M.Sharir P.Agarwal R.Pollack J.Pach D.Halperin P.Gupta M.Smid R.Janardan E.Welzl M.Overmars P.Bose J.Boissonnat M.vanKreveld O.Devillers J.Matousek J.Majhi M.Yvinec C.Yap M.deBerg J.Schwerdt O.Schwarzkopf G.Toussaint M.Teillaud J.Czyzowicz J.Urrutia C.Icking R.Klein V. Batagelj Networks from data bases
  18. 18. Second collaboration networkNetworks from data bases V. Batagelj The second collaboration network Cn = WAT ∗ NTwo modenetworks cnij = bip npj = npjMultiplication p∈W p∈N(i)∩N(j)Derivednetworks cnij = contribution of author j to works, that (s)he wrote together with the author i.Pajek It holds bip npj = deg(p) and cnij = deg(i) j∈A j∈A j∈A cnii = npi is the contribution of author i to his/her works. p∈N(i) cnii Self-sufficiency: Si = deg(i) Collaborativness (co-authorship index): Ki = 1 − Si V. Batagelj Networks from data bases
  19. 19. The ”best” authors in StatisticsNetworks from data bases V. Batagelj name contrib pap self collab 1. Burt R 83.716667 96 0.872049 0.127951 2. Newman M 59.533333 87 0.684291 0.315709 3. Doreian P 59.070408 75 0.787605 0.212395Two mode 4. Bonacich P 45.416667 59 0.769774 0.230226 5. Marsden P 41.000000 50 0.820000 0.180000networks 6. White H 39.986111 51 0.784041 0.215959 7. Wellman B 38.754762 57 0.679908 0.320092Multiplication 8. Friedkin N 36.333333 40 0.908333 0.091667 9. Leydesdo L 34.533333 47 0.734752 0.265248Derived 10. Borgatti S 30.469048 57 0.534545 0.465455 11. Freeman L 30.250000 36 0.840278 0.159722networks 12. Everett M 27.450000 45 0.610000 0.390000 13. Litwin H 26.166667 32 0.817708 0.182292Pajek 14. Snijders T 23.920408 42 0.569534 0.430466 15. Skvoretz J 23.691667 39 0.607479 0.392521 16. Breiger R 23.520408 30 0.784014 0.215986 17. Krackhar D 22.031519 35 0.629472 0.370528 18. Valente T 21.616667 44 0.491288 0.508712 19. Barabasi A 18.755159 42 0.446551 0.553449 20. Mizruchi M 18.333333 25 0.733333 0.266667 21. Carley K 17.616667 35 0.503333 0.496667 22. Cohen C 17.111111 32 0.534722 0.465278 23. Moody J 16.916667 22 0.768939 0.231061 24. Rothenbe R 16.492063 40 0.412302 0.587698 25. Pattison P 16.483333 34 0.484804 0.515196 26. Batagelj V 16.353741 29 0.563922 0.436078 27. Lazega E 16.000000 20 0.800000 0.200000 28. Latkin C 15.896032 49 0.324409 0.675591 29. Wasserma S 15.803741 33 0.478901 0.521099 30. Berkman L 15.767857 36 0.437996 0.562004 V. Batagelj Networks from data bases
  20. 20. Third collaboration networkNetworks from data bases V. BatageljTwo mode The third collaboration network Ct = NT ∗ Nnetworks ctij = the total contribution of collaboration of authors i and jMultiplication to works.Derivednetworks It holds ctij = ctji , i∈A j∈A ctij = |W | andPajek i∈A j∈A npi npj = 1 – the total contribution of a complete subgraph corresponding to the authors of a work is 1. ctij = npi is the total contribution of author i to works j∈A p∈W from W . V. Batagelj Networks from data bases
  21. 21. Components in SN5 cut at level 0.5Networks from Network SN5 (2008): for "social network*" + most frequent references + around 100 social networkers; data bases |W | = 193376, |C | = 7950, |A| = 75930, |J| = 14651, |K | = 29267 V. Batagelj Jackson_M Muth_S Kogovsek_T Park_J Mrvar_A Demeneze_M Ferligoj_A Krackhar_D Calvo-Ar_ATwo mode Batagelj_V Moore_C Leinhard_ Rothenbe_Rnetworks Woodard_K Newman_M Kilduff_M Doreian_P Zenou_Y Barabasi_A Holland_P Potterat_J Gastner_M Watts_DMultiplication Willer_DHummon_N Albert_R Balkundi_P Girvan_M Leinhard_SDerived Fararo_T Parker_A Jeong_H Mccarty_C Shelley_G Landau_Rnetworks Galaskie_J Skvoretz_J Cross_R Killwort_P Farquhar_M Sherman_S Faust_K Anderson_C Litwin_H Borgatti_SPajek Knowlton_A Hua_W Bernard_H Bowling_A Wasserma_S Robins_G Teresi_J Latkin_C Shiovitz_S Pattison_P Sokolovs_J Iacobucc_D Everett_M Browne_P Cohen_C Hopkins_N Mandell_W Grundy_E Boyd_J Davey-Ro_M Breiger_R Holmes_D Steinhau_H Bonacich_P Bjorkman_T Hawkins_J Masuda_N Chou_K Grabowsk_A Suitor_J Braha_D Metzke_C Bienenst_E Hansson_L Konno_N Fraser_M Bar-Yam_Y Chi_I Kosinski_R Pillemer_K Sundquis_J Ennett_S Wellman_B Boyack_K Ostergre_P Johnson_C Jolly_A Laumann_E Fingerma_K Johansso_S Bauman_K Klavans_R Hampton_K Barer_B Marsden_P Hanson_B Wylie_J Birditt_K Carley_K Yang_H Farmer_T Stauffer_D Leydesdo_L Morris_M Foster_B Carter_W Banks_D Tang_J Weisbuch_G Vandenbe_P Rodkin_P Kretzsch_M Seidman_S Feld_S Gronlund_A Vespigna_A Bell_D Neaigus_A Keeling_M Feiring_C Berkman_L Degenne_A Weisner_C Krause_N Shaw_B Newton_J Wallace_D Wallace_RV. Batagelj Networks from data bases Solomon_P Draine_J Ohtsuki_H Lindstro_D Lin_N Kimura_M Saito_K Holme_P Schneide_J Borlund_P
  22. 22. Authors’ citations networkNetworks from data bases V. BatageljTwo modenetworks iMultiplication was,i sDerivednetworksPajek cis,t j wat,j t A T A WA W Ci W WA Ca = WAT ∗ Ci ∗ WA is a network of citations between authors. The weight w (i, j) counts the number of times a work authored by i is citing a work authored by j. V. Batagelj Networks from data bases

×