Your SlideShare is downloading. ×
V. Batagelj - Big data Networks from data bases
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

V. Batagelj - Big data Networks from data bases

36,690
views

Published on


0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
36,690
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
0
Comments
0
Likes
1
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. Networks from data bases V. BatageljTwo modenetworks Big dataMultiplication Networks from data basesDerivednetworksPajek Vladimir Batagelj University of Ljubljana Undicesima conferenza nazionale di statistica Rome, February 20-21, 2013 V. Batagelj Networks from data bases
  • 2. OutlineNetworks from data bases V. BatageljTwo modenetworksMultiplicationDerived 1 Two mode networksnetworks 2 MultiplicationPajek 3 Derived networks 4 Pajek V. Batagelj Networks from data bases
  • 3. Example: Internet Movie Data BaseNetworks from data bases V. Batagelj Lee TamahoriTwo modenetworks Pierce Brosnan Halle BerryMultiplication Paul HaggisDerived Neal Purvisnetworks Die Another Day Robert Wade Ian FlemingPajek Martin Campbell Judi Dench Casino Royale Daniel Craig Mads Mikkelsen Eva Green Skyfall Sam Mendes John Logan Ralph Fiennes Javier Bardem On February 17, 2013 IMDB (Internet Movie Data Base) contained 2,262,638 titles and 4,745,392 names. Web of Science, Scopus, Zentralblatt Math, Google Scholar, DBLP, Amazon, etc. V. Batagelj Networks from data bases
  • 4. Two mode networks from data basesNetworks from data bases A simple data base B is a set of records B = {Rk : k ∈ K}, where K is the V. Batagelj set of keys. A record has the form Rk = (k, q1 (k), q2 (k), . . . , qr (k)) where qi (k) is the description of the property (attribute) qi for the key k.Two modenetworks Suppose that the description q(k) takes values in a finite set Q. It can always be transformed into such set by partitioning the set Q and recodingMultiplication the values. Then we can assign to the property q a two-mode networkDerivednetworks K × q = (K, Q, L, w ) where (k, v ) ∈ L iff v ∈ q(k). w (k, v ) is the weightPajek of the link (k, v ); often w (k, v ) = 1. Single-valued properties can be represented by a partition. Examples: (papers, authors, was written by), (papers, keywords, is described by), (parlamentarians, problems, positive vote), (persons, journals, is reading), (persons, societies, is member of, years of membership), (buyers/consumers, goods, bought, quantity), etc. V. Batagelj Networks from data bases
  • 5. Methods: degree distributionsNetworks from data bases V. BatageljTwo modenetworks In a network (V, L) the degree deg(v ) of vertex v ∈ V is equalMultiplication to the number of links that have vertex v as their end-vertex.Derived The indegree / outdegree is equal to the number of incoming /networks outgoing links.Pajek Usually one of the first analyses of a network is to look at its degree distribution(s). Are there isolated nodes (deg(v ) = 0)? Which are the nodes with the largest degrees? What is the average degree? What is the shape of degree distribution? V. Batagelj Networks from data bases
  • 6. Methods: two-mode cores and 4-rings weightsNetworks from The subset of vertices C ⊆ V is a (p, q)-core in a two-mode network data bases N = (V1 , V2 ; L), V = V1 ∪ V2 iff V. Batagelj a. in the induced subnetwork K = (C1 , C2 ; L(C )), C1 = C ∩ V1 ,Two modenetworks C2 = C ∩ V2 it holds ∀v ∈ C1 : degK (v ) ≥ p andMultiplication ∀v ∈ C2 : degK (v ) ≥ q ;Derived b. C is the maximal subset of V satisfying condition a.networksPajek A k-ring is a simple closed chain of length k. Using k-rings we can define a weight of edges as wk (e) = # of different k-rings containing the edge e ∈ E In two-mode network there are no 3- rings. The densest substructures are complete bipartite subgraphs Kp,q . They contain many 4-rings. There- fore these weights can be used to identify the dense parts of a network. V. Batagelj Networks from data bases
  • 7. Example: (247,2)-core and (27,22)-core in IMDBNetworks from Zhukov, Boris (I) Wright, Charles (II) Wilson, Al (III) Wight, Paul Wickens, Brian White, Leon data bases Warrior Warrington, Chaz Ware, David (II) Waltman, Sean Walker, P.J. von Erich, Kerry Vaziri, Kazrow Van Dam, Rob Valentine, Greg Vailahi, Sione Tunney, Jack Traylor, Raymond Tenta, John Taylor, Terry (IV) ’WWF Smackdown!’ Taylor, Scott (IX) Taylor, Scott (IX) Tanaka, Pat Tajiri, Yoshihiro Van Dam, Rob V. Batagelj Szopinski, Terry Storm, Lance Steiner, Scott Steiner, Rick (I) ’WWE Velocity’ Solis, Mercid Snow, Al Smith, Davey Boy Slaughter, Sgt. Matthews, Darren (II) Simmons, Ron (I) Shinzaki, Kensuke Shamrock, Ken Senerca, Pete Scaggs, Charles ’Sunday Night Heat’ LoMonaco, Mark Savage, Randy Saturn, Perry Sags, Jerry Ruth, Glen Runnels, Dustin Rude, Rick Rougeau, Raymond Rougeau Jr., Jacques ’Raw Is War’ Hughes, Devon Rotunda, Mike Ross, Jim (III) Rock, The Roberts, Jake (II) Rivera, Juan (II) Rhodes, Dusty (I) WWF Vengeance Huffman, BookerTwo mode Reso, Jason Reiher, Jim Reed, Bruce (II) Race, Harley Prichard, Tom Powers, Jim (IV) Poffo, Lanny WWF Unforgiven Heyman, Paul Plotcheck, Michael Piper, Roddy Pfohl, Lawrence Hebner, Earlnetworks Pettengill, Todd Peruzovic, Josip Palumbo, Chuck (I) Page, Dallas Ottman, Fred WWF Rebellion Orton, Randy Okerlund, Gene Nowinski, Chris Norris, Tony (I) McMahon, Stephanie Nord, John Neidhart, Jim Nash, Kevin (I) Muraco, Don Morris, Jim (VII) WWF No Way Out Keibler, Stacy Morley, Sean Morgan, Matt (III) Mooney, Sean (I) Moody, William (I) WWF No Mercy Wight, PaulMultiplication Miller, Butch Mero, Marc Survivor Series McMahon, Vince McMahon, Shane Matthews, Darren (II) Martin, Andrew (II) Martel, Rick Marella, Robert Marella, Joseph A. Manna, Michael WWF Judgment Day Simmons, Ron (I) Lothario, Jose Senerca, Pete Long, Teddy LoMonaco, Mark Lockwood, Michael Levy, Scott (III) Levesque, Paul Michael Lesnar, Brock Leslie, Ed WWF Insurrextion Ross, Jim (III)Derived Leinhardt, Rodney Layfield, John Lawler, Jerry Lawler, Brian (II) Laurinaitis, Joe Laughlin, Tom (IV) Lauer, David (II) Knobs, Brian Knight, Dennis (II) WWF Backlash Rock, The Killings, Ronnetworks Kelly, Kevin (VIII) Keirn, Steve Jones, Michael (XVI) Johnson, Ken (X) Jericho, Chris Jarrett, Jeff (I) Jannetty, Marty James, Brian (II) WWE Wrestlemania XX Reso, Jason Jacobs, Glen Jackson, Tiger Hyson, Matt Hughes, Devon Huffman, Booker WWE Wrestlemania X-8 McMahon, Vince Howard, Robert William McMahon, Shane Howard, Jamie Houston, Sam Horowitz, Barry WWE VengeancePajek Horn, Bobby Hollie, Dan Hogan, Hulk Hickenbottom, Michael Heyman, Paul Hernandez, Ray Henry, Mark (I) Martin, Andrew (II) Hennig, Curt Helms, Shane Hegstrand, Michael WWE Unforgiven Heenan, Bobby Hebner, Earl Hebner, Dave Heath, David (I) Levesque, Paul Michael WWE SmackDown! Vs. Raw Hayes, Lord Alfred Hart, Stu Hart, Owen Hart, Jimmy (I) Hart, Bret Harris, Ron (IV) Harris, Don (VII) Layfield, John Harris, Brian (IX) Hardy, Matt Hardy, Jeff (I) Hall, Scott (I) Guttierrez, Oscar Gunn, Billy (II) WWE No Way Out Lawler, Jerry Guerrero, Eddie Guerrero Jr., Chavo Gray, George (VI) Goldberg, Bill (I) Gill, Duane Gasparino, Peter WWE No Mercy Jericho, Chris Garea, Tony Funaki, Sho Fujiwara, Harry Frazier Jr., Nelson Foley, Mick Flair, Ric Finkel, Howard WWE Judgment Day Jacobs, Glen Royal Rumble Fifita, Uliuli Fatu, Eddie Farris, Roy Eudy, Sid Hardy, Matt Enos, Mike (I) Eaton, Mark (II) Eadie, Bill WWE Armageddon Duggan, Jim (II) Douglas, Shane DiBiase, Ted DeMott, William Davis, Danny (III) Hardy, Jeff (I) Darsow, Barry Cornette, James E. Copeland, Adam (I) Constantino, Rico Connor, A.C. Wrestlemania X-Seven Gunn, Billy (II) Cole, Michael (V) Coage, Allen Coachman, Jonathan Clemont, Pierre Clarke, Bryan Chavis, Chris Centopani, Paul Cena, John (I) Wrestlemania X-8 Guerrero, Eddie Canterbury, Mark Candido, Chris Calaway, Mark Bundy, King Kong Buchanan, Barry (II) Brunzell, Jim Wrestlemania 2000 Copeland, Adam (I) Brisco, Gerald Bresciano, Adolph Bloom, Wayne Bloom, Matt (I) Cole, Michael (V) Blood, Richard Blanchard, Tully Blair, Brian (I) Survivor Series Blackman, Steve (I) Bischoff, Eric Bigelow, Scott ’Bam Bam’ Benoit, Chris (I) Batista, Dave Calaway, Mark Bass, Ron (II) Barnes, Roger (II) Backlund, Bob Austin, Steve (IV) Summerslam Bloom, Matt (I) Apollo, Phil Anoai, Solofatu Anoai, Sam Anoai, Rodney Anoai, Matt Anoai, Arthur Angle, Kurt AndrØ the Giant Royal Rumble Benoit, Chris (I) Anderson, Arn Albano, Lou Al-Kassi, Adnan Ahrndt, Jason Adams, Brian (VI) Young, Mae (I) Wright, Juanita No Way Out Austin, Steve (IV) Wilson, Torrie Vachon, Angelle Stratus, Trish Runnels, Terri Robin, Rockin’ Psaltis, Dawn Marie King of the Ring Anoai, Solofatu Moretti, Lisa Moore, Jacqueline (VI) Moore, Carlene (II) Mero, Rena McMichael, Debra Angle, Kurt McMahon, Stephanie Martin, Judy (II) Martel, Sherri Invasion Laurer, Joanie Keibler, Stacy Kai, Leilani Hulette, Elizabeth Stratus, Trish Fully Loaded Guenard, Nidia Garca, LiliÆn Ellison, Lillian Dumas, Amy Dumas, Amy IMDB 2005: n1 = 428440, n2 = 896308, m = 3792390. V. Batagelj Networks from data bases
  • 8. Example: Islands for w4 / Charlie Brown and AdultNetworks from data bases V. Batagelj Morgan, Jonathan (I) Kesten, Brad Brando, Kevin Schoenberg, Jeremy Boy, T.T. Davis, Mark (V)Two mode Hauer, Brent Robbins, Peter (I)networks Shea, Christopher (I) Charlie Brown and Snoopy Show Voyeur, Vince Altieri, Ann Reilly, Earl ’Rocky’ Charlie Brown CelebrationMultiplication Ornstein, Geoffrey You Don’t Look 40, Charlie Brown He’s Your Dog, Charlie Brown Dough, Jon Making of ’A Charlie Brown Christmas’ You’re In Love, Charlie Brown Sanders, Alex (I)Derived It’s the Great Pumpkin, Charlie Brown Charlie Brown’s All Stars! Life Is a Circus, Charlie Brown North, Peter (I)networks Charlie Brown Christmas Race for Your Life, Charlie Brown Michaels, Sean Be My Valentine, Charlie Brown Horner, MikePajek Mendelson, Karen It’s Magic, Charlie Brown Dryer, Sally Stratford, Tracy Melendez, Bill You’re a Good Sport, Charlie Brown Drake, Steve (I) Boy Named Charlie Brown It’s a Mystery, Charlie Brown It’s an Adventure, Charlie Brown Byron, Tom Silvera, Joey It’s Flashbeagle, Charlie Brown Momberger, Hilary Play It Again, Charlie Brown West, Randy (I) Is This Goodbye, Charlie Brown? Charlie Brown Thanksgiving There’s No Time for Love, Charlie Brown You’re Not Elected, Charlie Brown Jeremy, Ron Snoopy Come Home It’s the Easter Beagle, Charlie Brown Wallice, Marc Savage, Herschel Thomas, Paul (I) Shea, Stephen Pajek Pajek V. Batagelj Networks from data bases
  • 9. Sparsity and Dunbar’s numberNetworks from data bases V. BatageljTwo mode Networks obtained from data bases are usually large – tens ofnetworks thousands or millions of nodes. Large networks are usuallyMultiplicationDerived sparse – they have small average degree.networksPajek In one-mode networks describing relations among people this can be related to Dunbar’s number with a value around 150. See Wikipedia: Dunbar’s number. In general, if initiator of a link wants to keep the link he should spend / invest a certain amount of finite total ”energy” he has. V. Batagelj Networks from data bases
  • 10. Multiplication of networksNetworks from data bases To a simple two-mode network N = (I, J , E, w ); where I and J are V. Batagelj sets of vertices, E is a set of edges linking I and J , and w : E → R (or some other semiring) is a weight; we can assign a network matrixTwo modenetworks W = [wi,j ] with elements: wi,j = w (i, j) for (i, j) ∈ E and wi,j = 0Multiplication otherwise. Given a pair of compatible networks NA = (I, K, EA , wA ) andDerivednetworks NB = (K, J , EB , wB ) with corresponding matrices AI×K and BK×JPajek we call a product of networks NA and NB a network NC = (I, J , EC , wC ), where EC = {(i, j) : i ∈ I, j ∈ J , ci,j = 0} and wC (i, j) = ci,j for (i, j) ∈ EC . The product matrix C = [ci,j ]I×J = A ∗ B is defined in the standard way ci,j = ai,k · bk,j k∈K In the case when I = K = J we are dealing with ordinary one-mode networks (with square matrices). V. Batagelj Networks from data bases
  • 11. Multiplication of networksNetworks from data bases V. BatageljTwo modenetworks iMultiplication ai,k jDerivednetworks bk,jPajek k J I A B K ci,j = ai,k · bk,j k∈K If all weights in networks NA and NB are equal to 1 the value of ci,j counts the number of ways we can go from i ∈ I to j ∈ J passing through K. V. Batagelj Networks from data bases
  • 12. Multiplication of networksNetworks from data bases V. Batagelj The standard matrix multiplication has the complexityTwo modenetworks O(|I| · |K| · |J |) – it is too slow to be used for large networks.Multiplication For sparse large networks we can multiply much fasterDerived considering only nonzero elements.networks In general the multiplication of large sparse networks is aPajek ’dangerous’ operation since the result can ’explode’ – it is not sparse. If for the sparse networks NA and NB there are in K only few vertices with large degree and no one among them with large degree in both networks then also the resulting product network NC is sparse. V. Batagelj Networks from data bases
  • 13. Derived networksNetworks from data bases From a bibliographical data base we get two-mode networks WA = V. Batagelj Works × Authors and WK = Works × Keywords. Since they have a common set Works the networks WAT and WK are compatible andTwo modenetworks multiplying them we obtain a derived networkMultiplication AK = WAT ∗ WKDerivednetworksPajek The entry akit = number of times author i used in his/her works keyword t. The dataset of EU projects on simulation (January 2006) contains data about research groups. We obtain networks: P = Groups × Projects, C = Groups × Countries, and U = Groups × Institutions. Sizes: |Groups| = 8869, |Projects| = 933, |Institutions| = 3438, |Countries| = 60. In the derived network W = Projects × Institutions = PT ∗ U we determine link islands for w4 . V. Batagelj Networks from data bases
  • 14. Analysis of Projects × InstitutionsNetworks from PSI FUR PRODUKTE UND ARMINES TQT SRL SYS.E DER INFORMATIONSTECH. data bases BICC GENERAL CABLE 28283 ESI SOFTWARE SA CHALMERS TEKNISKA HOEGSKOLA COLOPLAST A/S MTU AERO ENGINES 25525 DAIMLER CHRYSLER AG V. Batagelj FRAUENHOFER INST. FUER PRODUKTIONSTECH. UND AUTOMATISIERUNG EADS DE UND RAUMFAHRT E.V. BUURSKOV DE ZENTRUM FUER LUFT EA TECH. LTD VOLKSWAGEN AG 506503 LMS UMWELTSYS.E, DIPL. ING. DR. HERBERT BACK MECALOG SARL BAE SYSTEMS 506257 EUROCOPTER S. INST. NAT. DE RECHERCHE SUR LES TRANSPORTS ET LEUR SCURIT IST-2000-29207 AIRBUS UK LIMITED DASSAULT AVIATION TESSITURA LUIGI SANTI SPATwo mode INST. FUER TEXTIL UND VERFAHRENSTECH. DENKENDORF AIRBUS DEUTSCHLAND 502917 SNECMA MOTEURS SA INST. SUPERIOR TECNICO C. R. FIAT S.C.P.A. KBC MANUFAKTUR, KOECHLIN, 502909networks BAUMGARTNER UND CIE. AG NAT. TEC. UNIV. OF ATHENS NL ORG. FOR APPLIED SCIENTIFIC RESEARCH - TNO 502842 29817 AIRBUS FRANCE SAS BARTENBACH TRUMPF-BLUSEN-KLEIDER ALENIA AERONAUTICA SPA 501084 WALTER GIRNER UND CO. KGMultiplication BARCO NV G4RD-CT-2002-00836 STICHTING NATIONAAL LUCHT G4MA-CT-2002-00022 POLYMAGE SARL MSO CONCEPT INNOVATION + SOFTWARE OFFICE NAT. DETUDES ET ROSENHEIMER GLASTECH. DE REC. AEROSPATIALESDerived 7210-PR/163 BRPR987001 G4RD-CT-2001-00403 502896 G4RD-CT-2000-00178 ENK6-CT-2002-30023 G4RD-CT-2002-00795 RUDOLF BRAUNS AND CO. KGnetworks 7215-PP/031 CENTRE DE RECH. METALLURG. 502889 SHERPA ENGINEERING SARL G4RD-CT-2000-00395 CATALYSE SARL 7210-PR/233 VOEST-ALPINE STAHLPajek INST. DE RECHERCHES DE LA SIDERURGIE FR THYSSENKRUPP STAHL A.G. EVG3-CT-2002-80012 T3.2/99 DISENO DE SISTEMAS EN SILICIO CENTRE FOR EUROP. ECONOMIC SMT4982223 ILEVO AB FONDAZIONE ENI - ENRICO MATTEI 7210-PR/095 CSTB IST-2001-35358 JERNKONTORET HPSE-CT-2002-00108 UNIV. DER BUNDESWEHR MUENCHEN BUILDING RESEARCH CHIPIDEA - MICROELECTRONICA, S.A. ENEL.IT UNIV. PANTHEON-ASSAS - PARIS II LANDIS & GYR - EUROPE AG OESTERREICHISCHER BERGRETTUNGSDIENST SSAB TUNNPL¯T IFEN GES. FUER SATELLITENNAVIGATION 7215-PP/034 RESEARCH INST. OF THE FINNISH ECONOMY JOE3980089 WYKES ENGINEERING COMPANY LH AGRO EAST S.R.O. QLK6-CT-2002-02292 IST-2000-30158 TECHNOFARMING S.R.L. T3.5/99 THE AARHUS SCHOOL OF BUSINESS MEFOS, FOUNDATION FOR CINAR LTD. HELP SERVICE REMOTE SENSING METALLURGICAL RESEARCH INST. CARTOGRAFIC DE CATALUNYA HPSE-CT-2002-00143 LESPROJEKT SLUZBY S.R.O. BAYER. ROTES KREUZ ENERGY RESEARCH CENTRE NL IST-2000-28177 BRITISH STEEL UNIV. OF MACEDONIA 7210-PR/142 JOR3980200 FRAUENHOFER INST. FUER AGRO-SAT CONSULTING MATERIALFLUSS UND LOGISTIK DATASYS S.R.O. ENK5-CT-2000-00335 UNIV. OF ABERDEEN ORAD HI TEC SYS. POLAND CENTRE DE ROBOTIQUE CRE GROUP LTD. MJM GROUP, A.S. FRIMEKO INT. AB BBL INOX PNEUMATIC AS DFA DE FERNSEHNACHRICHTEN AGENTUR TPS TERMISKA PROCESSER AB KOMMANDITGES. HAMBURG 1 PROLEXIA FERNSEHEN BETEILIGUNGS & CO A.S.M. S.A. ZAMISEL D.O.O INGENIORHOJSKOLEN HELSINGOR TEKNIKUM IST-1999-56418 DPME ROBOTICS AB GATE5 AG INDUSTRIAS ROYO 511758 LKSOFTWARE UAB LKSOFT BALTIC BRST985352 SUPERELECTRIC DI IST-2000-30082 SVETS & TILLBEHOR AB CARLO PAGLIALUNGA & C. SAS ALBERTSEN & HOLM AS WISDOM TELE VISION SPORTART IST-1999-57451 OK GAMES DI ALESSANDRO CARTA ASM - DIMATEC INGENIERIA FFT ESPANA TECH. DE AUTOMOCION, ENERGITEKNIK HEATEX AB UNIV. DE ZARAGOZA YAHOO! DEOSAUHING EETRIUKSUS BROD THOMASSON EDAG ENGINEERING + DESIGN GUNNESTORPS SMIDE & MEKANISKA AB Pajek V. Batagelj Networks from data bases
  • 15. Collaboration networksNetworks from data bases Let WA be the works × authors two mode network; wapi ∈ {0, 1} is V. Batagelj describing the authorship of author i of work p.Two modenetworks wapi = deg(p) = # of authors of work p i∈AMultiplicationDerived Let N be its normalized version, ∀p ∈ W : i∈A npi = 1, obtainednetworks from WA by npi = wapi / deg(p), or by some other rule determiningPajek the author’s contribution. The first collaboration network Co = WAT ∗ WA coij = wapi wapj = 1 p∈W p∈N(i)∩N(j) coij = the number of works that authors i and j wrote together. Problem: The Co network is composed of complete graphs on the set of work’s authors. Works with many authors produce large complete subgraphs. V. Batagelj Networks from data bases
  • 16. Cores of orders 10–21 in Computational GeometryNetworks from data bases V. Batagelj C.Zelle H.A.El-Gindy S.P.Fekete M.E.Houle J.Czyzowicz M.L.Demaine P.Belleville V.Sacristan K.R.RomanikTwo mode I.Streinu H.Everett D.H.Rappaport F.Hurtado H.Meijernetworks A.Lubiw D.Bremner G.Liotta T.C.Shermer B.Zhu D.M.Avis W.J.Lenhart S.S.Skiena D.M.MountMultiplication T.C.Biedl P.K.Bose E.M.Arkin M.J.vanKreveld G.T.Toussaint S.M.Robbins J.Urrutia J.S.B.Mitchell G.T.Wilfong M.Yvinec S.H.Whitesides O.AichholzerDerived J-M.Robert E.D.Demaine O.Devillers M.T.deBerg Te.Asano S.Lazard N.Katohnetworks T.Roos D.L.Souvaine M.H.Overmars J-R.Sack H.Alt I.G.Tollis M.Teillaud G.Rote H.Imai R.Seidel J.O’Rourke M.T.GoodrichPajek J-D.Boissonnat J.Erickson S.Suri D.Halperin J.S.Vitter N.M.Amato M.Sharir K.Mehlhorn R.Pollack B.Chazelle D.Z.Chen S-W.Cheng R.Wenger J.S.Snoeyink O.Schwarzkopf J.E.Hershberger P.K.Agarwal R.Tamassia R.L.S.Drysdale J.Pach F.P.Preparata Q.Huang E.Welzl L.J.Guibas K.Kedem S.J.Fortune J.C.Clements J.Matousek H.S.Sawhney D.Eppstein C-K.Yap D.G.Kirkpatrick S.A.Mitchell B.Aronov J.Ashley D.White E.Trimble D.P.Dobkin H.Edelsbrunner F.Aurenhammer D.T.Lee M.W.Bern D.Steele W.J.Bohnhoff L.P.Chew A.Aggarwal R.R.Lober S.R.Kosaraju G.D.Sjaardema T.K.Dey N.Amenta P.Yanker D.Petkovic G.Lerman T.J.Wilson L.Lopez-Buriek P.Plassmann J.Hass J.R.Hipp E.Sedgwick C.K.Johnson M.Gorkani M.Flickner W.R.Oakes J.Harer D.Letscher C.Grimm T.J.Tautges A.Hicks S.Parker D.Zorin T.L.Edwards J.Weeks W.Niblack J.Hafner S.E.Benzley B.Dom T.D.Blacker M.Whitely V. Batagelj Networks from data bases
  • 17. pS -core at level 46 of Computational GeometryNetworks from data bases E.Arkin V. Batagelj J.Mitchell I.Tollis A.Garg M.Bern L.Vismara D.Eppstein G.diBattista M.GoodrichTwo mode R.Tamassianetworks G.Liotta D.Dobkin S.Suri J.O’Rourke J.VitterMultiplication J.HershbergerDerivednetworks B.Chazelle R.Seidel B.Aronov L.Guibas F.Preparata J.Snoeyink H.EdelsbrunnerPajek M.Sharir P.Agarwal R.Pollack J.Pach D.Halperin P.Gupta M.Smid R.Janardan E.Welzl M.Overmars P.Bose J.Boissonnat M.vanKreveld O.Devillers J.Matousek J.Majhi M.Yvinec C.Yap M.deBerg J.Schwerdt O.Schwarzkopf G.Toussaint M.Teillaud J.Czyzowicz J.Urrutia C.Icking R.Klein V. Batagelj Networks from data bases
  • 18. Second collaboration networkNetworks from data bases V. Batagelj The second collaboration network Cn = WAT ∗ NTwo modenetworks cnij = bip npj = npjMultiplication p∈W p∈N(i)∩N(j)Derivednetworks cnij = contribution of author j to works, that (s)he wrote together with the author i.Pajek It holds bip npj = deg(p) and cnij = deg(i) j∈A j∈A j∈A cnii = npi is the contribution of author i to his/her works. p∈N(i) cnii Self-sufficiency: Si = deg(i) Collaborativness (co-authorship index): Ki = 1 − Si V. Batagelj Networks from data bases
  • 19. The ”best” authors in StatisticsNetworks from data bases V. Batagelj name contrib pap self collab 1. Burt R 83.716667 96 0.872049 0.127951 2. Newman M 59.533333 87 0.684291 0.315709 3. Doreian P 59.070408 75 0.787605 0.212395Two mode 4. Bonacich P 45.416667 59 0.769774 0.230226 5. Marsden P 41.000000 50 0.820000 0.180000networks 6. White H 39.986111 51 0.784041 0.215959 7. Wellman B 38.754762 57 0.679908 0.320092Multiplication 8. Friedkin N 36.333333 40 0.908333 0.091667 9. Leydesdo L 34.533333 47 0.734752 0.265248Derived 10. Borgatti S 30.469048 57 0.534545 0.465455 11. Freeman L 30.250000 36 0.840278 0.159722networks 12. Everett M 27.450000 45 0.610000 0.390000 13. Litwin H 26.166667 32 0.817708 0.182292Pajek 14. Snijders T 23.920408 42 0.569534 0.430466 15. Skvoretz J 23.691667 39 0.607479 0.392521 16. Breiger R 23.520408 30 0.784014 0.215986 17. Krackhar D 22.031519 35 0.629472 0.370528 18. Valente T 21.616667 44 0.491288 0.508712 19. Barabasi A 18.755159 42 0.446551 0.553449 20. Mizruchi M 18.333333 25 0.733333 0.266667 21. Carley K 17.616667 35 0.503333 0.496667 22. Cohen C 17.111111 32 0.534722 0.465278 23. Moody J 16.916667 22 0.768939 0.231061 24. Rothenbe R 16.492063 40 0.412302 0.587698 25. Pattison P 16.483333 34 0.484804 0.515196 26. Batagelj V 16.353741 29 0.563922 0.436078 27. Lazega E 16.000000 20 0.800000 0.200000 28. Latkin C 15.896032 49 0.324409 0.675591 29. Wasserma S 15.803741 33 0.478901 0.521099 30. Berkman L 15.767857 36 0.437996 0.562004 V. Batagelj Networks from data bases
  • 20. Third collaboration networkNetworks from data bases V. BatageljTwo mode The third collaboration network Ct = NT ∗ Nnetworks ctij = the total contribution of collaboration of authors i and jMultiplication to works.Derivednetworks It holds ctij = ctji , i∈A j∈A ctij = |W | andPajek i∈A j∈A npi npj = 1 – the total contribution of a complete subgraph corresponding to the authors of a work is 1. ctij = npi is the total contribution of author i to works j∈A p∈W from W . V. Batagelj Networks from data bases
  • 21. Components in SN5 cut at level 0.5Networks from Network SN5 (2008): for "social network*" + most frequent references + around 100 social networkers; data bases |W | = 193376, |C | = 7950, |A| = 75930, |J| = 14651, |K | = 29267 V. Batagelj Jackson_M Muth_S Kogovsek_T Park_J Mrvar_A Demeneze_M Ferligoj_A Krackhar_D Calvo-Ar_ATwo mode Batagelj_V Moore_C Leinhard_ Rothenbe_Rnetworks Woodard_K Newman_M Kilduff_M Doreian_P Zenou_Y Barabasi_A Holland_P Potterat_J Gastner_M Watts_DMultiplication Willer_DHummon_N Albert_R Balkundi_P Girvan_M Leinhard_SDerived Fararo_T Parker_A Jeong_H Mccarty_C Shelley_G Landau_Rnetworks Galaskie_J Skvoretz_J Cross_R Killwort_P Farquhar_M Sherman_S Faust_K Anderson_C Litwin_H Borgatti_SPajek Knowlton_A Hua_W Bernard_H Bowling_A Wasserma_S Robins_G Teresi_J Latkin_C Shiovitz_S Pattison_P Sokolovs_J Iacobucc_D Everett_M Browne_P Cohen_C Hopkins_N Mandell_W Grundy_E Boyd_J Davey-Ro_M Breiger_R Holmes_D Steinhau_H Bonacich_P Bjorkman_T Hawkins_J Masuda_N Chou_K Grabowsk_A Suitor_J Braha_D Metzke_C Bienenst_E Hansson_L Konno_N Fraser_M Bar-Yam_Y Chi_I Kosinski_R Pillemer_K Sundquis_J Ennett_S Wellman_B Boyack_K Ostergre_P Johnson_C Jolly_A Laumann_E Fingerma_K Johansso_S Bauman_K Klavans_R Hampton_K Barer_B Marsden_P Hanson_B Wylie_J Birditt_K Carley_K Yang_H Farmer_T Stauffer_D Leydesdo_L Morris_M Foster_B Carter_W Banks_D Tang_J Weisbuch_G Vandenbe_P Rodkin_P Kretzsch_M Seidman_S Feld_S Gronlund_A Vespigna_A Bell_D Neaigus_A Keeling_M Feiring_C Berkman_L Degenne_A Weisner_C Krause_N Shaw_B Newton_J Wallace_D Wallace_RV. Batagelj Networks from data bases Solomon_P Draine_J Ohtsuki_H Lindstro_D Lin_N Kimura_M Saito_K Holme_P Schneide_J Borlund_P
  • 22. Authors’ citations networkNetworks from data bases V. BatageljTwo modenetworks iMultiplication was,i sDerivednetworksPajek cis,t j wat,j t A T A WA W Ci W WA Ca = WAT ∗ Ci ∗ WA is a network of citations between authors. The weight w (i, j) counts the number of times a work authored by i is citing a work authored by j. V. Batagelj Networks from data bases
  • 23. Islands in SN5 authors citation networkNetworks from Network SN5 (2008): for "social network*" + most frequent references + around 100 social networkers; data bases |W | = 193376, |C | = 7950, |A| = 75930, |J| = 14651, |K | = 29267 LIN_N LAZEGA_E ROBINS_G V. Batagelj FRIEDKIN_N VAPNARSK_V LAI_G LEVOT_P STRAUSS_D MERTON_R HOLLAND_P LYNCH_E FRANTZ_P DEYRIS_E ALAKARE_B COLEY_J GERGEN_K GALASKIE_J VANDUIJN_M AALTONEN_J LEMOIGNE_M SCHWARTZ_N BURT_R MAILLARD_J PATTISON_P GULATI_R BARAN_M SHOTTER_J WHITE_H LEINHARD_S ATRAN_S SNIJDERS_T MUSSAT_M MARECHAL_M ANDERSEN_T BOORMAN_S EK_E SEIKKULA_JTwo mode FAUST_K DEPOMPER_M SELVINIP MIZRUCHI_M CORP_E ROSS_N LEMOY_A MEDIN_Dnetworks COLEMAN_J GRANOVET_M KILDUFF_M BREIGER_R ANDERSON_C DELALAUR_L LEBEAU_E TIMURA_C ANDERSON_H ALANEN_Y MCGORRY_P WASSERMA_S BRASS_D FIENBERG_S LAUMANN_E DOREIAN_PMultiplication MARSDEN_P COHEN_A FARARO_T SKVORETZ_J IACOBUCC_D GIRVAN_M STEDILE_J MATTOSO_J ARRUDA_M COSTA_L HURLBERT_J DOROGOVT_S BENJAMIN_C BALKUNDI_P BATAGELJ_V KIM_D MORENO_Y GRABOWSK_ADerived FREEMAN_L IBARRA_H HUMMON_N EVERETT_M BOCCALET_S BIONDI_A FERLIGOJ_A BARTHELE_M LESBAUPI_I BOFF_C MOORE_C WHITE_D MILLER_M KRACKHAR_Dnetworks PARK_J WILLER_D STROGATZ_S NEWMAN_M GRONLUND_A TRINDADE_H CARLEY_K COOK_K PINAUD_J AMARAL_L VLAHOV_D MARKOVSK_B GONCALVE_R NEAIGUS_A MOLLOY_M WATTS_D HOLME_P KLOVDAHL_A JEONG_H ALBERT_RPajek BORGATTI_S BONACICH_P MASUDA_N VANDIEN_S DESJARLA_D LATKIN_C BIENENST_E BARABASI_A BURGARD_A MANDELL_W MAHADEVA_R ROGERS_E RODKIN_P CRICK_N POTTERAT_J CELENTAN_D FRIEDMAN_S KLINKE_D FAMILI_I BROADBEL_L MUTH_S ESPELAGE_D XIE_H LEUNG_M VALENTE_T KNOWLTON_A ROSSI_M SCHILLIN_C WOODHOUS_D MAGNUSSO_D CASSIRAM_A CURTIS_R CAIRNS_R GIULINI_G GEST_S LATUADA_S BELTRAMI_L AMIRKHAN_Y CAIRNS_B PEARL_R MAVROVOU_M PIZZAGAL_F HONEGGER_A PFAENDTN_J SUMATHI_R DARROW_W ESTELL_DNECKERMA_H VALLI_F RICCI_G CLEMMER_J PATETTA_L UNKNOWN MONTALTO_R BECCARIA_G ROTHENBE_R KINDERMA_T VANACKER_R BELGIOJO_A GATTIPER_M KELLY_J CADWALLA_T REGGIORI_F COIE_J KANNES_G VACANI_C GUILINI_G SABORNIE_E ARRIGONI_P ZOTTI_S D’AMIA_G NIELSEN_R BIANCONI_C FRANCHET_G MORRIS_M FARMER_T ANNONI_A THOMPSON_J GOFORTH_J VERCELLO_V ROMUSSI_C AMATI_C KRETZSCH_M LUCCHELL_G HIGGINS_C HOLLOWEL_J FEDORA_P PAPAGNA_P CATTANEO_C GOODHART_K ASPARI_D CHIZZOLI_G WEISNER_C SCOTTI_A HOYME_H GARIEPY_J MAY_P DISHION_T MEZZANOT_G GOZZOLI_M VILJOEN_D MERIGGI_M TRUJILLO_P HYMEL_S BUCHANAN_L GOLDOLI_E MATZGER_H PELLEGRI_A ADLER_P SANDRI_M DALLAJ_A DELUCCHI_K MEZZANOT_P KALBERG_W WHITE-CO_M ABEL_E GOSSAGE_J KASKUTAS_L DECOTEAU_S WALKER_B LESAGE_A WESTLEY_F OZEL_S AYDIN_I BROWN_G BREWIN_C CARPENTE_S MARASCO_C HELD_T JANSSEN_M HUNT_J MACCARTH_B ZILELI_L EREN_E BERKES_F FARRELL_M ADGER_W DEROSA_C SOLOMON_P GAMBOA_G FOLKE_C HAHN_T REEVE_H DAPPORTO_L WING_J SCHEFFER_M RAU_P BEBBINGT_P BRUGHA_T OZCURUME_G BASOGLU_M LALE_T MALANGON_C MAGLIANO_L FIORILLO_A JEANNE_R OSTROM_E HENDERSO_S HOLLING_C OLSSON_P LEWIS_G ROSELER_P GUARNERI_M FADDEN_G GUNDERSO_L TURILLAZ_S JENKINS_R PALAGI_E MELTZER_H TASKINTU_N KILIC_C STARKS_P MORGAN_Z MAJ_M STRASSMA_J KURT_G WESTEBER_M Pajek V. Batagelj Networks from data bases
  • 24. ESNA PajekNetworks from data bases Pajek – program for analysis and vi- V. Batagelj sualization of large networks is freely available, for noncommercial use, atTwo modenetworks its web site.Multiplication http://pajek.imfm.si/Derivednetworks An introduction to social networkPajek analysis with Pajek is available in the book ESNA (de Nooy, Mrvar, Batagelj 2005). Second extended edition in September 2011. ESNA in Japanese was published by Tokyo Denki University Press in 2010; Chinese, November 2012. Pajek 2.* → Pajek 3.* V. Batagelj Networks from data bases
  • 25. ReferencesNetworks from data bases V. BatageljTwo mode Batagelj, V.: Social Network Analysis, Large-Scale. R.A. Meyers, ed.,networks Encyclopedia of Complexity and Systems Science, Springer 2009: 8245-8265.MultiplicationDerived Batagelj, V, Cerinˇek, M: On bibliographic networks. Scientometrics (2013). snetworks (DOI) 10.1007/s11192-012-0940-1.Pajek Batagelj, V., Mrvar, A.: Analysis of Kinship Relations With Pajek. Social Science Computer Review 26(2), 224-246, 2008. The work was supported in part by the ARRS, Slovenia, grant P1-0294, as well as by grant N1-0011 within the EUROCORES Programme EUROGIGA (project GReGAS) of the European Science Foundation. http://pajek.imfm.si/lib/exe/fetch.php?media=pub:cns11.pdf V. Batagelj Networks from data bases