• Save
Disc2013 keynote speakers
Upcoming SlideShare
Loading in...5
×

Like this? Share it with your network

Share

Disc2013 keynote speakers

  • 4,087 views
Uploaded on

Disc2013 keynote speaker PDF ...

Disc2013 keynote speaker PDF

More in: Technology , Education
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
No Downloads

Views

Total Views
4,087
On Slideshare
3,734
From Embeds
353
Number of Embeds
3

Actions

Shares
Downloads
0
Comments
0
Likes
3

Embeds 353

http://www.scoop.it 285
http://disckorea.wordpress.com 67
http://news.google.com 1

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. Exploring the Structure of Government on the Web Presentation by Robert Ackland at DISC2013, 12-14 December 2013, Daegu, South Korea Robert Ackland (Australian National University) Paul Henman (University of Queensland) Tim Graham (University of Queensland) Homepage: https://researchers.anu.edu.au/researchers/ackland-rj Project: http://voson.anu.edu.au
  • 2. VOSON Project at the ANU (http://voson.anu.edu.au): Teaching, research and tool development in areas of computational social science, network science, web science since 2003 2
  • 3. Background Government use of the Internet has rapidly evolved. ● While this evolution has been examined in terms of the content, usability and interactivity of sites, the institutional structure of government on the web is less explored. ● Australian Research Council-funded project titled "The institutional structure of e-government: a cross-policy, cross-country comparison" (Henman, Ackland, Margetts) ● 3
  • 4. Overall aims of project ● Aim 1: Assess whether government hyperlink networks reflect offline institutional structures Is e-government facilitating joined-up government or are jurisdictional boundaries still a significant barrier? ● Whalen (2011) studied the hyperlink structure of the US .gov domain, assessing correspondence between online structure of US government and its offline hierarchy. ● ● Major difference is our project compares the UK and Australia, identifying both similarities and contrasts in the relationship between institutional structure and online presence. 4
  • 5. ● Aim 2: Use hyperlink data to assess “nodality” of government (Hood & Margetts 2007) – is government at centre of informational networks on Web? Nodality affects whether government messages received by the population. ● Web might increase government nodality, but can also decrease nodality, through increased competition from other information providers (who may destabilise/confuse/subvert the messages and actions of government). Example: anti-vaccination lobby groups. ● We ask: is government using the web to enhance its visibility? Are there differences in nodality across policy domains, countries (AU and UK)? ● Our approach is different to that used by Escher et al. (2006) ● ● ● Escher et al. focused only on the UK Foreign Office (and US and Australian counterparts), our analysis includes other sectors of government, allowing crosscountry and cross-sector comparisons We collect more hyperlink data, allowing us to identify the connection between sites that link to (or are linked to by) government sites. We can construction of nodality measures that are different to those used by Escher et al. (e.g. those requiring complete network data). 5
  • 6. Webometrics (link count analysis) focus on egonetworks, rather than complete networks ● typically only know attributes of ego, not alters ● 6
  • 7. Today – some methodological aspects Hyperlink network data collection (VOSON) ● Network reduction techniques ● Community structure in government hyperlink networks ● Coding websites (machine learning) ● 7
  • 8. Hyperlink network data collection (VOSON) 8
  • 9. ● Manually identified AU and UK government seed pages (typically, entry pages to government websites): AU – 88 pages ● UK – 92 pages ● ● Used the VOSON software (http://voson.anu.edu.au) to construct hyperlink network data using two stage approach: ● Stage 1: ● ● ● Stage 2: ● ● VOSON in-built crawler crawled the seed sites finding internal pages linked to from the entry page. Collected outbound links from each of the internal pages and also text content Bing API was used to find all inbound links to each of the internal pages (including seed page) Every new page discovered above (i.e. pages that either link to or are linked to by government web page) was then crawled by VOSON in-built crawler to find connections among these pages Data collected in 2012 9
  • 10. 10
  • 11. VOSON 2.0 web interface works with Firefox, Chrome, Safari, iPad VOSON+NodeXL allows construction and import of hyperlink networks from within NodeXL 11
  • 12. Network reduction techniques 12
  • 13. ● Network size (pages): AU: 1,517,020 nodes (pages) ● UK: 1,588,757 nodes (pages) ● ● First major network reduction technique: construct network of websites rather than pages VOSON has approach for automatically grouping pages into “pagegroups” ● e.g for AU, 6694 pages from Australian Taxation office all included in a single node “ato.gov.au” ● ● Full network size (pagegroups/sites): AU: 110665 nodes (pages), 290031 edges ● UK: 109161 nodes (pages), 280580 edges ● 13
  • 14. ● Gephi map UK network – only showing 30K+ nodes with indegree+outdegree>1 ...not much analytical potential from this visualisation... 14
  • 15. ● In future work we will be investigating approaches for removing edges to reveal the “backbone” of UK and AU government hyperlink networks ● e.g. Serrano, M., Boguñá, M. and A. Vespignani (2009): “Extracting the multiscale backbone of complex weighted networks,” PNAS, 106(16), 6483-6488. 15
  • 16. Community structure in government hyperlink networks 16
  • 17. Some approaches for 'community' detection in networks Modularity maximisation (Lancichinetti & Fortunato, 2012) ● Edge-Betweenness (Girvan & Newman, 2001) ● Fast-Greedy (Clauset et al, 2004) ● Multi-Level (Blondel et al, 2008) ● Walktrap (Pons & Latapy, 2005) ● Infomap (Rosvall, Axelsson & Bergstrom, 2009) ● 17
  • 18. The hyperlink networks we have collected are both directed and weighted (weight on edge from node i to j are number of pages with links from site i to j) ● Of the above, only Edge-Betweenness and Infomap support directed and weighted graphs ● 18
  • 19. Edge-Betweenness We found the Edge-Betweenness algorithm (as implemented in igraph/R) does not scale well. ● In a test run with UK hyperlink network, algorithm did not converge after 24 hours running... ● 19
  • 20. Infomap See: http://www.mapequation.org ● Scales well for large, dense networks ● information theoretic approach - appropriate to this network, where there is flow of information and attention ● If site i links to site j can think of a flow of information from j to i and a flow of attention from i to j. ● We do not have data on flow of web users from site i to site j i.e. 'clickstream data' ● We therefore make assumption that the number of pages on site i that contain hyperlinks to site j (these are our edge weights) is proportional to the flow of attention/information ● 20
  • 21. First attempt... Tried Infomap implemented in R/iGraph (v. 0.6.5) ● Results: Not good! Algorithm consistently generated a single massive community (approx. 95% of nodes) and thousands of tiny communities (1 or 2 nodes per community) ● Results do not pass ‘sanity test’ (i.e. face validity) ● The problem: ● Many nodes in the UK network have no outlinks ● Therefore, effect of teleportation in the Infomap algorithm is significant (it randomly connects nodes) ● This problem was solved in Lambiotte and Rosvall (2012) ● 21
  • 22. Second attempt... Results from Lambiotte and Rosvall (2012) were recently developed into Infomap algorithm ● This latest code is not yet integrated in R/iGraph ● So, next steps: ● Download and compile C++ source code for Infomap (v. 0.12.13) ● http://www.mapequation.org/code.html ● Run the standalone Infomap algorithm ● ● Using Infomap Map Generator, can examine the community structure of UK network at different scales (varying the number of communities displayed and number of links between communities) 22
  • 23. 17 out of 4571 communities (44% of all flow) 23
  • 24. 45 out of 4571 communities (70% of all flow) 24
  • 25. Each community is named after the website that has the highest flow and PageRank in that particular community (i.e. the ‘top dog’ website) ● Distribution of flow across network follows a power law ● There are many communities, but a very small percentage ‘hog’ all the flow across the network ● Top 5% of communities (229 nodes out of 4571) account for about 86% of all flow in the network ● ● Infomap uses an implementation of the PageRank algorithm to calculate ‘importance’ of each community (aggregate PageRank of all websites in that community) 25
  • 26. Preliminary findings Extremely influential communities form around social media and blogging platforms ● A massive amount of flow is directed through the ‘Twitter’ community (e.g. from Twitter to www.parliament.uk) ● Many UK seed sites form influential communities (i.e. Top 20), but not all. ● Somewhat unexpectedly, two UK Gov ‘business’ websites each form highly influential communities ● http://www.direct.gov.uk (community rank #4, 0.048% of all flow throughout network) ● http://bis.gov.uk (community rank #8, 0.025% of all flow throughout network) ● 26
  • 27. Coding websites 27
  • 28. ● To understand the structure of government hyperlink networks, we need to know something about the websites in these networks ● ● Generic top-level domains (.edu, .com, org etc.) will only give very coarsegrained information on who these sites are ● ● What policy domain are they in? (health, education, social security?) This is social science research so we need more information on nodes Options: 1. Manually code every site (not feasible, as we have >100K sites) 2. Manually code a subset of sites e.g. the “most important” sites based on centrality measure (scientifically valid?) 3. Manually code a sample of sites (e.g. adaptive sampling). To be explored in future... 4. Manually code training dataset and then use machine learning to predict website type ● The following is summary of preliminary work on approach 4... 28
  • 29. Data collection ● Subset of 'important' websites in the UK network were coded into discrete policy domains by a human coder Subset chosen as seed sites plus sites connected to two or more seed sites ● e.g. coding: ‘Community services’, ‘Health’, ‘Foreign Affairs’ ● Need to collect and ‘clean’ the HTML data from websites in the network ● While the original VOSON crawl collected text content for all websites crawled, for this proof of concept, we re-collected the text content (in future we will use the VOSON-collected text data) ●
  • 30. Text processing R ‘XML’ package used to clean the HTML (strip HTML tags, remove white spaces, remove strange ASCII characters, convert to lowercase, extract key word frequencies) ● 2157 websites were usable (i.e. with ‘clean’ web text and a known policy domain) ● Machine Learning using the ‘RTextTools’ package in R (supervised learning for text classification) ●
  • 31. Support Vector Machine (SVM) ● Websites with known policy codes = 2157 SVM ‘training sample’ = 2000 ● SVM ‘test sample’ = 157 ● ● Some example results of classification: PRECISION RECALL F-SCORE Education 0.94 0.83 0.88 Employment 1.00 0.14 0.25 Environment 0.99 0.79 0.88 Foreign Affairs 1.00 0.44 0.61 Health 0.52 0.97 0.68 Housing 0.96 0.79 0.87
  • 32. SVM Conclusion Surprising level of accuracy ● Future work will involve: ● More data (will use HTML collected via VOSON) ● Investigate different machine learning algorithms ●
  • 33. 33
  • 34. Motivation 2
  • 35. Research Goal 3
  • 36. Previous studies Level Authors Result Small-world effect existed between co-authors and the degree Newman(2001) distribution roughly follows the power law in co-authorship networks in the fields of physics, biomedicine and computer science Barabasi et al. (2002) Ramasco et al. (2004) Co-authorship network in mathematics and neuroscience is scale-free, and the network evolution is characterized by preferential attachment. Co-authorships network in the field of condensed matter showed that the degree distribution follows a power law. Individual Co-authorship network in the field of genetic programming changes Researcher Tomassini and Luthi (2007) in accordance with preferential attachment level International co-authorship grew based on the principle of Wagner and Leydesdorff (2005) preferential attachment, although the attachment mechanism was not fitted to a pure power law. Moody (2004) Brantle and Fallah (2011) Co-authorship network in sociology does not have a small-world structure. Collaboration network of patent inventors has a scale-free power law property. 4
  • 37. Previous studies Level Authors Result Verspagen and Strategic technology alliances, in the two technology fields of chemicals Duysters (2004) Powell et al. (2005) Organization level Gay and Dousset (2005) Barber et al. (2006) Breschi and Cusmao (2004) and food, could be characterized as small worlds. The alliance network among dedicated biotech firms is scale-free. The alliance network in the biotechnology industry has a small-world effect with a scale-free property based on preferential attachment. Both studies reported the existence of small-world and scale-free property in inter-organizational R&D relationships from EU-FP Programmes data. 5
  • 38. Brief history of governmental policy for UIG collaboration (‘00~’11)  6
  • 39. Brief history of governmental policy for UIG collaboration (‘00~’11)   7
  • 40. Research design 8
  • 41. Methodology  Network topological analysis Measures Definition Density Average degree Average path length Diameter The largest geodesic path length in the network Clustering coefficient Degree centralization Power law distribution 9
  • 42. Methodology  Centrality measures Measures Degree centrality Definition CD(i) = (ΣAi)/(n-1) * Ai = the number of direct links of node i, * n = the total number of nodes Closeness centrality CC(i) = (n-1)/(ΣDij) * Dij = the number of links in the geodesic linking node i and node j * n = the total number of nodes Betweenness centrality CB(i)=[Σj<k gjk(i)/gjk]/[(n-1)(n-2)/2] * gjk = the number of geodesics linking node j and node k * gjk(i) = the number of geodesics linking node j and node k that contain node i * n = the total number of nodes 10
  • 43. Methodology  Block modeling 11
  • 44. Data and network construction  Data collection and network construction  75 innovative actors (2010) 12
  • 45. Results The number of joint patents  30,000 23,973 25,000 20,000 15,000 12,659 10,000 5,000 4,579 1,368 6,735 3,535 5,720 2004-2007 2008-2011 10,623 0 2000-2003 2000-2011 Year 13
  • 46. Results  14
  • 47. Results  Period No. of No. Density Clustering Average Average coefficient degree (random network) links path length (random nodes of Diameter Degree centralization network) Power-law distribution Power-law KS p- exponent statistic value 2000~2003 46 90 0.087 0.323 (0.069) 1.957 2.997 (2.919) 7 0.351 2.768 0.193 0.03 2004~2007 61 209 0.114 0.375 (0.125) 3.410 2.366 (2.310) 5 0.331 2.924 0.138 0.05 2008~2011 60 387 0.219 0.498 (0.213) 6.450 1.933 (1.827) 4 0.493 3.305 0.115 0.23 15
  • 48. Results  2000-2003 Organi Degree zation 2004-2007 Rank Closeness Rank Between Rank centrality centrality Degree tion ness Organiza 2008-2011 Rank Closeness Rank Between Rank centrality centrality centrality Degree tion ness Organiza centrality Rank Closeness Rank Between Rank centrality ness centrality centrality SEC 0.422 1 0.506 2 0.253 2 ETRI 0.433 1 0.594 1 0.155 1 SNU 0.695 1 0.756 1 0.144 1 ETRI 0.378 2 0.479 3 0.252 3 SEC 0.400 2 0.583 2 0.104 4 KAIST 0.593 2 0.702 2 0.112 2 KAIST 0.289 3 0.511 1 0.241 4 SNU 0.350 3 0.577 3 0.146 2 YSU 0.559 3 0.686 3 0.043 5 KRICT 0.200 4 0.421 7 0.049 HYU 0.333 4 0.571 4 0.118 3 KRU 0.542 4 0.686 3 0.052 4 HMC 0.178 5 0.437 5 0.290 1 KAIST 0.283 5 0.522 10 0.082 8 HYU 0.492 5 0.656 5 0.076 3 POST ECH 0.156 6 0.421 7 0.084 9 YSU 0.267 6 0.536 6 0.094 6 ETRI 0.475 6 0.634 6 0.042 6 LGE 0.156 6 0.446 4 0.078 10 HMC 0.250 7 0.526 7 0.097 5 SEC 0.458 7 0.634 6 0.037 9 CII 0.156 6 0.402 10 0.013 KRU 0.250 7 0.545 5 0.092 7 POST ECH 0.424 8 0.615 9 0.029 KICT 0.111 9 0.360 0.136 5 SKKU 0.217 9 0.526 7 0.051 9 SKKU 0.407 9 0.621 8 0.039 8 KIMM 0.111 9 0.395 0.104 7 POST ECH 0.217 9 0.526 7 0.031 HMC 0.373 10 0.602 10 0.034 10 KIST 0.111 9 0.437 5 0.046 KT 0.183 0.517 0.010 KIST 0.356 0.602 10 0.010 KT 0.111 9 0.409 9 0.003 LGE 0.167 0.458 0.042 IHU 0.322 0.578 0.030 HMB 0.044 0.319 0.127 6 KHU 0.167 0.500 0.038 CAU 0.322 0.590 0.018 KHNP 0.067 0.249 0.087 8 KRICT 0.133 0.455 0.021 KRICT 0.305 0.578 0.041 7 10 16
  • 49. Results  17
  • 50. Conclusions and discussion  Conclusions  Policy implications 18
  • 51. Contributions 19
  • 52. 20
  • 53. Fred Phillips DISC 2013, Daegu General Informatics LLC Perspectives on Triple Helix
  • 54. Agenda 1. 3-Helix as a meso-level notion – Epicycle in a grander tech-psych-inst cycle 2. Speed (differentials) as high-level system metric – Roles of buffering institutions and ICT – Need for smart engagement 3. Applying 3-helix in the developing world 4. SUNY Korea’s joint TS/CS research
  • 55. 3-Helix papers published in Technological Forecasting & Social Change • Wilfred Dolfsma, Loet Leydesdorff “Lock-in and break-out from technological trajectories: Modeling and policy implications,” 76( 7), Sept. 2009, 932-941. • Raul Gouvea, Sul Kassicieh, M.J.R. Montoya “Using the quadruple helix to design strategies for the green economy,” 80(2), Feb. 2013, 221-230. • Øivind Strand, Loet Leydesdorff “Where is synergy indicated in the Norwegian innovation system? Triple-Helix relations among technology, organization, and geography,” 80(3), Mar. 2013, 471-484. • Inga A. Ivanova, Loet Leydesdorff “Rotational symmetry and the transformation of innovation systems in a Triple Helix of university– industry–government relations,” In Press, Corrected Proof, Available online 19 Sept. 2013.
  • 56. In D.S. Oh & F. Phillips (Eds), Technopolis: Best Practices for Science and Technology Cities (Springer, 2014) • E. Becker, B. Burger and T. Hülsmann, “Regional Innovation and Cooperation among Industries, Universities, R&D Institutes, and Governments” • F. Phillips, S. Alarakhia and P. Limprayoon,“The Triple Helix: International Cases and Critical Summary” • José Alberto Sampaio Aranha, “Arrangement of Actors in the Triple Helix Innovation”
  • 57. IC2 Model • Preceded 3-helix by several years • But only parts were made mathematical (Bard et al) Ac a d e mi a Indu st ry Go v e r n me n t Com m un it y Talen t Technology Capi t al Kno w - Ho w Ma rke t Ne e ds V alu e - A dd e d Ec ono m ic Deve lop me nt
  • 58. The math of AcademicGovernment-Industry dynamics is interesting, but... It is just part of a bigger picture.
  • 59. The cycle of innovation and change: Lab to society & back again Technological Innovation New desires & dreams New ways to organize (Public & private) Note how this schema extends Everett Rogers’ more linear model. New Products & Services New ways to Interact socially New ways of producing and using products & services
  • 60. We might think all the elements move together in an orderly way. Social Needs Institutional Change Technological Change Psychological Change Organizational Change
  • 61. But in a free-market economy, they do not. • They continually engage and disengage. • Sometimes they move each other only by friction. • 90% of MOT and Tech Policy problems stem from the differing speeds of the 3 sectors.
  • 62. Example: Transportation • Mobile-web rideshare services – Gain VC investment – Start operations – Get shut down by city governments trying to regulate them under old taxi rules. • Institutions have changed slower than technology and social demand.
  • 63. Example: Health • An elderly person dies because he was too proud to wear – A medical bracelet – or – An emergency signaller. • Psychology has changed slower than technology.
  • 64. Example: Software • Record companies and publishers – Sue student MP3 pirates – Develop DRP software that further alienates customers – Can’t adapt away from paper and CD publishing. • Business organizations change more slowly than technology and social demand.
  • 65. Example: More and more often, social/institutional change outpaces tech change - or will do so soon. • In most of the world, an excess of funds is chasing too few growth investment opportunities. • Fewer US companies are making IPOs. • Small-government activists rail indiscriminately against direct government monetary support for new technologies. See Phillips (2011).
  • 66. This can be good. • Individual creativity may bloom. • Mistakes... – Can be undone efficiently. – Don’t necessarily infect the whole system.
  • 67. It (disengagement)can be bad. • Alienation • Lack of coordination and cooperation • Little institutional or organizational creativity • Waste and pollution • Lives lost
  • 68. Speed as the system metric • Really, speed differentials among the sectors. • A “clutch” and “transmission” are needed. • The question is less how to engage, but rather, when. • The key is not engagement per se, but smart (well-timed) engagement.
  • 69. Not bridging organizations, but buffering organizations • • • • • • • • Civic groups Workforce training programs Economic development agencies Technology brokers Open innovation integrators Accountancies Industry associations NGOs The IC2 Model partially captured this. • • • • Incubators Law firms Venture capital TTOs
  • 70. 3-Helix as meso-level construct: An epicycle within the TechnologyPsychology-Institutional dynamic • Macro: Tech-Psych-Inst • Meso: Aca-Gov-Indus Tech – “Triple Helix” • Micro: – Dynamics within people and within organizations; – Technology life cycles • The buffering institutions span all 3 levels. Inst (3-Helix)
  • 71. What causes TOPI* disengagement? *Technological-Organizational-PsychologicalInstitutional • Bad marketing, bad market research • Mistrust, bad service • Technology inaccessible to underserved populations • Competition among de facto standards (e.g., VHS vs Beta) • Lack of vision • Poor design of information & communication products and programs.
  • 72. “Engaging” doesn’t mean “attractive nuisance.”
  • 73. Intrusive ‘engagement’ Update this app!
  • 74. Marketing guru Geoffrey Moore says, • “People have disengaged, for ... self-preservation.” – With “consequences for consumer and brand marketing, – “and long-term implications for education, health care, citizen participation, and workforce involvement. • “So engagement is rightfully going to be a big investment theme.”
  • 75. Moore: Engagement is taking center stage in business. • Off-line retailers are using digital interactions/devices in their in-store experiences. – Example: Starbucks. • “Social marketing foster[s] engagement around topics that ... reflect well upon the sponsor.” – Example: Sephora. • “Big data analytics drive communications that can break through the wall of detachment.” – Example: Obama campaign 2012.
  • 76. Moore is saying • Advertising used to be like this. – Annoying! Consumers disengaged. • Now with social media, mobile web, Yelp.com, – Consumers share product reviews & complaints. – Advertisers have to treat consumers more gently. – To make us want to continually re-engage. • Engaging doesn’t mean shouting.
  • 77. ICT for an Intelligently Engaged Society?
  • 78. What kinds of IT foster positive, voluntary engagement? Why?
  • 79. What kinds of IT discourage it? Why?
  • 80. People are proud to participate electronically. • Fighting crime – Zapruder film; Rodney King videos • Supporting favorite businesses, authors – Amazon reviews • For post-disaster aid – Crowd-mapping of post-earthquake Haiti • Crowd-funding research projects and entrepreneurs • Though there are abuses.
  • 81. Source: Ganti et al, Mobile Crowdsensing: Current State and Future Challenges.
  • 82. Micro Level: Workforce Engagement • Definition: The measure of whether employees merely do the minimum required of them, versus proactively driving innovation and new value for the organization. • Thus, engagement – “can only ever be partially accounted for by deploying the latest new collaborative technology, – “and probably significantly less than many of its proponents would have you believe.” Source: Hinchcliffe
  • 83. Current state of worker engagement
  • 84. ICT for engagement? Summary • ICT alone cannot create/sustain engagement. – Human intervention, via buffering institutions, can achieve ICT-aided engagement. • ICT, especially sensing and crowdsourcing, may assist in deciding when to engage. – Thus achieving smart engagement. • This applies to all 3 levels (macro, meso, micro) of our multi-level Technology & Society diagram.
  • 85. For many countries where central government direction is the norm, 3-helix thinking is premature. • Indonesia, Mongolia • USA: Industry lobbying government presents a slightly different problem...
  • 86. Big man little man game
  • 87. In sum, the problem is not disengagement, but mis-engagement among governments, people, organizations and products, due to: • Speed differentials (i.e., poor timing) • Lack of vision • Poor design of information & communication products and programs. – Lack of feedback – Excess complexity, leading to slow comprehension and adoption – Excess technology push (solutions without problems) – Excess demand pull (unrealistic expectations) – Other factors
  • 88. SUNY Korea’s research agenda • Combine social science and computer science... • To find principles of IT design that more quickly lead to engagement that is... – Well-timed – Smart – Satisfying • Among – – – – Individuals Businesses Government institutions Technology developers • With secure applications in several techno-policy domains (health, energy, etc.).
  • 89. Some Implications • For IT: Meeting users halfway • For managers: Engagement plans for each constituency • For theorists: – Modeling the moderating effect of buffering institutions – Impact of coalitions on the 3-helix dynamic
  • 90. The math of AcademicGovernment-Industry dynamics is interesting, but... It is just part of a bigger picture.
  • 91. An aside: Spatializing an innovation diffusion model F. Phillips, On S-curves and Tipping Points. Tech. Forecasting & Social Change, 74(6), July 2007, 715-730. Alan M. Turing, The chemical basis of morphogenesis. Philosophical Transactions of the Royal Society of London. B 327, 37–72 (1952) http://www.cgjennings.ca/toybox/turingmorph/
  • 92. References • http://davidsasaki.name/2013/01/beyond-technology-fortransparency/ • A. Charnes, S. Littlechild and S. Sorensen, “Core-stem Solutions of N-person Essential Games.” Socio-Econ. Plan. Sci. Vol. I, pp. 649660 (1973). • David Watson The Engaged University. Routledge, 2013. • Dion Hinchcliffe, “Does technology improve employee engagement?” Enterprise Web 2.0, Nov. 5, 2013. http://www.zdnet.com/doestechnology-improve-employee-engagement-7000021695/ • Jonathan Bard, Boaz Golany and Fred Phillips, “Bubble Planning and the Mathematics of Consortia.” Third International Conference on Technology Policy and Innovation, Austin, Texas, September, 1999. • F. Phillips, The state of technological and social change: Impressions. Technological Forecasting & SocialChange. 78(6), July 2011, 1072-1078.
  • 93. 감사합니다 Thank you fred.phillips@stonybrook.edu fp@generalinformatics.com
  • 94. A Network Analysis of Web-Citations Among the World’s Universities George A. Barnett Department of Communication University of California, Davis gbarnett@ucdavis.edu Daegu Gyeongbuk International Social Network Conference December 12-14, 2013
  • 95. Research Aims • Network Analysis of URL-citations among – 1,000 universities with greatest presence on WWW (1 million edges) – In 58 different countries – Multi-level analysis (both Universities & Countries) • Antecedent factors that determine the network’s structure – University level − National Level • Physical distance  • Same country  Capacity • Language of instruction  • Size  • Ph.D. granting  • Prestige • Research Excellence (Nobel Prizes) Hyperlink Connections International Bandwidth GDP, GDP/capita International Student Flows Nobel Prizes
  • 96. Data—Web-Citations • Web-citations among universities collected using Google – 2,100 X 2,100 matrix of universities (4,407,900 cells) generated – search query “university A webdomain” site:university B webdomain "harvard.edu" site:stanford.edu − Not all URL-citations are links, e.g., email addresses in coauthored papers − Removed universities with no ties & the smaller of a university’s multiple domains, retained 1,000 most interlinked Universities − Matrix of inter-citations aggregated to the national level
  • 97. Data--Antecedents University Level Physical Location − Google Maps Country − cTLD of website (USA--.edu) Language of Instruction − Country of University (India & Singapore—English) Size of University − Europe -- (EUMIDA) (http://thedatahub.org/dataset/eumida) − U.S. -- College Handbook 2012 − Asia, Africa, Oceania, Latin American & Canada – Universities’ Websites Prestige − U.S. News, World’s Best Universities 2012 http://www.usnews.com/education/ Nobel Prizes − (http://www.nobelprize.org)
  • 98. Data--Antecedents National Level Total Hyperlinks − Barnett & Park (2012) International Internet Bandwidth, GDP & population − TeleGeography (2012) (http://www.telegeography.com/) Student Exchange − UNESCO (http://stats.uis.unesco.org/unesco) International Co-authorships − Leydesdorff & Wagner (2008) International Citations − Science Citation Index
  • 99. Results - Universities • • • • Over 9.6 million links among 1,000 universities Density = .606 Mean # of Links = 24.0; S.D. = 2,208.6 Greatest # of links (322,000) – Universität Trier & Rheinisch Westfalische Technische Hochschule Aachen, two German institutions that host huge & popular bibliographic systems (DBLP & SunSite)
  • 100. Results - Universities
  • 101. Results - Universities
  • 102. Results – Clusters of Universities Cluster Defining Attributes 1. German, Swiss & Italian, not English, central, low prestige, less bandwidth connections 2. English (U.S., Canada, U.K., Australia), central, high prestige, strong bandwidth connections 3. Low prestige, peripheral, less bandwidth connections 4. English, not French, peripheral, no Ph.D.s, strong bandwidth connections 5. Continental Europe, not English 6. Chinese, less bandwidth connections 7. French, not English, peripheral, lower prestige 8. English, primarily (Jesuit Institutions), peripheral, low prestige 9. English, peripheral 10. Japanese & other Asian, peripheral, little bandwidth connections
  • 103. Results - National • N = 58 Countries • Density = .924 • United States most central, followed by Germany, U.K., Canada – >30% of links ; >4 million outward & 1.9 million inward – Eigenvector centrality 10 times > Germany • Gini = .672, a core = periphery structure – U.S. (359), Germany (67), U.K. (67) & Canada (38) 53.1% of the universities – These four nations account for 68.3% of the links – Links distributed by power law; concentrated in a few countries • Cluster Analysis – 1 group of countries centered about U.S. & U.K.
  • 104. Results – Predicting the Structure of the University URL-citation Network • Physical Distance Between Campuses – QAP Correlation = .005 No relationship between physical distance and web-citations • Same Country – – – – QAP Correlation = .065 Links 78.4% domestic; 21.6% international No Links 6.1% domestic; 93.9% international Mean Link Strength 1,415 with domestic; 42.5 international • Web-citations tend to be domestic
  • 105. Results – Predicting University Centrality in Network --Correlations
  • 106. Results – Predicting University Centrality in Network -- Regression In-degree R2 F P Size (log) English Bandwidth Rating Out-Degree .350 47.94 .000 ß .279 -.025 .268 .465 Betweenness .489 85.16 .000 t 6.49 -.516 5.70 10.53 all p< .001, except English for In-degree ß .123 .356 .302 .323 t 3.22 8.50 7.31 8.25 Eigenvector .579 122.25 .000 ß .282 .185 .336 .502 t 8.13 4.86 8.94 14.12 .310 39.94 .000 ß .150 .214 .208 .348 t 3.36 4.40 4.33 7.65
  • 107. Results – Predicting the Structure of the URL-citation Network-National Level • QAP Correlations with National Level Network – Co-Authorships .772 – Citations .967 – Hyperlinks .545 – Student Flows .270 – Missing Data N = 52 on all except Student Flows, N = 48
  • 108. Results – Predicting Nation’s Centrality in Network --Correlations
  • 109. Results – Predicting National Centrality in the Network -- Regression In-degree .524 33.78 .000 35.12 .670 ß R2 F P Out-Degree ß t Nobles English Population .482 .4.80 GDP/capital .722 7.19 GDP .000 t .184 2.27 .398 4.70 .797 9.28 All relations are significant p < .02 Betweenness 22.99 ß .505 .000 t .443 4.33 .720 7.03 Eigenvector .642 31.05 .000 ß t .553 5.07 .183 2.15 .258 2.41
  • 110. Discussion • So where is academic knowledge produced? – Primarily at prestigious English speaking institutions in the U.S.A. & U.K. , but also in Canada & Germany • Distance is unrelated to dissemination & collaboration via the Internet • Universities tend to link to others from the same country • Ten clusters- One composed of most prestigious institutions, suggesting exchanges of knowledge among this group • Centrality predicted by university size, its prestige (whether it offered doctoral degrees, its U.S. News ranking, the number of its faculty’s Noble Prizes), language of instruction (English), & national international bandwidth capacity
  • 111. Discussion • At the national level, the countries formed a single group centered about the U.S. & the U.K. • U.S. is the most central, followed by Germany, U.K. & Canada – They accounted for the majority of the universities in the network • The International Network has a core-periphery structure with a few countries accounting for the majority of the links • International co-authorships, citations, student exchanges & the number of links among the individual countries are strongly predictive of the network’s structure • Centrality is predicted, by a country’s population & GDP, depending on the measure, it may also be predicted by language of instruction (English) & the number of Noble Prizes
  • 112. Discussion • Results are consistent with Seeber, et al. (2012) – European university hyperlink network displays a center-periphery structure – centrality a function of the universities’ reputation – This study extends their conclusions to the global academic community
  • 113. Discussion • Consistent with Ortega & Aguilla (2009) – “The world-class university network graph is comprised of national sub-networks that merge in a central core where the principal universities of each country pull their networks toward international link relationships. This network rests on the United States, which dominates the world network in conjunction with the aggregation of the European ones, especially the British and the German subnetworks. This situation may be caused mainly by the technological development of these countries and the production of international content, that is, English web pages. This second reason might explain the apparent backward situation of some East Asian countries.“ • World Systems Theory – Telephone (Barnett, 2001, 2012) – Internet (Barnett & Park, 2005, 2012; Park, Barnett & Chung, 2011) – Student flows (Barnett & Wu, 1995; Chen & Barnett, 2000; Jiang, 2013) – Patents, trademarks and copyrights (Nam & Barnett, 2011).
  • 114. Discussion • Global academic community as a self-organizing system – Academic network may be considered an autopoietic or selfreplicated system – Evolved from traditional scientific activities (co-authorship, citing the research of others & other behaviors that required the sharing of information among scholars) – Krippendorf defines an autopoietic system as “a network of processes that produces all the components necessary to embody the very process that produces it”. The network recursively produces its components through the interaction in this historical reproductive network of postings on university websites & links among institutions
  • 115. Discussion • There are environmental constraints that limit the possible states into which this system may evolve • issues of information property • policies of individual universities & national governments • scientific funding agencies (U.S. National Science Foundation) • Academic networks co-evolved with other global institutions • Universally, higher education is developing common curricula especially in the sciences (Lechner & Boli, 2005). This seems to be reflected in pattern of universities’ hyperlinks and web-citations
  • 116. Thank you! See: Barnett, G.A. , Park, H.W., Jiang, K, Tang, C, & Aguillo, I.F., (2013), “A multi-level network analysis of web-citations among the world’s universities”, Scientometrics, DOI 10.1007/s11192-013-1070-0
  • 117. Virtual Knowledge Studio (VKS) “Webometrics Studies” Revisited in the Age of “Big Data” Asso. Prof. Dr. Han Woo PARK CyberEmotions Research Institute Dept. of Media & Communication YeungNam University 214-1 Dae-dong, Gyeongsan-si, Gyeongsangbuk-do 712-749 Republic of Korea www.hanpark.net cerc.yu.ac.kr eastasia.yu.ac.kr asia-triplehelix.org
  • 118. Big data  The term “big data” refers to “analytical technologies that have existed for years but can now be applied faster, on a greater scale and are accessible to more users. (Miller, 2013).  Big data sizes may vary per discipline.  Characteristics: Garner’s 3Vs plus SAS’s VC and IBM’s Veracity - Volume (amount of data), Velocity (speed of data in and out), Variety (range of data types and sources) - Variability: Data flows can be highly inconsistent with daily, seasonal, and event-triggered peak data loads - Complexity: Multiple data sources requiring cleaning, linking, and matching the data across system - Veracity: 1 in 3 business leaders don’t trust the information they use to make decisions. http://en.wikipedia.org/wiki/Big_data http://www-01.ibm.com/software/data/bigdata/
  • 119. http://www.emc.com/leadership/digitaluniverse/iview/executive-summary-a-universe-of.htm
  • 120. http://www.emc.com/leadership/digitaluniverse/iview/images/impact-ofconsumers-lg.jpg
  • 121. Data-driven Research that focuses on extracting meaningful data from technosocio-economic systems to discover some hidden patterns. Today’s “big” is probably tomorrow’s “medium” and next week’s “small” and thus the most effective definition of “big data” may be derived when the size of data itself becomes part of the research problem. Loukides (2012)
  • 122. Introduction  Webometrics is broadly defined as the study of webbased content (e.g., text, images, audio-visual objects, and hyperlinks) with primarily quantitative indicators for social science research goals and visualization techniques derived from information science and social network analysis.
  • 123. • Han Woo Park - “hidden” and “relational” data about lots of people as well as the few individuals, or small groups • Lev Manovich - “surface” data about lots of people (i.e., statistical, mathematical or computational techniques for analyzing data) - “deep” data about the few individuals or small groups (i.e., hermeneutics, participant observation, thick description, semiotics, and close reading) 7
  • 124. First type of Webometrics • Hyperlink Network Analysis - Inter-linkage: who linked to whom matrix - Co-inlink : a link to two different nodes from a third node - Co-outlink : A link from two different nodes to a third node Björneborn (2003)
  • 125. Inter-link network analysis diagram among Korean escience sites within public domain WCU WEBOMETRICS INSTITUTE Mapping the e-science landscape In South Korea using the Webometrics method
  • 126. Co-inlink network analysis WCU WEBOMETRICS INSTITUTE Mapping the e-science landscape In South Korea using the Webometrics method
  • 127. Findings As seen in Figure 4, the network structure shows a clear butterfly pattern. There is one hub (ghism) that belongs to Park Gyun-Hye (Park GH, www.cyworld.com/ghism), the daughter of ex-president Park Jeong-Hee and one of two major GNP candidates (along with president-elect Lee MB) in the 2007 presidential race. Figure 4: Cyworld Mini-hompies of Korean legislators How do social scientists use link data from search engines to understand Internet-based political and electoral communication? WCU WEBOMETRICS INSTITUTE INVESTIGATING INTERNET-BASED POLITICS WITH E-RESEARCH TOOLS Case 2. Cyworld Mini-hompies of Korean Legislators
  • 128. Sociology of Hyperlink Networks of Web 1.0, Web 2.0, and Twitter A Case Study of South Korea
  • 129. Introduction ‣ Online & offline lives ➭ co-constructing (e.g. Beer & Burrows, 2007) ‣ Politicians communicate with their constituencies using different platforms ‣ Questions: - What are the structural similarities and/or differences in South Korean politicians’ networks from Web 1.0 to Web 2.0 (and Twitter)? - Are online structures similar to structures in the physical world? - Are online patterns affected by offline relationships? ‣ Related studies conducted: - online social network analysis - online networks in Web 2.0 - role of Twitter on online politics
  • 130. 2001 2000 ‣ 59 isolated in 2000 ‣ more centralised in 2001 ‣ network of 2001 ➭ a ‘star’ network - might affected by political events ➭ presidential election in 2001 Web 1.0
  • 131. 2005 2006 ‣hubs disappearing ‣easy use of blogs ‣Clear boundaries between different parties ‣strong presence of GNP Assembly members ➭ party policy on using blogs Web 2.0
  • 132. Politician Twitter Network (Following and Mention Network)
  • 133. Conclusion Politicians Twitter Following-follower Network Politicians Twitter Mention Network
  • 134. Bi-linked network of politically active A-list Korean citizen blogs (July 2005) URI=Centre DLP=Left GNP=Right Just A-list blogs exchanging links with politicians
  • 135. Affiliation network diagram using pages linked to Lee’s and Park’s sites N = 901 (Lee: 215, Park: 692, Shared: 6)
  • 136. Tweets on the name of S. Korea president 20
  • 137. Viewertariat Networks: A Study of the 2012 South Korean Presidential Debate Park’s network Moon’s network
  • 138. Reply-To Networks of Park’s & Moon’s Facebook page visitors during TV debates
  • 139. “Those studies perpetuate the idea that linking behaviour is not random, and that links are ‘socially significant in some way’. In this perspective, links have an ‘information side-effect’, they can be used to understand other facts even though they were not individually designed to do so: ‘information side-effects are by-products of data intended for one use which can be mined in order to understand some tangential, and possibly larger scale, phenomena’
  • 140. Park and his colleagues were extensively cited: 9 times! • • • • • • • • • Barnett GA, Chung CJ and Park HW (2011) Uncovering transnational hyperlink patterns and web mediated contents: a new approach based on cracking.com domain. Social Science Computer Review 29(3): 369–384. Hsu C and Park HW (2011) Sociology of hyperlink networks of Web 1.0, Web 2.0, and Twitter: a case study of South Korea. Social Science Computer Review 29(3): 354–368. Park HW (2003) Hyperlink network analysis: a new method for the study of social structure on the web. Connections 25(1): 49–61. Park HW (2010) Mapping the e-science landscape in South Korea using the webometrics method. Journal of Computer-Mediated Communication 15(2): 211–229. Park HW and Jankowski NW (2008) A hyperlink network analysis of citizen blogs in South Korean politics. Javnost: The Public 15(2): 5–16. Park HW and Thelwall M (2003) Hyperlink analyses of the World Wide Web: a review. Journal of Computer-Mediated Communication 8(4). Park HW and Thelwall M (2008) Developing network indicators for ideological landscapes from the political blogosphere in South Korea. Journal of ComputerMediated Communication 13(4): 856–879. Park HW, Kim C and Barnett GA (2004) Socio-communicational structure among political actors on the web in South Korea. New Media & Society 6(3): 403–423. Park HW, Thelwall M and Kluver R (2005) Political hyperlinking in South Korea: technical indicators of ideology and content. Sociological Research Online 12(3).
  • 141. A comment from those who are NOT doing a hyperlink analysis • In a chapter of The Sage Handbook of Online Research Methods edited by Fielding et al. (2008), Horgan emphasizes that ‘link analysis’ has become an active research domain in examining social behavior online. 25
  • 142. A threat to Webometrics • The key application in this area is to collect some incoming, outgoing, inter-linking, and co-linking data from search engines - AltaVista in early 2000 - Yahoo renewed the AltaVista’s hyperlink commands via “Site Explorer” and its API - Yahoo discontinued its API option for interlinkage data in April 2011, and finally stopped its popular Site Explore service in November 2011
  • 143. http://cybermetrics.wlv.ac.uk/Que riesForWebometrics.htm
  • 144. A new proposal • Mike Thelwall - URL citation searches with the Bing search API facilities • Liwen Vaughan - Incoming hyperlinks from Alexa.com Can these "alternative" techniques be acceptable for scientific publishing?
  • 145. A new proposal : SEO Tools • - Search Engine Optimization Tools http://www.majesticseo.com/ http://www.opensiteexplorer.org/ https://ahrefs.com/ Enrique Orduña-Malea & John J. Regazzi (2013). Influence of the academic Library on U.S. university reputation: a webometric approach. Technologies. 1, 26-43, http://www.mdpi.com/2227-7080/1/2/26
  • 146. Webometrics Ranking of World Universities The link visibility data is collected from the two most important providers of this information: Majestic SEO and ahrefs. Both use their own crawlers, generating different databases that should be used jointly for filling gaps or correcting mistakes. The indicator is the product of square root of the number of backlinks and the number of domains originating those backlinks, so it is not only important the link popularity but even more the link diversity. The maximum of the normalized results is the impact indicator. http://www.webometrics.info/en/Methodology
  • 147. Interlinkage among world universities • Barnett, G.A., Park, H. W., Jiang, K., Tang, C., & Aguillo, I. F. (2013 forthcoming). A MultiLevel Network Analysis of Web-Citations Among The World’s Universities. Scientometrics*. Isidro F. Aguillo “Large interlinking matrix (1000*1000) are no longer possible to obtain. Perhaps national academic systems (200 or 300 institutions)”
  • 148. Intentional inattention among Information Scientists? • Robert Ackland (2013). Web Social Science. - http://voson.anu.edu.au/ • Richard Rogers (2013). Digital Methods. - https://www.issuecrawler.net/index.php - https://www.digitalmethods.net/Dmi/Tool Database
  • 149. Let us move to Web Visibility Analysis Frequently occurring key words in e-science webpages in Korea Created on Many Eyes(http://many-eyes.com) Words are larger according to the frequency of their occurrence but their positions are randomly-chosen for the best visualization WCU WEBOMETRICS INSTITUTE
  • 150. Websites retrieved more than two times Note: Websites are larger according to their frequency of retrieval; however, heir colors and locations are randomly-chosen for the best visualization WCU WEBOMETRICS INSTITUTE
  • 151. 2nd type of Webometrics: Web Visibility  Web visibility as an indicator of online political power   Presence or appearance of actors or issues being discussed by the public (Internet users) on the web. Tracking web visibility is powerful way to get an insight into public reactions to actors or issues.  Recent studies indicates the positive relationships between politicians’ web visibility level and election.  Also, the co-occurrence web visibility between two politicians represents their hidden online political relationships based on the public perception.
  • 152. Results – Web Visibility (co-occurrence)
  • 153. Results – Correlation & Path Analysis Correlation 1 (N=278) 1 Finance 2 (N=278) 3 (N=234) 1 0.420** 0.101 1 0.184** 2 Web 3 Vot e 1 Spearman Correlation 1 (N=278) 1 Finance 2 (N=278) 3 (N=234) 1 0.513** 0.090 1 0.163* 2 Web Political finance’s indirect effect = .076 3 Vot e Note. * p<.05, ** p<.01 ** p<.01 1
  • 154. Results – QAP Correlation 1 1 Committee 2 Constituency 2 3 1 0.004 -0.016 1 3 Party 4 Gender 5 Age 6 Incumbent 7 Web 8 Finance Note. * p<.05, ** p<.01 4 0.025 0.097** -0.007 1 0.027 1 5 -0.021 6 7 8 -0.074** 0.045** -0.037** -0.043** -0.064** 0.105** -0.119** -0.045* -0.050* 0.242** -0.094** 0.024 0.031 1 0.179** -0.051* 0.049* 1 0.098** 0.041 -0.060** 1 -0.224** -0.158** 1
  • 155. e-리서치 도구의 활용: 웹가시성 분석  블로그 공간에서 후보자들의 웹가시성 수준과 득표 수간 에 밀접한 상관성을 나타냄. (임연수, 박한우, 2010, JKDAS) 실제 득표수 29,120 평균 블로그 수 19,427 14,218 3,071 2,125 504 경대수 정범구 정원헌 박기수 이태희 김경회
  • 156. 2009년 10월 28일 재보선 결과 - 당선자 모두 블로그 가시성 높음
  • 157. I. 소셜 미디어의 특징 및 영향력 10.26 재보궐 선거 사례 • (2) 페이스북에서 이름이 동시에 언급되는 이름 연결망을 구성 하여 분석 • 초반에는 두 후보자가 비슷하게 언급되다가, 중반에 접어들자 박원순 지지자들과 박원순이 언급되면서 나경원 후보자 지지자가 안보이게 되고, 종반에는 박원순 중심으로 네트워크가 재편되며 종결됨
  • 158. I. Semantic network에서 중심성 비교 10.26 재보궐 선거 사례 (2) • 서울시장 선거 관련 메세지들의 내용 을 분석하여 나오는 단어들의 빈도 분석 • 초반부터 나경원 후보는 빈도가 떨어 지다가, 후반에 박원순 후보와 경쟁 및 선거 결과를 이야기하면서 나타나 는 경우를 제외하고는 줄곳 담론외곽 에 존재 • 안철수 효과는 초반에 크고, 중반이 후 떨이지는 효과가 나타났으나, 한 나라당이라는 언급이 높게 나오면서 집권여당에 반하는 정서가 나타나, 선거의 성격을 말해줌
  • 159.   As Lim & Park (2011, 2013) claim, the use of web mentions of politicians’ names is particularly useful for hierarchically ranking individual politicians. However, it may not sufficiently capture the entropy probability of an event (hidden in changing communication structures) resulting from the amount of information conveyed by the occurrence of that event (Shannon, 1948).
  • 160.  Taleb (2012) argues that society can be conceived as a complex fabric consisting of the extended disorder family including uncertainty, chance, entropy, etc.  Therefore, such disorder system can be better derived from empirical data mining, not obtained by a priori theorem.  Uncertainty exists when three or more events take place simultaneously and is increasingly beyond the control of individual events (Leydesdorff, 2008).
  • 161.  In social and communication sciences, entropy-based indicators have been widely used for exploring entropy values generated from university-industrygovernment (UIG) relationships.  This “Triple Helix Model” (THM) can be applied to the concurrence of a pair of two or three terms in the public search engine database
  • 162. Mapping Election Campaigns Through Negative Entropy: Triple and Quadruple Helix Approach to Korea’s 2012 Presidential Election Social media platforms have become a notable venue for Korean voters wishing to share their opinions and predictions with others (Park et al., 2011; Sams & Park, 2013).  Politicians have made increasingly use of SNSs to provide updates and communicate with citizens (Hsu & Park, 2012).  With the increasing proliferation of smartphones and portable computers in Korea, SNSs have been widely used for facilitating political discourse.  Prior studies have found that Web 1.0 contents tended to contain the more enduring political and electoral statements of the public in various contexts. 
  • 163. Introduction  To better understand the dynamics of the 2012 presidential election in Korea, this study estimates the web visibility of the three major candidates— Geun-Hye Park (PARK), Cheol-Soo Ahn (AHN), and Jae-In Moon (MOON)—in the entire digital sphere.
  • 164. Literature Review The total probabilistic entropy (uncertainty) produced by changes in one or two dimensions is always positive, which is in accordance with the second law of thermodynamics (Theil, 1972, p. 59).  On the other hand, the relative contribution of each event to the summation in three or four dimensions can be positive, zero, or negative (configurational information).  This configurational information provides a measure of synergy within a complex communication system. Network effects occur in a systemic and nonlinear manner when loops in the configuration generate redundancies in relationships between three or four events (Leydesdorff, 2008). 
  • 165. Method: Data collection     The number of hits for each search query per media channel (Facebook, Twitter, and Google) was harvested. The hit counts obtained from Google.com were employed to look primarily at entropies represented on a set of digitally accessible documents (e.g., online versions of newspapers, online word-of-mouth, Web 1.0 contents, etc.). We measured the occurrence and co-occurrence of the politicians’ names based on their bilateral, trilateral, and quadruple relationships by using Boolean operators. For example, we measured the number of web and social media mentions referring only to PARK (this is, no mention of AHN, MOON, or the term “president”).
  • 166. SNS 미디어에 따른 중심성에 따른 시각화
  • 167. Literature Review Twitter can be very effective to amplify messages particularly in terms of their one-to-many mode of communication (Barash & Golder, 2010).  Twitter is viable both as a political news and communication channel (González-Bailón, Borge-Holthoefer, Rivero & Moreno, 2011; Hsu & Park, 2011, 2012; Otterbacher, Shapiro, & Hemphill, 2013)  and to citizens who look for platforms for political participation and engagement (Hsu, Park, & Park, 2013; Kim & Park, 2011; Tufekci& Wilson, 2012). 
  • 168. Literature Review    The mode of information sharing on Facebook differs from that on Twitter. Facebook functions as a living room where friends talk to one another. Facebook can be a mixture of interpersonal and mass channels for the sharing of informational as well as social messages in a context of political campaign (Bond et al., 2012; Effing, van Hillegersberg, & Huibers, 2011; Robertson, Vatrapu, & Medina, 2010; Vitak et al., 2011). Both Twitter and Facebook communications seem to be biased because two platforms have been particularly dominated by the “2040 Generation”, who are generally categorized as political liberals in Korea (Kwak et al., 2011).
  • 169. Research questions  Therefore, it is important to examine what (social) media conversations are more likely to generate more entropies that others and which politician:  RQ 1) What (social) media generate (negative) entropy more than others across different periods?  RQ 2) Which politician (or which pair of politicians) generates entropy more than others for bilateral, trilateral, or quadruple relationships across various media and periods?
  • 170. Method: Measuring (negative) entropy  Figure 1. Binary Entropy Plot
  • 171.  Entropy values (expressed as T for transmission) for bilateral relationships are, by definition, positive. Here T is defined as the difference in uncertainty when the probability distributions of two incidents (e.g., i and j) are combined. The mutual information transmission capacity, expressed in T values, is measured by “bits” of information (for a more detailed mathematical definition, see Leydesdorff, 2003):  Hi = – Σi pi log2 (pi); Hij = – Σi Σj pij log2 (pij), Hij = Hi + Hj – Tij , Tij = Hi + Hj – Hij (1) Here Tij is zero if the two distributions are mutually independent and positive otherwise (Theil, 1972).   
  • 172.  On the other hand, T values for trilateral and quadruple relationships can be negative, positive, or zero depending on the size of contributing terms. Therefore, it is necessary to compare the absolute value of each (negative) entropy value when entropy values are calculated for trilateral and quadruple relationships. In the case of entropy values for trilateral and quadruple relationships, the higher the absolute entropy value, the more balanced the communication system is. Let p denote PARK; a, AHN; and m, MOON and formulate mutual information in these three dimensions as follows (Abramson. 1963, p. 129):  Tpam = Hp + Ha + Hm – Hpa – Hpm – Ham + Hpam  Here we are interested not only in information on mutual relationships between these three candidates but also in semantic relationships with respect to the term “president.” Accordingly, we measure the entropy value by using mutual information in these four dimensions (here “r” denotes “president”):  Tpamr = Hp + Ha + Hm + Hr – Hpa – Hpm – Hpr – Ham – Har – Hmr + Hpam + Hpar + Hpmr + Hamr –Hpamr (3) (2)
  • 173. Results  Figure 2. Entropy Values Across Media Channels and Time Periods
  • 174. Results  Figure 3. T Values for Bilateral and Trilateral Relationships on November 3.
  • 175. Results  Figure 4. T Values for Bilateral Relationships between Park and Moon
  • 176. Discussion and conclusions    Twitter has scored the most negative entropy values and Facebook followed. Google came last. This indicates that Twitter is the most open communication system. The entropy values for liberal candidates (AHN and MOON) have been higher than their conservative opponent PARK on social media than Google sphere. This may not be surprising because both Twitter and Facebook have particularly appeared to the Korean citizens in the age of late teenagers to early 40s.
  • 177. Discussion and conclusions PARK’s entropy has been slightly higher on Google than her liberal challenger MOON.  Park was successful in garnering a strong support from senior voters in their 50s and 60s accounted for 39% of the population, up from 29% a decade ago (Wall Street Journal, 2012).  Exit poll also revealed that PARK gained a support from 62% of voters in their 50s and 72% of voters in their 60s. Indeed, the most significant statistic on the election was that South Koreans in their 20s, 30s, and 40s actually voted 65.2%, 72.5%, and 78.7% respectively but 89.9% in 50s and 78.8% over 60s went to the polling booth. 
  • 178. Paper-code Keynote Speech “Creativity and TRIZ”for the Knowledge Network Analysis in the Emerging Big Data Research” - DISC 2013 2013. 12. 14. Dr. Jae Ho Par, Ph.D. Managing Director of GRCIOP Professor Emeritus Jae Ho Park Yeungnam University
  • 179. Curriculum Vitae Paper-code December 14, 2013 Professor emeritus Jae H. Park, Ph.D - Professor Emeritus , Industrial and Organizational Psychology, Yeungnam University, South Korea -Chairman, Global TRIZ Conference, Organizing Committes - Chairman, Korean Society of Creativity - Managing Director, GRCIOP Research Center - Senior Advisor, ICEDR(International Consortium for Executive Development Research, Boston, USA - Ph.D., Organizational Psychology, Goettingen University, Germany - MA, Social Psychology, Seoul National University - BA, Seoul National University <Academic Career> -  Harvard University, Research Professor. USA  University of Michigan, Exchange Professor, Ann Arbor, Michigan, USA  Yokohama National University, Research Fellow Professor, Japan  CSPP(California School of Professional Psychology), Teaching Professor, 1999-2000  Senior Advisor, ICEDR(International Consortium for Executive Development Research), USA  Visiting Professor, Meio University, Japan, current Partner, THT Cross-cultural Consulting, Amsterdam, the Netherlands  Partner, SYMLOG Consulting Group, San Diego, USA  Liscencee, Center for Creative Leadership(CCL), Greensboro, USA,  Partner, Global Integration, UK
  • 180. Paper-code <International Consulting and Training>  Samsung Electronics; Creativity and Innovation “Change Begins with Me” Samsung New Management, Train the trainers for 6,000 managers.  JMA(Japan Management Association and FMIC(Future Management and Innovation Consulting, Japan ), SYMLOG Diagnosis, Team-building and Coaching, Tokyo, Japan - LG Philips Displays, M & A Process Consultation, Coaching, Diagnosis  LG Electronics, DAC(White electronics Division), Changwon, Korea  Hyundai Motor Company, Creativity and Innovation Program, Korea  Samsung Electronics, Large Scale Change, Korea  BorgWarner, Detroit, USA  Ericsson, Sweden  Applied Materials Korea, Coaching and Consultation, Seoul, Korea  Goldman Sachs, Integration Project Coaching, with THT Consulting Group, 2007  MetLife, Coaching for Asset Managers, 2007  Mirae Assets Stock Company, Creativity Coaching, 2010  Team-building and Innovation, Trondheim University, Norway
  • 181. Paper-code <International Network>  Center for Creative Leadership, Partner, Liscencee, North Carolina, USA  SPGR Consulting, Oslo, Norway  JMAC(Japan Management Association Consulting) Tokyo, Japan  SYMLOG Consulting Group, Researcher and Partner, San Diego, USA  Global Integration, Partner, London, United Kingdom  Japan Creativity Research Center, Partner, Tokyo, Japan  THT Cross-cultural Consulting(Trompenaars & Turner), Amsterdam, Partner, the Netherlands  ICEDR(International Consortium for Executive Development Research) Boston, USA <Consultant and Advisor >  Samsung HRD Center  Samsung Electronics  Samsung SDI  LG Education Center  LG Electronics  POSCO HRD Center <Contact> Phone; 82-53-810-2230(Office) Fax; 82-53-810-4610 Mobile; 82-10-8751-7579 email; grciop@gmail.com
  • 182. TRIZ Founder G. S. Altshuller (1926~1998) Father of TRIZ Global TRIZ Conference 2013 | www.koreatrizcon.kr Seoul Trade Exhibition & Convention, Seoul, Korea | July 09-11, 2013
  • 183. Paper-code What is TRIZ ? TRI Z is a tool for Thinking but not instead of thinking G. Altshuller
  • 184. Change of major discipline Paper-code
  • 185. Paper-code From Tools to Subjects  Labor : Human Robot Creativity
  • 186. Paper-code TRIZ 6 Sigma CAE
  • 187. Innovation in Global companies Paper-code
  • 188. Paper-code 1. 2. 3. 4. 5. 6. 7. Toyota Method QFD TOC TRIZ 6 Sigma Taguchi Method 7 Tools of Product Design
  • 189. Paper-code
  • 190. Paper-code  Research Areas ◦ Understanding creative cognition and computation ◦ Creativity to stimulate breakthrough in science and engineering ◦ Educational approaches that encourage creativity ◦ Supporting creativity with IT
  • 191. Paper-code  INSA     Strasbourg http://www.insa-strasbourg.fr/en/news/news.html Advanced Master of Innovative Design 5 Semesters for Intensive TRIZ Since 9 years in operation
  • 192. Paper-code 2008. 11. 28
  • 193. Edison and Altshuller • • • • • Everybody can be a Inventor TRIZ Diffusion; No cost Developed TRIZ in Prison Benevolent Mentor (Dialectics; ideal Communist) Paper-code
  • 194. Paper-code  TRIZ  Analyzed many Patents  By Creative Problem Solving Methods  Inductive Research Methods
  • 195. Paper-code
  • 196. Paper-code Various views on TRIZ • • • • • • • From Knowledge Management From 6 Sigma From Engineering Design From Innovation From Creativity From R&D Etc…
  • 197. Paper-code
  • 198. Paper-code TRIZ as a Science Technical Systems Social Systems Natural Systems TRIZ N&A Narbut, 2003
  • 199. Paper-code 5 Levels of Invention ① Apparent Solution (32%) ① - Simple ② Simple Improvement within current system (45%) ③ Major improvement (18%) - within same science ④ Innovation within current system (4%) - Application different science principle ⑤ Pioneer Invention (1%) - New principle and Paradigm Shift ⑤ ④ ③ ②
  • 200. Paper-code Effects in TRIZ Effects Systematized Information funds Trends Su-F Development Models ARIZ, Standards N&A Narbut, 2003
  • 201. Paper-code Common Approach TRIZ Innovation involves the creation of new ideas Innovation involves adapting existing ideas Trained in the notion of the ‘great idea’. Popular mythology - “Einstein” as model. Belief that ‘six months in the lab beats one hour spent in the library’. Tap existing solutions. Look outside of discipline and to Nature. Key benefit: reduces perceived risk of innovation (predictable, higher chance of success).
  • 202. Paper-code Korea; Creative Economy via Creativity : Expansion & Convergence Pie Bibimbap - 2/10 -
  • 203. Creativity and TRIZ Paper-code * Korea Academic TRIZ Association Industry-Academia Knowledge sharing Contributor for industry competitiveness and creative talent by TRIZ Founded in May 2010  Participating of Univ. & Co.  Homepage: www.katatriz.or.kr 32 Co. 29 Univ. - 3/10 -
  • 204. Paper-code Main Activities Expanded use of TRIZ and social contribution Evolution Nurturing creative talent MATRIZ & KATA MOU Problem-solving, Patent-creation Biz. TRIZ research Univ. professor Workshop Anti-school violence program TRIZ education Charity fair TRIZ Youth Acamedy Lectures for SMEs Consulting for SMEs problem-solving Technical TRIZ application 2010 2011 2012 2013 Time - 5/10 -
  • 205. TRIZ Activities in Korea Paper-code Company : Development of Innovative Products, Problem-Solving and Patents Creation  Core tech & innovative product  Foundation of TRIZ Univ.  TRIZ Elite  Development of POSCO methodology  TRIZ research group  Internal TRIZ Conference  Mixing DFSS & TRIZ  Strategic R&D patent creation  Patent creation  On-site TRIZ process designed to  TRIZ research group improve on-site work performance - 6/10 -
  • 206. TRIZ Activities in Korea Paper-code University : Utilizing TRIZ in subject of “Creative design” POSTECH  Master course curriculum  TRIZ Project organization YONSEI  Creative engineering education  Inter-discipline activities and courses  Engineering certification program HANYANG  Creative design education  Business management and creative design curriculum POLYTECHNIC  Mechanical engineering-focused courses  KOREA/RUSSIA cooperation center ※ TRIZ application supported by the government and research institutions (i.e. Ministry of Trade, Industry and Energy and ETRI) - 7/10 -
  • 207. Paper-code    Systematic innovator Learn and practice by yourself. Participate as a member of TRIZ Association(Daegu-Gyungbuk Regional Association): via Band
  • 208. Paper-code Recognition that  (technical) systems evolve  Towards the increase of ideality  By overcoming Contradiction  Mostly with minimal introduction of (free) Resources Thus, for creative problem solving  TRIZ provides a dialectic ways of thinking, i.e.,  To understand the problem as a system  To image the Ideal solution first  And solve Contradiction
  • 209. Paper-code GRCIOP Global Network ICEDR(International Consortium for Executive Development Research(USA) Global Integration(United Kingdom) SYMLOG Consulting Group(USA) Center for Creative Leadership(USA) THT Consulting(the Netherlands) Endre Sjovold Association(Norway)
  • 210. The Geopolitics of New Media RANDY KLUVER TEXAS A&M UNIVERSITY
  • 211. The context  The rise of “new media” has transformed politics, economics, and societies.  But, “Internet Studies” as a field ignores the geopolitical issues associated with the rise of new media technologies   Lots of emphasis on “politics” and the internet, but little on the relations between states “Arab Spring”-events occur, but the focus remains primarily on a domestic context  Likewise, traditional IR theory focuses primarily on elite level strategy, and doesn’t have the tools to account for publics
  • 212. The Big Picture
  • 213. Issue 1: The implications of a “networked” globe on geopolitics  Shifting configurations of influence  Networked, rather than hierarchical  Highly transnational  “foreign” vs “domestic” doesn’t capture the reality  The conversation has become global, especially among elites    Values Politics Economics  But, influence depends on your connectedness to the global conversation  Thus, dependent on access to technological infrastructure
  • 214. Example: Influential players in discourse surrounding the Egyptian coup weren’t Egyptian!
  • 215. Saudi #2
  • 216. But where was the Muslim Brotherhood?
  • 217. Constraints on global networks  Language  Technological diffusion  Domestic politics/economic priorities  Platforms/applications
  • 218. Should networks follow language groups?
  • 219. English as the dominant carrier of global conversation
  • 220. Internet languages
  • 221. A new bi-polar world?
  • 222. Peer to Peer Diplomacy: Global Social Network Usage
  • 223. Twitter’s global web traffic (not counting sms, im, etc)
  • 224. P2PD: China’s exclusion from “facebook friendships”
  • 225. South Korea’s facebook friendships
  • 226. Russia’s Facebook friendships
  • 227. Iran’s Facebook friendships
  • 228. Public Diplomacy: Twitter targets
  • 229. China’s Twitter outreach
  • 230. Russia’s Twitter Outreach
  • 231. Public Diplomacy: E-diplomacy index
  • 232. How is China doing?
  • 233. South Korea’s E-Diplomacy
  • 234. Issue 2: Information Access/Control  Crowd Sourced  Unprecedented access to sensitive information  Stratified  Customized “The spread of information networks is forming a new nervous system for our planet. When something happens in Haiti or Hunan, the rest of us learn about it in real time-from real people.” US Sec of StateHillary Clinton, 2010
  • 235. Wikileaks: Crowd-sourced espionage or invaluable public service?  Revealed US war plans and operations, as well as diplomatic secrets  Led to multiple recriminations, including attempted assassination of Saudi ambassador  Snowden: hero or traitor?
  • 236. The value of geographic knowledge
  • 237. Need a drone?
  • 238. Issue Three: Policies  Re-articulation of “national interest”  Alec J. Ross and “21st Century Statecraft”  “addresses new forces propelling change in international relations that are pervasive, disruptive, and difficult to predict.” US Dept of State  Perhaps what we can predict  Publics more important than elites  Don’t assume you can keep secrets  Companies comply with national laws more for reputational reasons than for fear of sanction
  • 239. The Internet Freedom Agenda  “Countries that restrict free access to information or violate the basic rights of internet users risk walling themselves off from the progress of the next century.” Hillary Clinton, January 2010, Remarks on Internet Freedom  “Let’s be clear. This disclosure is not just an attack on America-it’s an attack on the international community.” Hillary Clinton, November 2010, after the Wikileaks release.  Conclusion: no set of easy answers
  • 240. Final thoughts…..  We need far more sustained attention to the impact of new media in between states, as well as within states.  Unrealistic to simply say “NO,” no matter how loudly we say it. The technology won’t be unmade.  We are in uncharted, and largely unstudied, territory, and our policies are being driven by what is technically feasible, rather than what is desirable.
  • 241. A project from the Social Media Research Foundation: http://www.smrfoundation.org
  • 242. About Me Introductions Marc A. Smith Chief Social Scientist Connected Action Consulting Group Marc@connectedaction.net http://www.connectedaction.net http://www.codeplex.com/nodexl http://www.twitter.com/marc_smith http://delicious.com/marc_smith/Paper http://www.flickr.com/photos/marc_smith http://www.facebook.com/marc.smith.sociologist http://www.linkedin.com/in/marcasmith http://www.slideshare.net/Marc_A_Smith http://www.smrfoundation.org
  • 243. Social Media Research Foundation http://smrfoundation.org
  • 244. Social Media Research Foundation People Disciplines Institutions University Faculty Computer Science University of Maryland Students HCI, CSCW Oxford Internet Institute Industry Machine Learning Stanford University Independent Information Visualization Microsoft Research Researchers UI/UX Illinois Institute of Technology Developers Social Science/Sociology Connected Action Network Analysis Cornell Collective Action Morningside Analytics
  • 245. What we are trying to do: Open Tools, Open Data, Open Scholarship • Build the “Firefox of GraphML” – open tools for collecting and visualizing social media data • Connect users to network analysis – make network charts as easy as making a pie chart • Connect researchers to social media data sources • Archive: Be the “Allen Very Large Telescope Array” for Social Media data – coordinate and aggregate the results of many user’s data collection and analysis • Create open access research papers & findings • Make “collections of connections” easy for users to manage
  • 246. What we have done: Open Tools • NodeXL • Data providers (“spigots”) – – – – – – – – ThreadMill Message Board Exchange Enterprise Email Voson Hyperlink SharePoint Facebook Twitter YouTube Flickr
  • 247. What we have done: Open Data • NodeXLGraphGallery.org – User generated collection of network graphs, datasets and annotations – Collective repository for the research community – Published collections of data from a range of social media data sources to help students and researchers connect with data of interest and relevance
  • 248. What we have done: Open Scholarship
  • 249. What we have done: Open Scholarship
  • 250. Social Media (email, Facebook, Twitter, YouTube, and more) is all about connections from people to people. 10
  • 251. Patterns are left behind 11
  • 252. There are many kinds of ties…. Send, Mention, Like, Link, Reply, Rate, Review, Favorite, Friend, Follow, Forward, Edit, Tag, Comment, Check-in… http://www.flickr.com/photos/stevendepolo/3254238329
  • 253. Social Network Theory http://en.wikipedia.org/wiki/Social_network • Central tenet – Social structure emerges from – the aggregate of relationships (ties) – among members of a population • Phenomena of interest – Emergence of cliques and clusters – from patterns of relationships – Centrality (core), periphery (isolates), – betweenness • Methods – Surveys, interviews, observations, log file analysis, computational analysis of matrices Source: Richards, W. (1986). The NEGOPY network analysis program. Burnaby, BC: Department of Communication, Simon Fraser University. pp.716 (Hampton &Wellman, 1999; Paolillo, 2001; Wellman, 2001)
  • 254. SNA 101 • Node A – “actor” on which relationships act; 1-mode versus 2-mode networks • Edge B – Relationship connecting nodes; can be directional C • Cohesive Sub-Group – Well-connected group; clique; cluster • Key Metrics A B D E – Centrality (group or individual measure) D • Number of direct connections that individuals have with others in the group (usually look at incoming connections only) • Measure at the individual node or group level E – Cohesion (group measure) • Ease with which a network can connect • Aggregate measure of shortest path between each node pair at network level reflects average distance – Density (group measure) • Robustness of the network • Number of connections that exist in the group out of 100% possible G F – Betweenness (individual measure) • # shortest paths between each node pair that a node is on • Measure at the individual node level • Node roles H I C – Peripheral – below average centrality – Central connector – above average centrality – Broker – above average betweenness E D
  • 255. NodeXL Free/Open Social Network Analysis add-in for Excel 2007/2010 makes graph theory as easy as a pie chart, with integrated analysis of social media sources. http://nodexl.codeplex.com
  • 256. Now Available
  • 257. Communities in Cyberspace
  • 258. Goal: Make SNA easier • Existing Social Network Tools are challenging for many novice users • Tools like Excel are widely used • Leveraging a spreadsheet as a host for SNA lowers barriers to network data analysis and display
  • 259. http://www.flickr.com/photos/badgopher/3264760070/
  • 260. http://www.flickr.com/photos/druclimb/2212572259/in/photostream/
  • 261. http://www.flickr.com/photos/hchalkley/47839243/
  • 262. http://www.flickr.com/photos/rvwithtito/4236716778
  • 263. http://www.flickr.com/photos/62693815@N03/6277208708/
  • 264. Social Network Maps Reveal Key influencers in any topic. Sub-groups. Bridges.
  • 265. NodeXL Network Overview Discovery and Exploration add-in for Excel 2007/2010 A minimal network can illustrate the ways different locations have different values for centrality and degree
  • 266. Hubs
  • 267. Bridges
  • 268. http://www.flickr.com/photos/storm-crypt/3047698741
  • 269. Welser, Howard T., Eric Gleave, Danyel Fisher, and Marc Smith. 2007. Visualizing the Signatures of Social Roles in Online Discussion Groups. The Journal of Social Structure. 8(2). Experts and “Answer People” Discussion people, Topic setters Discussion starters, Topic setters
  • 270. http://www.flickr.com/photos/library_of_congress/3295494976/sizes/o/in/photostream/
  • 271. http://www.flickr.com/photos/amycgx/3119640267/
  • 272. #teaparty 15 November 2011 #occupywallstreet 15 November 2011 http://www.newscientist.com/blogs/onepercent/2011/11/occupy-vs-tea-party-what-their.html
  • 273. Like MSPaint™ for graphs. — the Community Introduction to NodeXL
  • 274. NodeXL Ribbon in Excel
  • 275. NodeXL data import sources
  • 276. Example NodeXL data importer for Twitter
  • 277. NodeXL imports “edges” from social media data sources
  • 278. NodeXL displays subgraph images along with network metadata NodeXL creates a list of “vertices” from imported social media edges
  • 279. NodeXL Automation makes analysis simple and fast Perform collections of common operations with a single click
  • 280. NodeXL Generates Overall Network Metrics
  • 281. 50
  • 282. 51
  • 283. 52
  • 284. 53
  • 285. 54
  • 286. 55
  • 287. 56
  • 288. 57
  • 289. 58
  • 290. Divided Polarized Unified In-group Fragmented Brand Clustered Communities In-Hub & Spoke Broadcast Out-Hub & Spoke Support
  • 291. 6 kinds of Twitter social media networks
  • 292. #My2K Polarized
  • 293. #CMgrChat In-group / Community
  • 294. Lumia Brand / Public Topic
  • 295. #FLOTUS Bazaar
  • 296. New York Times Article Paul Krugman Broadcast: Audience + Communities
  • 297. Dell Listens/Dellcares Support
  • 298. SNA questions for social media: 1. 2. 3. 4. What does my topic network look like? What does the topic I aspire to be look like? What is the difference between #1 and #2? How does my map change as I intervene? What does #YourHashtag look like?
  • 299. Twitter Network for “Microsoft Research” *BEFORE*
  • 300. Twitter Network for “Microsoft Research” *AFTER*
  • 301. Network Motif Simplification Cody Dunne, University of Maryland
  • 302. Network Motif Simplification D-connector (glyph on the right) Fan(glyph on the right) D-clique (glyphs for 4, 5, and 6 member cliques below) Dr. Cody Dunne
  • 303. NodeXL Graph Gallery
  • 304. Scholars using NodeXL • Communications – Katy Pearce – Itai Himelboim • Business – Scott Dempwolf • Humanities/Classics – Diane Cline
  • 305. C. Scott Dempwolf, PhD Research Assistant Professor & Director UMD - Morgan State Center for Economic Development
  • 306. What is Social Network Analysis? How is it useful for the humanities? 1. New framework for analysis 2. Data visualization allows new perspectives – less linear, more comprehensive Social Network Analysis and Ancient History Diane H. Cline, Ph.D. University of Cincinnati
  • 307. NodeXL calculates metrics about networks and content
  • 308. The Content summary spreadsheet displays the most frequently used URLs, hashtags, and user names within the network as a whole and within each calculated sub-group.
  • 309. NodeXL Graph Gallery 80
  • 310. NodeXL as a Research Tool 81
  • 311. NodeXL as a Teaching Tool I. Getting Started with Analyzing Social Media Networks 1. Introduction to Social Media and Social Networks 2. Social media: New Technologies of Collaboration 3. Social Network Analysis II. NodeXL Tutorial: Learning by Doing 4. Layout, Visual Design & Labeling 5. Calculating & Visualizing Network Metrics 6. Preparing Data & Filtering 7. Clustering &Grouping III Social Media Network Analysis Case Studies 8. Email 9. Threaded Networks 10. Twitter 11. Facebook 12. WWW 13. Flickr 14. YouTube 15. Wiki Networks http://www.elsevier.com/wps/find/bookdescription.cws_home/723354/description 82
  • 312. What we want to do: (Build the tools to) map the social web • Move NodeXL to the web: (Node[NOT]XL) – Node for Google Doc Spreadsheets? – WebGL Canvas? D3.JS? Sigma.JS • Connect to more data sources of interest: – RDF, MediaWikis, Gmail, NYT, Citation Networks • Solve hard network manipulation UI problems: – Modal transform, Time series, Automated layouts • Grow and maintain archives of social media network data sets for research use. • Improve network science education: – Workshops on social media network analysis – Live lectures and presentations – Videos and training materials
  • 313. NodeXL Results • Easy to learn, yet powerful and insightful • Widely used by both students and researchers • Free and open source sofware • World-wide team of collaborators Malik S, Smith A, Papadatos P, Li J, Dunne C, and Shneiderman B (2013), “TopicFlow: Visualizing topic alignment of Twitter data over time. In ASONAM '13. Bonsignore EM, Dunne C, Rotman D, Smith M, Capone T, Hansen DL and Shneiderman B (2009), "First steps to NetViz Nirvana: Evaluating social network analysis with NodeXL", In CSE '09. pp. 332-339. DOI:10.1109/CSE.2009.120 Mohammad S, Dunne C and Dorr B (2009), "Generating high-coverage semantic orientation lexicons from overtly marked words and a thesaurus", In EMNLP '09. pp. 599-608. Smith M, Shneiderman B, Milic-Frayling N, Rodrigues EM, Barash V, Dunne C, Capone T, Perer A and Gleave E (2009), "Analyzing (social media) networks with NodeXL", In C&T '09. pp. 255-264. 84 DOI:0.1145/1556460.1556497
  • 314. How you can help Sponsor a feature Sponsor workshops Sponsor a student Schedule training Sponsor the foundation Donate your money, code, computation, storage, bandwidth, data or employee’s time • Help promote the work of the Social Media Research Foundation • • • • • •
  • 315. Available Now in NodeXL! • • • • • • • • • • • • • Motif Simplification Group-in-a-Box Layouts Data import spigots Excel functions & macros Network statistics Layout algorithms Filtering Clustering Attribute mapping Automate analyses Email reporting Graph Gallery C# libraries nodexl.codeplex.com
  • 316. Strategies for social media engagement based on social media network analysis
  • 317. A project from the Social Media Research Foundation: http://www.smrfoundation.org
  • 318. International Collaboration & Green Technology Generation Assessing the East Asian Environmental Regime Matthew A. Shapiro Illinois Institute of Technology matthew.shapiro@iit.edu
  • 319. Impetus • Shapiro and Nugent (2012) “Institutions and the sources of innovation” in IJPP • Total factor productivity is hindered by collaboration if institutions are absent or if not beyond TFP threshold • Shapiro (2013) “Regionalism’s challenge to the pollution haven hypothesis” in Pacific Review • Regional efforts to eliminate pollution are multifaceted • Support • East Asia Institute • Asiatic Research Institute, Korea University
  • 320. International institutions To other regions To other regions Regional institutions Country 2 FDI Country 2 ecologists (+) Pollution haven hypothesis (+) (+) Epistemic community hypothesis (-) Country 1 pollution Country 2 pollution Country 3 pollution Country 1 institutions (-) Country 2 domestic R&D funding Country 3 domestic R&D funding Country 3 ecologists Country 3 FDI Contra-pollution haven hypothesis (-) Country 1 domestic R&D funding Country 1 ecologists Country 1 FDI Country 2 institutions Country 3 institutions
  • 321. International institutions To other regions To other regions Regional institutions Country 2 FDI Country 2 ecologists (+) Pollution haven hypothesis (+) (+) Epistemic community hypothesis (-) Country 1 pollution Country 2 pollution Country 3 pollution Country 1 institutions (-) Country 2 domestic R&D funding Country 3 domestic R&D funding Country 3 ecologists Country 3 FDI Contra-pollution haven hypothesis (-) Country 1 domestic R&D funding Country 1 ecologists Country 1 FDI Country 2 institutions Country 3 institutions
  • 322. International institutions To other regions To other regions Regional institutions Country 2 FDI Country 2 ecologists (+) Pollution haven hypothesis (+) (+) Epistemic community hypothesis (-) Country 1 pollution Country 2 pollution Country 3 pollution Country 1 institutions (-) Country 2 domestic R&D funding Country 3 domestic R&D funding Country 3 ecologists Country 3 FDI Contra-pollution haven hypothesis (-) Country 1 domestic R&D funding Country 1 ecologists Country 1 FDI Country 2 institutions Country 3 institutions
  • 323. Research Questions • Are the Northeast Asian countries key collaborators in pursuit of green R&D? • Yes, particularly in recent years. • Are the Northeast Asian countries collaborating extensively with each other? • Not as much as they collaborate with countries beyond the region. • Implications?
  • 324. Green R&D • Patents • IPC Green Inventory • • • • • • • Alternative energy production Transportation Energy conservation Waste management Agriculture/forestry Administrative aspects Nuclear power generation
  • 325. Alternative energy production • Biofuels • Integrate gasification combined cycle • Fuel cells • Pyrolysis or gasification of biomass • Harnessing energy from manmade waste • Hydro energy • Ocean thermal energy conversion • Wind energy • Solar energy • Geothermal energy • Other production or use of heat not derived from combustion • Using waste heat • Devices for producing mechanical power from muscle energy Energy conservation • Storage of electrical energy • Power supply circuitry • Measurement of electricity consumption • Storage of thermal energy • Low energy lighting • Thermal building insulation, in general • Recovering mechanical energy
  • 326. Data Collection • Source: USPTO • Collection method: Leydesorff’s tools • Unit of analysis: country of inventor
  • 327. Data Description IL BE • Dates: 1990-2013 • 129,640 total inventors IN IT CN CH NZ TW all others AU KR DK • Assumption: Any collaboration is valued, so proportionate share of patent inventorship is ignored. CA GB • 242,331 total nodes based on country classification NL FR US DE JP
  • 328. Are Northeast Asian countries key collaborators?
  • 329. All years: 1990-2013
  • 330. Longitudinal analysis…
  • 331. 1990-1997
  • 332. 1998-2004
  • 333. 2005-2013
  • 334. Is Northeast Asia a singular research hub?
  • 335. All years: 1990-2013
  • 336. Longitudinal analysis…
  • 337. 1990-1997
  • 338. 1998-2004
  • 339. 2005-2013
  • 340. Small world example
  • 341. Northeast Asia only: 1990-2013
  • 342. Implications • Empirical • R&D collaboration can be beneficial from both intra- as well as extra-regionally. Both are happening extensively for Northeast Asia. • Methodological • Challenges of connecting these results to other variables in model • Longitudinal concerns: Change in connectedness? • Qualitative, quantitative, mixed?
  • 343. Assessing Social Media Coverage in Japan: Before and After March 11, 2011 Leslie M. Tkach-Kawasaki University of Tsukuba DISC 2013, December 11, 2013
  • 344. Overview 1. 2. 3. 4. 5. 6. Introduction: Social Media in Japan 2010-2011 March 11, 2011: Triple Disaster Social Media: Before and After? Method Select Results (6 tables) Conclusion
  • 345. Japan’s Internet Population 2011 Source: 2011 情報通信白書平成23年版
  • 346. Social Media in Japan 2010-2011 Have used the following at least once….. Blogs  77.3% Video-sharing websites  62.8% SNS  53.6% Microblogs (Twitter)  30.9% Source: 2010 White Paper on Information and Communications in Japan
  • 347. The Year in Social Media 2010-11 International diplomacy:Youtube and Chinese fishing vessel (September 2010)  Entertainment: Release of The Social Network (October 2010)  International conflicts: Role of Twitter and Facebook in Tunisia and Egypt (January 2011)  Disasters: New Zealand Earthquake (February 2011) 
  • 348. And March 11, 2011….
  • 349. Information Provision/Gathering During 2011 Earthquake Source: 2012 White Paper on Information and Communications in Japan
  • 350. Research question…. Are there perceivable differences in the discourse (phrases) about social media in Japan’s newspaper media before and after March 11, 2011?
  • 351. Method   Content analysis of newspaper articles from September 1, 2010 to March 11, 2011 (6 mos., 11 days), and from March 11, 2011 to July 31, 2011 (4 mos., 20 days). Main keywords:       Social media (ソーシャルメディア) Mixi (ミクシー、ミクシィ) Youtube (ユーチューブ) Twitter (ツイッター) Facebook (フェイスブック) Phrases associated with each of the main keywords
  • 352. Table 1 No. of Articles Keywords Pre-earthquake Sept. 1, 2010 to Mar. 11, 2011 Post-earthquake Mar. 12 to Jul. 31, 2011 Asahi Nikkei Asahi Nikkei Social media (ソーシャルメディア) 18 24 28 8 Mixi (ミクシー ミクシィ) 0 56 0 10 Youtube (ユーチューブ) 151 27 55 12 Twitter (ツイッター) 502 58 561 94 Facebook (フェイスブック) 115 49 97 74
  • 353. Table 2 Social media Asahi Pre (N=18) Phrase 情報 (info) Nikkei Post (N=28) Freq. 72 Phrase 情報 (info) Pre (N=24) Freq. Phrase Post (N=8) Freq. 131 99 ネット (Net) 19 123 営業 (operations) 77 情報 (info) 19 117 情報 (info) 62 日本 (Japan) 17 利用 (use) 社員 (comp. employee) 17 56 ネット (Net) 48 取締役 (board) 監査 (investigation) 人 (people) 42 社長 (co. president) 79 ネット (Net) 61 政府 (govt) 40 退任 (resign) 74 メディア (media) 58 メディア (media) 38 据置 (sueoki or defer) 72 世界 (world) 37 常務 (ordinary) 67 使う (use) 36 企業 (company) 35 デモ (demo) 33 損失(cost) Freq. 局長 (branch president) 日本 (Japan) 増配 (inc. in dividends メディア (media) Phrase サイト (website) ビジネス (business) 48 48 募金 (donation) 投資 (investment) 67 推進 (progress) 48 60 代理 (rep.) 48 必要 (necessary) ビジネス (business) 45 震災 (disaster) 54 企業 12 12 11 11 10 10
  • 354. Table 3 Mixi Nikkei Pre (N=56) Phrase Post (N=10) Freq. Phrase Freq. サイト (website) 118 サイト (website) 18 ネット (Net) 104 メール (email) 18 企業 (company) 74 サービス (service) 17 利用 (use) 64 相談 (consultation) 16 日本 (Japan) 62 東京 (Tokyo) 15 ゲーム (game) 54 連絡 (communicate) 15 市場 (market) 51 ギフト (gift) 14 米 (west) 48 センター (center) 13 利益 (benefit) 46 医師 (doctor) 13 サービス (service) 42 震災 (disaster) 13
  • 355. Table 4 Youtube Asahi Pre (N=151) Phrase Nikkei Post (N=55) Freq. Phrase Freq. Pre (N=27) Phrase Post (N=12) Freq. Phrase Freq. 36 映像 (footage) 490 動画 (video) 98 映像 (video) 143 動画 (video) 265 人 (people) 79 保安 (security) 70 情報 (info) サイト (website) 保安 (security) 251 77 海上 (maritine) 67 ネット (Net) 28 投稿 (post [v.]) 231 69 ネット (Net) 58 日本 (Japan) 24 流出 (outflow) 海保 (maritime safely) 捜査 (investigation) 222 被災 (disaster) サイト (website) 投稿 (post [v.]) 68 投稿 (post [v.]) 58 将棋 (chess) 21 207 ネット (Net) 52 58 動画 (video) 21 206 見る (watch) 50 流出 (outflow) 捜査 (investigation) 53 配信 (reports) 21 中国 (China) 172 50 中国 (China) 52 166 50 50 サイト (website) 165 情報 (info) 48 衝突 (collision) 海保 (maritime safety) 番組 (program) サービス (service) 18 ネット (Net) 震災 (disaster) 大震災 (major disaster) 47 被災 (disaster) 16 29 17
  • 356. Table 5 Twitter Asahi Pre (N=502) Phrase Post (N=561) Freq. Phrase Freq. 人 (people) 659 運行 (travel) 1513 日本 (Japan) 649 被災 (disaster) 1333 ネット (Net) 586 相談 (consultation) 1156 情報 (info) 563 情報 (info) 1144 東京 (Tokyo) 374 仙台 (Sendai) 1137 思う (to think) 368 人 (people) 1101 前 (before) 326 午後 (afternoon) 1030 サイト (website) 320 福島 (Fukushima) 1019 見る (watch) 320 支援 (support) 965 メディア (media) 296 電話 (telephone) 875
  • 357. Table 6 Facebook Asahi Pre (N=115) Phrase Post (N=97) Freq. Phrase Freq. デモ (demonstration) 414 情報 (info) 180 エジプト (Egypt) 233 人 (people) 157 大統領 (president) 214 ネット (Net) 136 政権 (govt. administration) 204 支援 (support) 114 ムバラク (Mubarak) 175 被災 (disaster) 113 日本 (Japan) 163 日本 (Japan) 104 情報 (info) 148 福島 (Fukushima) 86 政府 (govt) 144 避難 (evacuation) 84 人 (people) 142 世界 (world) 77 ネット (Net) 134 大震災 (major disaster) 74
  • 358. Conclusion  Newspapers      Timing     Similarities in coverage of “social media” as a general phrase In numbers, more articles about Youtube, Facebook, and Twitter in the Asahi Mixi as part of the service market in the Nikkei Implication 1  Social aspects versus business aspects Pre-earthquake: Business, world events (focus: China) Post-earthquake: Japan and the earthquake Implication 2  Diffusion from “foreign” media to “Japanese” media Across all types of social media channels   Emphasis on information (情報) and the earthquake (震災) particularly on Youtube, Facebook, and Twitter in the post-earthquake period Implication 3  Social media “personalization”
  • 359. Thank you for listening….. tkach@japan.email.ne.jp
  • 360. Citation Networks V. Batagelj, M. Cerinˇek s Bibliographic networks Analysis of citation networks Acyclic networks Probabilistic flow index Vladimir Batagelj, Monika Cerinˇek s SPC weights Coupling and Co-citation Authors’ citations network University of Ljubljana Daegu International Social Network Conference DISC 2013 Global Plaza, Daegu, South Korea, December 12-14, 2013 Other approaches V. Batagelj, M. Cerinˇek s Citation Networks
  • 361. Outline Citation Networks V. Batagelj, M. Cerinˇek s Bibliographic networks Acyclic networks Probabilistic flow index SPC weights Coupling and Co-citation 1 2 3 4 5 6 7 Bibliographic networks Acyclic networks Probabilistic flow index SPC weights Coupling and Co-citation Authors’ citations network Other approaches Authors’ citations network Other approaches V. Batagelj, M. Cerinˇek s Citation Networks
  • 362. Record from Web of Science Citation Networks V. Batagelj, M. Cerinˇek s Bibliographic networks Acyclic networks Probabilistic flow index SPC weights Coupling and Co-citation Authors’ citations network Other approaches PT J AU Dipple, H Evans, B TI The Leicestershire Huntington’s disease support group: a social network analysis SO HEALTH & SOCIAL CARE IN THE COMMUNITY LA English DT Article C1 Rehabil Serv, Troon Way Business Ctr, Leicester LE4 9HA, Leics, England. RP Dipple, H, Rehabil Serv, Troon Way Business Ctr, Sandringham Suite,Humberstone Lane, Leicester LE4 9HA, Leics, England. CR BORGATTI SP, 1992, UCINET 4 VERSION 1 0 FOLSTEIN S, 1989, HUNTINGTONS DIS DISO SCOTT J, 1991, SOCIAL NETWORK ANAL NR 3 TC 3 PU BLACKWELL SCIENCE LTD PI OXFORD PA P O BOX 88, OSNEY MEAD, OXFORD OX2 0NE, OXON, ENGLAND SN 0966-0410 J9 HEALTH SOC CARE COMMUNITY JI Health Soc. Care Community PD JUL PY 1998 VL 6 IS 4 BP 286 EP 289 PG 4 SC Public, Environmental & Occupational Health; Social Work GA 105UP UT ISI:000075092200008 ER V. Batagelj, M. Cerinˇek s Citation Networks
  • 363. Networks from Web of Science Citation Networks V. Batagelj, M. Cerinˇek s Bibliographic networks Acyclic networks Probabilistic flow index SPC weights Coupling and Co-citation For data from the Web of Science (Knowledge) we can obtain the corresponding networks using the program WoS2Pajek: • citation network Ci: works × works; • authorship network WA: works × authors, for works without complete description only the first author is known; • keywords network WK: works × keywords, only for works with complete description; • journals network WJ: works × journals; • partition of works by the publication year; • partition of works – complete description (1) / ISI name only (0); Authors’ citations network In the following we shall focus on the analysis of citation networks. Other approaches Other sources/examples of citation networks are the US patents network and US Supreme court cases network. V. Batagelj, M. Cerinˇek s Citation Networks
  • 364. Citation networks Citation Networks V. Batagelj, M. Cerinˇek s Bibliographic networks Acyclic networks Probabilistic flow index SPC weights Coupling and Co-citation Authors’ citations network In a given set of units/vertices V (articles, books, works, etc.) we introduce a citing relation/set of arcs R ⊆ V × V uRv ≡ v cites u which determines a citation network N = (V, R). The citing relation is usually irreflexive (no loops) and (almost) acyclic – it doesn’t contain any (proper) cycle. The reason for acyclicity is that the cited works are older than the citing work. If a cycle exists it is usually very short. Other approaches V. Batagelj, M. Cerinˇek s Citation Networks
  • 365. Condensation Citation Networks V. Batagelj, M. Cerinˇek s Bibliographic networks Acyclic networks Probabilistic flow index SPC weights Coupling and Co-citation Authors’ citations network Other approaches If we shrink every strong component of a given graph into a vertex, delete all loops and identify parallel arcs the obtained reduced graph called condensation is acyclic. V. Batagelj, M. Cerinˇek s Citation Networks
  • 366. Acyclic networks Citation Networks V. Batagelj, M. Cerinˇek s v4 v11 Bibliographic networks v8 Acyclic networks v2 v7 In real-life acyclic networks we usually have a vertex property p : V → R (most often time), that is compatible with arcs v9 SPC weights Coupling and Co-citation v3 v10 Authors’ citations network Other approaches v1 v6 Probabilistic flow index Besides citation networks, examples of acyclic networks are also genealogies and project networks. Acyclic network are the most general form of hierarchy. v5 (u, v ) ∈ R ⇒ p(u) < p(v ) acyclic.paj Network/Create Partition/Components/Strong [2] V. Batagelj, M. Cerinˇek s Citation Networks
  • 367. Basic properties of acyclic networks Citation Networks V. Batagelj, M. Cerinˇek s Bibliographic networks Acyclic networks Probabilistic flow index SPC weights Coupling and Co-citation Let G = (V, R) be acyclic and U ⊆ V, then G|U = (U, R|U), R|U = R ∩ U × U is also acyclic. Let G = (V, R) be acyclic, then G = (V, R −1 ) is also acyclic. Duality. The set of sources MinR (V) = {v : ¬∃u ∈ V : (u, v ) ∈ R} and the set of sinks MaxR (V) = {v : ¬∃u ∈ V : (v , u) ∈ R} are nonempty (in finite networks). Authors’ citations network Other approaches V. Batagelj, M. Cerinˇek s Citation Networks
  • 368. Compatible numberings Citation Networks V. Batagelj, M. Cerinˇek s 6 v6 5 v2 Bibliographic networks Acyclic networks 4 v11 Probabilistic flow index 6 v3 10 v6 5 v7 4 v9 7 v2 5 v11 3 v1 11 v3 9 v7 6 v9 4 v1 SPC weights 2 v4 2 v5 3 v4 8 v5 Coupling and Co-citation Authors’ citations network Other approaches 1 v8 1 v10 2 v8 1 v10 Compatible numberings: depth and topological order. For every acyclic graph an ordering / level function i : V → N exists s.t. (u, v ) ∈ A ⇒ i(u) < i(v ). Macro Layers. V. Batagelj, M. Cerinˇek s Citation Networks
  • 369. . . . Compatible numberings Citation Networks V. Batagelj, M. Cerinˇek s Other approaches V. Batagelj, M. Cerinˇek s Citation Networks v3 v6 v7 v5 v2 v9 v11 v1 v4 v3 v8 v6 v11 Authors’ citations network v10 v7 v10 v11 v5 v9 Coupling and Co-citation v10 v2 v8 SPC weights v9 v9 v7 v8 v6 v7 v11 Probabilistic flow index v6 v1 v5 v5 v4 v4 v4 Acyclic networks v3 v8 v3 Bibliographic networks v2 v10 v2 v1 v1
  • 370. Compatible numberings and functions on acyclic networks Citation Networks V. Batagelj, M. Cerinˇek s Bibliographic networks Acyclic networks Probabilistic flow index SPC weights Coupling and Co-citation Authors’ citations network Let the function f : V → R be defined in the following way: • f (v ) is known in sources v ∈ MinR (V) • f (v ) = F ({f (u) : uRv }) If we compute the values of function f in a sequence determined by a comptible numbering we can compute them in one pass since for each vertex v ∈ V the values of f needed for its computation are already known. Other approaches V. Batagelj, M. Cerinˇek s Citation Networks
  • 371. Compatible numberings – CPM Citation Networks V. Batagelj, M. Cerinˇek s Bibliographic networks Acyclic networks Probabilistic flow index SPC weights Coupling and Co-citation Authors’ citations network Other approaches CPM (Critical Path Method): A project consists of tasks. Vertices of a project network represent states of the project and arcs represent tasks. Every project network is acyclic. For each task (u, v ) its execution time t(u, v ) is known. A task can start only when all the preceeding tasks are finished. We want to know what is the shortest time in which the project can be completed. Let T (v ) denotes the earliest time of completion of all tasks entering the state v . v ∈ MinR (V) T (v ) = 0, T (v ) = max (T (u) + t(u, v )) u:uRv Network/Acyclic Network/Critical Path Method-CPM V. Batagelj, M. Cerinˇek s Citation Networks
  • 372. Dealing with cycles Citation Networks V. Batagelj, M. Cerinˇek s Bibliographic networks Acyclic networks Probabilistic flow index SPC weights Coupling and Co-citation Authors’ citations network Other approaches A citing relation is usually irreflexive (no loops) and (almost) acyclic. We shall assume that it has these two properties. Since in real-life citation networks the strong components are small (usually 2 or 3 vertices) we can transform such network into an acyclic network by shrinking strong components and deleting loops. An alternative approach is the preprint transformation that replaces each strong component with a bipartite subgraph from works to their preprints. Each paper from a strong component is duplicated with its ’preprint’ version. The papers inside strong component cite preprints. And preprints cite the works. V. Batagelj, M. Cerinˇek s Citation Networks
  • 373. Standardized citation network Citation Networks V. Batagelj, M. Cerinˇek s It is also useful to transform a citation network to its standardized form by adding a common source vertex s ∈ V and a com/ mon sink vertex t ∈ V. The / source s is linked by an arc to all minimal elements of R; and all maximal elements of R are linked to the sink t. We add also the ‘feedback’ arc (t, s). Bibliographic networks Acyclic networks Probabilistic flow index SPC weights Coupling and Co-citation Authors’ citations network Other approaches V. Batagelj, M. Cerinˇek s Citation Networks
  • 374. Probabilistic flow in acyclic network Citation Networks V. Batagelj, M. Cerinˇek s Bibliographic networks Let N = (V, A) be a standardized acyclic network with source s ∈ V and sink t ∈ V. We define the vertex potential p(v ) as • p(s) = 1 • p(v ) = Acyclic networks Probabilistic flow index u:(u,v )∈A p(u) outdeg(u) Let us denote ϕ(u, v ) = p(u) outdeg(u) . Then we have SPC weights p(v ) = Coupling and Co-citation ϕ(u, v ) u:(u,v )∈A Authors’ citations network and Other approaches ϕ(v , u) = u:(v ,u)∈A u:(v ,u)∈A V. Batagelj, M. Cerinˇek s p(v ) p(v ) = outdeg(v ) outdeg(v ) Citation Networks 1 = p(v ) u:(v ,u)∈A
  • 375. Probabilistic flow in acyclic network Citation Networks V. Batagelj, M. Cerinˇek s Therefore for each v ∈ V ϕ(u, v ) = Bibliographic networks Acyclic networks Probabilistic flow index SPC weights Coupling and Co-citation Authors’ citations network Other approaches u:(u,v )∈A ϕ(v , u) = p(v ) u:(v ,u)∈A saying that the Kirchoff’s law holds for the flow ϕ. p(v ) is equal to the probability that a random walk starting in the source s goes through the vertex v . and also ϕ(u, v ) is equal to the probability that a random walk starting in the source s goes through the arc (u, v ). V. Batagelj, M. Cerinˇek s Citation Networks
  • 376. 50 works with the largest value of probabilistic flow index ·106 in SN5 Citation Networks V. Batagelj, M. Cerinˇek s Bibliographic networks Acyclic networks Probabilistic flow index SPC weights Coupling and Co-citation Authors’ citations network Other approaches Network SN5 (2008): for "social network*" + most frequent references + around 100 social networkers; |W | = 193376, |C | = 7950, |A| = 75930, |J| = 14651, |K | = 29267 Rank Value Work Rank Value Work --------------------------------------------------------------------------------1 4238.3971 WASSERMA_S(1994): 26 784.2723 FISCHER_C(1982): 2 1993.3290 MITCHELL_J(1969): 27 762.5744 COCHRAN_M(1990): 3 1665.0870 WATTS_D(1998)393:440 28 753.3617 PILISUK_M(1986): 4 1610.8547 GRANOVET(1973)78:1360 29 743.9300 LOURENCO_I(2002): 5 1300.7664 YAN_Y(1996): 30 711.9562 HOLLAND_P(1979): 6 1221.1978 LAUMANN_E(1973): 31 708.2055 FRIEDMAN_S(1999): 7 1215.4193 GOTTLIEB_B(1981): 32 684.1711 MILARDO_R(1988): 8 1139.2607 FREEMAN_L(1979)1:215 33 678.3867 PUTNAM_R(2000): 9 1136.0781 BURT_R(1992): 34 669.8170 GRIECO_M(1987): 10 1131.6018 BARABASI_A(1999)286:509 35 659.6933 MAGUIRE_L(1983): 11 1122.4933 SCOTT_J(1991): 36 656.1357 BOTT_E(1971): 12 1094.2761 BELLE_D(1989): 37 655.7511 LITWIN_H(1995): 13 1071.6445 ZHANG_L(2001): 38 646.8604 HASSINGE_E(1982): 14 925.4650 ROGERS_A(1995): 39 641.1038 GRIECO_M(1996): 15 918.4549 WELLMAN_B(1988): 40 626.8148 WATTS_D(1999): 16 892.2797 PERRUCCI_R(1982): 41 606.7083 COLEMAN_J(1988)94:95 17 888.7358 ANGERMEY_M(1989): 42 573.8576 HEDIN_A(2001): 18 874.9730 ALBERT_R(2002)74:47 43 557.9572 TAVECCHI_L(1987): 19 874.1589 NEWMAN_M(2003)45:167 44 557.0382 COLEMAN_J(1990): 20 865.8406 BIEGEL_D(1985): 45 551.1556 DENOOY_W(2005): 21 854.9899 PHILLIPS_C(2004): 46 535.3469 BERKOWIT_S(1982): 22 851.8199 CLARKE_S(2002): 47 523.1899 MILROY_L(1980): 23 850.1594 RUSSELL_G(2002): 48 519.8359 DEGENNE_A(1999): 24 813.3185 ROGERS_E(1981): 49 517.4263 ALBERT_R(1999)401:130 25 799.1026 BERKMAN_L(1979)109:186 50 504.7977 PATTISON_P(1993): V. Batagelj, M. Cerinˇek s Citation Networks
  • 377. Citation network analysis Citation Networks V. Batagelj, M. Cerinˇek s The citation network analysis started in 1964 with the paper of Garfield et al. In 1989 Hummon and Doreian proposed three indices – weights of arcs that provide us with automatic way to identify the (most) important part of the citation network. For two of these indices we developed algorithms to efficiently compute them. Bibliographic networks Acyclic networks Probabilistic flow index SPC weights Coupling and Co-citation Authors’ citations network Other approaches V. Batagelj, M. Cerinˇek s Citation Networks
  • 378. Fast algorithm for SPC Citation Networks V. Batagelj, M. Cerinˇek s Bibliographic networks The search path count (SPC) method is based on counters n(u, v ) that count the number of different paths from s to t through the arc (u, v ). To compute n(u, v ) we introduce two auxiliary quantities: n− (v ) counts the number of different paths from s to v , and n+ (v ) counts the number of different paths from v to t. Then n(u, v ) = n− (u) · n+ (v ), Probabilistic flow index (u, v ) ∈ R n− (u) = 1 − v :v Ru n (v ) u=s otherwise n+ (u) = Acyclic networks 1 + v :uRv n (v ) u=t otherwise where SPC weights Coupling and Co-citation Authors’ citations network Other approaches and This is the basis of an efficient algorithm for computing n(u, v ) – after the topological sort of the graph we can compute, using the above relations in topological order, the weights in time of order O(m), m = |R|. The topological order ensures that all the quantities in the right sides of the above equalities are already computed when needed. V. Batagelj, M. Cerinˇek s Citation Networks
  • 379. Vertex weights Citation Networks V. Batagelj, M. Cerinˇek s Bibliographic networks The quantities used to compute the arc weights w can be used also to define the corresponding vertex weights t Acyclic networks tc (u) = n− (u) · n+ (u) Probabilistic flow index tl (u) = nl− (u) · nl+ (u) tp (u) − + = np (u) · np (u) SPC weights Coupling and Co-citation Authors’ citations network They are counting the number of paths of selected type through the vertex u. Network/Acyclic Network/Citation Weights Other approaches V. Batagelj, M. Cerinˇek s Citation Networks
  • 380. Properties of SPC weights Citation Networks V. Batagelj, M. Cerinˇek s The values of counters n(u, v ) form a flow in the citation network – the Kirchoff’s vertex law holds: For every vertex u in a standardized citation network incoming flow = outgoing flow : Bibliographic networks Acyclic networks Probabilistic flow index SPC weights Coupling and Co-citation Authors’ citations network Other approaches n(u, v ) = n− (u) · n+ (u) = tc (u) n(v , u) = v :vRu v :uRv The weight n(t, s) equals to the total flow through network and provides a natural normalization of weights w (u, v ) = n(u, v ) n(t, s) ⇒ 0 ≤ w (u, v ) ≤ 1 and if C is a minimal arc-cut-set (u,v )∈C w (u, v ) = 1. In large networks the values of weights can grow very large. This should be considered in the implementation of the algorithms. V. Batagelj, M. Cerinˇek s Citation Networks
  • 381. SN5: CPM main path Citation Networks SALES-PA_M{2007}104:15224 GUIMERA_R{2007}76:036102 V. Batagelj, M. Cerinˇek s NEWMAN_M{2006}74:036104 BOCCALET_S{2006}424:175 CLAUSET_A{2004}70:066111 NEWMAN_M{2004}38:321 NEWMAN_M{2004}69:026113 Bibliographic networks PARK_J{2003}68:026112 NEWMAN_M{2003}45:167 NEWMAN_M{2003}67:026126 NEWMAN_M{2002}89:208701 HOLME_P{2002}65:056109 Acyclic networks ALBERT_R{2002}74:47 STROGATZ_S{2001}410:268 NEWMAN_M{2000}101:819 MOORE_C{2000}62:7059 Probabilistic flow index NEWMAN_M{1999}60:7332 VALENTE_T{1996}18:69 FREEMAN_L{1991}13:141 STEPHENS_K{1989}11:1 MIZRUCHI_M{1984}6:193 SPC weights MARIOLIS_P{1982}27:571 MCPHERSO_J{1982}3:225 BURT_R{1980}45:821 Coupling and Co-citation BURT_R{1980}6:79 BURT_R{1979}6:211 BURT_R{1978}7:189 BURT_R{1977}56:551 Authors’ citations network BURT_R{1977}56:106 ALBA_R{1976}5:77 WHITE_H{1976}81:730 BREIGER_R{1975}12:328 HOLLAND_P{1970}76:492 Other approaches HEIDER_F{1958}: DAVIS_J{1968}: HEMPEL_C{1952}2: KENDALL_M{1939}31:324 MORAN_P{1947}34:363 HOLLAND_P{1970}: KATZ_L{1958}58:97 TABA_H{1950}: HEIDER_F{1946}21:107 LANDAU_H{1951}13:245 TABA_H{1951}: HAYES_M{1953}22:19 LANDAU_H{1953}15:143 ZELENY_L{1947}10:396 $HOW_CONSTRUCT_SOCIOG{1947}: LEINHARD_S{1968}: ZELENY_L{1947}13:314 DAVIS_J{1967}20:181 KATZ_L{1954}5:621 MORENO_J{1953}: DAVIS_J{1970}2: LANDAU_H{1951}13:1 CARTWRIG_D{1956}63:277 BEAUCHAM_M{1970}: HARARY_F{1965}: Pajek V. Batagelj, M. Cerinˇek s Citation Networks
  • 382. SN5: Main island in citation network for SPC Citation Networks V. Batagelj, M. Cerinˇek s SALE HT_E {2007 }59:7 _M{2 007}1 04:15 224 NEW MAN S_N{ 2007 }76:0 3610 1 VOLC RA_V HEN K_D{ {2007 007}7 2007 }9:18 }75:0 5:016 8 2610 103 4 FOR 2007 }9:17 2 CRU PAR EHR NEK ESTR LIND OVE HAR _P{2 _P{2 ADA 2007 E_M DT_G I_P{2 007}7 007}9 _E{2 {2007 }104: 006}7 005}7 {2006 6:036 :228 1791 }374: 3:036 2:046 }74:0 117 6 457 3610 125 105 6 BOC CHA KRA CALE BA_D T_S{ {2006 2006 }424: }38:A 175 1 Bibliographic networks EAM CAR HAJR HELL COS ESTR MAS BON ON_ A_K{ STEN TA_L ADA O{20 E_A{ _M{2 _E{2 2007 _I{20 {2007 07}1: 2007 {2007 006}7 007}4 }:P06 07}72 }56:1 }57:1 89 }104: 4:036 :48 015 :469 67 21 36 104 HOL ME_ GON U_B{ 2007 }143: 29 NEW ESTR CAS TOM JEGE KUP TELL ASSI ERM MAN R_M ADA AN_M O_X{ N_M _M{2 _E{2 {2007 2007 {2007 006}1 006}6 {2006 }174: }79:6 }385: 03:85 :35 279 }73:0 6006 750 77 4613 9 :0561 08 GRA 137 CSA CLAU WON NYI_ G_L{ SET_ G{20 2006 A{20 04}70 }360: 04}70 :0161 99 :0661 22 11 GRO NEW MAN _M{2 _M{2 004}3 004}6 8:321 9:066 133 122 JOLL Y_A{ 2001 }78:4 33 006}4 9:49 8:373 TERA SER 7:39 =GO =RO 04}26 :155 EAU 9:192 G{20 07}29 :173 NEW ROB ADA BUR MIC_ L{200 3}25: INS_ G{20 MAN _M{2 003}6 8:026 121 }68:0 2611 2 :216 INS_ _S{2 004}4 29:18 0 2003 07}29 ROB ANK K_J{ D{20 05}11 0:894 T_R{ 2004 }110: 349 211 BUR T_R{ 2000 _M{2 }22:3 003}4 45 5:167 59 FERG ROT BAN IAMN BE_R N_N{ 2000 }27:6 00 52 KS_D HEN USO }47:9 JEON NEW MEY RAV AND ASZ_ ERS ERS MAN G_H ON_ _L{20 _M{2 E{20 {2003 _A{2 C{19 03}67 03}9: 002}1 002}2 }61:5 99}21 204 :0261 47:40 429:2 67 :37 12 32 }21:1 }78:4 19 NEW TERA }29:9 91 GUL POT 99}44 :473 DEZS TERA HER SO_M {2001 R{19 FRIE NI_A 5:598 LABI T_R{ {1998 }25:1 54 GUL A_G{ }19:3 BUR }41:5 BUR _S{1 996}1 ROT T_R{ 1995 }7:25 5 8:47 SPC weights }16:9 1 }23:2 4 5 HAN STRO K_S{ }14:5 =AND ERS RE_C {2000 }62:7 _M{2 000}8 4:320 1 ON_ :137 IACO DER S_T{ }8:13 31 KRA MILA CKH BER NAR RDO AR_D L_A{ 1994 9 1:165 N_P{ 1987 }52:1 22 CAM NIEM OLLE PBEL L_K{ HAM }8:97 KLOV MER _M{1 983}1 7:405 _K{1 980}1 1986 4:101 BUR 3 MAR SDE 86 BAR MUE NES _J{19 83}5: 235 }117: 384 DOR MAR SDE MAR 71 GALA BUR 81}46 HAM HAM BRE MER MER IGER _M{1 _M{1 _R{1 978}4 980}2 979}4 2:262 :522 :165 EIAN _P{1 980}2 :235 HIP_ BUR KILL WOR T_R{ 1979 }1:41 1 BRE IGER T_R{ 1986 _R{1 }8:20 986}8 5 1985 LAND SER MA_ _P{1 S{19 972}7 77}5: 7:120 61 5 1976 }55:9 3 BON BUR T_R{ N_B{ 5:87 SER 1981 MA_ S{19 }3:71 84}6: 373 177 HOL LAND _P{1 _L{19 983}5 84}10 :109 :343 WHIT E_D{ 1983 }5:19 3 HALL KILL INAN WOR _M{1 T_P{ 978}1 1979 9 :193 }2:19 1978 1979 }84:1 BRE =WH IGER _R{1 }7:18 978}7 80}75 :213 :280 WHIT E_H{ 1977 977}6 }16:1 21 :386 WAL 193 _R{1 S{19 _P{1 06 ALBA 83}5: IGER MA_ 1977 }56:1 D{19 6:51 BRE SER HOL LAND T_R{ ITE_ 981}7 WAS BUR BOO KER ER_L RMA _G{1 {1978 N_S{ 977}2 }1:73 1976 }81:1 1:329 _R{1 976}5 :77 SAIL 384 WHIT E_H{ _P{1 971}2 :107 }2:11 982}2 WAS T_R{ 84}6: MAN LIN_ FRIE ALLE HOL BER LAND NAR N{19 DKIN N_M D_H{ 81}46 _N{1 {1982 _P{1 1980 1978 981}7 :393 984}1 }4:34 }2:19 }1:15 2:235 6:33 9 1 9 LAND 1972 _P{1 3 HOL ACIC H_P{ ISON }6:10 201 WAS PATT {1984 }7:77 BUR BIE_ P{19 FREE SEN _E{1 985}7 HOL BUR LAUM HOL KILL LAND LAND WOR T_R{ ANN 1977 T_P{ _P{1 _P{1 _E{1 }56:5 1976 978}7 977}5 978}4 51 }35:2 :227 :5 :455 69 HOL SCH WAR TZ_J T_K{ _E{1 BUR T_R{ :215 FAUS ARA JOHN WEL LMA 1978 }1:10 5 HOL LAND _P{1 973}3 977}8 :85 3:594 3:243 :209 T_P{ }6:21 5 T_R{ _R{1 979}1 3:21 LAUM ANN 988}1 88}10 BUR :203 BUR IGER _P{1 C{19 13 }4:3 }6:79 1979 BRE L_I{1 978}1 :5 DOR EIAN }10:3 {1982 1980 BUR T_R{ POO Other approaches 3 5 1988 N_M 92 :475 }12:8 }11:6 T_K{ 85}64 STEI 1980 }85:8 _J{19 1990 {1989 WINS ORN BUR T_R{ T_R{ SKIE T_R{ TI_S 0:383 _J{19 RICH BUR DOW BUR JOHN T_R{ ARD _M{1 T_R{ SEN 1980 S_W 982}4 1980 _E{1 _P{1 }2:32 {1981 986}8 982}4 :169 }45:8 7 21 }3:21 :257 :305 5 1984 }10:2 :145 3 BUR GAT 988}1 SKIE IOLIS N_P{ 82}23 }14:6 BOR _P{1 1 FAUS BUR BER SUD MCP BUR NAR T_R{ MAN T_R{ HER D_H{ _S{1 SO_J 1980 1987 985}7 1982 }92:1 {1982 }25:5 }11:3 287 :127 }3:22 57 0 5 1983 ITS_ P{19 {1992 21 PATT ISON 87}9: SUD BUR ROM NEY _R{1 MAN KHA _A{1 990}1 _S{1 RD_M 984}6 2:27 988}1 {1990 0:93 :59 }35:1 04 LJ_V }14:1 S{19 RICE HAM LIN_ MAN MAR N{19 LLER MER DEL_ SDE N_P{ 81}7: _M{1 _D{1 M{19 1981 73 981}7 980}1 83}48 }3:1 :45 4:147 :376 AGE {1992 :99 :403 D_D{ THO UCH I_M{1 981}2 6:475 1983 }88:6 =BAT LJ_V 90}12 SER MA_ 1988 }10:1 4:121 AGE P{19 WAS GALA T_R{ 992}1 4:63 =BAT BIE_ 5:204 :261 BUR _V{1 992}1 ARA 990}8 _A{1 993}1 5:217 REE MIZR N_P{ GELJ _V{1 15 _D{1 AELS ERIC MAR ROM IOLIS NEY KSO _A{1 N_G _P{1 982}4 {1984 1985 982}2 :285 }23:1 }21:1 7:571 87 203 BATA GELJ }11:3 USS MICH 1989 }11:1 _M{1 982}1 6:209 1 }6:29 BATA 1989 STRA 88}53 S_K{ MER 1984 07 C_D{ }12:3 59 SER MA_ S{19 DAH L_A{ HAM T_R{ }55:7 BUC 1990 WAS MIZR HAM JACO HILD TUTZ KRA MER UM_ UCH AUE CKH BSO D{19 I_M{1 _M{1 R_F{ AR_D N_D{ 86}8: {1987 1985 1985 984}6 984}6 79 }7:26 :341 }7:34 }9:10 :193 3 1 9 SDE 1990 IACO DER S_T{ HEN 1987 }9:49 989}5 MAR C_D{ SNIJ STEP D_H{ _R{1 {1988 }10:3 59 }38:7 BUC 1991 }56:3 97 2 MAN _L{19 91}13 :141 1990 }16:4 35 }12:1 79 }14:5 92}14 SNIJ MAN _M{1 999}6 0:733 FREE N_P{ 1990 {1992 C{19 NEW T{19 96}18 :69 MAR SDE NAR D_H{ =FAU ST_K 059 NTE_ BER {1990 }35:3 42 9:1 1992 VALE :447 AR_D 994}1 T_K{ _M{2 000}1 01:81 9 MOO 992}9 CKH DAH _P{1 FAUS MAN 468 KRA KLOV EIAN NEW _R{1 }36:4 71 75 DOR :137 2000 }85:5 RDO {1993 }38:2 7 92}14 Y_D{ MILA RA_H {1992 }37:4 22 {1997 }17:2 C{19 CALL AWA }14:4 61 IBAR RA_H _M{1 993}1 5:1 LL_H 1995 ON_ 994}3 8:67 IBAR ALTM ANN SMA K_K{ 8:149 AND ERS _A{1 :87 77 FRAN 996}1 IGUS M{19 94}37 }38:2 3 DOR EIAN _P{1 NEA UFF_ {1993 95}17 :299 US_D {1994 }18:9 9:143 4:436 {1987 KILD RA_H RIS_ M{19 DHO 1996 997}1 2:71 001}5 268 NEW MAN {1995 }17:2 73 IBAR MOR WOO K_K{ _V{1 993}2 2001 }410: EL_B BE_R FRAN GELJ :401 _M{1 I_G{2 Z_S{ 835 ISRA HEN 642 BATA 96}61 WAL KER BIAN CON GAT 2001 }86:5 WEL VALE SDE LMA NTE_ N_P{ N_B{ T{19 1993 1990 97}45 }15:3 }96:5 :677 58 99 1994 {1996 }104: 0 SER MA_ S{19 _P{1 995}3 9:57 1 AN_M {2002 }99:7 821 YOO 991}9 6:147 8 WAS ISON 2002 }88:1 2870 GIRV 7 _N{1 MAR T_R{ BE_R 1998 _V{1 999}1 731:9 2:169 PATT N_J{ }74:4 DKIN FRAN K_K{ GELJ 103 IDSE {2002 FRIE NTE_ T{19 98}20 :89 _D{1 997}1 9:9 23 BATA 999}5 6:035 DAV RT_R VALE GAN {1996 }24:3 41 HEN 98}19 :293 1998 55 75 MOR IN_C ROT ATI_ R{19 ANC 1997 529 LATK TIS_ R{19 95}17 :229 BE_R 9 BUR {1997 }19:3 PATT ISON _P{1 002}6 ALBE }12:1 CUR HEN 4:143 S_U{ 2001 }25:1 63 {1998 }87:1 289 ROT R{19 99}10 87 NDE BE_R 1997 GUL ATI_ }21:2 BRA HEN N_S{ }24:4 5 000}6 {1999 LAZE GA_E ROT DMA {1997 T_J{2 EY_S 47 }11:1 2:173 EBEL _H{2 5510 GHA LBER BAIL }18:2 2000 000}2 679 HOL ME_ MAN P{20 _M{2 02}65 002}9 :0561 9:256 09 3 6 2002 }65:0 15 HUR }22:1 E_T{ 1996 T_R{ _V{2 }107: NEW O_Z{ T_J{1 999}1 0:182 }27:4 BUR T_R{ 2000 WEIZ BUR GELJ {2001 _M{2 31 MCP ATI_ SCH BATA DY_J _M{2 003}6 7:026 126 NEWNEW WAT TS_D MANMAN _M{2_M{2 {2002 002}8 002}6 002}6 }296: 9:208 6:016 6:035 1302 701 128 101 MAN T_J{1 999}1 50:13 {2003 MOO MAN 73 POT TI_S NEW ITCH {1996 {2001 BOR GAT Authors’ citations network _G{2 007}2 _S{2 007}2 9:231 MAN }16:3 N_K{ 2004 Coupling and Co-citation :192 =HUN TER_ BINS ODR EUB _M{2 003}6 8:036 122 103 07}29 :216 31 MAN 8:065 G{20 07}29 }29:2 NEW _R{2 003}6 :565 INS_ D{20 {2007 PAR 003}6 ROB TER_ AU_S MEY NEW RA_H {2005 KLEI Probabilistic flow index HUN DRE BAR ABAS ERS _L{20 I_A{2 05}23 {2004 005}4 2:71 }70:0 35:20 3190 7 8 BOW SK_A 005}2 88 ERA _P{2 000}2 7:644 :169 GOO GRA MAN _M{2 HOL ME_ P{20 GUIM T_J{2 IBAR {2004 }47:7 95 07}29 604 _M{2 004}6 9:026 113 004}3 6:941 SS_D G{20 }23:2 {2005 }36:1 NEW 04}38 :305 2006 }311: 43 _A{2 GLEI INS_ 2007 SK_A D{20 T_G{ NEW MAN NAS 005}1 BRA ROB L_Z{ BOW KIM_ SINE {2004 }30:2 ARE I_P{2 SAU GRA 579 KOS TS_D POT UND 561 KIM_ D{20 05}35 1:671 WAT MAN BALK {2007 }378: NEW MAN _M{2 D_A{ 004}7 2004 0:056 }70:0 3610 131 8 NLUN NEW UND I_P{2 N_Q SK_A 2006 }224: BOG UNA _M{2 {2002 004}7 }99:1 0:056 3330 BALK XUA FRAI PALL ESTRBLON ADA TOIV GRO ZAKH CAP HOL LIU_ XUA OCC ME_ N_Q X{20 GNIA A_G{ ADA DEL_ MIC_ ONE NLUN ARO {2006 05}41 I_A{2 P{20 _P{2 2007 _E{2 V{20 L{200 N_R{ D_A{ V_P{ 006}7 05}8: }73:0 :1462 005}3 005}7 06}34 7}9:2 2006 2005 }446: 2007 {2006 3610 4:036 }378: 664 669:7 1:056 1:231 31 }371: }8:26 }73:0 5 851 116 1 91 103 550 1613 5 BOW Z_M{ 03}68 :0561 07 SON POR MILO GON PAR LILJE K_J{ G_C TER_ _R{2 ZALE ROS 2004 {2005 M{20 004}3 {2004 _F{2 Z_M{ }70:0 }433: 003}5 05}10 03:15 2006 }189: 6611 392 }96:0 :189 2:705 38 115 7 8870 7 2 ES_K HOL ME_ P{20 06}74 ZALE P{20 ES_K EAM RUY AT_S LIND K_J{ CITT 5 NEW MAN TUN _E{2 GHO CHE TRAV GOM PON THAD ESTR N_Y{ SHA EZ-G CELA O_G IESO ADA AKAM L_G{ 2007 _E{2 {2007 A_J{ _J{20 _G{2 _H{2 2007 }75:0 2007 006}7 006}7 07}9: }446: 007}9 }58:1 4610 }98:1 3:649 184 97 4:036 :190 75 7 0810 112 3 GUIM LAM WILH BIOT ELM ERA T_R{ _T{2 _R{2 2007 007}3 007}7 }:P08 85:38 6:036 026 5 102 _M{2 007}1 04:95 64 LATO ADA SZAB U_J{ Acyclic networks LEIC S-PA ALVE ESTR BULD 1976 }81:7 30 3 WHIT E_H{ BRE IGER _R{1 1973 975}1 }43:4 3 LUCE 2:328 MOS _R{1 950}1 WHIT MILL MCQ NEE MAC SHE MCQ LOR SHE SHE SHE SHA ABEL SAM ROM ARA ARA LOR ARA SHE BER SCH BER NEW ROE BER BJER BOO LAUM BOO BOO BOY BRE CAR LOR LING CAR CLAR LING STEN COO DAW EDW SNE FARA FLAM HOL FOR GLEA GRA VON GRA WHIT GRIF HAR HAR SOR HAR SOKHEIL HOL STRU HOL WINS LAUM NOR HOR HUB WHIT HUB WISH HUB HUB NEW JARD WHIT WHIT JOHN WHIT KATZ KLAH KRU WAR KRU KRU ROB LANC LANC LAUM DHA PAR RAIN PAR PAR PAR FTO_ BIE_ BIE_ RAIN BIE_ PAR NAR NAR NAR IGER ATH_ LAND D_L{ LAND HSA MAN OEN ROL RAIN ROL TIGA TIGA TIGA BELL ERT_ _M{1 ERT_ ERT_ SKAL SKAL SKAL D_J{ AL_R PSO NEY THLI NOV WIES NOV ENS INSO RAE DLIE AN_C RMA RMA RMA _G{1 TELL ER_G MBS UITT UITT COM ES_R ARD COM D_J{ _R{1 FITH E_H{ E_H{ E_H{ E_H{ SON E_H{ STED _L{19 SON HIP_ INE_ R_D{ K_J{ _R{1 SON RO_ ENT_ E_G{ E_G{ SON ANN ANN ANN D_R{ D_R{ D_R{ D_R{ D_R{ M_R D_P{ D_P{ D_P{ L_J{1 _F{1 L_J{1 EN_T P{19 P{19 P{19 1969 _D{1 FI_J{ _E{1 N_S{ _A{1 S_F{ ET{1 E_L{ ET_M S_A{ _F{1 _F{1 N_S{ N_S{ N_S{ _C{1 N_J{ N_J{ N_J{ _P{1 1963 974}: _R{1 973}6 Y_L{ Y_L{ _C{1 _B{0 M{19 P{19 N_A{ {1958 B_T{ B_T{ L{197 973}2 L{197 L{197 E_F{ _J{19 _J{19 _J{19 T{19 _P{1 1962 _P{0 K_T{ _R{1 1970 971}: _H{1 1970 C{0}: 0}:UN N{19 1961 1970 _S{1 0}:IN _T{1 _E{1 _E{1 {1969 _E{1 {1972 {1955 1969 1967 1967 47}10 _A{1 C{19 {1965 73}: 73}10 73}38 972}3 960}2 1972 1969 1962 1962 1974 0}:UN 1973 1974 }6:13 975}: 972}: 1939 973}7 1941 1967 1972 1973 1975 1969 971}: 958}: 964}: 1968 1968 1965 974}5 970}3 971}1 973}: }58:2 1970 1972 1973 }:UN 57}17 965}2 {1948 73}: 1961 1968 1968 970}7 }: }:UN 1967 72}: 1965 958}3 969}: }30:7 {1974 }49:2 973}: 969}7 974}3 }34:1 71}: 973}3 UNP }67:1 }: 967}3 _PRE 8:159 }34:3 64}29 64}29 66}45 }38:1 3}38: : 2}37: 4}69: }10:2 }9:37 956}: 63}: 5:169 PUB }: }27:3 :233 }:67 }: }27:1 }27:2 }39:3 }5:12 }: }: 9 PUB }: }: }67:1 }2:81 }: }: 191 }8:11 }3:46 }28:2 }21:3 PUB }29:8 :148 :607 }: }1:22 }10:2 7:417 36 3:360 PUB }:28 }63:1 8:136 }: }: UBLI 3:181 5:283 :49 73 :201 8:377 }5:1 59 85 6:492 :1 19 409 LISH 47 261 698 71 3 }: 1:122 9:162 39 2:241 SS 8:212 :1 :115 :1299 38 LISH LISH 3 25 19 73 8 23 LISH 3 5 11 62 5 6 SHE 0 ED ED ED ED D WOL FF_K E_H{ {1959 0}:UN }: PUB LISH ED_W WHIT ORK ING WHIT COL GOF BON COO $HO BEAU EMA FMA W_C ACIC LEY_ E_H{ E_H{ CHA N_J{ N_E{ ONS H_P{ C{19 1974 0}:UN M_M 1961 1971 TRU 1972 02}: {1970 }: PUB CT_S }: }: }: LISH }: OCIO ED_M G{19 ANU 47}: SCR I CAR DAV MAC DAV IS_J{ E_M {1971 1968 }: }: IS_J{ TWR IG_D 1967 }20:1 {1956 }63:2 81 DAV HAR IS_J{ HAY HEID ES_M ARY 1970 }2: _F{1 965}: ER_F {1953 }22:1 9 77 {1946 PER HEID RUC ER_F CI_R {1958 {1970 }: }35:1 07 HEM }21:1 040 SOR LIEB HOL LAND OKIN ERS PEL_ O_S{ _P{1 C{19 _P{1 1971 947}: 52}2: 970}: }76:5 62 KATZ KATZ _L{19 ALBA _L{19 54}5: 58}58 621 SIMM KEN GLAN BRE NAD LAND KAD DAV LAND LEVI HOM LAND POO LEIN DALL IGER USH IS_A AU_H EL_S AU_H NE_J L_I{0 HAR ANS AU_H _R{1 EL_G ZER_ {1941 {1972 D_S{ _G{1 _R{1 }:NO _M{1 {1957 IN_C 973}3 {1955 M{19 {1951 {1951 {1953 {1966 950}: 1968 NMA 974}: }: }: }37:1 :113 59}56 }:939}3 }13:1 }13:2 }15:1 THEM }: }31:7 1:324 4 :317 45 43 86 ATIC AL_I :97 MOR WHIT MILG MOR LUCE TRAV RAP TABA TUR TABA WHIT MILG ZELE NER OPO AN_P E_H{ ENO RAM NY_L E_H{ _H{1 _H{1RAM ERS _R{1 RT_A _J{19 _S{0 _J{19 955}1 950}: _C{1 951}:_S{1 1963 {1947 1961 {1947 967}2 }:IN_ {1961 53}: }:967}1 69}32 5:165 }34:3 }57:1 }10:3 0:121 PRE 85 :61 }6:28 63 :425 96 SS 0 ZELE BOT COL T_E{ NY_L 14 DEU OTO N_J{ }: }13:3 DES EMA 1957 {1947 GUR TER_ M{19 953}: 0:417 FOS TES_ _K{1 960}6 }: FOR TSCH _C{1 1966 HOW EVIT C_M C{19 45}: 56 KAD USH ARD LIPS ENY LOR RAIN ET_S _J{19 _F{1 {1956 60}: }33:6 UBLI KEM 955}: {1968 :UNP }: KATZ _E{1 IN_C _L{0} {1961 63}8: }: 969}: 616 85 SHE D_M NTR ANU SCR I Pajek V. Batagelj, M. Cerinˇek s Citation Networks
  • 383. Bibliographic Coupling Citation Networks The bibliographic coupling network biCo (Kessler 1963) can be determined as (u Ci v ≡ u cites v ): V. Batagelj, M. Cerinˇek s Bibliographic networks Acyclic networks Probabilistic flow index SPC weights Coupling and Co-citation Authors’ citations network Other approaches biCo = Ci ∗ CiT bicopq = # of works cited by both works p and q. bicopq = bicoqp . Again we have problems with works with many citations, especially with review papers. To neutralize their impact we can introduce a normalized measure such as (nCipq = Cipq / max(1, outdegCi (p))) biCon = 1 (n(Ci) ∗ CiT + Ci ∗ n(Ci)T ) 2 It is easy to verify that biconpq ∈ [0, 1] and biconpq = biconqp (symmetry). It also holds: biconpq = 1 iff the works p and q are referencing the same works. The cCpq element of the first term represents the ’importance’ of common (p, q)-citations for the work p; and the Ccpq element of the second term represents the ’importance’ of common (p, q)-citations for the work q. It 1 holds biconpq = 2 (cCpq + Ccpq ) = 1 (cCpq + cCqp ). 2 V. Batagelj, M. Cerinˇek s Citation Networks
  • 384. Normalized bibliographic coupling in SN5 Citation Networks V. Batagelj, M. Cerinˇek s Bibliographic networks Acyclic networks Probabilistic flow index SPC weights Coupling and Co-citation Authors’ citations network Other approaches In the network biCon(SN5) the larger components with edges with bicon = 1 correspond to papers with a single reference to a book (Wasserman, S., Faust, K.: Social network analysis. Cambridge UP, 1994; Taylor, Howard F.: Balance in small groups. Van Nostrand Reinhold, 1970; Belle, D.: Childrens social networks and social supports. Wiley, 1989; Gottlieb, B. H.: Social networks and social support. Sage, 1981; Yan, Yunxiang: The flow of gifts. Stanford UP, 1996; Zhang, L.: Strangers in the City. Stanford UP, 2001). There are also several pairs of papers with bicon = 1, mostly written by the same author. More interesting groups we can obtain as larger islands with values below 1. We obtain 19 islands of size in [10, 50] on 290 vertices. A selection from this set is displayed on the next slide. V. Batagelj, M. Cerinˇek s Citation Networks
  • 385. Selected biCon(SN5) islands Citation Networks V. Batagelj, M. Cerinˇek s Bibliographic networks Acyclic networks Probabilistic flow index SPC weights Coupling and Co-citation KOLL ER_K ER_{ {198 1}14 1971 :103 }73: 946 TIM MS _E{1 990} ARO 20:6 N_P NSO STE 27 {197 DO FFE N_D 0}18 REIA NS_J {197 DO :137 N_P RE 0} {200 KLOVD {198 MA7:221 IAN_P 2}50 AHL_ 2}4: DIG {198 :624 A{1 27 AN_R 1}3: 97 MA 107 {197 YER 7}13:1 0}7: 69 _P{1 WATTBOHM 314 970} S_H _K{1 RAA MA {197 970} VOLK MA TZD AR5:720 B_J RSD 1}36 13:2 {200 OR ONSO ER_B EN_P F_ :551 05 2}47 N D {200 :563 {198 AUGH I{1970}_D{197 2}23 ERT_ 21:1 2}78 5}91 :482 95 :476 :452 S{1 995} 40:2 LIE 15 BER MA KAM _S{1 THU ENIT 991} RM S_L AM AN_B 13:4 ATO GIERV {200 33 {197 ELD _P{1 3}32 9}2: _J :3 991} 47 PAS 20:4 {1998} WE CO 89:3 83 LLM E_J 28 AN_B {199 3}14 {197 :63 5}81 PRID :690 E_J {198 2}58 :231 GAL_ S{1 982} 84:8 53 HUD SON SIM _R{1 ON 982} 18:1 -VA_A {199 97 8}9: 120 MU RRA Y_S {199 7}48 :450 OTT ENB DO REIA MIZ RUC HI_ SM ITH _A Other approaches 0}2: JOH NS 285 EN_E {198 JOH NSE 203 251 N_P WA LEIN SSE {197 JOHNS HAR RM 0}18 E D_S A_S :422 N_E{1 {197 {197 986} 7}5: 7}5: 8:25 1 61 7 DO REIA BON 5}7: N_E {198 2}3: 6 ACIC H_P {198 0}86 DO :159 HO N_P 6}18 :149 67 TT_J {1 GR EELE 27:2 89 Y_A {197 5}69 :711 CO LLIN S_H {197 USH 4}3: ALB R_C IN_C A_R MIT 331 {198 {198 {197 CHE _N{1 2}3: 3}89 5}54TERR 287 LL_J 981} :230 :289 ENO BUR {198 87:7 I_J{ T_R 7}9: 64 1974 {198 37 }19: 7}9: 210 63 BUR BUR CUR T_R T_R TIS {198 {198 _R{1 6}8: 7}9: 974} 387 75 79:1 DAV 517 IS_W {197 8}80 :686 SON QU IST_ J{19 80}9 :151 FELD _S{1 997} 996} 39:2 11 FRIE CHE DKIN 19:9 7 KAD FIS SKIE _J{1 99 POR 1}13:3 47 TER _M{2 005} 10 BON 2:7057 35 ACIC H_P MA {199 RSD 1}13 BATT EN_P :155 ISTO {198 8}66 _S{2 :843 003} 322: BON 567 ACIC BON H_P {198 ACIC 6}12 H_P :127 {198 7}92 :117 0 SCO TT_J {198 HUM 5}6: MO 81 N_N {199 H_P {199 ND_P HI_ 5 V. Batagelj, M. Cerinˇek s BATA GELJ DO REIA _V{1 N_P 992} {198 14:1 8}10 21 :273 8}20 :1 MIZ RUC {197 7}5: {199 6}17 :2 LAZE GALA LLA REIA MA 1}23 :2 ACIC A_W {198 {200 BON KSTR 2:85 270 _M{2 005} KNO GA_E KE_D 85 BAB {199 {199 BATA ER_C 6}25 5}36 GELJ {200 :275 EAS :781 BATA 4}31 _V{1 TON 60:9 GELJ SCO 998}63 _G{1 8 _V{2 TT_J :103 996} 001}WOLF 3:29 {199 23:2 E_A 1 6}47 {199 37 WO 7}24 POTT :375 OD _R{1 ERIC S_B B :219 996} {199 KSO ANKS KAU 16:9 _D 4}37 N_B TZ_H 2 :419 {199 {1996} {199 13:3 7}30 7}40 :149 B 56 :63 LYTH E_J CAI_ {199 D{2 6}10 FRA 005} 27:4 NK_O 3721 0 {200 :445 2}24 :385 DIJ 7}18 {200 SCO G_S 6}21 {200 4:U 0}51 ETT YAN {199 A_T 000} {200 :755 KS_D HAR N_B {2 R_J OSS BAN INO KSO THO GR :173 Authors’ citations network ERIC GO LD M{1 984} 13:2 05 M{1 FRE EVE RETT RO BE SHE N {200 1}37 304: EM AN_L {199 3}15 :437 _M{1 991} RTS _F G_L {200 3} RETT :67 201 21:1 83 EVE _M{1 990} 12 :385 BATA GELJ 0}12 984} 6:19 3 Citation Networks _V{1 997} DO REIA N_P 19:1 {198 43 :273 7}9: 89 EVE RETT _M{1 993} 15:2 37
  • 386. Co-Citation and others Citation Networks V. Batagelj, M. Cerinˇek s Bibliographic networks Acyclic networks Probabilistic flow index The co-citation network coCi (Rosengren 1968, Small 1973) can be determined as coCi = CiT ∗ Ci cocipq = # of works citing both works p and q. cocipq = cociqp . It holds coCi(N ) = biCo(N T ) and also for corresponding normalized networks coCin(N ) = biCon(N T ). The weight aciip in the author citation network SPC weights ACi = AW ∗ Ci Coupling and Co-citation Authors’ citations network Other approaches counts the number of times author a cited work p. The author co-citation network can be obtained as ACo = b(ACi) ∗ t(b(ACi)) V. Batagelj, M. Cerinˇek s Citation Networks
  • 387. Authors’ citations network Citation Networks V. Batagelj, M. Cerinˇek s Bibliographic networks i was,i Acyclic networks s cis,t Probabilistic flow index t SPC weights Coupling and Co-citation Authors’ citations network Other approaches j A T WA wat,j A W Ci W WA Ca = AW ∗ Ci ∗ WA is a network of citations between authors. The weight caij counts the number of times a work authored by i is citing a work authored by j. Similary: Ck = KW ∗ Ci ∗ WK and Cj = AJ ∗ Ci ∗ WJ . V. Batagelj, M. Cerinˇek s Citation Networks
  • 388. Islands in SN5 authors’ citation network Citation Networks LIN_N LAZEGA_E ROBINS_G FRIEDKIN_N VAPNARSK_V LAI_G LEVOT_P STRAUSS_D MERTON_R HOLLAND_P LYNCH_E FRANTZ_P DEYRIS_E ALAKARE_B COLEY_J GERGEN_K VANDUIJN_M GALASKIE_J AALTONEN_J LEMOIGNE_M SCHWARTZ_N BURT_R V. Batagelj, M. Cerinˇek s MAILLARD_J PATTISON_P SHOTTER_J BARAN_M GULATI_R WHITE_H LEINHARD_S SNIJDERS_T ATRAN_S MUSSAT_M MARECHAL_M ANDERSEN_T EK_E BOORMAN_S FAUST_K SEIKKULA_J DEPOMPER_M SELVINIP MIZRUCHI_M CORP_E ROSS_N MEDIN_D LEMOY_A DELALAUR_L COLEMAN_J ANDERSON_H TIMURA_C LEBEAU_E MCGORRY_P ANDERSON_C BREIGER_R GRANOVET_M ALANEN_Y KILDUFF_M WASSERMA_S BRASS_D Bibliographic networks FIENBERG_S LAUMANN_E DOREIAN_P IACOBUCC_D MATTOSO_J ARRUDA_M MARSDEN_P FARARO_T GIRVAN_M SKVORETZ_J COHEN_A HURLBERT_J IBARRA_H BATAGELJ_V STEDILE_J COSTA_L DOROGOVT_S BALKUNDI_P FREEMAN_L EVERETT_M BENJAMIN_C MORENO_Y KIM_D GRABOWSK_A HUMMON_N BOCCALET_S BIONDI_A BARTHELE_M FERLIGOJ_A Acyclic networks BOFF_C LESBAUPI_I MOORE_C WHITE_D MILLER_M KRACKHAR_D PARK_J COOK_K CARLEY_K STROGATZ_S WILLER_D TRINDADE_H GRONLUND_A NEWMAN_M PINAUD_J AMARAL_L MARKOVSK_B VLAHOV_D NEAIGUS_A GONCALVE_R MOLLOY_M WATTS_D JEONG_H BONACICH_P Probabilistic flow index KLOVDAHL_A HOLME_P ALBERT_R MASUDA_N VANDIEN_S BORGATTI_S DESJARLA_D LATKIN_C BIENENST_E BURGARD_A BARABASI_A MANDELL_W MAHADEVA_R ROGERS_E RODKIN_P POTTERAT_J CRICK_N CELENTAN_D KLINKE_D ESPELAGE_D XIE_H BROADBEL_L FRIEDMAN_S FAMILI_I MUTH_S LEUNG_M VALENTE_T SPC weights ROSSI_M CASSIRAM_A GIULINI_G LATUADA_S BELTRAMI_L MAGNUSSO_D CAIRNS_R GEST_S CAIRNS_B BELGIOJO_A ROTHENBE_R KELLY_J VACANI_C KANNES_G SABORNIE_E GUILINI_G ZOTTI_S ARRIGONI_P BIANCONI_C D’AMIA_G NIELSEN_R FRANCHET_G MORRIS_M ANNONI_A VERCELLO_V GOFORTH_J ROMUSSI_C AMATI_C FEDORA_P HOLLOWEL_J DARROW_W UNKNOWN GATTIPER_M REGGIORI_F FARMER_T THOMPSON_J GOODHART_K Authors’ citations network MAY_P VILJOEN_D MERIGGI_M TRUJILLO_P GOLDOLI_E MATZGER_H DELUCCHI_K DALLAJ_A SANDRI_M MEZZANOT_P KALBERG_W ABEL_E WHITE-CO_M KASKUTAS_L GOSSAGE_J DECOTEAU_S LESAGE_A OZEL_S WALKER_B WESTLEY_F AYDIN_I BREWIN_C BROWN_G Other approaches HOYME_H GOZZOLI_M MEZZANOT_G DISHION_T BUCHANAN_L PELLEGRI_A ADLER_P WEISNER_C CHIZZOLI_G SCOTTI_A HYMEL_S KRETZSCH_M HIGGINS_C LUCCHELL_G PAPAGNA_P CATTANEO_C ASPARI_D GARIEPY_J CURTIS_R AMIRKHAN_Y MAVROVOU_M SUMATHI_R RICCI_G VALLI_F BECCARIA_G MONTALTO_R PATETTA_L VANACKER_R CADWALLA_T COIE_J PFAENDTN_J HONEGGER_A PIZZAGAL_F Coupling and Co-citation WOODHOUS_D SCHILLIN_C PEARL_R NECKERMA_H ESTELL_D CLEMMER_J KINDERMA_T KNOWLTON_A MARASCO_C HELD_T CARPENTE_S JANSSEN_M MACCARTH_B FARRELL_M HUNT_J ZILELI_L EREN_E BERKES_F ADGER_W DEROSA_C SOLOMON_P FOLKE_C BRUGHA_T OZCURUME_G BASOGLU_M GAMBOA_G REEVE_H HAHN_T BEBBINGT_P SCHEFFER_M WING_J DAPPORTO_L RAU_P MALANGON_C LALE_T MAGLIANO_L FIORILLO_A JEANNE_R OSTROM_E HOLLING_C HENDERSO_S LEWIS_G OLSSON_P GUNDERSO_L JENKINS_R MELTZER_H ROSELER_P GUARNERI_M FADDEN_G TURILLAZ_S PALAGI_E TASKINTU_N KILIC_C MORGAN_Z MAJ_M STARKS_P STRASSMA_J KURT_G WESTEBER_M Pajek V. Batagelj, M. Cerinˇek s Citation Networks
  • 389. SN5: Acyclic structure of two main islands Citation Networks CA RLE V. Batagelj, M. Cerinˇek s BO EV RGAT ER ET TI_S T_ M Bibliographic networks WH ITE Acyclic networks IAC OB AN DE Other approaches ND UIJ N_M N_C US T_ LA FR BA IBA IEDLK RR U KIN N A_H D _N I_P ZE GA _E BU LA RT_ R U MAR MAN SD N_E EN _P LE INH AR D_S BR EIG CO ER _R D_P BO OR MA N_S OW LT J_ GA M _A HU GU BR RLB LA AS SK S TI IE_J ERT__R _D J LA ME GR LIN R A _N MA TON NOV N_J _R ET_ M LE WH ITE NS CO OV TK IN_C CU NE LE DES RTIS_ R FR JA R NEAIEDM LA_D IG AN US_ _S A T_ E KL MU OVD PO TH_SAHL_ A RO TTER WO THEN AT OD BE _J HO _R US_ D SK _B R_D MO R RIS DA _M RR KR OW _W HO V_D CE DE LE LL AM N LL GE RS MA TZ HA KE RO OW IRK NTA N_D _W H V LS IGGIN ALEN EN TE _R S_C _T KN H_M MA N NIE UN ETZ SC VLA Y_J N_Y LU CC HI_ K R_H WE ISN KA _E DE GE ER _C SK UTA S_L OK _K FR EE MA CO _H LA R_M BIE WIL I_G LE CIC H_P RK LA MIL NA _M ON A BO MA DU FF HI_ AN GO BA DO TAG REI ELJ_ AN V _P _D KIL RU C LL KN RLI _N FA SK RAR VO O_T RET Z_ J MIZ STR NB AU ER SS G_S _D FE ON K FIE HO HU MM FA PA RO TTIS SN BINS_ON_P WASIJDER G SE S_ RM T A_ S SPC weights Authors’ citations network VA RS O UC C_D Probabilistic flow index Coupling and Co-citation Y_K N_L HE N_A Pajek V. Batagelj, M. Cerinˇek s Citation Networks
  • 390. Normalized version of authors’ citation network Citation Networks V. Batagelj, M. Cerinˇek s We define the normalized authors’ citation network as Can = n(WA)T ∗ Ci ∗ n(WA) Bibliographic networks Acyclic networks We have canij = Probabilistic flow index i,j∈A i∈A j∈A p∈W q∈W wapi waqj · cipq · = outdeg(p) outdeg(q) SPC weights Coupling and Co-citation = p∈W q∈W Authors’ citations network Other approaches cipq outdeg(p)outdeg(q) wapi i∈A cipq = |ACi | waqj = j∈A p∈W q∈W The citations arcs are distributed among authors. Similary Ckn = n(WK)T ∗ Ci ∗ n(WK) and Cjn = n(WJ)T ∗ Ci ∗ n(WJ). V. Batagelj, M. Cerinˇek s Citation Networks
  • 391. Layering and topics preserving network Citation Networks V. Batagelj, M. Cerinˇek s Bibliographic networks Acyclic networks In analysis of citation networks the operation NA £ NB of layering network NA over network NB turns out to be useful. Let NA = (V, AA , wA ) and NB = (V, AB ) then NC = (V, AC , wC ) = NA £ NB , where AC = AA ∩ AB and (u, v ) ∈ AC ⇒ wC (u, v ) = wA (u, v ). Probabilistic flow index Networks/Cross-Intersection/First SPC weights Using the layering operation we can define the topics preserving network CiK = (WK ∗ WKT ) £ Ci. The weight cikpq = number of keywords common to the citing work p and cited work q. Coupling and Co-citation Authors’ citations network Other approaches For all the introduced networks the normalized versions could/should be developed. V. Batagelj, M. Cerinˇek s Citation Networks
  • 392. Triple Helix for Social Innovation: Saemaul Movement for Eradicating Poverty Wha-Joon Rho Emeritus Professor of Seoul National University
  • 393. □ Introduction •There have been many studies of the Saemaul Movement but only a few studies of them were carried out in an aspect of social innovation- Only recently researchers begin to study Saemaul Movement in an aspect of social innovation. “The Role of Saemaul Leaders as Social Innovator”(2013) is a good example. When we perceive the Saemaul Movement as social innovation, we can correctly understand the characteristics of the Saemaul Movement and we can construct a solid foundation for the theorization. •This study will clarify the characteristics of the Saemaul Movement as social innovation and analyze the ways how three groups of actors, who are chief policy maker and his aids, central and local government officials, and Saemaul leaders successfully, drived the Saemaul Movement in the mutual interaction processes. Based on this empirical analysis, I would like to develop actor based triple helix model to explain successful carrying out processes of the Saemaul Movement during the 1970s in Korea and argue that why actor based triple model has more persuasive power to explain successful innovation promotion processes for the social innovation. 1
  • 394. □Policies to eradicate poverty during the 1960s and their lessons ○Structural change and poverty problems during the 1960s •The most important characteristics of the Korean population distribution during the 1960s was that more than 72% of the Korean people lived in rural community in early 1960s. But it decreased rapidly to 50% in 1970. •Per capita income in Korea was $79 in 1960 and $203 in 1970. In this situation, the most crucial social problems Korean society faced were poverty, poor living conditions and absence of a spirit to overcome hardship. •The relative incomes of rural people compared to urban people were low. •Per capita disposable income of workers in Korea was $ 83.8 in 1965 and that of rural community workers was $68.4 which was about 81.6% of whole workers of nation. This means that poverty problems in the rural society were very severe during the 1960s. The more severe problem was that the debts of farmers increased very fast. 2
  • 395. From 1962 to 1969, farmers’ debts were increasing average 16.4% annually. As the result, average farmers' debts which were $19 in 1962 reached to $50 in 1969. During the 8 years, debts increased about 2.6 times. So that, poverty problems which were rapidly increasing was the most urgent problems Korean society should solve at that time. ○The policy endeavors for eradicating poverty and developing rural communities during the 1960s •During the 1960s, Korean government had tried out to solve the poverty problems prevailed in Korean society by introducing policy measures aiming at the reduction of the burdens of rural people’s loans with high interest rates. Korean government also adopted People’s Movement for the National Reconstruction(PMNR). 3
  • 396. •Policy measures aiming at the reduction of the burdens of rural people’s loans with high interest rates was evaluated as partly successful one because rural people’s income could not increase continuously since it was not accompanied with the new income generation methods for the people. •Therefore, that policy reform aiming at the reduction of the burdens of loans with high interest rates and the increase of incomes of the poor people were evaluated as not so successful policy as expected. The main reason was that the result of the collection of the capital funds used to consolidate private loans with high interest rates was not so desirable and farmers' evaluation for the result of the consolidation of high interest rate of private loans was negative(Lee, 1984: 355). ○People’s Movement for the National Reconstruction(PMNR) •After the military coup d'état in May 16, 1961, coup d'état leading group adopted PMNR for the purposes to carry out the ideologies of the military revolution supported by the whole nation. 4
  • 397. •In order to make every people know about the new images of the nation, the main power group of the military coup d'état energetically promoted educational projects to infuse new image of the nation and people into the people’s minds. They spent about 30 percent of the whole budget of the PMNR for the educational programs and they concentrated to the educational programs under the premise of the development of all the programs of the PMNR. •They established several training institutes. At the headquarter, central training institute was established to train instructors of cities and countries. And at the provinces, regional training institutions were established to train instructors of the towns and myeons(sub-country). For the people, they were taught for the spiritual enlightenment and the development of democracy. •However, PMNR could not realize what they wanted to accomplish. The main reason was that they tried to read the mind of the powerful government organizations which they pushed forward the movement and promoted by the bureaucratic ways(Hong, 1965: 164). 5
  • 398. ○ Lessons learned through the poverty eradication movement •Among the policy measures during the 1960s to eradicate poverty, policy measures aiming at the reduction of the burdens of rural people’s loans with high interest rates(people’s loan reduction program) was target-oriented policy measures. On the other hand, PMNR was a social movement approach to solve the poverty problem by the spiritual enlightenment of the people only through the spiritual education for the general public. •People’s loan reduction program could not contribute greatly to farmers’ poverty reduction because this program was carried out with no relationships with farmers’ income increase measures. Lessons learned through the people’s loan reduction program was that even though any policy measures could succeed in removing one aspect of poverty problems, this measure could not succeed in eradicating poverty without accompanying with the income increase measures for farmers in the long run. 6
  • 399. •On the other hand, PMNR gave another aspect of lessons. That is, only through the spiritual education for the general public by the bureaucratic top down approach, not only spiritual enlightenment for the mass of people but also interaction resonance with the leaders of PMNR could not be occurred. But much more important lesson learned was that, as in the case of PMNR with the characteristics of social innovation, in order to succeed, there should be the chief policymaker’s presentation of vision to lead the social movement and the front line leaders’ positive actions as the social innovator. 7
  • 400. □Approaches to Saemaul Movement during the 1970s and its characteristics as social innovation ○Why Saemaul Movement could be regarded as a social innovation? •Social innovation is a novel solution to a social problem that is more effective, efficient, sustainable, or just than existing solutions and for which the value created accrues primarily to society as a whole rather than private individuals. This view of value creation puts great deal of weight on the difference between social and private problems to be solved on the on hand, and the social and private value created as a consequences of novel solution on the other(Phills, Deiglmeir and Miller, 2008: 34-43; Auerswald, 2009: 52). •During the 1960s and early 1970s when Saemaul Movement had begun, the most crucial social problems Korean society faced was the poverty, poor living condition and absence of the spirit that overcomes those hardships. Therefore, making better –off society was the most urgent task around that time in Korea and the emergence of the spirit to be well-off is the most important task for social innovation. •Therefore during the 1970s, the social innovation task to solve the problems which the Korean rural villages faced was how new social capitals that were lacking in those days created and built up. 8
  • 401. ○ Saemaul spirit as a social capital •Saemaul movement is for making better –off society and a livable community. •Saemaul spirit is diligence, self-help and cooperation. And ‘Can Do’ spirit is the social capital for solving social problems and forming the foundation for social development. •Also the social capital is needed to make use of the technologies for the farming and to make a clean living environment for the community wellbeing. •Therefore Seamaul movement as an antipoverty policy was for forming and creating the social capital for Korean society development during 1970s to solve poverty problem. •In order to create social capital, Saemaul Movement needed to adopt new approaches and new systems to make and to carry out Saemaul related •These new approaches were integrated system model and this new carryingpolicies. out system was a triple helix model. Because Saemaul Movement adopted new approaches and new systems to carry out Saemaul related policies, Saemaul Movement could be regarded as a social innovation. 9
  • 402. ○ An integrated system model of social belief, technology use and environmental improvement •Saemaul Movement sought to pursue social values which make my village the best place to live. These social values include diligence, self-help, cooperation as well as trust and creativity. •Saemaul Movement model which inspired the emergence of social value was the integrated systems model combined by social belief, use of technology, living environmental improvement and development of agricultural knowledge & technology, as shown in [Figure 1]. 10
  • 403. [Figure 1] Integrated system model of social belief, use of technology and living environment Better society: affluent rural village out of poverty Social belief and personal attitude Building infra for technology use Living environment Knowledge& technology for agricultural production Governance system 11
  • 404. •People’s National Reconstruction Movement(PNRM) put emphasis upon reforming rural mentality, ended in failure as government failed to provide economic incentives enough to make the movement take off the ground. Another attempt, called Special Project for Rural People’s Income Increase was made in latter part of the 1960s. This could not achieve considerable results, either mainly because it put emphasis on economic aspects only, neglecting the spiritual element involved. These two failures provided an empirical foundation in which both spiritual and economic aspects could be integrated into the Saemaul Undong(Goh, 2010: 35). 12
  • 405. ○ Triple Helix Model to carry out Saemaul Policies • The success of Saemaul Movement results from active interactions and mutual influences among the chief policy maker and his aides, central and local government officials and Saemaul leaders in rural areas as shown in [Figure 2]. [Figure 2] Triple Helix Model for Saemaul Movement Environment and context Transformative & generative leadership of the president Planning & maintenance by the administrative officials Environment and context Leadership of Saemaul leaders as well as other leaders in the local communities Environment and context 13
  • 406. • In the processes of Saemaul Movement, three entities played their own roles: the president and his aides offered new vision and strategies; central and local government officials planned the Saemaul projects, developed incentive systems and conducted result-oriented management; and Saemaul leaders as well as other local community leaders designed new projects to resolve difficult community problems as the catalyst, positive deviants(PD) as well as supporter for new farming works. 14
  • 407. □Mutual Interactions and the Role of the chief policymaker, central and local government officials and Saemaul leaders during carrying out processes of Saemaul policies ○ Major role of the chief policymaker • The president exerted transformative and generative leadership to support the Saemaul Movement in many ways including: ① Initiating the Saemaul Movement; ② Proposing a vision for transformation; ③ Modifying administrative systems to help successfully drive the Saemaul movement; ④ Supporting the budget, which was the energy needed to push forward the movement; ⑤ Nurturing Saemaul Leaders who would promote the movement at the local community level; ⑥ Holding Saemaul Cabinet meetings on a regular basis to identify and remove obstacles to Saemaul Movement; and ⑦ Hosting conferences where Saemaul leaders presented their best practices with an aim to motivate Saemaul leaders. 15
  • 408. ○ Generative Leadership • The primary objective of generative leadership in facilitating emergence is to foster and amplify novelty generation within an ecology of innovation. • In an organization or a local community, emergence is closely related with an idea of self-organizing. • Complexity researchers have found that emergence requires the presence of a substrate order that can be transformed, as well as structures that contain or channel the emergence processes. When applying these ideas informs similar emergent processes in organizations, enabling the processes of emergence become an active not a passive leadership endeavor that requires the right conditions and constraints. These are the roles and functions of the generative leadership. • Therefore, president as chief policy maker played key role as generative leader during the processes of carrying out Saemaul policies to facilitate emergence of Saemaul spirit. 16
  • 409. ○ Administrative structure and the role of government officials for Saemaul Movement •During the 1970s, government set up the organizational arrangement with the series of committees. A series of councils were formed with government agencies from the central government to local governments as shown in [Figure 3]. These organizations developed plans to carry out and support for Saemaul Movement, and implemented these plans. •The Ministry of Home Affairs(MHA) integrated and coordinated government policies. The new Saemaul Movement Central Consultative Council promoted and managed overall planning. The upper-level councils gave the lower-level councils the plans and guidance for carrying out the projects. •The lower-level councils should report the results of the Saemaul Movement in their own jurisdictions and had authorities for asking government supports for the projects to the upper-level councils. This organizational arrangement which covered all the related government agencies was beneficial to enhance the coordination, information sharing among authorities concerned, efficient planning and implementation of the Saemaul projects(Whang, 1983; Eom, 2012). 17
  • 410. [Figure 3] organizational arrangement for the Saemaul Movement Ministry of Home affairs Saemaul Movement Central Consultative Council(SMCCC) Provincial Government Provincial Level Council County Government County Level Council Interagency coordination of Saemaul programs: setting standard procedures for investments and loans to programs; setting guidance and promotion for the movement. Implementation of the SMCCC’s decision; overall long-range planning in the province; interagency coordination for implementation; analysis of program success and failure. General guidance of county programs, training program for residents, area investment plan and evaluation of program implementation. Myon Level Office Myon Level Council Diagnosis of residents’ wishes, detailed investment plan; interagency coordination for program implementation, examination of area particulars and its linkage to the Saemaul plan. Village Saemaul Leader Village Development Committee Village development plan; decision on new project implementation; management of village assets; day-today consultation on Saemaul projects and village affairs. Source: Saemaul Movement: From Beginning to Present, Ministry of Home Affairs, 1973, p.37, and Kim Hae-Dong(et.al.). An Evaluation and Field Experimentation of Saemaul Movement, op.cit., pp.20-34; Kim, Young-Pyoung(2013), p.68. 18
  • 411. •Each level of government was also accountable and coordinating the activities of the lower level government as well as for delivering feedbacks from the bottom. In addition, local administrations monitored the results and achievements of the Saemaul projects at the villages in their own jurisdictions. •Central government carried out the functions of budget allocations. Government chose village as the strategic unit of community action. That is, villages rather than individual farmers were chosen as the target of support for rural modernization projects(Goh Kun, 2010: 33-34). •The Korean government’s material support was designed to work as common resources for infrastructure projects of a unit village. This collective aid could help generate villagers’ enthusiasm for self-help and voluntary cooperation(Goh Kun, 2010: 34). •Government support was designed to spark and stimulate continuously the motivation of the farmers’ participation in self-help rural development not to foster their dependency on the government. The government support was provided in a careful, strategic and steady manner for the entire period of the movement(Goh Kun, 2010:43). 19
  • 412. •The more successful the village performance in Saemaul project implementation, the more support from the government for the village. Villages are classified into three categories: basic, self-help and self-sufficient. They can be promoted on the basis of good evaluation scores. Government authorities undertake an evaluation for village Saemaul performance every year in accordance with predetermined criteria. Each village is open to Saemaul competition with neighboring villages. The more participative the villagers are, the more successful their Saemaul Movement, and the better off are all the villagers (Goh Kun, 2010: 61; Kim Young-Pyung, 2013: 61). •The evaluation criteria for village classification and required standard projects for promotion were as shown in [Table 1]. •Government established three stage goals to attain through Saemaul Movement. These three stage goals were basis building up stage, self-help development stage and self-supporting stage as shown in [Table 2]. 20
  • 413. [Table 1] Criteria for village classification and required standard projects for promotion Projects Basic Village road Main village road Branch village road - Farm road Village entry farm road Cultivation farm road - Small river Small river inside village Small river between village Small and medium size river outside village Agricultural water Irrigation 70% Irrigation 70% Irrigation 85% Power driven machine for prevention of the breeding Power tiller, power threshing machine Agricultural machine - Self-help Self-sufficient Cooperative farming Cooperative work team Cooperative production team Cooperative production team Village fund $ 1,200 per village $ 2,000 per village $ 4,000 per village Income per household $ 2,000 per household $ 3,200 per household $ 5,600 per household Source: Ministry of Home Affairs(1980), p.215. 21
  • 414. [Table 2] Goal attainment stages of Saemaul Movement Baseline formation (1971- Self-help Development (19741973) 1976) Self-help village: 60% Basic village: 30% Self-help village: 60% Self-sufficient village: 40% Self-sufficient village: 10% Independent (1977-1981) Self-sufficient village: 100% (Income increase only through the self-help endeavors) •With limited resources for development, village competition proved to be a good means for the government to know where to invest. The government concentrated its support on these villages which were competitive and positive toward the Saemaul projects. •At the end of 1972, the number of self-sufficient villages was 2,307(7%), that of self-help villages was 13,943(40%) and that of basic villages was 18,415(53%). At the end of 1975, the number of self-sufficient villages was 10,049(29%), that of self-help villages was 20,936(60%) and that of basic villages was 4,046(11%). Finally, at the end of 1979, the number of self-sufficient villages was 33,893(97%), and that of self-help villages was 976(3%). (Oh Ryue-Suck, et.al. 2008: 17). 22
  • 415. ○ Implementation of Saemaul Policies by the integrated model and roles of the Saemaul leaders •The model for Saemaul Movement and the roles of Saemaul leaders were closely related. So, we need to first look at the integrated model for Saemaul Movement to analyze the role of Saemaul leaders. Please refer to [Figure 1] above for the integrated model for Saemaul Movement. •Saemaul leaders drove forward the movement following this integrated model. In doing so, Saemaul leaders played their roles as catalyst, experimenter and supporter of novel agricultural cultivation methods, example-setter as well as illuminator with the spirits and attitudes of modernity. •In the driving processes of the Saemaul movement, Saemaul leaders played different roles depending on the four purposes of Saemaul Movement included in the integrated system model for Saemaul Movement. 23
  • 416. •The main responsibilities of Saemaul leaders were to coach other village leaders as well as residents to build their own capability to plan and conduct Saemaul projects, to learn and to understand the importance of Saemaul projects and its deployment methodologies, to translate the knowledge into action, and by doing so to have ownership of the Saemaul projects. Coaching activities of Saemaul leaders were four-fold, as shown in [Figure 4]. [Figure 4] The goals of Saemaul project and coaching role Build infrastructure to use technology Core goals of Saemaul Movement Improve living environment system Role as catalyst Develop new knowledge & technology Role as supporter for new experiments and ideas Spiritual enlightenment (Saemaul Spirit) Role as positive deviant 24
  • 417. • In the pursuit of four goals of Saemaul projects, the coaching strategies of Saemaul leaders were aiming at the improvement of the knowledge of other village leaders and residents through learning, empowering and helping them to translate what they learned into actions, so that they could have ownership for the Saemaul projects. • In these processes, key contents for education include: ① the ways to recognize the needs; ② the ways to determine the current situations; ③ the ways to recognize issues; ④ the ways to develop plans and pursuit the plans to solve recognized issues; ⑤ the ways to seek help when they felt the lack of knowledge and capability while pursing the projects ⑥ the ways to identify strategies, to implement projects, etc. 25
  • 418. • The key strategy to implement Saemaul projects was to interrelate those scattered villagers to cooperate together. For example, Saemaul leaders encouraged villagers to exchange information, support and cooperate each other in the implementing processes of Saemaul projects such as renovating thatched roofs into tiled roofs, building the new sewer systems, renovating the traditional style kitchen into the western style and setting up orchards. • The roles played during the implementation processes of Saemaul projects, such as building infrastructure to use technology and improving living environment systems were the catalysts’ roles. • The roles played during the process of developing knowledge and technology of new agricultural cultivation methods were the roles of experimenter, introducer and supporter. Vegetables cultivation in vinyl greenhouse tried by Saemaul leader (Mr. Yu, Young-mo and Ha, Sa-yong) and new agricultural method by creating chestnut tree complex tried by woman Saemaul leader (Ms. Jung, Mun-ja) are good examples. 26
  • 419. • So, Saemaul leaders played the roles as the example setter and illuminator to transform the pre-modern mindsets and attitudes of village residents into modern ones, which was perceived by Saemaul leaders as the largest causes of poverty in rural villages. • We can see that the Saemaul leaders didn’t forced or directed village residents to transform their pre-modern mindsets and behaviors into modern ones. Rather the leaders set the examples as the positive deviants with modern mindset and persuaded people to participate and interact with them, which resulted in interaction resonances. 27
  • 420. □Inspiration to motivate for participating in Saemaul Moveme and lessons learned from the emergence of Saemaul spirit •The goal of Saemaul Movement was to construct a good society to live in. This ultimate goal of Saemaul Movement could be achieved only when rural community people quite agreed with the goal of Saemaul Movement presented by the chief policymaker and actively participated in Saemaul Movement. No matter how high and grand vision and value of Saemaul Movement was, if rural people who were the objects of Saemaul Movement did not participate in the movement enthusiastically, this goal could not be realized. •Then, how core leading group of Saemaul Movement could inspire the rural people who were the objects of the movement to take part in the movement enthusiastically? We can find the clue of the answer for this question in the marketing theory of Simon Sinek. Sinek developed golden circle model to explain buying behavior in the market. As a useful medium to inspire people’s desire to buy the goods in the marketing, he took WHY, HOW TO, and WHAT. And then he established relationships among them as shown in [Figure 5]. It is the manufacturing company’s general strategies to induce the people to buy the products by propagating the WHAT or HOW TO. 28
  • 421. [Figure 5] The golden circle WHY HOW TO WHAT •Here, WHAT includes products’ functions, properties, uses, etc. HOW TO includes the production processes such as materials used, techniques, etc. Compared to these, WHY means purposes, beliefs, or causes for which people or company exist. •”Marketing messages of most of the companies would move from the outside to inside at the golden circle. It would start with some statement of WHAT the company does or makes, followed by HOW they think they are different or better than the competition, followed by some call to action. With that, the company would expect some behavior in return, in this case a purchase”(Sinek, 2009: 40). •Generally, the communication is organized in an attempt to convince someone of a difference or superior value. 29
  • 422. •In contrast with this, the Golden Circle provides compelling evidence of how much more we can achieve if we remind ourselves to start everything we do by first asking why. To support this argument, Sinek gave us various examples such company as Apple and Southwest Airlines, and such great man as Dr. Martin Luther King Jr. and John F. Kennedy. He argues that golden circle offers clear insight as how Apple is able to innovate in so many diverse industries and never lose its ability to do so. It provides a clearer understanding not just of how Southwest Airlines created the most profitable airline in history, but why the thing it did work. It even gives some clarity as to why people followed Dr. Martin Luther King Jr. in a movement that changed a nation and why we took up John F. Kennedy’s challenge to put a man on the moon even after he died. The golden circle shows, Sinek insists, how these leaders were able to inspire action by starting why instead of manipulating people to act(Sinek, 2009: 38). •Golden circle theory developed mostly, as marketing theory to explain buying behaviors. However, if we can further develop this golden circle theory by modifying some aspects, this golden circle theory can be used to explain why people decide to participate in a large scale social innovation movement such as Saemaul Movement with enthusiasm. 30
  • 423. •In golden circle shown in [Figure 5], WHY is the vision and purpose of the social innovation movement. In a large scale social innovation movement such as Saemaul Movement, chief policymaker and his aids develop and present vision and purposes of social innovation. That is, chief policymaker and his aids inspire WHY to other actor groups by presenting vision and purposes. In contrast with this, central and local government officials develop plans and incentive systems to carry out these vision and purposes presented by the chief policymaker. This means that government officials develop HOW TO to realize vision and values. On the other hand, WHAT is the projects and activities which front line leaders as social innovators carry out in cooperation with people. •Just like above mentioned, three actor groups who are involved in a large scale social innovation movement play different roles and they lead the innovation movement in the processes of mutual interactions. In Saemaul Movement as a large scale social innovation movement, these three actor groups played each role and led people during the innovation movement processes. •We have previously analyzed three actor groups' roles and activities. That is, chief policymaker and his aids developed and presented vision and purposes of Saemaul Movement, central and local government officials developed and carried out plans and incentive systems to realizae these vision and purposes, and Saemaul leaders as social innovators played the roles as catalysts for development and carrying out rural village modernization projects, supporters of the new innovative ideas for farming, and positive deviants for the spiritual enlightenment of rural villages' people. 31
  • 424. •In a large scale social innovation movement like Saemaul Movement, WHY in the golden circle not only played functions to inspire people who are the objects of the movement but also played function to inspire other actor groups who played HOW TO function and WHAT function. In marketing theory, WHY’s function is to inspire buyers to enhance their loyalty to the company’s products. However, in the large scale social innovation movement, WHY should play the function to inspire not only objects of innovation but also other actor groups who play HOW TO function and WHAT function to raise loyalty to their roles. With this enhanced loyalty, other actor groups who played HOW TO function and WHAT function can carry out their roles honestly and effectively. •And if these two other actor groups who were in charge of HOW TO and WHAT function could be inspired to enhance their loyalty to the vision and purposes of the social innovation movement, then the movement could succeed in attaining the goals. •In Saemaul Movement in April 26, 1972, the President Park Chung-Hee by himself wrote the handwritten script expressing the philosophy and belief of the President Park’s concerning Saemaul Movement in detail at the 「Saemaul Income Increasing Conference」. “…3. What is Saemaul Undong(purpose and concept)?.....(3) To put it more easily, Saemaul Undong is a campaign to live a better life. (4) What is a better life? A better life is one where- ∙People escape from poverty, ∙Income increases so that rural communities can become affluent and enjoy an elegant and cultural life, ∙Neighbors share friendship and help one another, and ∙A good and beautiful village to live in is created. ⊙Although it is important to have a good life today, it is a bigger ambition to create a better life for tomorrow and for our offspring. (Let’s discover the philosophy of Saemaul Undong)…..”(Gyeongsangbukdo Saemaul Undong Center, 2012: 10-11). 32
  • 425. •Philosophy and belief of Saemaul Movement presented by the President Park ChungHee clarified the vision and purposes of Saemaul Movement, and the reason why people as well as other two actor groups should actively participate in Saemaul Movement. Particularly, the President Park’s philosophy and belief inspired government officials as well as Saemaul leaders of rural villages (including other cooperative rural villages’ leaders) to work harder and enhanced their loyalty to Saemaul Movement. As the result, almost all of the rural villages were evaluated as they were reached to the self-sufficient villages at the end of the 1970s(Ministry of Home Affairs, 1980). •Just like these, WHY in the center of the golden circle in the social innovation movement inspired two other actor groups who were in charge of HOW TO and WHAT, that is, actors who developed plans and incentive systems, and the other actors who implemented programs at the front line and enhanced their loyalty to the movement. These two actors worked as the generative power to carry out Saemaul Movement successfully. •WHY is based on one’s belief while HOW TO and WHAT are based on the belief and knowledge. Therefore, the fact that those who are in charge of HOW TO and WHAT are inspired by ‘WHY’ means fusion of belief and knowledge for carrying out the movement. This fusion and sharing of belief and knowledge to carry out the social innovation movement brings effect to minimize uncertainty so that the probability to carry out the movement successfully can be increased. 33
  • 426. •That is, the mutual interaction of three actor groups based on WHY can enhance the fusion and sharing of the belief and knowledge among three actor groups. And this worked to minimize the entropy of driving the social innovation movement. [Figure 6] sharing of the belief and knowledge among three actor groups Why? Chief policy-maker and aids Fusion & Sharing How? Central and local government officials What? Saemaul leaders and other villege leaders 34
  • 427. •As we can see in [Figure 6], the fusion and sharing of WHY, HOW TO and WHAT, that is, sharing of ideologies and knowledge for driving the social innovation movement among three actor groups to carry out the movement, by creating the interaction resonance during the mutual interaction processes, can raise the probability of successful driving of the movement. However, if the degree of the fusion and sharing of them is low, then the probability to successful completion of the movement can be much lowered. This is the lesson learned through the experience of driving Saemaul Movement during the 1970s. •That is, if actor groups who are in charge of HOW TO and WHAT are inspired by the belief of WHY, then it means that these actor groups' loyalty to the ideologies of social innovation movement can be enhanced and this, in turn, conjoin WHY with HOW TO and WHAT. If three actor groups who play different roles in the social innovation movement can share the belief and knowledge and if interaction resonance can be arisen, then possibilities of successful driving of a large scale social innovation movement can be enhanced and the entropy of shared driving force can be much more lowered. These lessons learned through the experience of Saemaul Movement during the 1970s can be the logic of actor based triple helix theory to successful driving of a large scale social innovation movement. 35
  • 428. Reference •내무부(1980a). 「새마을운동 10년사」. •노유경(2012). 「새마을운동에서 코칭 리더십과 혁신행동의 창발 에 관한 연구」, 고려대학교 대학원 교육학 과 석사학위논문. •노화준(2012). 정책학원론: 복잡성과학과의 융합학문적 시각, 서울: 박영사. •노화준(2013). “사회적 혁신가로서의 새마을지도자의 역할”.「새마을운동과 지역개발연구」, 제 19권, 경운대 학교 새마을연구소. •노화준(2013). 한국의 새마을운동: 생성적 리더십과 사회적 가치의 창발, 서울: 법문사. •박진환(1981). “새마을사업의 점화과정”.「새마을운동의 이념과 실제」, 서울대학교 새마을운동 종합연구소. •박진환(1982a). “새마을교육의 회고와 방향 – 새마을교육의 결정요인”. 새마을지도자연수원, 「새마을교육논 문집」, pp. 7-103. •박진환(1982b). “새마을교육의 근원”. 유네스코 한국위원회, 「새마을교육의 이론과 실제 – 새마을교육에 관 한 다 학문적 세미나 보고서」, pp. 51-95. •소진광(2007a). “지역사회 거버넌스와 한국의 새마을운동”. 한국지방자치학회, 「한국 지방자치 학회보」, 제 19권 제3호, pp. 93-176. •소진광(2007b). “아시아 개발도상국에서의 새마을운동 시범사업 성과평가 – 라오스와 캄보디아를 중심으로 -”. 한국지역개발학회, 「한국지역개발학회지」, 제19권 제4호. 36
  • 429. •유병용∙최봉대∙오유석(2001), 「근대화 전략과 새마을운동」, 서울: 벡산서당. •윤영수·채승병(2005). 「복잡계 개론」. 서울: 삼성경제연구소. •최외출(2010). “지도자와 리더십”. 「과학대통령 박정희와 리더십」, MSD 미디어 미래를 소유한 사람들. •최상호(1986). 「새마을 형 사회교육의 효과와 그 관련 변인분석」. 서울대학교 대학원 농업교육과 박사학위논문. •최상호(1977). “새마을지도자의 사회적 배경과 역할동기에 관한 연구”. 「한국농업교육학회지」, 제9권 제1호. •한도현(2010a). “1970년대 새마을운동에서 마을 지도자들의 경험세계: 남성 지도자들을 중심으로”. 한국사회사 학회, 「사회와 역사」, 제88호. •한도현(2010b). “박정희 대통령과 새마을운동”. 이지수 엮음, 「박정희 시대를 회고한다」, 도서출판 선인. •황인정(1980). 「한국의 종합농촌개발 : 새마을운동의 평가와 전망」. 한국농촌경제연구원. •Eom, Seok-Jin (2011a), “The Rural Saemaul Undong Revisited from the Perspective of Good Governance”, The Korean Journal of Policy Studies, Vol 26, No.2. pp.17-43. •Eom, Seok-Jin (2011b), “Synergy between State and Rural society for Development: An Analysis of the Governance System of the Rural Saemaul Undong in Korea”, Korean Observer, Vol. 42, No.4, pp. 583-620. •Eoyang, G.(2002). Conditions for Self-organizing in Human Systems. Unpublished doctoral dissertation, Union Institute and University, Cincinnati, Ohio. 37
  • 430. •Fukuyama, Francis(1999). Social Capital and Civil Society. The Institute of Public Policy, George Mason University, October 1, 1999. •Funnell, Sue C. and Rogers, Patricia J.(2011). Purposeful Program Theory: Effective use of Theories of Change and Logic Models, San Francisco: Jossey-Bass. •Goh, Kun(2010), Saemaul(New Village) Undong in Korea, Saemaul Undong 40th Anniversary International Symposium, Seoul, pp.29-47. •Goldstein, Jeffrey, Hazy, James K. and Benyamin B. Lichtenstein(2010). Complexity and the nexus of Leadership. New York: Palgrave Macmillan. •Fonseca, Jose(2002). Complexity and Innovation in Organization. New York: Rouldedge. •Hayden, F. Gregory(2006). Policymaking for A Good Society: The Social Fabric Matric Approach to Policy Analysis and Program Evaluation. New York: Springer Science and Business Media, Inc. 38
  • 431. •Hazy, James K., Goldstein, Jeffrey, and Benyamin B. Lichtenstein.(2007). Complex Systems Leadership Theory. ISCE Publishing. •Johnson, Neil(2009), Simply Complexity: A clear guide to complexity theory, Oxford: One world Publications. •Kim, Hae-Dong(1975), “Relationship between the central and the local government”, Korean Development Policy Studies vol 2. •Kim, Hae-Dong et.al(1979), 「An Evaluation and Field Experimentation of Saemaul Movement in the Republic of Korea」, 서울대 행정대학원, pp.67-101. •Kim, Young-Pyoung(2013), “Korean Saemaul Undong and its Implications for a Rural Development of Emerging Countries”, Journal of Saemaul Undong & Community Development, vol.9, pp.37-72 •Mitchell, Melanie(2009), Complexity: A Guided Tour, Oxford University Press. •Park, Chung Hee(1979). Saemaul: Korea's New Community Movement, Seoul:Korea Textbook Co., Ltd. •Sinek, Simon, Start With Why: How great leaders inspire everyone to take action, Penguin Group, 2009. 39
  • 432. Words and Networks DISC 2013 Jana Diesner, PhD Assistant Professor The iSchool, Department of Computer Science University of Illinois at Urbana-Champaign 1 Text Data Network Data Construction & Analysis Scalable, reliable, robust methods & technologies Network data Applications • Answer substantive questions about networks • Fill databases •Explore future what-if scenarios • Design & evaluate interventions •Input to other computations, e.g. machine learning Data analysis Jana Diesner and Team, UIUC
  • 433. Behavioral Data Network Data Construction & Analysis Scalable, reliable, robust methods & technologies Interaction data Network data Text data Applications • Answer substantive questions about networks • Fill databases •Explore future what-if scenarios • Design & evaluate interventions •Input to other computations, e.g. machine learning Data analysis Big Data Mission Thick Data Develop computational solutions to analyze traces of contextualized behavioral data, i.e. structure AND content, at breadth and depth Jana Diesner and Team, UIUC
  • 434. Approach and Methods Social Science Network Analysis Text Mining/ NLP Machine Learning 5 Example: Better Understanding of the Sudan • Task: Develop, evaluate, apply method and technology for extracting socio-technical network data from large-scale text corpora to answer questions about the Sudan. Diesner J, Tamabyong L, Carley KM (2012) Mapping socio-cultural networks of Sudan from opensource, large-scale text data. Computational and Mathematical Organization Theory (CMOT) Jana Diesner and Team, UIUC
  • 435. From Words to Networks: What node classes to consider? Who? (people, groups) Where? (places) What? (tasks, events) When? (time) ICT RoE Mission How? Conflict NATO Why? (resources, Vessel (beliefs, sentiments, knowledge) mental models) 7 How to find and categorize nodes in text data? Recipe for machine-learning based solution • Get some labeled ground-truth data (BBN corpus) • Build a classifier/model (h) that for every sequence of words (x) and label per word (y) predicts one category per word (y = h (x)), incl. for new and unseen text data • Exploit clues from text data (lexical, syntactic, statistical) • Train and validate model • Get good accuracy (compare to human intercoder reliability) • Apply prediction model to text data (~ 80,000 files) • Link nodes (e.g. based on co-occurrence, proximity) • Et voila: network data Diesner J, Carley KM (2008) Conditional Random Fields for Entity Extraction and Ontological Text Coding. Journal of Computational and Mathematical Organization Theory (CMOT), 14(3), 248 – 262. Jana Diesner and Team, UIUC
  • 436. How to find and categorize nodes in text data? • Model relationship among hidden states (y) as Markov Random Field (MRF) conditioned on observed data (x) (Lafferty et al. 2001) • Compute conditional distribution of entity sequence y and observed sequence x as normalized product of potential functions Mi: M i ( yi −1 , yi | x) = (exp Σ λα fα ( yi −1 , yi , x) + Σ µ β g β ( yi , x)    β weight  α weight feature feature ∏ pθ ( y | x) = ∏ n +1 M ( yi −1 , yi | x) i t =1 n +1 • • • • i =1 M i ( x) start , stop Edge and transition features plus node and emission features f, g: boolean feature vectors with learned weights Tool: CRF project page, training data: BBN Accuracy: Precision 89-90%, Recall 87%, F 88-89% 9 Results: Social Networks Control: Activity: Degree Centrality Omar al-Bashir Ali Osman Taha John Garang Salva Kiir Mayardit Hosni Mubarak Sadiq al-Mahdi Hassan al-Turabi Abdul Wahid al Nur Yoweri Museveni Kofi Annan Deng Alor 03 04 05 3 3 2 1 2 3 2 1 1 8 10 4 4 7 5 6 5 10 5 6 7 10 9 9 7 8 7 9 4 6 11 11 11 06 1 4 3 2 6 9 10 8 6 5 11 07 08 1 1 3 3 3 4 2 2 9 8 5 7 5 8 7 4 11 10 8 11 10 6 09 1 3 6 2 4 8 9 5 7 11 9 10 1 3 8 2 6 4 5 7 8 11 8 03 04 05 1 1 1 2 3 3 3 2 2 7 10 4 7 4 5 4 7 7 10 9 9 7 5 5 6 6 8 5 8 9 10 10 9 06 1 4 2 3 6 7 7 5 9 9 9 07 08 1 1 4 3 2 6 3 2 6 8 6 7 4 5 11 11 9 10 8 9 10 4 09 1 2 7 3 4 7 5 7 6 7 7 10 1 2 7 3 5 3 7 7 5 7 7 Trust: Triads Omar al-Bashir Ali Osman Taha John Garang Salva Kiir Mayardit Hosni Mubarak Sadiq al-Mahdi Abdul Wahid al Nur Kofi Annan Yoweri Museveni Hassan al-Turabi Deng Alor Close to power: Betweenness Centr. 03 04 05 06 07 08 09 10 Omar al-Bashir 1 1 1 1 1 1 1 1 Salva Kiir Mayardit 6 10 2 5 2 2 2 2 Ali Osman Taha 4 3 3 7 6 7 5 4 John Garang 3 6 5 4 4 6 7 7 Sadiq al-Mahdi 2 8 10 2 7 5 6 3 Abdul Wahid al Nur 8 4 7 8 3 4 3 6 Kofi Annan 7 2 4 3 10 11 8 10 Yoweri Museveni 5 5 9 6 5 9 8 10 Deng Alor 8 10 10 9 9 3 8 5 Hosni Mubarak 8 9 8 11 8 8 4 8 Hassan al-Turabi 8 7 6 10 11 10 8 9 Eigenvector Centr. Ali Osman Taha Omar al-Bashir Salva Kiir Mayardit John Garang Hosni Mubarak Kofi Annan Yoweri Museveni Hassan al-Turabi Sadiq al-Mahdi Deng Alor Abdul Wahid al Nur • • • • 03 04 05 1 2 3 3 3 5 7 10 4 2 1 1 4 5 6 8 4 7 9 8 8 5 7 10 6 6 9 11 11 1 10 9 11 06 3 2 1 4 5 6 7 8 9 10 11 07 08 3 3 2 2 1 1 4 4 11 5 6 11 9 6 8 10 7 8 5 7 10 9 09 3 2 1 7 4 11 5 8 10 9 6 10 4 3 1 9 7 1 8 5 6 10 11 President North: Known performer President South: Now established Legacy of religious leaders Presence of neighboring presidents 2003 2004 2005 2007 2010 Darfur conflict Continuous civil war (since 1993) Comprehensive Peace Agreement Garang 1st VP, followed by Kiir Autonomous South Sudan SPLA withdraws from government Votum in South Sudan 10 about Separation Jana Diesner and Team, UIUC
  • 437. Results: Social Networks Degree Centrality United Nations Rebel Groups Military SPLA Security Council Sudan government Nat. Congress Party African Union Inter. Criminal Court Dinka Churches Triads Military United Nations Rebel Groups SPLA Sudan government Nat. Congress Party African Union Security Council Inter. Criminal Court Churches Dinka 0304 05 4 2 1 1 1 2 2 3 3 # 6 5 5 5 4 3 4 6 6 9 9 8 7 8 # 11 7 9 10 11 7 8 10 06 07 08 1 1 1 3 4 3 2 2 2 4 3 4 5 5 5 6 8 8 8 6 7 7 7 9 11 9 6 9 10 10 10 11 11 09 10 1 5 2 3 4 2 3 1 5 6 9 7 10 4 7 10 6 9 8 8 11 11 0304 05 1 1 1 4 3 2 2 2 4 # 5 3 3 4 5 5 9 10 8 6 6 7 7 7 # 11 8 6 8 9 9 10 11 06 07 08 1 2 1 2 1 4 4 4 2 3 3 3 7 5 7 8 6 6 6 7 10 5 8 9 9 10 5 10 9 8 11 11 11 Betweenness Centr. 09 10 6 1 1 2 4 5 2 4 4 6 9 3 7 9 8 8 3 7 10 11 11 10 Military United Nations SPLA Rebel Groups Sudan government Nat. Congress Party Churches Dinka African Union Inter. Criminal Court Security Council 0304 05 1 1 3 3 6 2 # 3 1 4 2 4 2 4 5 6 9 8 5 7 9 8 5 6 7 8 7 # 11 10 9 10 11 06 07 08 3 1 1 2 3 2 1 2 3 4 7 5 8 4 7 5 5 4 10 6 6 6 8 11 11 10 10 9 9 8 7 11 9 09 10 2 1 1 3 5 2 3 4 6 10 8 7 9 9 11 6 10 5 4 11 7 8 Eigenvector Centr. United Nations Military Rebel Groups Security Council SPLA Sudan government African Union Inter. Criminal Court Nat. Congress Party Churches Dinka 0304 05 4 2 1 2 3 3 1 1 4 5 5 2 # 6 5 3 4 7 8 7 8 # 10 6 6 9 10 7 8 9 9 11 11 06 07 08 2 1 2 1 2 1 3 4 3 4 5 4 5 3 5 6 8 7 7 6 9 9 9 6 8 7 8 10 10 10 11 11 11 09 10 1 5 5 2 6 3 2 8 7 1 8 6 4 10 3 7 9 4 10 11 11 9 • Strong presence of various types of armed forces • Strong influence of out of state groups • Within top 10 Sudanese groups: – Dinka, Nuer (ethnic groups/ tribes) 11 2003 2006 2004 2007 Year Number Tribes linked Intertribal links of to conflict or for pairs linked tribes war to conflict or war 32 38% 32% 2003 44 45% 66% 2004 33 39% 40% 2005 46 50% 83% 2006 47 62% 78% 2007 50 60% 65% 2008 28 68% 95% 2009 27 56% 100% 2010 2005 2008 • High and increasing rate of tribes associated with conflict and/ or war Results: Tribal Networks 12 Jana Diesner and Team, UIUC
  • 438. Results: Conflict Networks • Conflict: Agriculture, Livestock (farmers vs. herders) • War: Land Resource (concept of dar) • Conflict and War: Oil, Civic, Transportation 13 14 How accurate are the results? Hmm, Relation Extraction looks like a nice idea. But how accurate are your results? The F values tell me all I need to know. We fine-tuned our method and technology based on the F-values and feedback from SMEs. But the F only shows the increase in accuracy over a baseline or benchmark. Maybe we need to ask a different question… Jana Diesner and Team, UIUC
  • 439. Generalization Abstraction 1. Mental Models (Spreading Activation) (Collins & Loftus 1975) 2. Case Grammar and Frame Semantics (Fillmore 1982, 1986) 3. Discourse Representation Theory (Kamp 1981) 4. Knowledge representation in AI, assertional semantic networks (Shapiro 1971, Woods 1975) 5. Centering Resonance Analysis (Corman et al. 2002) 6. Mind maps (Buzan 1974) 7. Concept maps (Novak & Gowin 1984) 8. Hypertext (Trigg & Weiser 1986) 9. Qualitative text coding (Grounded Theory) (Glaser & Strauss 1967) 10. Definitional semantic networks incl. text coding with ontologies (Fellbaum 1998) 11. Semantic Web (Berners-Lee et al. 2001, Van Atteveldt 2008) 12. Frames (Minsky 1974) 13. Semantic Grammars (Franzosi 1989, Roberts 1997) 14. Network Text Analysis in social science (Carley & Palmquist 1991) 15. Event Coding in pol. science (King & Lowe 2003, Schrodt et al. 2008) 16. Semantic networks in comm. science (Danowski 1993, Doerfel 1998) 17. Probabilistic graphical models (Howard 1989, Pearl 1988) Automation Impact of Methodological Choices on Results 15 Diesner J (2012) Uncovering and Managing the Impact of Methodological Choices for the Computational Construction of Socio-Technical Networks from Texts. Technical Report CMU-ISR- 12-101. Measuring Impact of Social Justice Documentaries: Problem Statement • Goal of (Social Justice) Documentaries: Storytelling – Create memories, imagination, sharing (Rose 2012) • Goal of funders and producers (Sundance Institute, Ford Foundation, BritDoc): Impact – Evoke change in people’s knowledge and/or behavior (Barrett & Leddy 2008) • Common Approach and status quo: – – – – Big data: frequency counts of screenings and viewers Thick data: small-scale, in-depth interviews with focus groups Science: psychological effects of media on individuals Media and funding agencies: theoretical and normative frameworks (Clark & Abrash 2011, Figueroa 2002) – Strong need for comprehensive, empirical, rigorous impact assessment 16 Jana Diesner and Team, UIUC
  • 440. Our Questions • How can we know if a documentary has what impact? – Computational, scalable, theoretically grounded – Generalized question: measure impact of information and media in terms of change • At what point/ how early in the life cycle of a production can we answer this question? – Prediction models for trajectory of movies • Usefulness for filmmakers and producers – Strategic allocation of limited resources for outreach and campaign work – Leverage existing social capital and discourse 17 Our Approach: A story of microscopes and telescopes • Assumption: documentaries produced, screened, watched as part of larger, dynamic ecosystems of stakeholders and information flow • Method: identify, map, monitor, analyze social (stakeholders) and semantic (information) networks to study the structure, functioning and dynamics of networks and information 18 Jana Diesner and Team, UIUC
  • 441. A Comprehensive Framework for Measuring the Impact of Documentaries DIMENSION LEVEL INDEX ANALYTICS ITEM Guiding Factor Description Ranking weighing Report by producers or funding agencies Outreach Stats Number of movies, CDs distributed Number of theatrical, Internet release Duration of release; Sales of product MASS MEDIA Mass Media Attention Text Mining Web Analytics Frequency of news coverage weighted by influence (article, opinion/editorial) Domestic, international broadcast USER MEDIA User Media Attention MESSAGE EXPECTED OUTCOME CONTENT EVALUATION PRIORITY RESOURCE OFFLINE RELEASE MEDIUM RESPONSIVE MEDIUM ONLINE MEDIUM PROFESSIONAL MEDIA Prestige INTERPERSONAL INTERACTION Intimate Attention Number of festival acceptance Number of awards Number of professional reviews Conversation, talking on the phone or email, lectures, exchange of letters, etc. AUDIENCE SIZE Reachability HOMOGENEITY TARGET Twitter, Facebook, Blogs, webpages Frequency of talking about, links included, user-created contents Text Mining Web Analytics Survey, Interview Diversity SINKER Passiveness TRANSMITTER Leadership AUDIENCE TYPE Text Mining Web Analytics Archived Data Survey, Interview Number of viewers or visitors Geography & demography: location, age, gender, education, income Number of inactive viewers Text Mining Web Analytics Network Analysis Number of opinion leaders Advocacy Text Mining Web Analytics Survey, Interview Number of advocacy communities, colleges, schools, or NGOs COGNITIVE Awareness Stats, Text Mining Web Analytics, Network Analysis Frequency of names, ideas, thoughts, or concepts appeared in corpus Report of increased awareness ATTITUDINAL Sentiment Sentiment Analysis Frequency of positive, negative, neutral sentiments of comments Personal, critics, mass media, and organizational responses Reaction to calls for action GLOBAL SOCIETAL INDIVIDUAL IMPACT COMMUNAL COLLECTIVE ENTITY BEHAVIORAL TEMPORAL Engagement Enactment Connectedness Capacity Expansiveness Centralization Impact Dynamics Text Mining Web Analytics Network Analysis Longitudinal analysis How well connected How much & far disseminated How centralized is the impact The route of diffusion Number of action pledges alliance and allied action of organization Discussion or decision by organizational, governmental, international policy/legislation makers sponsorship of bills, adoption, donation, funding, implementation, social movement or intervention Comparison b/w multiple time points Duration of impact Increase vs. decrease Change vs. stability vs. reinforcement Introduction or shifts of topics Detection of social norm change This is no computational fishing expedition. We have theory: CoMTI Framework Diesner J, Pak S, Kim J, Soltani K, Aleyasen A (2014) Computational Assessment of the Impact of Social Justice Documentaries. iConference, Berlin, Gemany 19 Scientific Logic Baseline Ground truth Transcript Content Reality/ Change Social Structure Meta Data Theme Content Social Structure Meta Data Movie Content Social Structure Meta Data Content Theme 20 Jana Diesner and Team, UIUC
  • 442. Technology: ConText (come play with it tomorrow) • Text Mining & • Social Networks Natural Language Processing: of agents Social Structure – Summarization • Number, type • Corpus Statistics and quality of • Topic Modeling social agent • Sentiment Analysis Meta Data • Semantic – Pre-Processing Content • Stemming networks of • Stop Word Removal content • • • • Disambiguate data Create curated dataset Create meta-data database Construct semantic networks • Parts of Speech Tagging – Codebook construction and Application • Entity Extraction – Relation Extraction • Based on proximity, syntax 21 • (http://context.lis.illinois.edu/) Case studies & Lessons Learned • House I Live in (Eugene Sarnecki 2012) HILI – Mandatory minimum sentencing • One Mile Away (Penny Woolcock 2012) OMA – Transforming inner city youth gangs • Pandoras Promise (Robert Stone 2012) PP – Nuclear and sustainable energy • Solar Mamas (Eldaief & Noujaim ‘12) SOMA – Technical education of women in developing world – Part of “Why Poverty Series” 22 Jana Diesner and Team, UIUC
  • 443. Opportunity Space (Baseline Model): Semantic Network (Meta Data) of News Coverage of Theme: “education” + “women empowerment” “children” “women” “economic development” “poverty and homelessness” 23 Semantic Network (Meta Data) of News Coverage of “Solar Mamas” “documentary films” “movie industry” “festival” “children” “poverty and homelessness” “women” 24 Jana Diesner and Team, UIUC
  • 444. Summarizing Substance (via Topic Modeling) of Baseline, Ground Truth, Film Coverage (Solar Mamas) BL: Press on theme: poverty in Arabic world & women, health, employment & development 22% health water people areas education government cent food poor 21% development world years economic Arab poverty country time social 17% women children work countries leaders time government world people 16% women education empowerment girls women's gender war school child 13% United Minister Education Development Nations Women SN:a SU:martini 12% President APRC people Oct election Development Jammeh support DU:syn GT: Transcript: storytelling (social conflict) and issue (training & employment for females) 23% back India don't kids won't I'm call husband 21% work make village solar back women girls years 21% husband daughters meeting things stay can't work girls 17% didn't role life world trainees day India problem 17% months mind man mother can't situation sin put Press on documentary: poverty among people in the Arabic world, especially women 93% film poverty people Arab documentary films world women 3% women solar Barefoot India College home back train 2% p.m Free Ave Film National Center a.m Park 25 2% Rafea Solar it's story Mamas mother Jordanian husband Facebook: Co-Commenters (OMA) Visualized in NodeXL 26 Jana Diesner and Team, UIUC
  • 445. Social Media: Facebook • Number of Likes – HILI (21,145) > PP (6,447) > OMA (5,908) > SOMA (884) • Likes per visitor – PP (4.08) > HILI (2.74) > SOMA (2.68) > OMA (2.53) • Number of comments – HILI (2,110) > PP (2,075) > OMA (1,081) > SOMA (63) • Comments per visitor – PP (6.10) > OMA (2.09) > HILI (1.50) > SOMA (1.26) • Difference between quantitative metrics and quality of interactions crucial 27 Facebook: Constructed public image (HILI) Stimulus 28 Jana Diesner and Team, UIUC
  • 446. Social Media: Public response (HILI): Community picking up on stimulus 29 Social Media: Twitter (HILI) Followers (@DrugWarMovie) (3,314, visible if >200 followers) Followees (2,245) Intersection (510) Visualized in NodeXL 30 Jana Diesner and Team, UIUC
  • 447. Social Media: Twitter (HILI): Being followed by not so relevant stakeholders Most other: writers, directors, musicians (comfort zone) 20% > 100K followers Red = important: legal 1, gov 2, media 12, NGO 33 31 Social Media: Twitter Film House I Live In Pandora’s Promise One Mile Away Solar Mamas Followers 2,804 (14) 691 (1) 1,247 (1) 55 (0) Intersection 510 (7) 183 (3) 61 (2) 109 (11) Followees 1,735 (230, 13%) 800 (86, 11%) 112 (18, 16%) 197 (33, 17%) * The number of power users with more than 100k followers are shown in parentheses 32 Jana Diesner and Team, UIUC
  • 448. Lessons learned from (Social) Media Analyses and Next Steps • Focus of communication & outreach on art product > theme -> limits possible impact • Social Media != Social Media – Even if films successful in attracting decent numbers of power users (>100K followers), but not of relevant types (government, NGO, media, legislation (House I Life In), energy (Pandoras Promise)) – Leverage opportunities for linking up with relevant people • Next: – Legal and governmental data – Comparative analysis and prediction models Diesner J, Aleyasen A, Kim J, Mishra S, Soltani K (2013) Using Socio-Semantic Network Analysis for Assessing the Impact of Documentaries. WIN (Workshop on Information in Networks), New 33 York, NY Acknowledgement • Sudan: National Science Foundation (NSF) IGERT 9972762, the Army Research Institute (ARI) W91WAW07C0063, the Army Research Laboratory (ARL/CTA) DAAD19-01- 2-0009, the Air Force Office of Scientific Research (AFOSR) MURI FA9550-05-1-0388, the Office of Naval Research (ONR) MURI N00014-08-11186. • Impact: This work is supported by the FORD Foundation, grant 0125-6162, and by IMO, Inc. (Intelligent Medical Objects). We are also grateful to feedback and advice from Dr. Susie Pak from St. John’s University, Orlando Bagwell, former director of JustFilms at the Ford Foundation, and Joaquin Alvarado, Chief Strategy Officer of the Center for Investigative Reporting. 34 Jana Diesner and Team, UIUC
  • 449. Team Jana Diesner Assistant Professor iSchool Computer Science Jinseok Kim PhD Student iSchool ShubHanshu Mishra PhD student GSLIS Sean Wilner PhD Student Informatics Amirhossein Aleyasin Master student Computer Science Kiumars Soltani PhD Student Informatics 35 Thank you! Q&A • For questions, comments, feedback, follow-up: Jana Diesner Email: jdiesner@illinois.edu Phone: (412) 519 7576 Web: http://people.lis.illinois.edu/~jdiesner 36 Jana Diesner and Team, UIUC