Disc2013 keynote speakers

Exploring the Structure of Government
on the Web

Presentation by Robert Ackland at DISC2013,
12-14 December 2013, Daegu, South Korea
Robert Ackland (Australian National University)
Paul Henman (University of Queensland)
Tim Graham (University of Queensland)
Homepage: https://researchers.anu.edu.au/researchers/ackland-rj
Project: http://voson.anu.edu.au

VOSON Project at the ANU (http://voson.anu.edu.au): Teaching,
research and tool development in areas of computational social
science, network science, web science since 2003
2

Background
Government use of the Internet has rapidly evolved.
● While this evolution has been examined in terms of the
content, usability and interactivity of sites, the institutional
structure of government on the web is less explored.
● Australian Research Council-funded project titled "The
institutional structure of e-government: a cross-policy,
cross-country comparison" (Henman, Ackland, Margetts)
●

3

Overall aims of project
●

Aim 1: Assess whether government hyperlink networks reflect
offline institutional structures
Is e-government facilitating joined-up government or are
jurisdictional boundaries still a significant barrier?
● Whalen (2011) studied the hyperlink structure of the US .gov
domain, assessing correspondence between online structure of
US government and its offline hierarchy.
●

●

Major difference is our project compares the UK and Australia, identifying
both similarities and contrasts in the relationship between institutional
structure and online presence.

4

●

Aim 2: Use hyperlink data to assess “nodality” of government (Hood &
Margetts 2007) – is government at centre of informational networks on
Web?
Nodality affects whether government messages received by the population.
● Web might increase government nodality, but can also decrease nodality,
through increased competition from other information providers (who may
destabilise/confuse/subvert the messages and actions of government).
Example: anti-vaccination lobby groups.
● We ask: is government using the web to enhance its visibility? Are there
differences in nodality across policy domains, countries (AU and UK)?
● Our approach is different to that used by Escher et al. (2006)
●

●

●

Escher et al. focused only on the UK Foreign Office (and US and Australian
counterparts), our analysis includes other sectors of government, allowing crosscountry and cross-sector comparisons
We collect more hyperlink data, allowing us to identify the connection between sites
that link to (or are linked to by) government sites. We can construction of nodality
measures that are different to those used by Escher et al. (e.g. those requiring
complete network data).

5

Webometrics (link count analysis)
focus on
egonetworks,
rather than
complete
networks
●
typically only
know attributes
of ego, not
alters
●

6

Today – some methodological aspects
Hyperlink network data collection (VOSON)
●
Network reduction techniques
●
Community structure in government
hyperlink networks
●
Coding websites (machine learning)
●

7

Hyperlink network data collection (VOSON)

8

●

Manually identified AU and UK government seed pages (typically, entry pages
to government websites):
AU – 88 pages
● UK – 92 pages
●

●

Used the VOSON software (http://voson.anu.edu.au) to construct hyperlink
network data using two stage approach:
●

Stage 1:
●

●

●

Stage 2:
●

●

VOSON in-built crawler crawled the seed sites finding internal pages linked to from the entry
page. Collected outbound links from each of the internal pages and also text content
Bing API was used to find all inbound links to each of the internal pages (including seed page)
Every new page discovered above (i.e. pages that either link to or are linked to by government
web page) was then crawled by VOSON in-built crawler to find connections among these pages

Data collected in 2012

9

VOSON 2.0 web
interface works with
Firefox, Chrome, Safari,
iPad

VOSON+NodeXL allows
construction and import
of hyperlink networks
from within NodeXL

11

Network reduction techniques

12

●

Network size (pages):
AU: 1,517,020 nodes (pages)
● UK: 1,588,757 nodes (pages)
●

●

First major network reduction technique: construct network
of websites rather than pages
VOSON has approach for automatically grouping pages into
“pagegroups”
● e.g for AU, 6694 pages from Australian Taxation office all
included in a single node “ato.gov.au”
●

●

Full network size (pagegroups/sites):
AU: 110665 nodes (pages), 290031 edges
● UK: 109161 nodes (pages), 280580 edges
●

13

●

Gephi map UK network – only showing 30K+ nodes with
indegree+outdegree>1 ...not much analytical potential from this
visualisation...

14

●

In future work we will be investigating
approaches for removing edges to reveal
the “backbone” of UK and AU government
hyperlink networks
●

e.g. Serrano, M., Boguñá, M. and A.
Vespignani (2009): “Extracting the
multiscale backbone of complex weighted
networks,” PNAS, 106(16), 6483-6488.
15

Community structure in
government hyperlink networks

16

Some approaches for 'community'
detection in networks
Modularity maximisation (Lancichinetti &
Fortunato, 2012)
●
Edge-Betweenness (Girvan & Newman, 2001)
●
Fast-Greedy (Clauset et al, 2004)
●
Multi-Level (Blondel et al, 2008)
●
Walktrap (Pons & Latapy, 2005)
●
Infomap (Rosvall, Axelsson & Bergstrom, 2009)
●

17

The hyperlink networks we have collected
are both directed and weighted (weight
on edge from node i to j are number of
pages with links from site i to j)
●
Of the above, only Edge-Betweenness
and Infomap support directed and
weighted graphs
●

18

Edge-Betweenness
We found the Edge-Betweenness
algorithm (as implemented in igraph/R)
does not scale well.
●
In a test run with UK hyperlink network,
algorithm did not converge after 24 hours
running...
●

19

Infomap
See: http://www.mapequation.org
● Scales well for large, dense networks
● information theoretic approach - appropriate to this network,
where there is flow of information and attention
●

If site i links to site j can think of a flow of information from j to i and
a flow of attention from i to j.
● We do not have data on flow of web users from site i to site j i.e.
'clickstream data'
● We therefore make assumption that the number of pages on site i that
contain hyperlinks to site j (these are our edge weights) is proportional
to the flow of attention/information
●

20

First attempt...
Tried Infomap implemented in R/iGraph (v. 0.6.5)
● Results: Not good! Algorithm consistently generated a single
massive community (approx. 95% of nodes) and thousands
of tiny communities (1 or 2 nodes per community)
● Results do not pass ‘sanity test’ (i.e. face validity)
● The problem:
●

Many nodes in the UK network have no outlinks
● Therefore, effect of teleportation in the Infomap algorithm is
significant (it randomly connects nodes)
● This problem was solved in Lambiotte and Rosvall (2012)
●

21

Second attempt...
Results from Lambiotte and Rosvall (2012) were recently
developed into Infomap algorithm
● This latest code is not yet integrated in R/iGraph
● So, next steps:
●

Download and compile C++ source code for Infomap (v. 0.12.13)
● http://www.mapequation.org/code.html
● Run the standalone Infomap algorithm
●

●

Using Infomap Map Generator, can examine the community
structure of UK network at different scales (varying the number of
communities displayed and number of links between communities)

22

17 out of 4571
communities
(44% of all
flow)
23

45 out of 4571
communities
(70% of all flow)
24

Each community is named after the website that has the highest
flow and PageRank in that particular community (i.e. the ‘top
dog’ website)
● Distribution of flow across network follows a power law
●

There are many communities, but a very small percentage ‘hog’ all
the flow across the network
● Top 5% of communities (229 nodes out of 4571) account for about
86% of all flow in the network
●

●

Infomap uses an implementation of the PageRank algorithm to
calculate ‘importance’ of each community (aggregate PageRank
of all websites in that community)

25

Preliminary findings
Extremely influential communities form around social media
and blogging platforms
● A massive amount of flow is directed through the ‘Twitter’
community (e.g. from Twitter to www.parliament.uk)
● Many UK seed sites form influential communities (i.e. Top
20), but not all.
● Somewhat unexpectedly, two UK Gov ‘business’ websites
each form highly influential communities
●

http://www.direct.gov.uk (community rank #4, 0.048% of all flow
throughout network)
● http://bis.gov.uk (community rank #8, 0.025% of all flow
throughout network)
●

26

●

To understand the structure of government hyperlink networks, we need to
know something about the websites in these networks
●

●

Generic top-level domains (.edu, .com, org etc.) will only give very coarsegrained information on who these sites are
●

●

What policy domain are they in? (health, education, social security?)

This is social science research so we need more information on nodes

Options:
1. Manually code every site (not feasible, as we have >100K sites)
2. Manually code a subset of sites e.g. the “most important” sites based on
centrality measure (scientifically valid?)
3. Manually code a sample of sites (e.g. adaptive sampling). To be explored in
future...
4. Manually code training dataset and then use machine learning to predict website
type

●

The following is summary of preliminary work on approach 4...
28

Data collection
●

Subset of 'important' websites in the UK network were
coded into discrete policy domains by a human coder
Subset chosen as seed sites plus sites connected to two
or more seed sites
● e.g. coding: ‘Community services’, ‘Health’, ‘Foreign
Affairs’
●

Need to collect and ‘clean’ the HTML data from
websites in the network
● While the original VOSON crawl collected text content
for all websites crawled, for this proof of concept, we
re-collected the text content (in future we will use the
VOSON-collected text data)
●

Text processing
R ‘XML’ package used to clean the HTML
(strip HTML tags, remove white spaces,
remove strange ASCII characters, convert to
lowercase, extract key word frequencies)
●
2157 websites were usable (i.e. with ‘clean’
web text and a known policy domain)
●
Machine Learning using the ‘RTextTools’
package in R (supervised learning for text
classification)
●

Support Vector Machine (SVM)
●

Websites with known policy codes = 2157
SVM ‘training sample’ = 2000
● SVM ‘test sample’ = 157
●

●

Some example results of classification:
PRECISION

RECALL

F-SCORE

Education

0.94

0.83

0.88

Employment

1.00

0.14

0.25

Environment

0.99

0.79

0.88

Foreign Affairs

1.00

0.44

0.61

Health

0.52

0.97

0.68

Housing

0.96

0.79

0.87

SVM Conclusion

Surprising level of accuracy
●
Future work will involve:
●

More data (will use HTML collected via
VOSON)
●
Investigate different machine learning
algorithms
●

Previous studies
Level

Authors

Result
Small-world effect existed between co-authors and the degree

Newman(2001)

distribution roughly follows the power law in co-authorship networks
in the fields of physics, biomedicine and computer science

Barabasi et al. (2002)

Ramasco et al. (2004)

Co-authorship network in mathematics and neuroscience is scale-free,
and the network evolution is characterized by preferential attachment.
Co-authorships network in the field of condensed matter showed that
the degree distribution follows a power law.

Individual
Co-authorship network in the field of genetic programming changes
Researcher Tomassini and Luthi (2007)
in accordance with preferential attachment
level
International co-authorship grew based on the principle of
Wagner and Leydesdorff
(2005)

preferential attachment, although the attachment mechanism was not
fitted to a pure power law.

Moody (2004)

Brantle and Fallah (2011)

Co-authorship network in sociology does not have a small-world
structure.
Collaboration network of patent inventors has a scale-free power law
property.

4

Previous studies
Level

Authors

Result

Verspagen and

Strategic technology alliances, in the two technology fields of chemicals

Duysters (2004)

Powell et al. (2005)

Organization
level

Gay and Dousset (2005)

Barber et al. (2006)
Breschi and Cusmao
(2004)

and food, could be characterized as small worlds.

The alliance network among dedicated biotech firms is scale-free.

The alliance network in the biotechnology industry has a small-world
effect with a scale-free property based on preferential attachment.

Both studies reported the existence of small-world and scale-free
property in inter-organizational R&D relationships from EU-FP
Programmes data.

5

Brief history of governmental policy for UIG collaboration (‘00~’11)


6

Brief history of governmental policy for UIG collaboration (‘00~’11)




7

Methodology
 Network topological analysis
Measures

Definition

Density
Average degree
Average path
length
Diameter

The largest geodesic path length in the network

Clustering
coefficient
Degree
centralization
Power law
distribution

9

Methodology
 Centrality measures
Measures
Degree centrality

Definition
CD(i) = (ΣAi)/(n-1)
* Ai = the number of direct links of node i,
* n = the total number of nodes

Closeness centrality

CC(i) = (n-1)/(ΣDij)
* Dij = the number of links in the geodesic linking node i
and node j

Betweenness centrality

CB(i)=[Σj<k gjk(i)/gjk]/[(n-1)(n-2)/2]
* gjk = the number of geodesics linking node j and node k
* gjk(i) = the number of geodesics linking node j and node k
that contain node i

10

Methodology
 Block modeling

11

Data and network construction
 Data collection and network construction

 75 innovative actors (2010)

12

Results

The number of joint patents


30,000
23,973

25,000
20,000
15,000

12,659

10,000
5,000

4,579
1,368

6,735
3,535

5,720

2004-2007

2008-2011

10,623

0
2000-2003

2000-2011

Year

13

Results

Period

No. of No. Density Clustering Average Average
coefficient degree

(random

network)

links

path length

(random

nodes of

Diameter Degree
centralization

network)

Power-law distribution
Power-law KS

p-

exponent statistic value

2000~2003

46

90

0.087

0.323
(0.069)

1.957

2.997
(2.919)

7

0.351

2.768

0.193

0.03

2004~2007

61 209

0.114

0.375
(0.125)

3.410

2.366
(2.310)

5

0.331

2.924

0.138

0.05

2008~2011

60 387

0.219

0.498
(0.213)

6.450

1.933
(1.827)

4

0.493

3.305

0.115

0.23

15

Results

2000-2003
Organi

Degree

zation

2004-2007

Rank Closeness Rank Between Rank

centrality

centrality

Degree

tion

ness

Organiza

2008-2011


centrality

centrality

centrality

Degree

tion

ness

Organiza

centrality

centrality

ness
centrality

centrality

SEC

0.422

1

0.506

2

0.253

2

ETRI

0.433

1

0.594

1

0.155

1

SNU

0.695

1 0.756 1 0.144 1

ETRI

0.378

2

0.479

3

0.252

3

SEC

0.400

2

0.583

2

0.104

4

KAIST

0.593

2 0.702 2 0.112 2

KAIST

0.289

3

0.511

1

0.241

4

SNU

0.350

3

0.577

3

0.146

2

YSU

0.559

3 0.686 3 0.043 5

KRICT

0.200

4

0.421

7

0.049

HYU

0.333

4

0.571

4

0.118

3

KRU

0.542

4 0.686 3 0.052 4

HMC

0.178

5

0.437

5

0.290

1

KAIST

0.283

5

0.522

10

0.082

8

HYU

0.492

5 0.656 5 0.076 3

POST
ECH

0.156

6

0.421

7

0.084

9

YSU

0.267

6

0.536

6

0.094

6

ETRI

0.475

6 0.634 6 0.042 6

LGE

0.156

6

0.446

4

0.078

10

HMC

0.250

7

0.526

7

0.097

5

SEC

0.458

7 0.634 6 0.037 9

CII

0.156

6

0.402

10

0.013

KRU

0.250

7

0.545

5

0.092

7

POST
ECH

0.424

8 0.615 9 0.029

KICT

0.111

9

0.360

0.136

5

SKKU

0.217

9

0.526

7

0.051

9

SKKU

0.407

9 0.621 8 0.039 8

KIMM

0.111

9

0.395

0.104

7

POST
ECH

0.217

9

0.526

7

0.031

HMC

0.373

10 0.602 10 0.034 10

KIST

0.111

9

0.437

5

0.046

KT

0.183

0.517

0.010

KIST

0.356

0.602 10 0.010

KT

0.111

9

0.409

9

0.003

LGE

0.167

0.458

0.042

IHU

0.322

0.578

0.030

HMB

0.044

0.319

0.127

6

KHU

0.167

0.500

0.038

CAU

0.322

0.590

0.018

KHNP

0.067

0.249

0.087

8

KRICT

0.133

0.455

0.021

KRICT

0.305

0.578

0.041 7

10

16

Conclusions and discussion
 Conclusions

 Policy implications

18

Fred Phillips
DISC 2013, Daegu

General Informatics LLC

Perspectives on
Triple Helix

Agenda
1. 3-Helix as a meso-level notion
– Epicycle in a grander tech-psych-inst
cycle

2. Speed (differentials) as high-level
system metric
– Roles of buffering institutions and ICT
– Need for smart engagement

3. Applying 3-helix in the developing
world
4. SUNY Korea’s joint TS/CS research

3-Helix papers published in
Technological Forecasting &
Social Change
• Wilfred Dolfsma, Loet Leydesdorff “Lock-in and break-out from
technological trajectories: Modeling and policy implications,” 76( 7),
Sept. 2009, 932-941.
• Raul Gouvea, Sul Kassicieh, M.J.R. Montoya “Using the quadruple
helix to design strategies for the green economy,” 80(2), Feb. 2013,
221-230.
• Øivind Strand, Loet Leydesdorff “Where is synergy indicated in the
Norwegian innovation system? Triple-Helix relations among
technology, organization, and geography,” 80(3), Mar. 2013, 471-484.
• Inga A. Ivanova, Loet Leydesdorff “Rotational symmetry and the
transformation of innovation systems in a Triple Helix of university–
industry–government relations,” In Press, Corrected Proof, Available
online 19 Sept. 2013.

In D.S. Oh & F. Phillips (Eds),
Technopolis: Best Practices for Science
and Technology Cities (Springer, 2014)
• E. Becker, B. Burger and T. Hülsmann,
“Regional Innovation and Cooperation
among Industries, Universities, R&D
Institutes, and Governments”
• F. Phillips, S. Alarakhia and P.
Limprayoon,“The Triple Helix:
International Cases and Critical
Summary”
• José Alberto Sampaio Aranha,
“Arrangement of Actors in the Triple
Helix Innovation”

IC2 Model
• Preceded 3-helix by several years
• But only parts were made mathematical (Bard et al)
Ac a d e mi a

Indu st ry

Go v e r n me n t

Com m un it y
Talen t

Technology

Capi t al

Kno w - Ho w

Ma rke t Ne e ds

V alu e - A dd e d
Ec ono m ic Deve lop me nt

The math of AcademicGovernment-Industry
dynamics is interesting,
but...
It is just part of a bigger picture.

The cycle of innovation and change:
Lab to society & back again
Technological
Innovation
New desires
& dreams

New ways to
organize (Public &
private)

Note how this
schema extends
Everett Rogers’
more linear
model.
New Products
& Services

New ways to
Interact socially

New ways of producing
and using
products & services

We might think all the elements
move together in an orderly way.
Social Needs
Institutional Change

Technological Change

Psychological Change

Organizational
Change

But in a free-market economy,
they do not.
• They continually
engage and
disengage.
• Sometimes they
move each other
only by friction.
• 90% of MOT and
Tech Policy
problems stem from
the differing speeds
of the 3 sectors.

Example: Transportation
• Mobile-web rideshare
services
– Gain VC investment
– Start operations
– Get shut down by city
governments trying to
regulate them under old taxi
rules.

• Institutions have changed
slower than technology
and social demand.

Example: Health
• An elderly person dies
because he was too proud
to wear
– A medical bracelet
– or
– An emergency signaller.

• Psychology has changed
slower than technology.

Example: Software
• Record companies and publishers
– Sue student MP3 pirates
– Develop DRP software that further alienates
customers
– Can’t adapt away from paper and CD
publishing.

• Business organizations change more
slowly than technology and social
demand.

Example: More and more often,
social/institutional change outpaces
tech change - or will do so soon.
• In most of the world, an excess of funds
is chasing too few growth investment
opportunities.
• Fewer US companies are making IPOs.
• Small-government activists rail
indiscriminately against direct
government monetary support for new
technologies.
See Phillips (2011).

This can be good.
• Individual creativity
may bloom.
• Mistakes...
– Can be undone
efficiently.
– Don’t necessarily infect
the whole system.

It (disengagement)can be bad.
• Alienation
• Lack of coordination and cooperation
• Little institutional or organizational
creativity
• Waste and pollution
• Lives lost

Speed as the system metric
• Really, speed
differentials among the
sectors.
• A “clutch” and
“transmission” are
needed.
• The question is less
how to engage, but
rather, when.
• The key is not
engagement per se,
but smart (well-timed)
engagement.

Not bridging organizations, but
buffering organizations
•
•
•
•
•
•
•
•

Civic groups
Workforce training programs
Economic development agencies
Technology brokers
Open innovation integrators
Accountancies
Industry associations
NGOs

The IC2 Model partially captured this.

•
•
•
•

Incubators
Law firms
Venture capital
TTOs

3-Helix as meso-level construct: An
epicycle within the TechnologyPsychology-Institutional dynamic
• Macro: Tech-Psych-Inst
• Meso: Aca-Gov-Indus

Tech

– “Triple Helix”

• Micro:
– Dynamics within people and
within organizations;
– Technology life cycles

• The buffering institutions
span all 3 levels.

Inst
(3-Helix)

What causes TOPI* disengagement?
*Technological-Organizational-PsychologicalInstitutional

• Bad marketing, bad market research
• Mistrust, bad service
• Technology inaccessible to underserved
populations
• Competition among de facto standards
(e.g., VHS vs Beta)
• Lack of vision
• Poor design of information &
communication products and programs.

“Engaging” doesn’t mean
“attractive nuisance.”

Intrusive
‘engagement’
Update
this app!

Marketing guru Geoffrey Moore says,
• “People have disengaged, for ... self-preservation.”
– With “consequences for consumer and brand marketing,
– “and long-term implications for education, health care,
citizen participation, and workforce involvement.

• “So engagement is rightfully going to be a big
investment theme.”

Moore: Engagement is taking
center stage in business.
• Off-line retailers are using digital interactions/devices in their
in-store experiences.
– Example: Starbucks.

• “Social marketing foster[s] engagement around topics that ...
reflect well upon the sponsor.”
– Example: Sephora.

• “Big data analytics drive communications that can break
through the wall of detachment.”
– Example: Obama campaign 2012.

Moore is saying
• Advertising used to be
like this.
– Annoying! Consumers
disengaged.

• Now with social media,
mobile web, Yelp.com,
– Consumers share product
reviews & complaints.
– Advertisers have to treat
consumers more gently.
– To make us want to
continually re-engage.

• Engaging doesn’t mean
shouting.

ICT for an Intelligently
Engaged Society?

What kinds of
IT foster
positive,
voluntary
engagement?
Why?

What kinds of IT discourage
it? Why?

People are proud to
participate electronically.
• Fighting crime
– Zapruder film; Rodney King videos

• Supporting favorite businesses, authors
– Amazon reviews

• For post-disaster aid
– Crowd-mapping of post-earthquake Haiti

• Crowd-funding research projects and
entrepreneurs
• Though there are abuses.

Source: Ganti et al, Mobile
Crowdsensing: Current State and
Future Challenges.

Micro Level: Workforce
Engagement
• Definition: The measure of whether
employees merely do the minimum required
of them, versus proactively driving innovation
and new value for the organization.
• Thus, engagement
– “can only ever be partially accounted for by
deploying the latest new collaborative technology,
– “and probably significantly less than many of its
proponents would have you believe.”
Source: Hinchcliffe

Current state of worker
engagement

ICT for engagement? Summary
• ICT alone cannot create/sustain engagement.
– Human intervention, via buffering institutions, can achieve
ICT-aided engagement.

• ICT, especially sensing and crowdsourcing, may
assist in deciding when to engage.
– Thus achieving smart engagement.

• This applies to all 3 levels (macro, meso, micro) of
our multi-level Technology & Society diagram.

For many countries where
central government direction is
the norm, 3-helix thinking is
premature.
• Indonesia, Mongolia
• USA: Industry lobbying government
presents a slightly different problem...

In sum, the problem is not disengagement, but mis-engagement
among governments, people,
organizations and products, due to:
• Speed differentials (i.e., poor timing)
• Lack of vision
• Poor design of information & communication
products and programs.
– Lack of feedback
– Excess complexity, leading to slow comprehension and
adoption
– Excess technology push (solutions without problems)
– Excess demand pull (unrealistic expectations)
– Other factors

SUNY Korea’s research agenda
• Combine social science and computer science...
• To find principles of IT design that more quickly
lead to engagement that is...
– Well-timed
– Smart
– Satisfying

• Among
–
–
–
–

Individuals
Businesses
Government institutions
Technology developers

• With secure applications in several techno-policy
domains (health, energy, etc.).

Some Implications
• For IT: Meeting users halfway
• For managers: Engagement plans for
each constituency
• For theorists:
– Modeling the moderating effect of buffering
institutions
– Impact of coalitions on the 3-helix dynamic

An aside: Spatializing
an innovation
diffusion model
F. Phillips, On S-curves and Tipping Points. Tech.
Forecasting & Social Change, 74(6), July 2007,
715-730.
Alan M. Turing, The chemical basis of morphogenesis. Philosophical Transactions of the
Royal Society of London. B 327, 37–72 (1952)
http://www.cgjennings.ca/toybox/turingmorph/

References
• http://davidsasaki.name/2013/01/beyond-technology-fortransparency/
• A. Charnes, S. Littlechild and S. Sorensen, “Core-stem Solutions of
N-person Essential Games.” Socio-Econ. Plan. Sci. Vol. I, pp. 649660 (1973).
• David Watson The Engaged University. Routledge, 2013.
• Dion Hinchcliffe, “Does technology improve employee engagement?”
Enterprise Web 2.0, Nov. 5, 2013. http://www.zdnet.com/doestechnology-improve-employee-engagement-7000021695/
• Jonathan Bard, Boaz Golany and Fred Phillips, “Bubble Planning
and the Mathematics of Consortia.” Third International Conference
on Technology Policy and Innovation, Austin, Texas, September,
1999.
• F. Phillips, The state of technological and social change:
Impressions. Technological Forecasting & SocialChange. 78(6), July
2011, 1072-1078.

감사합니다
Thank you
fred.phillips@stonybrook.edu
fp@generalinformatics.com

A Network Analysis of Web-Citations
Among the World’s Universities
George A. Barnett
Department of Communication
University of California, Davis
gbarnett@ucdavis.edu
Daegu Gyeongbuk International Social Network
Conference
December 12-14, 2013

Research Aims
• Network Analysis of URL-citations among

– 1,000 universities with greatest presence on WWW (1 million
edges)
– In 58 different countries
– Multi-level analysis (both Universities & Countries)

• Antecedent factors that determine the network’s
structure
– University level

− National Level

• Physical distance

• Same country

Capacity
• Language of instruction

• Size

• Ph.D. granting

• Prestige
• Research Excellence (Nobel Prizes)

Hyperlink Connections
International Bandwidth

GDP, GDP/capita
International Student Flows
Nobel Prizes

Data—Web-Citations
• Web-citations among universities collected using Google
– 2,100 X 2,100 matrix of universities (4,407,900 cells) generated
– search query
“university A webdomain” site:university B webdomain
"harvard.edu" site:stanford.edu
− Not all URL-citations are links, e.g., email addresses in coauthored
papers
− Removed universities with no ties & the smaller of a university’s
multiple domains, retained 1,000 most interlinked Universities
− Matrix of inter-citations aggregated to the national level

Data--Antecedents
University Level
Physical Location

− Google Maps

Country

− cTLD of website (USA--.edu)

Language of Instruction

− Country of University (India & Singapore—English)

Size of University

− Europe -- (EUMIDA)
(http://thedatahub.org/dataset/eumida)
− U.S. -- College Handbook 2012
− Asia, Africa, Oceania, Latin American & Canada –
Universities’ Websites

Prestige

− U.S. News, World’s Best Universities 2012
http://www.usnews.com/education/

Nobel Prizes

− (http://www.nobelprize.org)

Data--Antecedents
National Level
Total Hyperlinks

− Barnett & Park (2012)

International Internet Bandwidth,
GDP & population
− TeleGeography (2012)
(http://www.telegeography.com/)
Student Exchange

− UNESCO (http://stats.uis.unesco.org/unesco)

International Co-authorships

− Leydesdorff & Wagner (2008)

International Citations

− Science Citation Index

Results - Universities
•
•
•
•

Over 9.6 million links among 1,000 universities
Density = .606
Mean # of Links = 24.0; S.D. = 2,208.6
Greatest # of links (322,000)
– Universität Trier & Rheinisch Westfalische
Technische Hochschule Aachen, two German
institutions that host huge & popular bibliographic
systems (DBLP & SunSite)

Results – Clusters of Universities
Cluster

Defining Attributes

1. German, Swiss & Italian, not English, central, low prestige, less bandwidth
connections
2. English (U.S., Canada, U.K., Australia), central, high prestige, strong bandwidth
connections
3. Low prestige, peripheral, less bandwidth connections
4. English, not French, peripheral, no Ph.D.s, strong bandwidth connections
5. Continental Europe, not English
6. Chinese, less bandwidth connections
7. French, not English, peripheral, lower prestige
8. English, primarily (Jesuit Institutions), peripheral, low prestige
9. English, peripheral
10. Japanese & other Asian, peripheral, little bandwidth connections

Results - National
• N = 58 Countries
• Density = .924
• United States most central, followed by Germany, U.K., Canada
– >30% of links ; >4 million outward & 1.9 million inward
– Eigenvector centrality 10 times > Germany

• Gini = .672, a core = periphery structure

– U.S. (359), Germany (67), U.K. (67) & Canada (38) 53.1% of the universities
– These four nations account for 68.3% of the links
– Links distributed by power law; concentrated in a few countries

• Cluster Analysis – 1 group of countries centered about U.S. & U.K.

Results – Predicting the Structure of
the University URL-citation Network
• Physical Distance Between Campuses

– QAP Correlation = .005 No relationship between
physical distance and web-citations

• Same Country
–
–
–
–

QAP Correlation = .065
Links 78.4% domestic; 21.6% international
No Links 6.1% domestic; 93.9% international
Mean Link Strength 1,415 with domestic; 42.5
international

• Web-citations tend to be domestic

Results – Predicting University
Centrality in Network --Correlations

Results – Predicting University
Centrality in Network -- Regression
In-degree
R2
F
P

Size (log)
English
Bandwidth
Rating

Out-Degree

.350
47.94
.000
ß

.279
-.025
.268
.465

Betweenness

.489
85.16
.000
t

6.49
-.516
5.70
10.53

all p< .001, except English for In-degree

ß

.123
.356
.302
.323

t

3.22
8.50
7.31
8.25

Eigenvector

.579
122.25
.000
ß

.282
.185
.336
.502

t

8.13
4.86
8.94
14.12

.310
39.94
.000
ß

.150
.214
.208
.348

t

3.36
4.40
4.33
7.65

Results – Predicting the Structure of the
URL-citation Network-National Level
• QAP Correlations with National Level Network
– Co-Authorships .772
– Citations
.967
– Hyperlinks
.545
– Student Flows .270
– Missing Data N = 52 on all except Student Flows,
N = 48

Results – Predicting Nation’s Centrality
in Network --Correlations

Results – Predicting National Centrality
in the Network -- Regression
In-degree
.524
33.78
.000

35.12

.670

ß

R2
F
P

Out-Degree

ß

t

Nobles
English
Population .482 .4.80
GDP/capital .722 7.19
GDP

.000

t

.184 2.27
.398 4.70
.797 9.28

All relations are significant p < .02

Betweenness
22.99
ß

.505
.000
t

.443 4.33
.720 7.03

Eigenvector
.642

31.05
.000
ß t
.553 5.07
.183 2.15
.258 2.41

Discussion
• So where is academic knowledge produced?

– Primarily at prestigious English speaking institutions in the U.S.A. &
U.K. , but also in Canada & Germany

• Distance is unrelated to dissemination & collaboration via the
Internet
• Universities tend to link to others from the same country
• Ten clusters- One composed of most prestigious institutions,
suggesting exchanges of knowledge among this group
• Centrality predicted by university size, its prestige (whether it
offered doctoral degrees, its U.S. News ranking, the number of
its faculty’s Noble Prizes), language of instruction (English), &
national international bandwidth capacity

Discussion
• At the national level, the countries formed a single group
centered about the U.S. & the U.K.
• U.S. is the most central, followed by Germany, U.K. & Canada
– They accounted for the majority of the universities in the network

• The International Network has a core-periphery structure
with a few countries accounting for the majority of the links
• International co-authorships, citations, student exchanges &
the number of links among the individual countries are
strongly predictive of the network’s structure
• Centrality is predicted, by a country’s population & GDP,
depending on the measure, it may also be predicted by
language of instruction (English) & the number of Noble Prizes

Discussion
• Results are consistent with Seeber, et al. (2012)
– European university hyperlink network displays a
center-periphery structure
– centrality a function of the universities’ reputation
– This study extends their conclusions to the global
academic community

Discussion
• Consistent with Ortega & Aguilla (2009)

– “The world-class university network graph is comprised of national
sub-networks that merge in a central core where the principal
universities of each country pull their networks toward international
link relationships. This network rests on the United States, which
dominates the world network in conjunction with the aggregation of
the European ones, especially the British and the German subnetworks. This situation may be caused mainly by the technological
development of these countries and the production of international
content, that is, English web pages. This second reason might explain
the apparent backward situation of some East Asian countries.“

• World Systems Theory

– Telephone (Barnett, 2001, 2012)
– Internet (Barnett & Park, 2005, 2012; Park, Barnett & Chung, 2011)
– Student flows (Barnett & Wu, 1995; Chen & Barnett, 2000; Jiang,
2013)
– Patents, trademarks and copyrights (Nam & Barnett, 2011).

Discussion
• Global academic community as a self-organizing system
– Academic network may be considered an autopoietic or selfreplicated system
– Evolved from traditional scientific activities (co-authorship,
citing the research of others & other behaviors that required
the sharing of information among scholars)
– Krippendorf defines an autopoietic system as “a network of
processes that produces all the components necessary to
embody the very process that produces it”. The network
recursively produces its components through the interaction
in this historical reproductive network of postings on
university websites & links among institutions

Discussion
• There are environmental constraints that limit the
possible states into which this system may evolve
• issues of information property
• policies of individual universities & national governments
• scientific funding agencies (U.S. National Science Foundation)

• Academic networks co-evolved with other global
institutions
• Universally, higher education is developing common
curricula especially in the sciences (Lechner & Boli,
2005). This seems to be reflected in pattern of
universities’ hyperlinks and web-citations

Thank you!
See:
Barnett, G.A. , Park, H.W., Jiang, K, Tang, C, & Aguillo, I.F., (2013),
“A multi-level network analysis of web-citations among the
world’s universities”, Scientometrics, DOI 10.1007/s11192-013-1070-0

Virtual Knowledge Studio (VKS)

“Webometrics Studies” Revisited
in the Age of “Big Data”
Asso. Prof. Dr. Han Woo PARK
CyberEmotions Research Institute
Dept. of Media & Communication
YeungNam University
214-1 Dae-dong, Gyeongsan-si,
Gyeongsangbuk-do 712-749
Republic of Korea
www.hanpark.net
cerc.yu.ac.kr
eastasia.yu.ac.kr
asia-triplehelix.org

Big data
 The term “big data” refers to “analytical technologies that
have existed for years but can now be applied faster, on
a greater scale and are accessible to more users. (Miller,
2013).
 Big data sizes may vary per discipline.
 Characteristics: Garner’s 3Vs plus SAS’s VC and IBM’s
Veracity
- Volume (amount of data), Velocity (speed of data in and
out), Variety (range of data types and sources)
- Variability: Data flows can be highly inconsistent with
daily, seasonal, and event-triggered peak data loads
- Complexity: Multiple data sources requiring cleaning,
linking, and matching the data across system
- Veracity: 1 in 3 business leaders don’t trust the
information they use to make decisions.
http://en.wikipedia.org/wiki/Big_data
http://www-01.ibm.com/software/data/bigdata/

http://www.emc.com/leadership/digitaluniverse/iview/executive-summary-a-universe-of.htm

http://www.emc.com/leadership/digitaluniverse/iview/images/impact-ofconsumers-lg.jpg

Data-driven Research that focuses on
extracting meaningful data from technosocio-economic systems to discover
some hidden patterns.

Today’s “big” is probably tomorrow’s “medium” and
next week’s “small” and thus the most effective definition of “big data” may be derived when the size of data
itself becomes part of the research problem.
Loukides (2012)

Introduction


Webometrics is broadly defined as the study of webbased content (e.g., text, images, audio-visual objects, and
hyperlinks) with primarily quantitative indicators for
social science research goals and visualization techniques
derived from information science and social network
analysis.

• Han Woo Park
- “hidden” and “relational” data about

lots of people as well as the few
individuals, or small groups

• Lev Manovich
- “surface” data about lots of people (i.e.,
statistical, mathematical or computational
techniques for analyzing data)
- “deep” data about the few individuals or small
groups (i.e., hermeneutics, participant
observation, thick description, semiotics, and
close reading)
7

First type of Webometrics
• Hyperlink Network Analysis
- Inter-linkage: who linked to whom matrix
- Co-inlink : a link to two different nodes from a third node
- Co-outlink : A link from two different nodes to a third node

Björneborn (2003)

Inter-link network analysis diagram among Korean escience sites within public domain

WCU
WEBOMETRICS
INSTITUTE

Mapping the e-science landscape
In South Korea using the Webometrics method

Co-inlink network analysis

WCU
WEBOMETRICS
INSTITUTE

Mapping the e-science landscape
In South Korea using the Webometrics method

Findings
As seen in Figure 4, the network structure shows a clear butterfly pattern. There is one hub (ghism)
that belongs to Park Gyun-Hye (Park GH, www.cyworld.com/ghism), the daughter of ex-president
Park Jeong-Hee and one of two major GNP candidates (along with president-elect Lee MB) in the
2007 presidential race.

Figure 4: Cyworld Mini-hompies of Korean legislators

How do social scientists use link data
from search engines to understand
Internet-based political and electoral
communication?
WCU
WEBOMETRICS
INSTITUTE

INVESTIGATING INTERNET-BASED POLITICS WITH E-RESEARCH TOOLS

Case 2. Cyworld Mini-hompies of Korean Legislators

Sociology of Hyperlink Networks of Web 1.0,
Web 2.0, and Twitter
A Case Study of South Korea

Introduction
‣ Online & offline lives ➭ co-constructing (e.g. Beer & Burrows, 2007)
‣ Politicians communicate with their constituencies using different platforms
‣ Questions:
- What are the structural similarities and/or differences in South Korean
politicians’ networks from Web 1.0 to Web 2.0 (and Twitter)?
- Are online structures similar to structures in the physical world?
- Are online patterns affected by offline relationships?
‣ Related studies conducted:
- online social network analysis
- online networks in Web 2.0
- role of Twitter on online politics

2001

2000

‣ 59 isolated in 2000
‣ more centralised in 2001
‣ network of 2001 ➭ a ‘star’ network
- might affected by political events
➭ presidential election in 2001

Web 1.0

2005

2006

‣hubs disappearing
‣easy use of blogs
‣Clear boundaries between different parties
‣strong presence of GNP Assembly members
➭ party policy on using blogs
Web 2.0

Politician Twitter Network (Following and Mention
Network)

Conclusion

Politicians Twitter Following-follower Network

Politicians Twitter Mention Network

Bi-linked network of politically active
A-list Korean citizen blogs (July 2005)
URI=Centre
DLP=Left
GNP=Right

Just A-list blogs exchanging links with politicians

Affiliation network diagram using pages
linked to Lee’s and Park’s sites

N = 901 (Lee: 215, Park: 692, Shared: 6)

Tweets on the name of S. Korea president

20

Viewertariat Networks:

A Study of the 2012 South Korean Presidential Debate

Park’s network

Moon’s network

Reply-To Networks of Park’s & Moon’s
Facebook page visitors during TV debates

“Those studies perpetuate the idea that linking
behaviour is not random, and that links are ‘socially
significant in some way’. In this perspective, links
have an ‘information side-effect’, they can be used
to understand other facts even though they were
not individually designed to do so: ‘information
side-effects are by-products of data intended for
one use which can be mined in order to understand
some tangential, and possibly larger scale,
phenomena’

Park and his colleagues were
extensively cited: 9 times!
•
•
•
•
•
•
•
•
•

Barnett GA, Chung CJ and Park HW (2011) Uncovering transnational hyperlink patterns
and web mediated contents: a new approach based on cracking.com domain. Social
Science Computer Review 29(3): 369–384.
Hsu C and Park HW (2011) Sociology of hyperlink networks of Web 1.0, Web 2.0, and
Twitter: a case study of South Korea. Social Science Computer Review 29(3): 354–368.
Park HW (2003) Hyperlink network analysis: a new method for the study of social
structure on the web. Connections 25(1): 49–61.
Park HW (2010) Mapping the e-science landscape in South Korea using the
webometrics method. Journal of Computer-Mediated Communication 15(2): 211–229.
Park HW and Jankowski NW (2008) A hyperlink network analysis of citizen blogs in
South Korean politics. Javnost: The Public 15(2): 5–16.
Park HW and Thelwall M (2003) Hyperlink analyses of the World Wide Web: a review.

Journal of Computer-Mediated Communication 8(4).

Park HW and Thelwall M (2008) Developing network indicators for ideological
landscapes from the political blogosphere in South Korea. Journal of ComputerMediated Communication 13(4): 856–879.
Park HW, Kim C and Barnett GA (2004) Socio-communicational structure among political
actors on the web in South Korea. New Media & Society 6(3): 403–423.
Park HW, Thelwall M and Kluver R (2005) Political hyperlinking in South Korea: technical
indicators of ideology and content. Sociological Research Online 12(3).

A comment from those who are
NOT doing a hyperlink analysis
• In a chapter of The Sage Handbook of
Online Research Methods edited by
Fielding et al. (2008), Horgan emphasizes
that ‘link analysis’ has become an active
research domain in examining social
behavior online.

25

A threat to Webometrics
• The key application in this area is to collect
some incoming, outgoing, inter-linking, and
co-linking data from search engines
- AltaVista in early 2000
- Yahoo renewed the AltaVista’s hyperlink
commands via “Site Explorer” and its API
- Yahoo discontinued its API option for
interlinkage data in April 2011, and finally
stopped its popular Site Explore service in
November 2011

http://cybermetrics.wlv.ac.uk/Que
riesForWebometrics.htm

A new proposal
• Mike Thelwall
- URL citation searches with the Bing search
API facilities
• Liwen Vaughan
- Incoming hyperlinks from Alexa.com
Can these "alternative" techniques be
acceptable for scientific publishing?

A new proposal : SEO Tools
•
-

Search Engine Optimization Tools
http://www.majesticseo.com/
http://www.opensiteexplorer.org/
https://ahrefs.com/

Enrique Orduña-Malea & John J.
Regazzi (2013). Influence of the academic
Library on U.S. university reputation:
a webometric approach. Technologies. 1,
26-43, http://www.mdpi.com/2227-7080/1/2/26

Webometrics Ranking of
World Universities
The link visibility data is collected from the two
most important providers of this
information: Majestic SEO and ahrefs.
Both use their own crawlers, generating different
databases that should be used jointly for filling
gaps or correcting mistakes.
The indicator is the product of square root of the
number of backlinks and the number of
domains originating those backlinks, so it is not
only important the link popularity but even
more the link diversity.
The maximum of the normalized results is the
impact indicator.
http://www.webometrics.info/en/Methodology

Interlinkage among world universities
• Barnett, G.A., Park, H. W., Jiang, K., Tang, C.,
& Aguillo, I. F. (2013 forthcoming). A MultiLevel Network Analysis of Web-Citations
Among The World’s Universities.
Scientometrics*.
Isidro F. Aguillo
“Large interlinking matrix (1000*1000) are no
longer possible to obtain. Perhaps national
academic systems (200 or 300 institutions)”

Intentional inattention
among Information Scientists?
• Robert Ackland (2013). Web Social Science.

- http://voson.anu.edu.au/
• Richard Rogers (2013). Digital
Methods.
- https://www.issuecrawler.net/index.php
- https://www.digitalmethods.net/Dmi/Tool
Database

Let us move to Web Visibility Analysis
Frequently occurring key words in e-science webpages in Korea

Created on Many Eyes(http://many-eyes.com)

Words are larger according to the frequency of their occurrence but their
positions are randomly-chosen for the best visualization

WCU
WEBOMETRICS
INSTITUTE

Websites retrieved more than two times

Note: Websites are larger according to their frequency of retrieval; however, heir
colors and locations are randomly-chosen for the best visualization

WCU
WEBOMETRICS
INSTITUTE

2nd type of Webometrics: Web Visibility


Web visibility as an indicator of online political power



Presence or appearance of actors or issues being discussed by
the public (Internet users) on the web.
Tracking web visibility is powerful way to get an insight into
public reactions to actors or issues.



Recent studies indicates the positive relationships
between politicians’ web visibility level and election.



Also, the co-occurrence web visibility between two
politicians represents their hidden online political
relationships based on the public perception.

Results – Web Visibility (co-occurrence)

Results – Correlation & Path Analysis
Correlation
1 (N=278)
1 Finance

2 (N=278)

3 (N=234)

1

0.420**

0.101

1

0.184**

2 Web
3 Vot e

1

Spearman Correlation
1 (N=278)
1 Finance

2 (N=278)

3 (N=234)

1

0.513**

0.090

1

0.163*

2 Web

Political finance’s indirect effect = .076
3 Vot e
Note. * p<.05, ** p<.01
** p<.01

1

Results – QAP Correlation
1
1 Committee
2 Constituency

2

3

1

0.004

-0.016

1

3 Party
4 Gender
5 Age
6 Incumbent
7 Web
8 Finance

Note. * p<.05, ** p<.01

4
0.025

0.097** -0.007
1

0.027
1

5
-0.021

6

7

8

-0.074**

0.045** -0.037**

-0.043** -0.064**

0.105** -0.119**

-0.045*

-0.050*

0.242** -0.094**

0.024

0.031

1

0.179** -0.051*

0.049*

1

0.098**

0.041

-0.060**
1

-0.224**

-0.158**
1

e-리서치 도구의 활용: 웹가시성 분석


블로그 공간에서 후보자들의 웹가시성 수준과 득표 수간
에 밀접한 상관성을 나타냄. (임연수, 박한우, 2010, JKDAS)
실제 득표수
29,120

평균 블로그 수
19,427

14,218

3,071 2,125

504

경대수 정범구 정원헌 박기수 이태희 김경회

2009년 10월 28일 재보선 결과
- 당선자 모두 블로그 가시성 높음

I. 소셜 미디어의 특징 및 영향력
10.26 재보궐 선거 사례
•

(2)

페이스북에서 이름이 동시에 언급되는 이름 연결망을 구성
하여 분석

•

초반에는 두 후보자가 비슷하게 언급되다가,
중반에 접어들자 박원순 지지자들과 박원순이 언급되면서
나경원 후보자 지지자가 안보이게 되고,
종반에는 박원순 중심으로 네트워크가 재편되며 종결됨

I. Semantic network에서 중심성 비교
10.26 재보궐 선거 사례

(2)
•

서울시장 선거 관련 메세지들의 내용
을 분석하여 나오는 단어들의 빈도
분석

•

초반부터 나경원 후보는 빈도가 떨어
지다가, 후반에 박원순 후보와 경쟁
및 선거 결과를 이야기하면서 나타나
는 경우를 제외하고는 줄곳 담론외곽
에 존재

•

안철수 효과는 초반에 크고, 중반이
후 떨이지는 효과가 나타났으나, 한
나라당이라는 언급이 높게 나오면서
집권여당에 반하는 정서가 나타나,
선거의 성격을 말해줌





As Lim & Park (2011, 2013)
claim, the use of web
mentions of politicians’
names is particularly useful
for hierarchically ranking
individual politicians.
However, it may not
sufficiently capture the
entropy probability of an
event (hidden in changing
communication structures)
resulting from the amount of
information conveyed by the
occurrence of that event
(Shannon, 1948).



Taleb (2012) argues that society
can be conceived as a complex
fabric consisting of the extended
disorder family including
uncertainty, chance, entropy, etc.



Therefore, such disorder system
can be better derived from
empirical data mining, not
obtained by a priori theorem.



Uncertainty exists when three or
more events take place
simultaneously and is
increasingly beyond the control of
individual events (Leydesdorff,
2008).



In social and communication
sciences, entropy-based
indicators have been widely
used for exploring entropy
values generated from
university-industrygovernment (UIG)
relationships.



This “Triple Helix Model”
(THM) can be applied to
the concurrence of a pair
of two or three terms in
the public search engine
database

Mapping Election Campaigns Through Negative Entropy:

Triple and Quadruple Helix Approach
to Korea’s 2012 Presidential Election

Social media platforms have become a notable venue for Korean
voters wishing to share their opinions and predictions with others
(Park et al., 2011; Sams & Park, 2013).
 Politicians have made increasingly use of SNSs to provide updates
and communicate with citizens (Hsu & Park, 2012).
 With the increasing proliferation of smartphones and portable
computers in Korea, SNSs have been widely used for facilitating
political discourse.
 Prior studies have found that Web 1.0 contents tended to contain the
more enduring political and electoral statements of the public in
various contexts.


Introduction


To better understand the dynamics of the 2012 presidential election
in Korea, this study estimates the web visibility of the three major
candidates— Geun-Hye Park (PARK), Cheol-Soo Ahn (AHN), and
Jae-In Moon (MOON)—in the entire digital sphere.

Literature Review
The total probabilistic entropy (uncertainty) produced by changes in one or
two dimensions is always positive, which is in accordance with the second
law of thermodynamics (Theil, 1972, p. 59).
 On the other hand, the relative contribution of each event to the
summation in three or four dimensions can be positive, zero, or negative
(configurational information).
 This configurational information provides a measure of synergy within a
complex communication system. Network effects occur in a systemic and
nonlinear manner when loops in the configuration generate redundancies
in relationships between three or four events (Leydesdorff, 2008).


Method: Data collection







The number of hits for each search query per media
channel (Facebook, Twitter, and Google) was harvested.
The hit counts obtained from Google.com were
employed to look primarily at entropies represented on a
set of digitally accessible documents (e.g., online
versions of newspapers, online word-of-mouth, Web 1.0
contents, etc.).
We measured the occurrence and co-occurrence of the
politicians’ names based on their bilateral, trilateral, and
quadruple relationships by using Boolean operators.
For example, we measured the number of web and
social media mentions referring only to PARK (this is, no
mention of AHN, MOON, or the term “president”).

SNS 미디어에 따른 중심성에 따른 시각화

Literature Review
Twitter can be very effective to amplify messages particularly in terms of their
one-to-many mode of communication (Barash & Golder, 2010).
 Twitter is viable both as a political news and communication channel
(González-Bailón, Borge-Holthoefer, Rivero & Moreno, 2011; Hsu & Park,
2011, 2012; Otterbacher, Shapiro, & Hemphill, 2013)
 and to citizens who look for platforms for political participation and engagement
(Hsu, Park, & Park, 2013; Kim & Park, 2011; Tufekci& Wilson, 2012).


Literature Review





The mode of information sharing on Facebook differs from that on Twitter.
Facebook functions as a living room where friends talk to one another.
Facebook can be a mixture of interpersonal and mass channels for the sharing of
informational as well as social messages in a context of political campaign (Bond
et al., 2012; Effing, van Hillegersberg, & Huibers, 2011; Robertson, Vatrapu, &
Medina, 2010; Vitak et al., 2011).
Both Twitter and Facebook communications seem to be biased because two
platforms have been particularly dominated by the “2040 Generation”, who are
generally categorized as political liberals in Korea (Kwak et al., 2011).

Research questions


Therefore, it is important to examine what (social) media
conversations are more likely to generate more entropies that
others and which politician:



RQ 1) What (social) media generate (negative) entropy more than
others across different periods?



RQ 2) Which politician (or which pair of politicians) generates
entropy more than others for bilateral, trilateral, or quadruple
relationships across various media and periods?

Method: Measuring (negative) entropy


Figure 1. Binary Entropy Plot



Entropy values (expressed as T for transmission)
for bilateral relationships are, by definition,
positive. Here T is defined as the difference in
uncertainty when the probability distributions of
two incidents (e.g., i and j) are combined. The
mutual information transmission capacity,
expressed in T values, is measured by “bits” of
information (for a more detailed mathematical
definition, see Leydesdorff, 2003):



Hi = – Σi pi log2 (pi); Hij = – Σi Σj pij log2 (pij),
Hij = Hi + Hj – Tij ,
Tij = Hi + Hj – Hij
(1)
Here Tij is zero if the two distributions are mutually
independent and positive otherwise (Theil, 1972).







On the other hand, T values for trilateral and quadruple
relationships can be negative, positive, or zero depending on the
size of contributing terms. Therefore, it is necessary to compare
the absolute value of each (negative) entropy value when entropy
values are calculated for trilateral and quadruple relationships. In
the case of entropy values for trilateral and quadruple
relationships, the higher the absolute entropy value, the more
balanced the communication system is. Let p denote PARK; a,
AHN; and m, MOON and formulate mutual information in these
three dimensions as follows (Abramson. 1963, p. 129):



Tpam = Hp + Ha + Hm – Hpa – Hpm – Ham + Hpam



Here we are interested not only in information on mutual
relationships between these three candidates but also in semantic
relationships with respect to the term “president.” Accordingly, we
measure the entropy value by using mutual information in these
four dimensions (here “r” denotes “president”):



Tpamr = Hp + Ha + Hm + Hr – Hpa – Hpm – Hpr – Ham – Har – Hmr +
Hpam + Hpar + Hpmr + Hamr –Hpamr
(3)

(2)

Results


Figure 2. Entropy Values Across Media Channels and Time Periods

Results


Figure 3. T Values for Bilateral and Trilateral Relationships on November 3.

Results


Figure 4. T Values for Bilateral Relationships between Park and Moon

Discussion and conclusions






Twitter has scored the most negative entropy
values and Facebook followed. Google came last.
This indicates that Twitter is the most open
communication system.
The entropy values for liberal candidates (AHN and
MOON) have been higher than their conservative
opponent PARK on social media than Google
sphere.
This may not be surprising because both Twitter
and Facebook have particularly appeared to the
Korean citizens in the age of late teenagers to
early 40s.

Discussion and conclusions
PARK’s entropy has been slightly higher on
Google than her liberal challenger MOON.
 Park was successful in garnering a strong support
from senior voters in their 50s and 60s accounted
for 39% of the population, up from 29% a decade
ago (Wall Street Journal, 2012).
 Exit poll also revealed that PARK gained a support
from 62% of voters in their 50s and 72% of voters
in their 60s. Indeed, the most significant statistic on
the election was that South Koreans in their 20s,
30s, and 40s actually voted 65.2%, 72.5%, and
78.7% respectively but 89.9% in 50s and 78.8%
over 60s went to the polling booth.


Paper-code

Keynote Speech

“Creativity and TRIZ”for the Knowledge Network
Analysis in the Emerging Big Data Research”

- DISC 2013 2013. 12. 14.
Dr. Jae Ho Par, Ph.D.
Managing Director of GRCIOP
Professor Emeritus Jae Ho Park
Yeungnam University

Curriculum Vitae

Paper-code

December 14, 2013

Professor emeritus Jae H. Park, Ph.D
-

Professor Emeritus , Industrial and Organizational Psychology,

Yeungnam University, South Korea
-Chairman, Global TRIZ Conference, Organizing Committes
- Chairman, Korean Society of Creativity
- Managing Director, GRCIOP Research Center
- Senior Advisor, ICEDR(International Consortium for Executive
Development Research, Boston, USA
- Ph.D., Organizational Psychology, Goettingen University, Germany
- MA, Social Psychology, Seoul National University
- BA, Seoul National University

<Academic Career> -

 Harvard University, Research Professor. USA
 University of Michigan, Exchange Professor, Ann Arbor, Michigan, USA
 Yokohama National University, Research Fellow Professor, Japan
 CSPP(California School of Professional Psychology), Teaching Professor, 1999-2000
 Senior Advisor, ICEDR(International Consortium for Executive Development Research), USA
 Visiting Professor, Meio University, Japan, current
Partner, THT Cross-cultural Consulting, Amsterdam, the Netherlands
 Partner, SYMLOG Consulting Group, San Diego, USA
 Liscencee, Center for Creative Leadership(CCL), Greensboro, USA,
 Partner, Global Integration, UK

Paper-code

<International Consulting and Training>
 Samsung Electronics; Creativity and Innovation “Change Begins with Me”
Samsung New Management, Train the trainers for 6,000 managers.

 JMA(Japan Management Association and FMIC(Future Management and

Innovation Consulting, Japan ), SYMLOG Diagnosis, Team-building and

Coaching, Tokyo, Japan

- LG Philips Displays, M & A Process Consultation, Coaching, Diagnosis

 LG Electronics, DAC(White electronics Division), Changwon, Korea

 Hyundai Motor Company, Creativity and Innovation Program, Korea
 Samsung Electronics, Large Scale Change, Korea
 BorgWarner, Detroit, USA
 Ericsson, Sweden

 Applied Materials Korea, Coaching and Consultation, Seoul, Korea
 Goldman Sachs, Integration Project Coaching, with THT Consulting Group, 2007
 MetLife, Coaching for Asset Managers, 2007
 Mirae Assets Stock Company, Creativity Coaching, 2010

 Team-building and Innovation, Trondheim University, Norway

Paper-code

<International Network>
 Center for Creative Leadership, Partner, Liscencee, North Carolina, USA
 SPGR Consulting, Oslo, Norway
 JMAC(Japan Management Association Consulting) Tokyo, Japan
 SYMLOG Consulting Group, Researcher and Partner, San Diego, USA
 Global Integration, Partner, London, United Kingdom

 Japan Creativity Research Center, Partner, Tokyo, Japan

 THT Cross-cultural Consulting(Trompenaars & Turner), Amsterdam, Partner,
the Netherlands

 ICEDR(International Consortium for Executive Development Research) Boston, USA

<Consultant and Advisor >
 Samsung HRD
Center
 Samsung Electronics
 Samsung SDI
 LG Education Center
 LG Electronics
 POSCO HRD Center

<Contact>
Phone; 82-53-810-2230(Office)
Fax; 82-53-810-4610
Mobile; 82-10-8751-7579
email; grciop@gmail.com

TRIZ Founder

G. S. Altshuller
(1926~1998)

Father of TRIZ
Global TRIZ Conference 2013 | www.koreatrizcon.kr
Seoul Trade Exhibition & Convention, Seoul, Korea | July 09-11, 2013

Paper-code

What is TRIZ ?

TRI Z is a tool for Thinking
but not instead of thinking

G. Altshuller

Change of major discipline

Paper-code

Paper-code

From Tools to Subjects
 Labor : Human

Robot

Creativity

Innovation in Global companies

Paper-code

Paper-code

1.
2.
3.
4.
5.
6.
7.

Toyota Method
QFD
TOC
TRIZ
6 Sigma
Taguchi Method
7 Tools of Product
Design

Paper-code

 Research

Areas

◦ Understanding creative cognition and
computation
◦ Creativity to stimulate breakthrough in
science and engineering
◦ Educational approaches that encourage
creativity
◦ Supporting creativity with IT

Paper-code

 INSA





Strasbourg

http://www.insa-strasbourg.fr/en/news/news.html
Advanced Master of Innovative Design
5 Semesters for Intensive TRIZ
Since 9 years in operation

Edison and Altshuller
•
•
•
•
•

Everybody can be a Inventor
TRIZ Diffusion; No cost
Developed TRIZ in Prison
Benevolent Mentor
(Dialectics; ideal Communist)

Paper-code

Paper-code

 TRIZ
 Analyzed

many Patents
 By Creative Problem Solving
Methods
 Inductive Research Methods

Paper-code

Various views on TRIZ
•
•
•
•
•
•
•

From Knowledge Management
From 6 Sigma
From Engineering Design
From Innovation
From Creativity
From R&D
Etc…

Paper-code

TRIZ as a Science
Technical
Systems

Social
Systems

Natural
Systems

TRIZ
N&A Narbut, 2003

Paper-code

5 Levels of Invention
① Apparent Solution (32%)

①
- Simple
② Simple Improvement within current system
(45%)
③ Major improvement (18%)
- within same science
④ Innovation within current system (4%)
- Application different science principle
⑤ Pioneer Invention (1%)
- New principle and Paradigm Shift

⑤

④
③

②

Paper-code

Effects in TRIZ
Effects
Systematized
Information funds

Trends

Su-F

Development

Models

ARIZ,
Standards

N&A Narbut, 2003

Paper-code

Common Approach

TRIZ

Innovation involves the
creation of new ideas

Innovation involves
adapting existing ideas

Trained in the notion of the
‘great idea’. Popular
mythology - “Einstein” as
model. Belief that ‘six
months in the lab beats one
hour spent in the library’.

Tap existing solutions. Look
outside of discipline and to
Nature. Key benefit:
reduces perceived risk of
innovation (predictable,
higher chance of success).

Paper-code

Korea; Creative Economy via
Creativity : Expansion & Convergence

Pie

Bibimbap

- 2/10 -

Creativity and TRIZ

Paper-code

*

Korea Academic TRIZ Association

Industry-Academia Knowledge sharing
Contributor for industry competitiveness and
creative talent by TRIZ
Founded in May 2010
 Participating of Univ. & Co.

 Homepage: www.katatriz.or.kr

32 Co.

29 Univ.
- 3/10 -

Paper-code

Main Activities
Expanded use of
TRIZ and social
contribution

Evolution

Nurturing
creative talent

MATRIZ & KATA
MOU

Problem-solving,
Patent-creation

Biz. TRIZ research

Univ. professor
Workshop

Anti-school violence
program
TRIZ education
Charity fair

TRIZ Youth Acamedy
Lectures
for SMEs

Consulting for SMEs
problem-solving

Technical TRIZ
application
2010

2011

2012

2013

Time
- 5/10 -

TRIZ Activities in Korea

Paper-code

Company : Development of Innovative Products,
Problem-Solving and Patents Creation

 Core tech & innovative product

 Foundation of TRIZ Univ.

 TRIZ Elite

 Development of POSCO methodology

 TRIZ research group

 Internal TRIZ Conference

 Mixing DFSS & TRIZ

 Strategic R&D patent creation

 Patent creation

 On-site TRIZ process designed to

 TRIZ research group

improve on-site work performance
- 6/10 -

TRIZ Activities in Korea

Paper-code

University : Utilizing TRIZ in subject of “Creative design”
POSTECH
 Master course curriculum
 TRIZ Project organization

YONSEI
 Creative engineering education
 Inter-discipline activities and courses
 Engineering certification program

HANYANG
 Creative design education
 Business management and
creative design curriculum

POLYTECHNIC
 Mechanical engineering-focused courses
 KOREA/RUSSIA cooperation center

※ TRIZ application supported by the government and research institutions
(i.e. Ministry of Trade, Industry and Energy and ETRI)

- 7/10 -

Paper-code





Systematic innovator
Learn and practice by yourself.
Participate as a member of TRIZ
Association(Daegu-Gyungbuk Regional
Association): via Band

Paper-code

Recognition that
 (technical) systems evolve
 Towards the increase of ideality
 By overcoming Contradiction
 Mostly with minimal introduction of (free) Resources
Thus, for creative problem solving
 TRIZ provides a dialectic ways of thinking, i.e.,
 To understand the problem as a system
 To image the Ideal solution first
 And solve Contradiction

Paper-code

GRCIOP Global Network
ICEDR(International Consortium for Executive
Development Research(USA)
Global Integration(United Kingdom)
SYMLOG Consulting Group(USA)
Center for Creative Leadership(USA)

THT Consulting(the Netherlands)
Endre Sjovold Association(Norway)

The Geopolitics of New Media
RANDY KLUVER
TEXAS A&M UNIVERSITY

The context
 The rise of “new media” has transformed politics,

economics, and societies.
 But, “Internet Studies” as a field ignores the
geopolitical issues associated with the rise of new
media technologies



Lots of emphasis on “politics” and the internet, but little on the
relations between states
“Arab Spring”-events occur, but the focus remains primarily on
a domestic context

 Likewise, traditional IR theory focuses primarily on

elite level strategy, and doesn’t have the tools to
account for publics

Issue 1: The implications of a “networked” globe
on geopolitics
 Shifting configurations of influence


Networked, rather than hierarchical

 Highly transnational


“foreign” vs “domestic” doesn’t capture the reality

 The conversation has become global, especially among

elites




Values
Politics
Economics

 But, influence depends on your connectedness to the

global conversation


Thus, dependent on access to technological infrastructure

Example: Influential players in discourse
surrounding the Egyptian coup weren’t Egyptian!

But where was the Muslim Brotherhood?

Constraints on global networks
 Language
 Technological diffusion
 Domestic politics/economic priorities
 Platforms/applications

Should networks follow language groups?

English as the dominant carrier of global
conversation

Peer to Peer Diplomacy:
Global Social Network Usage

Twitter’s global web traffic
(not counting sms, im, etc)

P2PD:
China’s exclusion from “facebook friendships”

South Korea’s facebook friendships

Russia’s Facebook friendships

Public Diplomacy: Twitter targets

Public Diplomacy: E-diplomacy index

Issue 2: Information Access/Control
 Crowd Sourced
 Unprecedented access to sensitive information
 Stratified
 Customized

“The spread of information networks is forming a
new nervous system for our planet. When something
happens in Haiti or Hunan, the rest of us learn about
it in real time-from real people.”
US Sec of StateHillary Clinton, 2010

Wikileaks: Crowd-sourced espionage or
invaluable public service?
 Revealed US war plans

and operations, as well as
diplomatic secrets
 Led to multiple
recriminations, including
attempted assassination
of Saudi ambassador
 Snowden: hero or
traitor?

The value of geographic knowledge

Issue Three: Policies
 Re-articulation of “national interest”
 Alec J. Ross and “21st Century Statecraft”
 “addresses new forces propelling change in international
relations that are pervasive, disruptive, and difficult to
predict.” US Dept of State
 Perhaps what we can predict
 Publics more important than elites
 Don’t assume you can keep secrets
 Companies comply with national laws more for reputational
reasons than for fear of sanction

The Internet Freedom Agenda
 “Countries that restrict free access to information or

violate the basic rights of internet users risk walling
themselves off from the progress of the next
century.” Hillary Clinton, January 2010, Remarks
on Internet Freedom
 “Let’s be clear. This disclosure is not just an attack
on America-it’s an attack on the international
community.” Hillary Clinton, November 2010, after
the Wikileaks release.
 Conclusion: no set of easy answers

Final thoughts…..
 We need far more sustained attention to the impact

of new media in between states, as well as within
states.
 Unrealistic to simply say “NO,” no matter how loudly
we say it. The technology won’t be unmade.
 We are in uncharted, and largely unstudied,
territory, and our policies are being driven by what is
technically feasible, rather than what is desirable.

A project from the Social Media Research Foundation: http://www.smrfoundation.org

About Me
Introductions
Marc A. Smith
Chief Social Scientist
Connected Action Consulting Group
Marc@connectedaction.net
http://www.connectedaction.net
http://www.codeplex.com/nodexl
http://www.twitter.com/marc_smith
http://delicious.com/marc_smith/Paper
http://www.flickr.com/photos/marc_smith
http://www.facebook.com/marc.smith.sociologist
http://www.linkedin.com/in/marcasmith
http://www.slideshare.net/Marc_A_Smith
http://www.smrfoundation.org

Social Media Research Foundation
http://smrfoundation.org

Social Media Research Foundation
People

Disciplines

Institutions

University
Faculty

Computer Science

University of Maryland

Students

HCI, CSCW

Oxford Internet Institute

Industry

Machine Learning

Stanford University

Independent

Information Visualization

Microsoft Research

Researchers

UI/UX

Illinois Institute of
Technology

Developers

Social Science/Sociology

Connected Action

Network Analysis

Cornell

Collective Action

Morningside Analytics

What we are trying to do:

Open Tools, Open Data, Open Scholarship
• Build the “Firefox of GraphML” – open tools for
collecting and visualizing social media data
• Connect users to network analysis – make
network charts as easy as making a pie chart
• Connect researchers to social media data sources
• Archive: Be the “Allen Very Large Telescope Array”
for Social Media data – coordinate and aggregate
the results of many user’s data collection and
analysis
• Create open access research papers & findings
• Make “collections of connections” easy for users
to manage

What we have done: Open Tools
• NodeXL
• Data providers (“spigots”)
–
–
–
–
–
–
–
–

ThreadMill Message Board
Exchange Enterprise Email
Voson Hyperlink
SharePoint
Facebook
Twitter
YouTube
Flickr

What we have done: Open Data
• NodeXLGraphGallery.org

– User generated collection
of network graphs,
datasets and annotations
– Collective repository for
the research community
– Published collections of
data from a range of social
media data sources to help
students and researchers
connect with data of
interest and relevance

What we have done: Open Scholarship

Social Media
(email, Facebook, Twitter,
YouTube, and more)
is all about
connections
from people
to people.
10

There are many kinds of ties….

Send, Mention,

Like, Link, Reply, Rate, Review, Favorite, Friend, Follow, Forward, Edit, Tag, Comment, Check-in…

http://www.flickr.com/photos/stevendepolo/3254238329

Social Network Theory
http://en.wikipedia.org/wiki/Social_network
• Central tenet

– Social structure emerges from
– the aggregate of relationships (ties)
– among members of a population

• Phenomena of interest

– Emergence of cliques and clusters
– from patterns of relationships
– Centrality (core), periphery (isolates),
– betweenness

• Methods

– Surveys, interviews, observations,
log file analysis, computational
analysis of matrices

Source: Richards, W.
(1986). The NEGOPY
network analysis
program. Burnaby, BC:
Department of
Communication, Simon
Fraser University. pp.716

(Hampton &Wellman, 1999; Paolillo, 2001; Wellman, 2001)

SNA 101
• Node

A

– “actor” on which relationships act; 1-mode versus 2-mode networks

• Edge
B

– Relationship connecting nodes; can be directional

C

• Cohesive Sub-Group

– Well-connected group; clique; cluster

• Key Metrics

A B D E

– Centrality (group or individual measure)

D

• Number of direct connections that individuals have with others in the group (usually look at
incoming connections only)
• Measure at the individual node or group level

E

– Cohesion (group measure)

• Ease with which a network can connect
• Aggregate measure of shortest path between each node pair at network level reflects
average distance

– Density (group measure)

• Robustness of the network
• Number of connections that exist in the group out of 100% possible

G

F

– Betweenness (individual measure)

• # shortest paths between each node pair that a node is on
• Measure at the individual node level

• Node roles
H

I

C
– Peripheral – below average centrality
– Central connector – above average centrality
– Broker – above average betweenness
E

D

NodeXL
Free/Open Social Network Analysis add-in for Excel 2007/2010 makes graph
theory as easy as a pie chart, with integrated analysis of social media sources.
http://nodexl.codeplex.com

Goal: Make SNA easier
• Existing Social Network Tools are challenging
for many novice users
• Tools like Excel are widely used
• Leveraging a spreadsheet as a host for SNA
lowers barriers to network data analysis and
display

http://www.flickr.com/photos/badgopher/3264760070/

http://www.flickr.com/photos/druclimb/2212572259/in/photostream/

http://www.flickr.com/photos/hchalkley/47839243/

http://www.flickr.com/photos/rvwithtito/4236716778

http://www.flickr.com/photos/62693815@N03/6277208708/

Social Network Maps Reveal
Key influencers in any topic.
Sub-groups.
Bridges.

NodeXL
Network Overview Discovery and Exploration add-in for Excel 2007/2010

A minimal network can
illustrate the ways different
locations have different values
for centrality and degree

http://www.flickr.com/photos/storm-crypt/3047698741

Welser, Howard T., Eric Gleave, Danyel Fisher,
and Marc Smith. 2007. Visualizing the Signatures
of Social Roles in Online Discussion Groups.
The Journal of Social Structure. 8(2).

Experts and “Answer People”

Discussion people, Topic setters
Discussion starters, Topic setters

http://www.flickr.com/photos/library_of_congress/3295494976/sizes/o/in/photostream/

http://www.flickr.com/photos/amycgx/3119640267/

#teaparty
15 November 2011

#occupywallstreet
15 November 2011

http://www.newscientist.com/blogs/onepercent/2011/11/occupy-vs-tea-party-what-their.html

Like MSPaint™ for graphs.
— the Community

Introduction to NodeXL

Example NodeXL data importer for Twitter

NodeXL imports “edges” from social media data sources

NodeXL displays subgraph images along with network metadata

NodeXL creates a list of “vertices” from imported social media edges

NodeXL
Automation
makes analysis
simple and fast

Perform
collections
of common
operations
with a single
click

NodeXL Generates Overall Network Metrics

Divided
Polarized

Unified
In-group

Fragmented
Brand

Clustered
Communities

In-Hub & Spoke
Broadcast

Out-Hub & Spoke
Support

6 kinds of Twitter social media networks

#CMgrChat

In-group / Community

New York Times Article
Paul Krugman

Broadcast: Audience + Communities

Dell Listens/Dellcares

Support

SNA questions for social media:
1.
2.
3.
4.

What does my topic network look like?
What does the topic I aspire to be look like?
What is the difference between #1 and #2?
How does my map change as I intervene?
What does #YourHashtag look like?

Twitter Network for “Microsoft Research”
*BEFORE*

Twitter Network for “Microsoft Research”
*AFTER*

Network Motif Simplification

Cody Dunne, University of Maryland

Network Motif Simplification

D-connector (glyph on the right)

Fan(glyph on the right)

D-clique (glyphs for 4, 5, and 6
member cliques below)

Dr. Cody Dunne

Scholars using NodeXL
• Communications
– Katy Pearce
– Itai Himelboim

• Business
– Scott Dempwolf

• Humanities/Classics
– Diane Cline

C. Scott
Dempwolf,
PhD

Research Assistant
Professor & Director
UMD - Morgan State
Center for Economic
Development

What is Social Network Analysis?
How is it useful for the humanities?

1. New framework for analysis
2. Data visualization allows new perspectives – less linear, more comprehensive

Social Network Analysis and Ancient History
Diane H. Cline, Ph.D.
University of Cincinnati

NodeXL calculates metrics
about networks and content

The Content summary
spreadsheet displays the most
frequently used URLs, hashtags,
and user names within the
network as a whole and within
each calculated sub-group.

NodeXL as a Teaching Tool
I. Getting Started with Analyzing Social Media Networks
1. Introduction to Social Media and Social Networks
2. Social media: New Technologies of Collaboration
3. Social Network Analysis
II. NodeXL Tutorial: Learning by Doing
4. Layout, Visual Design & Labeling
5. Calculating & Visualizing Network Metrics
6. Preparing Data & Filtering
7. Clustering &Grouping
III Social Media Network Analysis Case Studies
8. Email
9. Threaded Networks
10. Twitter
11. Facebook
12. WWW
13. Flickr
14. YouTube
15. Wiki Networks

http://www.elsevier.com/wps/find/bookdescription.cws_home/723354/description
82

What we want to do:
(Build the tools to) map the social web
• Move NodeXL to the web: (Node[NOT]XL)
– Node for Google Doc Spreadsheets?
– WebGL Canvas? D3.JS? Sigma.JS

• Connect to more data sources of interest:

– RDF, MediaWikis, Gmail, NYT, Citation Networks

• Solve hard network manipulation UI problems:

– Modal transform, Time series, Automated layouts

• Grow and maintain archives of social media network data sets for
research use.
• Improve network science education:
– Workshops on social media network analysis
– Live lectures and presentations
– Videos and training materials

NodeXL Results
• Easy to learn, yet powerful and insightful
• Widely used by both students and researchers

• Free and open source sofware
• World-wide team of collaborators
Malik S, Smith A, Papadatos P, Li J, Dunne C, and Shneiderman B (2013), “TopicFlow: Visualizing topic
alignment of Twitter data over time. In ASONAM '13.
Bonsignore EM, Dunne C, Rotman D, Smith M, Capone T, Hansen DL and Shneiderman B (2009), "First steps
to NetViz Nirvana: Evaluating social network analysis with NodeXL", In CSE '09. pp. 332-339.
DOI:10.1109/CSE.2009.120
Mohammad S, Dunne C and Dorr B (2009), "Generating high-coverage semantic orientation lexicons from
overtly marked words and a thesaurus", In EMNLP '09. pp. 599-608.
Smith M, Shneiderman B, Milic-Frayling N, Rodrigues EM, Barash V, Dunne C, Capone T, Perer A and Gleave E
(2009), "Analyzing (social media) networks with NodeXL", In C&T '09. pp. 255-264.
84
DOI:0.1145/1556460.1556497

How you can help
Sponsor a feature
Sponsor workshops
Sponsor a student
Schedule training
Sponsor the foundation
Donate your money, code, computation, storage,
bandwidth, data or employee’s time
• Help promote the work of the Social Media
Research Foundation
•
•
•
•
•
•

Available Now in NodeXL!
•
•
•
•
•
•
•
•
•
•
•
•
•

Motif Simplification
Group-in-a-Box Layouts
Data import spigots
Excel functions & macros
Network statistics
Layout algorithms
Filtering
Clustering
Attribute mapping
Automate analyses
Email reporting
Graph Gallery
C# libraries

nodexl.codeplex.com

Strategies for social media engagement based on
social media network analysis

International Collaboration &
Green Technology Generation
Assessing the East Asian
Environmental Regime
Matthew A. Shapiro
Illinois Institute of Technology
matthew.shapiro@iit.edu

Impetus
• Shapiro and Nugent (2012) “Institutions and the
sources of innovation” in IJPP

• Total factor productivity is hindered by collaboration if
institutions are absent or if not beyond TFP threshold

• Shapiro (2013) “Regionalism’s challenge to the
pollution haven hypothesis” in Pacific Review
• Regional efforts to eliminate pollution are
multifaceted

• Support

• East Asia Institute
• Asiatic Research Institute, Korea University

International
institutions
To other regions

To other regions

Regional institutions

Country 2 FDI

Country 2
ecologists

(+)

Pollution
haven
hypothesis
(+)

(+)

Epistemic
community
hypothesis
(-)

Country 1
pollution

Country 2
pollution

Country 3
pollution

Country 1
institutions

(-)

Country 2
domestic R&D
funding
Country 3
domestic R&D
funding

Country 3
ecologists

Country 3 FDI

Contra-pollution
haven
hypothesis (-)

Country 1
domestic R&D
funding

Country 1
ecologists

Country 1 FDI

Country 2
institutions

Country 3
institutions

Research Questions
• Are the Northeast Asian countries key
collaborators in pursuit of green R&D?
• Yes, particularly in recent years.

• Are the Northeast Asian countries
collaborating extensively with each other?
• Not as much as they collaborate with countries
beyond the region.

• Implications?

Green R&D
• Patents
• IPC Green Inventory
•
•
•
•
•
•
•

Alternative energy production
Transportation
Energy conservation
Waste management
Agriculture/forestry
Administrative aspects
Nuclear power generation

Alternative energy production
• Biofuels
• Integrate gasification combined cycle
• Fuel cells
• Pyrolysis or gasification of biomass
• Harnessing energy from manmade
waste
• Hydro energy
• Ocean thermal energy conversion
• Wind energy
• Solar energy
• Geothermal energy
• Other production or use of heat not
derived from combustion
• Using waste heat
• Devices for producing mechanical
power from muscle energy

Energy conservation
• Storage of electrical
energy
• Power supply
circuitry
• Measurement of
electricity
consumption
• Storage of thermal
energy
• Low energy lighting
• Thermal building
insulation, in general
• Recovering
mechanical energy

Data Collection
• Source: USPTO
• Collection method: Leydesorff’s tools
• Unit of analysis: country of inventor

Data Description
IL
BE

• Dates: 1990-2013
• 129,640 total inventors

IN

IT

CN

CH

NZ TW
all others

AU
KR
DK

• Assumption: Any
collaboration is valued,
so proportionate share
of patent inventorship is
ignored.

CA

GB

• 242,331 total nodes
based on country
classification

NL

FR
US
DE

JP

Are Northeast Asian countries key collaborators?

Is Northeast Asia a singular research hub?

Northeast Asia only: 1990-2013

Implications
• Empirical
• R&D collaboration can be beneficial from both
intra- as well as extra-regionally. Both are
happening extensively for Northeast Asia.

• Methodological
• Challenges of connecting these results to other
variables in model
• Longitudinal concerns: Change in connectedness?
• Qualitative, quantitative, mixed?

Assessing Social Media Coverage in
Japan: Before and After March 11, 2011

Leslie M. Tkach-Kawasaki
University of Tsukuba

DISC 2013, December 11, 2013

Overview
1.
2.
3.
4.
5.
6.

Introduction: Social Media in Japan
2010-2011
March 11, 2011: Triple Disaster
Social Media: Before and After?
Method
Select Results (6 tables)
Conclusion

Japan’s Internet Population 2011

Source: 2011 情報通信白書平成２３年版

Social Media in Japan 2010-2011
Have used the following at least once…..
Blogs  77.3%
Video-sharing websites  62.8%
SNS  53.6%
Microblogs (Twitter)  30.9%

Source: 2010 White Paper on Information and Communications in Japan

The Year in Social Media 2010-11
International diplomacy:Youtube and Chinese
fishing vessel (September 2010)
 Entertainment: Release of The Social Network
(October 2010)
 International conflicts: Role of Twitter and
Facebook in Tunisia and Egypt (January 2011)
 Disasters: New Zealand Earthquake (February
2011)


Information Provision/Gathering During
2011 Earthquake

Source: 2012 White Paper on Information and Communications in Japan

Research question….

Are there perceivable differences
in the discourse (phrases) about
social media in Japan’s
newspaper media before and after
March 11, 2011?

Disc2013 keynote speakers

Recommended

Recommended

More Related Content

What's hot

What's hot (18)

Viewers also liked

Viewers also liked (11)

Similar to Disc2013 keynote speakers

Similar to Disc2013 keynote speakers (20)

More from Han Woo PARK

More from Han Woo PARK (20)

Recently uploaded

Recently uploaded (20)

Disc2013 keynote speakers