Virtual Knowledge Studio (VKS)
Google in 1998
Google and PageRank
Contemporaries
โ€ข Pre-Google Guys
- Bill Gates & Steve Jobs: 1955
- Jeff Bezos: 1964
โ€ข Google Guys
- Sergey Brin & Larry Page: 1973
- Elon Musk, Evan Williams, & Jack Dorsey: 1971/2/6
โ€ข Post-Google Guys
- Kevin Systrom & Mark Zuckerberg: 1983/4
Park (2003)
Bonacich, P. (2004).
The Invasion of the Physicists. Social Networks 26(3): 285-288
Graph structure in the web
Introduction
๏ฝ Webometricsis broadly defined as the study of web-
based content (e.g.,text,images,audio-visual objects,and
hyperlinks) with primarily quantitative indicatorsfor
social science research goals and visualization techniques
derived from information science and social network
analysis.
8
โ€ข Han Woo Park
- โ€œhiddenโ€ and โ€œrelationalโ€ data about
lots of people as well as the few
individuals, or small groups
โ€ข Lev Manovich
- โ€œsurfaceโ€ data about lots of people (i.e.,
statistical, mathematical or computational
techniques for analyzing data)
- โ€œdeepโ€ data about the few individuals or small
groups (i.e., hermeneutics, participant
observation, thick description, semiotics, and
close reading)
First type of Webometrics
โ€ข Hyperlink Network Analysis
- Inter-linkage: who linked to whom matrix
- Co-inlink: a link to two different nodes from a third node
- Co-outlink: A link from two different nodes to a third node
Bjรถrneborn (2003)
My First SSCI Research: 44 Websites
๏ฎ categorically
selected sites
๏ฎ financial sites the
most central
๏ฎ revenue sources:
advertising & e.c.
๏ฎ common payment:
credit card
The future of social relations
๏ฝ The social benefits of internet use will far outweigh
the negatives over the next decade.They say this is
because email, social networks, and other online tools
offer โ€˜lowโ€frictionโ€™ opportunities to create, enhance,
and rediscover social ties that make a difference in
peopleโ€™s lives.
๏ฝ Some 85%agreed with the statement:
โ€œIn 2020,when I look at the bigpicture and consider my
personal friendships,marriage and other relationships,I see
that the internet has mostly been apositive force on my social
world.And this will only grow more true in the future.โ€
๏ฝ โ€œThere's no escapingpeople anymore,and I believe that will
yield better relationships.โ€โ€”Jeff Jarvis,
M.Castells (2009), Communication Power
๏ฝ 1) Networkingpower: the power over who and what is
included in the network. โ€˜Mass self-communicationโ€™, the use of
new media for private messages that are able to reach masses
๏ฝ 2) Network power: the power of the protocols of network
communication. In mass self-communication the diversity of
formats is the rule and that this amplifies the diffusion of
messages
๏ฝ 3) Networked power: the power of certain nodes over other
nodes inside the network.This is the managerial, agenda-
setting, editorial and decision makingpower in the
organizations that own or operate networks.
๏ฝ 4) Network-makingpower: the capacity to set-up and program
a network โ€“ of multimedia or traditional mass communication-
by their owners and controllers
Given that social mediaconnect individuals in
dramatically different ways, research questions are like
these:
W hat do people talk?
W ho can see what?
W ho can reply to whom?
How longis content visible?
W hat can link to what?
W ho can link to whom?
Webometricsand Hyperlink Network Analysiscan be
particularlyuseful to answer these questions!!!
Big Data and Social Webometrics Network Analysis
Increasing data size in
terms of the no. of nodes
Micro โ‰ฆ100 nodes โ†’10K
Meso โ‰ฆ1000 nodes โ†’1000K
Macro โ‰ฆ10000 nodes
โ†’100,000K
Super-
Macro
โ‰ฅ10000 nodes โ†’ โˆฝ
์ถœ์ฒ˜: ๋ฐ•ํ•œ์šฐ(2014)
โ€œThose studies perpetuate the idea that linking
behaviour is not random, and that links are โ€˜socially
significant in some wayโ€™. In this perspective, links
have an โ€˜information side-effectโ€™, they can be used
to understand other facts even though they were
not individually designed to do so: โ€˜information
side-effects are by-products of data intended for
one use which can be mined in order to understand
some tangential, and possibly larger scale,
phenomenaโ€™
Park and his colleagues were
extensively cited: 9 times!
โ€ข Barnett GA, Chung CJ and Park HW (2011) Uncovering transnational hyperlink patterns
and web mediated contents: a new approach based on cracking.com domain. Social
Science Computer Review 29(3): 369โ€“384.
โ€ข Hsu C and Park HW (2011) Sociology of hyperlink networks of Web 1.0, Web 2.0, and
Twitter: a case study of South Korea. Social Science Computer Review 29(3): 354โ€“368.
โ€ข Park HW (2003) Hyperlink network analysis: a new method for the study of social
structure on the web. Connections 25(1): 49โ€“61.
โ€ข Park HW (2010) Mapping the e-science landscape in South Korea using the
webometrics method. Journal of Computer-Mediated Communication 15(2): 211โ€“229.
โ€ข Park HW and Jankowski NW (2008) A hyperlink network analysis of citizen blogs in
South Korean politics. Javnost: The Public 15(2): 5โ€“16.
โ€ข Park HW and Thelwall M (2003) Hyperlink analyses of the World Wide Web: a review.
Journal of Computer-Mediated Communication 8(4).
โ€ข Park HW and Thelwall M (2008) Developing network indicators for ideological
landscapes from the political blogosphere in South Korea. Journal of Computer-
Mediated Communication 13(4): 856โ€“879.
โ€ข Park HW, Kim C and Barnett GA (2004) Socio-communicational structure among political
actors on the web in South Korea. New Media & Society 6(3): 403โ€“423.
โ€ข Park HW, Thelwall M and Kluver R (2005) Political hyperlinking in South Korea: technical
indicators of ideology and content. Sociological Research Online 12(3).
A comment from those who are
NOT doing a hyperlink analysis
โ€ข In a chapter of The Sage Handbook of
Online Research Methods edited by
Fielding et al. (2008), Horgan emphasizes
that โ€˜link analysisโ€™ has become an active
research domain in examining social
behavior online.
17
http://participatorysociety.org/wiki/index.
php?title=Online_Research
2nd type of Webometrics: Web Visibility
๏ฝ Web mention as an indicator of
online viral power and reputation
๏ฝ Presence or appearance of actors or
issues beingdiscussed by the public
(Internet users) on the web.
๏ฝ Trackingweb visibility is powerful way
to get an insight into public reactions to
actors or issues.
Construct validity of webometrics
data
Ackland, R. (2013). Web Social Science:
Concepts, Data and Tools for Social
Scientists in the Digital Age. Sage.
P. 16.
How to either empirically or theoretically
demonstrate the construct validity of web
data for social science research?
โ€ข By testing whether the online network displays structural signatures
that are consistent with those displayed by real-world actors.
โ€“ For example: Does Facebook friendship network data display
homophily on the basis of race, ethnicity, etc.?
โ€ข By testing whether variables constructed from web data are
correlated with other accepted measures of the construct.
โ€“ For example: If counts of inbound hyperlinks to academic project
websites are correlated with other characteristics of academic
teams (e.g. publications, industry connections) that are used as
proxies of academic authority or performance, then this is
evidence of the construct validity of hyperlink data in the context
of scientometrics.
โ€ข If it can be shown that an actor's position in an online network has
influence on his or her performance or outcomes in a manner that
accords with what is found offline.
How different across disciplines?
WCU
WEBOMETRICS
INSTITUTE
INVESTIGATING INTERNET-BASED POLITICS WITH E-RESEARCH TOOLS
Park, H. W. (2010). Mapping the e-science landscape in South Korea using the webometrics
method. Journal of Computer-Mediated Communication, Vol. 15, No. 2. 211 โ€“ 229
Computational perspective based on the
use of high performance computing
to facilitate high-speed processing of
large volumes of digital data
e-Science in humanities
and social sciences
The networking perspective based on
virtual collaboration through the Grid
Two major
strands exist in
computational
science
(also called
e-Science)
?
A third alternative strand
Computational Social Science (CSS)
A minor but growing approach to
the study of society
Focus on the methodological
perspective based on the use
of new digital tools to manage
the data deluge
Computational (Social) Science
๏ฝ Focus on the methodological
perspective based on the use of
new digital tools to manage the
data deluge.
๏ฝ D evelopment of e-science
tools to automate research
process.
๏ฝ Experimentation with new
types of data visualization.
Measuring information exposure in dynamic
and dependent networks (ExpoNet)
According to the OECD's Global Science Forum
2013 report, social scientists' inability to anticipate
the Arab Spring was partly due to a failure to
understand 'the new ways in which humans
communicate' via social media and the ways they
are exposed to information. And social media's
mixed record for predicting the results of recent UK
elections suggests better tools and a unified
methodology are needed to analyse and extract
political meaning from this new type of data.
โ€ข http://www.ncrm.ac.uk/research/ExpoNet/
Why Data Science?
Savage and Burrows (2007, p.
886) lament, โ€œFifty years ago,
academic social scientists might
be seen as occupying the apex
of the โ€“ generally limited โ€“ social
science research โ€˜apparatusโ€™.
Now they occupy an increasingly
marginal position in the huge
research infrastructureโ€.
Bonacich, P. (2004).
The Invasion of the Physicists. Social Networks 26(3): 285-288
All models are wrong but some are useful
Emergence of data author on dataverse
Andersons claims
๏ฝ Data is everythingwe need.
๏ฝ We don't have to settle for models.
๏ฝ Agnostic statistics.
๏ฝ Out with every theory of human behavior.
๏ฝ This approach to science โ€” hypothesize, model,
test โ€” is becomingobsolete.
๏ฝ Petabytes allow us to say: "Correlation is enough."
We can stop lookingfor models.
๏ฝ W hat can science learn from Google? E-Science.
Big data and the end of theory?
๏ฝ Does big data have the answers? Maybe some, but not all, says -
Mark Graham
๏ฝ In 2008, Chris Anderson, then editor of W ired, wrote a
provocative piece titled The End of Theory. Anderson was
referring to the ways that computers, algorithms, and big data can
potentially generate more insightful, useful, accurate, or true
results than specialists or domain experts who traditionally craft
carefully targeted hypotheses and research strategies.
๏ฝ W e may one day get to the point where sufficient quantities of big
data can be harvested to answer all of the social questions that
most concern us. I doubt it though. There will always be digital
divides; always be uneven data shadows; and always be biases in
how information and technology are used and produced.
๏ฝ And so we shouldn't forget the important role of specialists to
contextualize and offer insights into what our data do, and maybe
more importantly, don't tell us.
http://www.guardian.co.uk/news/datablog/2012/mar/09/big-data-theory
The Coming of Triple Divide?
There are three main gaps Iโ€™d like to emphasize
in the present/future of Big Data research
community:
1) Developing/Transitional VS
Developed/Advanced countries,
2) Researcher in academia VS Researcher in
commercial sector,
3) Researchers with computational skills VS
Less computational scholars.
Method used Developed
Country/Region
Developing
Country/Region
Mixed Region
N % N % N %
Social-
Informetics
114 74.51 30 83.33 9 52.94
Scientometrics 28 18.30 6 16.67 8 47.06
Webometrics 11 7.19 0 0 0 0
Total 153 100 36 100 17 100
No. of articles in each category of methods
by the developed/developing division
Skoric, M. M. (2013, Online First). The implications of big data for developing
and transitional economies: Extending the Triple Helix?. Scientometrics.
http://www.oii.ox.ac.uk/research/projects/?id=98
http://www.nature.com/nature/journal/v455/n7209/
4 September 2008 Volume 455 Number 7209 pp1-136
"what Big Data sets mean for
contemporary scienceโ€
This approach to science is attributed to the late Jim Gray,
one of the most influential computer scientists, at Microsoft.
Science published a special
issue (February 11, 2011) looking
broadly at increasingly data-driven
research efforts as a scientific
domain (Science staff, 2011).
Data Science is composed of interrelated
clusters of research tasks. For example, the
technologies on data collection, curation, and
access, and the unique skill sets have
increasingly been central to Data Science
(Science staff, 2011).
Phrase map of highly occurring keywords 1999-2005
Halevi, G., & Moed, H. F. (2012).
Phrase map of highly occurring keywords 2006-2012
Halevi, G., & Moed, H. F. (2012).
Park, H. W., & Leydesdorff, L. (2013 Work-In-Progress). Decomposing a Data-Driven Science Using a Scientometric Method.
๏ฝ But, Halevi and Moed (2012), and Rousseau (2012) are
based on descriptive statistics. Therefore, we intend to add
the network perspective both in the social (in terms of co-
authorship) and semantic networks.
๏ฝ Furthermore, we extend search queries to various
terminologies related to Data Science because the term
โ€œbig dataโ€ is regarded only as one among a list of policy
priority issues.
๏ฝ We show where the research system in Data Science is
โ€œhotโ€ in terms of international collaborations and
prevailingsemantics.
Park, H.W.@, & Leydesdorff, L. (2013). Decomposing Social and Semantic Networks in
Emerging โ€œBig Dataโ€ Research. Journal of Informetrics*. 7 (3), 756-765.
The Signal and the Noise:
W hy Most Predictions Fail but Some Don't. Nate Silver
I do not go as far as a Popper in asserting that such
theories are therefore unscientific or that they lack any
value. However, the fact that the few theories we can
test have produced quite poor results suggests that
many of the ideas we havenโ€™t tested are very wrong as
well. We are undoubtedly living with many delusions
that we do not even realize.
page 15
OECD (2012).OECD Technology Foresight Forum 2012 - Harnessingdata as a new source of growth: Big
data analytics and policies. OECD Headquarters, Paris, France 22 October 2012
http://www.nature.com/news/facebook-experiment-boosts-us-voter-turnout-1.11401
Algorithmic management of socially shared
information: Facebook as a designed social system
Which features should be deployed?
[Ugander-Karrer-Backstrom-Kleinberg 2013]
Which discussions will be most active? [Backstrom-
Kleinberg-Lee-DanescuNiculescuMizil 2013]
Which memes will receive the most reshares?
[Cheng-Adamic-Dow-Kleinberg-Leskovec 2014]
Which links should be emphasized?
[Backstrom-Kleinberg 2014]
http://cdn.oreillystatic.com/en/assets/1/event/119/Computational%20Problems%20in%20Managing%20Social%20Information%20%20Presentation.pdf
Typical FB user writes 60-70% of comments to โ‰ˆ 15 people.
[Backstrom-Bakshy-Kleinberg-Lento-Rosenn 2011]
http://www.cs.cornell.edu/home/kleinber/icwsm11-attention.pdf
Economics in the age of big data
http://www.sciencemag.org/content/346/6210/1243089.ful
l
http://bpp.mit.edu/
http://www.unglobalpulse.org/projects
https://www.youtube.com/watch?v=ukWO64L2RAk&feature=youtu.be
Big Data for 2030 SDG
A more recent development was made with the
establishment of journals that included the term โ€œData Scienceโ€
in their titles:
โ€ข Data Science Journal in 2002
โ€ข Journal of Data Science in 2003
โ€ข EPJ Data Science in 2012
โ€ข GigaScience gigasciencejournal.com in 2012
โ€ข BigData & Society in 2015
1Ying Huang โ€ข Jannik Schuehle โ€ข Alan L. Porter โ€ข JanYoutie
http://bigdatasoc.blogspot.kr/2014/11/celebrating-official-launch-of-big-data.html?spref=fb
http://bigdatasoc.blogspot.co.uk/
http://bds.sagepub.com/content/1/1/2053951714540280.full
http://www.bbc.com/news/uk-22007058
http://www.bbc.com/news/uk-22020836
http://www.bbc.com/news/uk-22011732
http://ec.europa.eu/enterprise/policies/innovation/policy/business-
innovation-observatory/files/infographics/big-data_en.pdf
http://www.clickz.com/clickz/news/233699
6/80-of-marketers-will-run-cross-channel-
marketing-campaigns-in-2014-study
ttp://www.nae.edu/Publications/Bridge/128772.aspx
W inter Bridge:A GlobalView of BigData
The chart Tim Cook doesnโ€™t want you to see
5
http://qz.com/122921/the-chart-tim-cook-doesnt-want-you-to-see/
Kim, G. H., Trimi, S., & Chung, J. H. (2014). Big-data applications in the government sector. Communications of the ACM, 57(3), 78-85
Kim, G. H., Trimi, S., & Chung, J. H. (2014). Big-data applications in the government sector. Communications of the ACM, 57(3), 78-85
Kim, G. H., Trimi, S., & Chung, J. H. (2014). Big-data applications in the government sector. Communications of the ACM, 57(3), 78-85
Yet, there still are serious problems to overcome. A trenchant
critique concerning the big data field as it is nowadays came in
the form of six statements intending to temper unbridled
enthusiasm. [42] These six provocative statements are:
๏ฝ Bigdata change the definition of knowledge;
๏ฝ Claims to accuracy and objectivity are misleading;
๏ฝ More data are not always better data;
๏ฝ Taken out of context, bigdata loses its meaning;
๏ฝ Just because it is accessible, it does not make it ethical; and
๏ฝ (Limited) access to bigdata creates a new digital divide.
Rousseau (2012)
Big Data's Slippery Issue of
Causation vs. Correlation
Big Data's Slippery Issue of
Causation vs. Correlation
http://www.nae.edu/Publications/Bridge/128772.aspx
W inter Bridge:A GlobalView of BigData
http://www.nae.edu/Publications/Bridge/128772.aspx
W inter Bridge:A GlobalView of BigData
78
http://www.pewinternet.org/2014/02/20/mapping-twitter-topic-networks-from-polarized-crowds-to-community-clusters/
http://www.pewinternet.org/2014/02/20/mapping-twitter-topic-networks-from-polarized-crowds-to-community-clusters/
http://www.sciencemag.org/content/347/6227/1243
http://news.sciencemag.org/brain-behavior/2015/03/new-study-
questions-trope-conservatives-are-happier-liberals
Kobayashi, T., & Boase, J. (2012). No Such Effect? The Implications of Measurement
Error in Self-Report Measures of Mobile Communication Use. Communication Methods
and Measures, 6, 1โ€“18. DOI: 10.1080/19312458.2012.679243
N. A. Christakis, & J. H. Fowler (2009). Connected: The
Surprising Power of Our Social Networks and How They Shape
Our Lives.
๏ผ NY Times
85http://www.medicaldaily.com/friends-and-family-share-same-genes-how-trip-coffee-shop-reveals-evolutions-power-292952
Christakis, N. A., & Fowler, J. H. (2014). Friendship and natural selection. Proceedings of the National Academy of Sciences, 111(3), 10796โ€“10801.
https://www.youtube.com/watch?v=6vwg0dJY1NM
Friendship and natural selection
์ฐฝ์กฐ๋ฅผ ์œ„ํ•ด์„  ์ ๋‹นํžˆ ์ข์€ ์„ธ์ƒ์ด ํ•„์š”ํ•จ
Financial success of Broadway musicals 1945 to 1989
์ข์€ ์„ธ์ƒ๊ณผ ์˜ˆ์ˆ ์  ์„ฑ๊ณต
Artistic success of a show
89
Borgatti et al (2009)
Structural holes
Using Big Data to Fight Range
Anxiety in Electric Vehicles
โ€ข The software acquires
data from five sources:
Google Maps (for route,
terrain, and traffic data),
Wunderground.com (for
weather), driver history
(through driving
behavior measurements),
vehicle manufacturers
(for vehicle modeling
data), and battery
manufacturers (for
battery modeling data).
http://spectrum.ieee.org/cars-that-think/transportation/sensors/using-big-data-to-fight-range-anxiety-in-electric-vehicles
http://www.nae.edu/Publications/Bridge/128772.aspx
W inter Bridge:A GlobalView of BigData
Mike Thelwall: WA 2.0
http://lexiurl.wlv.ac.uk/index.html
March Smith: NodeXL
http://nodexl.codeplex.com/
Han Woo PARK
KrKWIC, WeboNaver, WeboDaum
https://wiki.digitalmethods.net/Dmi/ToolDatabase
http://discovertext.com/
https://netlytic.org/
http://chorusanalytics.co.uk/
https://support.google.com/fusiontables/answer/2571232
http://www.tableau.com/
https://pipes.yahoo.com/pipes/
http://www.kapowtech.com/
http://www.qgis.org/ko/site/
ArcGIS ๋ฅผ ์ด์šฉํ•œ ์˜คํ”ˆ๋ฐ์ดํ„ฐ ํˆด. ์„ธ๊ณ„์€ํ–‰ ๋ฐ์ดํ„ฐ ๋“ฑ cool
https://gcn.com/Articles/2013/10/04/GCN-Award-NYC-DataBridge.aspx?Page=2
Oreilly
10 data trends on our radar for 2016
1. Metadata
2. Systems optimization via deep neural networks
For example, as shown in the screenshot below, a
search on Google for "let it be lyrics" returns the
lyrics of the classic Beatles song at the top of the
search results. But a search for "let it go lyrics"
doesn't return such an interface element, despite
the immense popularity of this Disney song and
the wide availability of its lyrics.
Help users ask good questions, rather than
attempt to answer bad ones.
You can see this in action on LinkedIn, where typing "micr" into a search box triggers
search suggestions like "Jobs at Microsoft" and "People who work at Microsoft":
Artificial Intelligence and Intelligence Augmentation:
Very Different Approaches Yield Very Different Results
โ€œArtificial intelligenceโ€ is the
idea of a computer system
that, by reproducing human
cognition, allows that system
to function autonomously
and effectively in a given
domain. An AI system
demonstrates a kind
of intentionalityโ€”it initiates
action in its environment and
pursues goals
โ€œIntelligence
augmentation,โ€ on the other
hand, is the idea of a
computer system that
supplements and supports
human thinking, analysis, and
planning, leaving
the intentionality of a human
actor at the heart of the
human-computer interaction.
Because intelligence
augmentation focuses on the
interaction of humans and
computers, rather than on
computers alone, it is also
referred to as โ€œHCI.โ€
http://www.financialsense.com/contributors/guild/artificial-intelligence-vs-
intelligence-augmentation-debate
Twitter taught Microsoftโ€™s AI chatbot to
be a racist asshole in less than a day
http://www.theverge.com/2016/3/24/11297050/tay-microsoft-chatbot-racist
http://boards.4chan.org/pol/
Prof. Han Woo PARK
Department of Media and Communincation,
YeungNam University, Korea
hanpark@ynu.ac.kr
http://www.hanpark.net
WCU
WEBOMETRICS
INSTITUTE
INVESTIGATING INTERNET-BASED POLITIC WITH E-RESEARCH TOOLS

Mapping big data science

  • 1.
  • 2.
  • 3.
  • 4.
    Contemporaries โ€ข Pre-Google Guys -Bill Gates & Steve Jobs: 1955 - Jeff Bezos: 1964 โ€ข Google Guys - Sergey Brin & Larry Page: 1973 - Elon Musk, Evan Williams, & Jack Dorsey: 1971/2/6 โ€ข Post-Google Guys - Kevin Systrom & Mark Zuckerberg: 1983/4
  • 5.
  • 6.
    Bonacich, P. (2004). TheInvasion of the Physicists. Social Networks 26(3): 285-288 Graph structure in the web
  • 7.
    Introduction ๏ฝ Webometricsis broadlydefined as the study of web- based content (e.g.,text,images,audio-visual objects,and hyperlinks) with primarily quantitative indicatorsfor social science research goals and visualization techniques derived from information science and social network analysis.
  • 8.
    8 โ€ข Han WooPark - โ€œhiddenโ€ and โ€œrelationalโ€ data about lots of people as well as the few individuals, or small groups โ€ข Lev Manovich - โ€œsurfaceโ€ data about lots of people (i.e., statistical, mathematical or computational techniques for analyzing data) - โ€œdeepโ€ data about the few individuals or small groups (i.e., hermeneutics, participant observation, thick description, semiotics, and close reading)
  • 9.
    First type ofWebometrics โ€ข Hyperlink Network Analysis - Inter-linkage: who linked to whom matrix - Co-inlink: a link to two different nodes from a third node - Co-outlink: A link from two different nodes to a third node Bjรถrneborn (2003)
  • 10.
    My First SSCIResearch: 44 Websites ๏ฎ categorically selected sites ๏ฎ financial sites the most central ๏ฎ revenue sources: advertising & e.c. ๏ฎ common payment: credit card
  • 11.
    The future ofsocial relations ๏ฝ The social benefits of internet use will far outweigh the negatives over the next decade.They say this is because email, social networks, and other online tools offer โ€˜lowโ€frictionโ€™ opportunities to create, enhance, and rediscover social ties that make a difference in peopleโ€™s lives. ๏ฝ Some 85%agreed with the statement: โ€œIn 2020,when I look at the bigpicture and consider my personal friendships,marriage and other relationships,I see that the internet has mostly been apositive force on my social world.And this will only grow more true in the future.โ€ ๏ฝ โ€œThere's no escapingpeople anymore,and I believe that will yield better relationships.โ€โ€”Jeff Jarvis,
  • 12.
    M.Castells (2009), CommunicationPower ๏ฝ 1) Networkingpower: the power over who and what is included in the network. โ€˜Mass self-communicationโ€™, the use of new media for private messages that are able to reach masses ๏ฝ 2) Network power: the power of the protocols of network communication. In mass self-communication the diversity of formats is the rule and that this amplifies the diffusion of messages ๏ฝ 3) Networked power: the power of certain nodes over other nodes inside the network.This is the managerial, agenda- setting, editorial and decision makingpower in the organizations that own or operate networks. ๏ฝ 4) Network-makingpower: the capacity to set-up and program a network โ€“ of multimedia or traditional mass communication- by their owners and controllers
  • 13.
    Given that socialmediaconnect individuals in dramatically different ways, research questions are like these: W hat do people talk? W ho can see what? W ho can reply to whom? How longis content visible? W hat can link to what? W ho can link to whom? Webometricsand Hyperlink Network Analysiscan be particularlyuseful to answer these questions!!!
  • 14.
    Big Data andSocial Webometrics Network Analysis Increasing data size in terms of the no. of nodes Micro โ‰ฆ100 nodes โ†’10K Meso โ‰ฆ1000 nodes โ†’1000K Macro โ‰ฆ10000 nodes โ†’100,000K Super- Macro โ‰ฅ10000 nodes โ†’ โˆฝ ์ถœ์ฒ˜: ๋ฐ•ํ•œ์šฐ(2014)
  • 15.
    โ€œThose studies perpetuatethe idea that linking behaviour is not random, and that links are โ€˜socially significant in some wayโ€™. In this perspective, links have an โ€˜information side-effectโ€™, they can be used to understand other facts even though they were not individually designed to do so: โ€˜information side-effects are by-products of data intended for one use which can be mined in order to understand some tangential, and possibly larger scale, phenomenaโ€™
  • 16.
    Park and hiscolleagues were extensively cited: 9 times! โ€ข Barnett GA, Chung CJ and Park HW (2011) Uncovering transnational hyperlink patterns and web mediated contents: a new approach based on cracking.com domain. Social Science Computer Review 29(3): 369โ€“384. โ€ข Hsu C and Park HW (2011) Sociology of hyperlink networks of Web 1.0, Web 2.0, and Twitter: a case study of South Korea. Social Science Computer Review 29(3): 354โ€“368. โ€ข Park HW (2003) Hyperlink network analysis: a new method for the study of social structure on the web. Connections 25(1): 49โ€“61. โ€ข Park HW (2010) Mapping the e-science landscape in South Korea using the webometrics method. Journal of Computer-Mediated Communication 15(2): 211โ€“229. โ€ข Park HW and Jankowski NW (2008) A hyperlink network analysis of citizen blogs in South Korean politics. Javnost: The Public 15(2): 5โ€“16. โ€ข Park HW and Thelwall M (2003) Hyperlink analyses of the World Wide Web: a review. Journal of Computer-Mediated Communication 8(4). โ€ข Park HW and Thelwall M (2008) Developing network indicators for ideological landscapes from the political blogosphere in South Korea. Journal of Computer- Mediated Communication 13(4): 856โ€“879. โ€ข Park HW, Kim C and Barnett GA (2004) Socio-communicational structure among political actors on the web in South Korea. New Media & Society 6(3): 403โ€“423. โ€ข Park HW, Thelwall M and Kluver R (2005) Political hyperlinking in South Korea: technical indicators of ideology and content. Sociological Research Online 12(3).
  • 17.
    A comment fromthose who are NOT doing a hyperlink analysis โ€ข In a chapter of The Sage Handbook of Online Research Methods edited by Fielding et al. (2008), Horgan emphasizes that โ€˜link analysisโ€™ has become an active research domain in examining social behavior online. 17
  • 18.
  • 19.
    2nd type ofWebometrics: Web Visibility ๏ฝ Web mention as an indicator of online viral power and reputation ๏ฝ Presence or appearance of actors or issues beingdiscussed by the public (Internet users) on the web. ๏ฝ Trackingweb visibility is powerful way to get an insight into public reactions to actors or issues.
  • 20.
    Construct validity ofwebometrics data Ackland, R. (2013). Web Social Science: Concepts, Data and Tools for Social Scientists in the Digital Age. Sage. P. 16.
  • 21.
    How to eitherempirically or theoretically demonstrate the construct validity of web data for social science research? โ€ข By testing whether the online network displays structural signatures that are consistent with those displayed by real-world actors. โ€“ For example: Does Facebook friendship network data display homophily on the basis of race, ethnicity, etc.? โ€ข By testing whether variables constructed from web data are correlated with other accepted measures of the construct. โ€“ For example: If counts of inbound hyperlinks to academic project websites are correlated with other characteristics of academic teams (e.g. publications, industry connections) that are used as proxies of academic authority or performance, then this is evidence of the construct validity of hyperlink data in the context of scientometrics. โ€ข If it can be shown that an actor's position in an online network has influence on his or her performance or outcomes in a manner that accords with what is found offline.
  • 22.
  • 24.
    WCU WEBOMETRICS INSTITUTE INVESTIGATING INTERNET-BASED POLITICSWITH E-RESEARCH TOOLS Park, H. W. (2010). Mapping the e-science landscape in South Korea using the webometrics method. Journal of Computer-Mediated Communication, Vol. 15, No. 2. 211 โ€“ 229 Computational perspective based on the use of high performance computing to facilitate high-speed processing of large volumes of digital data e-Science in humanities and social sciences The networking perspective based on virtual collaboration through the Grid Two major strands exist in computational science (also called e-Science) ? A third alternative strand
  • 25.
    Computational Social Science(CSS) A minor but growing approach to the study of society Focus on the methodological perspective based on the use of new digital tools to manage the data deluge
  • 26.
    Computational (Social) Science ๏ฝFocus on the methodological perspective based on the use of new digital tools to manage the data deluge. ๏ฝ D evelopment of e-science tools to automate research process. ๏ฝ Experimentation with new types of data visualization.
  • 27.
    Measuring information exposurein dynamic and dependent networks (ExpoNet) According to the OECD's Global Science Forum 2013 report, social scientists' inability to anticipate the Arab Spring was partly due to a failure to understand 'the new ways in which humans communicate' via social media and the ways they are exposed to information. And social media's mixed record for predicting the results of recent UK elections suggests better tools and a unified methodology are needed to analyse and extract political meaning from this new type of data. โ€ข http://www.ncrm.ac.uk/research/ExpoNet/
  • 28.
    Why Data Science? Savageand Burrows (2007, p. 886) lament, โ€œFifty years ago, academic social scientists might be seen as occupying the apex of the โ€“ generally limited โ€“ social science research โ€˜apparatusโ€™. Now they occupy an increasingly marginal position in the huge research infrastructureโ€. Bonacich, P. (2004). The Invasion of the Physicists. Social Networks 26(3): 285-288
  • 29.
    All models arewrong but some are useful Emergence of data author on dataverse
  • 30.
    Andersons claims ๏ฝ Datais everythingwe need. ๏ฝ We don't have to settle for models. ๏ฝ Agnostic statistics. ๏ฝ Out with every theory of human behavior. ๏ฝ This approach to science โ€” hypothesize, model, test โ€” is becomingobsolete. ๏ฝ Petabytes allow us to say: "Correlation is enough." We can stop lookingfor models. ๏ฝ W hat can science learn from Google? E-Science.
  • 31.
    Big data andthe end of theory? ๏ฝ Does big data have the answers? Maybe some, but not all, says - Mark Graham ๏ฝ In 2008, Chris Anderson, then editor of W ired, wrote a provocative piece titled The End of Theory. Anderson was referring to the ways that computers, algorithms, and big data can potentially generate more insightful, useful, accurate, or true results than specialists or domain experts who traditionally craft carefully targeted hypotheses and research strategies. ๏ฝ W e may one day get to the point where sufficient quantities of big data can be harvested to answer all of the social questions that most concern us. I doubt it though. There will always be digital divides; always be uneven data shadows; and always be biases in how information and technology are used and produced. ๏ฝ And so we shouldn't forget the important role of specialists to contextualize and offer insights into what our data do, and maybe more importantly, don't tell us. http://www.guardian.co.uk/news/datablog/2012/mar/09/big-data-theory
  • 32.
    The Coming ofTriple Divide? There are three main gaps Iโ€™d like to emphasize in the present/future of Big Data research community: 1) Developing/Transitional VS Developed/Advanced countries, 2) Researcher in academia VS Researcher in commercial sector, 3) Researchers with computational skills VS Less computational scholars.
  • 33.
    Method used Developed Country/Region Developing Country/Region MixedRegion N % N % N % Social- Informetics 114 74.51 30 83.33 9 52.94 Scientometrics 28 18.30 6 16.67 8 47.06 Webometrics 11 7.19 0 0 0 0 Total 153 100 36 100 17 100 No. of articles in each category of methods by the developed/developing division Skoric, M. M. (2013, Online First). The implications of big data for developing and transitional economies: Extending the Triple Helix?. Scientometrics.
  • 35.
  • 37.
    http://www.nature.com/nature/journal/v455/n7209/ 4 September 2008Volume 455 Number 7209 pp1-136 "what Big Data sets mean for contemporary scienceโ€
  • 40.
    This approach toscience is attributed to the late Jim Gray, one of the most influential computer scientists, at Microsoft.
  • 41.
    Science published aspecial issue (February 11, 2011) looking broadly at increasingly data-driven research efforts as a scientific domain (Science staff, 2011). Data Science is composed of interrelated clusters of research tasks. For example, the technologies on data collection, curation, and access, and the unique skill sets have increasingly been central to Data Science (Science staff, 2011).
  • 42.
    Phrase map ofhighly occurring keywords 1999-2005 Halevi, G., & Moed, H. F. (2012).
  • 43.
    Phrase map ofhighly occurring keywords 2006-2012 Halevi, G., & Moed, H. F. (2012).
  • 44.
    Park, H. W.,& Leydesdorff, L. (2013 Work-In-Progress). Decomposing a Data-Driven Science Using a Scientometric Method. ๏ฝ But, Halevi and Moed (2012), and Rousseau (2012) are based on descriptive statistics. Therefore, we intend to add the network perspective both in the social (in terms of co- authorship) and semantic networks. ๏ฝ Furthermore, we extend search queries to various terminologies related to Data Science because the term โ€œbig dataโ€ is regarded only as one among a list of policy priority issues. ๏ฝ We show where the research system in Data Science is โ€œhotโ€ in terms of international collaborations and prevailingsemantics.
  • 46.
    Park, H.W.@, &Leydesdorff, L. (2013). Decomposing Social and Semantic Networks in Emerging โ€œBig Dataโ€ Research. Journal of Informetrics*. 7 (3), 756-765.
  • 48.
    The Signal andthe Noise: W hy Most Predictions Fail but Some Don't. Nate Silver I do not go as far as a Popper in asserting that such theories are therefore unscientific or that they lack any value. However, the fact that the few theories we can test have produced quite poor results suggests that many of the ideas we havenโ€™t tested are very wrong as well. We are undoubtedly living with many delusions that we do not even realize. page 15
  • 49.
    OECD (2012).OECD TechnologyForesight Forum 2012 - Harnessingdata as a new source of growth: Big data analytics and policies. OECD Headquarters, Paris, France 22 October 2012
  • 50.
  • 51.
    Algorithmic management ofsocially shared information: Facebook as a designed social system Which features should be deployed? [Ugander-Karrer-Backstrom-Kleinberg 2013] Which discussions will be most active? [Backstrom- Kleinberg-Lee-DanescuNiculescuMizil 2013] Which memes will receive the most reshares? [Cheng-Adamic-Dow-Kleinberg-Leskovec 2014] Which links should be emphasized? [Backstrom-Kleinberg 2014] http://cdn.oreillystatic.com/en/assets/1/event/119/Computational%20Problems%20in%20Managing%20Social%20Information%20%20Presentation.pdf
  • 52.
    Typical FB userwrites 60-70% of comments to โ‰ˆ 15 people. [Backstrom-Bakshy-Kleinberg-Lento-Rosenn 2011] http://www.cs.cornell.edu/home/kleinber/icwsm11-attention.pdf
  • 54.
    Economics in theage of big data http://www.sciencemag.org/content/346/6210/1243089.ful l
  • 55.
  • 56.
  • 57.
  • 58.
    A more recentdevelopment was made with the establishment of journals that included the term โ€œData Scienceโ€ in their titles: โ€ข Data Science Journal in 2002 โ€ข Journal of Data Science in 2003 โ€ข EPJ Data Science in 2012 โ€ข GigaScience gigasciencejournal.com in 2012 โ€ข BigData & Society in 2015
  • 59.
    1Ying Huang โ€ขJannik Schuehle โ€ข Alan L. Porter โ€ข JanYoutie
  • 60.
  • 61.
  • 62.
  • 63.
  • 64.
  • 65.
  • 67.
  • 68.
    The chart TimCook doesnโ€™t want you to see 5 http://qz.com/122921/the-chart-tim-cook-doesnt-want-you-to-see/
  • 70.
    Kim, G. H.,Trimi, S., & Chung, J. H. (2014). Big-data applications in the government sector. Communications of the ACM, 57(3), 78-85
  • 71.
    Kim, G. H.,Trimi, S., & Chung, J. H. (2014). Big-data applications in the government sector. Communications of the ACM, 57(3), 78-85
  • 72.
    Kim, G. H.,Trimi, S., & Chung, J. H. (2014). Big-data applications in the government sector. Communications of the ACM, 57(3), 78-85
  • 73.
    Yet, there stillare serious problems to overcome. A trenchant critique concerning the big data field as it is nowadays came in the form of six statements intending to temper unbridled enthusiasm. [42] These six provocative statements are: ๏ฝ Bigdata change the definition of knowledge; ๏ฝ Claims to accuracy and objectivity are misleading; ๏ฝ More data are not always better data; ๏ฝ Taken out of context, bigdata loses its meaning; ๏ฝ Just because it is accessible, it does not make it ethical; and ๏ฝ (Limited) access to bigdata creates a new digital divide. Rousseau (2012)
  • 74.
    Big Data's SlipperyIssue of Causation vs. Correlation
  • 75.
    Big Data's SlipperyIssue of Causation vs. Correlation
  • 76.
  • 77.
  • 78.
  • 79.
  • 80.
  • 81.
  • 82.
  • 83.
    Kobayashi, T., &Boase, J. (2012). No Such Effect? The Implications of Measurement Error in Self-Report Measures of Mobile Communication Use. Communication Methods and Measures, 6, 1โ€“18. DOI: 10.1080/19312458.2012.679243
  • 84.
    N. A. Christakis,& J. H. Fowler (2009). Connected: The Surprising Power of Our Social Networks and How They Shape Our Lives. ๏ผ NY Times
  • 85.
  • 86.
    Christakis, N. A.,& Fowler, J. H. (2014). Friendship and natural selection. Proceedings of the National Academy of Sciences, 111(3), 10796โ€“10801. https://www.youtube.com/watch?v=6vwg0dJY1NM Friendship and natural selection
  • 87.
    ์ฐฝ์กฐ๋ฅผ ์œ„ํ•ด์„  ์ ๋‹นํžˆ์ข์€ ์„ธ์ƒ์ด ํ•„์š”ํ•จ Financial success of Broadway musicals 1945 to 1989
  • 88.
    ์ข์€ ์„ธ์ƒ๊ณผ ์˜ˆ์ˆ ์ ์„ฑ๊ณต Artistic success of a show
  • 89.
    89 Borgatti et al(2009) Structural holes
  • 90.
    Using Big Datato Fight Range Anxiety in Electric Vehicles โ€ข The software acquires data from five sources: Google Maps (for route, terrain, and traffic data), Wunderground.com (for weather), driver history (through driving behavior measurements), vehicle manufacturers (for vehicle modeling data), and battery manufacturers (for battery modeling data). http://spectrum.ieee.org/cars-that-think/transportation/sensors/using-big-data-to-fight-range-anxiety-in-electric-vehicles
  • 91.
  • 96.
    Mike Thelwall: WA2.0 http://lexiurl.wlv.ac.uk/index.html
  • 97.
  • 98.
    Han Woo PARK KrKWIC,WeboNaver, WeboDaum
  • 100.
  • 101.
  • 102.
  • 103.
  • 104.
  • 105.
  • 106.
  • 107.
  • 108.
  • 109.
    ArcGIS ๋ฅผ ์ด์šฉํ•œ์˜คํ”ˆ๋ฐ์ดํ„ฐ ํˆด. ์„ธ๊ณ„์€ํ–‰ ๋ฐ์ดํ„ฐ ๋“ฑ cool
  • 110.
  • 111.
    Oreilly 10 data trendson our radar for 2016 1. Metadata 2. Systems optimization via deep neural networks For example, as shown in the screenshot below, a search on Google for "let it be lyrics" returns the lyrics of the classic Beatles song at the top of the search results. But a search for "let it go lyrics" doesn't return such an interface element, despite the immense popularity of this Disney song and the wide availability of its lyrics.
  • 113.
    Help users askgood questions, rather than attempt to answer bad ones. You can see this in action on LinkedIn, where typing "micr" into a search box triggers search suggestions like "Jobs at Microsoft" and "People who work at Microsoft":
  • 114.
    Artificial Intelligence andIntelligence Augmentation: Very Different Approaches Yield Very Different Results โ€œArtificial intelligenceโ€ is the idea of a computer system that, by reproducing human cognition, allows that system to function autonomously and effectively in a given domain. An AI system demonstrates a kind of intentionalityโ€”it initiates action in its environment and pursues goals โ€œIntelligence augmentation,โ€ on the other hand, is the idea of a computer system that supplements and supports human thinking, analysis, and planning, leaving the intentionality of a human actor at the heart of the human-computer interaction. Because intelligence augmentation focuses on the interaction of humans and computers, rather than on computers alone, it is also referred to as โ€œHCI.โ€ http://www.financialsense.com/contributors/guild/artificial-intelligence-vs- intelligence-augmentation-debate
  • 115.
    Twitter taught Microsoftโ€™sAI chatbot to be a racist asshole in less than a day http://www.theverge.com/2016/3/24/11297050/tay-microsoft-chatbot-racist
  • 116.
  • 117.
    Prof. Han WooPARK Department of Media and Communincation, YeungNam University, Korea hanpark@ynu.ac.kr http://www.hanpark.net WCU WEBOMETRICS INSTITUTE INVESTIGATING INTERNET-BASED POLITIC WITH E-RESEARCH TOOLS