How to utilize ‘big data’ on SNS for academic purpose?

1,584 views
1,464 views

Published on

A Keynote to the Japanese Society of Socio-Informatics Annual Meeting, September 14, 2012, Maebashi

Published in: Technology, Education
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
1,584
On SlideShare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
0
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide
  • 프로파일 사진. 자기가 분명히 드러나는 사진은 미국애들이 더 많이 사용. 위 그래프에는 없지만 한국애들의69%가 제3의 사진 활용. 다음 두개 슬라이드는 미국 페이스북(58명 중 일부), 한국 싸이월드(92명 중 일부) 프로파일 사진 모은 것 (설문참가자들 중 프로파일내용분석 허가한 사람들 중 일부의 사진들임)이 사진들은 2009년 가을 (10월?)에 수집한 것임.
  • So, we want to see politician’s Twitter network really different from previous studies on politicians’ Network and collected data.We have collected 189 politicians and divided them into two political groups or ruling party and oppsition parties because literally there were too many parties in South Korea.
  • And this is cohesiveness of network table.. ( explanation)
  • It could be more intuitive to see through graphics.We depicted Politicians’ Twitter network. We have drawn the mention network over the following-follower network(explanation, if necessary)
  • The distribution between politician confirms this The following-follower network shows a linear function, meanwhile the mention network is more similar to power-law distribution.Reciprocity-based connections is, basically, “ I link to people who linked to me” .. So linear functionThe gravity to popular person is that “I mention to people who get most attention” So preferential attachment principle to connect…leads power-law function.
  • (conclusion)
  • How to utilize ‘big data’ on SNS for academic purpose?

    1. 1. How to utilize ‘big data’ on SNS for academic purpose? Virtual Knowledge Studio (VKS) Asso. Prof. Dr. Han Woo PARK CyberEmotions Research Institute Dept. of Media & Communication YeungNam University 214-1 Dae-dong, Gyeongsan-si, Gyeongsangbuk-do 712-749 Republic of Korea http://www.hanpark.net http://asia-triplehelix.org A Keynote to the Japanese Society of Socio-Informatics Annual Meeting, September 14, 2012, Maebashi
    2. 2. http://novaspivack.typepad.com/nova_spivacks_weblog/2007/02/steps_towards_a.html 에서 재인용
    3. 3. Big data  Big data usually includes data sets with sizes beyond the ability of commonly-used software tools to capture, manage, and process the data within a tolerable elapsed time.  Big data sizes may vary per discipline.  Characteristics: Garner’s 3Vs plus SAS’s VC - Volume (amount of data), velocity (speed of data in and out), variety (range of data types and sources) - Variability: Data flows can be highly inconsistent with daily, seasonal, and event-triggered peak data loads - Complexity: Multiple data sources requiring cleaning, linking, and matching the data across systems.
    4. 4. Computational Social Science A minor but growing approach to the study of society Focus on the methodological perspective based on the use of new digital tools to manage the data deluge
    5. 5. CSS Approach 1. development of webometric tools to automate social Internet research process (e.g., data collection and analysis from search engines, SNS and microblogging sites) 2. experimentation with new types of data visualization (e.g, HNA and dynamic geographical mappings using Google)
    6. 6. Mike Thelwall: WA 2.0 http://lexiurl.wlv.ac.uk/index.html
    7. 7. March Smith: NodeXL http://nodexl.codeplex.com/
    8. 8. Research tradition of Webometrics • 1) development of online tools to automate the Internet research process, such as data collection and analysis • 2) experimentation with new types of data visualization, such as social network and hyperlink analysis and multimedia and dynamic mappings
    9. 9. 9 Interface WCU WEBOMETRICS INSTITUTE INVESTIGATING INTERNET-BASED POLITICSS WITH E-RESEARCH TOOLS WCU WEBOMETRICS INSTITUTE INVESTIGATING INTERNET-BASED POLITICSS WITH E-RESEARCH TOOLS The interface is fairly self-explanatory: -Tick or untick to collect either only hit number or the title, URL, and description of the results - Select which of the search options you want to include - Click on the '...' button to select the text file that contains the queries you wish to run - Click 'Run Queries'
    10. 10. http://english-webometrics.yu.ac.kr/WebometricsTools/WeboNaver/WeboNaver.html
    11. 11. Cyworld Extractor - Overview Java-based software tool that, given the URL of a politician on Cyworld, extracts comments given by citizens along with related profile attributes. The stored data, which can amount to thousands of records, is stored in a suitable format for import into statistical software
    12. 12. Twitter Extractor - Overview Sharing a similar interface and extraction mechanism with the Cyworld extractor, this application requires the URL of a user on Twitter. It is then possible to collect all tweets and determine the attributes of the user’s follower / following network
    13. 13. Korean Internet Network Miner: A Korean version of ICTA
    14. 14. http://www.openamplify.com/
    15. 15. OhMyNews vs.Chosun: Emotionality comparison (Jul 2009 - Feb 2010)water France EU Independent Africa Kabul gas Colombia Venezuela Pakistan press Hollywood parliament American Italy police Hungary Google voter Europe Russia Copenhagen election Obama Haiti India China CommunistParty Afghanistan PresidentBarackObama Canada Korea Taliban warming Poland Japan Australia ban U.S. climatechange opposition H1N1 Authority Belgium Dalai Sweden Palestinian pandemic woman Israel oil UN Conservative Asia Internet Afghan journalist economy Brazil Amazon NorthKorea Jerusalem Berlusconi ASEAN Uganda Brussels OhMyNews -1.00 -0.80 -0.60 -0.40 -0.20 0.00 0.20 0.40 0.60 0.80 1.00 OhMyNews Chosun
    16. 16. • Using the sentiment analysis, we are trying to find differences and similarities in emotional polarity of main topics covered in news stories by OhMyNews versus Chosun. • "MEAN POLARITY" - represents polarity on the scale from -1 (negative) to 1 (positive) for 78 popular topics covered in the both newspapers. • For example, topic "Uganda" tend to be mentioned in the positive context by OhMyNews, but in the negative context by Chosun. Or topic "opposition" tend to be neutral in OhMyNews, but positive in Chosun, and so on
    17. 17. Web archiving of Korean MPs: http://www.web-archive.kr/
    18. 18. Incoming International Hyperlink in 2009 (drawn using ManyEyes.com)
    19. 19. Incoming International Hyperlink in 2009 (drawn using Google Earth)
    20. 20. Frequently occurring key words in e-science webpages in South Korea E-science in Asia: Dreams and realities for social science research Created on Many Eyes(http://many-eyes.com) Results Park, H. W. (2010). Mapping the e-science landscape in South Korea using the webometrics method. Journal of Computer-Mediated Communication, Vol. 15, No. 2. 211 – 229
    21. 21. Websites retrieved more than two times Note: Websites are larger according to their frequency of retrieval; however, heir colors and locations are randomly-chosen for the best visualization Park, H. W. (2010). Mapping the e-science landscape in South Korea using the webometrics method. Journal of Computer-Mediated Communication, Vol. 15, No. 2. 211 – 229
    22. 22. Why CSS? • Savage and Burrows (2007, p. 886) laments, “Fifty years ago, academic social scientists might be seen as occupying the apex of the – generally limited – social science research ‘apparatus’. Now they occupy an increasingly marginal position in the huge research infrastructure. Bonacich, P. (2004). The Invasion of the Physicists. Social Networks 26(3): 285-288
    23. 23. Type Traditional Science -------------------------> e-Science Stage 1 2 3 4 Information gathering Libraries; personal conversations Offline database Online databases; link collections; discussion lists Digital libraries; Knowbots Data production Interviews; experiments Electron, text analysis; simulation/ modeling Internet surveys Distributed computing; virtual reality Data management Card files; lists Hypertextual card files; databases Networked card files; de-central databases Data processing/ analysis With paper and pencil Electron, data- processing; expert systems Modelling; simulations Artificial intelligence <Table 1> Development stage of e-Science Nentwich(2003)
    24. 24. 24
    25. 25. All modes are wrong but some are useful - Emergence of data author on dataverse
    26. 26. “Webometrics refers to a set of research methods that illustrates texts and their web linkages as a network and quantitatively examine the spreadable aspects of web- mediated communication activities of social actors and issues (Jenkins, 2011), in comparison to traditional methods (Savage & Burrows, 2007; Salganik & Levy, 2012). ” (by Han Woo Park)
    27. 27. http://participatorysociety.org/wiki/index .php?title=Online_Research
    28. 28. Seminal publications: * 실시간 피인용률 보기  Garton, L., Haythornthwaite, C., & Wellman, B. (1997 ). Studying online social networks. Journal of Compu ter-Mediated Communication, 3(1).  Wellman, B. (2001). 'Computer networks as social n etworks,' Science, Vol. 293, Issue (14), pp. 2031-203 4.  Park, H. W. (2003). Hyperlink network analysis: A ne w method for the study of social structure on the web . Connections, 25(1), 49-61 .  Park, H. W., & Thelwall, M. (2003). Hyperlink analyse s of the World Wide Web: A review. Journal of Comp uter-Mediated Communication, 8(4).
    29. 29. Recent special issues related to CSS  Special issues - Social Science Computer Review, 2011, 29(3) Theme: Social Networking Activities Across Countries - Asian Journal of Communication, 2011, 21(5), Theme: Online Social Capital and Participation in Asia- Pacific - Scientometrics, 2012, 90(2) Theme : Triple Helix and Innovation in Asia using Scientometrics, Webometrics, and Informetrics - Journal of Computer-Mediated Communication, 2012, 17(2) Theme: Hyperlinked Society
    30. 30. Selected publications related to CSS  Recent publications - Park, H. W., Barnett, G. A., & Chung, C. J. (2011). Structural changes in the global hyperlink network: Centralization or diversification. Global networks. 11 (4). 522–542 - Lim, Y. S., & Park, H. W. (2011). How Do Congressional Members Appear on the Web?: Tracking the Web Visibility of South Korean Politicians. Government Information Quarterly. 28 (4), 514-521. - Sandra González-Bailón, Rafael E. Banchs and Andreas Kaltenbrunner (2012). Emotions, Public Opinion, and U.S. Presidential Approval Rates: A 5-Year Analysis of Online Political Discussions Human Communication Research - Sams, S., Park, H. W. (2012 forthcoming). The Presence of Hyperlinks and Messages on Social Networking Sites: A Case Study of Cyworld in Korea. Journal of Computer-Mediated Communication - Nam, Y., Lee, Y.-O., Park, H.W. (2013, March). Can web ecology provide a clearer understanding of people’s information behavior during election campaigns?. Social Science Information.
    31. 31. Web Hyperlink Networks as Social Networks
    32. 32. How different across disciplines?
    33. 33. Social media refers to a set of online tools that supports social interaction between users.
    34. 34. Cross-Cultural Analysis of Beehive Status Messages within IBM Users in high power distance may use the status messages more for indicating general career interests and skills, rather than time- based updates of what one is doing or how one is feeling
    35. 35. Self-oriented profile photo Facebook users (N=58) Cyworld users (N=92) Total X2 Self-oriented photo 52 (90%) 17 (26%) 69 (37%) 72.55***
    36. 36.  A cross-cultural comparison of Twitter use between Korea and Japan  How do cultural attitudes of users influence their Twitter use?
    37. 37. Korea (N=286) Japan (N=283) Valid no. Male Female Valid no. Male Fmale Gender 165 106 (64%) 59 (36%) 204 145 (71%) 59 (29%) Reciprocity 76.40% 73.80% No. of Tweets 4292 9347** No. of followers 1047** 323 No. of followings 980** 285 Pieces of geographic information 166 (58%) 143 (51%) No. of metropolitans 154 111 (72%) 143 68 (48%) Participants and Their Twitter Use * The percentages for gender and no. of metropolitans were calculated using only the valid cases.
    38. 38. Differences in Twitter Use between Korea and Japan 1  No significant difference in the proportion of reciprocal connections.  A high proportion of reciprocal connections  Face negotiation theory (Ting-Toomey, 1988)  In collectivistic cultures, members consider their partners’ face as long as this consideration does not conflict with the members’ individual needs.  Korean and Japanese users might not have wanted to embarrass their followers by providing no response.
    39. 39. Differences in Twitter Use between Korea and Japan 2  Korean users had more followers and followings, which indicates that Korean users more tolerant of in- groups than Japanese users.  Simple versus contextual collectivism  Koreans, who reflect simple collectivism, are flexible in defining in-groups depending on situations, and it is common for Koreans to belong to more than one in-group.  In Japanese culture, which reflects contextual collectivism, members are likely to maintain a few confined in-groups throughout their lives regardless of the situation or context.
    40. 40. Differences in Twitter Use between Korea and Japan 3  Japanese users posted more messages through their Twitter timeline.  Unexpected result based on cross-cultural theories  The unexpected results may be explained by the differences in the history of mobile communication (not in cultural traits) between the two countries.  Japan’s mobile communications industry
    41. 41. Types of Tweets: Korea vs. Japan * IS: Information Sharing; SP: Self-Promotion; OC: Opinions/Complaints; RT: Statements/Random Thoughts; ME: Me Now; QF: Questions for Followers; PM: Presence Maintenance; AM: Anecdotes-Me; AO: Anecdotes-Others. * Blue bar: Korea; red bar: Japan.
    42. 42. Analysis of Tweets  For Korean users, the primary purpose of using Twitter was information sharing, a goal-oriented communication action.  Information sharing is an effective communication strategy both for facilitating faithful interactions and for maintaining individuality.  For Japanese users, it was personal graffiti disappearing (or unnecessary) communication context.  Their messages seemed as if they were talking to themselves in public. Self-disclosure while excluding out-group members from their personal lives. 분석 사례
    43. 43. Message category frequency 0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 IS SP OC RT ME QF PM AM AO Proportionofall Messages Korea Japan US
    44. 44. Conclusion • Twitter users in Korea tend to embrace their Twitter connections within the in-group boundary, despite some differences between Twitter connections and offline in- group members in terms of the degree of intimacy. • Twitter users in Japan tend to control their content and connections to maintain closed social relationships. • Social media have changed the ways in which individuals socialize and communicate through the Internet across countries and cultures. • By negotiating with local cultures, users develop their own communication strategies for global social media.
    45. 45. SOCIAL MEDIA USE IN THE QUEENSLAND FLOODS
    46. 46.  Messages tweeted during 15:00 March 11, 2011 to 7:00 March 13, 2011  Earthquake occurred around 14:46 on March 11 in Miyagi, Fukushima, Iwate, northern Ibaraki and further, Tokyo, Ibaraki, Chiba (South Kanto area)  During earthquake, people in the area cannot use their cell phone for calling and instead, they can access Twitter.  10-hour time period  Beginning 03/11 15:00 to 03/12 01:00  Middle 1 03/12 01:01 to 03/12 11:00  Middle 2 03/12 11:01 to 03/12 21:00  Last 03/12 21:01 to 03/13 07:00 DATA COLLECTION
    47. 47. Beginning Middle1 Middle2 Last Total Valid Case 112 196 141 119 568 VALID CASES 데이터 수집 기간 동안 올라온 일본어 메시지 중 지진 관련 메시지들만 분석
    48. 48. Information-related Official information from news reports or government notices Opinion-related Opinion or commentary related content on the Japan earthquake Technology/Media related Messages that all types of media (including television and social media) are mentioned Emotion-related Personal emotional statements and concern about the situation and sufferers Action-related Including suggestion, plan for helping sufferers (e.g. suggestion for donation or voluntary service) Personal experience- based Information described by sufferers (personal episodes and surroundings) Other Unrelated tweets (missing data) TYPE OF TWEET Heverin, T. & Jach, L. (2010
    49. 49. Time period (hours) Information (%) Opinion (%) Technolog y/Media (%) Emotion (%) Action (%) Personal- based (%) 1-10 12.9 4.0 5.0 23.8 19.8 34.7 11-20 16.7 9.3 6.7 16.0 26.0 25.3 21-30 14.1 5.1 10.3 12.8 23.1 34.6 31-40 18.1 13.8 5.3 5.3 25.5 31.9 PERCENTAGE OF TWEET TYPE PER 10 HOUR TIME PERIOD • 미디어 혹은 관련 조직의 공식 정보(information) 보다는 개인들의 경험에 기반한 정보 (personal-based) 가 더 많이 교환됨 • 의견 메시지는 사건 발생 직전 보다 어느 정도 시간이 지난 후 더 많이 올라옴 • 감정적인 메시지는 사건 직후 많았다가 (주로 안부와 걱정, 불안, 공포) 시간이 지나면서 줄어 듦 • 성금 모금이나 복구 참여를 촉구하는 메시지 (action)는 꾸준히 포스팅. 지진 발생이 특정 지역 의 개별 피해사례가 아니라 국가적 위기 사항으로 인식하고 있음을 짐작할 수 있음
    50. 50. Percentage of Tweet Type per 10 hour Time Period Time period (hours) Informati on (%) Opinion (%) Technolo gy/Media (%) Emotion (%) Action (%) Personal- based (%) 1-10 12.9 4.0 5.0 23.8 19.8 34.7 11-20 16.7 9.3 6.7 16.0 26.0 25.3 21-30 14.1 5.1 10.3 12.8 23.1 34.6 31-40 18.1 13.8 5.3 5.3 25.5 31.9 Time period (hours) Info (official + personal) Opinion (%) Technology /Media (%) Emotion (%) Action (%) 1-10 47.6 4.0 5.0 23.8 19.8 11-20 42.0 9.3 6.7 16.0 26.0 21-30 48.7 5.1 10.3 12.8 23.1 31-40 50.0 13.8 5.3 5.3 25.5
    51. 51. Comparison between Japanese’ and U.S.’ Time period (hours) Informati on (%) Opinion (%) Technolo gy/Media (%) Emotion (%) Action (%) Other 1-12 90.0 6.8 1.1 5.6 1.1 0.0 12-24 86.6 13.0 3.1 4.5 1.3 0.7 24-36 73.9 18.3 7.0 2.7 0.5 3.7 36-48 74.6 21.3 1.0 3.8 0.5 2.8 48-60 80.1 20.9 2.4 3.0 0.0 0.0 Time period (hours) Info (official + personal) Opinion (%) Technology /Media (%) Emotion (%) Action (%) 1-10 47.6 4.0 5.0 23.8 19.8 11-20 42.0 9.3 6.7 16.0 26.0 21-30 48.7 5.1 10.3 12.8 23.1 31-40 50.0 13.8 5.3 5.3 25.5
    52. 52. 12.9 4.0 5.0 23.8 19.8 34.716.7 9.3 6.7 16.0 26.0 25.3 14.1 5.1 10.3 12.8 23.1 34.6 18.1 13.8 5.3 5.3 25.5 31.9 IF OP TM EM AC PE Last Middle2 Middle1 Beginning Type of Tweets during Japanese Earthquake (Mar 11 to 13 2011) IF : Information-related OP: Opinion-related TM: Technology/Media related EM: Emotion-related AC: Action-related PE: Personal information IFIF OP TM EM AC PE
    53. 53. 15.6 8.3 6.6 14.9 23.9 30.7 IF OP TM EM AC PE Total Type of Tweets during Japanese Earthquake (Mar 11 to 13 2011) IF : Information-related OP: Opinion-related TM: Technology/Media related EM: Emotion-related AC: Action-related PE: Personal information IFIF OP TM EM AC PE
    54. 54. 직접 경험자의 생생한 상황 정보 전달 1
    55. 55. 트윗 메시지에 포함된 URL 분석 TLD Domains % jp 22 53.7 com 11 26.8 net 4 9.8 tv 1 2.4 uk 1 2.4 org 1 2.4 biz 1 2.4 Total 41 100 • 일본내 웹페이지가 가장 많이 정 보로 제공되었음을 알 수 있음 • 위기시 정부홈페이지나 관련 사 이트에서 정보를 얻으려는 커뮤 니케이션 행위가 드물었음을 짐 작할 수 있다는 점에서 정부 관련 기관 URL (go.jp or gov)이 없음을 눈여겨볼만함
    56. 56. Category Type Number % Total News channel News providing 15 28.4 15 (28.4) Official information channel Weather information 2 3.8 8 (15.2%) Crisis-specialized site for survival confirmation 2 3.8 Transportation information 2 3.8 Search engine 1 1.9 Disaster information for foreigners 1 1.9 Personal channel Video sharing 7 13.3 20 (38%) Personal blog 7 13.3 Photo sharing 4 7.6 Online community 2 3.8 Action fundraising 1 1.9 1 (1.9%) Commodity Online bookstore 3 5.7 5 (9.5%) Internet shopping mall 2 3.8 Other 4 7.6 4 (7.6%) Total 53 100 53 (100%)
    57. 57. ① ② ③ The status of minihompy ①How active ②How famous ③How friendly Gender Name Minihompy Visitor count xxx 사진 xxx xxx
    58. 58. Minihompy Logged in User xxx 사진 xxx
    59. 59. Sentimental Analysis of Korean Politicians’ Cyworld mini-hompy Chi-square = 11.472, df = 1, p<.01, two-tailed The results indicates a significant relationship between gender and online comments. Gender Total Male Female Comments Positive 509 491 1000 Negative 247 159 406 Total 756 650 1406
    60. 60. To identify the relationship among gender, comment type, and user activity, posters were divided into four groups: females contributing positive comments (FP), males contributing positive comments (MP), females contributing negative comments (FN), and males contributing negative comments (MN). The FP group was the most active group, the FN group’s activity was similar to that of male groups, and the MP group was more active than the MN group.
    61. 61. Cyworld-Target hyperlink screen capture Minihompy 사진 사진 xxx xxx xxx xxx
    62. 62. Where do Korean users want take us?-Korea Category Domain Comments linking to Domain % Petition agora.media.daum.net 325 17.6 News news.naver.com 150 8.1 SNS cyworld.com 139 7.5 Forum cafe.naver.com 106 5.7 Blog blog.naver.com 72 3.9 Blog blog.daum.net 69 3.7 Blog rokp.tistory.com 61 3.3 NGO bss.or.kr 56 3 Forum cafe.daum.net 51 2.8 Government socialenterprise.go.kr 49 2.7 Total 1078 58.3 Based on 1,078 (58.3%) of 1,849 links to Korean services
    63. 63. What makes Korean users hyperlink to? Category Information provision Network building Identity/image building Audience sharing Message amplification Spam Opposition Female 1 20 0 0 11 9 Opposition Male 3 4 1 1 13 8 Opposition Unknown 0 11 1 0 14 2 Ruling Female 1 6 0 0 29 3 Ruling Male 1 5 0 0 23 7 Ruling Unknown 0 12 0 1 16 3 Total 6 58 2 2 106 32 % 3% 28% 1% 1% 51% 16% Table 6: Comments categorized by link type from the six groups of gender and political affiliation Based on 206 comments agreed on by both coders from the initial set of 300
    64. 64. Sentiment of Korean users to link candlelight protest suicide of e x-president Roh
    65. 65. Political role of the Internet  Normalization perspective:  Internet may reflect the traditional power structure among individual politicians.  Equalization (Innovation) perspective:  Internet may reform the offline hierarchical structure of individual politicians.
    66. 66. Web Visibility  Web visibility as an indicator of online political power  Presence or appearance of actors or issues being discussed by the public (Internet users) on the web.  Tracking web visibility is powerful way to get an insight into public reactions to actors or issues.  Recent studies indicates the positive relationships between politicians’ web visibility level and election.  Also, the co-occurrence web visibility between two politicians represents their hidden online political relationships based on the public perception.
    67. 67. Web Trend Analysis • Jangan district in Suwon City, Gyeonggi Province (Park, CS) (Lee, CY) (Ahn, DS) (Yoon, JY)
    68. 68. 박찬숙 이찬열 안동섭 윤준영 33,106 38,187 5,570 716 Blogs vs. Votes • Jangan district in Suwon City, Gyeonggi Province N. of Votes N. of Blogs (Park, CS)(Lee, CY) (Ahn, DS) (Yoon, JY) (Park, CS)(Lee, CY) (Ahn, DS) (Yoon, JY)
    69. 69. Results • Correlation Analysis (N. of Blogs & N. of Votes) – Pearson r = .586, p < .01 (N=29) – Spearman rho = .797, p < .01 (N=29) • Simple Regression Analysis – N. of Votes = 1,055.56 + 79.99(N. of Blogs) – R2 = .344 (F = 14.128, p < .01) – ß = .586 (t = 3.759, p < .01)
    70. 70. Results – Web Visibility (co-occurrence)
    71. 71. Results – QAP Correlation 1 2 3 4 5 6 7 8 1 Committee 1 0.004 -0.016 0.025 -0.021 -0.074** 0.045** -0.037** 2 Constituency 1 0.097** -0.007 -0.043** -0.064** 0.105** -0.119** 3 Party 1 0.027 -0.045* -0.050* 0.242** -0.094** 4 Gender 1 0.024 0.031 0.041 -0.224** 5 Age 1 0.179** -0.051* 0.049* 6 Incumbent 1 -0.060** 0.098** 7 Web 1 -0.158** 8 Finance 1 Note. * p<.05, ** p<.01
    72. 72. Results – Correlation & Path Analysis Correlation Spearman Correlation Note. * p<.05, ** p<.01 1 (N=278) 2 (N=278) 3 (N=234) 1 Finance 1 0.420** 0.101 2 Web 1 0.184** 3 Vote 1 1 (N=278) 2 (N=278) 3 (N=234) 1 Finance 1 0.513** 0.090 2 Web 1 0.163* 3 Vote 1Political finance’s indirect effect = .076 Note. ** p<.01
    73. 73. Outlines  Web ecology  Inter-relationship among websites by the hu man activity of using the Internet in informati on ecology  Observing integration and changes of diverse information behavior during the campaign peri od of the 2010 regional elections in South Kor eaWeb Ecology - 2011 ICA 5/29/2011 Web ecology Public opinion & Campaign Issues
    74. 74. Co-occurrences can take places either Web-Mentionin g or Hyperlinking. Actor A Actor B User, Voter (Webpage, News, Blog, SNS, etc.) Hyperlink Hyperlink Web-mentioning or Hyperlinking co-occurrence 5/29/2011
    75. 75. 24May2010 Education Superintendents VS Mayors
    76. 76. 25May2010 Education Superintendents VS Mayors
    77. 77. 26May2010 Education Superintendents VS Mayors
    78. 78. 27May2010 Education Superintendents VS Mayors
    79. 79. 28May2010 Education Superintendents VS Mayors
    80. 80. 30May2010 Education Superintendents VS Mayors
    81. 81. 31May2010 Education Superintendents VS Mayors
    82. 82. 1June2010 Education Superintendents VS Mayors
    83. 83. 2June2010 Education Superintendents VS Mayors
    84. 84. Date Link(2010_M) N=44 Link(2010_E) N=69 Link(2007_P) N=20 Date 24-May-10 3.77 0.03 25-May-10 3.82 0.04 26-May-10 3.86 0.04 27-May-10 3.77 0.11 869.66 02-Dec-07 28-May-10 3.62 0.15 785.52 05-Dec-07 30-May-10 3.87 0.63 877.92 08-Dec-07 31-May-10 3.92 0.92 940.58 11-Dec-07 01-Jun-10 4.03 1.24 819.72 14-Dec-07 02-Jun-10 4.10 1.36 1129.62 17-Dec-07
    85. 85. Results & Discussion: Network Analysis (2/7) Data Collection for Web 1.0 • Official homepages of South Korean Assembly members • Manual collection: Observation • Inter-linkage: Who links to whom matrix • Explicit links excluding links in board • 2-Year tracking of same Assembly members: 2000-2001
    86. 86. Results & Discussion: Network Analysis (3/7) Web 1.0 2000 2001 ‣ 59 isolated in 2000 ‣ more centralised in 2001 ‣ network of 2001 ➭ a ‘star’ network - might affected by political events ➭ presidential election in 2001
    87. 87. Results & Discussion: Network Analysis (4/7) •Data collection for Web 2.0 • Personal blogs of South Korean Assembly members • Manual collection: Observation • Blogroll links: Excluding links in postings • Inter-linkage: Who links to whom matrix • 2-Year tracking of same Assembly members: 2005-2006 • Phone interview about usage behaviours
    88. 88. Results & Discussion: Network Analysis (5/7) Web 2.0 2005 2006 ‣ hubs disappearing ‣ easy use of blogs ‣ Clear boundaries between different parties ‣ strong presence of GNP Assembly members ➭ party policy on using blogs
    89. 89. Twitter Results & Discussion: Network Analysis (6/7) ‣ more connection between different parties ‣ the ruling party pays less attention on alternative media
    90. 90. Results & Discussion: Network Analysis (7/7) Web Type Year Sum of links (Mean) Density Centralisation Gini Coefficient In Out Web 1.0 (N=245) 2000 373 (1.52) 0.006 1.84 69.33 0.984 2001 515 (2.10) 0.009 1.19 99.55 0.996 Web 2.0 (N=99) 2005 652 (6.59) 0.067 22.07 41.66 0.759 2006 589 (5.95) 0.061 20.67 35.10 0.763 Twitter (N=22) 2009 111 (5.05) 0.240 24.72 39.68 0.408
    91. 91. Data collection  Date of collection – from February to April 2010  Homepage – LexiURL searcher to retrieve data from the Yahoo! database  Blog – Manually collected by visiting Assembly members’ blog page  Twitter – An automated computer program using Twitter’s API (Application Programming Interface) to retrieve data from Twitter  Analysis & Visualization – UciNet
    92. 92. Homepage Network Blue: GNP Yellow: MDP Purple: Independent Green: DLP Grey: LDP Red: NPP Pink: FHA
    93. 93. Blog Network Blue: GNP Yellow: MDP Purple: Independent
    94. 94. Twitter Network Blue: GNP Yellow: MDP Green: DLP Red: NPP Pink: FHA Purple: Independent
    95. 95. Basic network information No. of nodes (isolators excluded) No. of links (Mean) Density Centralization In Out Homepage (N=281) 115 (40.92%) 130 (0.46) 0.0017 2.34% 9.15% Blog (N=173) 71 (41.01%) 149 (0.86) 0.005 5.34% 12.95% Twitter (N=72) 35 (48.61%) 983 (13.65) 0.1923 57.63% 54.77%
    96. 96. Party No. of 18th members Homepage (%) Blog (%) Twitter (%) 무소속 (Independent) 8 7 87.50 6 75.00 3 37.50 민주노동당 (Democratic Liberal Party, DLP) 5 4 80.00 4 80.00 4 80.00 한나라당 (Grand National Party, GNP) 169 160 94.67 102 60.36 26 15.38 민주당 (Merged Democratic Party, MDP) 87 85 97.70 53 60.92 32 36.78 자유선진당 (Liberty Forward Party, LFP) 18 17 94.44 5 27.78 2 11.11 진보신당 (New Progressive Party, NPP) 1 1 100.00 1 100.00 1 100.00 창조한국당 (Creative Korea Party, CKP) 2 2 100.00 2 100.00 2 100.00 친박연대 (Future Hope Alliance, FHA) 8 5 62.50 0 0.00 2 25.00 Sum 289 281 97.23 173 59.86 72 24.91 Overall Use of Homepage, Blog & Twitter – by party
    97. 97. April 2010
    98. 98. Five Politicians’ Following-Based Ego Networks Size of node Color of node Number of followers 1.5 Yellow 0 to 10,000 3.0 Purple 10,001 to 100,000 3.5 Pink 100,001 to 1,000,000 4.0 Blue More than 1,000,000 Diagram 1. Five Politicians’ Following-Based Ego Networks The size and color of each node corresponds to the number of followers as follows: GG Kang HR Won KW Na DY Chung HC Noh 한나라당 진보신당 민주노동당 민주당
    99. 99. Five Politicians’ Follower-Based Ego Networks Size of node Color of node Number of followers 1.5 Yellow 0 to 10,000 3.0 Purple 10,001 to 100,000 3.5 Pink 100,001 to 1,000,000 4.0 Blue More than 1,000,000 Diagram 2. Five Politicians’ Follower- Based Ego Networks The size and color of each node corresponds to the number of followers as follows: GG Kang KW Na HR Won DY Chung HC Noh 한나라당 진보신당 민주노동당 민주당
    100. 100. Overlaps in terms of Twitter Followers HR Won GG Kang DY Chung HC Noh KW Na 한나라당 (GNP) 진보신당 (PNP) 민주노동당 (MDP) 민주당 (MP)
    101. 101. HR Won GG Kang DY Chung HC Noh KW Na Overlaps in terms of Twitter Followings 한나라당 진보신당 민주노동당 민주당
    102. 102. Data Nov 2010, API application, 189 Korean Politicians National Assembly Members and Political Figures (i.e. Mayors or Governors) Total Twitter Account Holder Ruling Party vs. Opposition Parties Grand National Party 173 110 110 Democratic Party 92 56 79 Democratic Labor Party 5 5 New Progressive Party 3 3 Liberty Forward Party 16 4 Creative Korea Party 2 2 Future Hope Alliance 8 3 Federation of Citizen-Centered Party 1 0 Citizen Participatory Party 1 1 Independent 8 5 Total 309 189
    103. 103. Research Result
    104. 104. Politician Network (Following and Mention Network)
    105. 105. Conclusion Politicians Twitter Following-follower Network Politicians Twitter Mention Network
    106. 106. Findings • The distribution between politician confirms this • The following-follower network shows a linear function, meanwhile the mention network is more similar to power-law distribution. • Reciprocity-based connections is, basically, “ I link to people who linked to me” .. So linear function • The gravity to popular person is that “I mention to people who get most attention” So preferential attachment principle to connect…leads power-law function.
    107. 107. Twitaddons.com
    108. 108. Why Twitaddons.com ?  Upfront, self-identified motivation for joining a group  Member list  Relational data of Twitter activities
    109. 109. What Motivation ? Social motivation  Attachment to a group identity Informational motivation  Access to information (info overload theory)  Sharing information (positive self-evaluation and social acceptance) Interpersonal motivation  Friendship and bond-based attachment to members Twitaddons.com group  조폐공사, 국민의 명령, 비정규직당, 희망공화국  멘토스당, 똘끼주식당  애플러들의 모임, 아이폰4당  강남당, 강서당
    110. 110. Network Structure Measure  Whole network, instead of ego-centric network  Degree (popularity), closeness (efficiency), and betweenness (control) centralities  Betweenness centrality as a significant predictor of leadership (Mullen & Johnson, 1991)  Reciprocity of ties  Types of retweeted hyperlinks  Distribution of in- degree in terms of mentions
    111. 111. Activities by group 0 1000 2000 3000 4000 5000 6000 7000 8000 9000 당원수 트윗수 Mention tweet(Sum) Mention(reply)sender(id) Mention(reply)receiver(id) url 포함 tweet Strong correlation among numbers of members, tweets, mention receivers, and tweets with URLs
    112. 112. Network Characteristics 0 1 2 3 4 5 6 7 0 2 4 6 8 10 Degree centrality Closeness centrality
    113. 113. Network Characteristics - 0.050 0.100 0.150 0.200 0.250 0.300 0.350 0.400 0.450 Reciprocity
    114. 114. 조폐공사당 국민의 명령당 비정규직당 희망공화국
    115. 115. 똘끼주식당 아이폰4당 애플러들의 모임 멘토스당
    116. 116. 강남당 강서당
    117. 117. Mention Sender & Receiver - 1.00 2.00 3.00 4.00 5.00 6.00 - 2.00 4.00 6.00 8.00 10.00 12.00 14.00 16.00 Mentions / Mention senders Mention senders / Mention receivers
    118. 118. Hyperlink Analysis 68.8% 38.2% 69.3% 54.7% 79.8% 61.4% 54.4% 40.5% 49.2% 9.3% - 0.10 0.20 0.30 0.40 0.50 0.60 0.70 0.80 0.90 Tweet with URLs / Tweets
    119. 119. Hyperlink Analysis (Domains over 10%) Domain Share Domain Share twtkr.olleh.com 15% finance.naver.com 53% twitpic.com 11% twitpic.com 27% ccdm.or.kr 11% appsapps.net 23% twitpic.com 33% twitpic.com 14% news.mk.co.kr 15% tipb.com 12% powertothepeople.kr 40% appsapps.net 28% twtkr.olleh.com 20% twitpic.com 14% twitpic.com 11% youtube.com 16% yfrog.com 32% yfrog.com 15% nodongnews.or.kr 17% twitpic.com 12% twtkr.olleh.com 14% yfrog.com 44% 똘끼주식당 koreapds.com 61% twitpic.com 11% 애플러들의 모임 강남당 강서당 조폐공사 희망공화국 국민의명령 비정규직당 멘토스당 아이폰4당
    120. 120. Tentative Research Result Social motivation (조폐공사, 희망공화국, 국민의 명령, 비정규직) - Higher degree and closeness centralities compared to other groups - Higher mention sender per receiver ratio - URLs: civic association homepage, news services Informational motivation (똘끼주식당, 멘토스 / 아이폰4, 애플러들의 모임) - Access to information: Higher mention sender per receiver ratio Larger number of Tweets including URLs URLs: financial information services - Sharing information: Higher mention per sender ratio URLs: application services Interpersonal motivation (강남당, 강서당) - Higher reciprocity - Higher mention per sender ratio - Smaller number of Tweets including URLs - URLs: video, picture uploading services
    121. 121. 아시아 학연산 연구회
    122. 122. The Mutual Information in Two Dimensions: Tij = Hi + Hj – Hij Tij ≥ 0 The Mutual Information in Three Dimensions: TUIG = HU + HI + HG – HUI – HIG – HUG + HUIG TUIG is potentially negative A negative entropy can be a consequence of the mutual relations at the network level. The configuration then reduces the uncertainty. 아시아 학연산 연구회
    123. 123. Triple Helix indicators - Communication Perspective • TH innovation takes places mainly in three KCI spaces where the content of communication (i.e., information-sharing behavior) is transferred. • Knowledge space: Scientometrics • Convergence space: Technometrics • Innovation space: Webometrics
    124. 124. Measuring Twitter-based political participation by using TH indicators  The absolute entropy values were lower when the trilateral relationship included the two conservative politicians: Na and Won. As indicated earlier, the lower the entropy value, the less stable the communication system is. Thus, the communication system became more unbalanced in trilateral relationships that included the two conservative politicians. On the other hand, in those trilateral relationships including only one conservative politician, the entropy values were higher, and the communication system was more stable. These results suggest that the level of political deliberation, expressed in terms of the degree of stability in the communication system, increases when politicians with different political orientations form trilateral relationships
    125. 125. Who r u going to partner in terms of TH? Politician (A B C) A B C AB AC BC ABC Na, Won, Noh 18000 377 16000 898 118 50 32 Na, Won, Kang 16000 380 4438 898 1 1 1 Na, Won, Chung 16000 357 14000 898 63 68 1 Na, Noh, Kang 18000 15000 3817 118 1 571 0 Na, Noh, Chung 16000 14000 13000 118 63 737 0 Na, Kang, Chung 15000 3618 13000 1 63 280 1 Won, Noh, Kang 9208 19000 10000 50 1 571 0 Won, Noh, Chung 8353 18000 27000 50 68 737 1 Won, Kang, Chung 8154 10000 28000 1 68 280 1 No, Kang, Chung 18000 9224 27000 571 737 280 151 출처: Measuring Twitter-Based Political Participation and Deliberation in th e South Korean Context by Using Social Network and Triple Helix Indicator s http://www.springerlink.com/content/77w06uv002179062/
    126. 126. A comparison of trilateral relationships of five politicians on Twitter
    127. 127. The Webpage and News Media Categories Figure 2. T-values for the webpage catagory Figure 3. T-values for the news media category
    128. 128. Using Twitter to explore communication processes within online innovation communities According to Etzkowitz (2008, p. 20-23), the circulation of individuals belonging to three institutions can provide each organization with a new innovation environment. Circulation of individuals from one organizational sphere to another can stimulate hybridization and the creation of new social formats.
    129. 129. Note: B (Blackberry), A (Android), O.H. (Official HTC), K.H. (Korean HTC) ‘a, b, c’ stand for the actors’ placement, from first to third.
    130. 130. Co-construction of online communities  Based on Triple Helix analysis, four T values, made by different combination of three groups, were generated. Among all negative T values, the T of “Blackberry,” “Android,” and “Official HTC” had the lowest value, which implies that this trilateral relation had the most synergic effect through shared members. What is interesting is that the lowest T value of trilateral relations does not include “Korean HTC,” while the others do. Taking a closer look at bilateral relations, it seems to be clear that the T values of bilateral relations with “Korean HTC” denote a lesser amount of mutual information shared between groups. Recalling that the organizer of “Korean HTC” shared more information and conversation with members than with followers (see Table 1 and Fig. 3), contrary to other innovation group organizers, it can be speculated that the relative closeness of the members of “Korean HTC” may have led to fewer exchanges of members with other groups, resulted in higher T, and finally might generate less differentiation and innovation.
    131. 131. The Future of Computational Social Science
    132. 132. WCU WEBOMETRICS INSTITUTE Conclusion Mindset shift • Scholars and researchers in social sciences need to recognize and acknowledge the opportunities that are available – E.g. access to vast data and new modes of data collection and analysis • The emerging era of networked research leads to two possible scenarios –Education and training programs have to be put in place to produce a new breed of social scientists with combined expertise and knowledge of computational science and social sciences –What is more actionable in the shorter term is to engender and promote collaborative efforts between these different fields http://blog.jove.com/wp-content/uploads/2012/05/Publishing.png
    133. 133. WCU WEBOMETRICS INSTITUTE Conclusion Mindset shift • In Korea, there appears to be a lack of desire for either distance international collaboration through the Access Grid or the use of high performance computing facilities among social scientists – Little demand as social scientists’ current choices for their research practices are still shaped by offline facilities rather than online technology capabilities – Policy-makers and technology developers to involve social scientists in design and application processes, but change in mindset among researchers is needed to transform e-science into a reality for social scientists Good role model in the West – Oxford Internet Institute, The Virtual Knowledge Studio for the Humanities and Social Sciences, The Institute for Quantitative Social Science at Harvard University
    134. 134. Big data and the end of theory?  Does big data have the answers? Maybe some, but not all, says - Mark Graham  In 2008, Chris Anderson, then editor of Wired, wrote a provocative piece titled The End of Theory. Anderson was referring to the ways that computers, algorithms, and big data can potentially generate more insightful, useful, accurate, or true results than specialists or domain experts who traditionally craft carefully targeted hypotheses and research strategies.  We may one day get to the point where sufficient quantities of big data can be harvested to answer all of the social questions that most concern us. I doubt it though. There will always be digital divides; always be uneven data shadows; and always be biases in how information and technology are used and produced.  And so we shouldn't forget the important role of specialists to contextualise and offer insights into what our data do, and maybe more importantly, don't tell us. http://www.guardian.co.uk/news/datablog/2012/mar/09/big-data- theory

    ×