AUTOMATED DISCOVERY OF SOCIAL NETWORKS IN ONLINE LEARNING COMMUNITIES   Anatoliy Gruzd  [email_address] Dissertation Defen...
Online Social Networks <ul><li>Email  networks </li></ul><ul><li>Forum networks </li></ul><ul><li>B log networks </li></ul...
Users’ contributions and  networks are growing daily! Source: IDC white paper, “The Diverse and Exploding Digital Unverse,...
Users’ contributions and  networks are growing daily! Usenet  newsgroups 4.6 terabytes of text *daily* Blogs 900,000 new b...
Automated Discovery of Social Networks ©   kelleyw
<ul><li>Research Goal   </li></ul><ul><ul><li>U se computers to discover  online  social  networks  automatically   </li><...
Research Questions <ul><li>Extracting Social Networks from Forum Postings </li></ul><ul><ul><li>Question 1 : What content-...
Extracting Social Networks from Forum Postings  Approach 1: Chain Network (Reply-to) FROM:  Sam REFERENCE CHAIN:  Gabriel ...
Extracting Social Networks from Forum Postings    Approach 2: Name Network FROM:  Ann “ Steve   and  Natasha , I couldn't ...
Extracting Social Networks from Forum Postings    Approach 2: Name Network <ul><li>Compare each word from the posting agai...
Extracting Social Networks from Forum Postings    Approach 2: Name Network <ul><li>EXAMPLE </li></ul><ul><li>From: wilma@ ...
Research Questions <ul><li>Extracting Social Networks from Forum Postings </li></ul><ul><li>Evaluating Name Networks </li>...
Evaluating Name Networks Name Network Chain Network Forum Postings Self-Reported Network Survey <ul><li>Comparison Procedu...
Evaluating Name Networks   Data collection 54%-86%   (63%) R esponse rate  <ul><li>Bulletin board messages </li></ul><ul><...
Evaluating Name Networks  Online Questionnaire <ul><li>Section 1. Students’ perceived social structures </li></ul><ul><ul>...
Evaluating Name Networks Example: Youtube comments Chain Network (less connections) Name Network (more connections) Name N...
Evaluating Name Networks   Results from Online Learners Dataset <ul><li>N ame networks provide on average 40 %  more infor...
Evaluating Name Networks Results from Online Learners Dataset <ul><li>Structurally, the name and self-reported networks ar...
Research Questions <ul><li>Extracting Social Networks from Forum Postings </li></ul><ul><li>Evaluating Name Networks </li>...
Identifying Social Relations in Name Networks  Results  <ul><li>The following social relations were found by the “name net...
Identifying Social Relations in Name Networks  Results  <ul><li>The following social relations were found by the “name net...
Identifying Social Relations in Name Networks  Results   <ul><li>The following social relations were found by the “name ne...
Identifying Social Relations in Name Networks  Results <ul><li>The following social relations were found by the “name netw...
Identifying Social Relations in Name Networks  Results <ul><li>The following social relations were found by the “name netw...
Using the results in the learning context <ul><li>I dentify students who might need extra attention /help  from the instru...
Contributions of the Research <ul><li>Development of a novel approach (name network) for content-based, automated discover...
Contributions of the Research (cont.) <ul><li>Empirical comparison of name networks to chain and self-reported networks us...
http://TextAnalytics.net
Limitations <ul><li>The ‘name network’ method  </li></ul><ul><ul><li>is more expensive computationally then the ‘chain net...
Future Research <ul><li>Study other types of online communities </li></ul><ul><li>Study online communities using multiple ...
AUTOMATED DISCOVERY OF SOCIAL NETWORKS IN ONLINE LEARNING COMMUNITIES   Anatoliy Gruzd  [email_address] April 1, 2009 <ul>...
Upcoming SlideShare
Loading in …5
×

AUTOMATED DISCOVERY OF SOCIAL NETWORKS IN ONLINE LEARNING COMMUNITIES

1,064 views
1,000 views

Published on

Dissertation Defense (April 1, 2009)

As a way to gain greater insights into the operation of online communities, this dissertation applies automated text mining techniques to text-based communication to identify, describe and evaluate underlying social networks among online community members. The main thrust of the study is to find a way to use computers to discover social ties that form between community members just from the digital footprints left behind in their online forum postings automatically. As part of this work, a web-based system for content and network analysis called the Internet Community Text Analyzer (ICTA) is being developed. A prototype of ICTA is available at http://textanalytics.net.

Published in: Technology, Education
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
1,064
On SlideShare
0
From Embeds
0
Number of Embeds
59
Actions
Shares
0
Downloads
30
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide
  • AUTOMATED DISCOVERY OF SOCIAL NETWORKS IN ONLINE LEARNING COMMUNITIES

    1. 1. AUTOMATED DISCOVERY OF SOCIAL NETWORKS IN ONLINE LEARNING COMMUNITIES Anatoliy Gruzd [email_address] Dissertation Defense April 1, 2009
    2. 2. Online Social Networks <ul><li>Email networks </li></ul><ul><li>Forum networks </li></ul><ul><li>B log networks </li></ul><ul><li>Friends’ networks on MySpace , Facebook , etc </li></ul><ul><li>Networks of like-minded people on </li></ul>http://www.visualcomplexity.com/vc
    3. 3. Users’ contributions and networks are growing daily! Source: IDC white paper, “The Diverse and Exploding Digital Unverse,” sponsored by EMC, March 2008. Usenet newsgroups 4.6 terabytes of text *daily* Blogs 900,000 new blogs *daily* Emails 100 billion emails *daily*
    4. 4. Users’ contributions and networks are growing daily! Usenet newsgroups 4.6 terabytes of text *daily* Blogs 900,000 new blogs *daily* Emails 100 billion emails *daily* <ul><li>What the group’s interests and priorities are? </li></ul><ul><li>How and why one online community emerges and another dies? </li></ul><ul><li>How people agree on common practices and rules in an online community? </li></ul><ul><li>How knowledge and information is shared among group members? </li></ul>
    5. 5. Automated Discovery of Social Networks © kelleyw
    6. 6. <ul><li>Research Goal </li></ul><ul><ul><li>U se computers to discover online social networks automatically </li></ul></ul><ul><li>Case Study </li></ul><ul><ul><li>Discussion forums in online classes </li></ul></ul>Automated Discovery of Social Networks
    7. 7. Research Questions <ul><li>Extracting Social Networks from Forum Postings </li></ul><ul><ul><li>Question 1 : What content-based features of postings help to uncover nodes and ties between group members? </li></ul></ul>
    8. 8. Extracting Social Networks from Forum Postings Approach 1: Chain Network (Reply-to) FROM: Sam REFERENCE CHAIN: Gabriel “ Nick , Gina and Gabriel : I apologize for not backing this up with a good source, but I know from reading about this topic that … ” Posting header Content Connects a sender to the previous poster in the thread Method Posting Header Source Sam -> Gabriel Discovered Tie(s) <ul><li>Possible Missing Connections: </li></ul><ul><li>Sam -> Nick </li></ul><ul><li>Sam -> Gina </li></ul><ul><li>Nick <-> Gina </li></ul>
    9. 9. Extracting Social Networks from Forum Postings Approach 2: Name Network FROM: Ann “ Steve and Natasha , I couldn't wait to see your site. I knew it was going to [be] awesome!” Ann -> Steve Ann -> Natasha Connect the sender to people mentioned in the message Steve <-> Natasha Connect people whose names co-occur in the same message(s) Discovered Tie(s) Method
    10. 10. Extracting Social Networks from Forum Postings Approach 2: Name Network <ul><li>Compare each word from the posting against a dictionary of all names collected from the US Census data </li></ul><ul><li>Find names that are NOT in the name dictionary (e.g., international names, informal names and nicknames) using contextual and structural information about words such as </li></ul><ul><ul><li>Capitalization </li></ul></ul><ul><ul><li>Context words </li></ul></ul><ul><ul><li>Position in text </li></ul></ul>Step 1. Automatically find all personal names in the postings
    11. 11. Extracting Social Networks from Forum Postings Approach 2: Name Network <ul><li>EXAMPLE </li></ul><ul><li>From: wilma@ email . net (= Wilma ) </li></ul><ul><li>Reference Chain: tank123@gl.edu, hle@gl.edu </li></ul><ul><li>Hi Dustin , Sam and all, I appreciate your posts from this and last week […]. I keep thinking of poor Charlie who only wanted information on “dogs“. […] Cheers, Wilma . </li></ul>Wilma – Dustin Wilma – Sam Wilma – Charlie <ul><li>Challenges to overcome: </li></ul><ul><ul><li>One person can have many names </li></ul></ul><ul><ul><li>Many people can have the same name </li></ul></ul><ul><ul><li>Names can belong to students in the class and outsiders </li></ul></ul>Step 2. Connect a sender of the posting to all names discovered in the previous step Dustin – Sam – Charlie Solution: - Name alias resolution
    12. 12. Research Questions <ul><li>Extracting Social Networks from Forum Postings </li></ul><ul><li>Evaluating Name Networks </li></ul><ul><ul><li>Question 2 : How are the proposed name networks similar to or different from networks derived from other methods? </li></ul></ul>
    13. 13. Evaluating Name Networks Name Network Chain Network Forum Postings Self-Reported Network Survey <ul><li>Comparison Procedure: </li></ul><ul><ul><li>QAP correlations </li></ul></ul><ul><ul><li>E xponential random graph models (p* models) </li></ul></ul><ul><ul><li>M anual exploration using network visualization </li></ul></ul>vs. vs. vs.
    14. 14. Evaluating Name Networks Data collection 54%-86% (63%) R esponse rate <ul><li>Bulletin board messages </li></ul><ul><li>Online questionnaire </li></ul>Data source Dataset 15 – 28 No. of students per class 15 weeks Duration of each class Spring 2008 School year 6 Classes
    15. 15. Evaluating Name Networks Online Questionnaire <ul><li>Section 1. Students’ perceived social structures </li></ul><ul><ul><ul><li>I learned a lot about the subject matter from this person … </li></ul></ul></ul><ul><ul><ul><ul><li>0 – never; 1 - rarely; 2 - for some of the course; 3 - during most of the course; 4 - throughout the whole course; </li></ul></ul></ul></ul><ul><li>Section 2. Influential members of the class </li></ul><ul><ul><ul><li>Indicate five students who you consider most important or influential in this class regarding each of the following types of interaction: </li></ul></ul></ul><ul><ul><ul><ul><li>(1) Providing information; (2) Promoting discussion; (3) Giving help; (4) Making class fun; </li></ul></ul></ul></ul><ul><li>Section 3. Interactions in the class as a whole </li></ul><ul><ul><ul><li>I felt that the class worked together … </li></ul></ul></ul>[ Based on C. Haythornthwaite’s 1999 LEEP study protocol ] Sample question: Sample question:
    16. 16. Evaluating Name Networks Example: Youtube comments Chain Network (less connections) Name Network (more connections) Name Network Chain Network
    17. 17. Evaluating Name Networks Results from Online Learners Dataset <ul><li>N ame networks provide on average 40 % more information about social ties in a group as compared to C hain networks </li></ul>QAP correlation ~ 0.5 “ New” Info (considering only the 40%) 82% An addressee has not posted to the thread 18% An addressee is not the most recent poster 70% Thread-starting posting 30% A subsequent posting in the thread Name Network Chain Network
    18. 18. Evaluating Name Networks Results from Online Learners Dataset <ul><li>Structurally, the name and self-reported networks are far more similar. </li></ul><ul><li>Based on p* models, the self-reported network is almost twice as likely to share the same ties with the name network than with the chain network. </li></ul>Chain Network Name Network Self-Reported Network Friends’ network for one of the classes
    19. 19. Research Questions <ul><li>Extracting Social Networks from Forum Postings </li></ul><ul><li>Evaluating Name Networks </li></ul><ul><li>Identifying Social Relations in Name Networks </li></ul><ul><ul><li>Question 3: What types of social relations do name networks include? </li></ul></ul>
    20. 20. Identifying Social Relations in Name Networks Results <ul><li>The following social relations were found by the “name network” method </li></ul>L earn ing ● Collaborative W ork ● H elp
    21. 21. Identifying Social Relations in Name Networks Results <ul><li>The following social relations were found by the “name network” method </li></ul>L earn ing ● Collaborative W ork ● H elp <ul><ul><li>P ostings that show attention to subject matter discussed by someone else </li></ul></ul><ul><ul><li> “… it made me think of the faceted catalogs' display that Karen posted ” </li></ul></ul>
    22. 22. Identifying Social Relations in Name Networks Results <ul><li>The following social relations were found by the “name network” method </li></ul>L earn ing ● Collaborative W ork ● H elp <ul><ul><li>Organizing group work, taking a leadership role </li></ul></ul><ul><ul><li>“ Some quick poking around shows that Steve and myself are here in Champaign, [...] and Nicole is in Chicago. [...] does anyone have a strong desire to be our contact person to the administrators ” </li></ul></ul>
    23. 23. Identifying Social Relations in Name Networks Results <ul><li>The following social relations were found by the “name network” method </li></ul>L earn ing ● Collaborative W ork ● H elp <ul><ul><li>A reference to an event or interaction that happened outside the bulleting board </li></ul></ul><ul><ul><li>“ Anne and I have been corresponding via e-mail and s he reminded me that we should be having discussion here &quot; </li></ul></ul>
    24. 24. Identifying Social Relations in Name Networks Results <ul><li>The following social relations were found by the “name network” method </li></ul>L earn ing ● Collaborative W ork ● H elp <ul><ul><li>P ostings that ask others for help </li></ul></ul><ul><ul><ul><li>“ [Instructor’s name] if you see this posting would you please clarify for us ” </li></ul></ul></ul>
    25. 25. Using the results in the learning context <ul><li>I dentify students who might need extra attention /help from the instructor </li></ul><ul><li>Discover if lectures or other class materials were unclear </li></ul><ul><li>Identify peer-help </li></ul><ul><li>Find active group members who often take a leadership role in a group </li></ul>Student Instructor Student Group Leader Student Student
    26. 26. Contributions of the Research <ul><li>Development of a novel approach (name network) for content-based, automated discovery of social networks from threaded discussions in online communities and a framework for evaluating this new approach </li></ul><ul><ul><li>The “name network” method can be used </li></ul></ul><ul><ul><ul><li>to transform even unstructured Internet data into social network data ; </li></ul></ul></ul><ul><ul><ul><li>where more traditional methods for data collection on social networks such as surveys are too costly or not possible ; </li></ul></ul></ul>
    27. 27. Contributions of the Research (cont.) <ul><li>Empirical comparison of name networks to chain and self-reported networks using data collected from 6 online classes </li></ul><ul><li>Demonstration of the proposed automated approach for collecting social network data is a viable alternative to the costly and time-consuming collection of self-reported networks </li></ul><ul><li>Demonstration of how name networks can be used to study online classes and assess collaborative learning </li></ul><ul><li>Development of the ICTA web-based system for content and network analysis ( http://textanalytics.net ) </li></ul>
    28. 28. http://TextAnalytics.net
    29. 29. Limitations <ul><li>The ‘name network’ method </li></ul><ul><ul><li>is more expensive computationally then the ‘chain network’ method </li></ul></ul><ul><ul><li>uses an email address as a unique identifier of a participant </li></ul></ul><ul><ul><li>relies only on postings that include personal names (on average only about 25-30% of all postings) </li></ul></ul>
    30. 30. Future Research <ul><li>Study other types of online communities </li></ul><ul><li>Study online communities using multiple data sources such as forums, chats, wikis, etc </li></ul><ul><li>Develop automated techniques to identify types of social relations and social roles </li></ul>
    31. 31. AUTOMATED DISCOVERY OF SOCIAL NETWORKS IN ONLINE LEARNING COMMUNITIES Anatoliy Gruzd [email_address] April 1, 2009 <ul><li>Contributions </li></ul><ul><li>Developed the Name Network method and evaluated it in the context of e-learning </li></ul><ul><li>Identifi ed types of s ocial r elations in Name Networks </li></ul><ul><li>Developed ICTA – a web-based system for content and network analysis ( http://textanalytics.net ) </li></ul>

    ×