A Picture Is Worth A Thousand Questions DocxPresentation Transcript
A picture worth a thousand questions
Visualization techniques for social science discovery in computational spaces Howard T. Welser, Thomas M. Lento, Marc A. Smith, Eric Gleave, and Itai Himelboim
Social life increasingly takes place through computer mediated interaction systems, and these systems are growing in terms of diversity of affordances for action (Gaverm 1991).
'Community Views' tool kit, which integrates multiple information visualizations and populates them with data produced by a stream of multiple years of Usenet message traffic(Smith & Fiore, 2001).
ATTRIBUTES & ANALYSIS OF COMPUTER MEDIATED SOCIAL WORLDS
Attributes of Online Social Systems
It is the tool, which connects between computer and the human. Interfaces for accessing online spaces increasingly include a wide range of media like text, images, and sound) nowdays, it provides various categories in detail. Most online communities can be found through World Wide Web, and this includes either public pages like Wikipedia, or member only access pages like fatsecret. Furthermore, there are both commercial software like World of Warcraft, Second Life, and hardware restrictions. (One of commercial computer program ION requires at least Pentium 3)
In Usenet, newsgroups act as collections around sets of threaded discussions. The Usenet is composed of newsgroups focused on a range of topics and interests and interconnected with other newsgroups through shared messages. Slashdot web community provide a fixed set of general topical classifications. Wikipedia provide a range of page types with specialized functions articles, discussion pages, community, and infrastructure pages are clearly distinguished. Wikipedia participants are themselbes clustered into classes based on common behaviors and structural connections.
It is like constructed messages like attaching replies after original message. In other words, a thread is a collection of digital objects that refer to one another in a hierarchy. As a good example, we can remind of Cyworld Karma social rating system.
4. Network of relationships (Il-chon or E-chon)
A social network is the pattern of relationships in a population (see Nadel, 1964; Freeman, 2000; Scott, 2000).
4. History of participation( 조회수 , 스크랩 횟수 )
All users of online spaced develop reputational signals from their history of participation.
5. Representation of identity(expressing ‘self’)
These can range from distinctive names, signature files, avatars, and personal pages, tags, biographical statements, affiliations, journals, images and other files uploaded or linked to the personal pages.
7. Dimensions of data contributed(uploading data, and sharing)
Youtube supports video, and Flikr is dedicated to sharing and tagging images. Myspace, FaceBook and Wikipedia upload and provide files that are more personal.
Analysis: general considerations
A systemic study of computer-mediated social interaction spaces must consider the dimensions of behavior they contain. Typically, behavioral variables measured as counts, like number of comments or relationships. Correspondingly, wherever appropriate, these measures should be considered in terms of rates of activity, or be standardized or displayed on a log scale. Another consideration is that the boundaries of social action may or may not correspond to boundaries within the online space and may often spread across multiple spaces (Baym, 2007).
Systematic differences in how participants contribute can be conceptualized as social roles (Welser, Smith, Gleave, Fisher, 2007)and mapping the distribution of these social roles across community boundaries will suggest the appropriate theoretical framework for modeling the social action that occurs within them(Monge & Contractor, 2003).
Two measurement challenges
What temporal and behavioural bounds should we place on the definition of tie in a given study?
More conceptually, which modes of interaction represent meaningful social relationships within the focal population?
Agenda for visualization in social scientific discovery
Purposes of applying information visualization to social media
Visualization plays a key step in a larger process of identifying meaningful dimensions of interaction, aggregating actions, and visualizing distributions and relationships in the population. Given the number and fine grained detail of logged events, processing aggregations and making descriptions is a critical methodological and theoretical task (see Welser, Smith, Gleave, & Fisher, 2008).
2. Challenges of applying information visualization to social media
Making use of color, shape, size and orientation to map different data dimensions expands the density of data captured in a single class of image. Although such approaches can present a lot of information in a single visualization, eventually images can become too complex or violate rules of visual perception that obscure information rather than contribute to revealing its real character (Tjfte, 1995, 1997).
3. Solutions and approaches to effective information visualization(Examples are manyeyes and treemap, which is similar to mindmap.)
Information visualization is a topic of deep complexity (Tufte, 1995; Donath, 1999; Freeman, 2000). A recent critique of network visualizations noted that basix tasks like following the links between any two nodes is often impossible (Shneiderman & Aris, 2006). Promising recent work suggests new approaches to network visualization that combine network with other nodal data to create more informative images*Brades, Raab, & Wagner, 2001; Shneiderman & Aris, 2006). Semantic substrates (Shneiderman & Aris, 2006) are a method to project network visualizations into meaningful spatial containers. A semantic substrate could be a non-geographic map like a treemap that clusters nodes by attributes other that their connections to other nodes (Shneiderman, 2004)
VISUALIZATIONS FOR DISCOVERY
Mapping Boundaries and Hierarchies: Treemaps and Graphs
Figure 10.1 depicts a treemap (Shneiderman, 2004; Smith & Fiore, 2001) of posts to Usenet newsgroups under the Microsoft.public hierarchy [Microsoft. Public.excel,… excel.programming, stc.] for 2001 ( see Turner, Smith, Fisher, & Welser, 2005).
Treemaps like these can be applied to other content-oriented sites like Slashdot, Wikipedia, And most topic specific web forums. Here we shift our focus to mapping nested invitation relationships in the social network and blogging site Wallop. Figure 10.2 displays invitation relationships both as a hierarchy and as a graph.
In our precious studies of Wallop, we described the rise of different language communities and the contrasts between them (Gu, Johns, Lento, & Smith, 2006; Lento et al., 2006). Figure 10.3 shows the diffusion of invitations across the English and Chinese language communities.
The hyperbolic network graph is an effective method for exploring a large network that highlights adjacent nodes while downplaying distant ones (Schaffer et al., 1996). This tool is useful for exploration and provides a helpful addition to classic network visualization software like Pajek (De Nooy, Mrvar, Baragelj & Granovetter, 2005). We found that, as the patterns noted in invitation practices, comment interaction in Wallop had occasional language group crossings, but was generally marked by preference to reply others in the same language (Lento et al., 2006).
The insight that the invitation tree contains more deviations from homophily than one might expect is actually borne out in records of interaction structure (see Lento et al., 2006, figure 5)
Comparing patterns of behavioral across different boundaries
The scatter plots in figure 10.4 show the relationship between overall levels of activity (total posts, x-axis) and the size of the community ( number of repliers, y-axis) across several different types of newsgroups.
Figure 10.5 is a set of ‘crowd views’ generated from data from a range of Usenet newsgroups. The crowd view is scatter plot with a few additional attributes mapped to the color and size of each glyph. Each crowd view displays a glyph for each author in a newsgroup or other collection of threaded message conversations.
MAPPING STRUCTURE OF RELATIONSHIPS WITHIN THREADS AND GROUPS
Computer-mediated social systems are rife with ways to infer social ties from interaction records.
The following network graphs were collected through content analysis of edits to a Wikipedia policy discussion page. These data come from the first archived page of the ‘No personal attacks’ policy (see Black, Welser, DeGroot, & Cosley, 2008).
They are valuable at initial steps in exploratory network visualization, but should be augmented with other data, like roles or status ( Brandes, Raub, & Wagner, 2001). Egocentric network graphs based on comment relationships among Wallop users are shown in Figure 10.7. These visualizations are consistent with a general theoretical supposition (McAdam & Paulson, 1993).
Characterizing types of actors from histories of contributions and relations
Figure 10.8 is a revealing triptych for discerning roles from threaded discussion, especially tailored to distinguish the role of expert (or ‘answer person’) from that of other common participants. The set includes an ‘authorline’, a longitudinal characterization of the amount of contributions to particular threads while distinguishing between those initiated by ego and those initiated by others (Viegas & Smith, 2004). This set of three visualizations allowed us to identify some of the ket structural signature of experts in online discussion spaces (Welser et al., 2007)
Assessing visualizations: building better pictures and better picture production systems
Tufte (1995, 1997) and many others have pointed out standards for high quality ‘final product’ visualizations. Two of our studies (Lento et al., 2006; Welser et al., 2007) illustrate this three stage process
Production of visualizations of relationships and behavior to gain insight into patterns and develop hypotheses
Operationalize visualization patterns as metrics and variables for use in a statistical model.
Communication of model results
Ultimately, the best assessment of exploratory visualizations and systems for producing those visualizations is the predictive power of those models and the theoretical significance of these findings.
Further challenges in extending visualization techniques to complex data
Attempting to squeeze more information into a single visualization becomes counterproductive. By grouping similar measures, eliminating irrelevant information, and recombining different relationships into aggregated measures, it is sometimes possible to represent complex data effectively in a simple visualization.
Another approach is to represent complex data through comparison of clearly related images like moving images or presenting multiple representations of different subsets of the data in a single visualization.
There is much room for progress. First, we recognize limitations that stem from the need for additional methods of inquiry, like ethnographic study, statistical testing, and experimental research in order to understand social dynamics more. We also recognize a need for greater development of tools for the efficient creation of precisely turned sets of visualizations. Finally, we hope to see visualization strategies extended across wider ranges of comparable situations.