In ongoing work, we are experimenting with data-driven approaches to support communication and socialization processes in online communities. Specifically, our work addresses the problems that often …
In ongoing work, we are experimenting with data-driven approaches to support communication and socialization processes in online communities. Specifically, our work addresses the problems that often arise due to the use of automated methods for curating participants' contributions. Simple binary constructs such as “helpful,” “like,” or “thumbs up/down” have become dominant mechanisms for organizing user-contributed content. Readers' feedback on collected content is aggregated and used to create information displays, ranking content by attributes such as what is “most helpful” or “most liked.” The result is a curated collection of participants’ contributions.
Given the tendency for people to access items in the order of presentation and to satisfice rather than satisfy their needs for information, the aforementioned curation algorithms largely determine what information participants are exposed to (i.e., creating a filter bubble). We are undertaking a systematic investigation of communities that employ such mechanisms in order to better understand how curation algorithms impact users. Existing research suggests social voting mechanisms have unintended consequences (e.g., that there is little turnover in what is “most helpful,” even when new, high-quality content is added, that some kinds of content are consistently hidden because they receive few votes), and we are studying the effects of those consequences: how do user perceptions and behavior change based on the information shown (or hidden)? How does the information shown (or hidden) influence what information users contribute? Do curation algorithms display homogeneous information that alienates underrepresented users? The many possible combinations of features of users, content, and information displays present a complex problem for automated curation.
The research problem is fundamentally socio-technical, and our project employs a multi-method approach that focuses on four characteristics of the algorithms and users: (1) contributor characteristics (e.g., gender, reputation), (2) content characteristics (e.g., writing style, key words), (3) the perceived value of curated content (e.g., “helpful” votes received), and (4) the presentation algorithm(s) (e.g., reverse helpfulness rank). We are conducting automated analyses of the content and its presentation and are planning a survey of users. Our study includes multiple communities in three different domains (health, entertainment, and news). We hope to identify the conditions under which contributions and/or contributors that exhibit certain properties are systematically ranked lower (or higher) than others and how the information displayed impacts user behavior.