Pervasive information streams that document people and their routines have been a boon to social computing research. But the ethics of collecting and analyzing available—but potentially sensitive—online data present challenges to researchers. In response to increasing public and scholarly debate over the ethics of online data research, this paper analyzes the current state of practice among researchers using online data. Qualitative and quantitative responses from a survey of 263 online data researchers document beliefs and practices around which social computing researchers are converging, as well as areas of ongoing disagreement. The survey also reveals that these disagreements are not correlated with disciplinary, methodological, or workplace affiliations. The paper concludes by reflecting on changing ethical practices in the digital age, and discusses a set of emergent best practices for ethical social computing research.
Do Open data badges influence author behaviour? A case study at Springer NatureRebecca Grant
Digital badges have previously been shown to incentivise journal authors to share their data openly. In this paper we introduce an Open data badging project at the Springer Nature journal BMC Microbiology. The development of the Open data badge is described, as well as the challenges of developing standard badging criteria and ensuring authors’ awareness of the badges. Next steps for the badging project are outlined, which are based on the experiences of the team assessing the badges, the number of badges awarded at the journal to date, and the results of an author survey.
Research in the time of Covid: Surveying impacts on Early Career ResearchersRebecca Grant
Based on a survey of over 4,500 researchers published in the white paper The State of Open Data 2020, this session will explore the impacts of the pandemic on early career reearchers (ECRs), their research practice, and how they interact with open data. We will discuss the specific challenges reported by ECRs, as well as the gaps in training and support that they have identified that would encourage their sharing and reuse of research data.
Presentation at the E-ARMA conference 2021.
Gather evidence to demonstrate the impact of your researchIUPUI
This workshop is the 3rd in a series of 4 titled "Maximize your impact" offered by the IUPUI University Library Center for Digital Scholarship. Faculty must provide strong evidence of impact in order to achieve promotion and tenure. Having strong evidence in year 5 is made easier by strategic dissemination early in your tenure track. In this hands-on workshop, we will introduce key sources of evidence to support your case, demonstrate strategies for gathering this evidence, and provide a variety of examples. These sources include citation metrics, article level metrics, and altmetrics as indicators of impact to support your narrative of excellence.
Federated Learning (FL) is a learning paradigm that enables collaborative learning without centralizing datasets. In this webinar, NVIDIA present the concept of FL and discuss how it can help overcome some of the barriers seen in the development of AI-based solutions for pharma, genomics and healthcare. Following the presentation, the panel debate on other elements that could drive the adoption of digital approaches more widely and help answer currently intractable science and business questions.
Do Open data badges influence author behaviour? A case study at Springer NatureRebecca Grant
Digital badges have previously been shown to incentivise journal authors to share their data openly. In this paper we introduce an Open data badging project at the Springer Nature journal BMC Microbiology. The development of the Open data badge is described, as well as the challenges of developing standard badging criteria and ensuring authors’ awareness of the badges. Next steps for the badging project are outlined, which are based on the experiences of the team assessing the badges, the number of badges awarded at the journal to date, and the results of an author survey.
Research in the time of Covid: Surveying impacts on Early Career ResearchersRebecca Grant
Based on a survey of over 4,500 researchers published in the white paper The State of Open Data 2020, this session will explore the impacts of the pandemic on early career reearchers (ECRs), their research practice, and how they interact with open data. We will discuss the specific challenges reported by ECRs, as well as the gaps in training and support that they have identified that would encourage their sharing and reuse of research data.
Presentation at the E-ARMA conference 2021.
Gather evidence to demonstrate the impact of your researchIUPUI
This workshop is the 3rd in a series of 4 titled "Maximize your impact" offered by the IUPUI University Library Center for Digital Scholarship. Faculty must provide strong evidence of impact in order to achieve promotion and tenure. Having strong evidence in year 5 is made easier by strategic dissemination early in your tenure track. In this hands-on workshop, we will introduce key sources of evidence to support your case, demonstrate strategies for gathering this evidence, and provide a variety of examples. These sources include citation metrics, article level metrics, and altmetrics as indicators of impact to support your narrative of excellence.
Federated Learning (FL) is a learning paradigm that enables collaborative learning without centralizing datasets. In this webinar, NVIDIA present the concept of FL and discuss how it can help overcome some of the barriers seen in the development of AI-based solutions for pharma, genomics and healthcare. Following the presentation, the panel debate on other elements that could drive the adoption of digital approaches more widely and help answer currently intractable science and business questions.
Presentation by RIN's Director, Michael Jubb, at the Association of Subscription Agents' annual conference in February 2010. http://www.subscription-agents.org/conferences/asa-conference-2010
In this case study we identify the factors that influence the adoption of a new system in a major company in Saudi Arabia. We develop a theoretical framework to help derive better understanding of system adoption via socio-technical integration.
We formulation of 14 hypotheses that were tested via a survey of 42 system users. Management support and change management were found to be significant factors influencing system adoption. As a result, the 14 null hypotheses were rejected due to their statistical significance (p-value < 0.05). Discussions and recommendations for future research are discussed.
Research process and research data management. Many universities are looking at how they can better serve the needs of researchers. Ken Chad Consulting worked with the University of Westminster to look the needs and attitudes of researchers and admin staff in terms of research data management (RDM). The result led the University to look first at the whole lifecycle and workflows of research administration. This in turn led to the innovative, rapid development of a system to support researchers and admin staff. Presented by Suzanne Enright (University of Westminster) and Ken Chad at the annual UKSG conference in April 2014
La comparsa, pochi decenni fa, di Internet e della connettività globale ha dato origine ad un fenomeno assolutamente nuovo: un accumulo di enormi quantità di dati conservati in banche digitali, la cui quantità raddoppia ogni pochi giorni e in prospettiva ogni poche ore. E’ la realtà dei Big Data, di cui molto si parla e discute, sovente con toni entusiastici. Ma Big Data vuol dire anche problemi di utilizzo, di interpretazione e rischi di distorsioni. Se questo è rilevante per i dati che hanno un valore economico, l’accumulo di informazione e il come viene trattata ha risvolti altrettanto rilevanti sulla formazione di conoscenza.
Per affrontare queste sfide, cruciali sono il rapporto fra etica e scienza, l’analisi critica su come i dati vengono prodotti e proposti, e il coinvolgimento di tutti i soggetti sociali chiamati in causa.
12 settembre 2019 | Torino, Polo del '900
A Model of Decision Support System for Research Topic Selection and Plagiaris...theijes
The paper proposes a model of the decision support system for deciding a research topic in academia. The biggest challenge for a student in the field of research is to identify area and topic of research. The paper explains the model which helps student to identify the most suitable area and/or topic for academic research. The model is also design to assist supervisors to explore latest areas of research as well as to get rid of non intentional plagiarism. The model facilitates the user to select either keyword bases topic search or questionnaire based topic search. The model uses local database and service of a Meta search engine in decision making activity
Researchers are normally expected to adopt risk minimization strategies. Basic ethical principles stress the need to do well and to do no harm because just one research study may cause a range of harm.
Ethical Considerations in Data Analyticsarchijain931
The age of data analytics has ushered in a wealth of opportunities for organizations and individuals to derive valuable insights from data. However, with great power comes great responsibility. Ethical considerations in data analytics have become increasingly important as the potential for misuse and privacy breaches has grown. In this article, we will explore the ethical challenges and principles that guide responsible data analytics, emphasizing the need for transparency, fairness, and accountability.
RDC - Chuck Humphrey and Susan Babcock - Research Ethics Boards and Data Mana...CASRAI
The panel will discuss the context in which requirements around the ethical conduct of human participant research intersect with data management plans.
In the rapidly evolving field of data analytics, ethical considerations are more critical than ever. Responsible data analytics involves not only extracting insights but also respecting privacy, ensuring fairness, and being transparent and accountable for data practices. Data professionals, organizations, and policymakers must collaborate to establish ethical guidelines and regulations that protect individuals' rights and promote responsible data use.
Ethical Priniciples for the All Data RevolutionMelissa Moody
A presentation by Stephanie Shipp, from the Research Highlights session at the 2019 Women in Data Science Charlottesville Conference. Hosted by the UVA Data Science Institute.
Data Analytics Ethics Issues and Questions
Presented at the University of Chicago Booth Big Data & Analytics Roundtable, April 2018
Presenter:
Arnie Aronoff, Ph.D.
Instructor, MScA in Data Analytics
Instructor, School of Social Services Administration
The University of Chicago
Group Concept OD
Organizational Development and Training
(312) 259-4544
aaronoff33@gmail.com
Presented by
Presentation by RIN's Director, Michael Jubb, at the Association of Subscription Agents' annual conference in February 2010. http://www.subscription-agents.org/conferences/asa-conference-2010
In this case study we identify the factors that influence the adoption of a new system in a major company in Saudi Arabia. We develop a theoretical framework to help derive better understanding of system adoption via socio-technical integration.
We formulation of 14 hypotheses that were tested via a survey of 42 system users. Management support and change management were found to be significant factors influencing system adoption. As a result, the 14 null hypotheses were rejected due to their statistical significance (p-value < 0.05). Discussions and recommendations for future research are discussed.
Research process and research data management. Many universities are looking at how they can better serve the needs of researchers. Ken Chad Consulting worked with the University of Westminster to look the needs and attitudes of researchers and admin staff in terms of research data management (RDM). The result led the University to look first at the whole lifecycle and workflows of research administration. This in turn led to the innovative, rapid development of a system to support researchers and admin staff. Presented by Suzanne Enright (University of Westminster) and Ken Chad at the annual UKSG conference in April 2014
La comparsa, pochi decenni fa, di Internet e della connettività globale ha dato origine ad un fenomeno assolutamente nuovo: un accumulo di enormi quantità di dati conservati in banche digitali, la cui quantità raddoppia ogni pochi giorni e in prospettiva ogni poche ore. E’ la realtà dei Big Data, di cui molto si parla e discute, sovente con toni entusiastici. Ma Big Data vuol dire anche problemi di utilizzo, di interpretazione e rischi di distorsioni. Se questo è rilevante per i dati che hanno un valore economico, l’accumulo di informazione e il come viene trattata ha risvolti altrettanto rilevanti sulla formazione di conoscenza.
Per affrontare queste sfide, cruciali sono il rapporto fra etica e scienza, l’analisi critica su come i dati vengono prodotti e proposti, e il coinvolgimento di tutti i soggetti sociali chiamati in causa.
12 settembre 2019 | Torino, Polo del '900
A Model of Decision Support System for Research Topic Selection and Plagiaris...theijes
The paper proposes a model of the decision support system for deciding a research topic in academia. The biggest challenge for a student in the field of research is to identify area and topic of research. The paper explains the model which helps student to identify the most suitable area and/or topic for academic research. The model is also design to assist supervisors to explore latest areas of research as well as to get rid of non intentional plagiarism. The model facilitates the user to select either keyword bases topic search or questionnaire based topic search. The model uses local database and service of a Meta search engine in decision making activity
Researchers are normally expected to adopt risk minimization strategies. Basic ethical principles stress the need to do well and to do no harm because just one research study may cause a range of harm.
Ethical Considerations in Data Analyticsarchijain931
The age of data analytics has ushered in a wealth of opportunities for organizations and individuals to derive valuable insights from data. However, with great power comes great responsibility. Ethical considerations in data analytics have become increasingly important as the potential for misuse and privacy breaches has grown. In this article, we will explore the ethical challenges and principles that guide responsible data analytics, emphasizing the need for transparency, fairness, and accountability.
RDC - Chuck Humphrey and Susan Babcock - Research Ethics Boards and Data Mana...CASRAI
The panel will discuss the context in which requirements around the ethical conduct of human participant research intersect with data management plans.
In the rapidly evolving field of data analytics, ethical considerations are more critical than ever. Responsible data analytics involves not only extracting insights but also respecting privacy, ensuring fairness, and being transparent and accountable for data practices. Data professionals, organizations, and policymakers must collaborate to establish ethical guidelines and regulations that protect individuals' rights and promote responsible data use.
Ethical Priniciples for the All Data RevolutionMelissa Moody
A presentation by Stephanie Shipp, from the Research Highlights session at the 2019 Women in Data Science Charlottesville Conference. Hosted by the UVA Data Science Institute.
Data Analytics Ethics Issues and Questions
Presented at the University of Chicago Booth Big Data & Analytics Roundtable, April 2018
Presenter:
Arnie Aronoff, Ph.D.
Instructor, MScA in Data Analytics
Instructor, School of Social Services Administration
The University of Chicago
Group Concept OD
Organizational Development and Training
(312) 259-4544
aaronoff33@gmail.com
Presented by
Data security issues, ethical issues and challenges to privacy in knowledge-i...Tore Hoel
Presentation at “Finnish-Norwegian Workshop in Learning Analytics”, Helsinki, 21-22 May 2015. Organised by The Research Council of Norway and Academy of Finland
What is e-research?
Enhancing research practice
e-Research Methods, Strategies, and Issues
Tips For Finding Useful Information
Some Search Tools for doing e-research
Research Design
Quantitative Research
Qualitative Research
Ethics & The e-Researcher
How The Net Complicates Ethics?
Privacy, Confidentiality, Autonomy, And The Respect For Persons
Tips For Ethical e-Research
Collaboration Tools
Why Consensus?
Net-based dissemination of E-research results
Dissemination through peer-reviewed articles
Advantages of a peer-reviewed article
Dissemination through email lists or Usenet groups
Dissemination through a virtual conference
"Towards Value-centric Big Data: Community Position Paper" Daniel Bachlechner...e-SIDES.eu
e-SIDES team attended the last edition of the IFIP Summer School and moderated the session "Towards Privacy-Preserving Big Data Analytics".
This presentation on e-SIDES and our Community Position Paper was delivered by Daniel Bachlechner during the session.
This presentation was given at the Blockchain for Social Impact Meetup in Philadelphia on 13 Aug 2018 by Sean Manion. A similar presentation was given at ICCS 2018 Blockchain and Network Effects workshop.
Leveraging Your Audience Through Social MediaJessica Vitak
Presentation at the University of Maryland's Women's Studies Summer Technology Institute (#wmststi) on the powerful role of social media and other sites in creating an online brand and locating & disseminating information.
Leveraging Your Audience Through Social Media Jessica Vitak
Presentation at the University of Maryland's Women's Studies Summer Technology Institute (#wmststi) on the powerful role of social media and other sites in creating an online brand and locating & disseminating information.
Connecting in the Facebook Age: Development and Validation of a New Measure o...Jessica Vitak
Presentation of ICA paper, "Connecting in the Facebook Age: Development and Validation of a New Measure of Relationship Maintenance." This presentation includes details from additional validity and reliability testing using confirmatory factor analysis.
Link to paper: http://vitak.files.wordpress.com/2009/02/ica2014-relmaintenance-toshare.pdf
Understanding Users' Privacy Motivations and Behaviors in Online SpacesJessica Vitak
I’ve spent my career so far studying the social outcomes people derive from their use of new communication systems like Facebook. These sites contain numerous affordances that differentiate them from other forms of communication & create low-cost environments for things like relationship maintenance and exchange of resources. I have found this research to be extremely rewarding, as it is important to understand how these social systems extend our capabilities for human interaction, beyond the more traditional forms of communication we have relied on previously.
But, there's a flip side to this story. Humans, by nature, are very social beings and want to interact, want to disclose information and share it with others. Social network sites and their like facilitate this through a variety of features. However, as individuals have moved their communication from offline spaces, where the interactions tend to be much more ephemeral and audiences are generally known, to online spaces, where the lines between public and private become much more blurred, I believe that thoughts of privacy of personal information are often lost in the novelty of the technologies. Now, as we begin to think about this issue more and more, I believe it’s time to step back and re-evaluate how we conceptualize our privacy in this highly networked world and to integrate that understanding into solutions that will help individuals become more savvy users of the technology.
#cscw2014 -- Facebook Makes the Heart Grow Fonder: Relationship Maintenance S...Jessica Vitak
Read the paper here: http://vitak.files.wordpress.com/2009/02/vitak-cscw2014-distance.pdf
Abstract: The increasing ubiquity of information and communication technologies has dramatically impacted interpersonal communication and relationship maintenance processes. These technologies remove temporal and spatial constraints, enabling communication at a distance for low to no physical costs. Research has established that technologies such as email supplement other forms of communication in relationship maintenance, but to what extent do newer technologies—which contain a unique set of affordances—facilitate these processes? Furthermore, do SNS users engage in different practices through the site and obtain different relational benefits based on specific characteristics of the tie? Findings from a survey of adult Facebook users (N=415) indicate that geographically distant Facebook Friends, as well as those who rely on the site as their primary form of communication, engage in relationship maintenance strategies through the site to a greater extent and perceive the site to have a more positive impact on the quality of their relationships.
Users and Nonusers: Interactions between Levels of Facebook Adoption and Soci...Jessica Vitak
Although Facebook is the largest social network site in the U.S. and attracts an increasingly diverse userbase, some individuals have chosen not to join the site. Using survey data collected from a sample of non-academic staff at a large Midwestern university (N=614), we explore the demographic and cognitive factors that predict whether a person chooses to join Facebook. We find that older adults and those with higher perceived levels of bonding social capital are less likely to use the site. Analyzing open-ended responses from non-users, we find that they express concerns about privacy, context collapse, limited time, and channel effects in deciding to not adopt Facebook. Finally, we compare non-adopters against users who differ on three dimensions of use. We find that light users often have social capital outcomes similar to, or worse than, non-users, and that heavy users report higher perceived bridging and bonding social capital than either group.
To view the paper: http://vitak.files.wordpress.com/2009/02/lampe_vitak_ellison-2013-cscw.pdf
Social Media: Where We Stand, Where We’re HeadingJessica Vitak
Guest lecture at Elon University on 10/19/12 in COM 371, The Future of the Internet, talking about social media research and thoughts on where social media is heading in the coming years.
Managing Privacy and Context Collapse in the Facebook AgeJessica Vitak
The growth of social media—online sites driven by the public sharing on personal information with a wide audience—raises new questions related to how individuals manage their privacy and self-presentation. The technical features of sites such as Facebook, Google Plus, and Twitter lower the transaction costs of connecting and interacting with a large and diverse audience. At the same time, they may raise the costs of managing self-presentation across different contexts and ensuring that private information is not shared with unintended audiences.
Discussions related to self-presentation and privacy have featured prominently in the social sciences for more than half a century. For example, Goffman (1959) argued that individuals’ self-presentation varies based on the audience for whom they are performing. Likewise, Altman (1975) viewed privacy not as a static process, but one of dynamic boundary regulation, in which individuals make decisions regarding which pieces of personal information to share with whom, as well as the context in which that information is disclosed.
In online social networking communities, additional social and technical features make the process of managing privacy and self-presentation more complicated. Unlike anonymous forums, where users can create virtual identities not connected to their “real” selves, SNSs are tied to real identities, and because users often share a significant amount of personal information through these sites (Nosko et al., 2010), privacy becomes a critical element to determining both who to connect with and what to disclose. Boyd (2008) characterizes SNSs as
“networked publics,” and describes three features that differentiate them from other publics: invisible audiences, context collapse, and the blurring of public and private. Each of these factors is critical in evaluating how individuals can regulate boundaries and get the most out of their use of these sites.
Context collapse—the flattening of multiple distinct audiences into a homogeneous group—offers benefits and barriers to individuals. The average American adult has 229 Facebook “friends” (Hampton et al., 2011) who comprise a variety of personal and professional contexts. While Facebook enables users to quickly diffuse information across their entire network, communicating with such a diverse set of others through the same channel (e.g., status updates) may become problematic when it prevents individuals from varying their self-presentation for different audiences or when their full audience is unclear.
When facing these challenges, individuals have a number of options. Bernie Hogan (2010) suggests that users employ a “lowest common denominator” approach, whereby only content appropriate for all audiences is shared on the site. On the other hand, users may employ advanced privacy settings to segregate audiences, so they can still share relevant content with their various connections.
Social Support and Information-Sharing on Facebook by Adult UsersJessica Vitak
These slides are from a presentation at the National Communication Association annual conference on November 16, 2010 in San Francisco. The presentation summarizes findings from a qualitative study of adult Facebook users and focuses on two key constructs of social capital: social support and information-seeking.
As Europe's leading economic powerhouse and the fourth-largest hashtag#economy globally, Germany stands at the forefront of innovation and industrial might. Renowned for its precision engineering and high-tech sectors, Germany's economic structure is heavily supported by a robust service industry, accounting for approximately 68% of its GDP. This economic clout and strategic geopolitical stance position Germany as a focal point in the global cyber threat landscape.
In the face of escalating global tensions, particularly those emanating from geopolitical disputes with nations like hashtag#Russia and hashtag#China, hashtag#Germany has witnessed a significant uptick in targeted cyber operations. Our analysis indicates a marked increase in hashtag#cyberattack sophistication aimed at critical infrastructure and key industrial sectors. These attacks range from ransomware campaigns to hashtag#AdvancedPersistentThreats (hashtag#APTs), threatening national security and business integrity.
🔑 Key findings include:
🔍 Increased frequency and complexity of cyber threats.
🔍 Escalation of state-sponsored and criminally motivated cyber operations.
🔍 Active dark web exchanges of malicious tools and tactics.
Our comprehensive report delves into these challenges, using a blend of open-source and proprietary data collection techniques. By monitoring activity on critical networks and analyzing attack patterns, our team provides a detailed overview of the threats facing German entities.
This report aims to equip stakeholders across public and private sectors with the knowledge to enhance their defensive strategies, reduce exposure to cyber risks, and reinforce Germany's resilience against cyber threats.
Show drafts
volume_up
Empowering the Data Analytics Ecosystem: A Laser Focus on Value
The data analytics ecosystem thrives when every component functions at its peak, unlocking the true potential of data. Here's a laser focus on key areas for an empowered ecosystem:
1. Democratize Access, Not Data:
Granular Access Controls: Provide users with self-service tools tailored to their specific needs, preventing data overload and misuse.
Data Catalogs: Implement robust data catalogs for easy discovery and understanding of available data sources.
2. Foster Collaboration with Clear Roles:
Data Mesh Architecture: Break down data silos by creating a distributed data ownership model with clear ownership and responsibilities.
Collaborative Workspaces: Utilize interactive platforms where data scientists, analysts, and domain experts can work seamlessly together.
3. Leverage Advanced Analytics Strategically:
AI-powered Automation: Automate repetitive tasks like data cleaning and feature engineering, freeing up data talent for higher-level analysis.
Right-Tool Selection: Strategically choose the most effective advanced analytics techniques (e.g., AI, ML) based on specific business problems.
4. Prioritize Data Quality with Automation:
Automated Data Validation: Implement automated data quality checks to identify and rectify errors at the source, minimizing downstream issues.
Data Lineage Tracking: Track the flow of data throughout the ecosystem, ensuring transparency and facilitating root cause analysis for errors.
5. Cultivate a Data-Driven Mindset:
Metrics-Driven Performance Management: Align KPIs and performance metrics with data-driven insights to ensure actionable decision making.
Data Storytelling Workshops: Equip stakeholders with the skills to translate complex data findings into compelling narratives that drive action.
Benefits of a Precise Ecosystem:
Sharpened Focus: Precise access and clear roles ensure everyone works with the most relevant data, maximizing efficiency.
Actionable Insights: Strategic analytics and automated quality checks lead to more reliable and actionable data insights.
Continuous Improvement: Data-driven performance management fosters a culture of learning and continuous improvement.
Sustainable Growth: Empowered by data, organizations can make informed decisions to drive sustainable growth and innovation.
By focusing on these precise actions, organizations can create an empowered data analytics ecosystem that delivers real value by driving data-driven decisions and maximizing the return on their data investment.
1. Beyond the Belmont Principles:
Ethical Challenges, Practices,
and Beliefs in the Online Data
Research Community
Jessica Vitak, Katie Shilton, and Zahra Ashktorab
College of Information Studies, University of Maryland
CSCW 2016 | San Francisco, CA
Twitter: @jvitak, @katieshilton, @zashktorab
#cscw2016
4. Data Free for All?
Not so fast
(research) cowboy…
Let’s talk about data
ethics first.
5. Belmont Report & Common Rule
Belmont Report published in 1979 in response to
unethical studies in psychology & medicine
Further articulated in Common Rule legislation
Three main principles:
1. Respect for persons: participants know about and
consent to research
2. Beneficence: do no harm; maximize benefits, minimize
risks
3. Justice: fair distribution of costs/benefits to all potential
participants
6. Online data create gray area
Is it feasible to collect informed
consent?
Should you be more
transparent about your
research?
Who is being left out by your
data collection strategies?
Isn’t public data public?
Is it possible to truly
anonymize a dataset?
7.
8. Statement from the SIGCOMM 2015 Program Committee: The
SIGCOMM 2015 PC appreciated the technical contributions made in
this paper, but found the paper controversial because some of the
experiments the authors conducted raise ethical concerns. The
controversy arose in large part because the networking research
community does not yet have widely accepted guidelines or rules
for the ethics of experiments that measure online censorship. In
accordance with the published submission guidelines for SIGCOMM
2015, had the authors not engaged with their Institutional Review
Boards (IRBs) or had their IRBs determined that their research was
unethical, the PC would have rejected the paper without review. But the
authors did engage with their IRBs, which did not flag the research as
unethical. The PC hopes that discussion of the ethical concerns these
experiments raise will advance the development of ethical guidelines in
this area. It is the PC’s view that future guidelines should include
as a core principle that researchers should not engage in
experiments that subject users to an appreciable risk of
substantial harm absent informed consent. The PC endorses
9. Research Questions
1. What are the research ethics practices of
researchers using online datasets?
2. What do researchers using online
datasets believe constitutes ethical
research?
3. How do these practices and beliefs vary
among social computing researchers?
10. Method: Survey Instrument
Survey items were created based on findings from
20 interviews with scholars in social-computing
fields1
Items captured:
Data Sources, Methods, and Analyses employed
Data collection, analysis and sharing practices
Attitudes toward appropriateness of various methods
and practices
Items capturing researchers personal codes of ethics
Demographics
1 See Shilton & Sayles (2016) for more information on this initial qualitative work.
11. Method: Identifying population
Focused on researchers in social computing-related
fields where online data analysis is common.
Used databases to identify articles published since 2011
at eight conferences: CSCW, CHI, ICWSM, iConference,
WWW, Ubicomp, CKIM, and KDD.
Search terms included: “trace ethnography,” “big data,”
“twitter,” “forums,” “text mining,” “logs,” “activity traces,”
and “social network”
Also posted survey invite to relevant listservs, including
AoIR, AIS, CITASA, AIS ICA, STS, and NCA
12. Method: Participant Details
Based on database search, we compiled a list of
~2800 unique names.
Sent initial invites plus two reminders through
SurveyGizmo over two-week period
Received 263 completed surveys, along with some
interesting emails from researchers…
13. Variable Mean (SD)/N (%)
Sex Male
Female
159 (60.5%)
93 (35.4%)
Education Bachelor’s
Master’s
PhD
15 (5.7%)
61 (23.2%)
180 (68.4%)
Current Location United States
UK
Canada
Germany
23 other countries (<10 participants)
162 (61.6%)
21 (8.0%)
14 (5.3%)
12 (4.6%)
47 (17.8%)
Degree In Communication & Media
Computer Science/Engineering
HCI
Information
Social Sciences
Other
33 (12.5%)
100 (38%)
10 (3.8%)
50 (19%)
37 (14.1%)
27 (10.3%)
Current Field of
Work
Academia (Research Focus)
Industry
Policy/Government/Non-Profit
207 (78.7%)
35 (13.3%)
12 (4.6%)
Sample Demographics (N=263)
14. Findings: Researcher Code of
Ethics
Analyzed two ways:
1. Iterative coding by authors of open-ended
responses to question, “How would you describe
your personal code of ethics regarding online
data?”*
2. EFA of 35 survey items relating to research
attitudes and practices
*We received 162 (63%) responses, with an average length of 35 words (median=23.5; SD=35).
15. Code Definition Example Statements
Public Data Only using public data / public data
being okay to collect and analyze
In general, I feel that what is posted online is a
matter of public record, though every case needs to
be looked at individually in order to evaluate the
ethical risks.
Do No Harm Comments related to Golden Rule Golden rule, do to others what you’d have them do to
you.
Informed
Consent
Always get informed consent /
stressing importance of informed
consent
I think at this point for any new study I started using
online data, I would try to get informed consent when
collecting identifiable information (e.g. usernames).
Greater
Good
Data collection should have a
social benefit
The work I do should address larger social
challenges, and not just offer incremental
improvements for companies to deploy.
Established
Guidelines
Including Belmont Report, IRBs
Terms of Service, legal
frameworks, community norms
I generally follow the ethical guidelines for human
subjects research as reflected in the Belmont Report
and codified in 45.CFR.46 when collecting online
data.
Risks vs.
Benefits
Discussion of weighing potential
harms and benefits or gains
I think I focus on potential harm, and all the ethical
procedures I put in place work towards minimizing
potential harm.
Protect
Participants
data aggregation, deleting PII,
anonymizing / obfuscating data
I aggregate unique cases into larger categories
rather than removing them from the data set.
Data
Judgments
Efforts to not make inferences or
judge participants or data
Do not expose users to the outside world by inferring
features that they have not personally disclosed.
Transparenc
y
Contact with participants or
methods of informing participants
about research
I prefer to engage individual participants in the data
collection process, and to provide them with explicit
information about data collection practices.
16. Item M SD
...notify participants about why they’re collecting online data1 3.89 0.96
...share research results with research subjects1 3.90 0.80
...Ask colleagues about their research ethics practices1 4.27 0.74
...Ask their IRB/internal reviews for advice about research ethics1 4.03 0.90
...Think about possible edge cases/outliers when designing
studies1
4.33 0.71
...Only collect online data when the benefits outweigh the potential
harms1
3.62 1.10
...Remove individuals from datasets upon their request1 4.56 0.71
Researchers should be held to a higher ethical standard than
others who use online data2
3.46 1.22
I think about ethics a lot when I'm designing a new research
project2
3.96 0.93
Full Scale (α=.71) 4.00 0.49
1 Prompt: “I think researchers should....”
2 Prompt: “To what extent do you agree with the following statements?”
Both sets of items were measured on five point, Likert-type scales (Strongly Agree-Strongly Disagree).
Codification of Ethical Attitudes Measure
17. Independent Variables ß (t-test) p
Sex: Female .01 (.13) .90
Academic Degree
Social Science (1)
Computer Science (1)
-.06 (-.72)
-.13 (-1.45)
.48
.15
Field of Work
Academia (1)
Industry (1)
-.16 (-1.23)
-.31 (-2.43)
.22
.02
Level of Education .01 (.07) .94
Location: U.S. -.01 (-.20) .84
F-test 1.56 .15
Adjusted R2 .02 --
OLS Regression for Codified Ethics
*Industry workers’ average agreement with the codified ethics scale (M=3.75) was significantly lower than
non-profit/policy workers (M=4.11). Academics (M=4.02) were not significantly different from either group.
18. Low variance (<.8) High variance (> 1.2)Agreement(>3.5)
Remove subjects from datasets
upon request1
Ask (1) colleagues or (2) IRBs
about research1
Share results with participants1
Think about edge
cases/outliers1
Use non-representative
samples2
Remove unique individuals
before sharing1
Researchers can’t collect large-
scale online data if consent is
required
Disagreement(<3)
No items.
Notes:
1 “I think researchers should…”
2 “It’s permissible for researchers to…”
Ignore ToS when necessary2
Deceive participants2
Share raw data with key
stakeholders1
It’s possible to obtain informed
consent with large-scale studies
Comparing level of agreement and item variable across the
corpus
19. Key Takeaways
Context matters. A lot.
Dependencies between
ethical principles.
Beliefs & norms are often
in flux.
Disciplinary differences?
The data suggest no.
Such assumptions risk
talking past each other
rather than learning from
others’ experiences
20. Ethics Heuristics for Online Data
Research: Beyond the Belmont
Report
1. Focus on transparency
Openness about data collection
Sharing results with community
leaders or research subjects
2. Data minimization
Collecting only what you need to
answer an RQ
Letting individuals opt out
Sharing data at aggregate levels
3. Increased caution in sharing results
4. Respect the norms of the contexts in which online
data was generated.
21. Future directions for this
research
This is first of three surveys by our research group. Each
survey focuses on a different stakeholder:
1. Researchers (this study)
2. IRBs (paper in progress)
3. Users (study in development)
Other avenues for discussion and action:
workshops (e.g., CSCW 2015, SIGCOM 2015)
Panels (e.g., iConference 2016)
Working groups/meetings with policymakers (e.g., Future of
Privacy Forum, December 2015)