This document discusses the importance of reproducibility in human computation tasks. It argues that for the results of human computation to be meaningful and informative, they must be reproducible by different human contributors working independently. Reproducibility ensures the results are not due to chance and can reflect the underlying properties of the task. The document outlines sources of variability in human judgments and draws similarities between human computation and content analysis in behavioral sciences, where reproducibility is crucial. It suggests ensuring reproducibility through clear task design and measurement.
This document summarizes a research paper that proposes a new adaptive ensemble boosting classifier for handling concept drifting stream data. It introduces an approach that uses adaptive sliding windows and Hoeffding Trees with naive Bayes as the base learner. The results showed that the proposed algorithm worked well in changing environments compared to other ensemble classifiers. It also discussed types of concept drift in streaming data, including noise, blips, abrupt changes, and gradual changes.
Structuring Ideation Map using Oriented Directed Acyclic Graph with Privacy P...IDES Editor
E-Brainstorming is a computerized version of sharing
ideas and it replaces verbal communication. The productivity
of ideas generated is viewed as the dominant measure of EBrainstorming.
In Agent-based E-Brainstorming, Idea
Ontology was used to map user’s knowledge with idea names
and relationships between idea instances. In this paper
Oriented Directed Acyclic Graph (ODAG) method is used to
construct the ideation map for diverse ideas and their
relationship. Privacy Preference Ontology is integrated to
provide privacy preference for user’s data like access control,
condition, access space and restriction. Here the Idea
Knowledge Base is applied and it enfolds a collection of idea
instances of different domains to denote a client’s knowledge.
This document discusses knowledge discovery in databases (KDD) through the LON-CAPA online educational system. [1] It defines KDD and data mining, describing the tasks, methods, and applications of KDD. [2] The goals are to obtain predictive models of students, help students and instructors use resources more effectively, and provide information to increase student learning. [3] It then discusses the KDD process and data mining methods like classification, clustering, and dependency modeling that can be applied to discover knowledge from educational data.
It is well-known that SRPT is optimal for minimizing
ow time on machines that run one job at a time.
However, running one job at a time is a big under-
utilization for modern systems where sharing, simultane-
ous execution, and virtualization-enabled consolidation
are a common trend to boost utilization. Such machines,
used in modern large data centers and clouds, are
powerful enough to run multiple jobs/VMs at a time
subject to overall CPU, memory, network, and disk
capacity constraints.
Motivated by this pr
Optimizing Budget Constrained Spend in Search AdvertisingSunny Kr
Search engine ad auctions typically have a signicant frac-
tion of advertisers who are budget constrained, i.e., if al-
lowed to participate in every auction that they bid on, they
would spend more than their budget. This yields an im-
portant problem: selecting the ad auctions in which these
advertisers participate, in order to optimize dierent system
We present a linear regression method for predictions on a small data
set making use of a second possibly biased data set that may be much
larger. Our method ts linear regressions to the two data sets while
penalizing the dierence between predictions made by those two models.
Times after the concord stagecoach was on displaySunny Kr
The document summarizes recent events and developments at the San Diego Historical Society:
1) The Society opened a new exhibition called "Nikkei Youth Culture: Past, Present, Future" in their newly dedicated Youth Gallery, focusing on the experiences of Japanese American youth over time.
2) They also advanced the second phase of their core exhibition "Place of Promise" highlighting the diverse histories that formed the foundation of San Diego. Artifacts like a stagecoach and quilts will be featured.
3) The Balboa Art Conservation Center is partnering with the Society to evaluate collection needs and make recommendations for funding conservation, storage, and management efforts.
Algorithmic entropy can be seen as a special case of entropy as studied in
statistical mechanics. This viewpoint allows us to apply many techniques
developed for use in thermodynamics to the subject of algorithmic information theory. In particular, suppose we fix a universal prefix-free Turing
This document summarizes a research paper that proposes a new adaptive ensemble boosting classifier for handling concept drifting stream data. It introduces an approach that uses adaptive sliding windows and Hoeffding Trees with naive Bayes as the base learner. The results showed that the proposed algorithm worked well in changing environments compared to other ensemble classifiers. It also discussed types of concept drift in streaming data, including noise, blips, abrupt changes, and gradual changes.
Structuring Ideation Map using Oriented Directed Acyclic Graph with Privacy P...IDES Editor
E-Brainstorming is a computerized version of sharing
ideas and it replaces verbal communication. The productivity
of ideas generated is viewed as the dominant measure of EBrainstorming.
In Agent-based E-Brainstorming, Idea
Ontology was used to map user’s knowledge with idea names
and relationships between idea instances. In this paper
Oriented Directed Acyclic Graph (ODAG) method is used to
construct the ideation map for diverse ideas and their
relationship. Privacy Preference Ontology is integrated to
provide privacy preference for user’s data like access control,
condition, access space and restriction. Here the Idea
Knowledge Base is applied and it enfolds a collection of idea
instances of different domains to denote a client’s knowledge.
This document discusses knowledge discovery in databases (KDD) through the LON-CAPA online educational system. [1] It defines KDD and data mining, describing the tasks, methods, and applications of KDD. [2] The goals are to obtain predictive models of students, help students and instructors use resources more effectively, and provide information to increase student learning. [3] It then discusses the KDD process and data mining methods like classification, clustering, and dependency modeling that can be applied to discover knowledge from educational data.
It is well-known that SRPT is optimal for minimizing
ow time on machines that run one job at a time.
However, running one job at a time is a big under-
utilization for modern systems where sharing, simultane-
ous execution, and virtualization-enabled consolidation
are a common trend to boost utilization. Such machines,
used in modern large data centers and clouds, are
powerful enough to run multiple jobs/VMs at a time
subject to overall CPU, memory, network, and disk
capacity constraints.
Motivated by this pr
Optimizing Budget Constrained Spend in Search AdvertisingSunny Kr
Search engine ad auctions typically have a signicant frac-
tion of advertisers who are budget constrained, i.e., if al-
lowed to participate in every auction that they bid on, they
would spend more than their budget. This yields an im-
portant problem: selecting the ad auctions in which these
advertisers participate, in order to optimize dierent system
We present a linear regression method for predictions on a small data
set making use of a second possibly biased data set that may be much
larger. Our method ts linear regressions to the two data sets while
penalizing the dierence between predictions made by those two models.
Times after the concord stagecoach was on displaySunny Kr
The document summarizes recent events and developments at the San Diego Historical Society:
1) The Society opened a new exhibition called "Nikkei Youth Culture: Past, Present, Future" in their newly dedicated Youth Gallery, focusing on the experiences of Japanese American youth over time.
2) They also advanced the second phase of their core exhibition "Place of Promise" highlighting the diverse histories that formed the foundation of San Diego. Artifacts like a stagecoach and quilts will be featured.
3) The Balboa Art Conservation Center is partnering with the Society to evaluate collection needs and make recommendations for funding conservation, storage, and management efforts.
Algorithmic entropy can be seen as a special case of entropy as studied in
statistical mechanics. This viewpoint allows us to apply many techniques
developed for use in thermodynamics to the subject of algorithmic information theory. In particular, suppose we fix a universal prefix-free Turing
Assessing Problem Solving In Expert Systems Using Human BenchmarkingClaire Webber
This document summarizes a study that assessed problem solving ability in an expert scheduling system called GATES by comparing its performance to that of humans on scheduling tasks. The study administered easy and difficult scheduling tasks to undergraduate students, graduate students, and air traffic controllers to represent different levels of problem solving ability. Process measures in the form of metacognitive questionnaires and think-aloud protocols were also used. The goal was to develop a "human benchmarking" scale to evaluate how the problem solving ability of GATES compared to humans. Results from previous studies found GATES outperformed community college students but a wider range of human participants was needed to better define the scale.
Chao Wrote Some trends that influence human resource are, Leade.docxsleeperharwell
Chao Wrote:
Some trends that influence human resource are, Leadership Development and Learning Opportunities, Data and Analytics, Compliance and Regulation, Controlling and Containing Costs, and More Competition for Talent. But the one that I like and think its much important is leadership development and learning opportunity because in this role, companies give the employees the opportunity to learn and grow with the leadership training and this will show employees that the company wants employee to be more engage. Plus, this kind of program can also help nurture leadership abilities and professional development. The other trend I think that plays a very important role is knowing the compliance and regulations because in this area, compliance and regulation changes all the time and companies need to be more pro-active and make changes as they have updates with any new compliance or regulations. For this, many companies turn to technology solutions to minimize the costs and resources devoted to this task, freeing up HR professionals to focus on other aspects of their work. Some strategic resource examples include recruitment, learning and development, compensation, and performance appraisal.
Quane Wrote:
Hi Dr. Clark and Classmates,
Through my assigned reading for week 1, I've learned that one-third of large U.S. businesses selected non-Human Resources managers to operate in top tier executive positions. Consequently, the most successful Human Resource executive do have prior Human Resources experience so for the select few managers without a Human Resource background that get the opportunity to serve in a Human Resource executive will increase their probability of successful career progression. The new tentative transition for businesses is to outsource the majority of their Human Resource operational needs to large Human Resource firms that service multiple businesses. Many frequently utilized services will be offered to employees online in order to address the increased demand for specialized Human Resource services as well as shorten response times and increase efficiency.
Strategic Human Resource Management is the process of determining ways to evaluate an organization's unique Human Resources need and create a plan that facilitates the establishment and maintenance of efficient personnel management systems that support the short term and long term functionality and sustained growth of an organization.
Exercise 8 - Case Study Research
Develop a hypothetical research scenario that would warrant the application of the case study.
What type of approach within the qualitative method would be used? Why or why not?
Exercise 9 - Perspectives in Qualitative Methods
Develop a hypothetical research scenario that would warrant the application of the ethnographic, narrative or phenomenological approach.
What type of design would be best utilized along with this approach?
Exercise 10 - Factors in Mixed Methods Research
What are the strengths.
Chao Wrote Some trends that influence human resource are, Leade.docxketurahhazelhurst
Chao Wrote:
Some trends that influence human resource are, Leadership Development and Learning Opportunities, Data and Analytics, Compliance and Regulation, Controlling and Containing Costs, and More Competition for Talent. But the one that I like and think its much important is leadership development and learning opportunity because in this role, companies give the employees the opportunity to learn and grow with the leadership training and this will show employees that the company wants employee to be more engage. Plus, this kind of program can also help nurture leadership abilities and professional development. The other trend I think that plays a very important role is knowing the compliance and regulations because in this area, compliance and regulation changes all the time and companies need to be more pro-active and make changes as they have updates with any new compliance or regulations. For this, many companies turn to technology solutions to minimize the costs and resources devoted to this task, freeing up HR professionals to focus on other aspects of their work. Some strategic resource examples include recruitment, learning and development, compensation, and performance appraisal.
Quane Wrote:
Hi Dr. Clark and Classmates,
Through my assigned reading for week 1, I've learned that one-third of large U.S. businesses selected non-Human Resources managers to operate in top tier executive positions. Consequently, the most successful Human Resource executive do have prior Human Resources experience so for the select few managers without a Human Resource background that get the opportunity to serve in a Human Resource executive will increase their probability of successful career progression. The new tentative transition for businesses is to outsource the majority of their Human Resource operational needs to large Human Resource firms that service multiple businesses. Many frequently utilized services will be offered to employees online in order to address the increased demand for specialized Human Resource services as well as shorten response times and increase efficiency.
Strategic Human Resource Management is the process of determining ways to evaluate an organization's unique Human Resources need and create a plan that facilitates the establishment and maintenance of efficient personnel management systems that support the short term and long term functionality and sustained growth of an organization.
Exercise 8 - Case Study Research
Develop a hypothetical research scenario that would warrant the application of the case study.
What type of approach within the qualitative method would be used? Why or why not?
Exercise 9 - Perspectives in Qualitative Methods
Develop a hypothetical research scenario that would warrant the application of the ethnographic, narrative or phenomenological approach.
What type of design would be best utilized along with this approach?
Exercise 10 - Factors in Mixed Methods Research
What are the strengths ...
Meetup 22/2/2018 - Artificiële Intelligentie & Human ResourcesDigipolis Antwerpen
The document discusses how artificial intelligence (AI) is impacting and will continue to impact human resources (HR) and the future of work. It describes how AI can automate certain cognitive processes like communicating, storing and retrieving information, sorting, filtering, pattern recognition, and decision making. The document also discusses using neural networks to generate inferential roles or predictions about what sentences are likely to follow from a given sentence to understand semantics. It provides an example of a model trained on a natural language inference dataset that is able to generate simple inferences. The document argues that understanding language involves drawing inferences, so inference should be at the core of models of semantic cognition.
The document discusses the differences between machine learning (ML), statistical learning, data mining (DM), and automated learning (AL). It argues that while ML and statistical learning developed similar techniques starting in the 1960s, DM emerged in the 1990s from a merging of database research and automated learning. However, industry was much more enthusiastic about adopting DM techniques compared to AL techniques, even though many DM systems are just friendly interfaces of AL systems. The document aims to explain the key differences between DM and AL that led to DM's greater commercial success.
An Instrument To Support Thinking Critically About Critical Thinking In Onlin...Daniel Wachtel
The document describes the creation of an instrument to measure critical thinking in online discussions. It reviewed 4 existing models of critical thinking and identified common elements and differences between them. It then analyzed and synthesized the models to develop an instrument that operationalizes critical thinking in a valid way based on key cognitive processes. The instrument was tested on an online discussion transcript and showed promise in identifying critical thinking, though practical issues remain to be addressed.
Intention recognition for dynamic role exchange in hapticأحلام انصارى
1) The paper summarizes an experimental study that investigated the utility of a role exchange mechanism in human-computer collaboration through haptic interaction.
2) In this framework, a human dynamically interacts with a computer partner by communicating through the haptic channel to trade control levels on a shared task.
3) The study examined conditions of equal control, role exchange, and visuo-haptic cues, analyzing effects on task performance, efficiency, and subjective measures.
This document discusses semantic networks and their applications in artificial intelligence. It begins with describing how semantic networks originated in psychology for analyzing relationships between concepts, and were later adopted by AI to analyze textual data. Key applications of semantic networks discussed include idea generation, visual text analysis, and conceptual design. The document also provides historical background on semantic networks and how they are used for knowledge representation.
What is Human Centred Design - The Design JournalJoseph Giacomin
1) The document discusses the definition and practice of human-centered design. It defines human-centered design as an approach that focuses on understanding people's needs, desires, and experiences through techniques like empathy, scenarios, and personas in order to design intuitive products and services.
2) It proposes a model of human-centered design as a pyramid with physical and perceptual characteristics at the base and meaning and purpose at the apex. The model suggests that designs addressing higher-level questions can offer more value and opportunities for success.
3) The document argues that while early approaches focused on usability, modern human-centered design also considers emotional engagement and can define new meanings through interactions with people.
Developing of climate data for building simulation with future weather condit...Rasmus Madsen
Today, climate models are used frequently to describe past, current or future climate conditions in par-ticular building simulation. A research study of how future climate change will affect the future indoor environment and buildings energy use in a Danish context has been conducted. To fulfil this research study, information of how climate models are developed are needed as well. The research study includes an objective descriptive approach from both Danish and global research of the given topic. The gathered information from the publications is evaluated with respect to indicators for the quality of the journals as well as the authors. The method used for development of the Danish design reference year, is not clear, and to have a full knowledge of how the climate change will affect building simulation in a Danish context, further research is needed. This research for development of a new Danish weather file will require both a descriptive and analytical research.
DALL-E 2 - OpenAI imagery automation first developed by Vishal Coodye in 2021...MITAILibrary
The document provides a review of machine learning interpretability methods. It begins with an introduction to explainable artificial intelligence and a discussion of key concepts like interpretability and explainability. It then presents a taxonomy of interpretability methods that are divided into four main categories: methods for explaining black-box models, creating white-box models, promoting fairness, and analyzing model sensitivity. Specific machine learning interpretability techniques are summarized within each category.
Rapid Prototyping for Instructional Designover TimeJean Mullins
This document provides an overview of the evolution of rapid prototyping as an instructional design methodology from 1994 to 2015. It summarizes key papers that explored using rapid prototyping for instructional design and evaluated its effectiveness. The papers found that rapid prototyping could produce high-quality instructional materials in less time than traditional instructional design models by involving stakeholders throughout the iterative design process. However, rapid prototyping required attention to instructional design principles.
This document provides an overview of the steps involved in quantitative data analysis and applying Partial Least Square Structural Equation Modeling (PLS-SEM). It discusses pretesting questionnaires, preparing raw data through editing and coding, assessing validity through measures like content validity and unidimensionality, and establishing construct validity through convergent and discriminant validity techniques. The goal is to review all the necessary steps for quantitative data analysis using SPSS and applying SEM, from preparing the data to reporting the results.
Knowledge Management in Software EnterpriseIOSR Journals
This document discusses knowledge management in software enterprises. It begins by defining key terms like data, information, knowledge, tacit knowledge and explicit knowledge. It then discusses how knowledge management is important for software engineering processes. Effective knowledge management can streamline processes and improve quality. The document also examines different approaches to knowledge management, like plan-based versus agile methods. Finally, it presents a framework for knowledge management in enterprises, including knowledge creation, acquisition, organization, distribution, and application to create business value.
A theoretical model of differential social attributions toward computing tech...UltraUploader
This document proposes a theoretical model to explain how and why people attribute human characteristics to computing technologies. It discusses how the common practice of anthropomorphism, or ascribing human traits to computers, can have both positive and negative effects. The model suggests that an individual's perspective on the computer exists on a continuum between viewing it as a locally simplex tool or a globally complex entity. One's position is influenced by the social cues of the technology, the person's self-evaluation, the interaction context, and available attributional cues. The goal is to better understand this phenomenon and guide future research on social interactions with technology.
Why big data didn’t end causal inference by Totte Harinen at Big Data Spain 2017Big Data Spain
Ten years ago there were rumours of the death of causal inference. Big data was supposed to enable us to rely on purely correlational data to predict and control the world.
https://www.bigdataspain.org/2017/talk/why-big-data-didnt-end-causal-inference
Big Data Spain 2017
November 16th - 17th Kinépolis Madrid
An Iterative Model as a Tool in Optimal Allocation of Resources in University...Dr. Amarjeet Singh
In this paper, a study was carried out to aid in
adequate allocation of resources in the College of Natural
Sciences, TYZ University (not real name because of ethical
issue). Questionnaires were administered to the highranking officials of one the Colleges, College of Pure and
Applied Sciences, to examine how resources were allocated
for three consecutive sessions(the sessions were 2009/2010,
2010/2011 and 2011/2012),then used the data gathered and
analysed to generate contributory inputs for the three basic
outputs (variables)formed for the purpose of the study.
These variables are: 1
x
represents the quality of graduates
produced;
2
x
stands for research papers, Seminars,
Journals articles etc. published by faculties and
3
x
denotes service delivery within the three sessions under study.
Simplex Method of Linear Programming was used to solve
the model formulated.
Invited talk: Using Social Media and Mobile Devices to Mediate Informal, Professional, Work-Based Learning
John Cook
Bristol Centre for Research
in Lifelong Learning and Education (BRILLE)
University of the West of England (UWE)
http://www.uwe.ac.uk/research/brille/
http://people.uwe.ac.uk/Pages/person.aspx?accountname=campus\jn-cook
Invited talk: Centre for Learning, Knowing and Interactive Technologies, Graduate School of Education, University of Bristol
26th February, 12.30 to 13.45
UNIVERSAL ACCOUNT NUMBER (UAN) Manual
UNIVERSAL ACCOUNT NUMBER (UAN) Manual
how to find uan number for pf
uan number epf registration
epf contact number
epf contact number kl
epf helpline number
how to find uan number in epf
old pf number to new pf number
old pf account number
how to get old pf number
how to know my pf account number
how to find the pf number
pf account number search
pf no check
how to know my pf balance
pf account balance enquiry
epf department contact number
epf ambattur contact number
epf contact number
epf contact number kl
epf helpline number
epf claim status contact number
Advanced Tactics and Engagement Strategies for Google+Sunny Kr
Succeeding Liam Walsh is Dan Petrovic, the CEO of Dejan SEO, and a well-known search engine specialist. A well-traveled and seasoned presenter, Dan brought a myriad of underutilised Google + capabilities to our attention.
Assessing Problem Solving In Expert Systems Using Human BenchmarkingClaire Webber
This document summarizes a study that assessed problem solving ability in an expert scheduling system called GATES by comparing its performance to that of humans on scheduling tasks. The study administered easy and difficult scheduling tasks to undergraduate students, graduate students, and air traffic controllers to represent different levels of problem solving ability. Process measures in the form of metacognitive questionnaires and think-aloud protocols were also used. The goal was to develop a "human benchmarking" scale to evaluate how the problem solving ability of GATES compared to humans. Results from previous studies found GATES outperformed community college students but a wider range of human participants was needed to better define the scale.
Chao Wrote Some trends that influence human resource are, Leade.docxsleeperharwell
Chao Wrote:
Some trends that influence human resource are, Leadership Development and Learning Opportunities, Data and Analytics, Compliance and Regulation, Controlling and Containing Costs, and More Competition for Talent. But the one that I like and think its much important is leadership development and learning opportunity because in this role, companies give the employees the opportunity to learn and grow with the leadership training and this will show employees that the company wants employee to be more engage. Plus, this kind of program can also help nurture leadership abilities and professional development. The other trend I think that plays a very important role is knowing the compliance and regulations because in this area, compliance and regulation changes all the time and companies need to be more pro-active and make changes as they have updates with any new compliance or regulations. For this, many companies turn to technology solutions to minimize the costs and resources devoted to this task, freeing up HR professionals to focus on other aspects of their work. Some strategic resource examples include recruitment, learning and development, compensation, and performance appraisal.
Quane Wrote:
Hi Dr. Clark and Classmates,
Through my assigned reading for week 1, I've learned that one-third of large U.S. businesses selected non-Human Resources managers to operate in top tier executive positions. Consequently, the most successful Human Resource executive do have prior Human Resources experience so for the select few managers without a Human Resource background that get the opportunity to serve in a Human Resource executive will increase their probability of successful career progression. The new tentative transition for businesses is to outsource the majority of their Human Resource operational needs to large Human Resource firms that service multiple businesses. Many frequently utilized services will be offered to employees online in order to address the increased demand for specialized Human Resource services as well as shorten response times and increase efficiency.
Strategic Human Resource Management is the process of determining ways to evaluate an organization's unique Human Resources need and create a plan that facilitates the establishment and maintenance of efficient personnel management systems that support the short term and long term functionality and sustained growth of an organization.
Exercise 8 - Case Study Research
Develop a hypothetical research scenario that would warrant the application of the case study.
What type of approach within the qualitative method would be used? Why or why not?
Exercise 9 - Perspectives in Qualitative Methods
Develop a hypothetical research scenario that would warrant the application of the ethnographic, narrative or phenomenological approach.
What type of design would be best utilized along with this approach?
Exercise 10 - Factors in Mixed Methods Research
What are the strengths.
Chao Wrote Some trends that influence human resource are, Leade.docxketurahhazelhurst
Chao Wrote:
Some trends that influence human resource are, Leadership Development and Learning Opportunities, Data and Analytics, Compliance and Regulation, Controlling and Containing Costs, and More Competition for Talent. But the one that I like and think its much important is leadership development and learning opportunity because in this role, companies give the employees the opportunity to learn and grow with the leadership training and this will show employees that the company wants employee to be more engage. Plus, this kind of program can also help nurture leadership abilities and professional development. The other trend I think that plays a very important role is knowing the compliance and regulations because in this area, compliance and regulation changes all the time and companies need to be more pro-active and make changes as they have updates with any new compliance or regulations. For this, many companies turn to technology solutions to minimize the costs and resources devoted to this task, freeing up HR professionals to focus on other aspects of their work. Some strategic resource examples include recruitment, learning and development, compensation, and performance appraisal.
Quane Wrote:
Hi Dr. Clark and Classmates,
Through my assigned reading for week 1, I've learned that one-third of large U.S. businesses selected non-Human Resources managers to operate in top tier executive positions. Consequently, the most successful Human Resource executive do have prior Human Resources experience so for the select few managers without a Human Resource background that get the opportunity to serve in a Human Resource executive will increase their probability of successful career progression. The new tentative transition for businesses is to outsource the majority of their Human Resource operational needs to large Human Resource firms that service multiple businesses. Many frequently utilized services will be offered to employees online in order to address the increased demand for specialized Human Resource services as well as shorten response times and increase efficiency.
Strategic Human Resource Management is the process of determining ways to evaluate an organization's unique Human Resources need and create a plan that facilitates the establishment and maintenance of efficient personnel management systems that support the short term and long term functionality and sustained growth of an organization.
Exercise 8 - Case Study Research
Develop a hypothetical research scenario that would warrant the application of the case study.
What type of approach within the qualitative method would be used? Why or why not?
Exercise 9 - Perspectives in Qualitative Methods
Develop a hypothetical research scenario that would warrant the application of the ethnographic, narrative or phenomenological approach.
What type of design would be best utilized along with this approach?
Exercise 10 - Factors in Mixed Methods Research
What are the strengths ...
Meetup 22/2/2018 - Artificiële Intelligentie & Human ResourcesDigipolis Antwerpen
The document discusses how artificial intelligence (AI) is impacting and will continue to impact human resources (HR) and the future of work. It describes how AI can automate certain cognitive processes like communicating, storing and retrieving information, sorting, filtering, pattern recognition, and decision making. The document also discusses using neural networks to generate inferential roles or predictions about what sentences are likely to follow from a given sentence to understand semantics. It provides an example of a model trained on a natural language inference dataset that is able to generate simple inferences. The document argues that understanding language involves drawing inferences, so inference should be at the core of models of semantic cognition.
The document discusses the differences between machine learning (ML), statistical learning, data mining (DM), and automated learning (AL). It argues that while ML and statistical learning developed similar techniques starting in the 1960s, DM emerged in the 1990s from a merging of database research and automated learning. However, industry was much more enthusiastic about adopting DM techniques compared to AL techniques, even though many DM systems are just friendly interfaces of AL systems. The document aims to explain the key differences between DM and AL that led to DM's greater commercial success.
An Instrument To Support Thinking Critically About Critical Thinking In Onlin...Daniel Wachtel
The document describes the creation of an instrument to measure critical thinking in online discussions. It reviewed 4 existing models of critical thinking and identified common elements and differences between them. It then analyzed and synthesized the models to develop an instrument that operationalizes critical thinking in a valid way based on key cognitive processes. The instrument was tested on an online discussion transcript and showed promise in identifying critical thinking, though practical issues remain to be addressed.
Intention recognition for dynamic role exchange in hapticأحلام انصارى
1) The paper summarizes an experimental study that investigated the utility of a role exchange mechanism in human-computer collaboration through haptic interaction.
2) In this framework, a human dynamically interacts with a computer partner by communicating through the haptic channel to trade control levels on a shared task.
3) The study examined conditions of equal control, role exchange, and visuo-haptic cues, analyzing effects on task performance, efficiency, and subjective measures.
This document discusses semantic networks and their applications in artificial intelligence. It begins with describing how semantic networks originated in psychology for analyzing relationships between concepts, and were later adopted by AI to analyze textual data. Key applications of semantic networks discussed include idea generation, visual text analysis, and conceptual design. The document also provides historical background on semantic networks and how they are used for knowledge representation.
What is Human Centred Design - The Design JournalJoseph Giacomin
1) The document discusses the definition and practice of human-centered design. It defines human-centered design as an approach that focuses on understanding people's needs, desires, and experiences through techniques like empathy, scenarios, and personas in order to design intuitive products and services.
2) It proposes a model of human-centered design as a pyramid with physical and perceptual characteristics at the base and meaning and purpose at the apex. The model suggests that designs addressing higher-level questions can offer more value and opportunities for success.
3) The document argues that while early approaches focused on usability, modern human-centered design also considers emotional engagement and can define new meanings through interactions with people.
Developing of climate data for building simulation with future weather condit...Rasmus Madsen
Today, climate models are used frequently to describe past, current or future climate conditions in par-ticular building simulation. A research study of how future climate change will affect the future indoor environment and buildings energy use in a Danish context has been conducted. To fulfil this research study, information of how climate models are developed are needed as well. The research study includes an objective descriptive approach from both Danish and global research of the given topic. The gathered information from the publications is evaluated with respect to indicators for the quality of the journals as well as the authors. The method used for development of the Danish design reference year, is not clear, and to have a full knowledge of how the climate change will affect building simulation in a Danish context, further research is needed. This research for development of a new Danish weather file will require both a descriptive and analytical research.
DALL-E 2 - OpenAI imagery automation first developed by Vishal Coodye in 2021...MITAILibrary
The document provides a review of machine learning interpretability methods. It begins with an introduction to explainable artificial intelligence and a discussion of key concepts like interpretability and explainability. It then presents a taxonomy of interpretability methods that are divided into four main categories: methods for explaining black-box models, creating white-box models, promoting fairness, and analyzing model sensitivity. Specific machine learning interpretability techniques are summarized within each category.
Rapid Prototyping for Instructional Designover TimeJean Mullins
This document provides an overview of the evolution of rapid prototyping as an instructional design methodology from 1994 to 2015. It summarizes key papers that explored using rapid prototyping for instructional design and evaluated its effectiveness. The papers found that rapid prototyping could produce high-quality instructional materials in less time than traditional instructional design models by involving stakeholders throughout the iterative design process. However, rapid prototyping required attention to instructional design principles.
This document provides an overview of the steps involved in quantitative data analysis and applying Partial Least Square Structural Equation Modeling (PLS-SEM). It discusses pretesting questionnaires, preparing raw data through editing and coding, assessing validity through measures like content validity and unidimensionality, and establishing construct validity through convergent and discriminant validity techniques. The goal is to review all the necessary steps for quantitative data analysis using SPSS and applying SEM, from preparing the data to reporting the results.
Knowledge Management in Software EnterpriseIOSR Journals
This document discusses knowledge management in software enterprises. It begins by defining key terms like data, information, knowledge, tacit knowledge and explicit knowledge. It then discusses how knowledge management is important for software engineering processes. Effective knowledge management can streamline processes and improve quality. The document also examines different approaches to knowledge management, like plan-based versus agile methods. Finally, it presents a framework for knowledge management in enterprises, including knowledge creation, acquisition, organization, distribution, and application to create business value.
A theoretical model of differential social attributions toward computing tech...UltraUploader
This document proposes a theoretical model to explain how and why people attribute human characteristics to computing technologies. It discusses how the common practice of anthropomorphism, or ascribing human traits to computers, can have both positive and negative effects. The model suggests that an individual's perspective on the computer exists on a continuum between viewing it as a locally simplex tool or a globally complex entity. One's position is influenced by the social cues of the technology, the person's self-evaluation, the interaction context, and available attributional cues. The goal is to better understand this phenomenon and guide future research on social interactions with technology.
Why big data didn’t end causal inference by Totte Harinen at Big Data Spain 2017Big Data Spain
Ten years ago there were rumours of the death of causal inference. Big data was supposed to enable us to rely on purely correlational data to predict and control the world.
https://www.bigdataspain.org/2017/talk/why-big-data-didnt-end-causal-inference
Big Data Spain 2017
November 16th - 17th Kinépolis Madrid
An Iterative Model as a Tool in Optimal Allocation of Resources in University...Dr. Amarjeet Singh
In this paper, a study was carried out to aid in
adequate allocation of resources in the College of Natural
Sciences, TYZ University (not real name because of ethical
issue). Questionnaires were administered to the highranking officials of one the Colleges, College of Pure and
Applied Sciences, to examine how resources were allocated
for three consecutive sessions(the sessions were 2009/2010,
2010/2011 and 2011/2012),then used the data gathered and
analysed to generate contributory inputs for the three basic
outputs (variables)formed for the purpose of the study.
These variables are: 1
x
represents the quality of graduates
produced;
2
x
stands for research papers, Seminars,
Journals articles etc. published by faculties and
3
x
denotes service delivery within the three sessions under study.
Simplex Method of Linear Programming was used to solve
the model formulated.
Invited talk: Using Social Media and Mobile Devices to Mediate Informal, Professional, Work-Based Learning
John Cook
Bristol Centre for Research
in Lifelong Learning and Education (BRILLE)
University of the West of England (UWE)
http://www.uwe.ac.uk/research/brille/
http://people.uwe.ac.uk/Pages/person.aspx?accountname=campus\jn-cook
Invited talk: Centre for Learning, Knowing and Interactive Technologies, Graduate School of Education, University of Bristol
26th February, 12.30 to 13.45
Similar to Human Computation Must Be Reproducible (20)
UNIVERSAL ACCOUNT NUMBER (UAN) Manual
UNIVERSAL ACCOUNT NUMBER (UAN) Manual
how to find uan number for pf
uan number epf registration
epf contact number
epf contact number kl
epf helpline number
how to find uan number in epf
old pf number to new pf number
old pf account number
how to get old pf number
how to know my pf account number
how to find the pf number
pf account number search
pf no check
how to know my pf balance
pf account balance enquiry
epf department contact number
epf ambattur contact number
epf contact number
epf contact number kl
epf helpline number
epf claim status contact number
Advanced Tactics and Engagement Strategies for Google+Sunny Kr
Succeeding Liam Walsh is Dan Petrovic, the CEO of Dejan SEO, and a well-known search engine specialist. A well-traveled and seasoned presenter, Dan brought a myriad of underutilised Google + capabilities to our attention.
A scalable gibbs sampler for probabilistic entity linkingSunny Kr
This document summarizes a research paper that proposes a scalable Gibbs sampling approach for probabilistic entity linking. The approach formulates entity linking as probabilistic inference in a topic model where each topic corresponds to a Wikipedia article. It introduces an efficient Gibbs sampling scheme that exploits the sparsity in the Wikipedia-LDA model to allow inference over millions of topics. Experimental results show it achieves state-of-the-art performance on the Aida-CoNLL dataset.
Through a detailed analysis of logs of activity for all Google employees, this paper shows how the Google Docs suite (documents, spreadsheets and slides) enables and increases collaboration within Google. In particular, visualization and analysis of the evolution of Google’s collaboration network show that new employees, have started collaborating more quickly and with more people as usage of Docs has grown.
Think Hotels - Book Cheap Hotels WorldwideSunny Kr
This document provides information on booking hotels through an online service. It allows users to choose from over 25,000 hotel locations worldwide and offers the best prices with a single click. Additional details are provided about available accommodations.
Wireframes are basic sketches of an app or website that communicate design ideas without visual elements like fonts or colors. They allow teams to align expectations early on and are used by business decision makers to visualize requirements, developers to understand technical needs, and QA testers to write test scripts. The process involves starting with requirements, creating visual mockups that are iterated based on feedback, building the product while updating designs, and keeping wireframes current throughout development and testing.
comScore Inc. - 2013 Mobile Future in FocusSunny Kr
This document provides an overview and analysis of the mobile and connected device landscape in the United States and internationally. Some of the key points covered include:
- Smartphone and tablet adoption has surged, with over 120 million smartphone owners and nearly 50 million tablet owners in the US. This widespread adoption is ushering in a new "Brave New Digital World" of multi-platform media consumption.
- Mobile channels now account for over 1 in 3 minutes of digital media time spent, demonstrating the rise of multi-platform consumption as a new reality. Leading digital properties are extending their reach by 29% on average through mobile.
- The top mobile platforms are Android and iOS, which combined control nearly 90% of the
Search results clustering (SRC) is a challenging algorithmic
problem that requires grouping together the results returned
by one or more search engines in topically coherent clusters,
and labeling the clusters with meaningful phrases describing
the topics of the results included in them.
Many latent (factorized) models have been
proposed for recommendation tasks like collaborative filtering and for ranking tasks like
document or image retrieval and annotation.
Common to all those methods is that during inference the items are scored independently by their similarity to the query in the
latent embedding space. The structure of the
ranked list (i.e. considering the set of items
returned as a whole) is not taken into account. This can be a problem because the
set of top predictions can be either too diverse (contain results that contradict each
other) or are not diverse enough
HyperLogLog in Practice: Algorithmic Engineering of a State of The Art Cardin...Sunny Kr
Cardinality estimation has a wide range of applications and
is of particular importance in database systems. Various
algorithms have been proposed in the past, and the HyperLogLog algorithm is one of them
Auctions for perishable goods such as internet ad inventory need to make real-time allocation
and pricing decisions as the supply of the good arrives in an online manner, without knowing the
entire supply in advance. These allocation and pricing decisions get complicated when buyers
Approximation Algorithms for the Directed k-Tour and k-Stroll ProblemsSunny Kr
In the Asymmetric Traveling Salesman Problem (ATSP), the input is a directed n-vertex graph G = (V; E) with nonnegative edge lengths, and the goal is to nd a minimum-length tour, visiting
each vertex at least once. ATSP, along with its undirected counterpart, the Traveling Salesman
problem, is a classical combinatorial optimization problem
Ivanti’s Patch Tuesday breakdown goes beyond patching your applications and brings you the intelligence and guidance needed to prioritize where to focus your attention first. Catch early analysis on our Ivanti blog, then join industry expert Chris Goettl for the Patch Tuesday Webinar Event. There we’ll do a deep dive into each of the bulletins and give guidance on the risks associated with the newly-identified vulnerabilities.
Introduction of Cybersecurity with OSS at Code Europe 2024Hiroshi SHIBATA
I develop the Ruby programming language, RubyGems, and Bundler, which are package managers for Ruby. Today, I will introduce how to enhance the security of your application using open-source software (OSS) examples from Ruby and RubyGems.
The first topic is CVE (Common Vulnerabilities and Exposures). I have published CVEs many times. But what exactly is a CVE? I'll provide a basic understanding of CVEs and explain how to detect and handle vulnerabilities in OSS.
Next, let's discuss package managers. Package managers play a critical role in the OSS ecosystem. I'll explain how to manage library dependencies in your application.
I'll share insights into how the Ruby and RubyGems core team works to keep our ecosystem safe. By the end of this talk, you'll have a better understanding of how to safeguard your code.
Best 20 SEO Techniques To Improve Website Visibility In SERPPixlogix Infotech
Boost your website's visibility with proven SEO techniques! Our latest blog dives into essential strategies to enhance your online presence, increase traffic, and rank higher on search engines. From keyword optimization to quality content creation, learn how to make your site stand out in the crowded digital landscape. Discover actionable tips and expert insights to elevate your SEO game.
5th LF Energy Power Grid Model Meet-up SlidesDanBrown980551
5th Power Grid Model Meet-up
It is with great pleasure that we extend to you an invitation to the 5th Power Grid Model Meet-up, scheduled for 6th June 2024. This event will adopt a hybrid format, allowing participants to join us either through an online Mircosoft Teams session or in person at TU/e located at Den Dolech 2, Eindhoven, Netherlands. The meet-up will be hosted by Eindhoven University of Technology (TU/e), a research university specializing in engineering science & technology.
Power Grid Model
The global energy transition is placing new and unprecedented demands on Distribution System Operators (DSOs). Alongside upgrades to grid capacity, processes such as digitization, capacity optimization, and congestion management are becoming vital for delivering reliable services.
Power Grid Model is an open source project from Linux Foundation Energy and provides a calculation engine that is increasingly essential for DSOs. It offers a standards-based foundation enabling real-time power systems analysis, simulations of electrical power grids, and sophisticated what-if analysis. In addition, it enables in-depth studies and analysis of the electrical power grid’s behavior and performance. This comprehensive model incorporates essential factors such as power generation capacity, electrical losses, voltage levels, power flows, and system stability.
Power Grid Model is currently being applied in a wide variety of use cases, including grid planning, expansion, reliability, and congestion studies. It can also help in analyzing the impact of renewable energy integration, assessing the effects of disturbances or faults, and developing strategies for grid control and optimization.
What to expect
For the upcoming meetup we are organizing, we have an exciting lineup of activities planned:
-Insightful presentations covering two practical applications of the Power Grid Model.
-An update on the latest advancements in Power Grid -Model technology during the first and second quarters of 2024.
-An interactive brainstorming session to discuss and propose new feature requests.
-An opportunity to connect with fellow Power Grid Model enthusiasts and users.
Monitoring and Managing Anomaly Detection on OpenShift.pdfTosin Akinosho
Monitoring and Managing Anomaly Detection on OpenShift
Overview
Dive into the world of anomaly detection on edge devices with our comprehensive hands-on tutorial. This SlideShare presentation will guide you through the entire process, from data collection and model training to edge deployment and real-time monitoring. Perfect for those looking to implement robust anomaly detection systems on resource-constrained IoT/edge devices.
Key Topics Covered
1. Introduction to Anomaly Detection
- Understand the fundamentals of anomaly detection and its importance in identifying unusual behavior or failures in systems.
2. Understanding Edge (IoT)
- Learn about edge computing and IoT, and how they enable real-time data processing and decision-making at the source.
3. What is ArgoCD?
- Discover ArgoCD, a declarative, GitOps continuous delivery tool for Kubernetes, and its role in deploying applications on edge devices.
4. Deployment Using ArgoCD for Edge Devices
- Step-by-step guide on deploying anomaly detection models on edge devices using ArgoCD.
5. Introduction to Apache Kafka and S3
- Explore Apache Kafka for real-time data streaming and Amazon S3 for scalable storage solutions.
6. Viewing Kafka Messages in the Data Lake
- Learn how to view and analyze Kafka messages stored in a data lake for better insights.
7. What is Prometheus?
- Get to know Prometheus, an open-source monitoring and alerting toolkit, and its application in monitoring edge devices.
8. Monitoring Application Metrics with Prometheus
- Detailed instructions on setting up Prometheus to monitor the performance and health of your anomaly detection system.
9. What is Camel K?
- Introduction to Camel K, a lightweight integration framework built on Apache Camel, designed for Kubernetes.
10. Configuring Camel K Integrations for Data Pipelines
- Learn how to configure Camel K for seamless data pipeline integrations in your anomaly detection workflow.
11. What is a Jupyter Notebook?
- Overview of Jupyter Notebooks, an open-source web application for creating and sharing documents with live code, equations, visualizations, and narrative text.
12. Jupyter Notebooks with Code Examples
- Hands-on examples and code snippets in Jupyter Notebooks to help you implement and test anomaly detection models.
Main news related to the CCS TSI 2023 (2023/1695)Jakub Marek
An English 🇬🇧 translation of a presentation to the speech I gave about the main changes brought by CCS TSI 2023 at the biggest Czech conference on Communications and signalling systems on Railways, which was held in Clarion Hotel Olomouc from 7th to 9th November 2023 (konferenceszt.cz). Attended by around 500 participants and 200 on-line followers.
The original Czech 🇨🇿 version of the presentation can be found here: https://www.slideshare.net/slideshow/hlavni-novinky-souvisejici-s-ccs-tsi-2023-2023-1695/269688092 .
The videorecording (in Czech) from the presentation is available here: https://youtu.be/WzjJWm4IyPk?si=SImb06tuXGb30BEH .
This presentation provides valuable insights into effective cost-saving techniques on AWS. Learn how to optimize your AWS resources by rightsizing, increasing elasticity, picking the right storage class, and choosing the best pricing model. Additionally, discover essential governance mechanisms to ensure continuous cost efficiency. Whether you are new to AWS or an experienced user, this presentation provides clear and practical tips to help you reduce your cloud costs and get the most out of your budget.
Your One-Stop Shop for Python Success: Top 10 US Python Development Providersakankshawande
Simplify your search for a reliable Python development partner! This list presents the top 10 trusted US providers offering comprehensive Python development services, ensuring your project's success from conception to completion.
Letter and Document Automation for Bonterra Impact Management (fka Social Sol...Jeffrey Haguewood
Sidekick Solutions uses Bonterra Impact Management (fka Social Solutions Apricot) and automation solutions to integrate data for business workflows.
We believe integration and automation are essential to user experience and the promise of efficient work through technology. Automation is the critical ingredient to realizing that full vision. We develop integration products and services for Bonterra Case Management software to support the deployment of automations for a variety of use cases.
This video focuses on automated letter generation for Bonterra Impact Management using Google Workspace or Microsoft 365.
Interested in deploying letter generation automations for Bonterra Impact Management? Contact us at sales@sidekicksolutionsllc.com to discuss next steps.
Skybuffer AI: Advanced Conversational and Generative AI Solution on SAP Busin...Tatiana Kojar
Skybuffer AI, built on the robust SAP Business Technology Platform (SAP BTP), is the latest and most advanced version of our AI development, reaffirming our commitment to delivering top-tier AI solutions. Skybuffer AI harnesses all the innovative capabilities of the SAP BTP in the AI domain, from Conversational AI to cutting-edge Generative AI and Retrieval-Augmented Generation (RAG). It also helps SAP customers safeguard their investments into SAP Conversational AI and ensure a seamless, one-click transition to SAP Business AI.
With Skybuffer AI, various AI models can be integrated into a single communication channel such as Microsoft Teams. This integration empowers business users with insights drawn from SAP backend systems, enterprise documents, and the expansive knowledge of Generative AI. And the best part of it is that it is all managed through our intuitive no-code Action Server interface, requiring no extensive coding knowledge and making the advanced AI accessible to more users.
Skybuffer SAM4U tool for SAP license adoptionTatiana Kojar
Manage and optimize your license adoption and consumption with SAM4U, an SAP free customer software asset management tool.
SAM4U, an SAP complimentary software asset management tool for customers, delivers a detailed and well-structured overview of license inventory and usage with a user-friendly interface. We offer a hosted, cost-effective, and performance-optimized SAM4U setup in the Skybuffer Cloud environment. You retain ownership of the system and data, while we manage the ABAP 7.58 infrastructure, ensuring fixed Total Cost of Ownership (TCO) and exceptional services through the SAP Fiori interface.
TrustArc Webinar - 2024 Global Privacy SurveyTrustArc
How does your privacy program stack up against your peers? What challenges are privacy teams tackling and prioritizing in 2024?
In the fifth annual Global Privacy Benchmarks Survey, we asked over 1,800 global privacy professionals and business executives to share their perspectives on the current state of privacy inside and outside of their organizations. This year’s report focused on emerging areas of importance for privacy and compliance professionals, including considerations and implications of Artificial Intelligence (AI) technologies, building brand trust, and different approaches for achieving higher privacy competence scores.
See how organizational priorities and strategic approaches to data security and privacy are evolving around the globe.
This webinar will review:
- The top 10 privacy insights from the fifth annual Global Privacy Benchmarks Survey
- The top challenges for privacy leaders, practitioners, and organizations in 2024
- Key themes to consider in developing and maintaining your privacy program
Building Production Ready Search Pipelines with Spark and MilvusZilliz
Spark is the widely used ETL tool for processing, indexing and ingesting data to serving stack for search. Milvus is the production-ready open-source vector database. In this talk we will show how to use Spark to process unstructured data to extract vector representations, and push the vectors to Milvus vector database for search serving.
Driving Business Innovation: Latest Generative AI Advancements & Success StorySafe Software
Are you ready to revolutionize how you handle data? Join us for a webinar where we’ll bring you up to speed with the latest advancements in Generative AI technology and discover how leveraging FME with tools from giants like Google Gemini, Amazon, and Microsoft OpenAI can supercharge your workflow efficiency.
During the hour, we’ll take you through:
Guest Speaker Segment with Hannah Barrington: Dive into the world of dynamic real estate marketing with Hannah, the Marketing Manager at Workspace Group. Hear firsthand how their team generates engaging descriptions for thousands of office units by integrating diverse data sources—from PDF floorplans to web pages—using FME transformers, like OpenAIVisionConnector and AnthropicVisionConnector. This use case will show you how GenAI can streamline content creation for marketing across the board.
Ollama Use Case: Learn how Scenario Specialist Dmitri Bagh has utilized Ollama within FME to input data, create custom models, and enhance security protocols. This segment will include demos to illustrate the full capabilities of FME in AI-driven processes.
Custom AI Models: Discover how to leverage FME to build personalized AI models using your data. Whether it’s populating a model with local data for added security or integrating public AI tools, find out how FME facilitates a versatile and secure approach to AI.
We’ll wrap up with a live Q&A session where you can engage with our experts on your specific use cases, and learn more about optimizing your data workflows with AI.
This webinar is ideal for professionals seeking to harness the power of AI within their data management systems while ensuring high levels of customization and security. Whether you're a novice or an expert, gain actionable insights and strategies to elevate your data processes. Join us to see how FME and AI can revolutionize how you work with data!
Taking AI to the Next Level in Manufacturing.pdfssuserfac0301
Read Taking AI to the Next Level in Manufacturing to gain insights on AI adoption in the manufacturing industry, such as:
1. How quickly AI is being implemented in manufacturing.
2. Which barriers stand in the way of AI adoption.
3. How data quality and governance form the backbone of AI.
4. Organizational processes and structures that may inhibit effective AI adoption.
6. Ideas and approaches to help build your organization's AI strategy.
1. Human Computation Must Be Reproducible
Praveen Paritosh
Google
345 Spear St,
San Francisco, CA 94105.
pkp@google.com
ABSTRACT sures of quality [Ipeirotis, Provost and Wang, 2010], utility
Human computation is the technique of performing a com- [Dai, Mausam and Weld, 2010], among others. In order for
putational process by outsourcing some of the difficult-to- us to have confidence in any such criteria, the results must
automate steps to humans. In the social and behavioral sci- be reproducible, i.e., not a result of chance agreement or irre-
ences, when using humans as measuring instruments, repro- producible human idiosyncrasies, but a reflection of the un-
ducibility guides the design and evaluation of experiments. derlying properties of the questions and task instructions, on
We argue that human computation has similar properties, which others could agree as well. Reproducibility is the de-
and that the results of human computation must be repro- gree to which a process can be replicated by different human
ducible, in the least, in order to be informative. We might contributors working under varying conditions, at different
additionally require the results of human computation to locations, or using different but functionally equivalent mea-
have high validity or high utility, but the results must be suring instruments. A total lack of reproducibility implies
reproducible in order to measure the validity or utility to a that the given results could have been obtained merely by
degree better than chance. Additionally, a focus on repro- chance agreement. If the results are not differentiable from
ducibility has implications for design of task and instruc- chance, there is little information content in them. Using
tions, as well as for the communication of the results. It human computation in such a scenario is wasteful of an ex-
is humbling how often the initial understanding of the task pensive resource, as chance is cheap to simulate.
and guidelines turns out to lack reproducibility. We suggest A much stronger claim than reproducibility is validity. For
ensuring, measuring and communicating reproducibility of a measurement instrument, e.g., a vernier caliper, a stan-
human computation tasks. dardized test, or, a human coder, the reproducibility is the
extent to which a measurement gives consistent results, and
the validity is the extent to which the tool measures what
1. INTRODUCTION it claims to measure. In contrast to reproducibility, valid-
Some examples of tasks using human computation are: ity concerns truths. Validity requires comparing the results
labeling images [Nowak and Ruger, 2010], conducting user of the study to evidence obtained independently of that ef-
studies [Kittur, Chi, and Suh, 2008], annotating natural lan- fort. Reproducibility provides assurances that particular re-
guage corpora [Snow, O’Connor, Jurafsky and Ng, 2008], an- search results can be duplicated, that no (or only a negligi-
notating images for computer vision research [Sorokin and ble amount of) extraneous noise has entered the process and
Forsyth, 2008], search engine evaluation [Alonso, Rose and polluted the data or perturbed the research results, validity
Stewart, 2008; Alonso, Kazai and Mizzaro, 2011], content provides assurances that claims emerging from the research
moderation [Ipeirotis, Provost and Wang, 2010], entity rec- are borne out in fact.
onciliation [Kochhar, Mazzocchi and Paritosh, 2010], con- We might want the results of human computation to have
ducting behavioral studies [Suri and Mason, 2010; Horton, high validity, high utility, low cost, among other desirable
Rand and Zeckhauser, 2010]. characteristics. However, the results must be reproducible
These tasks involve presenting a question, e.g.,“Is this im- in order for us to measure the validity or utility to a degree
age offensive?,” to one or more humans, whose answers are better than chance.
aggregated to produce a resolution, a suggested answer for More than a statistic, a focus on reproducibility offers
the original question. The humans might be paid contribu- valuable insights regarding the design of the task and the
tors. Examples of paid workforces include Amazon Mechani- guidelines for the human contributors, as well as the commu-
cal Turk [www.mturk.com] and oDesk [www.odesk.com]. An nication of the results. The output of human computation is
example of a community of volunteer contributors is Foldit thus akin to the result of a scientific experiment, and it can
[www.fold.it], where human computation augments machine only be considered meaningful if it is reproducible — that is,
computation for predicting protein structure [Cooper et al., the same results could be replicated in an independent exer-
2010]. Another example involving volunteer contributors is cise. This requires clearly communicating the task instruc-
games with a purpose [von Ahn, 2006], and the upcoming tions, and the criterion of selecting the human contributors,
Duolingo [www.duolingo.com], where the contributors are ensuring that they work independently, and reporting an
translate previously untranslated web corpora while learn- appropriate measure of reproducibility. Much of this is well
ing a new language. established in the methodology of content analysis in the so-
The results of human computation can be characterized cial and behavioral sciences [Armstrong, Gosling, Weinman
by accuracy [Oleson et al., 2011], information theoretic mea-
Copyright c 2012 for the individual papers by the papers’ authors. Copy-
ing permitted for private and academic purposes. This volume is published
and copyrighted by its editors.
CrowdSearch 2012 workshop at WWW 2012, Lyon, France
2. and Marteau, 1997; Hayes and Krippendorff, 2007], being 3. CONTENT ANALYSIS AND CODING IN
required of any publishable result involving human contrib- THE BEHAVIORAL SCIENCES
utors. In Section 2 and 3, we argue that human computation
In the social sciences, content analysis is a methodology
resembles the coding tasks of behavioral sciences. However,
for studying the content of communication [Berelson, 1952;
in the human computation and crowdsourcing research com-
Krippendorff, 2004]. Coding of subjective information is a
munity, reproducibility is not commonly reported.
significant source of empirical data in social and behavioral
We have collected millions of human judgments regard-
sciences, as they allow techniques of quantitative research
ing entities and facts in Freebase [Kochhar, Mazzocchi and
to be applied to complex phenomena. These data are typ-
Paritosh, 2010; Paritosh and Taylor, 2012]. We have found
ically generated by trained human observers who record or
reproducibility to be a useful guide for task and guideline
transcribe textual, pictorial or audible matter in terms suit-
design. It is humbling how often the initial understanding
able for analysis. This task is called coding, which involves
of the task and guidelines turns out to lack reproducibility.
assigning categorical, ordinal or quantitative responses to
In section 4, we describe some of the widely used measures of
units of communication.
reproducibility. We suggest ensuring, measuring and com-
An early example of a content analysis based study is “Do
municating reproducibility of human computation tasks.
newspapers now give the news?” [Speed, 1893], which tried
In the next section, we describe the sources of variabil-
to show that the coverage of religious, scientific and literary
ity human computation, which highlight the role of repro-
matters was dropped in favor of gossip, sports and scandals
ducibility.
between 1881 and 1893, by New York newspapers. Conclu-
sions from such data can only be trusted if the reading of
2. SOURCES OF VARIABILITY IN HUMAN the textual data as well as of the research results are replica-
COMPUTATION ble elsewhere, that the coders demonstrably agree on what
There are many sources of variability in human compu- they are talking about. Hence, the coders need to demon-
tation that are not present in machine computation. Given strate the trustworthiness of their data by measuring their
that human computation is used to solve problems that are reproducibility. To perform reproducibility tests, additional
beyond the reach of machine computation, by definition, data are needed: by duplicating the research under various
these problems are incompletely specified. Variability arises conditions. Reproducibility is established by independent
due to incomplete specification of the task. This is convolved agreement between different but functionally equal measur-
with the fact that the guidelines are subject to differing in- ing devices, for example, by using several coders with di-
terpretations by different human contributors. Some char- verse personalities. The reproducibility of coding has been
acteristics of human computation tasks are: used for comparing consistency of medical diagnosis [e.g.,
Koran, 1975], for drawing conclusions from meta-analysis of
• Task guidelines are incomplete: A task can span a wide
research findings [e.g., Morley et al., 1999], for testing indus-
set of domains, not all of which are anticipated at the
trial reliability [Meeker and Escobar, 1998], for establishing
beginning of the task. This leads to incompleteness
the usefulness of a clinical scale [Hughes et al., 1982].
in guidelines, as well as varying levels of performance
depending upon the contributor’s expertise in that do- 3.1 Relationship between Reproducibility and
main. Consider the task of establishing relevance of
an arbitrary search query [Alonso, Kazai and Mizzaro,
Validity
2011].
• Lack of reproducibility limits the chance of validity: If
• Task guidelines are not precise: Consider, for exam- the coding results are a product of chance, it may well
ple, the task of declaring if an image is unsuitable for include a valid account of what was observed, but re-
a social network website. Not only is it hard to write searchers would not be able to identify that account
down all the factors that go into making an image of- to a degree better than chance. Thus, the more unre-
fensive, it is hard to communicate those factors to hu- liable a procedure, the less likely it is to result in data
man contributors with vastly different predispositions. that lead to valid conclusions.
The guidelines usually rely upon shared common sense
knowledge and cultural knowledge. • Reproducibility does not guarantee validity: Two ob-
servers of the same event who hold the same concep-
• Validity data is expensive or unavailable: An oracle tual system, prejudice, or, interest may well agree on
that provide the true answer for any given question is what they see but still be objectively wrong, based on
usually unavailable. Sometimes for a small subset of some external criterion. Thus a reliable process may
gold questions, we have answers from another indepen- or may not lead to valid outcomes.
dent source. This can be useful in making estimates of
validity, subject to the degree that the gold questions In some cases, validity data might be so hard to obtain
are representative of the set of questions. These gold that one has to contend with reproducibility. In tasks such
questions could be very useful for training and feed- as interpretation and transcription of complex textual mat-
back to the human contributors, however, we have to ter, suitable accuracy standards are not easy to find. Be-
be ensure their representatitiveness in order to make cause interpretations can only be compared to interpreta-
warranted claims regarding validity. tions, attempts to measure validity presuppose the privi-
Each of the above might be true to a different degree for leging of some interpretations over others, and this puts any
different human computation tasks. These factors are simi- claims regarding validity on epistemologically shaky grounds.
lar to the concerns of behavioral and social scientists in using In some tasks like psychiatric diagnosis, even reproducibil-
humans as measuring instruments. ity is hard to attain for some questions. Aboraya et al.
3. [2006] review the reproducibility of psychiatric diagnosis. flecting the original bias in the distribution of safe and unsafe
Lack of reproducibility has been reported for judgments of bags. A misguided interpretation of accuracy or a poor esti-
schizophrenia and affective disorder [Goodman et al., 1984], mate of accuracy can be less informative than reproducibil-
calling such diagnosis into question. ity.
3.2 Relationship with Chance Agreement 3.3 Reproducibility and Experiment Design
The focus on reproducibility has implications on design of
In this section, we look at some properties of chance agree-
task and instruction materials. Krippendorff [2004] argues
ment, and its relationship to reproducibility. Given two cod-
that any study using observed agreement as a measure of
ing schemes for the same phenomenon, the one with fewer
reproducibility must satisfy the following requirements:
categories will have higher chance agreement. For example,
in reCAPTCHA [von Ahn et al., 2008], two independent • It must employ an exhaustively formulated, clear, and
humans are shown an image containing text and asked to usable guidelines;
transcribe it. Assuming that there is no collusion, chance
agreement, i.e., two different humans typing in the same • It must use clearly specified criteria concerning the
word/phrase by chance is very small. However, in a task choice of contributors,so as others may use such cri-
in which there are only two possible answers, e.g., true and teria to reproduce the data;
false, the probability of chance agreement between two an- • It must ensure that the contributors that generate the
swers is 0.5. data used to measure reproducibility work indepen-
If a disproportionate amount of data falls under one cate- dently of each other. Only if such independence is as-
gory, then the expected chance agreement is very high, so in sured can covert consensus be ruled out the observed
order to demonstrate high reproducibility, even higher ob- agreement be explained in terms of the given guidelines
served agreement is required [Feinstein and Cicchetti 1990; and the task.
Di Eugenio and Glass 2004].
Consider a task of rating a proposition as true or false. The last point cannot be stressed enough. There are po-
Let p be the probability of the proposition being true. An tential benefits from multiple contributors collaborating, but
implementation of chance agreement is the following: toss a data generated in this manner neither ensure reproducibil-
biased coin with the same odds, i.e., p is the probability that ity nor reveal its extent. In groups like these, humans are
it turns heads, and declare the proposition to be true when known to negotiate and to yield to each other in quid pro
the coin lands heads. Now, we can simulate n judgments quo exchanges, with prestigious group members dominating
by tossing the coin n times. Let us look at the properties the outcome [see for example, Esser, 1998]. This makes the
of unanimous agreement between two judgments. The like- results of collaborative human computation a reflection of
lihood of a chance agreement on true is p2 . Independent the social structure of the group, which is nearly impossible
of this agreement, the probability of the proposition being to communicate to other researchers and replicate. The data
true is p, therefore, the accuracy of this chance agreement generated by collaborative work are akin to data generated
on true is p3 . By design, these judgments do not contain any by a single observer, while reproducibility requires at least
information other than the a priori distribution across an- two independent observers. To substantiate the contention
swers. Such data has close to zero reproducibility, however, that collaborative coding is superior to coding by separate
sometimes it can show up in surprising ways when looked individuals, a researcher would have to compare the data
through the lens of accuracy. generated by at least two such groups and two individuals,
For example, consider the task of an airport agent declar- each working independently.
ing a bag as safe or unsafe for boarding on the plane. A A model in which the coders work independently, but con-
bag can be unsafe if it contains toxic or explosive materials sult each other when unanticipated problems arise, is also
that could threaten the safety of the flight. Most bags are problematic. A key source of these unanticipated problems
safe. Let us say that one in a thousand bags is potentially is the fact that the writers of the coding instructions did
unsafe. Random coding would allow two agents to jointly not anticipate all the possible ways of expressing the rele-
assign “safe” 99.8% of the time, and since 99.9% of the bags vant matter. Ideally, these instructions should include every
are safe, this agreement would be accurate 99.7% of the time! applicable rule on which agreement is being measured. How-
This leads to the surprising result that when data are highly ever, discussing emerging problems could create re-interpretation
skewed, the coders may agree on a high proportion of items of the existing instructions in ways that are a function of the
while producing annotations that are accurate, but of low group and not communicable to others. In addition, as the
reproducibility. When one category is very common, high instructions become reinterpreted, the process loses its sta-
accuracy and high agreement can also result from indiscrim- bility: data generated early in the process use instructions
inate coding. The test for reproducibility in such cases is that differ from those later.
the ability to agree on the rare categories. In the airport In addition to the above, Craggs and McGee Wood [2005]
bag classification problem, while chance accuracy on safe discourage researchers from testing their coding instructions
bags is high, chance accuracy on unsafe bags is extremely on data from more than one domain. Given that the re-
low, 10−7 %. In practice, the cost of errors vary: mistakenly producibility of the coding instructions depends to a great
classifying a safe bag as unsafe causes far less damage than extent on how complications are dealt with, and that every
classifying an unsafe bag as safe. domain displays different complications, the sample should
In this case, it is dangerous to consider an averaged accu- contain sufficient examples from all domains which have to
racy score, as different errors do not count equal: a chance be annotated according to the instructions.
process that does not add any information has an average Even the best coding instructions might not specify all
accuracy which is higher than 99.7%, most of which is re- possible complications. Besides the set of desired answers,
4. the coders should also be allowed to skip a question. If the to Popping [1988], Artstein and Poesio [2007]. The different
coders cannot prove any of the other answers is correct, they coefficients of reproducibility differ in the assumptions they
skip that question. For any other answer, the instructions make about the properties of coders, judgments and units.
define an a priori model of agreement on that answer, while Scott’s π [1955] is applicable to two raters and assumes that
skip represents the unanticipated properties of questions and the raters have the same distribution of responses, where
coders. For instance, some questions might be too difficult Cohen’s κ [1960; 1968] allows for a a separate distribution
for certain coders. Providing the human contributors with of chance behavior per coder. Fleiss’ κ [1971] is a gen-
an option to skip is a nod to the openness of the task, and eralization of Scott’s π for an arbitrary number of raters.
can be used to explore the poorly defined parts of the task All of these coefficients of reproducibility correct for chance
that were not anticipated at the outset. Additionally, the agreement similarly. First, they find how much agreement
skip votes can be removed from the analysis for computing is expected by chance: let us call this vallue Ae . The data
reproducibility, as we do not have expectation of agreement from the coding is a measure of the observed agreement,
on them [Krippendorff, 2012, personal communication]. Ao . Various inter-rater reliabilities measure the proportion
of the possible agreement beyond chance that was actually
4. MEASURING REPRODUCIBILITY observed.
In measurement theory, reliability is the more general
guarantee that the data obtained are independent of the Ao − Ae
S, π, κ =
measuring event, instrument or person. There are three dif- 1 − Ae
ferent kinds of reliability: Krippendorff’s α [1970; 2004] is a generalization of many
Stability: measures the degree to which a process is un- of these coefficients. It is a generalization of the above met-
changing over time. It is measured by agreement between rics for an arbitrary number of raters, not all of whom have
multiple trials of the same measuring or coding process. This to answer every question. Krippendorff’s α has the following
is also called test-retest condition, in which one observer desirable characteristics:
does a task, and after some time, repeats the task again.
This measures intra-observer reliability. A similar notion is • It is applicable to an arbitrary number of contributors
internal consistency [Cronbach, 1951], which is the degree and invariant to the permutation and selective partic-
to which the answers on the same task are consistent. Sur- ipation of contributors. It corrects itself for varying
veys are designed so that the subsets of similar questions amounts of reproducibility data.
are known a priori, and measures for internal consistency
metrics are based on correlation between these answers. • It constitutes a numerical scale between at least two
Reproducibility: measures the degree to which a process points with sensible reproducibility interpretations, 0
can be replicated by different analysts working under varying representing absence of agreement, and 1 indicating
conditions, at different locations, or, using different but func- perfect agreement.
tionally equivalent measuring instruments. Reproducible • It is applicable to several scales of measurement: ordi-
data, by definition, are data that remain constant through- nal, nominal, interval, ratio, and more.
out variations in the measuring process [Kaplan and Gold-
sen, 1965]. Alpha’s general form is:
Accuracy: measures the degree to which the process pro-
duces valid results. To measure accuracy, we have to com- Do
pare the performance of contributors with the performance α=1−
De
of a procedure that is known to be correct. In order to
generate estimates of accuracy, we need accuracy data, i.e., Where Do is the observed disagreement:
valid answers to a representative sample of the questions.
Estimating accuracy gets harder in cases where the hetero- 1 2
Do = ock .δck
geneity of the task is poorly understood. n c k
The next section focuses on reproducibility.
and De is the disagreement one would expect when the
4.1 Reproducibility answers are attributable to chance rather than to the prop-
There are two different aspects of reproducibility: inter- erties of the questions:
rater reliability and inter-method reliability. Inter-rater reli-
ability focuses on the reproducibility by agreement between 1 2
De = nc .nk .δck
independent raters, and inter-method reliability focuses on n(n − 1) c k
the reliability of different measuring devices. For example,
2
in survey and test design, parallel forms reliability is used The δck term is the distance metric for the scale of the
to create multiple equivalent tests, of which more than one answerspace. For a nominal scale,
are administered to the same human. We focus on inter-
rater reliability as the measure of reproducibility typically
2 0 if c = k
applicable to human computation tasks, where we generate δck =
judgments from multiple humans per question. The simplest 1 if c = k
form of inter-rater reliability is percent agreement, however
it is not suitable as a measure of reproducibility as it does 4.2 Statistical Significance
not correct for chance agreement. The goal of measuring reproducibility is to ensure that
For extensive survey of measures of reproducibility, refer the data does not deviate too much from perfect agreement,
5. not that the data is different from chance agreement. In the 4.5 Other Measures of Quality
definition of α, chance agreement is one of the two anchors Ipeiritos, Provost and Wang [2010] present an information
for the agreement scale, the other, more important, reference theoretic quality score, which measures the quality of a con-
point being that of perfect agreement. As the distribution of tributor in terms of comparing their score to a spammer who
α is unknown, confidence intervals on α are obtained from is trying to advantage of chance accuracy. In that regard, it
empirical distribution generated by bootstrapping — that is a similar metric to Krippendorff’s alpha, and additionally
is, by drawing a large number of subsamples from the re- models a notion of cost.
producibility data, computing α for each. This gives us a
probability distribution of hypothetical α values that could ExpCost(Contributor)
QualityScore = 1 −
occur within the constraints of the observed data. This can ExpCost(Spammer)
be used to calculate the probability of failing to reach the Turkontrol [Dai, Mausam and Weld, 2010], uses both a
smallest acceptable reproducibility αmin , q|α < αmin , or a model of utility and quality using decision-theoretic control
two tailed confidence interval for chosen level of significance. to make trade-offs between quality and utility for workflow
4.3 Sampling Considerations control.
Le, Edmonds, Hester and Biewald [2010] develop a gold
To generate an estimate of the reproducibility of a pop-
standard based quality assurance framework that provides
ulation, we need to generate a representative sample of the
direct feedback to the workers and targets specific worker
population. The sampling needs to ensure that we have
errors. This approach requires extensive manually generated
enough units from the rare categories of questions in the
collection of gold data. Oleson et al. [2011], further develop
data. Assuming that α is normally distributed, Bloch and
this approach to include pyrite, which are programmatically
Kraemer [1989] provide a suggestion for minimum number of
generated gold questions on which contributors are likely to
questions from each category to be included in the sample,
make an error, for example by mutating data so that it is
Nc , by,
no longer valid. These are very useful metrics for training,
feedback and protection against spammers, but these do not
(1 + αmin )(3 − αmin ) reveal the accuracy of the results. The gold questions, by
Nc = z 2 − αmin
4pc (1 − pc )(1 − αmin ) design, are not representative of the original set of questions.
Where, These lead to wide error bars on the accuracy estimates, and
• pc is the smallest estimated proportion values of the it might be valuable to measure reproducibility of results.
category c in the population,
• alphamin is the smallest acceptable reproducibility be- 5. CONCLUSIONS
low which data will have to be rejected as unreliable, We describe reproducibility as a necessary but not suf-
and ficient requirement for results of human computation. We
might additionally require the results to have high validity
• z is the desired level of statistical significance, repre- or high utility, but our ability to measure validity or utility
sented by the corresponding z value for one-tailed tests with confidence is limited if the data are not reproducible.
This is a simplification, as it assumes α is normally dis- Additionally, a focus on reproducibility has implications for
tributed, and binary data, and does not account for the num- design of task and instructions, as well as for the commu-
ber of raters. A general description of sampling requirement nication of the results. It is humbling how often the initial
is an open problem. understanding of the task and guidelines turns out to lack
reproducibility. We suggest ensuring, measuring and com-
4.4 Acceptable Levels of Reproducibility municating reproducibility of human computation tasks.
Fleiss [1981] and Krippendorff [2004] present guidelines
for what should acceptable values of reproducibility based
on surveying the empirical research using these measures.
6. ACKNOWLEDGMENTS
Krippendorff suggests, The author would like to thank Jamie Taylor, Reilly Hayes,
Stefano Mazzocchi, Mike Shwe, Ben Hutchinson, Al Marks,
• Rely on variables with reproducibility above α = 0.800. Tamsyn Waterhouse, Peng Dai, Ozymandias Haynes, Ed
Additionally don’t accept data if the confidence inter- Chi and Panos Ipeirotis for insightful discussions on this
val reaches below the smallest acceptable reproducibil- topic.
ity, αmin = 0.667, or, ensure that the probability, q, of
the failure to have less than smallest acceptable repro-
ducibility alphamin is reasonably small, e.g., q < 0.05. 7. REFERENCES
[1] Aboraya, A., Rankin, E., France, C., El-Missiry, A.,
• Consider variables with reproducibility between α =
John, C. The Reliability of Psychiatric Diagnosis Revisited,
0.667 and α = 0.800 only for drawing tentative con-
Psychiatry (Edgmont). 2006 January; 3(1): 41-50.
clusions.
[2] Alonso, O., Rose, D. E., Stewart, B.. Crowdsourcing
These are suggestions, and the choice of thresholds of ac- for relevance evaluation. SIGIR Forum, 42(2):9-15, 2008.
ceptability depend upon the validity requirements imposed [3] Alonso, O., Kazai, G. and Mizzaro, S., 2011, Crowd-
on the research results. It is perilous to “game” α by vio- sourcing for search engine evaluation, Springer 2011.
lating the requirements of reproducibility: for example, by [4] Armstrong, D., Gosling, A., Weinman, J. and Marteau,
removing a subset of data post-facto to increase α. Parti- T., 1997, The place of inter-rater reliability in qualitative
tioning data by agreement measured after the experiment research: An empirical study, Sociology, vol 31, no. 3, 597-
will not lead to valid conclusions. 606.
6. [5] Artstein, R. and Poesio, M. 2007. Inter-coder agree- [25] Krippendorff, K. 2004. Content Analysis: An Intro-
ment for computational linguistics. Computational Linguis- duction to Its Methodology, Second edition. Sage, Thousand
tics. Oaks.
[6] Bennett, E. M., R. Alpert, and A. C. Goldstein. 1954. [26] Kochhar, S., Mazzocchi, S., and Paritosh, P., 2010,
Communications through limited questioning. Public Opin- The Anatomy of a Large-Scale Human Computation Engine,
ion Quarterly, 18(3):303-308. In Proceedings of Human Computation Workshop at the
[7] Berelson, B., 1952. Content analysis in communication 16th ACM SIKDD Conference on Knowledge Discovery and
research, Free Press, New York. Data Mining, KDD 2010, Washington D.C.
[8] Bloch, Daniel A. and Helena Chmura Kraemer. 1989. [27] Koran, L. M., 1975, The reliability of clinical meth-
2 x 2 kappa coefficients: Measures of agreement or associa- ods, data and judgments (parts 1 and 2), New England Jour-
tion. Biometrics, 45(1):269-287 nal of Medicine, 293(13/14).
[9] Bollacker, K., Evans, C., Paritosh, P., Sturge, T. and [28] Kittur, A., Chi, E. H. and Suh, B. Crowdsourcing
Taylor, J., 2008, Freebase: A Collaboratively Created Graph user studies with Mechanical Turk. In Proceedings of the
Database for Structuring Human Knowledge. In the Pro- Proceeding of the twenty-sixth annual SIGCHI conference
ceedings of the 28th ACM SIGMOD International Confer- on Human factors in computing systems, Florence.
ence on Management of Data, Vancouver. [29] Mason, W., and Siddharth. 2010. Conducting Be-
[10] Cohen, Jacob. 1960. A coefficient of agreement for havioral Research on Amazon’s Mechanical Turk, Working
nominal scales. Educational and Psychological Measure- Paper, Social Science Research Network.
ment, 20(1):37-46. [30] Meeker, W. and Escobar, L., 1998, Statistical Meth-
[11] Cohen, Jacob. 1968. Weighted kappa: Nominal scale ods for Reliability Data, Wiley.
agreement with provision for scaled disagreement or partial [31] Morley, S. Eccleston, C. and Williams, A., 1999, Sys-
credit. Psychological Bulletin, 70(4):213-220. tematic review and meta-analysis of randomized controlled
[12] Cooper, S., Khatib, F., Treuille, A., Barbero, J., Lee, trials of cognitive behavior therapy and behavior therapy for
J., Beenen, M., Leaver-Fay, A., Baker, D., and Popovic, chronic pain in adults, excluding headache, Pain, 80, 1-13.
Z. Predicting protein structures with a multiplayer online [32] Nowak, S. and Ruger, S. 2010. How reliable are an-
game. Nature, 466:7307, 756-760 notations via crowdsourcing: a study about inter-annotator
[13] Cronbach, L. J., 1951, Coefficient alpha and the in- agreement for multi-label image annotation. In Multimedia
ternal structure of tests. Psychometrica, 16, 297-334. Information Retrieval, pages 557-566.
[14] Craggs, R. and McGee Wood, M. 2004. A two-dimensional [33] Oleson, D., Hester, V., Sorokin, A., Laughlin, G.,
annotation scheme for emotion in dialogue. In Proc.of AAAI Le, J., and Biewald, L. Programmatic Gold: Targeted and
Spring Symmposium on Exploring Attitude and Affect in Scalable Quality Assurance in Crowdsourcing. In HCOMP
Text, Stanford. ’11: Proceedings of the Third AAAI Human Computation
[15] Dai, P., Mausam, Weld, D. S. Decision-Theoretic Workshop.
Control of Crowd-Sourced Workflows. AAAI 2010. [34] Paritosh, P. and Taylor, J., 2012, Freebase: Curating
[16] Di Eugenio, Barbara and Michael Glass. 2004. The Knowledge at Scale, To appear in the 24th Conference on
kappa statistic: A second look. Computational Linguistics, Innovative Applications of Artificial Intelligence, IAAI 2012.
30(1):95-101. [35] Popping, R., 1988, On agreement indices for nominal
[17] Esser, J.K., 1998, Alive and Well after 25 Years: A data, Sociometric Research: Data Collection and Scaling,
Review of Groupthink Research, Organizational Behavior 90-105.
and Human Decision Processes, Volume 73, Issues 2-3, 116- [36] Scott, W. A. 1955. Reliability of content analysis:
141. The case of nominal scale coding. Public Opinion Quarterly,
[18] Feinstein, Alvan R. and Domenic V. Cicchetti. 1990. 19(3):321-325.
High agreement but low kappa: I. The problems of two para- [37] Snow, R., O’Connor, B., Jurafsky, D., and Ng, A.
doxes. Journal of Clinical Epidemiology, 43(6):543-549. Y. 2008. Cheap and fast, but is it good?: evaluating non-
[19] Fleiss, Joseph L. 1971. Measuring nominal scale agree- expert annotations for natural language tasks. In EMNLP
ment among many raters. Psychological Bulletin, 76(5):378- ’08: Proceedings of the Conference on Empirical Methods
382. in Natural Language Processing.
[20] Goodman AB, Rahav M, Popper M, Ginath Y, Pearl [38] Sorokin, A., and Forsyth, D. 2008. Utility data anno-
E. The reliability of psychiatric diagnosis in Israel’s Psychi- tation with amazon mechanical turk. In First International
atric Case Register. Acta Psychiatr Scand. 1984 May;69(5):391- Workshop on Internet Vision, CVPR 08.
7. [39] Speed, J. G., 1893, Do newspapers now give the news,
[21] Hayes, A. F., and Krippendorff, K. 2007. Answering Forum, August, 1893.
the call for a standard reliability measure for coding data. [40] von Ahn, L. Games With A Purpose. IEEE Computer
Communication Methods and Measures, 1(1):77-89. Magazine, June 2006. Pages 96-98
[22] Hughes CD, Berg L, Danziger L, Coben LA, Martin [41] von Ahn, L., and Dabbish, L. Labeling Images with
RL. A new clinical scale for the staging of dementia. Then a Computer Game. ACM Conference on Human Factors in
British Journal of Psychiatry 1982;140:56. Computing Systems, CHI 2004. Pages 319-326.
[23] Ipeirotis, P.; Provost, F.; and Wang, J. 2010. Quality [42] von Ahn, L., Maurer, B., McMillen, C., Abraham,
management on amazon mechanical turk. In KDD-HCOMP D. and Blum, M. reCAPTCHA: Human-Based Character
’10. Recognition via Web Security Measures. Science, September
[24] Krippendorff, K. 1970. Bivariate agreement coef- 12, 2008. pp 1465-1468.
ficients for reliability of data. Sociological Methodology,
2:139-150