Research talk presented at "Innovations in Online Research" (October 1, 2021)
Event URL: https://web.cvent.com/event/d063e447-1f16-4f70-a375-5d6978b3feea/websitePage:b8d4ce12-3d02-4d24-897d-fd469ca4808a.
1) The document presents an approach called Multidimensional Annotation Scaling (MAS) for aggregating complex annotations from multiple annotators.
2) MAS models annotation tasks as distance matrices calculated using task-specific distance functions, rather than modeling the annotations directly.
3) It then applies a Bayesian hierarchical model called multidimensional scaling to learn annotator reliabilities and item difficulties from the distance matrices in order to aggregate the annotations.
4) Experiments on tasks with diverse complex label types like sequences, rankings and translations show MAS outperforms baselines and adapts to different tasks without retraining.
This document discusses an upcoming course on using mixed effects models in psychology. The course will cover applying mixed effects models to common research designs, fitting models in R, and addressing issues that arise. Mixed effects models are motivated by research designs involving multiple random effects, nested random effects, crossed random effects, categorical dependent variables, and continuous predictors. Accounting for these complex sampling procedures and non-independent observations is important for making valid statistical inferences.
Learning from Noisy Label Distributions (ICANN2017)Yuya Yoshikawa
This document presents a method for learning from noisy label distributions when labeled training data is unavailable. It proposes a probabilistic generative model to:
1) Infer true label distributions of groups from observed noisy distributions, by modeling the noise distortion process.
2) Infer the true label of each instance from the inferred true distributions and which groups it belongs to.
3) Learn a classifier using the inferred true labels. The model outperforms existing methods on synthetic data, especially when noise distortion is large. Future work includes experiments on real-world datasets.
This document provides information about a business course, including that there will be class on Labor Day and lab materials are available on Canvas. It discusses fixed effects in models and how to include them in R scripts. Predicted values from models are covered, along with using residuals to detect outliers and interpreting interactions between variables in models. The document provides examples of adding interaction terms to model formulas and interpreting the results.
1. Post-hoc comparisons allow testing differences between individual levels or cells in an experiment after fitting a linear mixed effects model. The Tukey test, available via the emmeans package, corrects for multiple comparisons.
2. Estimated marginal means (EMMs) report what cell means would be if covariates were equal across conditions, providing a hypothetical adjustment. EMMs can be compared to test effects averaging over other variables.
3. Both post-hoc comparisons and EMMs require fitting an appropriate linear mixed effects model first before making inferences about condition differences.
The document discusses introducing mixed effects models. It focuses on fixed effects, which are the effects of interest. This week covers sample datasets, theoretical models, fitting models in R, and interpreting parameters like slopes and intercepts. The sample dataset examines factors like newcomers and experience that influence how long teams take to assemble phones. The document uses this example to demonstrate key steps in modeling fixed effects, such as estimating slopes, intercepts, and conducting hypothesis tests on parameters.
Computational Framework for Generating Visual Summaries of Topical Clusters i...Sebastian Alfers
The document describes a computational framework for generating visual summaries of topical clusters in Twitter streams. It involves preprocessing tweets, constructing a word co-occurrence graph, performing hierarchical clustering to group related words into topics, extracting keywords for each topic based on their frequency, and creating visual summaries like treemaps or word clouds to display the results.
1) The document presents an approach called Multidimensional Annotation Scaling (MAS) for aggregating complex annotations from multiple annotators.
2) MAS models annotation tasks as distance matrices calculated using task-specific distance functions, rather than modeling the annotations directly.
3) It then applies a Bayesian hierarchical model called multidimensional scaling to learn annotator reliabilities and item difficulties from the distance matrices in order to aggregate the annotations.
4) Experiments on tasks with diverse complex label types like sequences, rankings and translations show MAS outperforms baselines and adapts to different tasks without retraining.
This document discusses an upcoming course on using mixed effects models in psychology. The course will cover applying mixed effects models to common research designs, fitting models in R, and addressing issues that arise. Mixed effects models are motivated by research designs involving multiple random effects, nested random effects, crossed random effects, categorical dependent variables, and continuous predictors. Accounting for these complex sampling procedures and non-independent observations is important for making valid statistical inferences.
Learning from Noisy Label Distributions (ICANN2017)Yuya Yoshikawa
This document presents a method for learning from noisy label distributions when labeled training data is unavailable. It proposes a probabilistic generative model to:
1) Infer true label distributions of groups from observed noisy distributions, by modeling the noise distortion process.
2) Infer the true label of each instance from the inferred true distributions and which groups it belongs to.
3) Learn a classifier using the inferred true labels. The model outperforms existing methods on synthetic data, especially when noise distortion is large. Future work includes experiments on real-world datasets.
This document provides information about a business course, including that there will be class on Labor Day and lab materials are available on Canvas. It discusses fixed effects in models and how to include them in R scripts. Predicted values from models are covered, along with using residuals to detect outliers and interpreting interactions between variables in models. The document provides examples of adding interaction terms to model formulas and interpreting the results.
1. Post-hoc comparisons allow testing differences between individual levels or cells in an experiment after fitting a linear mixed effects model. The Tukey test, available via the emmeans package, corrects for multiple comparisons.
2. Estimated marginal means (EMMs) report what cell means would be if covariates were equal across conditions, providing a hypothetical adjustment. EMMs can be compared to test effects averaging over other variables.
3. Both post-hoc comparisons and EMMs require fitting an appropriate linear mixed effects model first before making inferences about condition differences.
The document discusses introducing mixed effects models. It focuses on fixed effects, which are the effects of interest. This week covers sample datasets, theoretical models, fitting models in R, and interpreting parameters like slopes and intercepts. The sample dataset examines factors like newcomers and experience that influence how long teams take to assemble phones. The document uses this example to demonstrate key steps in modeling fixed effects, such as estimating slopes, intercepts, and conducting hypothesis tests on parameters.
Computational Framework for Generating Visual Summaries of Topical Clusters i...Sebastian Alfers
The document describes a computational framework for generating visual summaries of topical clusters in Twitter streams. It involves preprocessing tweets, constructing a word co-occurrence graph, performing hierarchical clustering to group related words into topics, extracting keywords for each topic based on their frequency, and creating visual summaries like treemaps or word clouds to display the results.
Алексей Ященко и Ярослав Волощук "False simplicity of front-end applications"Fwdays
It’s easy to underestimate a front-end project's complexity, which leads to shallow and thus incorrect implementation. Attempts to fix this problem result in uncontrolled complexity growth and undefined behavior in corner cases.
We'll discuss ways of revealing the inherent complexity of a problem and dealing with it both on theoretical and practical levels.
It Does What You Say, Not What You Mean: Lessons From A Decade of Program RepairClaire Le Goues
In this talk we present lessons learned, good ideas, and thoughts on the future, with an eye toward informing junior researchers about the realities and opportunities of a long-running project. We highlight some notions from the original paper that stood the test of time, some that were not as prescient, and some that became more relevant as industrial practice advanced. We place the work in context, highlighting perceptions from software engineering and evolutionary computing, then and now, of how program repair could possibly work. We discuss the importance of measurable benchmarks and reproducible research in bringing scientists together and advancing the area. We give our thoughts on the role of quality requirements and properties in program repair. From testing to metrics to scalability to human factors to technology transfer, software repair touches many aspects of software engineering, and we hope a behind-the-scenes exploration of some of our struggles and successes may benefit researchers pursuing new projects.
The document discusses domain-specific languages (DSLs) and provides examples of internal and external DSLs. It mentions using DSLs to define date operations and search queries. Code snippets demonstrate building DSLs in Ruby and Scala, including using automata and semantic models. The document advises always using a semantic model and provides tips on metaprogramming techniques in Ruby like method_missing and class_eval. It concludes by thanking the reader and providing contact information.
Interactive Questions and Answers - London Information Retrieval MeetupSease
Answers to some questions about Natural Language Search, Language Modelling (Google Bert, OpenAI GPT-3), Neural Search and Learning to Rank made during our London Information Retrieval Meetup (December).
Waking the Data Scientist at 2am: Detect Model Degradation on Production Mod...Chris Fregly
The document discusses Amazon SageMaker Model Monitor and Debugger for monitoring machine learning models in production. SageMaker Model Monitor collects prediction data from endpoints, creates a baseline, and runs scheduled monitoring jobs to detect deviations from the baseline. It generates reports and metrics in CloudWatch. SageMaker Debugger helps debug training issues by capturing debug data with no code changes and providing real-time alerts and visualizations in Studio. Both services help detect model degradation and take corrective actions like retraining.
It is easy to measure code coverage when running unit tests.
However, very frequently the following questions come up:
- How can we measure API test coverage and e2e / UI test coverage?
- Does e2e / UI test coverage add value?
- If not, what other data can we look at to know if the e2e tests have good coverage?
This session is about understanding the above questions, and finding solutions for the same.
Building web applications?
Thinking about auto-updater?
Need to document your releases?
Then look at this presentation.
You'll likely discover another point of view on these questions.
KnittingBoar Toronto Hadoop User Group Nov 27 2012Adam Muise
This document discusses machine learning and parallel iterative algorithms. It provides an introduction to machine learning and Mahout. It then describes Knitting Boar, a system for parallelizing stochastic gradient descent on Hadoop YARN. Knitting Boar partitions data among workers that perform online logistic regression in batches. The workers send gradient updates to a master node, which averages the updates to produce a new global model. Experimental results show Knitting Boar achieves roughly linear speedup. The document concludes by discussing developing YARN applications and the Knitting Boar codebase.
Space ships, bridges, buildings have been reduced to rubble, banking errors occurred worth billions of dollars, all because of a simple error.
We’ll be talking about the importance of automated testing, types of testing, how to make and maintain tests, and ultimately how to use all of this to automatically deploy your project, with a small demo in the end.
The document discusses how to contribute to open source software projects. It recommends getting a GitHub account, registering for Hacktoberfest, and provides tips for making contributions such as starting with small changes, communicating expectations through pull requests, and choosing projects that are familiar or have requested help. The overall message is that contributing to open source can help overcome fears of making mistakes and that maintainers are generally receptive to contributions.
The document discusses Java 8 Lambdas and the Streaming API. Lambdas allow functions to be passed around as method arguments rather than whole objects. The Streaming API allows collections to be processed in a functional way using intermediate and terminal operations on a stream, such as filtering, mapping, reducing, and collecting the results. Examples demonstrate common stream operations like filtering, sorting, mapping elements to different types, and collecting results.
Distributed GLM with H2O - Atlanta MeetupSri Ambati
The document outlines a presentation about H2O's distributed generalized linear model (GLM) algorithm. The presentation includes sections about H2O.ai the company, an overview of the H2O software, a 30 minute section explaining H2O's distributed GLM in detail, a 15 minute demo of GLM, and a question and answer period. The document provides background on H2O.ai and H2O, and outlines the topics that will be covered in the distributed GLM section, including the algorithm, input parameters, outputs, runtime costs, and best practices.
Machine Learning with ML.NET and Azure - Andy CrossAndrew Flatters
- The document discusses machine learning and ML.NET. It begins with an introduction of the speaker and their background in machine learning.
- Key topics that will be covered include machine learning, ML.NET, Parquet.NET, using machine learning in production, and relevant Azure tools for data and machine learning.
- Examples provided will demonstrate sentiment analysis, finding patterns in taxi fare data, image recognition, and more to illustrate machine learning algorithms and best practices.
Responsive design is forcing us to reevaluate our design and development practices. It's also forcing us to rethink how we communicate with our clients and what a project's deliverables might be. Pattern Lab helps bridge the gap by providing one tool that allows for the creation of modular systems as well as gives clients the tool review the work in the place it's going to be used: the browser.
This talk is a deep dive into how Pattern Lab is organized and how to take advantage of it.
Chris OBrien - Azure DevOps for managing workChris O'Brien
A presentation I gave at ESPC 2019 (the European SharePoint, Office 365 and Azure Conference) about Azure DevOps for managing both development and support work. The focus is on Azure DevOps boards and task management, but covers some CI/CD aspects too.
This document summarizes a presentation on different approaches to solving natural language processing problems. It discusses rule-based, counting-based, and deep learning-based approaches. For each approach, it provides examples including English tokenization, error correction, dialogue systems, and spam filtering. The presentation covers key stages of an NLP project including domain analysis, data preparation and collection, iterating on solutions, and gathering feedback. It emphasizes the importance of data and discusses techniques for data acquisition such as scraping, annotation, and crowdsourcing.
The document provides an introduction to object oriented design principles with C#. It discusses the SOLID principles, which are single responsibility principle (SRP), open-closed principle (OCP), Liskov substitution principle (LSP), interface segregation principle (ISP) and dependency inversion principle (DIP). For each principle, it provides a definition, real world example and demonstration of implementation. It also discusses other design principles like program to interface, dependency injection and composition over inheritance. The document is intended to help understand and apply good object oriented design in software projects.
Part 2 of the Deep Learning Fundamentals Series, this session discusses Tuning Training (including hyperparameters, overfitting/underfitting), Training Algorithms (including different learning rates, backpropagation), Optimization (including stochastic gradient descent, momentum, Nesterov Accelerated Gradient, RMSprop, Adaptive algorithms - Adam, Adadelta, etc.), and a primer on Convolutional Neural Networks. The demos included in these slides are running on Keras with TensorFlow backend on Databricks.
Knitting boar - Toronto and Boston HUGs - Nov 2012Josh Patterson
1) The document discusses machine learning and parallel iterative algorithms like stochastic gradient descent. It introduces the Mahout machine learning library and describes an implementation of parallel SGD called Knitting Boar that runs on YARN.
2) Knitting Boar parallelizes Mahout's SGD algorithm by having worker nodes process partitions of the training data in parallel while a master node merges their results.
3) The author argues that approaches like Knitting Boar and IterativeReduce provide better ways to implement machine learning algorithms for big data compared to traditional MapReduce.
Explainable Fact Checking with Humans in-the-loopMatthew Lease
Invited Keynote at KDD 2021 TrueFact Workshop: Making a Credible Web for Tomorrow, August 15, 2021.
https://www.microsoft.com/en-us/research/event/kdd-2021-truefact-workshop-making-a-credible-web-for-tomorrow/#!program-schedule
More Related Content
Similar to Automated Models for Quantifying Centrality of Survey Responses
Алексей Ященко и Ярослав Волощук "False simplicity of front-end applications"Fwdays
It’s easy to underestimate a front-end project's complexity, which leads to shallow and thus incorrect implementation. Attempts to fix this problem result in uncontrolled complexity growth and undefined behavior in corner cases.
We'll discuss ways of revealing the inherent complexity of a problem and dealing with it both on theoretical and practical levels.
It Does What You Say, Not What You Mean: Lessons From A Decade of Program RepairClaire Le Goues
In this talk we present lessons learned, good ideas, and thoughts on the future, with an eye toward informing junior researchers about the realities and opportunities of a long-running project. We highlight some notions from the original paper that stood the test of time, some that were not as prescient, and some that became more relevant as industrial practice advanced. We place the work in context, highlighting perceptions from software engineering and evolutionary computing, then and now, of how program repair could possibly work. We discuss the importance of measurable benchmarks and reproducible research in bringing scientists together and advancing the area. We give our thoughts on the role of quality requirements and properties in program repair. From testing to metrics to scalability to human factors to technology transfer, software repair touches many aspects of software engineering, and we hope a behind-the-scenes exploration of some of our struggles and successes may benefit researchers pursuing new projects.
The document discusses domain-specific languages (DSLs) and provides examples of internal and external DSLs. It mentions using DSLs to define date operations and search queries. Code snippets demonstrate building DSLs in Ruby and Scala, including using automata and semantic models. The document advises always using a semantic model and provides tips on metaprogramming techniques in Ruby like method_missing and class_eval. It concludes by thanking the reader and providing contact information.
Interactive Questions and Answers - London Information Retrieval MeetupSease
Answers to some questions about Natural Language Search, Language Modelling (Google Bert, OpenAI GPT-3), Neural Search and Learning to Rank made during our London Information Retrieval Meetup (December).
Waking the Data Scientist at 2am: Detect Model Degradation on Production Mod...Chris Fregly
The document discusses Amazon SageMaker Model Monitor and Debugger for monitoring machine learning models in production. SageMaker Model Monitor collects prediction data from endpoints, creates a baseline, and runs scheduled monitoring jobs to detect deviations from the baseline. It generates reports and metrics in CloudWatch. SageMaker Debugger helps debug training issues by capturing debug data with no code changes and providing real-time alerts and visualizations in Studio. Both services help detect model degradation and take corrective actions like retraining.
It is easy to measure code coverage when running unit tests.
However, very frequently the following questions come up:
- How can we measure API test coverage and e2e / UI test coverage?
- Does e2e / UI test coverage add value?
- If not, what other data can we look at to know if the e2e tests have good coverage?
This session is about understanding the above questions, and finding solutions for the same.
Building web applications?
Thinking about auto-updater?
Need to document your releases?
Then look at this presentation.
You'll likely discover another point of view on these questions.
KnittingBoar Toronto Hadoop User Group Nov 27 2012Adam Muise
This document discusses machine learning and parallel iterative algorithms. It provides an introduction to machine learning and Mahout. It then describes Knitting Boar, a system for parallelizing stochastic gradient descent on Hadoop YARN. Knitting Boar partitions data among workers that perform online logistic regression in batches. The workers send gradient updates to a master node, which averages the updates to produce a new global model. Experimental results show Knitting Boar achieves roughly linear speedup. The document concludes by discussing developing YARN applications and the Knitting Boar codebase.
Space ships, bridges, buildings have been reduced to rubble, banking errors occurred worth billions of dollars, all because of a simple error.
We’ll be talking about the importance of automated testing, types of testing, how to make and maintain tests, and ultimately how to use all of this to automatically deploy your project, with a small demo in the end.
The document discusses how to contribute to open source software projects. It recommends getting a GitHub account, registering for Hacktoberfest, and provides tips for making contributions such as starting with small changes, communicating expectations through pull requests, and choosing projects that are familiar or have requested help. The overall message is that contributing to open source can help overcome fears of making mistakes and that maintainers are generally receptive to contributions.
The document discusses Java 8 Lambdas and the Streaming API. Lambdas allow functions to be passed around as method arguments rather than whole objects. The Streaming API allows collections to be processed in a functional way using intermediate and terminal operations on a stream, such as filtering, mapping, reducing, and collecting the results. Examples demonstrate common stream operations like filtering, sorting, mapping elements to different types, and collecting results.
Distributed GLM with H2O - Atlanta MeetupSri Ambati
The document outlines a presentation about H2O's distributed generalized linear model (GLM) algorithm. The presentation includes sections about H2O.ai the company, an overview of the H2O software, a 30 minute section explaining H2O's distributed GLM in detail, a 15 minute demo of GLM, and a question and answer period. The document provides background on H2O.ai and H2O, and outlines the topics that will be covered in the distributed GLM section, including the algorithm, input parameters, outputs, runtime costs, and best practices.
Machine Learning with ML.NET and Azure - Andy CrossAndrew Flatters
- The document discusses machine learning and ML.NET. It begins with an introduction of the speaker and their background in machine learning.
- Key topics that will be covered include machine learning, ML.NET, Parquet.NET, using machine learning in production, and relevant Azure tools for data and machine learning.
- Examples provided will demonstrate sentiment analysis, finding patterns in taxi fare data, image recognition, and more to illustrate machine learning algorithms and best practices.
Responsive design is forcing us to reevaluate our design and development practices. It's also forcing us to rethink how we communicate with our clients and what a project's deliverables might be. Pattern Lab helps bridge the gap by providing one tool that allows for the creation of modular systems as well as gives clients the tool review the work in the place it's going to be used: the browser.
This talk is a deep dive into how Pattern Lab is organized and how to take advantage of it.
Chris OBrien - Azure DevOps for managing workChris O'Brien
A presentation I gave at ESPC 2019 (the European SharePoint, Office 365 and Azure Conference) about Azure DevOps for managing both development and support work. The focus is on Azure DevOps boards and task management, but covers some CI/CD aspects too.
This document summarizes a presentation on different approaches to solving natural language processing problems. It discusses rule-based, counting-based, and deep learning-based approaches. For each approach, it provides examples including English tokenization, error correction, dialogue systems, and spam filtering. The presentation covers key stages of an NLP project including domain analysis, data preparation and collection, iterating on solutions, and gathering feedback. It emphasizes the importance of data and discusses techniques for data acquisition such as scraping, annotation, and crowdsourcing.
The document provides an introduction to object oriented design principles with C#. It discusses the SOLID principles, which are single responsibility principle (SRP), open-closed principle (OCP), Liskov substitution principle (LSP), interface segregation principle (ISP) and dependency inversion principle (DIP). For each principle, it provides a definition, real world example and demonstration of implementation. It also discusses other design principles like program to interface, dependency injection and composition over inheritance. The document is intended to help understand and apply good object oriented design in software projects.
Part 2 of the Deep Learning Fundamentals Series, this session discusses Tuning Training (including hyperparameters, overfitting/underfitting), Training Algorithms (including different learning rates, backpropagation), Optimization (including stochastic gradient descent, momentum, Nesterov Accelerated Gradient, RMSprop, Adaptive algorithms - Adam, Adadelta, etc.), and a primer on Convolutional Neural Networks. The demos included in these slides are running on Keras with TensorFlow backend on Databricks.
Knitting boar - Toronto and Boston HUGs - Nov 2012Josh Patterson
1) The document discusses machine learning and parallel iterative algorithms like stochastic gradient descent. It introduces the Mahout machine learning library and describes an implementation of parallel SGD called Knitting Boar that runs on YARN.
2) Knitting Boar parallelizes Mahout's SGD algorithm by having worker nodes process partitions of the training data in parallel while a master node merges their results.
3) The author argues that approaches like Knitting Boar and IterativeReduce provide better ways to implement machine learning algorithms for big data compared to traditional MapReduce.
Similar to Automated Models for Quantifying Centrality of Survey Responses (20)
Explainable Fact Checking with Humans in-the-loopMatthew Lease
Invited Keynote at KDD 2021 TrueFact Workshop: Making a Credible Web for Tomorrow, August 15, 2021.
https://www.microsoft.com/en-us/research/event/kdd-2021-truefact-workshop-making-a-credible-web-for-tomorrow/#!program-schedule
Talk given at Delft University speaker series on "Crowd Computing & Human-Centered AI" (https://www.academicfringe.org/). November 23, 2020. Covers two 2020 works:
(1) Anubrata Das, Brandon Dang, and Matthew Lease. Fast, Accurate, and Healthier: Interactive Blurring Helps Moderators Reduce Exposure to Harmful Content. In Proceedings of the 8th AAAI Conference on Human Computation and Crowdsourcing (HCOMP), 2020.
Alexander Braylan and Matthew Lease. Modeling and Aggregation of Complex Annotations via Annotation Distances. In Proceedings of the Web Conference, pages 1807--1818, 2020.
AI & Work, with Transparency & the Crowd Matthew Lease
The document discusses designing human-AI partnerships and the role of crowdsourcing in AI systems. It summarizes work on designing AI assistants to work with humans, using crowds to help fact-check information, and explores challenges around protecting crowd workers who review harmful content or do "dirty jobs". It advocates for more research on ethics in AI and using crowds to help check work for ethical issues.
Designing Human-AI Partnerships to Combat Misinfomation Matthew Lease
The document discusses designing human-AI partnerships to combat misinformation. It describes a prototype partnership where a human and AI work together to fact-check claims. The partnership aims to make the AI more transparent and address user bias by allowing the user to adjust the perceived reliability of news sources, which then changes the AI's political leaning analysis and fact checking results. The discussion wraps up by noting challenges like avoiding echo chambers and assessing potential harms, as well as opportunities for AI to reduce bias and increase trust through explainable, interactive systems.
Designing at the Intersection of HCI & AI: Misinformation & Crowdsourced Anno...Matthew Lease
This document summarizes a presentation about designing human-AI partnerships for fact-checking misinformation. It discusses using crowdsourced rationales to improve the accuracy and cost-efficiency of annotation tasks. It also addresses challenges in designing interfaces for automatic fact-checking models, such as integrating human knowledge and reasoning to correct errors and account for bias. The goal is to develop mixed-initiative systems where humans and AI can jointly reason and personalize fact-checking.
Presentation given at the Linguistic Data Consortium (LDC), University of Pennsylvania, April 2019. Based on presentations at the 6th ACM Collective Intelligence Conference, 2018 and the 6th AAAI Conference on Human Computation & Crowdsourcing (HCOMP), 2018. Blog post: https://blog.humancomputation.com/?p=9932.
Believe it or not: Designing a Human-AI Partnership for Mixed-Initiative Fact...Matthew Lease
Presented at the 31st ACM User Interface Software and Technology Symposium (UIST), 2018. Paper: https://www.ischool.utexas.edu/~ml/papers/nguyen-uist18.pdf
Mix and Match: Collaborative Expert-Crowd Judging for Building Test Collectio...Matthew Lease
Presentation at the 1st Biannual Conference on Design of Experimental Search & Information Retrieval Systems (DESIRES 2018). August 30, 2018. Paper: https://www.ischool.utexas.edu/~ml/papers/kutlu-desires18.pdf
Talk given August 29, 2018 at the 1st Biannual Conference on Design of Experimental Search & Information Retrieval Systems (DESIRES 2018). Paper: https://www.ischool.utexas.edu/~ml/papers/lease-desires18.pdf
Your Behavior Signals Your Reliability: Modeling Crowd Behavioral Traces to E...Matthew Lease
Presentation at the 6th AAAI Conference on Human Computation and Crowdsourcing (HCOMP), July 7, 2018. Work by Tanya Goyal, Tyler McDonnell, Mucahid Kutlu, Tamer Elsayed, and Matthew Lease. Pages 41-49 in conference proceedings. Online version of paper includes corrections to official version in proceedings: https://www.ischool.utexas.edu/~ml/papers/goyal-hcomp18
What Can Machine Learning & Crowdsourcing Do for You? Exploring New Tools for...Matthew Lease
Invited Talk at the ACM JCDL 2018 WORKSHOP ON CYBERINFRASTRUCTURE AND MACHINE LEARNING FOR DIGITAL LIBRARIES AND ARCHIVES. https://www.tacc.utexas.edu/conference/jcdl18
Deep Learning for Information Retrieval: Models, Progress, & OpportunitiesMatthew Lease
Talk given at the 8th Forum for Information Retrieval Evaluation (FIRE, http://fire.irsi.res.in/fire/2016/), December 10, 2016, and at the Qatar Computing Research Institute (QCRI), December 15, 2016.
Systematic Review is e-Discovery in Doctor’s ClothingMatthew Lease
This document discusses opportunities for collaboration between researchers working in systematic reviews and electronic discovery (e-discovery). It notes similarities in the challenges both fields face, including the need for high recall with bounded costs and reliance on multi-stage review pipelines. The document proposes that technologies developed for semi-automated citation screening and crowdsourcing could help address current limitations. It concludes by encouraging information retrieval researchers to investigate open problems in systematic reviews as opportunities to advance technologies beyond other tasks and help bring together interested parties through forums like the TREC Total Recall track.
Crowd computing utilizes both crowdsourcing and human computation to solve problems. Crowdsourcing enables more efficient and scalable data collection and processing by outsourcing tasks to a large, undefined group of people. Human computation allows software developers to incorporate human intelligence and judgment into applications to provide capabilities beyond current artificial intelligence. Examples discussed include Amazon Mechanical Turk, various crowd-powered applications, and how crowdsourcing has helped label large datasets to train machine learning models.
The Rise of Crowd Computing (December 2015)Matthew Lease
Crowd computing is rising with two waves - the first using crowds to label large amounts of data for artificial intelligence applications. The second wave delivers applications that go beyond AI abilities by incorporating human computation. Open problems remain around ensuring high quality outputs, task design, understanding the worker context and experience, and addressing ethics concerns around opaque platforms and working conditions. The future holds potential for empowering crowd work but also risks like digital sweatshops if worker freedoms and conditions are not considered.
Beyond Mechanical Turk: An Analysis of Paid Crowd Work PlatformsMatthew Lease
The document summarizes a presentation about analyzing paid crowd work platforms beyond Mechanical Turk. It discusses how Mechanical Turk has dominated research on paid crowdsourcing due to its early popularity, but that it has limitations. The presentation conducts a qualitative study of 7 alternative crowd work platforms to identify distinguishing capabilities not found on MTurk, such as different payment models, richer worker profiles, and support for confidential tasks. It aims to increase awareness of other platforms to further inform practice and research on crowdsourcing.
zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...Alex Pruden
Folding is a recent technique for building efficient recursive SNARKs. Several elegant folding protocols have been proposed, such as Nova, Supernova, Hypernova, Protostar, and others. However, all of them rely on an additively homomorphic commitment scheme based on discrete log, and are therefore not post-quantum secure. In this work we present LatticeFold, the first lattice-based folding protocol based on the Module SIS problem. This folding protocol naturally leads to an efficient recursive lattice-based SNARK and an efficient PCD scheme. LatticeFold supports folding low-degree relations, such as R1CS, as well as high-degree relations, such as CCS. The key challenge is to construct a secure folding protocol that works with the Ajtai commitment scheme. The difficulty, is ensuring that extracted witnesses are low norm through many rounds of folding. We present a novel technique using the sumcheck protocol to ensure that extracted witnesses are always low norm no matter how many rounds of folding are used. Our evaluation of the final proof system suggests that it is as performant as Hypernova, while providing post-quantum security.
Paper Link: https://eprint.iacr.org/2024/257
Monitoring and Managing Anomaly Detection on OpenShift.pdfTosin Akinosho
Monitoring and Managing Anomaly Detection on OpenShift
Overview
Dive into the world of anomaly detection on edge devices with our comprehensive hands-on tutorial. This SlideShare presentation will guide you through the entire process, from data collection and model training to edge deployment and real-time monitoring. Perfect for those looking to implement robust anomaly detection systems on resource-constrained IoT/edge devices.
Key Topics Covered
1. Introduction to Anomaly Detection
- Understand the fundamentals of anomaly detection and its importance in identifying unusual behavior or failures in systems.
2. Understanding Edge (IoT)
- Learn about edge computing and IoT, and how they enable real-time data processing and decision-making at the source.
3. What is ArgoCD?
- Discover ArgoCD, a declarative, GitOps continuous delivery tool for Kubernetes, and its role in deploying applications on edge devices.
4. Deployment Using ArgoCD for Edge Devices
- Step-by-step guide on deploying anomaly detection models on edge devices using ArgoCD.
5. Introduction to Apache Kafka and S3
- Explore Apache Kafka for real-time data streaming and Amazon S3 for scalable storage solutions.
6. Viewing Kafka Messages in the Data Lake
- Learn how to view and analyze Kafka messages stored in a data lake for better insights.
7. What is Prometheus?
- Get to know Prometheus, an open-source monitoring and alerting toolkit, and its application in monitoring edge devices.
8. Monitoring Application Metrics with Prometheus
- Detailed instructions on setting up Prometheus to monitor the performance and health of your anomaly detection system.
9. What is Camel K?
- Introduction to Camel K, a lightweight integration framework built on Apache Camel, designed for Kubernetes.
10. Configuring Camel K Integrations for Data Pipelines
- Learn how to configure Camel K for seamless data pipeline integrations in your anomaly detection workflow.
11. What is a Jupyter Notebook?
- Overview of Jupyter Notebooks, an open-source web application for creating and sharing documents with live code, equations, visualizations, and narrative text.
12. Jupyter Notebooks with Code Examples
- Hands-on examples and code snippets in Jupyter Notebooks to help you implement and test anomaly detection models.
Skybuffer SAM4U tool for SAP license adoptionTatiana Kojar
Manage and optimize your license adoption and consumption with SAM4U, an SAP free customer software asset management tool.
SAM4U, an SAP complimentary software asset management tool for customers, delivers a detailed and well-structured overview of license inventory and usage with a user-friendly interface. We offer a hosted, cost-effective, and performance-optimized SAM4U setup in the Skybuffer Cloud environment. You retain ownership of the system and data, while we manage the ABAP 7.58 infrastructure, ensuring fixed Total Cost of Ownership (TCO) and exceptional services through the SAP Fiori interface.
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAUpanagenda
Webinar Recording: https://www.panagenda.com/webinars/hcl-notes-und-domino-lizenzkostenreduzierung-in-der-welt-von-dlau/
DLAU und die Lizenzen nach dem CCB- und CCX-Modell sind für viele in der HCL-Community seit letztem Jahr ein heißes Thema. Als Notes- oder Domino-Kunde haben Sie vielleicht mit unerwartet hohen Benutzerzahlen und Lizenzgebühren zu kämpfen. Sie fragen sich vielleicht, wie diese neue Art der Lizenzierung funktioniert und welchen Nutzen sie Ihnen bringt. Vor allem wollen Sie sicherlich Ihr Budget einhalten und Kosten sparen, wo immer möglich. Das verstehen wir und wir möchten Ihnen dabei helfen!
Wir erklären Ihnen, wie Sie häufige Konfigurationsprobleme lösen können, die dazu führen können, dass mehr Benutzer gezählt werden als nötig, und wie Sie überflüssige oder ungenutzte Konten identifizieren und entfernen können, um Geld zu sparen. Es gibt auch einige Ansätze, die zu unnötigen Ausgaben führen können, z. B. wenn ein Personendokument anstelle eines Mail-Ins für geteilte Mailboxen verwendet wird. Wir zeigen Ihnen solche Fälle und deren Lösungen. Und natürlich erklären wir Ihnen das neue Lizenzmodell.
Nehmen Sie an diesem Webinar teil, bei dem HCL-Ambassador Marc Thomas und Gastredner Franz Walder Ihnen diese neue Welt näherbringen. Es vermittelt Ihnen die Tools und das Know-how, um den Überblick zu bewahren. Sie werden in der Lage sein, Ihre Kosten durch eine optimierte Domino-Konfiguration zu reduzieren und auch in Zukunft gering zu halten.
Diese Themen werden behandelt
- Reduzierung der Lizenzkosten durch Auffinden und Beheben von Fehlkonfigurationen und überflüssigen Konten
- Wie funktionieren CCB- und CCX-Lizenzen wirklich?
- Verstehen des DLAU-Tools und wie man es am besten nutzt
- Tipps für häufige Problembereiche, wie z. B. Team-Postfächer, Funktions-/Testbenutzer usw.
- Praxisbeispiele und Best Practices zum sofortigen Umsetzen
GraphRAG for Life Science to increase LLM accuracyTomaz Bratanic
GraphRAG for life science domain, where you retriever information from biomedical knowledge graphs using LLMs to increase the accuracy and performance of generated answers
Building Production Ready Search Pipelines with Spark and MilvusZilliz
Spark is the widely used ETL tool for processing, indexing and ingesting data to serving stack for search. Milvus is the production-ready open-source vector database. In this talk we will show how to use Spark to process unstructured data to extract vector representations, and push the vectors to Milvus vector database for search serving.
Driving Business Innovation: Latest Generative AI Advancements & Success StorySafe Software
Are you ready to revolutionize how you handle data? Join us for a webinar where we’ll bring you up to speed with the latest advancements in Generative AI technology and discover how leveraging FME with tools from giants like Google Gemini, Amazon, and Microsoft OpenAI can supercharge your workflow efficiency.
During the hour, we’ll take you through:
Guest Speaker Segment with Hannah Barrington: Dive into the world of dynamic real estate marketing with Hannah, the Marketing Manager at Workspace Group. Hear firsthand how their team generates engaging descriptions for thousands of office units by integrating diverse data sources—from PDF floorplans to web pages—using FME transformers, like OpenAIVisionConnector and AnthropicVisionConnector. This use case will show you how GenAI can streamline content creation for marketing across the board.
Ollama Use Case: Learn how Scenario Specialist Dmitri Bagh has utilized Ollama within FME to input data, create custom models, and enhance security protocols. This segment will include demos to illustrate the full capabilities of FME in AI-driven processes.
Custom AI Models: Discover how to leverage FME to build personalized AI models using your data. Whether it’s populating a model with local data for added security or integrating public AI tools, find out how FME facilitates a versatile and secure approach to AI.
We’ll wrap up with a live Q&A session where you can engage with our experts on your specific use cases, and learn more about optimizing your data workflows with AI.
This webinar is ideal for professionals seeking to harness the power of AI within their data management systems while ensuring high levels of customization and security. Whether you're a novice or an expert, gain actionable insights and strategies to elevate your data processes. Join us to see how FME and AI can revolutionize how you work with data!
For the full video of this presentation, please visit: https://www.edge-ai-vision.com/2024/06/temporal-event-neural-networks-a-more-efficient-alternative-to-the-transformer-a-presentation-from-brainchip/
Chris Jones, Director of Product Management at BrainChip , presents the “Temporal Event Neural Networks: A More Efficient Alternative to the Transformer” tutorial at the May 2024 Embedded Vision Summit.
The expansion of AI services necessitates enhanced computational capabilities on edge devices. Temporal Event Neural Networks (TENNs), developed by BrainChip, represent a novel and highly efficient state-space network. TENNs demonstrate exceptional proficiency in handling multi-dimensional streaming data, facilitating advancements in object detection, action recognition, speech enhancement and language model/sequence generation. Through the utilization of polynomial-based continuous convolutions, TENNs streamline models, expedite training processes and significantly diminish memory requirements, achieving notable reductions of up to 50x in parameters and 5,000x in energy consumption compared to prevailing methodologies like transformers.
Integration with BrainChip’s Akida neuromorphic hardware IP further enhances TENNs’ capabilities, enabling the realization of highly capable, portable and passively cooled edge devices. This presentation delves into the technical innovations underlying TENNs, presents real-world benchmarks, and elucidates how this cutting-edge approach is positioned to revolutionize edge AI across diverse applications.
Freshworks Rethinks NoSQL for Rapid Scaling & Cost-EfficiencyScyllaDB
Freshworks creates AI-boosted business software that helps employees work more efficiently and effectively. Managing data across multiple RDBMS and NoSQL databases was already a challenge at their current scale. To prepare for 10X growth, they knew it was time to rethink their database strategy. Learn how they architected a solution that would simplify scaling while keeping costs under control.
Fueling AI with Great Data with Airbyte WebinarZilliz
This talk will focus on how to collect data from a variety of sources, leveraging this data for RAG and other GenAI use cases, and finally charting your course to productionalization.
For the full video of this presentation, please visit: https://www.edge-ai-vision.com/2024/06/how-axelera-ai-uses-digital-compute-in-memory-to-deliver-fast-and-energy-efficient-computer-vision-a-presentation-from-axelera-ai/
Bram Verhoef, Head of Machine Learning at Axelera AI, presents the “How Axelera AI Uses Digital Compute-in-memory to Deliver Fast and Energy-efficient Computer Vision” tutorial at the May 2024 Embedded Vision Summit.
As artificial intelligence inference transitions from cloud environments to edge locations, computer vision applications achieve heightened responsiveness, reliability and privacy. This migration, however, introduces the challenge of operating within the stringent confines of resource constraints typical at the edge, including small form factors, low energy budgets and diminished memory and computational capacities. Axelera AI addresses these challenges through an innovative approach of performing digital computations within memory itself. This technique facilitates the realization of high-performance, energy-efficient and cost-effective computer vision capabilities at the thin and thick edge, extending the frontier of what is achievable with current technologies.
In this presentation, Verhoef unveils his company’s pioneering chip technology and demonstrates its capacity to deliver exceptional frames-per-second performance across a range of standard computer vision networks typical of applications in security, surveillance and the industrial sector. This shows that advanced computer vision can be accessible and efficient, even at the very edge of our technological ecosystem.
TrustArc Webinar - 2024 Global Privacy SurveyTrustArc
How does your privacy program stack up against your peers? What challenges are privacy teams tackling and prioritizing in 2024?
In the fifth annual Global Privacy Benchmarks Survey, we asked over 1,800 global privacy professionals and business executives to share their perspectives on the current state of privacy inside and outside of their organizations. This year’s report focused on emerging areas of importance for privacy and compliance professionals, including considerations and implications of Artificial Intelligence (AI) technologies, building brand trust, and different approaches for achieving higher privacy competence scores.
See how organizational priorities and strategic approaches to data security and privacy are evolving around the globe.
This webinar will review:
- The top 10 privacy insights from the fifth annual Global Privacy Benchmarks Survey
- The top challenges for privacy leaders, practitioners, and organizations in 2024
- Key themes to consider in developing and maintaining your privacy program
Have you ever been confused by the myriad of choices offered by AWS for hosting a website or an API?
Lambda, Elastic Beanstalk, Lightsail, Amplify, S3 (and more!) can each host websites + APIs. But which one should we choose?
Which one is cheapest? Which one is fastest? Which one will scale to meet our needs?
Join me in this session as we dive into each AWS hosting service to determine which one is best for your scenario and explain why!
FREE A4 Cyber Security Awareness Posters-Social Engineering part 3Data Hops
Free A4 downloadable and printable Cyber Security, Social Engineering Safety and security Training Posters . Promote security awareness in the home or workplace. Lock them Out From training providers datahops.com
FREE A4 Cyber Security Awareness Posters-Social Engineering part 3
Automated Models for Quantifying Centrality of Survey Responses
1. Matt Lease
Associate Professor
School of Information
The University of Texas at Austin
Amazon Scholar
Human-in-the-loop Services
Amazon Web Services (AWS)
Automated Models for Quantifying
Centrality of Survey Responses
1
Lab: ir.ischool.utexas.edu
@mattlease
Slides: slideshare.net/mattlease
8. Caption this image:
When majority voting falls short
Problem: large label space, exact match doesn’t work!
8
A cat is
eating
The cat
eats
A beautiful
picture
9. What about complex annotations?
Ranked lists
Parse trees
A1: A cat is eating
A2: The cat eats
A3: A beautiful picture
Image captions
Range sequences
9
10. 10
Alexander Braylan1 and Matthew Lease2
1
Dept. of Computer Science & 2
School of Information
The University of Texas at Austin
Modeling and Aggregation of Complex
Annotations via Annotation Distance
Code & Data: https://github.com/Praznat/annotationmodeling
https://github.com/Praznat/annotationmodeling
11. Roadmap
• Prior work
• Approach
• Example outputs
• Conclusion
11
https://github.com/Praznat/annotationmodeling
12. Aggregating Simple Labels
• Hundreds of papers
• Multiple benchmarking studies
• Rich body of Bayesian modeling
• General-purpose aggregation
models for simple labels don’t
support complex labels
Dawid-Skene MACE
Hierarchical Dawid-Skene
Item Difficulty
Logistic Random Effects
Source:
Paun et al 2018
“Comparing bayesian
models of annotation”
12
https://github.com/Praznat/annotationmodeling
13. Task-specific models
• Pros:
– Task specialization
maximizes accuracy
• Cons:
– Need new model for
every task
– Complicated, difficult
to formulate
Nguyen et al 2017 (Sequences)
Lin, Mausam, and Weld 2012 (Math)
13
https://github.com/Praznat/annotationmodeling
14. Our goals
• We want aggregation for complex data types
– Build on ideas from simple label aggregation models
• We want to generalize across many labeling tasks
– Can we reduce problem to common simpler state space?
14
https://github.com/Praznat/annotationmodeling
15. Roadmap
• Prior work
• Approach
• Example outputs
• Conclusion
15
https://github.com/Praznat/annotationmodeling
16. Key Insight
Partial credit matching via task-specific distance function
• Adopt or define a distance function for each annotation task
• Model annotation distances uniformly across tasks
• Distance functions already exist for many task types
– Free-text responses, e.g., survey questions
16
https://github.com/Praznat/annotationmodeling
17. Calculate distances
“a cat is eating” “cat is eating”
“a beautiful picture” “the cat eats”
17
• Example task: free text answer
• Example distance function:
string edit distance
https://github.com/Praznat/annotationmodeling
18. Calculate distances
“a cat is eating” “cat is eating”
“a beautiful picture” “the cat eats”
0.05
0.1
0.1
18
• Example task: free text answer
• Example distance function:
string edit distance
https://github.com/Praznat/annotationmodeling
19. Calculate distances
“a cat is eating” “cat is eating”
“a beautiful picture” “the cat eats”
0.8
0.82
0.05
0.1
0.1
19
0.82
• Example task: free text answer
• Example distance function:
string edit distance
https://github.com/Praznat/annotationmodeling
22. Distance function properties
22
Properties of distance functions
Non-negativity
Symmetry
Triangle inequality
Data Free Text Rankings
Example
evaluation fn
BLEU(x, y)
Example
distance fn
Non-negativity ✓ ✓
Symmetry ✓ ✓
Triangle
inequality
✓ ✓
https://github.com/Praznat/annotationmodeling
23. Calculate distances
“a cat is eating” “cat is eating”
“a beautiful picture” “the cat eats”
0.8
0.82
0.05
0.1
0.1
23
0.82
https://github.com/Praznat/annotationmodeling
24. A1: A cat is eating
A2: The cat eats
A3: A beautiful
picture
0.1 0.6
0.3
24
All tasks reduce to
matrices of distances
https://github.com/Praznat/annotationmodeling
25. How to aggregate given distances
• Local selection model
• Global selection model
• Combined
25
Current item
Other items
https://github.com/Praznat/annotationmodeling
26. Local approach: Smallest Avg Distance (SAD)
• For each question: compute average
distance between responses
• The response with smallest average
distance is locally most normative,
generalizing majority vote
• Independence between items
• Local approach does not model
respondent agreement
26
Current item
Other items
https://github.com/Praznat/annotationmodeling
27. Global approach: Best Available User (BAU)
• Score each participant by their
average distance to all other
participants across all questions
• The participant with lowest score is
globally most normative; treat their
response as most normative
• Global approach ignores distance
observed on the current item
27
Current item
Other items
https://github.com/Praznat/annotationmodeling
28. Can we get best of both worlds?
• Want a method that combines:
– Best available user (global)
– Smallest avg distance (local)
• Should build on rich history of work on Bayesian annotation modeling
• Need a principled framework for modeling annotation distance matrices
weights
votes weighted voting
28
https://github.com/Praznat/annotationmodeling
29. Multidimensional Annotation Scaling (MAS)
• Based on Multidimensional
Scaling (Kruskal & Wish 1978)
• Probabilistic model of multi-
item distance matrices
• “Hierarchical Bayesian”
– Additional learned parameters
represent crowd effects such as
worker reliability
A cat is
eating
The cat
eats
A beautiful
picture
29
https://github.com/Praznat/annotationmodeling
32. MAS Objective 2: Prior
“a cat is eating”
“cat is eating”
“a beautiful picture”
“the cat eats”
Pseudo-gold
32
https://github.com/Praznat/annotationmodeling
33. MAS Objective 2: Prior
“a cat is eating”
“cat is eating”
“a beautiful picture”
“the cat eats”
33
https://github.com/Praznat/annotationmodeling
34. MAS Objective 2: Prior
“a cat is eating”
“cat is eating”
“a beautiful picture”
“the cat eats”
34
https://github.com/Praznat/annotationmodeling
35. MAS Objective 2: Prior
35
https://github.com/Praznat/annotationmodeling
36. MAS Objective 2: Prior
36
https://github.com/Praznat/annotationmodelingç
37. Roadmap
• Prior work
• Approach
• Example outputs
• Conclusion
37
https://github.com/Praznat/annotationmodeling
38. Example Output: father
38
Response SAD MAS
He always speaks ill about his father behind back. 0.78 0.16
He always speaks ill of his father behind his back. 0.71 0.30
He always talks about his father behind his back. 0.74 0.50
He always speaks ill of his father 0.78 0.55
He always speak ill of his father. 0.79 0.62
He is always talking about his father behind his back. 0.82 0.63
He always says behind his father. 0.90 0.72
He always talks about his dad behind his back. 0.83 0.73
https://github.com/Praznat/annotationmodelingç
39. Example Output: she says
39
Response SAD MAS
Please be sure to take a note of what she says. 0.77 0.16
Please take a note of what she says. 0.84 0.30
Be sure to take a warning notice what she says. 0.86 0.46
Please be sure to take notes what she says. 0.81 0.48
Please take a note what she say. 0.92 0.73
Please be sure to take instructions for her saying. 0.93 0.76
Make sure to insert disclaimer about what she says. 0.93 0.80
Please make a memo whatever she says. 0.99 0.82
https://github.com/Praznat/annotationmodelingç
40. Example Output: quiet
40
Response SAD MAS
As long as you keep quiet you may stay here 0.83 0.26
You can stay here as long as you keep quiet. 0.86 0.39
You may stay here if you keep quiet. 0.81 0.39
You can stay here if you keep quiet. 0.82 0.57
So long as you remain quiet you may stay here. 0.92 0.57
If it is quiet you may stay here 0.90 0.70
If you keep quiet you can stay here. 0.92 0.81
You may be here if you keep quiet. 0.91 0.84
https://github.com/Praznat/annotationmodelingç
41. Example Output: go ahead
41
Response SAD MAS
Please go ahead if i am late. 0.83 0.16
Please go ahead if I'm late. 0.79 0.28
Please go ahead if I delayed. 0.82 0.51
Please go without me if I'm late. 0.91 0.62
Please go ahead if I get late 0.83 0.67
Please go ahead and leave if I'm late. 0.88 0.74
If I am late you can go in first. 1.00 0.79
If I should be late go without me. 1.00 0.81
https://github.com/Praznat/annotationmodelingç
42. Example Output: married
42
Response SAD MAS
Actually they are not married 0.91 0.18
To tell the truth they are not couple 0.79 0.47
To tell the truth they are not a married couple 0.84 0.62
To tell the truth they're not married 0.89 0.63
In fact they are not couple 0.94 0.69
to telling the truth we're not married 0.97 0.71
Two people are not couples in truth 1.00 0.79
https://github.com/Praznat/annotationmodelingç
43. Roadmap
• Prior work
• Approach
• Example outputs
• Conclusion
43
https://github.com/Praznat/annotationmodeling
44. Conclusion
• Probabilistic model identifies normative vs. outlier
responses by quantifying distance between responses
• Many choices for measuring distance between two
texts (e.g., character-based or more semantic NLP)
• 3 models: local (SAD), global (BAU), or combo (MAS)
• Open source: github.com/Praznat/annotationmodeling
44
A1: A cat is eating
A2: The cat eats
A3: A beautiful picture
https://github.com/Praznat/annotationmodeling
45. Future work
45
A1: A cat is eating
A2: The cat eats
A3: A beautiful picture
• From objective labeling tasks to subjective responses
• Evaluation on survey data
– Collaboration with behavioral science researchers?
– Compare distance functions and model settings for utility
• Automatic detection of consistent biases in a
participant’s responses vs. what’s group normative
https://github.com/Praznat/annotationmodeling
46. 46
Matt Lease (University of Texas at Austin)
Lab: ir.ischool.utexas.edu
@mattlease
Slides: slideshare.net/mattlease
We thank our many talented crowd workers
for their contributions to our research!
https://github.com/Praznat/annotationmodeling
Alexander Braylan and Matthew Lease. Aggregating Complex Annotations via Merging and Matching.
In Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data
Mining, pages 86--94, 2021. [ bib | pdf | data | sourcecode | video | slides | tech-report ]
Alexander Braylan and Matthew Lease. Modeling and Aggregation of Complex Annotations via
Annotation Distances. In Proceedings of the Web Conference, pages 1807--1818, 2020.
[ bib | pdf | data | sourcecode | video | slides ]
48. MTurk: The Early Days
48
• Artificial Intelligence, With Help From the Humans.
– J. Pontin. NY Times, March 25, 2007
• Is Amazon's Mechanical Turk a Failure? April 9, 2007
– “As of this writing, there are [only] 128 HITs available on Mechanical Turk.”
• Su et al., WWW 2007: “a web-based human data collection system… ‘System M’ ”
49. 2008: the ”Gold” Rush Begins
Braylan and Lease 49
Snow et al, EMNLP (Natural Language Processing)
• Annotating human language for natural language processing (NLP)
• 22,000 labels for only $26 USD
• Crowd’s consensus labels can replace traditional expert labels
“Discovery” sparks rush for “gold” data across areas
• Alonso et al., SIGIR Forum (Information Retrieval)
• Kittur et al., CHI (Human-Computer Interaction)
• Sorokin and Forsythe, CVPR (Computer Vision)
50. 2010-11: Social & Behavioral Sciences
50
• A Guide to Behavioral Experiments on Mechanical Turk
– W. Mason and S. Suri (2010). SSRN online.
• Crowdsourcing for Human Subjects Research
– L. Schmidt (CrowdConf 2010)
• Crowdsourcing Content Analysis for Behavioral Research: Insights from Mechanical Turk
– Conley & Tosti-Kharas (2010). Academy of Management
• Amazon's Mechanical Turk : A New Source of Inexpensive, Yet High-Quality, Data?
– M. Buhrmester et al. (2011). Perspectives… 6(1):3-5.
– see also: Amazon Mechanical Turk Guide for Social Scientists
51. The Future of Crowd Work (ACM CSCW’13)
by Kittur, Nickerson, Bernstein, Gerber,
Shaw, Zimmerman, Lease, and Horton
51
55. Tasks & datasets
SYNTHETIC DATASETS
• Syntactic parse trees
– Distance function: evalb
• Ranked lists
– Distance function: Kendall’s tau
REAL DATASETS
• Biomedical text sequences
– Distance function: Span F1
• Urdu-English translations
– Distance function: GLEU
55
Nguyen et al 2017
Zaidan and Callison-Burch 2011
56. Methods
Baselines:
• Random User (RU): pick one label randomly
• ZenCrowd (ZC) (Demartini et al. 2012)
– Weighted voting based on exact match (rare!)
• Crowd Hidden Markov Model (CHMM) (Nguyen et al. 2017)
– Sequence annotation task only
Upper bound: Oracle (OR) (always picks best label)
• Even if 5 workers answer, limited by best answer any of them gave
56
57. Results
Task Metric RU ZC CHMM MAS Oracle
Translations GLEU 0.185 0.246
Sequences F1 0.561 0.827
Parses EVALB 0.812 0.939
Rankings 0.491 0.724
57
• Diverse complex label datasets
60. Results
Task Metric RU ZC CHMM MAS Oracle
Translations GLEU 0.185 0.188 - 0.217 0.246
Sequences F1 0.561 0.569 0.702 0.709 0.827
Parses EVALB 0.812 0.819 - 0.932 0.939
Rankings 0.491 0.495 - 0.710 0.724
60
• Diverse complex label datasets
• MAS aggregation is best way to get closer to ground truth with no
model alteration between datasets
62. 62
Goal: Design a future of Artificial Intelligence (AI)
technologies to meet society’s needs and values.
.
http://goodsystems.utexas.edu
Good Systems: an 8-year, $10M
UT Austin Grand Challenge
63. “The place where people & technology meet”
~ Wobbrock et al., 2009
“iSchools” now exist at over 100 universities around the world
63
What’s an Information School?
64. Task-specific workflows
• Pros:
– Empower workers
for complex tasks
• Cons:
– Need new workflow
for every task
– Complicated, difficult
to formulate
Noronha et al 2011
(image analysis)
Lasecki et al 2012
(transcription)
64