SEO & Artificial Intelligence: The new rules to stay on top!TheFamily
As Artificial Intelligence evolves, SEO strategies changes as well: Google has already started to implement some Artificial Intelligence tools within its algorithm, including RankBrain.
It's a challenge but also an opportunity to re-think the way we formulate SEO strategy ;)
Why is AI such a big game-changer when it comes to SEO?
How can you adapt to remain in the top results despite these changes?
Philippe Yonnet, is the general manager of Search Foresight, a fast-growing SEO Agency specialized in helping companies to define successful search and inbound marketing strategies.
Philippe has more than 10 years experience and will give all the tips you need to rock your audience ;)
SEO & Artificial Intelligence: The new rules to stay on top!TheFamily
As Artificial Intelligence evolves, SEO strategies changes as well: Google has already started to implement some Artificial Intelligence tools within its algorithm, including RankBrain.
It's a challenge but also an opportunity to re-think the way we formulate SEO strategy ;)
Why is AI such a big game-changer when it comes to SEO?
How can you adapt to remain in the top results despite these changes?
Philippe Yonnet, is the general manager of Search Foresight, a fast-growing SEO Agency specialized in helping companies to define successful search and inbound marketing strategies.
Philippe has more than 10 years experience and will give all the tips you need to rock your audience ;)
Wimmics Research Team 2015 Activity ReportFabien Gandon
Extract of the activity report of the Wimmics joint research team between Inria Sophia Antipolis - Méditerranée and I3S (CNRS and Université Nice Sophia Antipolis). Wimmics stands for web-instrumented man-machine interactions, communities and semantics. The team focuses on bridging social semantics and formal semantics on the web.
Ethical Considerations in the Design of Artificial IntelligenceJohn C. Havens
A presentation for IEEE's Ethics Symposium happening in Vancouver, May 2016. Featuring presentations from John C. Havens, Mike Van der Loos, John P. Sullins, and Alan Mackworth.
A Theory of Knowledge Lecture given by Mark Steed, Director of JESS Dubai on Monday 4th March 2019
The lecture explains how AI works and then looks at some of the ethical implications
Dynamic UXR: Ethical Responsibilities and AI. Carol Smith at Strive in TorontoCarol Smith
Artificially intelligent (AI) technologies are exciting and with them come a lot of new user experience research (UXR) responsibilities. How do we understand and clarify our users need for transparency, control, and access (and more) when the system is constantly changing?
These dynamic systems are already part of our everyday lives and quickly becoming part of our jobs. What are our responsibilities with regard to ethics and protecting users from bias?
Presented at Strive, June 7, 2019 in Toronto, Ontario, Canada. Strive is the 2019 UX Research Conference presented by the UX Research Collective Inc.
Demystifying Artificial Intelligence: Solving Difficult Problems at ProductCa...Carol Smith
Artificially intelligent systems are becoming part of our everyday lives. This session will answer your questions about artificial intelligence, machine learning, and the ethical conflicts and the implications inherent in these technologies. Topics covered will include: discussions of bias in data; how to focus on the user experience; what is necessary to build a good cognitive computing systems; data needs; levels of accuracy; making safe and secure AI's; and discussions on ethics in AI and our role in leading those conversations. Carol will propose simple models for thinking about these systems and provide time for questions. You will walk away with an awareness of the weaknesses of AI and the knowledge of how these systems work.
Selected by the audience to be presented at ProductCamp Pittsburgh in September 2018
AI, Machine Learning & Deep Learning Risk Management & Controls: Beyond Deep Learning and Generative Adversarial Networks: Model Risk Management in AI, Machine Learning & Deep Learning
The most significant (not purely scientific) results in AI in the last year (2018-2019).
Disclaimer: may be very subjective :)
Slides to the set of lectures given in Feb-Apr 2019.
This one was conducted in Atlas Biomed Group, 2019-04-26
In the era of algorithms and AI, codes of ethics should have an added sense of purpose. But do they? The codes of ethics for ACM, IEEE and ASQ are reviewed in light of these concerns. Several case studies are cited which have grabbed headlines over the past two years. An increasingly software / code-driven universe in which AI is insinuated seemingly everywhere is one in which ethics must be present, part of enterprise decision-making, and traceable.
Semantics, Deep Learning, and the Transformation of BusinessSteve Omohundro
Deep learning is likely to have a big impact on business. McKinsey predicts that AI and robotics will create $50 trillion of value over the next 10 years. Over $1 billion of venture investment has gone to 250 deep learning startups over the past year. Deep learning systems have recently broken records in speech recognition, image recognition, image captioning, translation, drug discovery and other tasks. Why is this happening now and how is it likely to play out? We review the development of AI and the pendulum swings between the "neats" and the "scruffies". We describe traditional approaches to semantics through logics and grammars and the new deep learning vector semantics. We relate it to Roger Shepard's cognitive geometry and the structure of biological networks. We also describe limitations of deep learning for safety and regulation. We show how it fits into the rational agent framework and discuss what the next steps may be.
Artificial intelligence is more and more becoming the core of digital products. Designing for Products based on AI requires Designers to know about Machine Learning.
This talk is an easy walk through the most important elements of Machine Learning. It looks at the fundamental principles of using practical examples. It showcases applications of the different types of Machine Learning. The use-cases range from text categorization to image recognition, on to speech analysis. The goal is to show what is important for designers and why.
The newest buzzword after Big Data is AI. From Google search to Facebook messenger bots, AI is also everywhere.
• Machine learning has gone mainstream. Organizations are trying to build competitive advantage with AI and Big Data.
• But, what does it take to build Machine Learning applications? Beyond the unicorn data scientists and PhDs, how do you build on your big data architecture and apply Machine Learning to what you do?
• This talk will discuss technical options to implement machine learning on big data architectures and how to move forward.
Machine Learning Introduction for Digital Business LeadersSudha Jamthe
This is Sudha Jamthe's lecture to the Masters program students of Barcelona Technology School.
Covers Machine Learning introduction of technology foundation, use cases across multiple industries, jobs and varioys business roles to create Machine Intelligence Products and Services.
Wimmics Research Team 2015 Activity ReportFabien Gandon
Extract of the activity report of the Wimmics joint research team between Inria Sophia Antipolis - Méditerranée and I3S (CNRS and Université Nice Sophia Antipolis). Wimmics stands for web-instrumented man-machine interactions, communities and semantics. The team focuses on bridging social semantics and formal semantics on the web.
Ethical Considerations in the Design of Artificial IntelligenceJohn C. Havens
A presentation for IEEE's Ethics Symposium happening in Vancouver, May 2016. Featuring presentations from John C. Havens, Mike Van der Loos, John P. Sullins, and Alan Mackworth.
A Theory of Knowledge Lecture given by Mark Steed, Director of JESS Dubai on Monday 4th March 2019
The lecture explains how AI works and then looks at some of the ethical implications
Dynamic UXR: Ethical Responsibilities and AI. Carol Smith at Strive in TorontoCarol Smith
Artificially intelligent (AI) technologies are exciting and with them come a lot of new user experience research (UXR) responsibilities. How do we understand and clarify our users need for transparency, control, and access (and more) when the system is constantly changing?
These dynamic systems are already part of our everyday lives and quickly becoming part of our jobs. What are our responsibilities with regard to ethics and protecting users from bias?
Presented at Strive, June 7, 2019 in Toronto, Ontario, Canada. Strive is the 2019 UX Research Conference presented by the UX Research Collective Inc.
Demystifying Artificial Intelligence: Solving Difficult Problems at ProductCa...Carol Smith
Artificially intelligent systems are becoming part of our everyday lives. This session will answer your questions about artificial intelligence, machine learning, and the ethical conflicts and the implications inherent in these technologies. Topics covered will include: discussions of bias in data; how to focus on the user experience; what is necessary to build a good cognitive computing systems; data needs; levels of accuracy; making safe and secure AI's; and discussions on ethics in AI and our role in leading those conversations. Carol will propose simple models for thinking about these systems and provide time for questions. You will walk away with an awareness of the weaknesses of AI and the knowledge of how these systems work.
Selected by the audience to be presented at ProductCamp Pittsburgh in September 2018
AI, Machine Learning & Deep Learning Risk Management & Controls: Beyond Deep Learning and Generative Adversarial Networks: Model Risk Management in AI, Machine Learning & Deep Learning
The most significant (not purely scientific) results in AI in the last year (2018-2019).
Disclaimer: may be very subjective :)
Slides to the set of lectures given in Feb-Apr 2019.
This one was conducted in Atlas Biomed Group, 2019-04-26
In the era of algorithms and AI, codes of ethics should have an added sense of purpose. But do they? The codes of ethics for ACM, IEEE and ASQ are reviewed in light of these concerns. Several case studies are cited which have grabbed headlines over the past two years. An increasingly software / code-driven universe in which AI is insinuated seemingly everywhere is one in which ethics must be present, part of enterprise decision-making, and traceable.
Semantics, Deep Learning, and the Transformation of BusinessSteve Omohundro
Deep learning is likely to have a big impact on business. McKinsey predicts that AI and robotics will create $50 trillion of value over the next 10 years. Over $1 billion of venture investment has gone to 250 deep learning startups over the past year. Deep learning systems have recently broken records in speech recognition, image recognition, image captioning, translation, drug discovery and other tasks. Why is this happening now and how is it likely to play out? We review the development of AI and the pendulum swings between the "neats" and the "scruffies". We describe traditional approaches to semantics through logics and grammars and the new deep learning vector semantics. We relate it to Roger Shepard's cognitive geometry and the structure of biological networks. We also describe limitations of deep learning for safety and regulation. We show how it fits into the rational agent framework and discuss what the next steps may be.
Artificial intelligence is more and more becoming the core of digital products. Designing for Products based on AI requires Designers to know about Machine Learning.
This talk is an easy walk through the most important elements of Machine Learning. It looks at the fundamental principles of using practical examples. It showcases applications of the different types of Machine Learning. The use-cases range from text categorization to image recognition, on to speech analysis. The goal is to show what is important for designers and why.
The newest buzzword after Big Data is AI. From Google search to Facebook messenger bots, AI is also everywhere.
• Machine learning has gone mainstream. Organizations are trying to build competitive advantage with AI and Big Data.
• But, what does it take to build Machine Learning applications? Beyond the unicorn data scientists and PhDs, how do you build on your big data architecture and apply Machine Learning to what you do?
• This talk will discuss technical options to implement machine learning on big data architectures and how to move forward.
Machine Learning Introduction for Digital Business LeadersSudha Jamthe
This is Sudha Jamthe's lecture to the Masters program students of Barcelona Technology School.
Covers Machine Learning introduction of technology foundation, use cases across multiple industries, jobs and varioys business roles to create Machine Intelligence Products and Services.
Data Science with Human in the Loop @Faculty of Science #Leiden UniversityLora Aroyo
Software systems are becoming ever more intelligent and more useful, but the way we interact with these machines too often reveals that they don’t actually understand people. Knowledge Representation and Semantic Web focus on the scientific challenges involved in providing human knowledge in machine-readable form. However, we observe that various types of human knowledge cannot yet be captured by machines, especially when dealing with wide ranges of real-world tasks and contexts. The key scientific challenge is to provide an approach to capturing human knowledge in a way that is scalable and adequate to real-world needs. Human Computation has begun to scientifically study how human intelligence at scale can be used to methodologically improve machine-based knowledge and data management. My research is focusing on understanding human computation for improving how machine-based systems can acquire, capture and harness human knowledge and thus become even more intelligent. In this talk I will show how the CrowdTruth framework (http://crowdtruth.org) facilitates data collection, processing and analytics of human computation knowledge.
Some project links:
- http://controcurator.org/
- http://crowdtruth.org/
- http://diveproject.beeldengeluid.nl/
- http://vu-amsterdam-web-media-group.github.io/linkflows/
Crowdsourcing & Nichesourcing: Enriching Cultural Heritagewith Experts & Cr...Lora Aroyo
Presentation at the "Past, Present and Future of Digital Humanities & Social Sciences in the Netherlands" event, http://www.ehumanities.nl/past-present-and-future-of-digital-humanities-social-sciences-in-the-netherlands-programme-and-abstracts-2/
One of the best things about Flash is it’s community, and the number of available open source frameworks. In this session we will cover a number of the frameworks that make developing Flash games easier, better, and just more fun.
My ESWC 2017 keynote: Disrupting the Semantic Comfort ZoneLora Aroyo
Ambiguity in interpreting signs is not a new idea, yet the vast majority of research in machine interpretation of signals such as speech, language, images, video, audio, etc., tend to ignore ambiguity. This is evidenced by the fact that metrics for quality of machine understanding rely on a ground truth, in which each instance (a sentence, a photo, a sound clip, etc) is assigned a discrete label, or set of labels, and the machine’s prediction for that instance is compared to the label to determine if it is correct. This determination yields the familiar precision, recall, accuracy, and f-measure metrics, but clearly presupposes that this determination can be made. CrowdTruth is a form of collective intelligence based on a vector representation that accommodates diverse interpretation perspectives and encourages human annotators to disagree with each other, in order to expose latent elements such as ambiguity and worker quality. In other words, CrowdTruth assumes that when annotators disagree on how to label an example, it is because the example is ambiguous, the worker isn’t doing the right thing, or the task itself is not clear. In previous work on CrowdTruth, the focus was on how the disagreement signals from low quality workers and from unclear tasks can be isolated. Recently, we observed that disagreement can also signal ambiguity. The basic hypothesis is that, if workers disagree on the correct label for an example, then it will be more difficult for a machine to classify that example. The elaborate data analysis to determine if the source of the disagreement is ambiguity supports our intuition that low clarity signals ambiguity, while high clarity sentences quite obviously express one or more of the target relations. In this talk I will share the experiences and lessons learned on the path to understanding diversity in human interpretation and the ways to capture it as ground truth to enable machines to deal with such diversity.
The Rijksmuseum Collection as Linked DataLora Aroyo
Presentation at ISWC2018: http://iswc2018.semanticweb.org/sessions/the-rijksmuseum-collection-as-linked-data/ of our paper published originally in the Semantic Web Journal: http://www.semantic-web-journal.net/content/rijksmuseum-collection-linked-data-2
Many museums are currently providing online access to their collections. The state of the art research in the last decade shows that it is beneficial for institutions to provide their datasets as Linked Data in order to achieve easy cross-referencing, interlinking and integration. In this paper, we present the Rijksmuseum linked dataset (accessible at http://datahub.io/dataset/rijksmuseum), along with collection and vocabulary statistics, as well as lessons learned from the process of converting the collection to Linked Data. The version of March 2016 contains over 350,000 objects, including detailed descriptions and high-quality images released under a public domain license.
FAIRview: Responsible Video Summarization @NYCML'18Lora Aroyo
Presentation at the NYC Media Lab (NYCML2018). There is a growing demand for news videos online, with more consumers preferring to watch the news than read or listen to it. On the publisher side, there is a growing effort to use video summarization technology in order to create easy-to-consume previews (trailers) for different types of broadcast programs. How can we measure the quality of video summaries and their potential to misinform? This workshop will inform participants about automatic video summarization algorithms and how to produce more “representative” video summaries. The research presented is from the FAIRview project and is supported by the Digital News Innovation Fund (DNI Fund), which is part of the Google News Initiative.
DH Benelux 2017 Panel: A Pragmatic Approach to Understanding and Utilising Ev...Lora Aroyo
Lora Aroyo, Chiel van den Akker, Marnix van Berchum, Lodewijk
Petram, Gerard Kuys, Tommaso Caselli, Jacco van Ossenbruggen, Victor de Boer, Sabrina Sauer, Berber Hagedoorn
Crowdsourcing ambiguity aware ground truth - collective intelligence 2017Lora Aroyo
The process of gathering ground truth data through human annotation is a major bottleneck in the use of information extraction methods. Crowdsourcing-based approaches are gaining popularity in the attempt to solve the issues related to the volume of data and lack of annotators. Typically these practices use inter-annotator agreement as a measure of quality. However, this assumption often creates issues in practice. Previous experiments we performed found that inter-annotator disagreement is usually never captured, either because the number of annotators is too small to capture the full diversity of opinion, or because the crowd data is aggregated with metrics that enforce consensus, such as majority vote. These practices create artificial data that is neither general nor reflects the ambiguity inherent in the data.
To address these issues, we proposed the method for crowdsourcing ground truth by harnessing inter-annotator disagreement. We present an alternative approach for crowdsourcing ground truth data that, instead of enforcing an agreement between annotators, captures the ambiguity inherent in semantic annotation through the use of disagreement-aware metrics for aggregating crowdsourcing responses. Based on this principle, we have implemented the CrowdTruth framework for machine-human computation, that first introduced the disagreement-aware metrics and built a pipeline to process crowdsourcing data with these metrics.
In this paper, we apply the CrowdTruth methodology to collect data over a set of diverse tasks: medical relation extraction, Twitter event identification, news event extraction and sound interpretation. We prove that capturing disagreement is essential for acquiring a high-quality ground truth. We achieve this by comparing the quality of the data aggregated with CrowdTruth metrics with a majority vote, a method which enforces consensus among annotators. By applying our analysis over a set of diverse tasks we show that, even though ambiguity manifests differently depending on the task, our theory of inter-annotator disagreement as a property of ambiguity is generalizable.
Stitch by Stitch: Annotating Fashion at the RijksmuseumLora Aroyo
https://www.rijksmuseum.nl/en/stitch-by-stitch
http://annotate.accurator.nl/
Fashion can be found everywhere in museums. Fashion heritage collected over centuries: costumes, accessories, paintings, prints and photographs. But while some clothes and accessories are easily found and identified, others are obscure and require a trained eye to describe. What are we looking at? What kind of sleeve is this? Which materials and techniques have been used? More specific descriptions of the images facilitate better use of digital collections and enable users to wander through them in detail.
Museums & the Web 2016 Presentation: Enriching Collections with Expert Knowle...Lora Aroyo
http://mw2016.museumsandtheweb.com/proposal/accurator-enriching-collections-with-expert-knowledge-from-the-crowd/
Crowdsourcing is not a new phenomenon for museums. There are good examples for museums (e.g., Powerhouse museum, steve.museum). But not all crowdsourcing initiatives are successful. Crowdsourced tagging does not always contribute to a better understanding of art and can even be confusing.
The Rijksmuseum and the VU University Amsterdam developed the Accurator: a visual tool to get experts in domains like birds, bibles, ships, castles, etc. involved in annotating art and enrich the museums’ metadata with expertise that is not available internally.
In this how-to session, we demonstrate the tool and the ways other museums can implement this Open Web application for their own collections.
Achieving Expert-Level Annotation Quality with CrowdTruth: The Case of Medical Relation Extraction. Anca Dumitrache, Lora Aroyo and Chris Welty. ==> http://ceur-ws.org/Vol-1428/
#CrowdTruth: Linked Data for Information Extraction @ISWC2015Lora Aroyo
CrowdTruth Measures for Language Ambiguity: The Case of Medical Relation Extraction. Anca Dumitrache, Lora Aroyo and Chris Welty ==> http://oak.dcs.shef.ac.uk/ld4ie2015/LD4IE2015/Program.html
Connector Corner: Automate dynamic content and events by pushing a buttonDianaGray10
Here is something new! In our next Connector Corner webinar, we will demonstrate how you can use a single workflow to:
Create a campaign using Mailchimp with merge tags/fields
Send an interactive Slack channel message (using buttons)
Have the message received by managers and peers along with a test email for review
But there’s more:
In a second workflow supporting the same use case, you’ll see:
Your campaign sent to target colleagues for approval
If the “Approve” button is clicked, a Jira/Zendesk ticket is created for the marketing design team
But—if the “Reject” button is pushed, colleagues will be alerted via Slack message
Join us to learn more about this new, human-in-the-loop capability, brought to you by Integration Service connectors.
And...
Speakers:
Akshay Agnihotri, Product Manager
Charlie Greenberg, Host
Key Trends Shaping the Future of Infrastructure.pdfCheryl Hung
Keynote at DIGIT West Expo, Glasgow on 29 May 2024.
Cheryl Hung, ochery.com
Sr Director, Infrastructure Ecosystem, Arm.
The key trends across hardware, cloud and open-source; exploring how these areas are likely to mature and develop over the short and long-term, and then considering how organisations can position themselves to adapt and thrive.
Neuro-symbolic is not enough, we need neuro-*semantic*Frank van Harmelen
Neuro-symbolic (NeSy) AI is on the rise. However, simply machine learning on just any symbolic structure is not sufficient to really harvest the gains of NeSy. These will only be gained when the symbolic structures have an actual semantics. I give an operational definition of semantics as “predictable inference”.
All of this illustrated with link prediction over knowledge graphs, but the argument is general.
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf91mobiles
91mobiles recently conducted a Smart TV Buyer Insights Survey in which we asked over 3,000 respondents about the TV they own, aspects they look at on a new TV, and their TV buying preferences.
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...James Anderson
Effective Application Security in Software Delivery lifecycle using Deployment Firewall and DBOM
The modern software delivery process (or the CI/CD process) includes many tools, distributed teams, open-source code, and cloud platforms. Constant focus on speed to release software to market, along with the traditional slow and manual security checks has caused gaps in continuous security as an important piece in the software supply chain. Today organizations feel more susceptible to external and internal cyber threats due to the vast attack surface in their applications supply chain and the lack of end-to-end governance and risk management.
The software team must secure its software delivery process to avoid vulnerability and security breaches. This needs to be achieved with existing tool chains and without extensive rework of the delivery processes. This talk will present strategies and techniques for providing visibility into the true risk of the existing vulnerabilities, preventing the introduction of security issues in the software, resolving vulnerabilities in production environments quickly, and capturing the deployment bill of materials (DBOM).
Speakers:
Bob Boule
Robert Boule is a technology enthusiast with PASSION for technology and making things work along with a knack for helping others understand how things work. He comes with around 20 years of solution engineering experience in application security, software continuous delivery, and SaaS platforms. He is known for his dynamic presentations in CI/CD and application security integrated in software delivery lifecycle.
Gopinath Rebala
Gopinath Rebala is the CTO of OpsMx, where he has overall responsibility for the machine learning and data processing architectures for Secure Software Delivery. Gopi also has a strong connection with our customers, leading design and architecture for strategic implementations. Gopi is a frequent speaker and well-known leader in continuous delivery and integrating security into software delivery.
State of ICS and IoT Cyber Threat Landscape Report 2024 previewPrayukth K V
The IoT and OT threat landscape report has been prepared by the Threat Research Team at Sectrio using data from Sectrio, cyber threat intelligence farming facilities spread across over 85 cities around the world. In addition, Sectrio also runs AI-based advanced threat and payload engagement facilities that serve as sinks to attract and engage sophisticated threat actors, and newer malware including new variants and latent threats that are at an earlier stage of development.
The latest edition of the OT/ICS and IoT security Threat Landscape Report 2024 also covers:
State of global ICS asset and network exposure
Sectoral targets and attacks as well as the cost of ransom
Global APT activity, AI usage, actor and tactic profiles, and implications
Rise in volumes of AI-powered cyberattacks
Major cyber events in 2024
Malware and malicious payload trends
Cyberattack types and targets
Vulnerability exploit attempts on CVEs
Attacks on counties – USA
Expansion of bot farms – how, where, and why
In-depth analysis of the cyber threat landscape across North America, South America, Europe, APAC, and the Middle East
Why are attacks on smart factories rising?
Cyber risk predictions
Axis of attacks – Europe
Systemic attacks in the Middle East
Download the full report from here:
https://sectrio.com/resources/ot-threat-landscape-reports/sectrio-releases-ot-ics-and-iot-security-threat-landscape-report-2024/
Transcript: Selling digital books in 2024: Insights from industry leaders - T...BookNet Canada
The publishing industry has been selling digital audiobooks and ebooks for over a decade and has found its groove. What’s changed? What has stayed the same? Where do we go from here? Join a group of leading sales peers from across the industry for a conversation about the lessons learned since the popularization of digital books, best practices, digital book supply chain management, and more.
Link to video recording: https://bnctechforum.ca/sessions/selling-digital-books-in-2024-insights-from-industry-leaders/
Presented by BookNet Canada on May 28, 2024, with support from the Department of Canadian Heritage.
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
Harnessing Human Semantics at Scale (updated)
1. http://lora-aroyo.org @laroyo
Harnessing Human Semantics at Scale
Measurable, Reproducible, Engaging, Sustainable
Crowdsourcing & Nichesourcing
Lora Aroyo
Join to participate in the CATS4ML Data Challenge
cats4ml.humancomputation.com
17. http://lora-aroyo.org @laroyo
in the (very near) future
most visitors will be digital-born
not bound by time or location
native to new forms of co-makership
native to new media
Siebe Weide, Max Meijer and Marieke Krabshuis (2012).
Agenda 2026: Study on the Future of the Dutch Museum Sector
25. http://lora-aroyo.org @laroyo
expertise of Rijksmuseum professionals is
in annotating their collection
with art-historical information, e.g. when they
were created, by whom, etc.
29. http://lora-aroyo.org @laroyo
Engage with Games
training the general crowd to be a niche:
game in which players can carry out an expert
annotation tasks with some assistance
39. http://lora-aroyo.org @laroyo
Low reproducibility rates
Difficult to estimate & control the time to complete
Difficult to assess & compare quality
Demands continuous promotional effort
Active learning (human-in-the-loop) needs different expertise
Difficult to incorporate results into existing content infrastructure
Challenges
Crowdsourcing typically undertaken in isolation
46. Web & Media Group
http://lora-aroyo.org @laroyo
simplification of context
this all results in
47. Web & Media Group
http://lora-aroyo.org @laroyo
48. http://lora-aroyo.org @laroyo
● Identify Crowdsourcing Goals through user log analysis
○ # queries, #unique queries, #queries of specific type
○ ranked by popularity
○ ranked by popularity and with error, e.g.
■ # queries entered over 50 times with 0 results
■ # queries of specific type with 0 results
○ which will have biggest impact
○ which has biggest urgency
● … or through other user analysis
○ museum visits, external channels
Assess Impact of Results
50. http://lora-aroyo.org @laroyo
people search for fragments
experts annotate full videos
35% of search queries result in not found
people search for fragments
experts annotate full videos
35% of search queries result in not found
for example
in video search
52. http://lora-aroyo.org @laroyo
Measure Quality
“On the Role of User-Generated Metadata in A/V Collections”, Riste Gligorov et al. KCAP2011
time-based annotation
bernhard
88% of the tags useful
for specific genres
describe short segments
often not very specific
don’t describe program as a whole
53. http://lora-aroyo.org @laroyo
for example
in video search
video annotation is time-consuming
5 times the video duration
experts use a specific vocabulary
that is unknown to general audiences
video annotation is time-consuming
5 times the video duration
experts use a specific vocabulary
that is unknown to general audiences
54. http://lora-aroyo.org @laroyo
user vocabulary
8% in professional vocabulary
23% in Dutch lexicon
89% found on Google
locations (7%)
engeland
persons (31%)
objects (57%)
Measure Quality
“On the Role of User-Generated Metadata in A/V Collections”, Riste Gligorov et al. KCAP2011
55. Web & Media Group
http://lora-aroyo.org @laroyo
human subjectivity, ambiguity & uncertainty of expression
natural part of human semantics
56. http://lora-aroyo.org @laroyo
measure quality
quality is not just about spam
quality is typically multi-dimensional
understand the diversity in crowd answers
do not ignore multitude of interpretations
understand the variety of contexts
identify cases with high ambiguity, similarity, …
experiment with explicit metrics
experiment with different designs
57. http://lora-aroyo.org @laroyo
Measure Progress
6 months 2 years
340,551 tags 36,981 tags
137.421 matches
602 items 1.782 items
555 registered players 2,017 users (taggers)
thousands of anonymous players
12,279 visits (3+ min online)
44,362 pageviews
Riste Gligorov, Michiel Hildebrand, Jacco van Ossenbruggen, Guus Schreiber, Lora Aroyo (2011).
On the role of user-generated metadata in audio visual collections. International conference
on Knowledge capture K-CAP '11, Pages 145-152
64. http://lora-aroyo.org @laroyo
Your AI model is as good
as your evaluation data
… but is your evaluation
data missing relevant
examples?
… and how can we find
such examples, especially
if they are AI blindspots
(i.e. unknown unknowns)?
CATS4ML Challenge
offers a crowdsourced red
team for finding
blindspots of your AI
models
cats4ml.humancomputation.com
65. http://lora-aroyo.org @laroyo
AI Blindspots
real images with visual patterns that confuse AI models
in ways humans might find meaningful
Lipstick?
Airplane?
Car?
Construction worker? Thanksgiving?
https://opensource.google/projects/open-images-dataset
66. http://lora-aroyo.org @laroyo
Inspired by Bug Bounty
this is a data challenge to find
the blindspots in our AI models
Challenge running
until mid Jan 2021
Join, and start hunting
for AI Blindspots, and
spread the word to other
teams that might be
interested!
cats4ml.humancomputation.com
cats4ml.humancomputation.com