SlideShare a Scribd company logo
1 of 28
CIG 2014 - Human-Like Bots Competition 
IEEE Computational Intelligence in Games. Dortmund, Germany. August, 2014 
Organization: 
Manuel G. Bedia 
Juan Peralta 
Joan Marc 
Philip Hingston 
Raúl Arrabales
Organization / Acknowledgements
Players (humans and bots) 
PLAYER TYPE TEAM MEMBERS AFFILIATION COUNTRY 
BotTracker BOT TETRIIS 
Hunjoo Lee 
Jee-Hyong Lee 
ETRI, 
Sungkyunkwan 
University 
SOUTH 
KOREA 
MirrorBot BOT IHSEV Mihai Polceanu 
ENIB CERV Centre 
de Réalité Virtuelle 
FRANCE 
NizorBot BOT 
UMAG-BOT 
José L. Jiménez López 
Antonio J. Fernández-Leiva 
Antonio M. Mora 
Universidad de 
Málaga 
SPAIN 
OvGUBot BOT OvGUBot 
Xenija Neufeld 
Sanaz Mostaghim 
Otto von Guericke 
University, 
Magdeburg 
GERMANY 
ADANN BOT CVC 
Juan Peralta Donate 
Joan Marc Llargués A. 
CVC. UAB SPAIN 
CCBot BOT 
Conscious 
-Robots 
Jorge Muñoz 
Raúl Arrabales 
Comaware SPAIN 
Player HUMAN Judge - - - 
Tmchojo HUMAN Judge - - - 
Juan_CVC HUMAN Judge - - - 
Xenija HUMAN Judge - - -
Original BotPrize Testing Protocol 
(FPA – First-Person Assessment) 
Human Judges 
Artificial Bots UT 2004 Server 
Real-time Online Anonymized interaction
BotPrize 2014 Edition: We add TPA 
* FPA – First-Person Assessment 
* TPA – Third-Person Assessment 
Generation of Anonymized TPA Video 
Clips featuring human and bot players 
FPA Human Judges 
Artificial Bots UT 2004 Server 
TPA Judges 
(Crowdsourcing 
platform) 
Third-Person 
Crowdsourcing 
Judging
BotPrize 2014 Edition: Humanness++ (H) 
H = (FPA * FPWF) + (TPA * TPWF) 
FPWF First-Person Weighting Factor = 0,5. 
TPWF Third-Person Weighting Factor = 0,5. 
FPA Human Judges 
Artificial Bots UT 2004 Server 
H = FPA / 2 + TPA / 2
Humanness scores based on SDT 
Signal Detection Theory 
Judge SDT Matrix Vote “Human” Vote “Bot” 
Player is a Human Hit False Alarm 
Player is a Bot Miss Hit 
Tanner Jr., Wilson P.; John A. Swets (November 1954). "A decision-making theory of visual detection.". Psychological Review. 61 (6): 401–409.
Humanness scores based on SDT 
Judge Reliability (JR) 
A measure of how good a judge is in terms of 
telling apart humans and bots 
퐽푅푗 = 
퐻푖푡푠 − (푀푖푠푠푒푠 + 퐹푎푙푠푒퐴푙푎푟푚푠 
푁푗 
Judge SDT Matrix Vote “Human” Vote “Bot” 
Player is a Human Hit False Alarm 
Player is a Bot Miss Hit
Judge Reliability can be used to adjust 
Humanness Scores 
Judge Relative Reliability (JRR) 
A measure of how good a judge is in relation 
with other judges 
퐽푅푅푗 = 
퐽푅푗 
퐽 퐽푅푗 
푗=1 
퐴푣푔퐽푅 = 퐴푣푔퐽푅 
퐽
Judge Reliability can be used to adjust 
Humanness Scores 
Judge Relative Reliability (JRR) 
A measure of how good a judge is in relation 
with other judges 
judges JRmeasures JRR Weight 
Player 0.28967254 1.2574682 0.31436704 
tmchojo 0.43801653 1.9014292 0.47535731 
Juan_CVC 0.16129032 0.7001611 0.17504027 
Xenija 0.03246753 0.1409415 0.03523538
Judge Reliability can be used to 
adjust Humanness Scores 
Judge Relative Reliability (JRR) 
“tmchojo” is the best FPA judge 
44% Correct Guesses
Bots Judging Reliability 
“BotTracker” is the best Bot telling apart 
bots and humans (32%)
Humans & Bots Judging Reliability 
H B H H B H B B B B
Humans & Bots Judging Reliability
Judge Reliability can be used to adjust 
Humanness Scores 
JRmeasures["Weight"] <- JRmeasures$JRR / nrow(JRmeasures)
Calculating FPA (First-Person Assessment) 
Weighted First-Person Humanness Ratio 
퐹푃퐴푖 = 
푛 푊푒푖푔ℎ푡푗 ∗ 퐻푢푚푎푛푛푒푠푠푖,푗 
푗=1 
퐽 
ℎ푢푚푎푛푛푒푠푠푖,푗 = 
푀푖푠푠푖,푗 
푁푖,푗 
Judge j SDT Matrix Voted “Human” Voted “Bot” 
Player is a Human Hit False Alarm 
Player is a Bot Miss Hit 
Sample proportion is an 
unbiased estimator of p 
in the population. 
Humanness of player i 
according to Judge j
Calculating FPA (First-Person Assessment) 
Weighted First-Person Humanness Ratio 
BotName Humanness FPA 
MirrorBot 0.4996406 0.20164771 
BotTracker 0.4231043 0.20070203 
OvGUBot 0.3164826 0.10545765 
NizorBot 0.2980527 0.11821633 
ADANN 0.2432864 0.08351664 
CCBot 0.1685606 0.06214746 
BotName Humanness FPA 
Player 0.5417464 0.1932813 
tmchojo 0.5177169 0.1775752 
Xenija 0.3847691 0.1713976 
Juan_CVC 0.3172348 0.1237229
Calculating FPA (First-Person Assessment) 
Weighted First-Person Humanness Ratio
Calculating FPA (First-Person Assessment) 
Weighted First-Person Humanness Ratio 
H H B B H H B B B B
Calculating FPA (First-Person Assessment) 
Weighted First-Person Humanness Ratio 
B B H H H H B B B B
BotPrize 2014 Edition: We add TPA 
* TPA – Third-Person Assessment 
Generation of Anonymized TPA Video 
Clips featuring human and bot players 
FPA Human Judges 
Artificial Bots UT 2004 Server 
TPA Judges 
(Crowdsourcing 
platform) 
Third-Person 
Crowdsourcing 
Judging
Calculating TPA (Third-Person Assessment) 
Crowdsourcing Judging 
푇푃퐴푖,푗 = 
푀푖푠푠푖,푗 
푁푖,푗 
J = 232 human judges 
I = 12 characters (6+6)
Calculating TPA (Third-Person Assessment) 
Crowdsourcing Judging 
BotName FPA TPA H++ 
Xenija 0.17139763 0.8235294 0.4974635 
MirrorBot 0.20164771 0.7333333 0.4674905 
Player 0.19328127 0.6315789 0.4124301 
tmchojo 0.17757519 0.6470588 0.4123170 
NizorBot 0.11821633 0.7058824 0.4120493 
BotTracker 0.20070203 0.5909091 0.3958056 
CCBot 0.06214746 0.7058824 0.3840149 
Juan_CVC 0.12372294 0.6190476 0.3713853 
OvGUBot 0.10545765 0.6086957 0.3570767 
ADANN 0.08351664 0.4761905 0.2798536
Calculating TPA (Third-Person Assessment) 
Crowdsourcing Judging 
H B B B H H H H H B B B
Final Results (FPA + TPA) 
H B H H B B B H B B
Final Results (H++) 
MirrorBot 
NizorBot 
BotTracker 
OvGUBot 
0.467 
0.412 
0.395 
0.357 
Mihai Polceanu 
José L. Jiménez López 
Antonio J. Fernández-Leiva 
Antonio M. Mora 
Hunjoo Lee 
Jee-Hyong Lee 
Xenija Neufeld 
Sanaz Mostaghim
Congratulations for your results!!! 
Hope to see you again 
next year  
www.botprize.org 
human-machine.unizar.es 
Raul.Arrabales 
@Conscious-Robots.com 
@ConsciousRobots

More Related Content

Recently uploaded

Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
Azure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAzure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAndikSusilo4
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxOnBoard
 
Artificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraArtificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraDeakin University
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphNeo4j
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 

Recently uploaded (20)

Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
Azure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAzure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & Application
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptx
 
Artificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraArtificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning era
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 

Featured

2024 State of Marketing Report – by Hubspot
2024 State of Marketing Report – by Hubspot2024 State of Marketing Report – by Hubspot
2024 State of Marketing Report – by HubspotMarius Sescu
 
Everything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPTEverything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPTExpeed Software
 
Product Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage EngineeringsProduct Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage EngineeringsPixeldarts
 
How Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthHow Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthThinkNow
 
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfAI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfmarketingartwork
 
PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024Neil Kimberley
 
Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)contently
 
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024Albert Qian
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsKurio // The Social Media Age(ncy)
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Search Engine Journal
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summarySpeakerHub
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next Tessa Mero
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentLily Ray
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best PracticesVit Horky
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project managementMindGenius
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...RachelPearson36
 

Featured (20)

2024 State of Marketing Report – by Hubspot
2024 State of Marketing Report – by Hubspot2024 State of Marketing Report – by Hubspot
2024 State of Marketing Report – by Hubspot
 
Everything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPTEverything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPT
 
Product Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage EngineeringsProduct Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage Engineerings
 
How Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthHow Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental Health
 
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfAI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
 
Skeleton Culture Code
Skeleton Culture CodeSkeleton Culture Code
Skeleton Culture Code
 
PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024
 
Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)
 
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie Insights
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search Intent
 
How to have difficult conversations
How to have difficult conversations How to have difficult conversations
How to have difficult conversations
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best Practices
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project management
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
 

BotPrize 2014 Results. Human-Like Bots Competition at IEEE CIG

  • 1. CIG 2014 - Human-Like Bots Competition IEEE Computational Intelligence in Games. Dortmund, Germany. August, 2014 Organization: Manuel G. Bedia Juan Peralta Joan Marc Philip Hingston Raúl Arrabales
  • 3. Players (humans and bots) PLAYER TYPE TEAM MEMBERS AFFILIATION COUNTRY BotTracker BOT TETRIIS Hunjoo Lee Jee-Hyong Lee ETRI, Sungkyunkwan University SOUTH KOREA MirrorBot BOT IHSEV Mihai Polceanu ENIB CERV Centre de Réalité Virtuelle FRANCE NizorBot BOT UMAG-BOT José L. Jiménez López Antonio J. Fernández-Leiva Antonio M. Mora Universidad de Málaga SPAIN OvGUBot BOT OvGUBot Xenija Neufeld Sanaz Mostaghim Otto von Guericke University, Magdeburg GERMANY ADANN BOT CVC Juan Peralta Donate Joan Marc Llargués A. CVC. UAB SPAIN CCBot BOT Conscious -Robots Jorge Muñoz Raúl Arrabales Comaware SPAIN Player HUMAN Judge - - - Tmchojo HUMAN Judge - - - Juan_CVC HUMAN Judge - - - Xenija HUMAN Judge - - -
  • 4.
  • 5. Original BotPrize Testing Protocol (FPA – First-Person Assessment) Human Judges Artificial Bots UT 2004 Server Real-time Online Anonymized interaction
  • 6. BotPrize 2014 Edition: We add TPA * FPA – First-Person Assessment * TPA – Third-Person Assessment Generation of Anonymized TPA Video Clips featuring human and bot players FPA Human Judges Artificial Bots UT 2004 Server TPA Judges (Crowdsourcing platform) Third-Person Crowdsourcing Judging
  • 7. BotPrize 2014 Edition: Humanness++ (H) H = (FPA * FPWF) + (TPA * TPWF) FPWF First-Person Weighting Factor = 0,5. TPWF Third-Person Weighting Factor = 0,5. FPA Human Judges Artificial Bots UT 2004 Server H = FPA / 2 + TPA / 2
  • 8. Humanness scores based on SDT Signal Detection Theory Judge SDT Matrix Vote “Human” Vote “Bot” Player is a Human Hit False Alarm Player is a Bot Miss Hit Tanner Jr., Wilson P.; John A. Swets (November 1954). "A decision-making theory of visual detection.". Psychological Review. 61 (6): 401–409.
  • 9. Humanness scores based on SDT Judge Reliability (JR) A measure of how good a judge is in terms of telling apart humans and bots 퐽푅푗 = 퐻푖푡푠 − (푀푖푠푠푒푠 + 퐹푎푙푠푒퐴푙푎푟푚푠 푁푗 Judge SDT Matrix Vote “Human” Vote “Bot” Player is a Human Hit False Alarm Player is a Bot Miss Hit
  • 10. Judge Reliability can be used to adjust Humanness Scores Judge Relative Reliability (JRR) A measure of how good a judge is in relation with other judges 퐽푅푅푗 = 퐽푅푗 퐽 퐽푅푗 푗=1 퐴푣푔퐽푅 = 퐴푣푔퐽푅 퐽
  • 11. Judge Reliability can be used to adjust Humanness Scores Judge Relative Reliability (JRR) A measure of how good a judge is in relation with other judges judges JRmeasures JRR Weight Player 0.28967254 1.2574682 0.31436704 tmchojo 0.43801653 1.9014292 0.47535731 Juan_CVC 0.16129032 0.7001611 0.17504027 Xenija 0.03246753 0.1409415 0.03523538
  • 12. Judge Reliability can be used to adjust Humanness Scores Judge Relative Reliability (JRR) “tmchojo” is the best FPA judge 44% Correct Guesses
  • 13. Bots Judging Reliability “BotTracker” is the best Bot telling apart bots and humans (32%)
  • 14. Humans & Bots Judging Reliability H B H H B H B B B B
  • 15. Humans & Bots Judging Reliability
  • 16. Judge Reliability can be used to adjust Humanness Scores JRmeasures["Weight"] <- JRmeasures$JRR / nrow(JRmeasures)
  • 17. Calculating FPA (First-Person Assessment) Weighted First-Person Humanness Ratio 퐹푃퐴푖 = 푛 푊푒푖푔ℎ푡푗 ∗ 퐻푢푚푎푛푛푒푠푠푖,푗 푗=1 퐽 ℎ푢푚푎푛푛푒푠푠푖,푗 = 푀푖푠푠푖,푗 푁푖,푗 Judge j SDT Matrix Voted “Human” Voted “Bot” Player is a Human Hit False Alarm Player is a Bot Miss Hit Sample proportion is an unbiased estimator of p in the population. Humanness of player i according to Judge j
  • 18. Calculating FPA (First-Person Assessment) Weighted First-Person Humanness Ratio BotName Humanness FPA MirrorBot 0.4996406 0.20164771 BotTracker 0.4231043 0.20070203 OvGUBot 0.3164826 0.10545765 NizorBot 0.2980527 0.11821633 ADANN 0.2432864 0.08351664 CCBot 0.1685606 0.06214746 BotName Humanness FPA Player 0.5417464 0.1932813 tmchojo 0.5177169 0.1775752 Xenija 0.3847691 0.1713976 Juan_CVC 0.3172348 0.1237229
  • 19. Calculating FPA (First-Person Assessment) Weighted First-Person Humanness Ratio
  • 20. Calculating FPA (First-Person Assessment) Weighted First-Person Humanness Ratio H H B B H H B B B B
  • 21. Calculating FPA (First-Person Assessment) Weighted First-Person Humanness Ratio B B H H H H B B B B
  • 22. BotPrize 2014 Edition: We add TPA * TPA – Third-Person Assessment Generation of Anonymized TPA Video Clips featuring human and bot players FPA Human Judges Artificial Bots UT 2004 Server TPA Judges (Crowdsourcing platform) Third-Person Crowdsourcing Judging
  • 23. Calculating TPA (Third-Person Assessment) Crowdsourcing Judging 푇푃퐴푖,푗 = 푀푖푠푠푖,푗 푁푖,푗 J = 232 human judges I = 12 characters (6+6)
  • 24. Calculating TPA (Third-Person Assessment) Crowdsourcing Judging BotName FPA TPA H++ Xenija 0.17139763 0.8235294 0.4974635 MirrorBot 0.20164771 0.7333333 0.4674905 Player 0.19328127 0.6315789 0.4124301 tmchojo 0.17757519 0.6470588 0.4123170 NizorBot 0.11821633 0.7058824 0.4120493 BotTracker 0.20070203 0.5909091 0.3958056 CCBot 0.06214746 0.7058824 0.3840149 Juan_CVC 0.12372294 0.6190476 0.3713853 OvGUBot 0.10545765 0.6086957 0.3570767 ADANN 0.08351664 0.4761905 0.2798536
  • 25. Calculating TPA (Third-Person Assessment) Crowdsourcing Judging H B B B H H H H H B B B
  • 26. Final Results (FPA + TPA) H B H H B B B H B B
  • 27. Final Results (H++) MirrorBot NizorBot BotTracker OvGUBot 0.467 0.412 0.395 0.357 Mihai Polceanu José L. Jiménez López Antonio J. Fernández-Leiva Antonio M. Mora Hunjoo Lee Jee-Hyong Lee Xenija Neufeld Sanaz Mostaghim
  • 28. Congratulations for your results!!! Hope to see you again next year  www.botprize.org human-machine.unizar.es Raul.Arrabales @Conscious-Robots.com @ConsciousRobots