BotPrize 2014 Results. Human-Like Bots Competition at IEEE CIG

Accenture Analytics
Accenture AnalyticsBig Data Analytics | Data Scientist | Machine Learning
CIG 2014 - Human-Like Bots Competition 
IEEE Computational Intelligence in Games. Dortmund, Germany. August, 2014 
Organization: 
Manuel G. Bedia 
Juan Peralta 
Joan Marc 
Philip Hingston 
Raúl Arrabales
Organization / Acknowledgements
Players (humans and bots) 
PLAYER TYPE TEAM MEMBERS AFFILIATION COUNTRY 
BotTracker BOT TETRIIS 
Hunjoo Lee 
Jee-Hyong Lee 
ETRI, 
Sungkyunkwan 
University 
SOUTH 
KOREA 
MirrorBot BOT IHSEV Mihai Polceanu 
ENIB CERV Centre 
de Réalité Virtuelle 
FRANCE 
NizorBot BOT 
UMAG-BOT 
José L. Jiménez López 
Antonio J. Fernández-Leiva 
Antonio M. Mora 
Universidad de 
Málaga 
SPAIN 
OvGUBot BOT OvGUBot 
Xenija Neufeld 
Sanaz Mostaghim 
Otto von Guericke 
University, 
Magdeburg 
GERMANY 
ADANN BOT CVC 
Juan Peralta Donate 
Joan Marc Llargués A. 
CVC. UAB SPAIN 
CCBot BOT 
Conscious 
-Robots 
Jorge Muñoz 
Raúl Arrabales 
Comaware SPAIN 
Player HUMAN Judge - - - 
Tmchojo HUMAN Judge - - - 
Juan_CVC HUMAN Judge - - - 
Xenija HUMAN Judge - - -
BotPrize 2014 Results. Human-Like Bots Competition at IEEE CIG
Original BotPrize Testing Protocol 
(FPA – First-Person Assessment) 
Human Judges 
Artificial Bots UT 2004 Server 
Real-time Online Anonymized interaction
BotPrize 2014 Edition: We add TPA 
* FPA – First-Person Assessment 
* TPA – Third-Person Assessment 
Generation of Anonymized TPA Video 
Clips featuring human and bot players 
FPA Human Judges 
Artificial Bots UT 2004 Server 
TPA Judges 
(Crowdsourcing 
platform) 
Third-Person 
Crowdsourcing 
Judging
BotPrize 2014 Edition: Humanness++ (H) 
H = (FPA * FPWF) + (TPA * TPWF) 
FPWF First-Person Weighting Factor = 0,5. 
TPWF Third-Person Weighting Factor = 0,5. 
FPA Human Judges 
Artificial Bots UT 2004 Server 
H = FPA / 2 + TPA / 2
Humanness scores based on SDT 
Signal Detection Theory 
Judge SDT Matrix Vote “Human” Vote “Bot” 
Player is a Human Hit False Alarm 
Player is a Bot Miss Hit 
Tanner Jr., Wilson P.; John A. Swets (November 1954). "A decision-making theory of visual detection.". Psychological Review. 61 (6): 401–409.
Humanness scores based on SDT 
Judge Reliability (JR) 
A measure of how good a judge is in terms of 
telling apart humans and bots 
퐽푅푗 = 
퐻푖푡푠 − (푀푖푠푠푒푠 + 퐹푎푙푠푒퐴푙푎푟푚푠 
푁푗 
Judge SDT Matrix Vote “Human” Vote “Bot” 
Player is a Human Hit False Alarm 
Player is a Bot Miss Hit
Judge Reliability can be used to adjust 
Humanness Scores 
Judge Relative Reliability (JRR) 
A measure of how good a judge is in relation 
with other judges 
퐽푅푅푗 = 
퐽푅푗 
퐽 퐽푅푗 
푗=1 
퐴푣푔퐽푅 = 퐴푣푔퐽푅 
퐽
Judge Reliability can be used to adjust 
Humanness Scores 
Judge Relative Reliability (JRR) 
A measure of how good a judge is in relation 
with other judges 
judges JRmeasures JRR Weight 
Player 0.28967254 1.2574682 0.31436704 
tmchojo 0.43801653 1.9014292 0.47535731 
Juan_CVC 0.16129032 0.7001611 0.17504027 
Xenija 0.03246753 0.1409415 0.03523538
Judge Reliability can be used to 
adjust Humanness Scores 
Judge Relative Reliability (JRR) 
“tmchojo” is the best FPA judge 
44% Correct Guesses
Bots Judging Reliability 
“BotTracker” is the best Bot telling apart 
bots and humans (32%)
Humans & Bots Judging Reliability 
H B H H B H B B B B
Humans & Bots Judging Reliability
Judge Reliability can be used to adjust 
Humanness Scores 
JRmeasures["Weight"] <- JRmeasures$JRR / nrow(JRmeasures)
Calculating FPA (First-Person Assessment) 
Weighted First-Person Humanness Ratio 
퐹푃퐴푖 = 
푛 푊푒푖푔ℎ푡푗 ∗ 퐻푢푚푎푛푛푒푠푠푖,푗 
푗=1 
퐽 
ℎ푢푚푎푛푛푒푠푠푖,푗 = 
푀푖푠푠푖,푗 
푁푖,푗 
Judge j SDT Matrix Voted “Human” Voted “Bot” 
Player is a Human Hit False Alarm 
Player is a Bot Miss Hit 
Sample proportion is an 
unbiased estimator of p 
in the population. 
Humanness of player i 
according to Judge j
Calculating FPA (First-Person Assessment) 
Weighted First-Person Humanness Ratio 
BotName Humanness FPA 
MirrorBot 0.4996406 0.20164771 
BotTracker 0.4231043 0.20070203 
OvGUBot 0.3164826 0.10545765 
NizorBot 0.2980527 0.11821633 
ADANN 0.2432864 0.08351664 
CCBot 0.1685606 0.06214746 
BotName Humanness FPA 
Player 0.5417464 0.1932813 
tmchojo 0.5177169 0.1775752 
Xenija 0.3847691 0.1713976 
Juan_CVC 0.3172348 0.1237229
Calculating FPA (First-Person Assessment) 
Weighted First-Person Humanness Ratio
Calculating FPA (First-Person Assessment) 
Weighted First-Person Humanness Ratio 
H H B B H H B B B B
Calculating FPA (First-Person Assessment) 
Weighted First-Person Humanness Ratio 
B B H H H H B B B B
BotPrize 2014 Edition: We add TPA 
* TPA – Third-Person Assessment 
Generation of Anonymized TPA Video 
Clips featuring human and bot players 
FPA Human Judges 
Artificial Bots UT 2004 Server 
TPA Judges 
(Crowdsourcing 
platform) 
Third-Person 
Crowdsourcing 
Judging
Calculating TPA (Third-Person Assessment) 
Crowdsourcing Judging 
푇푃퐴푖,푗 = 
푀푖푠푠푖,푗 
푁푖,푗 
J = 232 human judges 
I = 12 characters (6+6)
Calculating TPA (Third-Person Assessment) 
Crowdsourcing Judging 
BotName FPA TPA H++ 
Xenija 0.17139763 0.8235294 0.4974635 
MirrorBot 0.20164771 0.7333333 0.4674905 
Player 0.19328127 0.6315789 0.4124301 
tmchojo 0.17757519 0.6470588 0.4123170 
NizorBot 0.11821633 0.7058824 0.4120493 
BotTracker 0.20070203 0.5909091 0.3958056 
CCBot 0.06214746 0.7058824 0.3840149 
Juan_CVC 0.12372294 0.6190476 0.3713853 
OvGUBot 0.10545765 0.6086957 0.3570767 
ADANN 0.08351664 0.4761905 0.2798536
Calculating TPA (Third-Person Assessment) 
Crowdsourcing Judging 
H B B B H H H H H B B B
Final Results (FPA + TPA) 
H B H H B B B H B B
Final Results (H++) 
MirrorBot 
NizorBot 
BotTracker 
OvGUBot 
0.467 
0.412 
0.395 
0.357 
Mihai Polceanu 
José L. Jiménez López 
Antonio J. Fernández-Leiva 
Antonio M. Mora 
Hunjoo Lee 
Jee-Hyong Lee 
Xenija Neufeld 
Sanaz Mostaghim
Congratulations for your results!!! 
Hope to see you again 
next year  
www.botprize.org 
human-machine.unizar.es 
Raul.Arrabales 
@Conscious-Robots.com 
@ConsciousRobots
1 of 28

Recommended

Integrating BI - Data Warehouse and Big Data by
Integrating BI - Data Warehouse and Big DataIntegrating BI - Data Warehouse and Big Data
Integrating BI - Data Warehouse and Big DataAccenture Analytics
1.5K views33 slides
What is ETL? by
What is ETL?What is ETL?
What is ETL?Ismail El Gayar
19.9K views13 slides
Large scale ETL with Hadoop by
Large scale ETL with HadoopLarge scale ETL with Hadoop
Large scale ETL with HadoopOReillyStrata
32.7K views76 slides
How One Company Offloaded Data Warehouse ETL To Hadoop and Saved $30 Million by
How One Company Offloaded Data Warehouse ETL To Hadoop and Saved $30 MillionHow One Company Offloaded Data Warehouse ETL To Hadoop and Saved $30 Million
How One Company Offloaded Data Warehouse ETL To Hadoop and Saved $30 MillionDataWorks Summit
13.2K views17 slides
Building an Effective Data Warehouse Architecture by
Building an Effective Data Warehouse ArchitectureBuilding an Effective Data Warehouse Architecture
Building an Effective Data Warehouse ArchitectureJames Serra
138.5K views34 slides
The Big Data Dream Team by
The Big Data Dream TeamThe Big Data Dream Team
The Big Data Dream TeamAccenture Analytics
1.3K views13 slides

More Related Content

Recently uploaded

Uni Systems for Power Platform.pptx by
Uni Systems for Power Platform.pptxUni Systems for Power Platform.pptx
Uni Systems for Power Platform.pptxUni Systems S.M.S.A.
50 views21 slides
Future of Learning - Yap Aye Wee.pdf by
Future of Learning - Yap Aye Wee.pdfFuture of Learning - Yap Aye Wee.pdf
Future of Learning - Yap Aye Wee.pdfNUS-ISS
41 views11 slides
Upskilling the Evolving Workforce with Digital Fluency for Tomorrow's Challen... by
Upskilling the Evolving Workforce with Digital Fluency for Tomorrow's Challen...Upskilling the Evolving Workforce with Digital Fluency for Tomorrow's Challen...
Upskilling the Evolving Workforce with Digital Fluency for Tomorrow's Challen...NUS-ISS
28 views70 slides
Java Platform Approach 1.0 - Picnic Meetup by
Java Platform Approach 1.0 - Picnic MeetupJava Platform Approach 1.0 - Picnic Meetup
Java Platform Approach 1.0 - Picnic MeetupRick Ossendrijver
25 views39 slides
Melek BEN MAHMOUD.pdf by
Melek BEN MAHMOUD.pdfMelek BEN MAHMOUD.pdf
Melek BEN MAHMOUD.pdfMelekBenMahmoud
14 views1 slide
Business Analyst Series 2023 - Week 3 Session 5 by
Business Analyst Series 2023 -  Week 3 Session 5Business Analyst Series 2023 -  Week 3 Session 5
Business Analyst Series 2023 - Week 3 Session 5DianaGray10
209 views20 slides

Recently uploaded(20)

Future of Learning - Yap Aye Wee.pdf by NUS-ISS
Future of Learning - Yap Aye Wee.pdfFuture of Learning - Yap Aye Wee.pdf
Future of Learning - Yap Aye Wee.pdf
NUS-ISS41 views
Upskilling the Evolving Workforce with Digital Fluency for Tomorrow's Challen... by NUS-ISS
Upskilling the Evolving Workforce with Digital Fluency for Tomorrow's Challen...Upskilling the Evolving Workforce with Digital Fluency for Tomorrow's Challen...
Upskilling the Evolving Workforce with Digital Fluency for Tomorrow's Challen...
NUS-ISS28 views
Business Analyst Series 2023 - Week 3 Session 5 by DianaGray10
Business Analyst Series 2023 -  Week 3 Session 5Business Analyst Series 2023 -  Week 3 Session 5
Business Analyst Series 2023 - Week 3 Session 5
DianaGray10209 views
AMAZON PRODUCT RESEARCH.pdf by JerikkLaureta
AMAZON PRODUCT RESEARCH.pdfAMAZON PRODUCT RESEARCH.pdf
AMAZON PRODUCT RESEARCH.pdf
JerikkLaureta15 views
DALI Basics Course 2023 by Ivory Egg
DALI Basics Course  2023DALI Basics Course  2023
DALI Basics Course 2023
Ivory Egg14 views
The details of description: Techniques, tips, and tangents on alternative tex... by BookNet Canada
The details of description: Techniques, tips, and tangents on alternative tex...The details of description: Techniques, tips, and tangents on alternative tex...
The details of description: Techniques, tips, and tangents on alternative tex...
BookNet Canada121 views
Digital Product-Centric Enterprise and Enterprise Architecture - Tan Eng Tsze by NUS-ISS
Digital Product-Centric Enterprise and Enterprise Architecture - Tan Eng TszeDigital Product-Centric Enterprise and Enterprise Architecture - Tan Eng Tsze
Digital Product-Centric Enterprise and Enterprise Architecture - Tan Eng Tsze
NUS-ISS19 views
Architecting CX Measurement Frameworks and Ensuring CX Metrics are fit for Pu... by NUS-ISS
Architecting CX Measurement Frameworks and Ensuring CX Metrics are fit for Pu...Architecting CX Measurement Frameworks and Ensuring CX Metrics are fit for Pu...
Architecting CX Measurement Frameworks and Ensuring CX Metrics are fit for Pu...
NUS-ISS37 views
PharoJS - Zürich Smalltalk Group Meetup November 2023 by Noury Bouraqadi
PharoJS - Zürich Smalltalk Group Meetup November 2023PharoJS - Zürich Smalltalk Group Meetup November 2023
PharoJS - Zürich Smalltalk Group Meetup November 2023
Noury Bouraqadi120 views
Emerging & Future Technology - How to Prepare for the Next 10 Years of Radica... by NUS-ISS
Emerging & Future Technology - How to Prepare for the Next 10 Years of Radica...Emerging & Future Technology - How to Prepare for the Next 10 Years of Radica...
Emerging & Future Technology - How to Prepare for the Next 10 Years of Radica...
NUS-ISS16 views
TouchLog: Finger Micro Gesture Recognition Using Photo-Reflective Sensors by sugiuralab
TouchLog: Finger Micro Gesture Recognition  Using Photo-Reflective SensorsTouchLog: Finger Micro Gesture Recognition  Using Photo-Reflective Sensors
TouchLog: Finger Micro Gesture Recognition Using Photo-Reflective Sensors
sugiuralab15 views
Data-centric AI and the convergence of data and model engineering: opportunit... by Paolo Missier
Data-centric AI and the convergence of data and model engineering:opportunit...Data-centric AI and the convergence of data and model engineering:opportunit...
Data-centric AI and the convergence of data and model engineering: opportunit...
Paolo Missier34 views
Beyond the Hype: What Generative AI Means for the Future of Work - Damien Cum... by NUS-ISS
Beyond the Hype: What Generative AI Means for the Future of Work - Damien Cum...Beyond the Hype: What Generative AI Means for the Future of Work - Damien Cum...
Beyond the Hype: What Generative AI Means for the Future of Work - Damien Cum...
NUS-ISS34 views
Transcript: The Details of Description Techniques tips and tangents on altern... by BookNet Canada
Transcript: The Details of Description Techniques tips and tangents on altern...Transcript: The Details of Description Techniques tips and tangents on altern...
Transcript: The Details of Description Techniques tips and tangents on altern...
BookNet Canada130 views
AI: mind, matter, meaning, metaphors, being, becoming, life values by Twain Liu 刘秋艳
AI: mind, matter, meaning, metaphors, being, becoming, life valuesAI: mind, matter, meaning, metaphors, being, becoming, life values
AI: mind, matter, meaning, metaphors, being, becoming, life values
SAP Automation Using Bar Code and FIORI.pdf by Virendra Rai, PMP
SAP Automation Using Bar Code and FIORI.pdfSAP Automation Using Bar Code and FIORI.pdf
SAP Automation Using Bar Code and FIORI.pdf

Featured

ChatGPT and the Future of Work - Clark Boyd by
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd Clark Boyd
21.3K views69 slides
Getting into the tech field. what next by
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next Tessa Mero
5.2K views22 slides
Google's Just Not That Into You: Understanding Core Updates & Search Intent by
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentLily Ray
5.9K views99 slides
How to have difficult conversations by
How to have difficult conversations How to have difficult conversations
How to have difficult conversations Rajiv Jayarajah, MAppComm, ACC
4.5K views19 slides
Introduction to Data Science by
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data ScienceChristy Abraham Joy
82.2K views51 slides
Time Management & Productivity - Best Practices by
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best PracticesVit Horky
169.7K views42 slides

Featured(20)

ChatGPT and the Future of Work - Clark Boyd by Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd
Clark Boyd21.3K views
Getting into the tech field. what next by Tessa Mero
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next
Tessa Mero5.2K views
Google's Just Not That Into You: Understanding Core Updates & Search Intent by Lily Ray
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Lily Ray5.9K views
Time Management & Productivity - Best Practices by Vit Horky
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best Practices
Vit Horky169.7K views
The six step guide to practical project management by MindGenius
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project management
MindGenius36.6K views
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright... by RachelPearson36
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
RachelPearson3612.6K views
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present... by Applitools
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Applitools55.4K views
12 Ways to Increase Your Influence at Work by GetSmarter
12 Ways to Increase Your Influence at Work12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at Work
GetSmarter401.6K views
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G... by DevGAMM Conference
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...
DevGAMM Conference3.6K views
Barbie - Brand Strategy Presentation by Erica Santiago
Barbie - Brand Strategy PresentationBarbie - Brand Strategy Presentation
Barbie - Brand Strategy Presentation
Erica Santiago25.1K views
Good Stuff Happens in 1:1 Meetings: Why you need them and how to do them well by Saba Software
Good Stuff Happens in 1:1 Meetings: Why you need them and how to do them wellGood Stuff Happens in 1:1 Meetings: Why you need them and how to do them well
Good Stuff Happens in 1:1 Meetings: Why you need them and how to do them well
Saba Software25.2K views
Introduction to C Programming Language by Simplilearn
Introduction to C Programming LanguageIntroduction to C Programming Language
Introduction to C Programming Language
Simplilearn8.4K views
The Pixar Way: 37 Quotes on Developing and Maintaining a Creative Company (fr... by Palo Alto Software
The Pixar Way: 37 Quotes on Developing and Maintaining a Creative Company (fr...The Pixar Way: 37 Quotes on Developing and Maintaining a Creative Company (fr...
The Pixar Way: 37 Quotes on Developing and Maintaining a Creative Company (fr...
Palo Alto Software88.3K views
9 Tips for a Work-free Vacation by Weekdone.com
9 Tips for a Work-free Vacation9 Tips for a Work-free Vacation
9 Tips for a Work-free Vacation
Weekdone.com7.2K views
How to Map Your Future by SlideShop.com
How to Map Your FutureHow to Map Your Future
How to Map Your Future
SlideShop.com275.1K views

BotPrize 2014 Results. Human-Like Bots Competition at IEEE CIG

  • 1. CIG 2014 - Human-Like Bots Competition IEEE Computational Intelligence in Games. Dortmund, Germany. August, 2014 Organization: Manuel G. Bedia Juan Peralta Joan Marc Philip Hingston Raúl Arrabales
  • 3. Players (humans and bots) PLAYER TYPE TEAM MEMBERS AFFILIATION COUNTRY BotTracker BOT TETRIIS Hunjoo Lee Jee-Hyong Lee ETRI, Sungkyunkwan University SOUTH KOREA MirrorBot BOT IHSEV Mihai Polceanu ENIB CERV Centre de Réalité Virtuelle FRANCE NizorBot BOT UMAG-BOT José L. Jiménez López Antonio J. Fernández-Leiva Antonio M. Mora Universidad de Málaga SPAIN OvGUBot BOT OvGUBot Xenija Neufeld Sanaz Mostaghim Otto von Guericke University, Magdeburg GERMANY ADANN BOT CVC Juan Peralta Donate Joan Marc Llargués A. CVC. UAB SPAIN CCBot BOT Conscious -Robots Jorge Muñoz Raúl Arrabales Comaware SPAIN Player HUMAN Judge - - - Tmchojo HUMAN Judge - - - Juan_CVC HUMAN Judge - - - Xenija HUMAN Judge - - -
  • 5. Original BotPrize Testing Protocol (FPA – First-Person Assessment) Human Judges Artificial Bots UT 2004 Server Real-time Online Anonymized interaction
  • 6. BotPrize 2014 Edition: We add TPA * FPA – First-Person Assessment * TPA – Third-Person Assessment Generation of Anonymized TPA Video Clips featuring human and bot players FPA Human Judges Artificial Bots UT 2004 Server TPA Judges (Crowdsourcing platform) Third-Person Crowdsourcing Judging
  • 7. BotPrize 2014 Edition: Humanness++ (H) H = (FPA * FPWF) + (TPA * TPWF) FPWF First-Person Weighting Factor = 0,5. TPWF Third-Person Weighting Factor = 0,5. FPA Human Judges Artificial Bots UT 2004 Server H = FPA / 2 + TPA / 2
  • 8. Humanness scores based on SDT Signal Detection Theory Judge SDT Matrix Vote “Human” Vote “Bot” Player is a Human Hit False Alarm Player is a Bot Miss Hit Tanner Jr., Wilson P.; John A. Swets (November 1954). "A decision-making theory of visual detection.". Psychological Review. 61 (6): 401–409.
  • 9. Humanness scores based on SDT Judge Reliability (JR) A measure of how good a judge is in terms of telling apart humans and bots 퐽푅푗 = 퐻푖푡푠 − (푀푖푠푠푒푠 + 퐹푎푙푠푒퐴푙푎푟푚푠 푁푗 Judge SDT Matrix Vote “Human” Vote “Bot” Player is a Human Hit False Alarm Player is a Bot Miss Hit
  • 10. Judge Reliability can be used to adjust Humanness Scores Judge Relative Reliability (JRR) A measure of how good a judge is in relation with other judges 퐽푅푅푗 = 퐽푅푗 퐽 퐽푅푗 푗=1 퐴푣푔퐽푅 = 퐴푣푔퐽푅 퐽
  • 11. Judge Reliability can be used to adjust Humanness Scores Judge Relative Reliability (JRR) A measure of how good a judge is in relation with other judges judges JRmeasures JRR Weight Player 0.28967254 1.2574682 0.31436704 tmchojo 0.43801653 1.9014292 0.47535731 Juan_CVC 0.16129032 0.7001611 0.17504027 Xenija 0.03246753 0.1409415 0.03523538
  • 12. Judge Reliability can be used to adjust Humanness Scores Judge Relative Reliability (JRR) “tmchojo” is the best FPA judge 44% Correct Guesses
  • 13. Bots Judging Reliability “BotTracker” is the best Bot telling apart bots and humans (32%)
  • 14. Humans & Bots Judging Reliability H B H H B H B B B B
  • 15. Humans & Bots Judging Reliability
  • 16. Judge Reliability can be used to adjust Humanness Scores JRmeasures["Weight"] <- JRmeasures$JRR / nrow(JRmeasures)
  • 17. Calculating FPA (First-Person Assessment) Weighted First-Person Humanness Ratio 퐹푃퐴푖 = 푛 푊푒푖푔ℎ푡푗 ∗ 퐻푢푚푎푛푛푒푠푠푖,푗 푗=1 퐽 ℎ푢푚푎푛푛푒푠푠푖,푗 = 푀푖푠푠푖,푗 푁푖,푗 Judge j SDT Matrix Voted “Human” Voted “Bot” Player is a Human Hit False Alarm Player is a Bot Miss Hit Sample proportion is an unbiased estimator of p in the population. Humanness of player i according to Judge j
  • 18. Calculating FPA (First-Person Assessment) Weighted First-Person Humanness Ratio BotName Humanness FPA MirrorBot 0.4996406 0.20164771 BotTracker 0.4231043 0.20070203 OvGUBot 0.3164826 0.10545765 NizorBot 0.2980527 0.11821633 ADANN 0.2432864 0.08351664 CCBot 0.1685606 0.06214746 BotName Humanness FPA Player 0.5417464 0.1932813 tmchojo 0.5177169 0.1775752 Xenija 0.3847691 0.1713976 Juan_CVC 0.3172348 0.1237229
  • 19. Calculating FPA (First-Person Assessment) Weighted First-Person Humanness Ratio
  • 20. Calculating FPA (First-Person Assessment) Weighted First-Person Humanness Ratio H H B B H H B B B B
  • 21. Calculating FPA (First-Person Assessment) Weighted First-Person Humanness Ratio B B H H H H B B B B
  • 22. BotPrize 2014 Edition: We add TPA * TPA – Third-Person Assessment Generation of Anonymized TPA Video Clips featuring human and bot players FPA Human Judges Artificial Bots UT 2004 Server TPA Judges (Crowdsourcing platform) Third-Person Crowdsourcing Judging
  • 23. Calculating TPA (Third-Person Assessment) Crowdsourcing Judging 푇푃퐴푖,푗 = 푀푖푠푠푖,푗 푁푖,푗 J = 232 human judges I = 12 characters (6+6)
  • 24. Calculating TPA (Third-Person Assessment) Crowdsourcing Judging BotName FPA TPA H++ Xenija 0.17139763 0.8235294 0.4974635 MirrorBot 0.20164771 0.7333333 0.4674905 Player 0.19328127 0.6315789 0.4124301 tmchojo 0.17757519 0.6470588 0.4123170 NizorBot 0.11821633 0.7058824 0.4120493 BotTracker 0.20070203 0.5909091 0.3958056 CCBot 0.06214746 0.7058824 0.3840149 Juan_CVC 0.12372294 0.6190476 0.3713853 OvGUBot 0.10545765 0.6086957 0.3570767 ADANN 0.08351664 0.4761905 0.2798536
  • 25. Calculating TPA (Third-Person Assessment) Crowdsourcing Judging H B B B H H H H H B B B
  • 26. Final Results (FPA + TPA) H B H H B B B H B B
  • 27. Final Results (H++) MirrorBot NizorBot BotTracker OvGUBot 0.467 0.412 0.395 0.357 Mihai Polceanu José L. Jiménez López Antonio J. Fernández-Leiva Antonio M. Mora Hunjoo Lee Jee-Hyong Lee Xenija Neufeld Sanaz Mostaghim
  • 28. Congratulations for your results!!! Hope to see you again next year  www.botprize.org human-machine.unizar.es Raul.Arrabales @Conscious-Robots.com @ConsciousRobots