Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

OLIVAW: reaching superhuman strength at Othello

67 views

Published on

https://www.meetup.com/it-IT/Machine-Learning-Data-Science-Meetup/events/258491250/

Published in: Technology
  • Be the first to comment

OLIVAW: reaching superhuman strength at Othello

  1. 1. OLIVAW: reaching superhuman strength at Othello From my Master thesis Deep learning for Othello Computer Science, Sapienza University of Rome Thesis Advisor: Prof. Alessandro Panconesi 19-02-19 Antonio Norelli using the AlphaGo Zero paradigm with Zero budget
  2. 2. Why games? Shannon 1949 Turing 1950 McCarthy 1956
  3. 3. ‘’ A satisfactory solution of [chess] will act as a wedge in attacking other problems of a similar nature and of greater significance. Programming a Computer for Playing Chess Philosophical Magazine, Vol 41, No. 314 Shannon 1949
  4. 4. Games as a micro world Clear objectives Small set of rules Still interesting complexity
  5. 5. Choosing a move Naïf approach
  6. 6. Games as trees
  7. 7. Games as trees
  8. 8. Games as trees
  9. 9. Looking into the future Tic-tac-toe Connect 4 Checkers Othello Chess Go
  10. 10. Looking into the future Tic-tac-toe Connect 4 Checkers Othello Chess Go
  11. 11. Exloring the full tree is impossible 0 50 100 150 200 250 300 350 400 Tic-tac-toe Connect 4 Checkers Othello Chess Go Game tree complexity log10 𝑏 𝑑
  12. 12. Exloring the full tree is impossible 0 50 100 150 200 250 300 350 400 Tic-tac-toe Connect 4 Checkers Othello Chess Go Game tree complexity log10 𝑏 𝑑
  13. 13. What the future holds for us?
  14. 14. What the future holds for us?
  15. 15. Oracle to evaluate intermediate positions
  16. 16. Oracle to evaluate intermediate positions -0,42
  17. 17. Oracle using game knowledge
  18. 18. DeepBlue 1997
  19. 19. AlphaGo 2016
  20. 20. AlphaGo Zero 2017
  21. 21. AlphaGo Zero 2017
  22. 22. The AlphaGo Zero paradigm: is it universal?
  23. 23. The AlphaGo Zero paradigm: is it universal? Does it scale DOWN in terms of resources?
  24. 24. The AlphaGo Zero paradigm: is it universal? Does it scale DOWN in terms of resources? This thesis: Can we reach superhuman strength at Othello with the same paradigm and "normal" computing power?
  25. 25. AlphaGo Zero My solution 5000 TPUs ≈1 40 (3) Days of training 30 $ 25 Million Estimated Hardware cost $ 500 Budget 0
  26. 26. AlphaGo Zero My solution 5000 TPUs ≈1 40 (3) Days of training 30 $ 25 Million Estimated Hardware cost $ 500 Budget 0
  27. 27. AlphaGo Zero My solution 5000 TPUs ≈1 40 (3) Days of training 30 $ 25 Million Estimated Hardware cost $ 500
  28. 28. AlphaGo Zero My solution 5000 TPUs ≈1 40 (3) Days of training 30 $ 25 Million Estimated Hardware cost $ 500 Budget 0
  29. 29. AlphaGo Zero My solution 5000 TPUs ≈1 40 (3) Days of training 30 $ 25 Million Estimated Hardware cost $ 500 Budget 0
  30. 30. Why Othello? Simple but still interesting
  31. 31. Why Othello? Simple but still interesting
  32. 32. Why Othello? Simple but still interesting
  33. 33. Why Othello? Simple but still interesting
  34. 34. Why Othello? Simple but still interesting
  35. 35. Why Othello? Well known Simpler but still interesting
  36. 36. Why Othello? Well known Simpler but still interesting Easy to implement
  37. 37. OLIVAW algorithm
  38. 38. OLIVAW algorithm
  39. 39. OLIVAW algorithm
  40. 40. OLIVAW algorithm
  41. 41. OLIVAW training process
  42. 42. OLIVAW training process +0,24 [[0. 0. 0. 0. 0. 0. 0. 0. ] [0. 0. 0.03 0. 0. 0. 0. 0. ] [0.15 0. 0. 0. 0. 0. 0. 0. ] [0.64 0. 0. 0. 0. 0. 0. 0. ] [0. 0. 0. 0. 0. 0. 0. 0. ] [0. 0. 0. 0. 0. 0. 0. 0. ] [0. 0. 0. 0. 0. 0. 0. 0. ] [0. 0. 0.18 0. 0. 0. 0. 0. ]]
  43. 43. OLIVAW training process +0,24 [[0. 0. 0. 0. 0. 0. 0. 0. ] [0. 0. 0.03 0. 0. 0. 0. 0. ] [0.15 0. 0. 0. 0. 0. 0. 0. ] [0.64 0. 0. 0. 0. 0. 0. 0. ] [0. 0. 0. 0. 0. 0. 0. 0. ] [0. 0. 0. 0. 0. 0. 0. 0. ] [0. 0. 0. 0. 0. 0. 0. 0. ] [0. 0. 0.18 0. 0. 0. 0. 0. ]]
  44. 44. OLIVAW training process +0,24 [[0. 0. 0. 0. 0. 0. 0. 0. ] [0. 0. 0.03 0. 0. 0. 0. 0. ] [0.15 0. 0. 0. 0. 0. 0. 0. ] [0.64 0. 0. 0. 0. 0. 0. 0. ] [0. 0. 0. 0. 0. 0. 0. 0. ] [0. 0. 0. 0. 0. 0. 0. 0. ] [0. 0. 0. 0. 0. 0. 0. 0. ] [0. 0. 0.18 0. 0. 0. 0. 0. ]]
  45. 45. OLIVAW training process -0,03 [[0. 0. 0. 11 0.11 0. 0. 0. 0.12 ] [0. 0.13 0.9 0. 0. 0. 0. 0. ] [0.10 0.11 0. 0. 0. 0. 0. 0. ] [0.12 0. 0. 0. 0. 0. 0. 0. ] [0. 0. 0. 0. 0. 0. 0. 0. ] [0. 0. 0. 0. 0. 0. 0. 0. ] [0. 0. 0. 0. 0. 0. 0. 0. ] [0. 0. 0.11 0. 0. 0. 0.11 0. ]]
  46. 46. OLIVAW training process VS
  47. 47. OLIVAW training process VS
  48. 48. OLIVAW training process 1. Self-play games generation
  49. 49. OLIVAW training process 1. Self-play games generation
  50. 50. OLIVAW training process 1. Self-play games generation
  51. 51. OLIVAW training process 1. Self-play games generation 2. Neural Net training
  52. 52. OLIVAW training process 1. Self-play games generation 2. Neural Net training VS
  53. 53. OLIVAW training process 1. Self-play games generation 2. Neural Net training VS
  54. 54. OLIVAW training process 1. Self-play games generation 2. Neural Net training 3. Evaluation
  55. 55. OLIVAW training process 1. Self-play games generation 2. Neural Net training 3. Evaluation
  56. 56. Pseudocode
  57. 57. Reaching superhuman strength
  58. 58. OLIVAW vs Alessandro Di Mattei 2016-2017 Italian champion
  59. 59. OLIVAW vs Alessandro Di Mattei 2016-2017 Italian champion 2-3 Draw – Draw – Defeat – Victory - Defeat 27-11-2018
  60. 60. OLIVAW vs Alessandro Di Mattei
  61. 61. OLIVAW vs Alessandro Di Mattei 4-0 3-12-2018
  62. 62. OLIVAW strength 0 200 400 600 800 1000 1200 1400 1600 1800 2000 2200 2400 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 Elo Generation Alessandro Di Mattei
  63. 63. OLIVAW vs Di Mattei 03-12 Game 4
  64. 64. OLIVAW vs Michele Borassi 2008 World Othello champion
  65. 65. OLIVAW Is still training VIDEO TRAILER
  66. 66. OLIVAW vs Michele Borassi Best of 3 19 January 2019 16,30 – Dipartimento di Matematica Guido Castelnuovo Sapienza, piazzale Aldo Moro, 5
  67. 67. 1-0 OLIVAW wins with black OLIVAW vs Michele Borassi Game 1
  68. 68. 1-1 Michele Borassi wins with black OLIVAW vs Michele Borassi Game 2
  69. 69. 1-2 Michele Borassi wins with black OLIVAW vs Michele Borassi Game 3
  70. 70. Thanks! Any questions? You can find me at noranta4@gmail.com OLIVAW: reaching superhuman strength at Othello using the AlphaGo Zero paradigm with Zero budget Antonio Norelli

×