Your SlideShare is downloading.
×

×

Saving this for later?
Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.

Text the download link to your phone

Standard text messaging rates apply

Like this presentation? Why not share!

- The Perfect Post: From email to blo... by Ken Herron 2231 views
- Technology Integration Project by mle5 689 views
- Chiếu sáng phù hợp không gian nội thất by CÔNG TY TNHH TTNN... 51 views
- 3D Charts Collection by PoweredTemplate.com 105 views
- Professional Journey - Jacopo Pasquini by Jacopo Pasquini 73 views
- The lexicon partnership interview q... by texsatris 63 views

594

Published on

Talk given at NUTN in 2011

Talk given at NUTN in 2011

Published in:
Education

No Downloads

Total Views

594

On Slideshare

0

From Embeds

0

Number of Embeds

0

Shares

0

Downloads

7

Comments

0

Likes

1

No embeds

No notes for slide

- 1. SOME TOOLSFOR ARTIFICIAL INTELLIGENCEOlivier Teytaud --- olivier.teytaud@gmail.comNUTN, Tainan, 2011
- 2. Tao (Inria, Cnrs, Lri, Paris-Sud) People: Permanent staff: 11 ~15 ph.D. Students In Université Paris-Sud Largest campus in France Faculty of sciences: mathematics, computer science, physics, chemistry, biology, earth and space sciences ==> 12000 students Inria affiliation: Around 50 years old Devoted to research in comp. science
- 3. Tao (Inria, Cnrs, Lri, Paris-Sud) Reservoir computing Optimal decision making under uncertainty Optimization Autonomic computer Machine learning
- 4. Communication not always so easy:● Many of you speak Chinese + Taiwanese. So English = third language. I am French. English = second language.● I work mainly in mathematical aspects of computer science, more than computer science. Difficulties might also be an enrichment.Feel free to interrupt me as much as useful.NUTN, Tainan, 2011
- 5. Communication not always so easy:● Many of you speak Chinese + Taiwanese. So English = third language. I am French. English = second language.● I work mainly in mathematical aspects of computer science, more than computer science. Difficulties might also be an enrichment.Feel free to interrupt me as much as useful.NUTN, Tainan, 2011
- 6. Vita in a nutshell:1) First research: mathematical logic2) I had fun, but I wanted to be “directly” useful. I switched to Statistics.3) I had fun, but I wanted to be “more directly” useful. Switched to Operational Research, in industry. - Many applications. - My favorite: electricity generation.4) Now (40 dangerously approaching), Artificial Intelligence: - Mathematics. - Challenges (in particular games). - Applications.
- 7. Vita in a nutshell:1) First research: mathematical logic2) I had fun, but I wanted to be “directly” useful. I switched to Statistics.3) I had fun, but I wanted to be “more directly” useful. Switched to Operational Research, in industry. - Many applications. - My favorite: electricity generation.4) Now (40 dangerously approaching), Artificial Intelligence: - Mathematics. - Challenges (in particular games). - Applications.
- 8. Vita in a nutshell:1) First research: mathematical logic2) I had fun, but I wanted to be “directly” useful. I switched to Statistics.3) I had fun, but I wanted to be “more directly” useful. Switched to Operation Research, in industry. - Many applications. - My favorite: electricity generation.4) Now (40 dangerously approaching), Artificial Intelligence: - Mathematics. Goes back to military - Challenges application around world war II, (in particular games). - Applications. UK resisted to Hitler thanks when to optimized radars. Now essentially civil applications.
- 9. Vita in a nutshell:1) First research: mathematical logic2) I had fun, but I wanted to be “directly” useful. I switched to Statistics.3) I had fun, but I wanted to be “more directly” useful. Switched to Operational Research, in industry. - Many applications. - My favorite: electricity generation.4) Now (40 years old soon...), Artificial Intelligence: - Mathematics. - Beautiful challenges (in particular games). - Applications.
- 10. Outline of what Ill discuss:1) Some concepts: - simplified problems - toolboxes for these problems2) Principle: - reducing real problems to groups of artificial problems - small problems might be considered as artificial and useless when considered alone. - but when you solve a clearly stated small problem, usually you can find an application for this solution. - we will see applications as well.==> For the moment lets see “big” applications3) Ill also show some works on which contributors are welcome.
- 11. EXAMPLES OFAPPLICATIONS
- 12. ELECTRICITY GENERATION
- 13. ELECTRICITY GENERATIONThe case of FranceData: - climate model (stochastic) - model of electricity demand (stochastic) - model of power plantsEach day we receive: - electricity consumption - weather information - info on faultsEach day, we decide how to distribute the productionamong the power plants. (also: schedule long-terminvestiments)
- 14. Data: - climate model (stochastic) - model of electricity demand (stochastic) - model of power plants (PP): nuclear PP (NPP), thermal PP (TPP), Hydroelectric PP (HPP)...Each day we receive: - electricity consumption - weather information - info on faultsEach day, we decide how to distribute the production among the powerplants. Daily information DATA (climate, Electric PROGRAM STRATEGY plants, system economy) Decisions
- 15. One of the most important industrial problem you can imagine:how to produce energy ?France has specific elements:- heavily nuclearized (most nuclearized country in the world) - often cooled by rivers (do not work in case of droughts ==> hard to predict) - we must schedule maintenance - we must take long-term decisions (building new NPP ? Removing ?)- also hydroelectricity: - should we use water now ? - should we keep it for winter (in France, high consumption is in winter) Daily information DATA (climate, Electric PROGRAM STRATEGY plants, system economy) Decisions
- 16. Problem 1: Taiwan is very different from France :-)● Almost no nuclear power plant ? Cooled by sea ?● Electrically connected to other countries ? (France might be connected to Africa)● Sun sufficient for massive photo-voltaic units ?● Wind much stronger than in France - can be used ?● Other questions ?● Electricity consumption dominated by air conditioning ?● Maybe electric cars in the future ?● Climate maybe more regular ? Problem easier than in France ? ==> I dont know ==> Id like to work on it (energy is an important concern, in Taiwan as well – lack of independence ?) ==> Need Chinese-reading persons ==> Other (Taiwan-independent) concern: tackling partial observation in energy generation problem
- 17. GOOD NEWS: we had aGAME OF GO lot of progress with **generic** algorithms(with Nutn) (algorithms which can be used for many things). The revolution in Go which occurred in 2007-2009 is a major breakthrough in Artificial Intelligence. Well see that in details. I am a little bit tired of the game of Go, because I have no recent progress, and recent progress in the community comes from Go expertise, which is only useful for Go...
- 18. Problem 2: Solving unsolved situations in Go● Now computers are much stronger than in the past.● However, they still misunderstand some trivial situations (in particular, liberty races).● You have an idea ? Tell me :-)● We have a solver in France (not for playing Go; aimed at provably solving), that we would like to test on various situations. We do not play Go. If you are 5kyu or better, you can contribute.
- 19. URBAN RIVALS17 Millions registered users. Important company.
- 20. URBAN RIVALS- Choose 4 cards, your opponent chooses 4 Cards- Each player gets 12 “Pilz” (i.e. strength points)- Each player gets health points.- Each turn: - each player chooses a card - each player uses pilz (each used pilz is lost forever, but it gives strength) - read cards, apply rules==> no more health point ? ==> youre dead.
- 21. Urban Rivals==> Partial information because you dont observe your opponents decisions==> There are “on the shell” algorithms and programs for full information games, but not for partial information games.==> We used a (provable) combination of MCTS and EXP3==> Immediately human level performance ==> suggests that maths can help ==> still possible works: - automatic choice of cards ? - reducing comp. cost ?
- 22. POKEMONS 皮卡丘Second most lucrative videogame.Meta-gaming: choosing your deck.
- 23. POKEMONS: Problem 3Second most lucrative videogame.Meta-gaming: choosing your deck.In-gaming: playing with your set of cards.
- 24. Problem 4: Solving MineSweeper. Find an optimal move ?● Looks like a trivial boring problem. Certainly not indeed.● Many papers with the same approach (so-called CSP technique)● We could outperform these algorithms thanks to a probabilistic approach.● But my approach only works on small board (or huge computational cost) ==> we want to extend.● Quite similar to electricity generation (yes, I believe in this)
- 25. Game applications can be considered as childish.Shouldnt we focus on more important things ?However:- If you have a breakthrough in an important game, people will trust you. Doors will be opened when you will propose new algorithms for real-world applications.- Testing ideas on a nuclear power plant is more dangerous than testing ideas on a game of Go.- Its easier to compare approaches in games than in electricity generation.
- 26. INTRODUCTION IS OVER.NOW TECHNICAL STUFF.REMARKS, QUESTIONS ?
- 27. TODAY, GAMES.1) HOW TO SOLVE THEM2) C IMPLEMENTATION
- 28. ONE FUNDAMENTAL TOOL: ZERMELOConsider the following game:- there are 5 sticks;- in turn, each player removes 1 or 2 sticks;- the player which removes the last stick looses.Example:Player I: IIIIIPlayer II: IIIPlayer I: I ==> looses! How should I play ?
- 29. ONE FUNDAMENTAL TOOL: ZERMELOZermelo proposed a solution (for full-information games).Born in 1871.1900-1905: major contributions in logic.1913: major contribution to games in 1913.1931: Optimized navigation (from games to applications).Resigned in 1935 (he did not like Hitler).Died in 1953.
- 30. ONE FUNDAMENTAL TOOL: ZERMELO 5 LOSS! 4 3 WIN! WIN! 3 2 2 1 LOSS! WIN! WIN! 1 2WIN! LOSS!
- 31. ZERMELO: I HAVE THE OPTIMAL STRATEGY! 5 LOSS! 4 3 WIN! WIN! 3 2 2 1 LOSS! WIN! WIN! 1 2WIN! LOSS!
- 32. ZERMELO: not limited to win/loss games. Can work on games with continuous rewards.New rule: if the game contains 4, reward is multiplied by 2. YELLOW NODES: 5 BLUE NODES: LABEL = MINIMUM 2 LABEL = MAXIMUM OF CHILDRENs LABELS OF CHILDRENs LABELS 0 4 3 2 2 3 2 2 1 0 2 1 1 2 2 0
- 33. ZERMELO: C CODEstruct gameState{ int *descriptionOfState; int numberOfLegalMoves; int * legalMoves; int turn; // 1 if player 1 plays, -1 otherwise int result; // final reward, if numberOfLegalMoves=0};struct gameState next(struct gameState s,int move) { RULES };double zermeloValue(struct gameState s){ int i;double value; double maxValue=-MAXDOUBLE; if (s.numberOfLegalMoves==0) return(s.turn * s.result); for (i=0;i<s.numberOfLegalMoves;i++) { value=s.turn*zermeloValue(next(s,s.legalMoves[i])); if (value>maxValue) maxValue=value; } return s.turn*maxValue; //we return value for player 1}
- 34. ZERMELO: C CODEstruct gameState{ int *descriptionOfState; int numberOfLegalMoves; Int * legalMoves; int turn; // 1 if player 1 plays, -1 otherwise int result; // final reward, if numberOfLegalMoves=0};struct gameState next(struct gameState s,int move) { RULES };double zermeloValue(struct gameState s){ int i;double value; double maxValue=-MAXDOUBLE; if (s.numberOfLegalMoves==0) return(s.turn * s.result); for (i=0;i<s.numberOfLegalMoves;i++) { value=s.turn*zermeloValue(next(s,s.legalMoves[i])); if (value>maxValue) maxValue=value; } return s.turn*maxValue; //we return value for player 1}
- 35. Last week: Zermelo algorithm.What is Zermelo ? = Simplest algorithm for solving 1Player or 2Player games. = Recursive algorithm = Conveniently (but slowly) implemented with “struct” This week = a bit more on Zermelo algorithm = C development: “static” random variables Future weeksStill some C implementation (or other languages ? as you wish)Still some (not always easy) algorithmsModels of applications I hope I can convince you that operational research / artificial intelligence are useful and fun.
- 36. Zermelo again. What does the “zermeloValue()” function returns ?===> The reward in case of perfect play.===> A perfect strategy.===> Gods can run Zermelo algorithms: perfect play.==> humans have no time for this.==> Can we design a new version in case it is too slow ?
- 37. Lets see a pseudo-code, instead of a code.double zermeloValue(struct gameState s){ if (s is end of game) then return score. else { If (play 1 plays) then return max(zermeloValue(children)) Else return min(zermeloValue(children)) }}
- 38. ZERMELO: A NATURAL CONCEPT, THE DEPTH. 5 (0) LOSS! 4(1) 3(1) WIN! WIN!3(2) 2(2) 2(2) 1(2) LOSS! WIN! WIN! 1(3) 2(3)WIN! LOSS!
- 39. ZERMELO: C CODE FOR THE DEPTHdouble zermeloValue(struct gameState s){ static int depth=0; int i;double value; double maxValue=-MAXDOUBLE; if (s.numberOfLegalMoves==0) return(s.turn * s.result); depth++; for (i=0;i<s.numberOfLegalMoves;i++) { value=s.turn*zermeloValue(next(s,s.legalMoves[i])); if (value>maxValue) maxValue=value; } depth--; return s.turn*maxValue; //we return value for player 1}
- 40. Sometimes it is too slow. Then, what can I do ?
- 41. Etc... too big!
- 42. We will not gobelow this depth.
- 43. We will not go But, what shouldbelow this depth. zermeloFunction return ?
- 44. double zermeloValue(struct gameState s){ static int depth=0; Should we return int i;double value; a random number ? double maxValue=-MAXDOUBLE; if (s.numberOfLegalMoves==0) return(s.turn * s.result); if (depth>5) return drand48(); depth++; for (i=0;i<s.numberOfLegalMoves;i++) { value=s.turn*zermeloValue(next(s,s.legalMoves[i])); if (value>maxValue) maxValue=value; } depth--; return s.turn*maxValue; //we return value for player 1}
- 45. double zermeloValue(struct gameState s){ static int depth=0; int i;double value; double maxValue=-MAXDOUBLE; if (s.numberOfLegalMoves==0) return(s.turn * s.result); if (depth>5) return heuristicValue(s); depth++; for (i=0;i<s.numberOfLegalMoves;i++) { A function written by some expert of value=s.turn*zermeloValue(next(s,s.legalMoves[i])); if (value>maxValue) maxValue=value; game. the } depth--; return s.turn*maxValue; //we return value for player 1}
- 46. SHANNON and games This idea is a main contribution by Shannon (for European chess). Shannon 1916-2001 Noble prize (not Nobel!)Works in:- Logic- Games (also: artificial mouse for mazes)- Financial analysis
- 47. double heuristicValue(struct gameState s){ if (!strcmp(gameName,”chineseChess”)) { /******/ Return 0.1*(nbOfBlackElephants(s) – nbOfRedElephants(s) ) +0.1*(nbOfBlackGuards(s) - nbOfWhiteGuards(s) ) +0.03*(nbOfBlackPieces(s) - nbOfWhitePieces(s) ) +0.01*(nbOfBlackPawns(s) - nbOfWhitePawns(s) ); } else { assert(0); }}
- 48. Zermelos algorithm is too slow. MINIMAX: an approximation of Zermelos algo.Thanks to Wikipedia
- 49. ALPHA-BETA (thks WIKIPEDIA)
- 50. ALPHA-BETA PRINCIPLE OF ALPHA-BETA:In zermeloFunction, considering a opponent node, if I know:- THAT AT PREVIOUS DEPTH, I CAN REACH SCORE ALPHA=6,- THAT IN CURRENT STATE MY OPPONENT CAN ENSURE SCORE BETA<6, I CAN STOP STUDYING THIS BRANCH.==> THIS IS A “ALPHA-CUTOFF“==> OTHER PLAYER: “BETA-CUTOFF“ (just exchange players)
- 51. ALPHA-BETA (thks WIKIPEDIA)
- 52. EXAMPLE OF GAME (we can discuss why it is a good game)- Randomly generate a 4x4 matrix with 0 and 1 (K=4). 0011 1001 0111 1000- Player one removes top part or bottom part 0111 1000- Player two removes left part or right part 01 10- Player one removes top part of bottom part 01- Player two removes left part or right part 0 ==> Player one wins if 1, player two wins if 0!
- 53. POSSIBLE HOME WORK1) ZERMELO: can you implement it on a simple game ?2) MINIMAX: can you add a heuristic function ? Which heuristic function ? Experiments: plot a graph: X(depth) = computation time of minimax (divided by Zermelos computation time) Y(depth) = win rate against Zermelo3) ALPHA-BETA Can you modify it ==> alpha-beta pruning ? Plot a graph for various sizes: X = number of visited nodes Y = average winning rate of alpha-beta vs minimax Or X = depth Y = average winning rate of a-b vs a-b with depth -1
- 54. APPLICATION OF ZERMELO WE HAVE SEEN THE 5-STICKS GAME.CAN WE FIND A REALLY USEFUL APPLICATION ?
- 55. APPLICATION OF ZERMELO WE HAVE SEEN THE 5-STICKS GAME. CAN WE FIND A REALLY USEFUL APPLICATION ?I have:- water
- 56. APPLICATION OF ZERMELO WE HAVE SEEN THE 5-STICKS GAME. CAN WE FIND A REALLY USEFUL APPLICATION ?I have:- water- plants (which need water during summers heat wave)
- 57. APPLICATION OF ZERMELO WE HAVE SEEN THE 5-STICKS GAME. CAN WE FIND A REALLY USEFUL APPLICATION ?I have:- water- plants (which need water during summers heat wave)Actions = giving water to plants, or not.
- 58. APPLICATION OF ZERMELO I have: - water - plants (which need water during summers heat wave)Each day, I choose an action.State = { date +water level in stock + water level in plants }Reward = quality / quantity of production.
- 59. Zermelo ==> optimal sequence ofactions ==> optimal stock level.
- 60. IMPORTANT REMARK:- Maybe this does not look serious.- But heat waves are a serious problem.- Here the problem is simplified, but the concepts for the real application are the same.- Applying this just requires a computer and datas/models about plants/water resources.==> if you can apply Zermelo variants correctly, you can help for a better world.
- 61. However, the “nextState” function israndomized ==> we need a Zermelo for thiscase
- 62. s.turn == 0: action is randomly chosen.double zermeloValue(struct gameState s){ This is Zermelo, adapted to int i;double value; static int depth=0; stochastic games. If (s.turn==0) References: { value=0; - Massé double total=0; - Bellman for (i=0;i<s.numberOfLegalMoves;i++) value+=zermeloValue(next(s,s.legalMoves[i])); return value/s.numberOfLegalMoves; } double maxValue=-MAXDOUBLE; if (s.numberOfLegalMoves==0) return(s.turn * s.result); if (depth>5) return heuristicValue(s); depth++; for (i=0;i<s.numberOfLegalMoves;i++) { value=s.turn*zermeloValue(next(s,s.legalMoves[i])); if (value>maxValue) maxValue=value; } depth--; return s.turn*maxValue; //we return value for player 1}
- 63. ONE MORE TOOL: MATRIX GAMESThe problem: Solving Matrix Games.A solution: EXP3.
- 64. What is a (0-sum) Matrix Game ?Example: 1 0 0M= 0 1 1 1 0 1- You choose (privately) a row (i is 1, 2 or 3).- In same time, I choose (privately) a column (j=1, 2 or 3).- My reward: M(i,j)- Your reward: -M(i,j)I want a 1, you want a 0.Given M, how should I play ?
- 65. What is a (0-sum) Matrix Game ?Example: rock-paper-scissor Rock Paper Scissor Rock 0 -1 1M= Paper 1 0 -1 Scissor -1 1 0- You choose (privately) a row (i is 1, 2 or 3).- In same time, I choose (privately) a column (j=1, 2 or 3).- My reward: M(i,j)- Your reward: -M(i,j)I want a 1, you want a 0.Given M, how should I play ?
- 66. Given M, how should I play ?Nash (diagnosed with paranoid schizophrenia)got a Nobel prize for his work around that.Principle of a Nash equilibrium:- pure strategy = “fixed” strategy (e.g. “play scissor”)- mixed strategy = randomized strategy (e.g. “play scissor with probability ½ and play rock with probability ½”- choose the mixed strategy such that “The worst possible score against any opponent strategy is maximum” ==> “Nash” strategy ==> EXP3: algorithm for finding Nash strategies.
- 67. IMPORTANT FACTS ON GAMES:- Turn-based, full-information games, solvers exist: - Too slow for chess, Go. - Ok for 8x8 checkers. ==> Zermelo ==> variants: Minimax, Alpha-beta, play reasonably well many games- Matrix games: - Nash strategies = wort-case optimal - Nash strategies = randomized strategies
- 68. A BETTER EXAMPLE ? POKEMON.Each player chooses 2 pokemons amongthe 3 possible ones (real life: 3 or 4among hundreds).
- 69. A BETTER EXAMPLE ? POKEMON.Three possibilities:
- 70. A BETTER EXAMPLE ? POKEMON. Three possibilities (the same as choosing a row in a 3x3 matrix game): Player 2Player 1 Check who wins (by some full-observation game-solver).
- 71. A BETTER EXAMPLE ? POKEMON. Three possibilities (the same as choosing a row in a 3x3 matrix game): Player 2Player 1 P1 P2 P2 P2 P1 P1 P1 P2 P1
- 72. A BETTER EXAMPLE ? POKEMON. Three possibilities (the same as choosing a row in a 3x3 matrix game): Player 2Player 1 1 0 0 0 1 1 1 0 1
- 73. EXP3 principle for Nash equilibrium of KxK matrix M: - choose a number N of iterations - S1=null vector - S2=null vector - at each iteration t=1, ..., t=N: { - compute p1 as a function of S1 // we will see how - compute p2 as a function of S2 // we will see how - randomly draw i according to probability distribution p1 - randomly draw j according to probability distribution p2 - define r=M(i,j) in the matrix - S1(i)+= r / p1(i) - S2(j)+=(1-r) / p2(j) - Player1Nash(i)+= (1/N); - Player2Nash(j)+= (1/N); }
- 74. EXP3 principle for Nash equilibrium of KxK matrix M: - choose a number N of iterations - S1=null vector - S2=null vector - at each iteration t=1, ..., t=N: { - compute p1 as a function of S1 // we will see how - compute p2 as a function of S2 // we will see how - randomly draw i according to probability distribution p1 - randomly draw j according to probability distribution p2 - define r=M(i,j) in the matrix - S1(i)+= r / p1(i) - S2(j)+=(1-r) / p2(j) - Player1Nash(i)+= (1/N); - Player2Nash(j)+= (1/N); }
- 75. ==> see C source code
- 76. Q&A: (my questions, and also yours)Q: Who cares about matrix games ?A: Useful for many things. Unfortunately, its usually a building block inside more complex algorithms. We will see examples, but later.Q: Is a Nash strategy optimal ?A: It depends for what... It is optimal in a worst case sense (i.e. against a very strong opponent). Not necessarily very good against a weak opponent.

Be the first to comment