Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Decision Theory Research at FRI

113 views

Published on

Johannes Treutlein & Caspar Oesterheld (both researchers at the Foundational Research Institute), EA Global X Berlin 2017, Oct 14/15 2017

Published in: Government & Nonprofit
  • Be the first to comment

  • Be the first to like this

Decision Theory Research at FRI

  1. 1. Johannes Treutlein Foundational Research Institute Decision theory research at FRI
  2. 2. Johannes Treutlein Foundational Research Institute A wager for evidential decision theory
  3. 3. Altruistic Newcomb problem 3 Ω ? one wish predicts one-boxing:
 two wishes predicts two-boxing:
 nothing
  4. 4. Altruistic Newcomb problem 4 S1 S2 A1 2 0 A2 3 1 ● A1: One-box; A2: Two-box ● S1: opaque box contains two wishes; S2: opaque box empty
  5. 5. Evidential decision theory 5
  6. 6. Causal decision theory 6
  7. 7. Meta decision theory 7 (Nozick 1993; MacAskill 2016)
  8. 8. 8 Altruistic Newcomb problem in a large universe Ω Ω Ω Ω Ω Ω Ω
  9. 9. Altruistic Newcomb problem in a large universe 9
  10. 10. EDT Wager 10 ● Large universe ● Caring about the gains of our copies ● Non-zero credence in EDT ● Meta decision theory Wager for evidential decision theory (and all other theories that take impact of copies into account)
  11. 11. Relevance 11 ● AI Safety ● Macrostrategy ● Multiverse-wide superrationality (Oesterheld 2017a)
  12. 12. Caspar Oesterheld 
 Foundational Research Institute Decision theory and approval- directed agents
  13. 13. Implementing decision theories in AIs 13 • Two problems of decision theory in AI safety: • What is the right decision theory for an AI? • How do we implement decision theories in AI? • Decision theory not explicit in AI architecture • Example: Doing what has worked well in the past (Oesterheld 2017b) • Exception: Gödel machine (Schmidhuber 2006)
  14. 14. Approval-directed agency 14 (Christiano 2014)
  15. 15. Two decision theories 15
  16. 16. Two decision theories 16
  17. 17. Example 17
  18. 18. Two decision theories 18
  19. 19. Example 19
  20. 20. 20 In the paper… If overseer only looks at the world, the agent’s DT is decisive. If overseer only looks at the agent’s action, the overseer’s DT is decisive.
  21. 21. Presentation title John Smith | Head of Department 28.06.2016 Subtitle or caption Thank you. {johannes,caspar}@foundational-research.org
  22. 22. References 22 • Ahmed, A. (2014): Evidence, Decision and Causality. Cambridge University Press. • Almond, P. (2010): On Causation and Correlation. Part 2: Implications of Evidential Decision Theory. https://casparoesterheld.files.wordpress.com/2017/03/ correlation2.pdf • Bostrom, N. (2014b): Superintelligence: Paths, Dangers, Strategies. Oxford University Press. • Christiano, P. (2014): Model-free decisions. https://ai-alignment.com/model-free- decisions-6e6609f5d99e • MacAskill, W. (2016): Smokers, Psychos, and Decision-Theoretic Uncertainty. The Journal of Philosophy • Nozick, R. (1993): The Nature of Rationality. Princeton: Princeton University Press
  23. 23. References 23 • Oesterheld, C. (2017b): Doing what has worked well in the past leads to evidential decision theory. https://casparoesterheld.files.wordpress.com/2017/09/learningdt.pdf • Oesterheld, C. (2017a): Multiverse-wide Cooperation via Correlated Decision Making. https://foundational-research.org/files/Multiverse-wide-Cooperation-via- Correlated-Decision-Making.pdf • Schmidhuber, J. (2006): Gödel Machines: Self-Referential Universal Problem Solvers Making Provably Optimal Self-Improvements. ftp://ftp.idsia.ch/pub/juergen/gm6.pdf • Soares, N. and Fallenstein, B. (2014a): Aligning Superintelligence with Human Interests: A Technical Research Agenda. MIRI Tech. rep. 2014-8. https:// intelligence.org/files/TechnicalAgenda.pdf • Soares, N. and Fallenstein, B. (2014b): Toward Idealized Decision Theory. MIRI Tech. rep. 2014-7. https://arxiv.org/abs/1507.01986 • Soares and Levinstein (2017): Cheating Death in Damascus. https://intelligence.org/ files/DeathInDamascus.pdf

×