Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

A3Cという強化学習アルゴリズムで遊んでみた話

15,655 views

Published on

2015/07/23 PFIセミナー発表資料 https://www.youtube.com/watch?v=uiEtfyBAAHQ

Published in: Technology

A3Cという強化学習アルゴリズムで遊んでみた話

  1. 1. d✓v = @(R V (si; ✓v))2 @✓v d✓ = r✓ log ⇡(ai|si; ✓)(R V (si; ✓v))
  2. 2. g = ↵g + (1 ↵) ✓2 ✓ ✓ ⌘ ✓ p g + ✏

×