[Distributed System] ch4. interprocess communication
ย
[study] Long Text Generation via Adversarial Training with Leaked Information
1. Long Text Generation via Adversarial Training
with Leaked Information
Jiaxian Guo, Sidi Lu, Han Cai, Weinan Zhang, Yong Yu, Jun Wang
AAAI 2018, pp.5141-5148
https://arxiv.org/pdf/1709.08624.pdf
๊ตญ๋ฏผ๋ํ๊ต ์์ฐ์ด์ฒ๋ฆฌ์ฐ๊ตฌ์ค ๋จ๊ทํ
Natural Language Processing Lab. @Kookmin University
7. Natural Language Processing Lab. @Kookmin University
โข Generator
SeqGAN
- ์์ฑ์๋ GRU cell๊ณผ attention ์ ์ ์ฉํ์ฌ ๋ฌธ์ฅ์ ์์ฑํจ
T 1 2 3 4 5
Word ๋๋ ๋ฐฅ์ ๋จน๊ณ ํ๊ต์ ๊ฐ๋ค
โข Generator update
- ๋ชจ๋ธ ํ๋ผ๋ฏธํฐ theta ์ ๋ํ ๋ณด์ ํจ์ J
- ์์ T์์ state : s ์ action : a ์ ์์ฑ ํ๋ฅ G์ ๋ชฉ์ ํจ์ Q ๊ณฑ์ ํฉ
8. ์ง์ โฆ
๋์๊ด์ โฆ
Natural Language Processing Lab. @Kookmin University
โข Generator update
SeqGAN
http://www.aaai.org/Conferences/AAAI/2017/PreliminaryPapers/12-Yu-L-14344.pdf
- State-action value function : Q
T 1 2 3 4 5
Word ๋๋ ๋ฐฅ์ ๋จน๊ณ ํ๊ต์ ๊ฐ๋ค
State Action
๐1:๐
1
๐1:๐
2
๐1:๐
๐
9. Natural Language Processing Lab. @Kookmin University
โข Generator update
SeqGAN
http://www.aaai.org/Conferences/AAAI/2017/PreliminaryPapers/12-Yu-L-14344.pdf
- Derivate J
- Gradient update
10. Natural Language Processing Lab. @Kookmin University
โข Discriminator
SeqGAN
http://www.aaai.org/Conferences/AAAI/2017/PreliminaryPapers/12-Yu-L-14344.pdf
- CNN ์ ์ด์ฉํ์ฌ ํ๋ณ
> Concat
> Convolution
> Polling
โข Discriminator Update
- ์ค์ ๋ฐ์ดํฐ ๋ถํฌ P data, ์์ธก ๋ฐ์ดํฐ ๋ถํฌ G theta
11. Natural Language Processing Lab. @Kookmin University
SeqGAN
http://www.aaai.org/Conferences/AAAI/2017/PreliminaryPapers/12-Yu-L-14344.pdf
12. Natural Language Processing Lab. @Kookmin University
SeqGAN
http://www.aaai.org/Conferences/AAAI/2017/PreliminaryPapers/12-Yu-L-14344.pdf
T 1 2 3 4 5
Word ๋๋ ๋ฐฅ์ ๋จน๊ณ ํ๊ต์ ๊ฐ๋ค
โข Generator update
- State-action value function : Q
15. Natural Language Processing Lab. @Kookmin University
โข Leaked feature from D as Guiding signals
- s : input, Pi : model parameter, F : CNN, f : feature vector (leaked information)
LeakGAN
https://arxiv.org/abs/1709.08624
โข Hierarchical Structure of G
- D์ ์ ์ถ๋ ์ ๋ณด๋ฅผ ์ด์ฉํ๊ธฐ ์ํด Manager-Worker ๊ณ์ธต ๊ตฌ์กฐ ํํ๋ฅผ ๊ฐ์ง
- Manager : ๊ฐ ์์ t ์์ ์ ์ถ ์ ๋ณด ft ๋ฅผ ์ด์ฉํด goal vector : gt ๋ฅผ ์์ฑ
- Worker : manager์ gt๋ฅผ ํ ๋๋ก ๋ค์ ๋จ์ด ์์ฑ
16. Natural Language Processing Lab. @Kookmin University
โข Generation process (Manager)
- Manager ์ ์ ์ถ ์ ๋ณด๋ก goal vector (worker๋ค์ guideline) ์ ์์ฑํด์ผ ํจ.
- hM : hidden state, theta : model parameter, M : LSTM
LeakGAN
https://arxiv.org/abs/1709.08624
- ์ด์ ์์ ์ goal vector ์ ํ์ฌ ๋ฒกํฐ๋ฅผ embedding.
- Phsai : model parameter
17. Natural Language Processing Lab. @Kookmin University
โข Generation process (Worker)
- Worker ๋ Manager์ goal vector ์ ํ์ฌ ๋จ์ด๋ก ๋ค์ ๋จ์ด๋ฅผ ์์ธกํด์ผ ํจ.
- Xt : ํ์ฌ ๋จ์ด, h : hidden state, theta : model parameter, W : LSTM, a : temp parameter
LeakGAN
https://arxiv.org/abs/1709.08624
18. Natural Language Processing Lab. @Kookmin University
โข Training of G
- G์ ๋ชจ๋ ๊ณผ์ ์ ๋ฏธ๋ถ ๊ฐ๋ฅํ ๊ตฌ์กฐ๋ก ๋์์ผ๋ฏ๋ก, gradient policy๋ฅผ ๋ฐ๋ผ์ ์๋์ ๊ฐ์ด
Manager ์ gradient ๋ฅผ ๊ณ์ฐ.
LeakGAN
https://arxiv.org/abs/1709.08624
- Q : state value function,
ํ์ฌ ์ํ st, goal vector : gt ๋ฅผ ๋ฐํ์ผ๋ก monte carlo ์ ๊ฑฐ์ณ reward๋ฅผ ์ธก์ .
- Dcos : ๋ ๋ฒกํฐ์ ์ฝ์ฌ์ธ ์ ์ฌ๋
- Ft+c : c step ์ดํ ์ ์ถ๋ ์ ๋ณด
- Gt : goal vector by param theta
19. Natural Language Processing Lab. @Kookmin University
โข Training of G
- Worker์ reward gradient
LeakGAN
https://arxiv.org/abs/1709.08624
- Rt : ๋ณธ์ง์ ์ธ reward