SlideShare a Scribd company logo
Convolutional Neural Networks for
Sentence Classification
์ „ํฌ์„ 
http://www.people.fas.harvard.edu/~yoonkim/
0. Abstract
(ex: ๊ธ์ •/๋ถ€์ •)(+word2vec)
pre-trained๋œ word vectors๋ฅผ ๊ฐ€์ง€๊ณ  ํ•™์Šต๋œ CNN์„ ํ†ตํ•ด ๋ฌธ์žฅ ๋ถ„๋ฅ˜๋ฅผ ํ•œ ์—ฌ๋Ÿฌ ์—ฐ๊ตฌ์— ๋Œ€ํ•ด ๋ณด๊ณ 
- ์•ฝ๊ฐ„์˜ hyperparameter tuning์ด ๋“ค์–ด๊ฐ„ ๊ฐ„๋‹จํ•œ CNN๋„ ์ข‹์€ ๊ฒฐ๊ณผ๋ฅผ ๋‚˜ํƒ€๋ƒ„
- ์ด 7๊ฐœ task ์ค‘ 4๊ฐœ์—์„œ ๋‹ค๋ฅธ ๋ชจ๋ธ๋“ค ์ค‘ ๊ฐ€์žฅ ์ข‹์€ ๊ฒฐ๊ณผ๋ฅผ ๋ณด์ž„
Word Embedding : ๋‹จ์–ด๋ฅผ ์ €์ฐจ์›์˜ ๋ฒกํ„ฐ๋กœ ํ‘œํ˜„
(ex: Word2Vec, GloVe ๋“ฑ)
1. Introduction - Word2Vec
๋ฐ€์ง‘ ํ‘œํ˜„(Dense Representation) : ๊ฐ ๋‹จ์–ด์˜ ํŠน์ง•์ด ํ‘œํ˜„๋œ ๋ฒกํ„ฐ (์ €์ฐจ์›)
ํฌ์†Œ ํ‘œํ˜„(Sparse Representation) : ๋‹จ์–ด๋ณ„ ID ์ƒ์„ฑ (๊ณ ์ฐจ์›(๋‹จ์–ด ๊ฐœ์ˆ˜ ๋งŒํผ์˜ ์ฐจ์›), one-hot encoding)
๋‹จ์  : ๊ณ ์ฐจ์› + ๋‹จ์–ด ๊ฐ„ ์œ ์‚ฌ์„ฑ์„ ํ‘œํ˜„ํ•  ์ˆ˜ X
(1) CBOW(Continuous Bag of Words) : ์ฃผ๋ณ€์— ์žˆ๋Š” ๋‹จ์–ด๋กœ ์ค‘๊ฐ„ ๋‹จ์–ด ์˜ˆ์ธก
ex) ๋ฌธ์žฅ
โ€œThe fat cat sat on the matโ€
window size = 2
1. Introduction - Word2Vec
[0, 1, 0, 0, 0, 0, 0]
[0, 0, 1, 0, 0, 0, 0]
[0, 0, 0, 0, 1, 0, 0]
[0, 0, 0, 0, 0, 1, 0]
[0, 0, 0, 1, 0, 0, 0]
output : ์ค‘์‹ฌ ๋‹จ์–ด
input : ์ฃผ๋ณ€ ๋‹จ์–ด
(1) CBOW(Continuous Bag of Words) : ์ฃผ๋ณ€์— ์žˆ๋Š” ๋‹จ์–ด๋กœ ์ค‘๊ฐ„ ๋‹จ์–ด ์˜ˆ์ธก
1. Introduction - Word2Vec
m : window size
V : ๋ฌธ์žฅ์˜ ๋‹จ์–ด ๊ฐœ์ˆ˜ (= 7)
N : hidden layer ํฌ๊ธฐ (= 5)
(1) CBOW(Continuous Bag of Words) : ์ฃผ๋ณ€์— ์žˆ๋Š” ๋‹จ์–ด๋กœ ์ค‘๊ฐ„ ๋‹จ์–ด ์˜ˆ์ธก
1. Introduction - Word2Vec
=
๐‘ง ๐‘–
๐›ด ๐‘—โ…‡
๐‘ง ๐‘—
(1) CBOW(Continuous Bag of Words) : ์ฃผ๋ณ€์— ์žˆ๋Š” ๋‹จ์–ด๋กœ ์ค‘๊ฐ„ ๋‹จ์–ด ์˜ˆ์ธก
1. Introduction - Word2Vec
- ํ•™์Šต๋˜๋Š” mechanism์€ CBOW์™€ ๋™์ผ
(2) skip-gram : ์ค‘๊ฐ„ ๋‹จ์–ด๋กœ ์ฃผ๋ณ€์— ์žˆ๋Š” ๋‹จ์–ด ์˜ˆ์ธก
1. Introduction - Word2Vec
1. Introduction โ€“ CNN
Filter
1 0 1
0 1 0
1 0 1
2. Model โ€“ data(word vector) ์ค€๋น„
Google News์˜ 1000์–ต ๊ฐœ ๋‹จ์–ด๋กœ ๊ตฌ์„ฑ๋œ ๋ฐ์ดํ„ฐ๋กœ pre-trained๋œ Word2Vec(CBOW)์„ word vector๋กœ ์ด์šฉ
(pre-trained๋œ word vector์— ์—†๋Š” ๋‹จ์–ด๋Š” randomํ•˜๊ฒŒ ๋ฒกํ„ฐ ๊ฐ’ ์ดˆ๊ธฐํ™”)
n: ๋‹จ์–ด ๊ฐœ์ˆ˜
k: word vector ์ฐจ์›
window size๊ฐ€ ๋‹ค๋ฅธ
์—ฌ๋Ÿฌ filter๋“ค ์ ์šฉํ•œ
conv layer ๋ณ‘๋ ฌ์ ์œผ๋กœ ์ƒ์„ฑ
(filter ํฌ๊ธฐ : window size * k)
window size = 2
window size = 3
2. Model โ€“ static | non-static | multi-channel
non-static (input์œผ๋กœ ๋„ฃ์€ word-vector๊นŒ์ง€ backprop)
static (conv-layer๊นŒ์ง€ backprop)
multi-
channel
static channel
non-static channel
2. Model โ€“ step1 : concatenation
๐‘ฅ๐‘– โˆˆ โ„ ๐‘˜
: ๋ฌธ์žฅ ๐‘–๋ฒˆ์งธ์— ์žˆ๋Š” ๋‹จ์–ด์˜ k์ฐจ์› word vector
๐‘ฅ1
๐‘ฅ2
๐‘ฅ9
.
.
.
.
.
(โŠ• : concatenate operator
h : window size)
๐‘ฅ๐‘–:๐‘–+โ„Žโˆ’1 = ๐‘ฅ๐‘– โŠ• ๐‘ฅ๐‘–+1 โŠ• โ€ฆ โŠ• ๐‘ฅ๐‘–+โ„Žโˆ’1
: ๐‘–๋ฒˆ์งธ๋ถ€ํ„ฐ โ„Ž๊ฐœ์˜ ๋‹จ์–ด concatenate
๐‘ฅ1
๐‘ฅ2
๐‘ฅ9
.
.
.
.
.
๐‘ฅ1:2
๐‘ฅ8:9
๐‘ฅ4:6
2. Model โ€“ step1 : concatenation
2. Model โ€“ step2 : conv layer ์ƒ์„ฑ
๐‘๐‘– = ๐‘“ ๐‘ค โ‹… ๐‘ฅ๐‘–:๐‘–+โ„Žโˆ’1 + ๐‘ (๐‘“ : non-linear function(tanh ๋“ฑ)
โ„Ž : window size
๐‘ : bias term)
๐‘ฅ1
๐‘ฅ2
๐‘ฅ9
.
.
.
.
.
๐‘ฅ1:2
๐‘ฅ8:9
๐‘ฅ4:6
2. Model โ€“ step3 : max-over-time pooling
๐‘ฅ1
๐‘ฅ2
๐‘ฅ9
.
.
.
.
.
๐‘ฅ1:2
๐‘ฅ8:9
๐‘ฅ4:6
๐‘ = ๐‘1, โ€ฆ , ๐‘ ๐‘›โˆ’โ„Ž+1
: feature map
ฦธ๐‘ = max{๐‘}
๊ฐ conv layer๋งˆ๋‹ค feature map ๊ฐœ์ˆ˜๊ฐ€ ๋‹ฌ๋ผ์ง
โ†’ ๊ฐ conv layer๋งˆ๋‹ค feature map ไธญ ๊ฐ€์žฅ ํฐ ๊ฐ’๋งŒ ์‚ฌ์šฉ
๊ฐ window size๋งˆ๋‹ค ์ƒ์„ฑ
2. Model โ€“ step4 : softmax
๐‘ฅ1
๐‘ฅ2
๐‘ฅ9
.
.
.
.
.
๐‘ฅ1:2
๐‘ฅ8:9
๐‘ฅ4:6
softmax function ํ†ตํ•ด
์ตœ์ข… output ๋‚˜์˜ด
ิฆ๐‘ง = ฦธ๐‘1, โ‹ฏ , ฦธ๐‘ ๐‘š
m : filter ๊ฐœ์ˆ˜
2.1 Regularization
- Dropout (keep probability p=0.5)
: feature ไธญ ๋ฐ˜์€ dropout์œผ๋กœ ์ œ๊ฑฐ โ†’ 2~4% ์ •ํ™•๋„ ํ–ฅ์ƒ
(train์—์„œ๋งŒ ์‚ฌ์šฉ, test์—์„œ๋Š” dropout ํ•˜์ง€ ์•Š์Œ)
- Testํ•  ๋•Œ dropout์„ ํ•˜์ง€ ์•Š์œผ๋ฏ€๋กœ เทก๐‘ค = ๐‘๐‘ค ๋กœ rescale
ิฆ๐‘ฆ = ๐‘ค โ‹… ิฆ๐‘ง + ๐‘
ิฆ๐‘ง = ฦธ๐‘1, โ‹ฏ , ฦธ๐‘ ๐‘š
ิฆ๐‘ฆ = ๐‘ค โ‹… ( ิฆ๐‘ง โˆ˜ ิฆ๐‘Ÿ) + ๐‘
masking vector
(dropout ์—ฌ๋ถ€๋ฅผ 0, 1[Bernoulli random variable]๋กœ ๋‚˜ํƒ€๋‚ด ์ฃผ๋Š” vector)
dropout
m : filter ๊ฐœ์ˆ˜
โˆ˜ : element-wise operation
- ๐‘ค 2 > ๐‘ ์ผ ๊ฒฝ์šฐ ๐‘ค 2 = ๐‘ ๋กœ rescale
3. Datasets and Experimental Setup
์˜๋ฏธ label ์ˆ˜ ๋ฌธ์žฅ ํ‰๊ท  ๊ธธ์ด
dataset
ํฌ๊ธฐ
๋‹จ์–ด ์ˆ˜
pre-trained word
vector์— ํฌํ•จ๋œ ๋‹จ์–ด ์ˆ˜
test set ํฌ๊ธฐ
MR
(Movie Review)
์˜ํ™” ๋ฆฌ๋ทฐ ๋ฌธ์žฅ 2 (๊ธ์ •/๋ถ€์ •) 20 10662 18765 16448 10-fold CV ์‚ฌ์šฉ
SST-1
(Stanford Sentiment
Treebank-1)
MR์—์„œ test set ์ œ๊ณต
+ label 5๊ฐœ๋กœ
5 (๋งค์šฐ ๊ธ์ •/
๊ธ์ •/๋ณดํ†ต/๋ถ€์ •
/๋งค์šฐ ๋ถ€์ •)
18 11855 17836 16262 2210
SST-2
(Stanford Sentiment
Treebank-1)
SST-1์—์„œ ๋ณดํ†ต ์ œ๊ฑฐํ•˜๊ณ 
binary label๋กœ
2 (๊ธ์ •/๋ถ€์ •) 19 9613 16185 14838 1821
Subj Subjectivity dataset 2 (์ฃผ๊ด€/๊ฐ๊ด€) 23 10000 21323 17913 10-fold CV ์‚ฌ์šฉ
TREC ์˜๋ฌธ๋ฌธ dataset 6 (์งˆ๋ฌธ ์ข…๋ฅ˜) 10 5952 9592 9125 500
CR
(Consumer Review)
์†Œ๋น„์ž ๋ฆฌ๋ทฐ ๋ฌธ์žฅ
2 (๊ธ์ •/๋ถ€์ •)
19 3775 5340 5046 10-fold CV ์‚ฌ์šฉ
MPQA ์˜๊ฒฌ 2 (์˜๊ฒฌ ๊ทน์„ฑ) 3 10606 6246 6083 10-fold CV ์‚ฌ์šฉ
3.1 Hyperparameters and Training
SST-2์˜ validation set์œผ๋กœ grid search ํ†ตํ•ด ์„ค์ •
- Activation function : ReLU
- Filter windows : h = 3, 4, 5
- Feature map : 100๊ฐœ
- Dropout rate : p = 0.5
- L2 constraint : s = 3
- Mini-batch size : 50
3.2 Pre-trained Word Vectors
(์•ž์—์„œ ์–ธ๊ธ‰ํ•œ ๋‚ด์šฉ)
Google News์˜ 1000์–ต ๊ฐœ ๋‹จ์–ด๋กœ ๊ตฌ์„ฑ๋œ ๋ฐ์ดํ„ฐ๋กœ pre-trained๋œ
Word2Vec์„ word vector๋กœ ์ด์šฉ
* Train ์‹œ CBOW ์‚ฌ์šฉ / word vector ์ฐจ์›: 300์ฐจ์›
* pre-trained๋œ word vector์— ์—†๋Š” ๋‹จ์–ด๋Š” randomํ•˜๊ฒŒ ๋ฒกํ„ฐ ๊ฐ’ ์ดˆ๊ธฐํ™”
3.3 Model Variations - CNN-rand : word vector๊ฐ€ random์œผ๋กœ ์ดˆ๊ธฐํ™”
- CNN-static : word2vec ์‚ฌ์šฉ + static
- CNN-non-static : word2vec ์‚ฌ์šฉ + non-static
- CNN-multichannel : word2vec ๋‘ set(channel์ด๋ผ๊ณ ๋„ ๋ถ€๋ฆ„) ์‚ฌ์šฉ
- ํ•˜๋‚˜๋Š” static, ๋‹ค๋ฅธ ํ•˜๋‚˜๋Š” non-static
4 Results and Discussion
์„ฑ๋Šฅ BAD
4.1 Multichannel vs. Single Channel Models
โ†’ ํ•ญ์ƒ Multichannel์ด single channel model๋ณด๋‹ค ์„ฑ๋Šฅ์ด ๋” ์ข‹์€ ๊ฒƒ์€ ์•„๋‹ˆ๋‹ค!
4.2 Static vs. Non-static Representations
โ†’ non-static์€ static๋ณด๋‹ค ํ†ต์‚ฌ์ ์ธ ๋ถ€๋ถ„๊นŒ์ง€ ๊ณ ๋ ค๋จ
์ฐธ๊ณ  ์‚ฌ์ดํŠธ
https://wikidocs.net/33520 word2vec
https://www.youtube.com/watch?v=EAJoRA0KX7I&list=PLoROMvodv4rOhcuXMZkNm7j3fVwBBY42z&index=11
CS224n โ€“ Lec11
https://www.youtube.com/watch?v=IRB2vXSet2E PR-015
https://zyint.tistory.com/575 ๋…ผ๋ฌธ์š”์•ฝ1
https://arclab.tistory.com/149 ๋…ผ๋ฌธ์š”์•ฝ2
https://ratsgo.github.io/natural%20language%20processing/2017/08/16/deepNLP/ NLP+Deep Learning
Usually, NLP โž” RNN
ํ•˜์ง€๋งŒ RNN์—๋„ ๋ฌธ์ œ๊ฐ€ ์žˆ์—ˆ์œผ๋‹ˆ...
* Why not RNN, but CNN?
RNN์˜ ๋‹จ์ ์„ ๋ณด์™„ํ•ด์ค„ CNN!
* Why not RNN, but CNN?
* ํ•œ๊ตญ์–ด ver
Word2vec์ด ํฐ ์ฐจ์ด ์—†์—ˆ๋˜ ์ด์œ :
- word2vec์ด ์ƒˆ๋กœ์šด feature๋ฅผ ์ฃผ์ง€๋งŒ
๋ฐ์ดํ„ฐ๊ฐ€ ๋Œ€์šฉ๋Ÿ‰(65๋งŒ๊ฑด ์ด์ƒ)์ด๋ผ ๋ฐ์ด
ํ„ฐ ํฌ๊ธฐ๋งŒ์œผ๋กœ๋„ ์ถฉ๋ถ„ํžˆ ํฐ ์—ญํ• ์„ ํ•ด์„œ
- ๋‹ค์–‘ํ•œ ์นดํ…Œ๊ณ ๋ฆฌ๋ฅผ ์‹œํ—˜ํ•ด์„œ ๊ฐ™์€ ๋‹จ์–ด
๋ผ๋„ ์ค‘์˜์ ์ธ ํ‘œํ˜„์œผ๋กœ ์“ฐ์˜€๊ธฐ ๋•Œ๋ฌธ
* ํ•œ๊ตญ์–ด ver

More Related Content

What's hot

ใƒ‡ใ‚ฃใƒผใƒ—ใƒฉใƒผใƒ‹ใƒณใ‚ฐใ‚’็”จใ„ใŸ็‰ฉไฝ“่ช่ญ˜ใจใใฎๅ‘จ่พบ ๏ฝž็พ็Šถใจ่ชฒ้กŒ๏ฝž (Revised on 18 July, 2018)
ใƒ‡ใ‚ฃใƒผใƒ—ใƒฉใƒผใƒ‹ใƒณใ‚ฐใ‚’็”จใ„ใŸ็‰ฉไฝ“่ช่ญ˜ใจใใฎๅ‘จ่พบ ๏ฝž็พ็Šถใจ่ชฒ้กŒ๏ฝž (Revised on 18 July, 2018)ใƒ‡ใ‚ฃใƒผใƒ—ใƒฉใƒผใƒ‹ใƒณใ‚ฐใ‚’็”จใ„ใŸ็‰ฉไฝ“่ช่ญ˜ใจใใฎๅ‘จ่พบ ๏ฝž็พ็Šถใจ่ชฒ้กŒ๏ฝž (Revised on 18 July, 2018)
ใƒ‡ใ‚ฃใƒผใƒ—ใƒฉใƒผใƒ‹ใƒณใ‚ฐใ‚’็”จใ„ใŸ็‰ฉไฝ“่ช่ญ˜ใจใใฎๅ‘จ่พบ ๏ฝž็พ็Šถใจ่ชฒ้กŒ๏ฝž (Revised on 18 July, 2018)
Masakazu Iwamura
ย 
[่ซ–ๆ–‡็ดนไป‹] LSTM (LONG SHORT-TERM MEMORY)
[่ซ–ๆ–‡็ดนไป‹] LSTM (LONG SHORT-TERM MEMORY)[่ซ–ๆ–‡็ดนไป‹] LSTM (LONG SHORT-TERM MEMORY)
[่ซ–ๆ–‡็ดนไป‹] LSTM (LONG SHORT-TERM MEMORY)
Tomoyuki Hioki
ย 
์ œ 14ํšŒ ๋ณด์•„์ฆˆ(BOAZ) ๋น…๋ฐ์ดํ„ฐ ์ปจํผ๋Ÿฐ์Šค - [TweetVizํŒ€] : ์นดํ”„์นด์™€ ์ŠคํŒŒํฌ๋ฅผ ํ†ตํ•œ tweetdeck ๊ฐœ๋ฐœ
์ œ 14ํšŒ ๋ณด์•„์ฆˆ(BOAZ) ๋น…๋ฐ์ดํ„ฐ ์ปจํผ๋Ÿฐ์Šค - [TweetVizํŒ€] : ์นดํ”„์นด์™€ ์ŠคํŒŒํฌ๋ฅผ ํ†ตํ•œ tweetdeck ๊ฐœ๋ฐœ์ œ 14ํšŒ ๋ณด์•„์ฆˆ(BOAZ) ๋น…๋ฐ์ดํ„ฐ ์ปจํผ๋Ÿฐ์Šค - [TweetVizํŒ€] : ์นดํ”„์นด์™€ ์ŠคํŒŒํฌ๋ฅผ ํ†ตํ•œ tweetdeck ๊ฐœ๋ฐœ
์ œ 14ํšŒ ๋ณด์•„์ฆˆ(BOAZ) ๋น…๋ฐ์ดํ„ฐ ์ปจํผ๋Ÿฐ์Šค - [TweetVizํŒ€] : ์นดํ”„์นด์™€ ์ŠคํŒŒํฌ๋ฅผ ํ†ตํ•œ tweetdeck ๊ฐœ๋ฐœ
BOAZ Bigdata
ย 
LLM แ„†แ…ฉแ„ƒแ…ฆแ†ฏ แ„€แ…ตแ„‡แ…กแ†ซ แ„‰แ…ฅแ„‡แ…ตแ„‰แ…ณ แ„‰แ…ตแ†ฏแ„Œแ…ฅแ†ซ แ„€แ…กแ„‹แ…ตแ„ƒแ…ณ
LLM แ„†แ…ฉแ„ƒแ…ฆแ†ฏ แ„€แ…ตแ„‡แ…กแ†ซ แ„‰แ…ฅแ„‡แ…ตแ„‰แ…ณ แ„‰แ…ตแ†ฏแ„Œแ…ฅแ†ซ แ„€แ…กแ„‹แ…ตแ„ƒแ…ณLLM แ„†แ…ฉแ„ƒแ…ฆแ†ฏ แ„€แ…ตแ„‡แ…กแ†ซ แ„‰แ…ฅแ„‡แ…ตแ„‰แ…ณ แ„‰แ…ตแ†ฏแ„Œแ…ฅแ†ซ แ„€แ…กแ„‹แ…ตแ„ƒแ…ณ
LLM แ„†แ…ฉแ„ƒแ…ฆแ†ฏ แ„€แ…ตแ„‡แ…กแ†ซ แ„‰แ…ฅแ„‡แ…ตแ„‰แ…ณ แ„‰แ…ตแ†ฏแ„Œแ…ฅแ†ซ แ„€แ…กแ„‹แ…ตแ„ƒแ…ณ
Tae Young Lee
ย 
Seq2Seq (encoder decoder) model
Seq2Seq (encoder decoder) modelSeq2Seq (encoder decoder) model
Seq2Seq (encoder decoder) model
ไฝณ่“‰ ๅ€ช
ย 
Sampling-Importance-Sampling์„ ์ด์šฉํ•œ ์„ ์ˆ˜ ๊ฒฝ๊ธฐ๋Šฅ๋ ฅ ์ธก์ •
Sampling-Importance-Sampling์„ ์ด์šฉํ•œ ์„ ์ˆ˜ ๊ฒฝ๊ธฐ๋Šฅ๋ ฅ ์ธก์ •Sampling-Importance-Sampling์„ ์ด์šฉํ•œ ์„ ์ˆ˜ ๊ฒฝ๊ธฐ๋Šฅ๋ ฅ ์ธก์ •
Sampling-Importance-Sampling์„ ์ด์šฉํ•œ ์„ ์ˆ˜ ๊ฒฝ๊ธฐ๋Šฅ๋ ฅ ์ธก์ •
Mad Scientists
ย 
Tutorial on People Recommendations in Social Networks - ACM RecSys 2013,Hong...
Tutorial on People Recommendations in Social Networks -  ACM RecSys 2013,Hong...Tutorial on People Recommendations in Social Networks -  ACM RecSys 2013,Hong...
Tutorial on People Recommendations in Social Networks - ACM RecSys 2013,Hong...
Anmol Bhasin
ย 
Learning to forget continual prediction with lstm
Learning to forget continual prediction with lstmLearning to forget continual prediction with lstm
Learning to forget continual prediction with lstm
Fujimoto Keisuke
ย 
ๅ‹•็”ป่ช่ญ˜ใ‚ตใƒผใƒ™ใ‚คv1๏ผˆใƒกใ‚ฟใ‚ตใƒผใƒ™ใ‚ค ๏ผ‰
ๅ‹•็”ป่ช่ญ˜ใ‚ตใƒผใƒ™ใ‚คv1๏ผˆใƒกใ‚ฟใ‚ตใƒผใƒ™ใ‚ค ๏ผ‰ๅ‹•็”ป่ช่ญ˜ใ‚ตใƒผใƒ™ใ‚คv1๏ผˆใƒกใ‚ฟใ‚ตใƒผใƒ™ใ‚ค ๏ผ‰
ๅ‹•็”ป่ช่ญ˜ใ‚ตใƒผใƒ™ใ‚คv1๏ผˆใƒกใ‚ฟใ‚ตใƒผใƒ™ใ‚ค ๏ผ‰
cvpaper. challenge
ย 
์ œ10ํšŒ ๋ณด์•„์ฆˆ(BOAZ) ๋น…๋ฐ์ดํ„ฐ ์ปจํผ๋Ÿฐ์Šค - ๋ฐ‘๋ฐ”๋‹ฅ๋ถ€ํ„ฐ ์‹œ์ž‘ํ•˜๋Š” trivago ์ถ”์ฒœ์‹œ์Šคํ…œ
์ œ10ํšŒ ๋ณด์•„์ฆˆ(BOAZ) ๋น…๋ฐ์ดํ„ฐ ์ปจํผ๋Ÿฐ์Šค - ๋ฐ‘๋ฐ”๋‹ฅ๋ถ€ํ„ฐ ์‹œ์ž‘ํ•˜๋Š” trivago ์ถ”์ฒœ์‹œ์Šคํ…œ์ œ10ํšŒ ๋ณด์•„์ฆˆ(BOAZ) ๋น…๋ฐ์ดํ„ฐ ์ปจํผ๋Ÿฐ์Šค - ๋ฐ‘๋ฐ”๋‹ฅ๋ถ€ํ„ฐ ์‹œ์ž‘ํ•˜๋Š” trivago ์ถ”์ฒœ์‹œ์Šคํ…œ
์ œ10ํšŒ ๋ณด์•„์ฆˆ(BOAZ) ๋น…๋ฐ์ดํ„ฐ ์ปจํผ๋Ÿฐ์Šค - ๋ฐ‘๋ฐ”๋‹ฅ๋ถ€ํ„ฐ ์‹œ์ž‘ํ•˜๋Š” trivago ์ถ”์ฒœ์‹œ์Šคํ…œ
BOAZ Bigdata
ย 
Introduction To Tensorflow
Introduction To TensorflowIntroduction To Tensorflow
Introduction To Tensorflow
Rayyan Khalid
ย 
Text Classification in Python โ€“ using Pandas, scikit-learn, IPython Notebook ...
Text Classification in Python โ€“ using Pandas, scikit-learn, IPython Notebook ...Text Classification in Python โ€“ using Pandas, scikit-learn, IPython Notebook ...
Text Classification in Python โ€“ using Pandas, scikit-learn, IPython Notebook ...
Jimmy Lai
ย 
Deep neural networks cnn rnn_ae_some practical techniques
Deep neural networks cnn rnn_ae_some practical techniquesDeep neural networks cnn rnn_ae_some practical techniques
Deep neural networks cnn rnn_ae_some practical techniques
Kang Pilsung
ย 
ใ€DL่ผช่ชญไผšใ€‘Novel View Synthesis with Diffusion Models
ใ€DL่ผช่ชญไผšใ€‘Novel View Synthesis with Diffusion Modelsใ€DL่ผช่ชญไผšใ€‘Novel View Synthesis with Diffusion Models
ใ€DL่ผช่ชญไผšใ€‘Novel View Synthesis with Diffusion Models
Deep Learning JP
ย 
ๆ–‡็Œฎ็ดนไป‹๏ผšVideoCLIP: Contrastive Pre-training for Zero-shot Video-Text Understanding
ๆ–‡็Œฎ็ดนไป‹๏ผšVideoCLIP: Contrastive Pre-training for Zero-shot Video-Text Understandingๆ–‡็Œฎ็ดนไป‹๏ผšVideoCLIP: Contrastive Pre-training for Zero-shot Video-Text Understanding
ๆ–‡็Œฎ็ดนไป‹๏ผšVideoCLIP: Contrastive Pre-training for Zero-shot Video-Text Understanding
Toru Tamaki
ย 
2017:10:20่ซ–ๆ–‡่ชญใฟไผš"Image-to-Image Translation with Conditional Adversarial Netwo...
2017:10:20่ซ–ๆ–‡่ชญใฟไผš"Image-to-Image Translation with Conditional Adversarial Netwo...2017:10:20่ซ–ๆ–‡่ชญใฟไผš"Image-to-Image Translation with Conditional Adversarial Netwo...
2017:10:20่ซ–ๆ–‡่ชญใฟไผš"Image-to-Image Translation with Conditional Adversarial Netwo...
ayaha osaki
ย 
์ œ 16ํšŒ ๋ณด์•„์ฆˆ(BOAZ) ๋น…๋ฐ์ดํ„ฐ ์ปจํผ๋Ÿฐ์Šค - [WHY ํŒ€] : ๋‚˜๋งŒ์˜ ์›นํˆฐ์ผ๊ธฐ Toonight
์ œ 16ํšŒ ๋ณด์•„์ฆˆ(BOAZ) ๋น…๋ฐ์ดํ„ฐ ์ปจํผ๋Ÿฐ์Šค - [WHY ํŒ€] : ๋‚˜๋งŒ์˜ ์›นํˆฐ์ผ๊ธฐ Toonight์ œ 16ํšŒ ๋ณด์•„์ฆˆ(BOAZ) ๋น…๋ฐ์ดํ„ฐ ์ปจํผ๋Ÿฐ์Šค - [WHY ํŒ€] : ๋‚˜๋งŒ์˜ ์›นํˆฐ์ผ๊ธฐ Toonight
์ œ 16ํšŒ ๋ณด์•„์ฆˆ(BOAZ) ๋น…๋ฐ์ดํ„ฐ ์ปจํผ๋Ÿฐ์Šค - [WHY ํŒ€] : ๋‚˜๋งŒ์˜ ์›นํˆฐ์ผ๊ธฐ Toonight
BOAZ Bigdata
ย 
[DL่ผช่ชญไผš]Depth Prediction Without the Sensors: Leveraging Structure for Unsuper...
[DL่ผช่ชญไผš]Depth Prediction Without the Sensors: Leveraging Structure for Unsuper...[DL่ผช่ชญไผš]Depth Prediction Without the Sensors: Leveraging Structure for Unsuper...
[DL่ผช่ชญไผš]Depth Prediction Without the Sensors: Leveraging Structure for Unsuper...
Deep Learning JP
ย 
Image classification using cnn
Image classification using cnnImage classification using cnn
Image classification using cnn
Debarko De
ย 
Text MIning
Text MIningText MIning
Text MIning
Prakhyath Rai
ย 

What's hot (20)

ใƒ‡ใ‚ฃใƒผใƒ—ใƒฉใƒผใƒ‹ใƒณใ‚ฐใ‚’็”จใ„ใŸ็‰ฉไฝ“่ช่ญ˜ใจใใฎๅ‘จ่พบ ๏ฝž็พ็Šถใจ่ชฒ้กŒ๏ฝž (Revised on 18 July, 2018)
ใƒ‡ใ‚ฃใƒผใƒ—ใƒฉใƒผใƒ‹ใƒณใ‚ฐใ‚’็”จใ„ใŸ็‰ฉไฝ“่ช่ญ˜ใจใใฎๅ‘จ่พบ ๏ฝž็พ็Šถใจ่ชฒ้กŒ๏ฝž (Revised on 18 July, 2018)ใƒ‡ใ‚ฃใƒผใƒ—ใƒฉใƒผใƒ‹ใƒณใ‚ฐใ‚’็”จใ„ใŸ็‰ฉไฝ“่ช่ญ˜ใจใใฎๅ‘จ่พบ ๏ฝž็พ็Šถใจ่ชฒ้กŒ๏ฝž (Revised on 18 July, 2018)
ใƒ‡ใ‚ฃใƒผใƒ—ใƒฉใƒผใƒ‹ใƒณใ‚ฐใ‚’็”จใ„ใŸ็‰ฉไฝ“่ช่ญ˜ใจใใฎๅ‘จ่พบ ๏ฝž็พ็Šถใจ่ชฒ้กŒ๏ฝž (Revised on 18 July, 2018)
ย 
[่ซ–ๆ–‡็ดนไป‹] LSTM (LONG SHORT-TERM MEMORY)
[่ซ–ๆ–‡็ดนไป‹] LSTM (LONG SHORT-TERM MEMORY)[่ซ–ๆ–‡็ดนไป‹] LSTM (LONG SHORT-TERM MEMORY)
[่ซ–ๆ–‡็ดนไป‹] LSTM (LONG SHORT-TERM MEMORY)
ย 
์ œ 14ํšŒ ๋ณด์•„์ฆˆ(BOAZ) ๋น…๋ฐ์ดํ„ฐ ์ปจํผ๋Ÿฐ์Šค - [TweetVizํŒ€] : ์นดํ”„์นด์™€ ์ŠคํŒŒํฌ๋ฅผ ํ†ตํ•œ tweetdeck ๊ฐœ๋ฐœ
์ œ 14ํšŒ ๋ณด์•„์ฆˆ(BOAZ) ๋น…๋ฐ์ดํ„ฐ ์ปจํผ๋Ÿฐ์Šค - [TweetVizํŒ€] : ์นดํ”„์นด์™€ ์ŠคํŒŒํฌ๋ฅผ ํ†ตํ•œ tweetdeck ๊ฐœ๋ฐœ์ œ 14ํšŒ ๋ณด์•„์ฆˆ(BOAZ) ๋น…๋ฐ์ดํ„ฐ ์ปจํผ๋Ÿฐ์Šค - [TweetVizํŒ€] : ์นดํ”„์นด์™€ ์ŠคํŒŒํฌ๋ฅผ ํ†ตํ•œ tweetdeck ๊ฐœ๋ฐœ
์ œ 14ํšŒ ๋ณด์•„์ฆˆ(BOAZ) ๋น…๋ฐ์ดํ„ฐ ์ปจํผ๋Ÿฐ์Šค - [TweetVizํŒ€] : ์นดํ”„์นด์™€ ์ŠคํŒŒํฌ๋ฅผ ํ†ตํ•œ tweetdeck ๊ฐœ๋ฐœ
ย 
LLM แ„†แ…ฉแ„ƒแ…ฆแ†ฏ แ„€แ…ตแ„‡แ…กแ†ซ แ„‰แ…ฅแ„‡แ…ตแ„‰แ…ณ แ„‰แ…ตแ†ฏแ„Œแ…ฅแ†ซ แ„€แ…กแ„‹แ…ตแ„ƒแ…ณ
LLM แ„†แ…ฉแ„ƒแ…ฆแ†ฏ แ„€แ…ตแ„‡แ…กแ†ซ แ„‰แ…ฅแ„‡แ…ตแ„‰แ…ณ แ„‰แ…ตแ†ฏแ„Œแ…ฅแ†ซ แ„€แ…กแ„‹แ…ตแ„ƒแ…ณLLM แ„†แ…ฉแ„ƒแ…ฆแ†ฏ แ„€แ…ตแ„‡แ…กแ†ซ แ„‰แ…ฅแ„‡แ…ตแ„‰แ…ณ แ„‰แ…ตแ†ฏแ„Œแ…ฅแ†ซ แ„€แ…กแ„‹แ…ตแ„ƒแ…ณ
LLM แ„†แ…ฉแ„ƒแ…ฆแ†ฏ แ„€แ…ตแ„‡แ…กแ†ซ แ„‰แ…ฅแ„‡แ…ตแ„‰แ…ณ แ„‰แ…ตแ†ฏแ„Œแ…ฅแ†ซ แ„€แ…กแ„‹แ…ตแ„ƒแ…ณ
ย 
Seq2Seq (encoder decoder) model
Seq2Seq (encoder decoder) modelSeq2Seq (encoder decoder) model
Seq2Seq (encoder decoder) model
ย 
Sampling-Importance-Sampling์„ ์ด์šฉํ•œ ์„ ์ˆ˜ ๊ฒฝ๊ธฐ๋Šฅ๋ ฅ ์ธก์ •
Sampling-Importance-Sampling์„ ์ด์šฉํ•œ ์„ ์ˆ˜ ๊ฒฝ๊ธฐ๋Šฅ๋ ฅ ์ธก์ •Sampling-Importance-Sampling์„ ์ด์šฉํ•œ ์„ ์ˆ˜ ๊ฒฝ๊ธฐ๋Šฅ๋ ฅ ์ธก์ •
Sampling-Importance-Sampling์„ ์ด์šฉํ•œ ์„ ์ˆ˜ ๊ฒฝ๊ธฐ๋Šฅ๋ ฅ ์ธก์ •
ย 
Tutorial on People Recommendations in Social Networks - ACM RecSys 2013,Hong...
Tutorial on People Recommendations in Social Networks -  ACM RecSys 2013,Hong...Tutorial on People Recommendations in Social Networks -  ACM RecSys 2013,Hong...
Tutorial on People Recommendations in Social Networks - ACM RecSys 2013,Hong...
ย 
Learning to forget continual prediction with lstm
Learning to forget continual prediction with lstmLearning to forget continual prediction with lstm
Learning to forget continual prediction with lstm
ย 
ๅ‹•็”ป่ช่ญ˜ใ‚ตใƒผใƒ™ใ‚คv1๏ผˆใƒกใ‚ฟใ‚ตใƒผใƒ™ใ‚ค ๏ผ‰
ๅ‹•็”ป่ช่ญ˜ใ‚ตใƒผใƒ™ใ‚คv1๏ผˆใƒกใ‚ฟใ‚ตใƒผใƒ™ใ‚ค ๏ผ‰ๅ‹•็”ป่ช่ญ˜ใ‚ตใƒผใƒ™ใ‚คv1๏ผˆใƒกใ‚ฟใ‚ตใƒผใƒ™ใ‚ค ๏ผ‰
ๅ‹•็”ป่ช่ญ˜ใ‚ตใƒผใƒ™ใ‚คv1๏ผˆใƒกใ‚ฟใ‚ตใƒผใƒ™ใ‚ค ๏ผ‰
ย 
์ œ10ํšŒ ๋ณด์•„์ฆˆ(BOAZ) ๋น…๋ฐ์ดํ„ฐ ์ปจํผ๋Ÿฐ์Šค - ๋ฐ‘๋ฐ”๋‹ฅ๋ถ€ํ„ฐ ์‹œ์ž‘ํ•˜๋Š” trivago ์ถ”์ฒœ์‹œ์Šคํ…œ
์ œ10ํšŒ ๋ณด์•„์ฆˆ(BOAZ) ๋น…๋ฐ์ดํ„ฐ ์ปจํผ๋Ÿฐ์Šค - ๋ฐ‘๋ฐ”๋‹ฅ๋ถ€ํ„ฐ ์‹œ์ž‘ํ•˜๋Š” trivago ์ถ”์ฒœ์‹œ์Šคํ…œ์ œ10ํšŒ ๋ณด์•„์ฆˆ(BOAZ) ๋น…๋ฐ์ดํ„ฐ ์ปจํผ๋Ÿฐ์Šค - ๋ฐ‘๋ฐ”๋‹ฅ๋ถ€ํ„ฐ ์‹œ์ž‘ํ•˜๋Š” trivago ์ถ”์ฒœ์‹œ์Šคํ…œ
์ œ10ํšŒ ๋ณด์•„์ฆˆ(BOAZ) ๋น…๋ฐ์ดํ„ฐ ์ปจํผ๋Ÿฐ์Šค - ๋ฐ‘๋ฐ”๋‹ฅ๋ถ€ํ„ฐ ์‹œ์ž‘ํ•˜๋Š” trivago ์ถ”์ฒœ์‹œ์Šคํ…œ
ย 
Introduction To Tensorflow
Introduction To TensorflowIntroduction To Tensorflow
Introduction To Tensorflow
ย 
Text Classification in Python โ€“ using Pandas, scikit-learn, IPython Notebook ...
Text Classification in Python โ€“ using Pandas, scikit-learn, IPython Notebook ...Text Classification in Python โ€“ using Pandas, scikit-learn, IPython Notebook ...
Text Classification in Python โ€“ using Pandas, scikit-learn, IPython Notebook ...
ย 
Deep neural networks cnn rnn_ae_some practical techniques
Deep neural networks cnn rnn_ae_some practical techniquesDeep neural networks cnn rnn_ae_some practical techniques
Deep neural networks cnn rnn_ae_some practical techniques
ย 
ใ€DL่ผช่ชญไผšใ€‘Novel View Synthesis with Diffusion Models
ใ€DL่ผช่ชญไผšใ€‘Novel View Synthesis with Diffusion Modelsใ€DL่ผช่ชญไผšใ€‘Novel View Synthesis with Diffusion Models
ใ€DL่ผช่ชญไผšใ€‘Novel View Synthesis with Diffusion Models
ย 
ๆ–‡็Œฎ็ดนไป‹๏ผšVideoCLIP: Contrastive Pre-training for Zero-shot Video-Text Understanding
ๆ–‡็Œฎ็ดนไป‹๏ผšVideoCLIP: Contrastive Pre-training for Zero-shot Video-Text Understandingๆ–‡็Œฎ็ดนไป‹๏ผšVideoCLIP: Contrastive Pre-training for Zero-shot Video-Text Understanding
ๆ–‡็Œฎ็ดนไป‹๏ผšVideoCLIP: Contrastive Pre-training for Zero-shot Video-Text Understanding
ย 
2017:10:20่ซ–ๆ–‡่ชญใฟไผš"Image-to-Image Translation with Conditional Adversarial Netwo...
2017:10:20่ซ–ๆ–‡่ชญใฟไผš"Image-to-Image Translation with Conditional Adversarial Netwo...2017:10:20่ซ–ๆ–‡่ชญใฟไผš"Image-to-Image Translation with Conditional Adversarial Netwo...
2017:10:20่ซ–ๆ–‡่ชญใฟไผš"Image-to-Image Translation with Conditional Adversarial Netwo...
ย 
์ œ 16ํšŒ ๋ณด์•„์ฆˆ(BOAZ) ๋น…๋ฐ์ดํ„ฐ ์ปจํผ๋Ÿฐ์Šค - [WHY ํŒ€] : ๋‚˜๋งŒ์˜ ์›นํˆฐ์ผ๊ธฐ Toonight
์ œ 16ํšŒ ๋ณด์•„์ฆˆ(BOAZ) ๋น…๋ฐ์ดํ„ฐ ์ปจํผ๋Ÿฐ์Šค - [WHY ํŒ€] : ๋‚˜๋งŒ์˜ ์›นํˆฐ์ผ๊ธฐ Toonight์ œ 16ํšŒ ๋ณด์•„์ฆˆ(BOAZ) ๋น…๋ฐ์ดํ„ฐ ์ปจํผ๋Ÿฐ์Šค - [WHY ํŒ€] : ๋‚˜๋งŒ์˜ ์›นํˆฐ์ผ๊ธฐ Toonight
์ œ 16ํšŒ ๋ณด์•„์ฆˆ(BOAZ) ๋น…๋ฐ์ดํ„ฐ ์ปจํผ๋Ÿฐ์Šค - [WHY ํŒ€] : ๋‚˜๋งŒ์˜ ์›นํˆฐ์ผ๊ธฐ Toonight
ย 
[DL่ผช่ชญไผš]Depth Prediction Without the Sensors: Leveraging Structure for Unsuper...
[DL่ผช่ชญไผš]Depth Prediction Without the Sensors: Leveraging Structure for Unsuper...[DL่ผช่ชญไผš]Depth Prediction Without the Sensors: Leveraging Structure for Unsuper...
[DL่ผช่ชญไผš]Depth Prediction Without the Sensors: Leveraging Structure for Unsuper...
ย 
Image classification using cnn
Image classification using cnnImage classification using cnn
Image classification using cnn
ย 
Text MIning
Text MIningText MIning
Text MIning
ย 

Similar to CNN for sentence classification

LeNet & GoogLeNet
LeNet & GoogLeNetLeNet & GoogLeNet
DP ์•Œ๊ณ ๋ฆฌ์ฆ˜์— ๋Œ€ํ•ด ์•Œ์•„๋ณด์ž.pdf
DP ์•Œ๊ณ ๋ฆฌ์ฆ˜์— ๋Œ€ํ•ด ์•Œ์•„๋ณด์ž.pdfDP ์•Œ๊ณ ๋ฆฌ์ฆ˜์— ๋Œ€ํ•ด ์•Œ์•„๋ณด์ž.pdf
DP ์•Œ๊ณ ๋ฆฌ์ฆ˜์— ๋Œ€ํ•ด ์•Œ์•„๋ณด์ž.pdf
Ho Jeong Im
ย 
์ธ๊ณต์‹ ๊ฒฝ๋ง
์ธ๊ณต์‹ ๊ฒฝ๋ง์ธ๊ณต์‹ ๊ฒฝ๋ง
์ธ๊ณต์‹ ๊ฒฝ๋ง
์ข…์—ด ํ˜„
ย 
๊น€๋ฏผ์šฑ, (๋‹ฌ๋น›์กฐ๊ฐ์‚ฌ) ์—˜๋ฆญ์„œ๋ฅผ ์ด์šฉํ•œ mmorpg ์„œ๋ฒ„ ๊ฐœ๋ฐœ, NDC2019
๊น€๋ฏผ์šฑ, (๋‹ฌ๋น›์กฐ๊ฐ์‚ฌ) ์—˜๋ฆญ์„œ๋ฅผ ์ด์šฉํ•œ mmorpg ์„œ๋ฒ„ ๊ฐœ๋ฐœ, NDC2019๊น€๋ฏผ์šฑ, (๋‹ฌ๋น›์กฐ๊ฐ์‚ฌ) ์—˜๋ฆญ์„œ๋ฅผ ์ด์šฉํ•œ mmorpg ์„œ๋ฒ„ ๊ฐœ๋ฐœ, NDC2019
๊น€๋ฏผ์šฑ, (๋‹ฌ๋น›์กฐ๊ฐ์‚ฌ) ์—˜๋ฆญ์„œ๋ฅผ ์ด์šฉํ•œ mmorpg ์„œ๋ฒ„ ๊ฐœ๋ฐœ, NDC2019min woog kim
ย 
I3D and Kinetics datasets (Action Recognition)
I3D and Kinetics datasets (Action Recognition)I3D and Kinetics datasets (Action Recognition)
I3D and Kinetics datasets (Action Recognition)
Susang Kim
ย 
Image Deep Learning ์‹ค๋ฌด์ ์šฉ
Image Deep Learning ์‹ค๋ฌด์ ์šฉImage Deep Learning ์‹ค๋ฌด์ ์šฉ
Image Deep Learning ์‹ค๋ฌด์ ์šฉ
Youngjae Kim
ย 
History of Vision AI
History of Vision AIHistory of Vision AI
History of Vision AI
Tae Young Lee
ย 
Mylab
MylabMylab
Mylab
Lee Gyeong Hoon
ย 
[226]แ„ƒแ…ขแ„‹แ…ญแ†ผแ„…แ…ฃแ†ผ แ„แ…ฆแ†จแ„‰แ…ณแ„แ…ณแ„†แ…กแ„‹แ…ตแ„‚แ…ตแ†ผ แ„€แ…ตแ„‰แ…ฎแ†ฏ แ„’แ…กแ„Œแ…ฅแ†ผแ„‹แ…ฎ
[226]แ„ƒแ…ขแ„‹แ…ญแ†ผแ„…แ…ฃแ†ผ แ„แ…ฆแ†จแ„‰แ…ณแ„แ…ณแ„†แ…กแ„‹แ…ตแ„‚แ…ตแ†ผ แ„€แ…ตแ„‰แ…ฎแ†ฏ แ„’แ…กแ„Œแ…ฅแ†ผแ„‹แ…ฎ[226]แ„ƒแ…ขแ„‹แ…ญแ†ผแ„…แ…ฃแ†ผ แ„แ…ฆแ†จแ„‰แ…ณแ„แ…ณแ„†แ…กแ„‹แ…ตแ„‚แ…ตแ†ผ แ„€แ…ตแ„‰แ…ฎแ†ฏ แ„’แ…กแ„Œแ…ฅแ†ผแ„‹แ…ฎ
[226]แ„ƒแ…ขแ„‹แ…ญแ†ผแ„…แ…ฃแ†ผ แ„แ…ฆแ†จแ„‰แ…ณแ„แ…ณแ„†แ…กแ„‹แ…ตแ„‚แ…ตแ†ผ แ„€แ…ตแ„‰แ…ฎแ†ฏ แ„’แ…กแ„Œแ…ฅแ†ผแ„‹แ…ฎ
NAVER D2
ย 
Convolutional rnn
Convolutional rnnConvolutional rnn
Convolutional rnn
Lee Gyeong Hoon
ย 
ํŒŒ์ด์ฌ์„ ํ™œ์šฉํ•œ ์ฑ—๋ด‡ ์„œ๋น„์Šค ๊ฐœ๋ฐœ 3์ผ์ฐจ
ํŒŒ์ด์ฌ์„ ํ™œ์šฉํ•œ ์ฑ—๋ด‡ ์„œ๋น„์Šค ๊ฐœ๋ฐœ 3์ผ์ฐจํŒŒ์ด์ฌ์„ ํ™œ์šฉํ•œ ์ฑ—๋ด‡ ์„œ๋น„์Šค ๊ฐœ๋ฐœ 3์ผ์ฐจ
ํŒŒ์ด์ฌ์„ ํ™œ์šฉํ•œ ์ฑ—๋ด‡ ์„œ๋น„์Šค ๊ฐœ๋ฐœ 3์ผ์ฐจ
Taekyung Han
ย 
Deep learningwithkeras ch3_1
Deep learningwithkeras ch3_1Deep learningwithkeras ch3_1
Deep learningwithkeras ch3_1
PartPrime
ย 
์ž์—ฐ์–ด4 | 1์ฐจ๊ฐ•์˜
์ž์—ฐ์–ด4 | 1์ฐจ๊ฐ•์˜์ž์—ฐ์–ด4 | 1์ฐจ๊ฐ•์˜
์ž์—ฐ์–ด4 | 1์ฐจ๊ฐ•์˜
๊น€์šฉ๋ฒ” | ๋ฌด์˜์ธํ„ฐ๋‚ด์‡ผ๋‚ 
ย 
Scalability
ScalabilityScalability
ScalabilityDongwook Lee
ย 
Scala, Scalability
Scala, ScalabilityScala, Scalability
Scala, Scalability
Dongwook Lee
ย 
Multiple vector encoding (KOR. version)
Multiple vector encoding (KOR. version)Multiple vector encoding (KOR. version)
Multiple vector encoding (KOR. version)
์ƒ๊ทผ ์ •
ย 
Bidirectional attention flow for machine comprehension
Bidirectional attention flow for machine comprehensionBidirectional attention flow for machine comprehension
Bidirectional attention flow for machine comprehension
Woodam Lim
ย 
Attention is all you need ์„ค๋ช…
Attention is all you need ์„ค๋ช…Attention is all you need ์„ค๋ช…
Attention is all you need ์„ค๋ช…
Junho Lee
ย 
NDC11_์Šˆํผํด๋ž˜์Šค
NDC11_์Šˆํผํด๋ž˜์ŠคNDC11_์Šˆํผํด๋ž˜์Šค
NDC11_์Šˆํผํด๋ž˜์Šค
noerror
ย 
Adversarial Attack in Neural Machine Translation
Adversarial Attack in Neural Machine TranslationAdversarial Attack in Neural Machine Translation
Adversarial Attack in Neural Machine Translation
HyunKyu Jeon
ย 

Similar to CNN for sentence classification (20)

LeNet & GoogLeNet
LeNet & GoogLeNetLeNet & GoogLeNet
LeNet & GoogLeNet
ย 
DP ์•Œ๊ณ ๋ฆฌ์ฆ˜์— ๋Œ€ํ•ด ์•Œ์•„๋ณด์ž.pdf
DP ์•Œ๊ณ ๋ฆฌ์ฆ˜์— ๋Œ€ํ•ด ์•Œ์•„๋ณด์ž.pdfDP ์•Œ๊ณ ๋ฆฌ์ฆ˜์— ๋Œ€ํ•ด ์•Œ์•„๋ณด์ž.pdf
DP ์•Œ๊ณ ๋ฆฌ์ฆ˜์— ๋Œ€ํ•ด ์•Œ์•„๋ณด์ž.pdf
ย 
์ธ๊ณต์‹ ๊ฒฝ๋ง
์ธ๊ณต์‹ ๊ฒฝ๋ง์ธ๊ณต์‹ ๊ฒฝ๋ง
์ธ๊ณต์‹ ๊ฒฝ๋ง
ย 
๊น€๋ฏผ์šฑ, (๋‹ฌ๋น›์กฐ๊ฐ์‚ฌ) ์—˜๋ฆญ์„œ๋ฅผ ์ด์šฉํ•œ mmorpg ์„œ๋ฒ„ ๊ฐœ๋ฐœ, NDC2019
๊น€๋ฏผ์šฑ, (๋‹ฌ๋น›์กฐ๊ฐ์‚ฌ) ์—˜๋ฆญ์„œ๋ฅผ ์ด์šฉํ•œ mmorpg ์„œ๋ฒ„ ๊ฐœ๋ฐœ, NDC2019๊น€๋ฏผ์šฑ, (๋‹ฌ๋น›์กฐ๊ฐ์‚ฌ) ์—˜๋ฆญ์„œ๋ฅผ ์ด์šฉํ•œ mmorpg ์„œ๋ฒ„ ๊ฐœ๋ฐœ, NDC2019
๊น€๋ฏผ์šฑ, (๋‹ฌ๋น›์กฐ๊ฐ์‚ฌ) ์—˜๋ฆญ์„œ๋ฅผ ์ด์šฉํ•œ mmorpg ์„œ๋ฒ„ ๊ฐœ๋ฐœ, NDC2019
ย 
I3D and Kinetics datasets (Action Recognition)
I3D and Kinetics datasets (Action Recognition)I3D and Kinetics datasets (Action Recognition)
I3D and Kinetics datasets (Action Recognition)
ย 
Image Deep Learning ์‹ค๋ฌด์ ์šฉ
Image Deep Learning ์‹ค๋ฌด์ ์šฉImage Deep Learning ์‹ค๋ฌด์ ์šฉ
Image Deep Learning ์‹ค๋ฌด์ ์šฉ
ย 
History of Vision AI
History of Vision AIHistory of Vision AI
History of Vision AI
ย 
Mylab
MylabMylab
Mylab
ย 
[226]แ„ƒแ…ขแ„‹แ…ญแ†ผแ„…แ…ฃแ†ผ แ„แ…ฆแ†จแ„‰แ…ณแ„แ…ณแ„†แ…กแ„‹แ…ตแ„‚แ…ตแ†ผ แ„€แ…ตแ„‰แ…ฎแ†ฏ แ„’แ…กแ„Œแ…ฅแ†ผแ„‹แ…ฎ
[226]แ„ƒแ…ขแ„‹แ…ญแ†ผแ„…แ…ฃแ†ผ แ„แ…ฆแ†จแ„‰แ…ณแ„แ…ณแ„†แ…กแ„‹แ…ตแ„‚แ…ตแ†ผ แ„€แ…ตแ„‰แ…ฎแ†ฏ แ„’แ…กแ„Œแ…ฅแ†ผแ„‹แ…ฎ[226]แ„ƒแ…ขแ„‹แ…ญแ†ผแ„…แ…ฃแ†ผ แ„แ…ฆแ†จแ„‰แ…ณแ„แ…ณแ„†แ…กแ„‹แ…ตแ„‚แ…ตแ†ผ แ„€แ…ตแ„‰แ…ฎแ†ฏ แ„’แ…กแ„Œแ…ฅแ†ผแ„‹แ…ฎ
[226]แ„ƒแ…ขแ„‹แ…ญแ†ผแ„…แ…ฃแ†ผ แ„แ…ฆแ†จแ„‰แ…ณแ„แ…ณแ„†แ…กแ„‹แ…ตแ„‚แ…ตแ†ผ แ„€แ…ตแ„‰แ…ฎแ†ฏ แ„’แ…กแ„Œแ…ฅแ†ผแ„‹แ…ฎ
ย 
Convolutional rnn
Convolutional rnnConvolutional rnn
Convolutional rnn
ย 
ํŒŒ์ด์ฌ์„ ํ™œ์šฉํ•œ ์ฑ—๋ด‡ ์„œ๋น„์Šค ๊ฐœ๋ฐœ 3์ผ์ฐจ
ํŒŒ์ด์ฌ์„ ํ™œ์šฉํ•œ ์ฑ—๋ด‡ ์„œ๋น„์Šค ๊ฐœ๋ฐœ 3์ผ์ฐจํŒŒ์ด์ฌ์„ ํ™œ์šฉํ•œ ์ฑ—๋ด‡ ์„œ๋น„์Šค ๊ฐœ๋ฐœ 3์ผ์ฐจ
ํŒŒ์ด์ฌ์„ ํ™œ์šฉํ•œ ์ฑ—๋ด‡ ์„œ๋น„์Šค ๊ฐœ๋ฐœ 3์ผ์ฐจ
ย 
Deep learningwithkeras ch3_1
Deep learningwithkeras ch3_1Deep learningwithkeras ch3_1
Deep learningwithkeras ch3_1
ย 
์ž์—ฐ์–ด4 | 1์ฐจ๊ฐ•์˜
์ž์—ฐ์–ด4 | 1์ฐจ๊ฐ•์˜์ž์—ฐ์–ด4 | 1์ฐจ๊ฐ•์˜
์ž์—ฐ์–ด4 | 1์ฐจ๊ฐ•์˜
ย 
Scalability
ScalabilityScalability
Scalability
ย 
Scala, Scalability
Scala, ScalabilityScala, Scalability
Scala, Scalability
ย 
Multiple vector encoding (KOR. version)
Multiple vector encoding (KOR. version)Multiple vector encoding (KOR. version)
Multiple vector encoding (KOR. version)
ย 
Bidirectional attention flow for machine comprehension
Bidirectional attention flow for machine comprehensionBidirectional attention flow for machine comprehension
Bidirectional attention flow for machine comprehension
ย 
Attention is all you need ์„ค๋ช…
Attention is all you need ์„ค๋ช…Attention is all you need ์„ค๋ช…
Attention is all you need ์„ค๋ช…
ย 
NDC11_์Šˆํผํด๋ž˜์Šค
NDC11_์Šˆํผํด๋ž˜์ŠคNDC11_์Šˆํผํด๋ž˜์Šค
NDC11_์Šˆํผํด๋ž˜์Šค
ย 
Adversarial Attack in Neural Machine Translation
Adversarial Attack in Neural Machine TranslationAdversarial Attack in Neural Machine Translation
Adversarial Attack in Neural Machine Translation
ย 

More from KyeongUkJang

Photo wake up - 3d character animation from a single photo
Photo wake up - 3d character animation from a single photoPhoto wake up - 3d character animation from a single photo
Photo wake up - 3d character animation from a single photo
KyeongUkJang
ย 
YOLO
YOLOYOLO
YOLO
KyeongUkJang
ย 
AlphagoZero
AlphagoZeroAlphagoZero
AlphagoZero
KyeongUkJang
ย 
GoogLenet
GoogLenetGoogLenet
GoogLenet
KyeongUkJang
ย 
GAN - Generative Adversarial Nets
GAN - Generative Adversarial NetsGAN - Generative Adversarial Nets
GAN - Generative Adversarial Nets
KyeongUkJang
ย 
Distilling the knowledge in a neural network
Distilling the knowledge in a neural networkDistilling the knowledge in a neural network
Distilling the knowledge in a neural network
KyeongUkJang
ย 
Latent Dirichlet Allocation
Latent Dirichlet AllocationLatent Dirichlet Allocation
Latent Dirichlet Allocation
KyeongUkJang
ย 
Gaussian Mixture Model
Gaussian Mixture ModelGaussian Mixture Model
Gaussian Mixture Model
KyeongUkJang
ย 
Visualizing data using t-SNE
Visualizing data using t-SNEVisualizing data using t-SNE
Visualizing data using t-SNE
KyeongUkJang
ย 
Playing atari with deep reinforcement learning
Playing atari with deep reinforcement learningPlaying atari with deep reinforcement learning
Playing atari with deep reinforcement learning
KyeongUkJang
ย 
Chapter 20 - GAN
Chapter 20 - GANChapter 20 - GAN
Chapter 20 - GAN
KyeongUkJang
ย 
Chapter 20 - VAE
Chapter 20 - VAEChapter 20 - VAE
Chapter 20 - VAE
KyeongUkJang
ย 
Chapter 20 Deep generative models
Chapter 20 Deep generative modelsChapter 20 Deep generative models
Chapter 20 Deep generative models
KyeongUkJang
ย 
Chapter 19 Variational Inference
Chapter 19 Variational InferenceChapter 19 Variational Inference
Chapter 19 Variational Inference
KyeongUkJang
ย 
Natural Language Processing(NLP) - basic 2
Natural Language Processing(NLP) - basic 2Natural Language Processing(NLP) - basic 2
Natural Language Processing(NLP) - basic 2
KyeongUkJang
ย 
Natural Language Processing(NLP) - Basic
Natural Language Processing(NLP) - BasicNatural Language Processing(NLP) - Basic
Natural Language Processing(NLP) - Basic
KyeongUkJang
ย 
Chapter 17 monte carlo methods
Chapter 17 monte carlo methodsChapter 17 monte carlo methods
Chapter 17 monte carlo methods
KyeongUkJang
ย 
Chapter 16 structured probabilistic models for deep learning - 2
Chapter 16 structured probabilistic models for deep learning - 2Chapter 16 structured probabilistic models for deep learning - 2
Chapter 16 structured probabilistic models for deep learning - 2
KyeongUkJang
ย 
Chapter 16 structured probabilistic models for deep learning - 1
Chapter 16 structured probabilistic models for deep learning - 1Chapter 16 structured probabilistic models for deep learning - 1
Chapter 16 structured probabilistic models for deep learning - 1
KyeongUkJang
ย 
Chapter 15 Representation learning - 2
Chapter 15 Representation learning - 2Chapter 15 Representation learning - 2
Chapter 15 Representation learning - 2
KyeongUkJang
ย 

More from KyeongUkJang (20)

Photo wake up - 3d character animation from a single photo
Photo wake up - 3d character animation from a single photoPhoto wake up - 3d character animation from a single photo
Photo wake up - 3d character animation from a single photo
ย 
YOLO
YOLOYOLO
YOLO
ย 
AlphagoZero
AlphagoZeroAlphagoZero
AlphagoZero
ย 
GoogLenet
GoogLenetGoogLenet
GoogLenet
ย 
GAN - Generative Adversarial Nets
GAN - Generative Adversarial NetsGAN - Generative Adversarial Nets
GAN - Generative Adversarial Nets
ย 
Distilling the knowledge in a neural network
Distilling the knowledge in a neural networkDistilling the knowledge in a neural network
Distilling the knowledge in a neural network
ย 
Latent Dirichlet Allocation
Latent Dirichlet AllocationLatent Dirichlet Allocation
Latent Dirichlet Allocation
ย 
Gaussian Mixture Model
Gaussian Mixture ModelGaussian Mixture Model
Gaussian Mixture Model
ย 
Visualizing data using t-SNE
Visualizing data using t-SNEVisualizing data using t-SNE
Visualizing data using t-SNE
ย 
Playing atari with deep reinforcement learning
Playing atari with deep reinforcement learningPlaying atari with deep reinforcement learning
Playing atari with deep reinforcement learning
ย 
Chapter 20 - GAN
Chapter 20 - GANChapter 20 - GAN
Chapter 20 - GAN
ย 
Chapter 20 - VAE
Chapter 20 - VAEChapter 20 - VAE
Chapter 20 - VAE
ย 
Chapter 20 Deep generative models
Chapter 20 Deep generative modelsChapter 20 Deep generative models
Chapter 20 Deep generative models
ย 
Chapter 19 Variational Inference
Chapter 19 Variational InferenceChapter 19 Variational Inference
Chapter 19 Variational Inference
ย 
Natural Language Processing(NLP) - basic 2
Natural Language Processing(NLP) - basic 2Natural Language Processing(NLP) - basic 2
Natural Language Processing(NLP) - basic 2
ย 
Natural Language Processing(NLP) - Basic
Natural Language Processing(NLP) - BasicNatural Language Processing(NLP) - Basic
Natural Language Processing(NLP) - Basic
ย 
Chapter 17 monte carlo methods
Chapter 17 monte carlo methodsChapter 17 monte carlo methods
Chapter 17 monte carlo methods
ย 
Chapter 16 structured probabilistic models for deep learning - 2
Chapter 16 structured probabilistic models for deep learning - 2Chapter 16 structured probabilistic models for deep learning - 2
Chapter 16 structured probabilistic models for deep learning - 2
ย 
Chapter 16 structured probabilistic models for deep learning - 1
Chapter 16 structured probabilistic models for deep learning - 1Chapter 16 structured probabilistic models for deep learning - 1
Chapter 16 structured probabilistic models for deep learning - 1
ย 
Chapter 15 Representation learning - 2
Chapter 15 Representation learning - 2Chapter 15 Representation learning - 2
Chapter 15 Representation learning - 2
ย 

CNN for sentence classification

  • 1. Convolutional Neural Networks for Sentence Classification ์ „ํฌ์„ 
  • 2. http://www.people.fas.harvard.edu/~yoonkim/ 0. Abstract (ex: ๊ธ์ •/๋ถ€์ •)(+word2vec) pre-trained๋œ word vectors๋ฅผ ๊ฐ€์ง€๊ณ  ํ•™์Šต๋œ CNN์„ ํ†ตํ•ด ๋ฌธ์žฅ ๋ถ„๋ฅ˜๋ฅผ ํ•œ ์—ฌ๋Ÿฌ ์—ฐ๊ตฌ์— ๋Œ€ํ•ด ๋ณด๊ณ  - ์•ฝ๊ฐ„์˜ hyperparameter tuning์ด ๋“ค์–ด๊ฐ„ ๊ฐ„๋‹จํ•œ CNN๋„ ์ข‹์€ ๊ฒฐ๊ณผ๋ฅผ ๋‚˜ํƒ€๋ƒ„ - ์ด 7๊ฐœ task ์ค‘ 4๊ฐœ์—์„œ ๋‹ค๋ฅธ ๋ชจ๋ธ๋“ค ์ค‘ ๊ฐ€์žฅ ์ข‹์€ ๊ฒฐ๊ณผ๋ฅผ ๋ณด์ž„
  • 3. Word Embedding : ๋‹จ์–ด๋ฅผ ์ €์ฐจ์›์˜ ๋ฒกํ„ฐ๋กœ ํ‘œํ˜„ (ex: Word2Vec, GloVe ๋“ฑ) 1. Introduction - Word2Vec ๋ฐ€์ง‘ ํ‘œํ˜„(Dense Representation) : ๊ฐ ๋‹จ์–ด์˜ ํŠน์ง•์ด ํ‘œํ˜„๋œ ๋ฒกํ„ฐ (์ €์ฐจ์›) ํฌ์†Œ ํ‘œํ˜„(Sparse Representation) : ๋‹จ์–ด๋ณ„ ID ์ƒ์„ฑ (๊ณ ์ฐจ์›(๋‹จ์–ด ๊ฐœ์ˆ˜ ๋งŒํผ์˜ ์ฐจ์›), one-hot encoding) ๋‹จ์  : ๊ณ ์ฐจ์› + ๋‹จ์–ด ๊ฐ„ ์œ ์‚ฌ์„ฑ์„ ํ‘œํ˜„ํ•  ์ˆ˜ X
  • 4. (1) CBOW(Continuous Bag of Words) : ์ฃผ๋ณ€์— ์žˆ๋Š” ๋‹จ์–ด๋กœ ์ค‘๊ฐ„ ๋‹จ์–ด ์˜ˆ์ธก ex) ๋ฌธ์žฅ โ€œThe fat cat sat on the matโ€ window size = 2 1. Introduction - Word2Vec
  • 5. [0, 1, 0, 0, 0, 0, 0] [0, 0, 1, 0, 0, 0, 0] [0, 0, 0, 0, 1, 0, 0] [0, 0, 0, 0, 0, 1, 0] [0, 0, 0, 1, 0, 0, 0] output : ์ค‘์‹ฌ ๋‹จ์–ด input : ์ฃผ๋ณ€ ๋‹จ์–ด (1) CBOW(Continuous Bag of Words) : ์ฃผ๋ณ€์— ์žˆ๋Š” ๋‹จ์–ด๋กœ ์ค‘๊ฐ„ ๋‹จ์–ด ์˜ˆ์ธก 1. Introduction - Word2Vec
  • 6. m : window size V : ๋ฌธ์žฅ์˜ ๋‹จ์–ด ๊ฐœ์ˆ˜ (= 7) N : hidden layer ํฌ๊ธฐ (= 5) (1) CBOW(Continuous Bag of Words) : ์ฃผ๋ณ€์— ์žˆ๋Š” ๋‹จ์–ด๋กœ ์ค‘๊ฐ„ ๋‹จ์–ด ์˜ˆ์ธก 1. Introduction - Word2Vec
  • 7. = ๐‘ง ๐‘– ๐›ด ๐‘—โ…‡ ๐‘ง ๐‘— (1) CBOW(Continuous Bag of Words) : ์ฃผ๋ณ€์— ์žˆ๋Š” ๋‹จ์–ด๋กœ ์ค‘๊ฐ„ ๋‹จ์–ด ์˜ˆ์ธก 1. Introduction - Word2Vec
  • 8. - ํ•™์Šต๋˜๋Š” mechanism์€ CBOW์™€ ๋™์ผ (2) skip-gram : ์ค‘๊ฐ„ ๋‹จ์–ด๋กœ ์ฃผ๋ณ€์— ์žˆ๋Š” ๋‹จ์–ด ์˜ˆ์ธก 1. Introduction - Word2Vec
  • 9. 1. Introduction โ€“ CNN Filter 1 0 1 0 1 0 1 0 1
  • 10. 2. Model โ€“ data(word vector) ์ค€๋น„ Google News์˜ 1000์–ต ๊ฐœ ๋‹จ์–ด๋กœ ๊ตฌ์„ฑ๋œ ๋ฐ์ดํ„ฐ๋กœ pre-trained๋œ Word2Vec(CBOW)์„ word vector๋กœ ์ด์šฉ (pre-trained๋œ word vector์— ์—†๋Š” ๋‹จ์–ด๋Š” randomํ•˜๊ฒŒ ๋ฒกํ„ฐ ๊ฐ’ ์ดˆ๊ธฐํ™”) n: ๋‹จ์–ด ๊ฐœ์ˆ˜ k: word vector ์ฐจ์› window size๊ฐ€ ๋‹ค๋ฅธ ์—ฌ๋Ÿฌ filter๋“ค ์ ์šฉํ•œ conv layer ๋ณ‘๋ ฌ์ ์œผ๋กœ ์ƒ์„ฑ (filter ํฌ๊ธฐ : window size * k) window size = 2 window size = 3
  • 11. 2. Model โ€“ static | non-static | multi-channel non-static (input์œผ๋กœ ๋„ฃ์€ word-vector๊นŒ์ง€ backprop) static (conv-layer๊นŒ์ง€ backprop) multi- channel static channel non-static channel
  • 12. 2. Model โ€“ step1 : concatenation ๐‘ฅ๐‘– โˆˆ โ„ ๐‘˜ : ๋ฌธ์žฅ ๐‘–๋ฒˆ์งธ์— ์žˆ๋Š” ๋‹จ์–ด์˜ k์ฐจ์› word vector ๐‘ฅ1 ๐‘ฅ2 ๐‘ฅ9 . . . . .
  • 13. (โŠ• : concatenate operator h : window size) ๐‘ฅ๐‘–:๐‘–+โ„Žโˆ’1 = ๐‘ฅ๐‘– โŠ• ๐‘ฅ๐‘–+1 โŠ• โ€ฆ โŠ• ๐‘ฅ๐‘–+โ„Žโˆ’1 : ๐‘–๋ฒˆ์งธ๋ถ€ํ„ฐ โ„Ž๊ฐœ์˜ ๋‹จ์–ด concatenate ๐‘ฅ1 ๐‘ฅ2 ๐‘ฅ9 . . . . . ๐‘ฅ1:2 ๐‘ฅ8:9 ๐‘ฅ4:6 2. Model โ€“ step1 : concatenation
  • 14. 2. Model โ€“ step2 : conv layer ์ƒ์„ฑ ๐‘๐‘– = ๐‘“ ๐‘ค โ‹… ๐‘ฅ๐‘–:๐‘–+โ„Žโˆ’1 + ๐‘ (๐‘“ : non-linear function(tanh ๋“ฑ) โ„Ž : window size ๐‘ : bias term) ๐‘ฅ1 ๐‘ฅ2 ๐‘ฅ9 . . . . . ๐‘ฅ1:2 ๐‘ฅ8:9 ๐‘ฅ4:6
  • 15. 2. Model โ€“ step3 : max-over-time pooling ๐‘ฅ1 ๐‘ฅ2 ๐‘ฅ9 . . . . . ๐‘ฅ1:2 ๐‘ฅ8:9 ๐‘ฅ4:6 ๐‘ = ๐‘1, โ€ฆ , ๐‘ ๐‘›โˆ’โ„Ž+1 : feature map ฦธ๐‘ = max{๐‘} ๊ฐ conv layer๋งˆ๋‹ค feature map ๊ฐœ์ˆ˜๊ฐ€ ๋‹ฌ๋ผ์ง โ†’ ๊ฐ conv layer๋งˆ๋‹ค feature map ไธญ ๊ฐ€์žฅ ํฐ ๊ฐ’๋งŒ ์‚ฌ์šฉ ๊ฐ window size๋งˆ๋‹ค ์ƒ์„ฑ
  • 16. 2. Model โ€“ step4 : softmax ๐‘ฅ1 ๐‘ฅ2 ๐‘ฅ9 . . . . . ๐‘ฅ1:2 ๐‘ฅ8:9 ๐‘ฅ4:6 softmax function ํ†ตํ•ด ์ตœ์ข… output ๋‚˜์˜ด ิฆ๐‘ง = ฦธ๐‘1, โ‹ฏ , ฦธ๐‘ ๐‘š m : filter ๊ฐœ์ˆ˜
  • 17. 2.1 Regularization - Dropout (keep probability p=0.5) : feature ไธญ ๋ฐ˜์€ dropout์œผ๋กœ ์ œ๊ฑฐ โ†’ 2~4% ์ •ํ™•๋„ ํ–ฅ์ƒ (train์—์„œ๋งŒ ์‚ฌ์šฉ, test์—์„œ๋Š” dropout ํ•˜์ง€ ์•Š์Œ) - Testํ•  ๋•Œ dropout์„ ํ•˜์ง€ ์•Š์œผ๋ฏ€๋กœ เทก๐‘ค = ๐‘๐‘ค ๋กœ rescale ิฆ๐‘ฆ = ๐‘ค โ‹… ิฆ๐‘ง + ๐‘ ิฆ๐‘ง = ฦธ๐‘1, โ‹ฏ , ฦธ๐‘ ๐‘š ิฆ๐‘ฆ = ๐‘ค โ‹… ( ิฆ๐‘ง โˆ˜ ิฆ๐‘Ÿ) + ๐‘ masking vector (dropout ์—ฌ๋ถ€๋ฅผ 0, 1[Bernoulli random variable]๋กœ ๋‚˜ํƒ€๋‚ด ์ฃผ๋Š” vector) dropout m : filter ๊ฐœ์ˆ˜ โˆ˜ : element-wise operation - ๐‘ค 2 > ๐‘ ์ผ ๊ฒฝ์šฐ ๐‘ค 2 = ๐‘ ๋กœ rescale
  • 18. 3. Datasets and Experimental Setup ์˜๋ฏธ label ์ˆ˜ ๋ฌธ์žฅ ํ‰๊ท  ๊ธธ์ด dataset ํฌ๊ธฐ ๋‹จ์–ด ์ˆ˜ pre-trained word vector์— ํฌํ•จ๋œ ๋‹จ์–ด ์ˆ˜ test set ํฌ๊ธฐ MR (Movie Review) ์˜ํ™” ๋ฆฌ๋ทฐ ๋ฌธ์žฅ 2 (๊ธ์ •/๋ถ€์ •) 20 10662 18765 16448 10-fold CV ์‚ฌ์šฉ SST-1 (Stanford Sentiment Treebank-1) MR์—์„œ test set ์ œ๊ณต + label 5๊ฐœ๋กœ 5 (๋งค์šฐ ๊ธ์ •/ ๊ธ์ •/๋ณดํ†ต/๋ถ€์ • /๋งค์šฐ ๋ถ€์ •) 18 11855 17836 16262 2210 SST-2 (Stanford Sentiment Treebank-1) SST-1์—์„œ ๋ณดํ†ต ์ œ๊ฑฐํ•˜๊ณ  binary label๋กœ 2 (๊ธ์ •/๋ถ€์ •) 19 9613 16185 14838 1821 Subj Subjectivity dataset 2 (์ฃผ๊ด€/๊ฐ๊ด€) 23 10000 21323 17913 10-fold CV ์‚ฌ์šฉ TREC ์˜๋ฌธ๋ฌธ dataset 6 (์งˆ๋ฌธ ์ข…๋ฅ˜) 10 5952 9592 9125 500 CR (Consumer Review) ์†Œ๋น„์ž ๋ฆฌ๋ทฐ ๋ฌธ์žฅ 2 (๊ธ์ •/๋ถ€์ •) 19 3775 5340 5046 10-fold CV ์‚ฌ์šฉ MPQA ์˜๊ฒฌ 2 (์˜๊ฒฌ ๊ทน์„ฑ) 3 10606 6246 6083 10-fold CV ์‚ฌ์šฉ
  • 19. 3.1 Hyperparameters and Training SST-2์˜ validation set์œผ๋กœ grid search ํ†ตํ•ด ์„ค์ • - Activation function : ReLU - Filter windows : h = 3, 4, 5 - Feature map : 100๊ฐœ - Dropout rate : p = 0.5 - L2 constraint : s = 3 - Mini-batch size : 50 3.2 Pre-trained Word Vectors (์•ž์—์„œ ์–ธ๊ธ‰ํ•œ ๋‚ด์šฉ) Google News์˜ 1000์–ต ๊ฐœ ๋‹จ์–ด๋กœ ๊ตฌ์„ฑ๋œ ๋ฐ์ดํ„ฐ๋กœ pre-trained๋œ Word2Vec์„ word vector๋กœ ์ด์šฉ * Train ์‹œ CBOW ์‚ฌ์šฉ / word vector ์ฐจ์›: 300์ฐจ์› * pre-trained๋œ word vector์— ์—†๋Š” ๋‹จ์–ด๋Š” randomํ•˜๊ฒŒ ๋ฒกํ„ฐ ๊ฐ’ ์ดˆ๊ธฐํ™”
  • 20. 3.3 Model Variations - CNN-rand : word vector๊ฐ€ random์œผ๋กœ ์ดˆ๊ธฐํ™” - CNN-static : word2vec ์‚ฌ์šฉ + static - CNN-non-static : word2vec ์‚ฌ์šฉ + non-static - CNN-multichannel : word2vec ๋‘ set(channel์ด๋ผ๊ณ ๋„ ๋ถ€๋ฆ„) ์‚ฌ์šฉ - ํ•˜๋‚˜๋Š” static, ๋‹ค๋ฅธ ํ•˜๋‚˜๋Š” non-static 4 Results and Discussion ์„ฑ๋Šฅ BAD
  • 21. 4.1 Multichannel vs. Single Channel Models โ†’ ํ•ญ์ƒ Multichannel์ด single channel model๋ณด๋‹ค ์„ฑ๋Šฅ์ด ๋” ์ข‹์€ ๊ฒƒ์€ ์•„๋‹ˆ๋‹ค! 4.2 Static vs. Non-static Representations โ†’ non-static์€ static๋ณด๋‹ค ํ†ต์‚ฌ์ ์ธ ๋ถ€๋ถ„๊นŒ์ง€ ๊ณ ๋ ค๋จ
  • 22. ์ฐธ๊ณ  ์‚ฌ์ดํŠธ https://wikidocs.net/33520 word2vec https://www.youtube.com/watch?v=EAJoRA0KX7I&list=PLoROMvodv4rOhcuXMZkNm7j3fVwBBY42z&index=11 CS224n โ€“ Lec11 https://www.youtube.com/watch?v=IRB2vXSet2E PR-015 https://zyint.tistory.com/575 ๋…ผ๋ฌธ์š”์•ฝ1 https://arclab.tistory.com/149 ๋…ผ๋ฌธ์š”์•ฝ2 https://ratsgo.github.io/natural%20language%20processing/2017/08/16/deepNLP/ NLP+Deep Learning
  • 23. Usually, NLP โž” RNN ํ•˜์ง€๋งŒ RNN์—๋„ ๋ฌธ์ œ๊ฐ€ ์žˆ์—ˆ์œผ๋‹ˆ... * Why not RNN, but CNN?
  • 26. Word2vec์ด ํฐ ์ฐจ์ด ์—†์—ˆ๋˜ ์ด์œ : - word2vec์ด ์ƒˆ๋กœ์šด feature๋ฅผ ์ฃผ์ง€๋งŒ ๋ฐ์ดํ„ฐ๊ฐ€ ๋Œ€์šฉ๋Ÿ‰(65๋งŒ๊ฑด ์ด์ƒ)์ด๋ผ ๋ฐ์ด ํ„ฐ ํฌ๊ธฐ๋งŒ์œผ๋กœ๋„ ์ถฉ๋ถ„ํžˆ ํฐ ์—ญํ• ์„ ํ•ด์„œ - ๋‹ค์–‘ํ•œ ์นดํ…Œ๊ณ ๋ฆฌ๋ฅผ ์‹œํ—˜ํ•ด์„œ ๊ฐ™์€ ๋‹จ์–ด ๋ผ๋„ ์ค‘์˜์ ์ธ ํ‘œํ˜„์œผ๋กœ ์“ฐ์˜€๊ธฐ ๋•Œ๋ฌธ * ํ•œ๊ตญ์–ด ver