SlideShare a Scribd company logo
1 of 32
Download to read offline
15-381/781
Fall 2016
Deep Learning
Instructors: Ariel Procaccia and Emma Brunskill
Slides courtesy of Zico Kolter
Popularization of backprop
for training neural networks
Academic papers on unsupervised
pre-training for deep networks
“AlexNet” deep neural network
wins ImageNet 2012 contest
Facebook launches AI research
center, Google buys DeepMind
Figure 2: An illustration of the architecture of our CNN, explicitly showing the delineation of responsibilities
between the two GPUs. One GPU runs the layer-parts at the top of the figure while the other runs the layer-parts
at the bottom. The GPUs communicate only at certain layers. The network’s input is 150,528-dimensional, and
the number of neurons in the network’s remaining layers is given by 253,440–186,624–64,896–64,896–43,264–
4096–4096–1000.
neurons in a kernel map). The second convolutional layer takes as input the (response-normalized
and pooled) output of the first convolutional layer and filters it with 256 kernels of size 5 ⇥ 5 ⇥ 48.
The third, fourth, and fifth convolutional layers are connected to one another without any intervening
pooling or normalization layers. The third convolutional layer has 384 kernels of size 3 ⇥ 3 ⇥
256 connected to the (normalized, pooled) outputs of the second convolutional layer. The fourth
Figure 4: (Left) Eight ILSVRC-2010 test images and the five la
The correct label is written under each image, and the probabili
with a red bar (if it happens to be in the top 5). (Right) Five ILSV
A Neural Conversational Model
nd achieves im-
English-German
set (Luong et al.,
o been used for
t al., 2014a) and
b). Since it is
er from vanish-
variants of Long
ent neural net-
).
t success of neu-
3; Mikolov et al.,
t recurrent neural
natural language.
rdoni et al., 2015)
d recurrent neural
ersations (trained
ts has been pur-
t decades, and it
Figure 1. Using the seq2seq framework for modeling conversa-
tions.
and train to map “ABC” to “WXYZ” as shown in Figure 1
above. The hidden state of the model when it receives the
end of sequence symbol “<eos>” can be viewed as the
thought vector because it stores the information of the sen-
tence, or thought, “ABC”.
The strength of this model lies in its simplicity and gener-
ality. We can use this model for machine translation, ques-
tion/answering, and conversations without major changes
A Neural Conversational Model
t
model on the OpenSubtitles
9). This dataset consists of
XML format. It contains sen-
ers in movies. We applied a
emoving XML tags and obvi-
xt (e.g., hyperlinks) from the
not clearly indicated, we treated
suming they were uttered by
rained our model to predict the
evious one, and we did this for
t this doubles our dataset size,
both for context and as target).
split has 62M sentences (923M
les, and the validation set has
ens). The split is done in such a
a pair of sentences either appear
or test set but not both. Unlike
OpenSubtitles is quite large, and
cutive sentences may be uttered
ven the broad scope of movies,
onversation dataset, contrasting
hooting dataset.
Machine: hi
Human: hello
Machine: could you please let me know what are
the operating systems you are using ?
Human: i am using linux
Machine: ok it was in the ssl vpn on which you
stated it was broken
Machine: please check if you are logged in to
corp network
Machine: what is the error that you are running
please
Human: i am seeing an error related to vpn
Machine: what is the error message that you are
getting when connecting to vpn using network
connect ?
Human: connection refused or something like
that
Machine: may i know the version of network
connect you connect ?
Human: i am not sure i know that
Machine: is the network connect prompting for
any user certificate
Machine: ?
Human: yes
4.2. OpenSubtitles dataset
We also tested our model on the OpenSubtitles
dataset (Tiedemann, 2009). This dataset consists of
movie conversations in XML format. It contains sen-
tences uttered by characters in movies. We applied a
simple processing step removing XML tags and obvi-
ous non-conversational text (e.g., hyperlinks) from the
dataset. As turn taking is not clearly indicated, we treated
consecutive sentences assuming they were uttered by
different characters. We trained our model to predict the
next sentence given the previous one, and we did this for
every sentence (noting that this doubles our dataset size,
as each sentence is used both for context and as target).
Our training and validation split has 62M sentences (923M
tokens) as training examples, and the validation set has
26M sentences (395M tokens). The split is done in such a
way that each sentence in a pair of sentences either appear
together in the training set or test set but not both. Unlike
the previous dataset, the OpenSubtitles is quite large, and
rather noisy because consecutive sentences may be uttered
by the same character. Given the broad scope of movies,
this is an open-domain conversation dataset, contrasting
with the technical troubleshooting dataset.
5. Experiments
In this section, we describe the experimental results with
the two datasets and show some samples of the interac-
tions with the system that we trained. We also compare
the performance of our system against a popular rule-based
bot (CleverBot1
) using human evaluations on a set of 200
questions.
5.1. IT Helpdesk Troubleshooting experiments
In this experiment, we trained a single layer LSTM with
1024 memory cells using stochastic gradient descent with
gradient clipping. The vocabulary consists of the most
common 20K words, which includes special tokens indi-
cating turn taking and actor.
At convergence, this model achieved a perplexity of 8,
whereas an n-gram model achieved 18. Below is a few sam-
ples of simulated troubleshooting sessions for most three
of the most common IT problems (remote access, software
Machine: hi
Human: hello
Machine: could you please let me know what are
the operating systems you are using ?
Human: i am using linux
Machine: ok it was in the ssl vpn on which you
stated it was broken
Machine: please check if you are logged in to
corp network
Machine: what is the error that you are running
please
Human: i am seeing an error related to vpn
Machine: what is the error message that you are
getting when connecting to vpn using network
connect ?
Human: connection refused or something like
that
Machine: may i know the version of network
connect you connect ?
Human: i am not sure i know that
Machine: is the network connect prompting for
any user certificate
Machine: ?
Human: yes
Machine: are you at home or at work at the mo-
ment ?
Machine: office ?
Human: i am at home
Machine: try this
Machine: goto <URL> page and enter your
name and password
Human: done
Machine: try logging in with <NAME> and
check if its working fine now
Human: yes , now it works !
Machine: great
Machine: anything else that i can help
Human: no , that was very useful
Machine: you are other than <NAME> : )
Machine: have a nice night
(The <URL> indeed contains information about vpn ac-
cess.)
Conversation 2: Browser issues.
x(i) ∈ Rn
y(i) ∈ Y R {−1, +1} {1, . . . , p})
θ ∈ Rk
hθ : Rn → R
" : R × Y → R+
KBMBKBx2
θ
m
!
i=1
"(hθ(x(i)
), y(i)
)
hθ(x(i)
) = θT
φ(x(i)
)
φ : Rn → Rk
x(i)
x(i)
= [ i]
φ(x(i)
) =





1
x(i)
x(i)2





hθ(x) = W2φ(x) + b2 = W2(W1x + b1) + b2
θ = {W1 ∈ Rn×k
, b1 ∈ Rk
, W2 ∈ R1×k
, b2 ∈ R}
b
x1
x2
xn
z1
z2
zk
y
W1, b1
W2, b2
hθ(x) = W2(W1x + b1) + b2 = W̃ x + b̃
hθ(x) = f2(W2f1(W1x + b1) + b2)
f1, f2 : R → R
fi
iM?(x) = (e2x − 1)/(e2x + 1) σ(x) = 1/(1 + e−x )
f (x) = Kt{0, x}
x1
x2
xn
z1
z2
zk
y
W1, b1
W2, b2
z
hθ
θ = {Wi , bi }
z1 = x
W1, b1
z5
z2 z3 z4
W3, b3
W4, b4
= hθ(x)
W2, b2
k
zi+1 = fi (Wi zi + bi ), i = 1, . . . , k − 1, z1 = x
hθ(x) = zk
zi
(2n)
O(n) O(HQ; n)
KBMBKBx2
θ
m
!
i=1
(hθ(x(i)
), y(i)
)
({(x(i), y(i))}, hθ, , α)
Wj , bj ← , j = 1, . . . , k
i = 1, . . . , m
∇Wj ,bj (hθ(x(i)), y(i)), j = 1, . . . , k − 1
Wj ← Wj − α∇Wj (hθ(x(i)), y(i)), j = 1, . . . , k
bj ← bj − α∇bj (hθ(x(i)), y(i)), j = 1, . . . , k
{Wj , bj }
∇Wj ,bj (hθ(x(i)), y(i))
f : Rn → Rm
m × n
(
∂f (x)
∂x
)
∈ Rm×n
=






∂f1(x)
∂x1
∂f1(x)
∂x2
· · · ∂f1(x)
∂xn
∂f2(x)
∂x1
∂f2(x)
∂x2
· · · ∂f2(x)
∂xn
∂fm (x)
∂x1
∂fm (x)
∂x2
· · · ∂fm (x)
∂xn






f : Rn → R
∂f (x)
∂x
T
= ∇x f (x)
∂f (g(x))
∂x
=
∂f (g(x))
∂g(x)
∂g(x)
∂x
A ∈ Rm×n
∂Ax
∂x
= A
f
∂f (x)
∂x
= /B;(f #
(x))
∂(hθ(x), y)
∂b1
=
∂(zk , y)
∂b1
=
∂(zk , y)
∂zk
∂zk
∂b1
=
∂(zk , y)
∂zk
∂zk
∂zk−1
∂zk−1
∂zk−2
· · ·
∂z3
∂z2
∂z2
∂b1
i
∂zi+1
∂zi
=
∂fi (Wi zi + bi )
∂Wi zi + bi
∂Wi zi + bi
∂zi
= /B;(f #
i (Wi zi + bi ))Wi
∂zi+1
∂bi
=
∂fi (Wi zi + bi )
∂Wi zi + bi
∂Wi zi + bi
∂bi
= /B;(f #
i (Wi zi + bi ))
bi Wi
∂zi+1
∂zi
gT
i =
∂(zk , y)
∂zk
∂zk
∂zk−1
· · ·
∂zi+1
∂zi
gk =
∂(zk , y)
∂zk
gi = W T
i (gi+1 ◦ f #
(Wi zi + bi ))
◦
∇bi (hθ(x), y) = gi+1 ◦ f #
(Wi zi + bi )
∇Wi (hθ(x), y) = (gi+1 ◦ f #
(Wi zi + bi ))zT
i
zi
gi
(x, y, {Wi , bi , fi }k−1
i=1 , )
z1 ← x
i = 1, . . . , k − 1
zi+1, z#
i+1 ← fi (Wi zi + bi ), f #
i (Wi zi + bi )
L ← (zk , y)
gk ← ∂#(zk ,y)
∂zk
i = k − 1, . . . , 1
gi = W T
i (gi+1 ◦ z#
i+1)
∇bi ← gi+1 ◦ z#
i+1
∇Wi ← (gi+1 ◦ z#
i+1)zT
i
L, {∇bi , ∇Wi }k−1
i=1
?iiT,ff/22TH2`MBM;XM2ifbQ7ir`2fi?2MQf
?iiT,ffiQ`+?X+?f
?iiT,ffrrrXi2MbQ`7HQrXQ`;f
?iiTb,ff;Bi?m#X+QKfASafmiQ;`/
f (x) = Kt{0, x} f (x) = iM?(x)

More Related Content

Similar to nlp dl 1.pdf

Project 2: Baseband Data Communication
Project 2: Baseband Data CommunicationProject 2: Baseband Data Communication
Project 2: Baseband Data CommunicationDanish Bangash
 
Camp IT: Making the World More Efficient Using AI & Machine Learning
Camp IT: Making the World More Efficient Using AI & Machine LearningCamp IT: Making the World More Efficient Using AI & Machine Learning
Camp IT: Making the World More Efficient Using AI & Machine LearningKrzysztof Kowalczyk
 
D3, TypeScript, and Deep Learning
D3, TypeScript, and Deep LearningD3, TypeScript, and Deep Learning
D3, TypeScript, and Deep LearningOswald Campesato
 
Deep Dive on Deep Learning (June 2018)
Deep Dive on Deep Learning (June 2018)Deep Dive on Deep Learning (June 2018)
Deep Dive on Deep Learning (June 2018)Julien SIMON
 
Deep Learning: R with Keras and TensorFlow
Deep Learning: R with Keras and TensorFlowDeep Learning: R with Keras and TensorFlow
Deep Learning: R with Keras and TensorFlowOswald Campesato
 
Image De-Noising Using Deep Neural Network
Image De-Noising Using Deep Neural NetworkImage De-Noising Using Deep Neural Network
Image De-Noising Using Deep Neural Networkaciijournal
 
D3, TypeScript, and Deep Learning
D3, TypeScript, and Deep LearningD3, TypeScript, and Deep Learning
D3, TypeScript, and Deep LearningOswald Campesato
 
Scalable Deep Learning Using Apache MXNet
Scalable Deep Learning Using Apache MXNetScalable Deep Learning Using Apache MXNet
Scalable Deep Learning Using Apache MXNetAmazon Web Services
 
Java and Deep Learning (Introduction)
Java and Deep Learning (Introduction)Java and Deep Learning (Introduction)
Java and Deep Learning (Introduction)Oswald Campesato
 
Discovering Your AI Super Powers - Tips and Tricks to Jumpstart your AI Projects
Discovering Your AI Super Powers - Tips and Tricks to Jumpstart your AI ProjectsDiscovering Your AI Super Powers - Tips and Tricks to Jumpstart your AI Projects
Discovering Your AI Super Powers - Tips and Tricks to Jumpstart your AI ProjectsWee Hyong Tok
 
Deep Learning And Business Models (VNITC 2015-09-13)
Deep Learning And Business Models (VNITC 2015-09-13)Deep Learning And Business Models (VNITC 2015-09-13)
Deep Learning And Business Models (VNITC 2015-09-13)Ha Phuong
 
TypeScript and Deep Learning
TypeScript and Deep LearningTypeScript and Deep Learning
TypeScript and Deep LearningOswald Campesato
 
Image De-Noising Using Deep Neural Network
Image De-Noising Using Deep Neural NetworkImage De-Noising Using Deep Neural Network
Image De-Noising Using Deep Neural Networkaciijournal
 

Similar to nlp dl 1.pdf (20)

Project 2: Baseband Data Communication
Project 2: Baseband Data CommunicationProject 2: Baseband Data Communication
Project 2: Baseband Data Communication
 
Camp IT: Making the World More Efficient Using AI & Machine Learning
Camp IT: Making the World More Efficient Using AI & Machine LearningCamp IT: Making the World More Efficient Using AI & Machine Learning
Camp IT: Making the World More Efficient Using AI & Machine Learning
 
D3, TypeScript, and Deep Learning
D3, TypeScript, and Deep LearningD3, TypeScript, and Deep Learning
D3, TypeScript, and Deep Learning
 
Deep Dive on Deep Learning (June 2018)
Deep Dive on Deep Learning (June 2018)Deep Dive on Deep Learning (June 2018)
Deep Dive on Deep Learning (June 2018)
 
Deep Learning: R with Keras and TensorFlow
Deep Learning: R with Keras and TensorFlowDeep Learning: R with Keras and TensorFlow
Deep Learning: R with Keras and TensorFlow
 
Image De-Noising Using Deep Neural Network
Image De-Noising Using Deep Neural NetworkImage De-Noising Using Deep Neural Network
Image De-Noising Using Deep Neural Network
 
C++ and Deep Learning
C++ and Deep LearningC++ and Deep Learning
C++ and Deep Learning
 
Sparse autoencoder
Sparse autoencoderSparse autoencoder
Sparse autoencoder
 
Java and Deep Learning
Java and Deep LearningJava and Deep Learning
Java and Deep Learning
 
Angular and Deep Learning
Angular and Deep LearningAngular and Deep Learning
Angular and Deep Learning
 
D3, TypeScript, and Deep Learning
D3, TypeScript, and Deep LearningD3, TypeScript, and Deep Learning
D3, TypeScript, and Deep Learning
 
Scala and Deep Learning
Scala and Deep LearningScala and Deep Learning
Scala and Deep Learning
 
Scalable Deep Learning Using Apache MXNet
Scalable Deep Learning Using Apache MXNetScalable Deep Learning Using Apache MXNet
Scalable Deep Learning Using Apache MXNet
 
Java and Deep Learning (Introduction)
Java and Deep Learning (Introduction)Java and Deep Learning (Introduction)
Java and Deep Learning (Introduction)
 
Discovering Your AI Super Powers - Tips and Tricks to Jumpstart your AI Projects
Discovering Your AI Super Powers - Tips and Tricks to Jumpstart your AI ProjectsDiscovering Your AI Super Powers - Tips and Tricks to Jumpstart your AI Projects
Discovering Your AI Super Powers - Tips and Tricks to Jumpstart your AI Projects
 
Log polar coordinates
Log polar coordinatesLog polar coordinates
Log polar coordinates
 
Deep Learning And Business Models (VNITC 2015-09-13)
Deep Learning And Business Models (VNITC 2015-09-13)Deep Learning And Business Models (VNITC 2015-09-13)
Deep Learning And Business Models (VNITC 2015-09-13)
 
Cnn
CnnCnn
Cnn
 
TypeScript and Deep Learning
TypeScript and Deep LearningTypeScript and Deep Learning
TypeScript and Deep Learning
 
Image De-Noising Using Deep Neural Network
Image De-Noising Using Deep Neural NetworkImage De-Noising Using Deep Neural Network
Image De-Noising Using Deep Neural Network
 

More from nyomans1

PPT-UEU-Keamanan-Informasi-Pertemuan-5.ppt
PPT-UEU-Keamanan-Informasi-Pertemuan-5.pptPPT-UEU-Keamanan-Informasi-Pertemuan-5.ppt
PPT-UEU-Keamanan-Informasi-Pertemuan-5.pptnyomans1
 
Template Pertemuan 1 All MK - Copy.pptx
Template Pertemuan 1 All MK - Copy.pptxTemplate Pertemuan 1 All MK - Copy.pptx
Template Pertemuan 1 All MK - Copy.pptxnyomans1
 
Clustering_hirarki (tanpa narasi) (1).pptx
Clustering_hirarki (tanpa narasi) (1).pptxClustering_hirarki (tanpa narasi) (1).pptx
Clustering_hirarki (tanpa narasi) (1).pptxnyomans1
 
slide 7_olap_example.ppt
slide 7_olap_example.pptslide 7_olap_example.ppt
slide 7_olap_example.pptnyomans1
 
PPT-UEU-Keamanan-Informasi-Pertemuan-5.ppt
PPT-UEU-Keamanan-Informasi-Pertemuan-5.pptPPT-UEU-Keamanan-Informasi-Pertemuan-5.ppt
PPT-UEU-Keamanan-Informasi-Pertemuan-5.pptnyomans1
 
Security Requirement.pptx
Security Requirement.pptxSecurity Requirement.pptx
Security Requirement.pptxnyomans1
 
Minggu_1_Matriks_dan_Operasinya.pptx
Minggu_1_Matriks_dan_Operasinya.pptxMinggu_1_Matriks_dan_Operasinya.pptx
Minggu_1_Matriks_dan_Operasinya.pptxnyomans1
 
Matriks suplemen.ppt
Matriks suplemen.pptMatriks suplemen.ppt
Matriks suplemen.pptnyomans1
 
fdokumen.com_muh1g3-matriks-dan-ruang-vektor-3-312017-muh1g3-matriks-dan-ruan...
fdokumen.com_muh1g3-matriks-dan-ruang-vektor-3-312017-muh1g3-matriks-dan-ruan...fdokumen.com_muh1g3-matriks-dan-ruang-vektor-3-312017-muh1g3-matriks-dan-ruan...
fdokumen.com_muh1g3-matriks-dan-ruang-vektor-3-312017-muh1g3-matriks-dan-ruan...nyomans1
 
10-Image-Enhancement-Bagian3-2021.pptx
10-Image-Enhancement-Bagian3-2021.pptx10-Image-Enhancement-Bagian3-2021.pptx
10-Image-Enhancement-Bagian3-2021.pptxnyomans1
 
08-Image-Enhancement-Bagian1.pptx
08-Image-Enhancement-Bagian1.pptx08-Image-Enhancement-Bagian1.pptx
08-Image-Enhancement-Bagian1.pptxnyomans1
 
03-Pembentukan-Citra-dan-Digitalisasi-Citra.pptx
03-Pembentukan-Citra-dan-Digitalisasi-Citra.pptx03-Pembentukan-Citra-dan-Digitalisasi-Citra.pptx
03-Pembentukan-Citra-dan-Digitalisasi-Citra.pptxnyomans1
 
04-Format-citra-dan-struktur-data-citra-2021.pptx
04-Format-citra-dan-struktur-data-citra-2021.pptx04-Format-citra-dan-struktur-data-citra-2021.pptx
04-Format-citra-dan-struktur-data-citra-2021.pptxnyomans1
 
02-Pengantar-Pengolahan-Citra-Bag2-2021.pptx
02-Pengantar-Pengolahan-Citra-Bag2-2021.pptx02-Pengantar-Pengolahan-Citra-Bag2-2021.pptx
02-Pengantar-Pengolahan-Citra-Bag2-2021.pptxnyomans1
 
03spatialfiltering-130424050639-phpapp02.pptx
03spatialfiltering-130424050639-phpapp02.pptx03spatialfiltering-130424050639-phpapp02.pptx
03spatialfiltering-130424050639-phpapp02.pptxnyomans1
 
Q-Step_WS_02102019_Practical_introduction_to_Python.pptx
Q-Step_WS_02102019_Practical_introduction_to_Python.pptxQ-Step_WS_02102019_Practical_introduction_to_Python.pptx
Q-Step_WS_02102019_Practical_introduction_to_Python.pptxnyomans1
 
BAB 2_TIPE DATA, VARIABEL, DAN OPERATOR (1) (1).pptx
BAB 2_TIPE DATA, VARIABEL, DAN OPERATOR (1) (1).pptxBAB 2_TIPE DATA, VARIABEL, DAN OPERATOR (1) (1).pptx
BAB 2_TIPE DATA, VARIABEL, DAN OPERATOR (1) (1).pptxnyomans1
 
Support-Vector-Machines_EJ_v5.06.pptx
Support-Vector-Machines_EJ_v5.06.pptxSupport-Vector-Machines_EJ_v5.06.pptx
Support-Vector-Machines_EJ_v5.06.pptxnyomans1
 
06-Image-Histogram-2021.pptx
06-Image-Histogram-2021.pptx06-Image-Histogram-2021.pptx
06-Image-Histogram-2021.pptxnyomans1
 
05-Operasi-dasar-pengolahan-citra-2021 (1).pptx
05-Operasi-dasar-pengolahan-citra-2021 (1).pptx05-Operasi-dasar-pengolahan-citra-2021 (1).pptx
05-Operasi-dasar-pengolahan-citra-2021 (1).pptxnyomans1
 

More from nyomans1 (20)

PPT-UEU-Keamanan-Informasi-Pertemuan-5.ppt
PPT-UEU-Keamanan-Informasi-Pertemuan-5.pptPPT-UEU-Keamanan-Informasi-Pertemuan-5.ppt
PPT-UEU-Keamanan-Informasi-Pertemuan-5.ppt
 
Template Pertemuan 1 All MK - Copy.pptx
Template Pertemuan 1 All MK - Copy.pptxTemplate Pertemuan 1 All MK - Copy.pptx
Template Pertemuan 1 All MK - Copy.pptx
 
Clustering_hirarki (tanpa narasi) (1).pptx
Clustering_hirarki (tanpa narasi) (1).pptxClustering_hirarki (tanpa narasi) (1).pptx
Clustering_hirarki (tanpa narasi) (1).pptx
 
slide 7_olap_example.ppt
slide 7_olap_example.pptslide 7_olap_example.ppt
slide 7_olap_example.ppt
 
PPT-UEU-Keamanan-Informasi-Pertemuan-5.ppt
PPT-UEU-Keamanan-Informasi-Pertemuan-5.pptPPT-UEU-Keamanan-Informasi-Pertemuan-5.ppt
PPT-UEU-Keamanan-Informasi-Pertemuan-5.ppt
 
Security Requirement.pptx
Security Requirement.pptxSecurity Requirement.pptx
Security Requirement.pptx
 
Minggu_1_Matriks_dan_Operasinya.pptx
Minggu_1_Matriks_dan_Operasinya.pptxMinggu_1_Matriks_dan_Operasinya.pptx
Minggu_1_Matriks_dan_Operasinya.pptx
 
Matriks suplemen.ppt
Matriks suplemen.pptMatriks suplemen.ppt
Matriks suplemen.ppt
 
fdokumen.com_muh1g3-matriks-dan-ruang-vektor-3-312017-muh1g3-matriks-dan-ruan...
fdokumen.com_muh1g3-matriks-dan-ruang-vektor-3-312017-muh1g3-matriks-dan-ruan...fdokumen.com_muh1g3-matriks-dan-ruang-vektor-3-312017-muh1g3-matriks-dan-ruan...
fdokumen.com_muh1g3-matriks-dan-ruang-vektor-3-312017-muh1g3-matriks-dan-ruan...
 
10-Image-Enhancement-Bagian3-2021.pptx
10-Image-Enhancement-Bagian3-2021.pptx10-Image-Enhancement-Bagian3-2021.pptx
10-Image-Enhancement-Bagian3-2021.pptx
 
08-Image-Enhancement-Bagian1.pptx
08-Image-Enhancement-Bagian1.pptx08-Image-Enhancement-Bagian1.pptx
08-Image-Enhancement-Bagian1.pptx
 
03-Pembentukan-Citra-dan-Digitalisasi-Citra.pptx
03-Pembentukan-Citra-dan-Digitalisasi-Citra.pptx03-Pembentukan-Citra-dan-Digitalisasi-Citra.pptx
03-Pembentukan-Citra-dan-Digitalisasi-Citra.pptx
 
04-Format-citra-dan-struktur-data-citra-2021.pptx
04-Format-citra-dan-struktur-data-citra-2021.pptx04-Format-citra-dan-struktur-data-citra-2021.pptx
04-Format-citra-dan-struktur-data-citra-2021.pptx
 
02-Pengantar-Pengolahan-Citra-Bag2-2021.pptx
02-Pengantar-Pengolahan-Citra-Bag2-2021.pptx02-Pengantar-Pengolahan-Citra-Bag2-2021.pptx
02-Pengantar-Pengolahan-Citra-Bag2-2021.pptx
 
03spatialfiltering-130424050639-phpapp02.pptx
03spatialfiltering-130424050639-phpapp02.pptx03spatialfiltering-130424050639-phpapp02.pptx
03spatialfiltering-130424050639-phpapp02.pptx
 
Q-Step_WS_02102019_Practical_introduction_to_Python.pptx
Q-Step_WS_02102019_Practical_introduction_to_Python.pptxQ-Step_WS_02102019_Practical_introduction_to_Python.pptx
Q-Step_WS_02102019_Practical_introduction_to_Python.pptx
 
BAB 2_TIPE DATA, VARIABEL, DAN OPERATOR (1) (1).pptx
BAB 2_TIPE DATA, VARIABEL, DAN OPERATOR (1) (1).pptxBAB 2_TIPE DATA, VARIABEL, DAN OPERATOR (1) (1).pptx
BAB 2_TIPE DATA, VARIABEL, DAN OPERATOR (1) (1).pptx
 
Support-Vector-Machines_EJ_v5.06.pptx
Support-Vector-Machines_EJ_v5.06.pptxSupport-Vector-Machines_EJ_v5.06.pptx
Support-Vector-Machines_EJ_v5.06.pptx
 
06-Image-Histogram-2021.pptx
06-Image-Histogram-2021.pptx06-Image-Histogram-2021.pptx
06-Image-Histogram-2021.pptx
 
05-Operasi-dasar-pengolahan-citra-2021 (1).pptx
05-Operasi-dasar-pengolahan-citra-2021 (1).pptx05-Operasi-dasar-pengolahan-citra-2021 (1).pptx
05-Operasi-dasar-pengolahan-citra-2021 (1).pptx
 

Recently uploaded

B2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxB2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxStephen266013
 
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档208367051
 
04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationshipsccctableauusergroup
 
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样vhwb25kk
 
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPramod Kumar Srivastava
 
Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Colleen Farrelly
 
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一fhwihughh
 
NLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptx
NLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptxNLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptx
NLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptxBoston Institute of Analytics
 
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDINTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDRafezzaman
 
Top 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In QueensTop 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In Queensdataanalyticsqueen03
 
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...limedy534
 
Industrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfIndustrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfLars Albertsson
 
办理(UWIC毕业证书)英国卡迪夫城市大学毕业证成绩单原版一比一
办理(UWIC毕业证书)英国卡迪夫城市大学毕业证成绩单原版一比一办理(UWIC毕业证书)英国卡迪夫城市大学毕业证成绩单原版一比一
办理(UWIC毕业证书)英国卡迪夫城市大学毕业证成绩单原版一比一F La
 
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfPredicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfBoston Institute of Analytics
 
ASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel CanterASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel Cantervoginip
 
Call Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts ServiceCall Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts ServiceSapana Sha
 
How we prevented account sharing with MFA
How we prevented account sharing with MFAHow we prevented account sharing with MFA
How we prevented account sharing with MFAAndrei Kaleshka
 
Customer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxCustomer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxEmmanuel Dauda
 
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝DelhiRS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhijennyeacort
 
Amazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptx
Amazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptxAmazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptx
Amazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptxAbdelrhman abooda
 

Recently uploaded (20)

B2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxB2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docx
 
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
原版1:1定制南十字星大学毕业证(SCU毕业证)#文凭成绩单#真实留信学历认证永久存档
 
04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships
 
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
1:1定制(UQ毕业证)昆士兰大学毕业证成绩单修改留信学历认证原版一模一样
 
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
 
Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024Generative AI for Social Good at Open Data Science East 2024
Generative AI for Social Good at Open Data Science East 2024
 
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
 
NLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptx
NLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptxNLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptx
NLP Project PPT: Flipkart Product Reviews through NLP Data Science.pptx
 
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDINTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
 
Top 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In QueensTop 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In Queens
 
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
Effects of Smartphone Addiction on the Academic Performances of Grades 9 to 1...
 
Industrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfIndustrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdf
 
办理(UWIC毕业证书)英国卡迪夫城市大学毕业证成绩单原版一比一
办理(UWIC毕业证书)英国卡迪夫城市大学毕业证成绩单原版一比一办理(UWIC毕业证书)英国卡迪夫城市大学毕业证成绩单原版一比一
办理(UWIC毕业证书)英国卡迪夫城市大学毕业证成绩单原版一比一
 
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfPredicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
 
ASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel CanterASML's Taxonomy Adventure by Daniel Canter
ASML's Taxonomy Adventure by Daniel Canter
 
Call Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts ServiceCall Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts Service
 
How we prevented account sharing with MFA
How we prevented account sharing with MFAHow we prevented account sharing with MFA
How we prevented account sharing with MFA
 
Customer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptxCustomer Service Analytics - Make Sense of All Your Data.pptx
Customer Service Analytics - Make Sense of All Your Data.pptx
 
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝DelhiRS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
 
Amazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptx
Amazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptxAmazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptx
Amazon TQM (2) Amazon TQM (2)Amazon TQM (2).pptx
 

nlp dl 1.pdf

  • 1. 15-381/781 Fall 2016 Deep Learning Instructors: Ariel Procaccia and Emma Brunskill Slides courtesy of Zico Kolter
  • 2.
  • 3.
  • 4. Popularization of backprop for training neural networks Academic papers on unsupervised pre-training for deep networks “AlexNet” deep neural network wins ImageNet 2012 contest Facebook launches AI research center, Google buys DeepMind
  • 5. Figure 2: An illustration of the architecture of our CNN, explicitly showing the delineation of responsibilities between the two GPUs. One GPU runs the layer-parts at the top of the figure while the other runs the layer-parts at the bottom. The GPUs communicate only at certain layers. The network’s input is 150,528-dimensional, and the number of neurons in the network’s remaining layers is given by 253,440–186,624–64,896–64,896–43,264– 4096–4096–1000. neurons in a kernel map). The second convolutional layer takes as input the (response-normalized and pooled) output of the first convolutional layer and filters it with 256 kernels of size 5 ⇥ 5 ⇥ 48. The third, fourth, and fifth convolutional layers are connected to one another without any intervening pooling or normalization layers. The third convolutional layer has 384 kernels of size 3 ⇥ 3 ⇥ 256 connected to the (normalized, pooled) outputs of the second convolutional layer. The fourth
  • 6. Figure 4: (Left) Eight ILSVRC-2010 test images and the five la The correct label is written under each image, and the probabili with a red bar (if it happens to be in the top 5). (Right) Five ILSV
  • 7.
  • 8. A Neural Conversational Model nd achieves im- English-German set (Luong et al., o been used for t al., 2014a) and b). Since it is er from vanish- variants of Long ent neural net- ). t success of neu- 3; Mikolov et al., t recurrent neural natural language. rdoni et al., 2015) d recurrent neural ersations (trained ts has been pur- t decades, and it Figure 1. Using the seq2seq framework for modeling conversa- tions. and train to map “ABC” to “WXYZ” as shown in Figure 1 above. The hidden state of the model when it receives the end of sequence symbol “<eos>” can be viewed as the thought vector because it stores the information of the sen- tence, or thought, “ABC”. The strength of this model lies in its simplicity and gener- ality. We can use this model for machine translation, ques- tion/answering, and conversations without major changes A Neural Conversational Model t model on the OpenSubtitles 9). This dataset consists of XML format. It contains sen- ers in movies. We applied a emoving XML tags and obvi- xt (e.g., hyperlinks) from the not clearly indicated, we treated suming they were uttered by rained our model to predict the evious one, and we did this for t this doubles our dataset size, both for context and as target). split has 62M sentences (923M les, and the validation set has ens). The split is done in such a a pair of sentences either appear or test set but not both. Unlike OpenSubtitles is quite large, and cutive sentences may be uttered ven the broad scope of movies, onversation dataset, contrasting hooting dataset. Machine: hi Human: hello Machine: could you please let me know what are the operating systems you are using ? Human: i am using linux Machine: ok it was in the ssl vpn on which you stated it was broken Machine: please check if you are logged in to corp network Machine: what is the error that you are running please Human: i am seeing an error related to vpn Machine: what is the error message that you are getting when connecting to vpn using network connect ? Human: connection refused or something like that Machine: may i know the version of network connect you connect ? Human: i am not sure i know that Machine: is the network connect prompting for any user certificate Machine: ? Human: yes 4.2. OpenSubtitles dataset We also tested our model on the OpenSubtitles dataset (Tiedemann, 2009). This dataset consists of movie conversations in XML format. It contains sen- tences uttered by characters in movies. We applied a simple processing step removing XML tags and obvi- ous non-conversational text (e.g., hyperlinks) from the dataset. As turn taking is not clearly indicated, we treated consecutive sentences assuming they were uttered by different characters. We trained our model to predict the next sentence given the previous one, and we did this for every sentence (noting that this doubles our dataset size, as each sentence is used both for context and as target). Our training and validation split has 62M sentences (923M tokens) as training examples, and the validation set has 26M sentences (395M tokens). The split is done in such a way that each sentence in a pair of sentences either appear together in the training set or test set but not both. Unlike the previous dataset, the OpenSubtitles is quite large, and rather noisy because consecutive sentences may be uttered by the same character. Given the broad scope of movies, this is an open-domain conversation dataset, contrasting with the technical troubleshooting dataset. 5. Experiments In this section, we describe the experimental results with the two datasets and show some samples of the interac- tions with the system that we trained. We also compare the performance of our system against a popular rule-based bot (CleverBot1 ) using human evaluations on a set of 200 questions. 5.1. IT Helpdesk Troubleshooting experiments In this experiment, we trained a single layer LSTM with 1024 memory cells using stochastic gradient descent with gradient clipping. The vocabulary consists of the most common 20K words, which includes special tokens indi- cating turn taking and actor. At convergence, this model achieved a perplexity of 8, whereas an n-gram model achieved 18. Below is a few sam- ples of simulated troubleshooting sessions for most three of the most common IT problems (remote access, software Machine: hi Human: hello Machine: could you please let me know what are the operating systems you are using ? Human: i am using linux Machine: ok it was in the ssl vpn on which you stated it was broken Machine: please check if you are logged in to corp network Machine: what is the error that you are running please Human: i am seeing an error related to vpn Machine: what is the error message that you are getting when connecting to vpn using network connect ? Human: connection refused or something like that Machine: may i know the version of network connect you connect ? Human: i am not sure i know that Machine: is the network connect prompting for any user certificate Machine: ? Human: yes Machine: are you at home or at work at the mo- ment ? Machine: office ? Human: i am at home Machine: try this Machine: goto <URL> page and enter your name and password Human: done Machine: try logging in with <NAME> and check if its working fine now Human: yes , now it works ! Machine: great Machine: anything else that i can help Human: no , that was very useful Machine: you are other than <NAME> : ) Machine: have a nice night (The <URL> indeed contains information about vpn ac- cess.) Conversation 2: Browser issues.
  • 9.
  • 10.
  • 11. x(i) ∈ Rn y(i) ∈ Y R {−1, +1} {1, . . . , p}) θ ∈ Rk hθ : Rn → R " : R × Y → R+ KBMBKBx2 θ m ! i=1 "(hθ(x(i) ), y(i) )
  • 12. hθ(x(i) ) = θT φ(x(i) ) φ : Rn → Rk x(i) x(i) = [ i] φ(x(i) ) =      1 x(i) x(i)2     
  • 13.
  • 14. hθ(x) = W2φ(x) + b2 = W2(W1x + b1) + b2 θ = {W1 ∈ Rn×k , b1 ∈ Rk , W2 ∈ R1×k , b2 ∈ R} b
  • 15. x1 x2 xn z1 z2 zk y W1, b1 W2, b2 hθ(x) = W2(W1x + b1) + b2 = W̃ x + b̃
  • 16. hθ(x) = f2(W2f1(W1x + b1) + b2) f1, f2 : R → R fi iM?(x) = (e2x − 1)/(e2x + 1) σ(x) = 1/(1 + e−x ) f (x) = Kt{0, x}
  • 18. hθ θ = {Wi , bi }
  • 19. z1 = x W1, b1 z5 z2 z3 z4 W3, b3 W4, b4 = hθ(x) W2, b2 k zi+1 = fi (Wi zi + bi ), i = 1, . . . , k − 1, z1 = x hθ(x) = zk zi
  • 21.
  • 23. ({(x(i), y(i))}, hθ, , α) Wj , bj ← , j = 1, . . . , k i = 1, . . . , m ∇Wj ,bj (hθ(x(i)), y(i)), j = 1, . . . , k − 1 Wj ← Wj − α∇Wj (hθ(x(i)), y(i)), j = 1, . . . , k bj ← bj − α∇bj (hθ(x(i)), y(i)), j = 1, . . . , k {Wj , bj } ∇Wj ,bj (hθ(x(i)), y(i))
  • 24.
  • 25. f : Rn → Rm m × n ( ∂f (x) ∂x ) ∈ Rm×n =       ∂f1(x) ∂x1 ∂f1(x) ∂x2 · · · ∂f1(x) ∂xn ∂f2(x) ∂x1 ∂f2(x) ∂x2 · · · ∂f2(x) ∂xn ∂fm (x) ∂x1 ∂fm (x) ∂x2 · · · ∂fm (x) ∂xn       f : Rn → R ∂f (x) ∂x T = ∇x f (x)
  • 26. ∂f (g(x)) ∂x = ∂f (g(x)) ∂g(x) ∂g(x) ∂x A ∈ Rm×n ∂Ax ∂x = A f ∂f (x) ∂x = /B;(f # (x))
  • 27. ∂(hθ(x), y) ∂b1 = ∂(zk , y) ∂b1 = ∂(zk , y) ∂zk ∂zk ∂b1 = ∂(zk , y) ∂zk ∂zk ∂zk−1 ∂zk−1 ∂zk−2 · · · ∂z3 ∂z2 ∂z2 ∂b1 i ∂zi+1 ∂zi = ∂fi (Wi zi + bi ) ∂Wi zi + bi ∂Wi zi + bi ∂zi = /B;(f # i (Wi zi + bi ))Wi ∂zi+1 ∂bi = ∂fi (Wi zi + bi ) ∂Wi zi + bi ∂Wi zi + bi ∂bi = /B;(f # i (Wi zi + bi ))
  • 28. bi Wi ∂zi+1 ∂zi gT i = ∂(zk , y) ∂zk ∂zk ∂zk−1 · · · ∂zi+1 ∂zi gk = ∂(zk , y) ∂zk gi = W T i (gi+1 ◦ f # (Wi zi + bi )) ◦ ∇bi (hθ(x), y) = gi+1 ◦ f # (Wi zi + bi ) ∇Wi (hθ(x), y) = (gi+1 ◦ f # (Wi zi + bi ))zT i
  • 29. zi gi (x, y, {Wi , bi , fi }k−1 i=1 , ) z1 ← x i = 1, . . . , k − 1 zi+1, z# i+1 ← fi (Wi zi + bi ), f # i (Wi zi + bi ) L ← (zk , y) gk ← ∂#(zk ,y) ∂zk i = k − 1, . . . , 1 gi = W T i (gi+1 ◦ z# i+1) ∇bi ← gi+1 ◦ z# i+1 ∇Wi ← (gi+1 ◦ z# i+1)zT i L, {∇bi , ∇Wi }k−1 i=1
  • 31.
  • 32. f (x) = Kt{0, x} f (x) = iM?(x)