Representation learning

Final Report:
Construction through Deep RNN
Representation Learning
of Knowledge Bases
KoreaUniversity,
DepartmentofComputerScience&Radio
CommunicationEngineering
MASSIVEDATAMANAGEMENT
Professor JaewooKang
1
2015010661
2015011155
2016010646
MinhwanYu
YonghwaChoi
BumsooKim

MASSIVE DATA MANAGEMENT Presentation format
Contents
01.Introduction
02.PTransE
1-1.Whatdowewant?
1-3.TransH
2-1.Whatisdifferent?
2-2.RelationPathRepresentation
2-3.RecurrentNeuralNetwork
2
03.Ouralgorithm
3-1.Objectfunction
3-2.Activationfunction
3-3.LongShortTermMemory
04.Evaluation
4-1.Results
4-2.Comparison&Analysis
1-4.TransR
1-5.PTransE
1-2.TransE

Introduction
1-2.TransE
1-3.TransH
3
1-4.TransR
1-5.PTransE
1-1.Whatdowewant?

1. Introduction
4
1-1. What do we want?
Large-Scale
KnowledgeBases
Freebase
DBpedia
Yago
Realworld triples
KnowledgeBase(KB)

1. Introduction
5
Large-Scale
KnowledgeBases
Freebase
Realworld triples
∴KBsareveryincomplete
Misrelation
1.Norelationfound
2.Wrongrelationfound
DBpedia
Yago

1. Introduction
6
How doweaddress mis-relations?

1. Introduction
7
Q.Howdoweaddressmisrelation?
A1. Addexternalsourcesfor
completion
Manualandintuitional
Requirestoomuchlabor
Inefficient,time-consuming
PerformancedropsasKBgetslarger

1. Introduction
8
Q.Howdoweaddressmisrelation?
A1. Addexternalsourcesfor
completion
A2. Referencingothertuples
forcompletion
Powerful&Efficient
Largelyexpandable
MorepowerfulasKBgetslarger

1. Introduction
9
Wetriedout…
TransE
TransH
TransR
PTransE
CompletionofKBbyaddressingmisrelationsthrough
co-entity,co-relationreference

1. Introduction
10
1-2. TransE
relation#1Boy Girl
relation#1Queen
TransE
Representingdirectrelationswith
differencesbetweenentityvectors
Simple&Effective
reference:TransE(Bordesetal.NIPS2013)

1. Introduction
11
1-2. TransE
TransE
Simple&Effective
relation#1Boy Girl
relation#1Queen King
Representingdirectrelationswith
reference:TransE(Bordesetal.NIPS2013)

1. Introduction
12
1-2. TransE
However…
TransEcan’trepresentmorethan
onerelationshipbetweenentities.
Inrealworld,weconstructmany
relationshipswithmanysubjects.

1. Introduction
13
1-3. TransH
TransH
AbletorepresentM-to-Mrelations
Representingprojectedrelationswith
Entitiesandrelationshavedifferentcharacteristics
However,theyarerepresentedinthesamespace
reference:TransH(Wangetal.AAAI2014)

1. Introduction
14
1-4. TransR
Abletorepresentdifferentcharacteristics
betweenentitiesandrelations
Representingprojectedentitiesand
mappingthemintorelationspace‘r’
TransR
However,thesewereallnotgoodenoughfor
detectingandaddressingmisrelations!!
reference:TransR(AAAI2015)

1. Introduction
15
1-5. PTransE
PTransE
Representingrelationsthroughcompositionof
relationsbetweenentityvectors
Widelyexpandable&powerfulmethod
h
t
e1
r1
r2
r=r1⋅r2
r1:father r1:father
reference:emnlpprocessing

1. Introduction
16
1-5. PTransE
PTransE
Representingrelationsthroughcompositionof
relationsbetweenentityvectors
Widelyexpandable&powerfulmethod
h
t
e1
r1
r2
r=r1⋅r2
r:grandfather
reference:emnlpprocessing

PTransE
2-2.RelationPathRepresentation
2-3.RecurrentNeuralNetwork
17
2-1.Whatisdifferent?

2. PTransE
18
2-1. What is different?
PTransE
LearnsonlyonesinglematrixEachrelationneedsitsown
projectionmap.
TransR
50,000relationsneeds50,000
enormousprojectionmaps!
Learnsn-steprelationalpaths
Learnsrelationsthatevendon’t
haveenoughtrainingdataIfarelationdoesn’thaveenough
trainingdata,itwillsufferfromlow
performance.
Learnsrelationsonlyinthetraining
dataset.
Learnsrelationsthatarenotinthe
trainingdatasetthroughrelation
paths.(Zero-shotKBinference)

2. PTransE
19
2-2. Relation Path Representation
MicrosoftisbasedinSeattle.
WhichcountryisMicrosoftlocatedin?
Microsoft Seattle
IsBasedInCountryIn
???
RelationNOTFOUND!(Misrelation)

2. PTransE
20
‘Microsoft’isbasedin‘Seattle’.
‘Seattle’islocatedinstate‘Washington’
Microsoft Seattle Washington USA
IsBasedIn
‘Washington’islocatedincountry‘USA’
StateIn CountryIn

2. PTransE
21
‘Microsoft’isbasedin‘Seattle’.
‘Seattle’islocatedinstate‘Washington’
Microsoft USA
‘Washington’islocatedincountry‘USA’
CountryLocatedIn
‘Microsoft’islocatedin‘USA’.

2. PTransE
22
2-3. Recurrent Neural Network
Microsoft Seattle Washington
IsBasedIn StateIn CountryIn
USA
CountryLocatedIn
RNN
RNN
Learningamatrix 𝑾𝑾𝒓𝒓 for
constructingcomposition
vector
StateLocatedIn
p= 𝒇𝒇( 𝑾𝑾𝒓𝒓 𝒄𝒄𝟑𝟑;𝒄𝒄𝟒𝟒 ) = 𝒕𝒕𝒕𝒕𝒕𝒕𝒕𝒕( 𝑾𝑾𝒓𝒓 𝒄𝒄𝟑𝟑;𝒄𝒄𝟒𝟒 )
𝐜𝐜𝟑𝟑 = 𝒇𝒇( 𝑾𝑾𝒓𝒓 𝒄𝒄𝟏𝟏;𝒄𝒄𝟐𝟐 )
𝒄𝒄𝟏𝟏 𝒄𝒄𝟐𝟐
𝒄𝒄𝟒𝟒
reference:CompositionalVectorSpaceModelsforKnowledgeBaseInference

Our Algorithm
3-2.Activationfunction
3-3.LongShortTermMemory
23
3-1.Objectivefunction

3. Our Algorithm
24
3-1. Objective function
Me Father Grandfather
FatherOf FatherOf
RNN
Compositionvector
𝐩𝐩 = 𝒇𝒇( 𝑾𝑾𝒓𝒓 𝒄𝒄𝟏𝟏;𝒄𝒄𝟐𝟐 )
𝒄𝒄𝟏𝟏 𝒄𝒄𝟐𝟐
Me Grandfather
GrandfatherOf
Update 𝑾𝑾𝒓𝒓

3. Our Algorithm
25
3-1. Objective function
Compositionvector
𝐩𝐩 = 𝒇𝒇( 𝑾𝑾𝒓𝒓 𝒄𝒄𝟏𝟏;𝒄𝒄𝟐𝟐 )GrandfatherOf
n-steprelationvector
𝒍𝒍𝒍𝒍𝒍𝒍𝒍𝒍
Updating 𝑾𝑾𝒓𝒓 …
𝒍𝒍𝒍𝒍𝒍𝒍𝒍𝒍BackPropagation
ForwardPropagation 𝒑𝒑𝒑𝒑𝒑𝒑𝒑𝒑
−𝒕𝒕𝒕𝒕𝒕𝒕𝒕𝒕𝒕𝒕𝒕𝒕𝑼𝑼𝑼𝑼𝑼𝑼𝑼𝑼𝑼𝑼𝑼𝑼
𝐖𝐖𝐫𝐫

3. Our Algorithm
26
3-2. Activation function
However,NeuralNetworkisbasicallyalinearoperation
𝐩𝐩 = 𝒇𝒇( 𝑾𝑾𝒓𝒓 𝒄𝒄𝟏𝟏;𝒄𝒄𝟐𝟐 )
Applyinganon-linearaftereach
operationsinordertolearnnon-
lineardecisionboundary
tanhistheconventionaldefault
non-linearactivationfunctionfor
RNNmodels.

3. Our Algorithm
27
3-2. Activation function
Then,what’swrongwithtanh?
Whenbackpropagatinggradients
inordertoupdatethematrix,
thesegradientstendtoconvergeto
0andeliminatesthegradientflow.
Therefore,nogradientflows
backward,andtheparameterstays
unmodified.
GradientVanishingProblem

3. Our Algorithm
28
3-3. Long Short Term Memory
AddressesGradientVanishingproblems
HandleLongtermdependencies
ForgetGate

Evaluation
4-2.Comparison&Analysis
29
4-1.Results

4. Evaluation
30
4-1. Results
WeusedFB15K,andFB40Kdatasets
FB15K FB40K
# Relations 1,345 1,336
# Entities 14,951 39,528
# Training set 483,142 370,648
# Validation set 50,000 67,948
# Test set 59,071 96,678

4. Evaluation
31
4-2. Comparisons & Analysis
Hit 10 (RAW) Hit 10 (FILTER)
TransE 34.9% 47.1%
TransH 45.7% 64.4%
TransR 43.8% 65.5%
PTransE(RNN)
(Original model)
50.6% 82.2%
PTransE(LSTM)
(Our model)
53.1% 86.6%
PTransE,RNNcomposition
Baseline

4. Evaluation
32
TransE 34.9% 47.1%
TransH 45.7% 64.4%
TransR 43.8% 65.5%
PTransE(RNN)
(Original model)
50.6% 82.2%
PTransE(LSTM)
(Our model)
53.1% 86.6%
(+9.3%)
Baseline

4. Evaluation
33
TransE 34.9% 47.1%
TransH 45.7% 64.4%
TransR 43.8% 65.5%
PTransE(RNN)
(Original model)
50.6% 82.2%
PTransE(LSTM)
(Our model)
53.1% 86.6%
(+21.1%)
Baseline

4. Evaluation
34
PTransE(ADD)
(Original model)
51.8% 83.4%
PTransE(ADD)
(Our model)
52.1% 84.1%
PTransE(MUL)
(Original model)
47.4% 77.7%
PTransE(MUL)
(Our model)
47.1% 77.2%
PTransE(RNN)
(Original model)
50.6% 82.2%
PTransE(LSTM)
(Our model)
53.1% 86.6%
PTransE,Compositioncomparison

Thank you for your attention!
35

Representation learning

Recommended

Recommended

More Related Content

What's hot

What's hot (13)

Similar to Representation learning

Similar to Representation learning (20)

More from Brian Kim

More from Brian Kim (7)

Recently uploaded

Recently uploaded (20)

Representation learning