HHHHHH

•

1 like•471 views

Venkata Vineel

• 

Semi
-‐super
vised

Learning
?

• 

Scarcity
of
Training
Data

• 

What

are
constraints
?

• 

How/why

do
they
help

?

Supervised
learning

(
X1àY1)

Labelled
Data

(X2-‐àY2)

(X3à

Y3)..
……(XnàYn)

.

What
if
n
is
less
?

..

Obtaining
training
data

is
Costly
and
it

could
be
ineﬃcient

.

Example
:
(Fraud
detecNon
/
Anamoly
detecNon)

Domain
experNse
helps……

De9initions

•  X
=
(X1,X2,X3,X4…………Xn)

•  Y
=

(Y1,Y2,Y3,Y4…………Yn)

•  H
:
XàY

is
a
classiﬁer
.

f
:
(Cross
product
of
X
and
Y
)

-‐àR

set
of
real
numbers

•  The
out-‐put
of
the
classiﬁer
will
be
such
y
which
maximizes
the

value
of
funcNon
f

•  ClassiﬁcaNon
funcNon
..

•  It’s
a
linear
sum
of
feature

funcNons

Motivational
Interviewing

Labels
:
Support,ReﬂecNon,CofrontaNon,Facilitate,

QuesNon

Can
we
exploit
knowledge
of

constraints
in
Inference
Phase?

•  Lets
assume
n
items
(observaNons)

in
sequence

and
p
labels..

i.e.,
n
tokens

and
p
parts
of
speech

or

n
tokens
and
p
tags
in
an
NER
task

Brute
Force
:

O(n
power
p
)

Viterbi

:

O(
N
power
P)

Can
we
go
down
further

?

Can
we
further
reduce
our
search
space

Further
down
?

Introducing
constraints
into

Model

•  Let
C1,
C2
……….CK
be
the
constraints

•  C:
(Cross
product
of
X
and
Y)

à
{0,1}

•  Constraints
are
of
two
types
.

•  Hard
(MUST
be
saNsﬁed)

•  Sof

(Can
be
relaxed)

•  1Cx
is
the
set
of

sequence
labels
that
DON’T
violate
the

constraints

Constraints
come
to

rescue

• 
Lets
say
x

out
of
X
possible
tag
sequences

violate
the
constraints
.

• 

Search
space
comes
from
X
to

X-‐x
.

• 

How
do
we
infer

?

• 

Does
Viterbi
help
us
?

Example

A

B

C

D

E

F

G

S1

X1

X1

X1

X1

X1

X1

X1

S2

X10

X10

X10

X10

X10

X10

X10

S3

X11

X11

X11

X11

X11

X1I

X11

Mo:va:onal
Interviewing
:

At
least
ONE
reﬂecNon

Soft
constraints

How
do
we
calculate
distance
here

?

How
do
we
learn
the
parameters
?

Lars
Ole
Andersen.
Program
Analysis
and
SpecializaNon
for
the
C

programming
Language
.
PhD
Thesis
,
DIKU
,
University
of

Copenhagen,
May
1994.

This
is
Ground
Truth
.

But
HMM
gives
this.

Lars
Ole
Andersen.
Program
Analysis
and
SpecializaNon
for
the
C

Programming
Language
.
PhD
Thesis
,
DIKU
,
University
of

Copenhagen,
May
1994.

Top-‐k
inference

We
only
chose
the

few
top
possible
sequences
and
add
ALL
of

of
them
to
training
data.

The
author
used
beam
search
decoding,
but
this
can
be
done

with
any
inference
procedure.

From
the
Unlabeled
sample,
we
label
them
and

include
them
in

the
training
data.

Choice

:

We
may
include
only
the
high
conﬁdent
samples.

PitFall
:
Then
we
don’t
really
learn
properly
and
miss-‐out
some

characteris:cs

What's hot

Dixon Deep LearningSciCompIIT

Gan introHyungjoo Cho

Generative Adversarial Networks (GAN)Manohar Mukku

Deep Generative ModelsMijung Kim

Convolutional neural network in practice남주 김

Cryptography Baby Step Giant StepSAUVIK BISWAS

Submodularity slidesdragonthu

Generative adversarial text to image synthesisUniversitat Politècnica de Catalunya

1시간만에 GAN(Generative Adversarial Network) 완전 정복하기NAVER Engineering

Generative Adversarial Networks 2Alireza Shafaei

Iccv2011 learning spatiotemporal graphs of human activities zukun

Object Recognition with Deformable Modelszukun

Deep Learning Opening Workshop - ProxSARAH Algorithms for Stochastic Composit...The Statistical and Applied Mathematical Sciences Institute

Object Detection with Discrmininatively Trained Part based Modelszukun

Finding connections among images using CycleGANNAVER Engineering

Word Embeddings (D2L4 Deep Learning for Speech and Language UPC 2017)Universitat Politècnica de Catalunya

Generative Adversarial Network (+Laplacian Pyramid GAN)NamHyuk Ahn

[Harvard CS264] 09 - Machine Learning on Big Data: Lessons Learned from Googl...npinto

Efficient end-to-end learning for quantizable representationsNAVER Engineering

Inoculation strategies for victims of virusesAleksandr Yampolskiy

What's hot (20)

Dixon Deep Learning

Gan intro

Generative Adversarial Networks (GAN)

Deep Generative Models

Convolutional neural network in practice

Cryptography Baby Step Giant Step

Submodularity slides

Generative adversarial text to image synthesis

1시간만에 GAN(Generative Adversarial Network) 완전 정복하기

Generative Adversarial Networks 2

Iccv2011 learning spatiotemporal graphs of human activities

Object Recognition with Deformable Models

Deep Learning Opening Workshop - ProxSARAH Algorithms for Stochastic Composit...

Object Detection with Discrmininatively Trained Part based Models

Finding connections among images using CycleGAN

Word Embeddings (D2L4 Deep Learning for Speech and Language UPC 2017)

Generative Adversarial Network (+Laplacian Pyramid GAN)

[Harvard CS264] 09 - Machine Learning on Big Data: Lessons Learned from Googl...

Efficient end-to-end learning for quantizable representations

Inoculation strategies for victims of viruses

Viewers also liked

Historical perspective of the Philippine educational system 100220073509-phpa...Ʀohema Maguad

InformationRetrievalVenkata Vineel

Generic Framework for Knowledge Classification-1Venkata Vineel

Content Strategy Post-PenaltyRoss Hudgens

Lamaran yohan inmaraRedhy Arisandi

How to Get a Job in SEORoss Hudgens

Ecommerce Content Marketing for SEORoss Hudgens

Z scoresƦohema Maguad

How to Achieve Content Marketing NirvanaRoss Hudgens

Child and adolescent individual changesƦohema Maguad

Curriculum assessment InstructionƦohema Maguad

Transfer of learningƦohema Maguad

Link Building Strategies: 2013 EditionRoss Hudgens

Link Building By ImitationRoss Hudgens

How to Increase Website Traffic by 250,000+ Monthly VisitorsRoss Hudgens

Content Marketing Data That Moves the NeedleRoss Hudgens

Viewers also liked (16)

Historical perspective of the Philippine educational system 100220073509-phpa...

InformationRetrieval

Generic Framework for Knowledge Classification-1

Content Strategy Post-Penalty

Lamaran yohan inmara

How to Get a Job in SEO

Ecommerce Content Marketing for SEO

Z scores

How to Achieve Content Marketing Nirvana

Child and adolescent individual changes

Curriculum assessment Instruction

Transfer of learning

Link Building Strategies: 2013 Edition

Link Building By Imitation

How to Increase Website Traffic by 250,000+ Monthly Visitors

Content Marketing Data That Moves the Needle

Similar to HHHHHH

Number Crunching in PythonValerio Maggio

"Let us talk about output features! by Florence d’Alché-Buc, LTCI & Full Prof...Paris Women in Machine Learning and Data Science

机器学习AdaboostShocky1

know Machine Learning Basic Concepts.pdfhemangppatel

5 character classifiersSolin TEM

Machine Learning ebook.pdfHODIT12

1_5_AI_edx_ml_51intro_240204_104838machine learning lecture 1MostafaHazemMostafaa

super vector machines algorithms using deepKNaveenKumarECE

decoder and encoderUnsa Shakir

Random Matrix Theory and Machine Learning - Part 4Fabian Pedregosa

Making BIG DATA smallerTony Tran

Introduction to Deep Neural NetworkLiwei Ren任力偉

Introduction to k-Nearest Neighbors and Amazon SageMaker Suman Debnath

Clique and stingSubramanyam Natarajan

Java and Deep LearningOswald Campesato

L1 intro2 supervised_learningYogendra Singh

Deep Learning for Personalized Search and Recommender SystemsBenjamin Le

nnml.pptyang947066

Introduction to PrologChamath Sajeewa

Structured regression for efficient object detectionzukun

Similar to HHHHHH (20)

Number Crunching in Python

"Let us talk about output features! by Florence d’Alché-Buc, LTCI & Full Prof...

机器学习Adaboost

know Machine Learning Basic Concepts.pdf

5 character classifiers

Machine Learning ebook.pdf

1_5_AI_edx_ml_51intro_240204_104838machine learning lecture 1

super vector machines algorithms using deep

decoder and encoder

Random Matrix Theory and Machine Learning - Part 4

Making BIG DATA smaller

Introduction to Deep Neural Network

Introduction to k-Nearest Neighbors and Amazon SageMaker

Clique and sting

Java and Deep Learning

L1 intro2 supervised_learning

Deep Learning for Personalized Search and Recommender Systems

nnml.ppt

Introduction to Prolog

Structured regression for efficient object detection

HHHHHH

1. Guiding Semi-‐ Supervision with Constraint-‐Driven Learning Venkata Vineel Yalamarthi u0881808

2. •  Semi -‐super vised Learning ? •  Scarcity of Training Data •  What are constraints ? •  How/why do they help ?

3. Supervised learning ( X1àY1) Labelled Data (X2-‐àY2) (X3à Y3).. ……(XnàYn) . What if n is less ? .. Obtaining training data is Costly and it could be ineﬃcient . Example : (Fraud detecNon / Anamoly detecNon) Domain experNse helps……

4. De9initions •  X = (X1,X2,X3,X4…………Xn) •  Y = (Y1,Y2,Y3,Y4…………Yn) •  H : XàY is a classiﬁer . f : (Cross product of X and Y ) -‐àR set of real numbers •  The out-‐put of the classiﬁer will be such y which maximizes the value of funcNon f

5. •  ClassiﬁcaNon funcNon .. •  It’s a linear sum of feature funcNons

6. Motivational Interviewing Labels : Support,ReﬂecNon,CofrontaNon,Facilitate, QuesNon

7. Can we exploit knowledge of constraints in Inference Phase? •  Lets assume n items (observaNons) in sequence and p labels.. i.e., n tokens and p parts of speech or n tokens and p tags in an NER task Brute Force : O(n power p ) Viterbi : O( N power P) Can we go down further ? Can we further reduce our search space Further down ?

8. Introducing constraints into Model •  Let C1, C2 ……….CK be the constraints •  C: (Cross product of X and Y) à {0,1} •  Constraints are of two types . •  Hard (MUST be saNsﬁed) •  Sof (Can be relaxed) •  1Cx is the set of sequence labels that DON’T violate the constraints

9. Constraints come to rescue •  Lets say x out of X possible tag sequences violate the constraints . •  Search space comes from X to X-‐x . •  How do we infer ? •  Does Viterbi help us ?

10. Example A B C D E F G S1 X1 X1 X1 X1 X1 X1 X1 S2 X10 X10 X10 X10 X10 X10 X10 S3 X11 X11 X11 X11 X11 X1I X11 Mo:va:onal Interviewing : At least ONE reﬂecNon

11. Soft constraints How do we calculate distance here ? How do we learn the parameters ?

12. Lars Ole Andersen. Program Analysis and SpecializaNon for the C programming Language . PhD Thesis , DIKU , University of Copenhagen, May 1994. This is Ground Truth . But HMM gives this. Lars Ole Andersen. Program Analysis and SpecializaNon for the C Programming Language . PhD Thesis , DIKU , University of Copenhagen, May 1994.

13.

14.

15. Top-‐k inference We only chose the few top possible sequences and add ALL of of them to training data. The author used beam search decoding, but this can be done with any inference procedure. From the Unlabeled sample, we label them and include them in the training data. Choice : We may include only the high conﬁdent samples. PitFall : Then we don’t really learn properly and miss-‐out some characteris:cs

16. Algorithm:

HHHHHH

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (16)

Similar to HHHHHH

Similar to HHHHHH (20)

HHHHHH