SlideShare a Scribd company logo
1 of 19
Download to read offline
Guiding	
  Semi-­‐
Supervision	
  with	
  
Constraint-­‐Driven	
  	
  	
  	
  	
  	
  
Learning	
  
Venkata	
  Vineel	
  Yalamarthi	
  
u0881808	
  
• 	
  	
  Semi	
  -­‐super	
  vised	
  
Learning	
  ?	
  	
  
• 	
  	
  	
  Scarcity	
  of	
  Training	
  Data	
  	
  
• 	
  	
  	
  What	
  	
  are	
  constraints	
  ?	
  
• 	
  	
  How/why	
  	
  do	
  they	
  help	
  	
  ?	
  
	
  
	
  	
  
Supervised	
  learning	
  
	
  (	
  X1àY1)	
  	
  	
  	
  	
  Labelled	
  Data	
  	
  
	
  
	
  (X2-­‐àY2)	
  
	
  
	
  (X3à	
  	
  Y3)..	
  ……(XnàYn)	
  	
  .	
  
	
  
What	
  if	
  n	
  is	
  less	
  ?	
  	
  	
  ..	
  	
  Obtaining	
  training	
  data	
  	
  is	
  Costly	
  and	
  it	
  
could	
  be	
  inefficient	
  	
  	
  .	
  
	
  	
  
Example	
  :	
  (Fraud	
  detecNon	
  /	
  Anamoly	
  detecNon)	
  
	
  
Domain	
  experNse	
  helps……	
  
	
  
De9initions	
  	
  
•  X	
  =	
  (X1,X2,X3,X4…………Xn)	
  
•  Y	
  =	
  	
  (Y1,Y2,Y3,Y4…………Yn)	
  
	
  
•  H	
  :	
  XàY	
  	
  	
  is	
  a	
  classifier	
  .	
  
	
  
	
  	
  
	
  
	
  	
  f	
  :	
  (Cross	
  product	
  of	
  X	
  and	
  Y	
  )	
  	
  -­‐àR	
  	
  set	
  of	
  real	
  numbers	
  	
  
•  The	
  out-­‐put	
  of	
  the	
  classifier	
  will	
  be	
  such	
  y	
  which	
  maximizes	
  the	
  	
  
	
  	
  value	
  of	
  funcNon	
  f	
  	
  
 
	
  
	
  
	
  
	
  
	
  
•  ClassificaNon	
  funcNon	
  ..	
  	
  
•  It’s	
  a	
  linear	
  sum	
  of	
  feature	
  
funcNons	
  	
  
Motivational	
  Interviewing	
  	
  
Labels	
  :	
  Support,ReflecNon,CofrontaNon,Facilitate,	
  
QuesNon	
  
Can	
  we	
  exploit	
  knowledge	
  of	
  
constraints	
  in	
  Inference	
  Phase?	
  	
  
•  Lets	
  assume	
  n	
  items	
  (observaNons)	
  	
  in	
  sequence	
  	
  and	
  p	
  labels..	
  	
  
	
  	
  	
  
	
  	
  	
  	
  	
  	
  	
  i.e.,	
  n	
  tokens	
  	
  and	
  p	
  parts	
  of	
  speech	
  	
  or	
  	
  
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  n	
  tokens	
  and	
  p	
  tags	
  in	
  an	
  NER	
  task	
  
	
  
Brute	
  Force	
  :	
  	
  	
  O(n	
  power	
  p	
  )	
  
	
  
Viterbi	
  	
  :	
  	
  	
  	
  	
  O(	
  N	
  power	
  P)	
  	
  
	
  
Can	
  we	
  go	
  down	
  further	
  	
  ?	
  	
  	
  	
  Can	
  we	
  further	
  reduce	
  our	
  search	
  space	
  	
  
	
  
Further	
  down	
  ?	
  	
  	
  	
  
	
  	
  	
  
	
  	
  	
  	
  	
  	
  	
  
	
  
Introducing	
  constraints	
  into	
  
Model	
  
•  Let	
  C1,	
  C2	
  ……….CK	
  be	
  the	
  constraints	
  	
  
•  C:	
  (Cross	
  product	
  of	
  X	
  and	
  Y)	
  	
  à	
  {0,1}	
  
•  Constraints	
  are	
  of	
  two	
  types	
  .	
  	
  
•  Hard	
  (MUST	
  be	
  saNsfied)	
  
•  Sof	
  	
  (Can	
  be	
  relaxed)	
  	
  
•  1Cx	
  is	
  the	
  set	
  of	
  	
  sequence	
  labels	
  that	
  DON’T	
  violate	
  the	
  	
  
constraints	
  	
  
	
  	
  
Constraints	
  come	
  to	
  	
  rescue	
  	
  
•  	
  Lets	
  say	
  x	
  	
  out	
  of	
  X	
  possible	
  tag	
  sequences	
  	
  	
  	
  	
  violate	
  the	
  constraints	
  .	
  	
  
	
  
	
  
	
  
	
  
	
  
	
  
	
  	
  	
  
	
  
•  	
  	
  	
  	
  Search	
  space	
  comes	
  from	
  X	
  to	
  	
  X-­‐x	
  .	
  
•  	
  	
  	
  	
  How	
  do	
  we	
  infer	
  	
  ?	
  	
  	
  
•  	
  	
  	
  	
  Does	
  Viterbi	
  help	
  us	
  ?	
  	
  
	
  
Example	
  
	
  
	
  	
  	
  	
  	
  	
  	
  	
  	
  A	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  B	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  C	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  D	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  E	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  F	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  G	
  
	
  
S1	
  	
  	
  	
  	
  X1	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  X1	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  X1	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  X1	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  X1	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  X1	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  X1	
  
	
  
S2	
  	
  	
  	
  	
  X10	
  	
  	
  	
  	
  	
  	
  	
  	
  X10	
  	
  	
  	
  	
  	
  	
  	
  X10	
  	
  	
  	
  	
  	
  	
  	
  	
  X10	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  X10	
  	
  	
  	
  	
  	
  	
  	
  	
  X10	
  	
  	
  	
  	
  	
  	
  	
  	
  X10	
  
	
  	
  
S3	
  	
  	
  	
  X11	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  X11	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  X11	
  	
  	
  	
  	
  	
  	
  X11	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  X11	
  	
  	
  	
  	
  	
  	
  X1I	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  X11	
  
	
  
Mo:va:onal	
  Interviewing	
  :	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  
At	
  least	
  ONE	
  reflecNon	
  	
  	
  	
  	
  	
  
	
  
	
  
Soft	
  constraints	
  	
  
	
  
How	
  do	
  we	
  calculate	
  distance	
  here	
  	
  ?	
  
	
  
	
  
	
  
How	
  do	
  we	
  learn	
  the	
  parameters	
  ?	
  	
  
Lars	
  Ole	
  Andersen.	
  Program	
  Analysis	
  and	
  SpecializaNon	
  for	
  the	
  C	
  
programming	
  Language	
  .	
  PhD	
  Thesis	
  ,	
  DIKU	
  ,	
  University	
  of	
  
Copenhagen,	
  May	
  1994.	
  
This	
  is	
  Ground	
  Truth	
  .	
  	
  
	
  
	
  
But	
  HMM	
  gives	
  this.	
  	
  
Lars	
  Ole	
  Andersen.	
  Program	
  Analysis	
  and	
  SpecializaNon	
  for	
  the	
  C	
  
Programming	
  Language	
  .	
  PhD	
  Thesis	
  ,	
  DIKU	
  ,	
  University	
  of	
  
Copenhagen,	
  May	
  1994.	
  
	
  
	
  
	
  
	
  
Top-­‐k	
  inference	
  	
  
We	
  only	
  chose	
  the	
  	
  few	
  top	
  possible	
  sequences	
  and	
  add	
  ALL	
  of	
  	
  
of	
  them	
  to	
  training	
  data.	
  	
  
	
  
The	
  author	
  used	
  beam	
  search	
  decoding,	
  but	
  this	
  can	
  be	
  done	
  
with	
  any	
  inference	
  procedure.	
  
	
  
From	
  the	
  Unlabeled	
  sample,	
  we	
  label	
  them	
  and	
  	
  include	
  them	
  in	
  
the	
  training	
  data.	
  	
  
	
  
Choice	
  	
  :	
  	
  We	
  may	
  include	
  only	
  the	
  high	
  confident	
  samples.	
  
	
  
PitFall	
  :	
  Then	
  we	
  don’t	
  really	
  learn	
  properly	
  and	
  miss-­‐out	
  some	
  
characteris:cs	
  	
  
	
  
	
  
Algorithm:	
  
HHHHHH
HHHHHH
HHHHHH

More Related Content

What's hot

Dixon Deep Learning
Dixon Deep LearningDixon Deep Learning
Dixon Deep LearningSciCompIIT
 
Generative Adversarial Networks (GAN)
Generative Adversarial Networks (GAN)Generative Adversarial Networks (GAN)
Generative Adversarial Networks (GAN)Manohar Mukku
 
Deep Generative Models
Deep Generative ModelsDeep Generative Models
Deep Generative ModelsMijung Kim
 
Convolutional neural network in practice
Convolutional neural network in practiceConvolutional neural network in practice
Convolutional neural network in practice남주 김
 
Cryptography Baby Step Giant Step
Cryptography Baby Step Giant StepCryptography Baby Step Giant Step
Cryptography Baby Step Giant StepSAUVIK BISWAS
 
Submodularity slides
Submodularity slidesSubmodularity slides
Submodularity slidesdragonthu
 
1시간만에 GAN(Generative Adversarial Network) 완전 정복하기
1시간만에 GAN(Generative Adversarial Network) 완전 정복하기1시간만에 GAN(Generative Adversarial Network) 완전 정복하기
1시간만에 GAN(Generative Adversarial Network) 완전 정복하기NAVER Engineering
 
Generative Adversarial Networks 2
Generative Adversarial Networks 2Generative Adversarial Networks 2
Generative Adversarial Networks 2Alireza Shafaei
 
Iccv2011 learning spatiotemporal graphs of human activities
Iccv2011 learning spatiotemporal graphs of human activities Iccv2011 learning spatiotemporal graphs of human activities
Iccv2011 learning spatiotemporal graphs of human activities zukun
 
Object Recognition with Deformable Models
Object Recognition with Deformable ModelsObject Recognition with Deformable Models
Object Recognition with Deformable Modelszukun
 
Object Detection with Discrmininatively Trained Part based Models
Object Detection with Discrmininatively Trained Part based ModelsObject Detection with Discrmininatively Trained Part based Models
Object Detection with Discrmininatively Trained Part based Modelszukun
 
Finding connections among images using CycleGAN
Finding connections among images using CycleGANFinding connections among images using CycleGAN
Finding connections among images using CycleGANNAVER Engineering
 
Word Embeddings (D2L4 Deep Learning for Speech and Language UPC 2017)
Word Embeddings (D2L4 Deep Learning for Speech and Language UPC 2017)Word Embeddings (D2L4 Deep Learning for Speech and Language UPC 2017)
Word Embeddings (D2L4 Deep Learning for Speech and Language UPC 2017)Universitat Politècnica de Catalunya
 
Generative Adversarial Network (+Laplacian Pyramid GAN)
Generative Adversarial Network (+Laplacian Pyramid GAN)Generative Adversarial Network (+Laplacian Pyramid GAN)
Generative Adversarial Network (+Laplacian Pyramid GAN)NamHyuk Ahn
 
[Harvard CS264] 09 - Machine Learning on Big Data: Lessons Learned from Googl...
[Harvard CS264] 09 - Machine Learning on Big Data: Lessons Learned from Googl...[Harvard CS264] 09 - Machine Learning on Big Data: Lessons Learned from Googl...
[Harvard CS264] 09 - Machine Learning on Big Data: Lessons Learned from Googl...npinto
 
Efficient end-to-end learning for quantizable representations
Efficient end-to-end learning for quantizable representationsEfficient end-to-end learning for quantizable representations
Efficient end-to-end learning for quantizable representationsNAVER Engineering
 
Inoculation strategies for victims of viruses
Inoculation strategies for victims of virusesInoculation strategies for victims of viruses
Inoculation strategies for victims of virusesAleksandr Yampolskiy
 

What's hot (20)

Dixon Deep Learning
Dixon Deep LearningDixon Deep Learning
Dixon Deep Learning
 
Gan intro
Gan introGan intro
Gan intro
 
Generative Adversarial Networks (GAN)
Generative Adversarial Networks (GAN)Generative Adversarial Networks (GAN)
Generative Adversarial Networks (GAN)
 
Deep Generative Models
Deep Generative ModelsDeep Generative Models
Deep Generative Models
 
Convolutional neural network in practice
Convolutional neural network in practiceConvolutional neural network in practice
Convolutional neural network in practice
 
Cryptography Baby Step Giant Step
Cryptography Baby Step Giant StepCryptography Baby Step Giant Step
Cryptography Baby Step Giant Step
 
Submodularity slides
Submodularity slidesSubmodularity slides
Submodularity slides
 
Generative adversarial text to image synthesis
Generative adversarial text to image synthesisGenerative adversarial text to image synthesis
Generative adversarial text to image synthesis
 
1시간만에 GAN(Generative Adversarial Network) 완전 정복하기
1시간만에 GAN(Generative Adversarial Network) 완전 정복하기1시간만에 GAN(Generative Adversarial Network) 완전 정복하기
1시간만에 GAN(Generative Adversarial Network) 완전 정복하기
 
Generative Adversarial Networks 2
Generative Adversarial Networks 2Generative Adversarial Networks 2
Generative Adversarial Networks 2
 
Iccv2011 learning spatiotemporal graphs of human activities
Iccv2011 learning spatiotemporal graphs of human activities Iccv2011 learning spatiotemporal graphs of human activities
Iccv2011 learning spatiotemporal graphs of human activities
 
Object Recognition with Deformable Models
Object Recognition with Deformable ModelsObject Recognition with Deformable Models
Object Recognition with Deformable Models
 
Deep Learning Opening Workshop - ProxSARAH Algorithms for Stochastic Composit...
Deep Learning Opening Workshop - ProxSARAH Algorithms for Stochastic Composit...Deep Learning Opening Workshop - ProxSARAH Algorithms for Stochastic Composit...
Deep Learning Opening Workshop - ProxSARAH Algorithms for Stochastic Composit...
 
Object Detection with Discrmininatively Trained Part based Models
Object Detection with Discrmininatively Trained Part based ModelsObject Detection with Discrmininatively Trained Part based Models
Object Detection with Discrmininatively Trained Part based Models
 
Finding connections among images using CycleGAN
Finding connections among images using CycleGANFinding connections among images using CycleGAN
Finding connections among images using CycleGAN
 
Word Embeddings (D2L4 Deep Learning for Speech and Language UPC 2017)
Word Embeddings (D2L4 Deep Learning for Speech and Language UPC 2017)Word Embeddings (D2L4 Deep Learning for Speech and Language UPC 2017)
Word Embeddings (D2L4 Deep Learning for Speech and Language UPC 2017)
 
Generative Adversarial Network (+Laplacian Pyramid GAN)
Generative Adversarial Network (+Laplacian Pyramid GAN)Generative Adversarial Network (+Laplacian Pyramid GAN)
Generative Adversarial Network (+Laplacian Pyramid GAN)
 
[Harvard CS264] 09 - Machine Learning on Big Data: Lessons Learned from Googl...
[Harvard CS264] 09 - Machine Learning on Big Data: Lessons Learned from Googl...[Harvard CS264] 09 - Machine Learning on Big Data: Lessons Learned from Googl...
[Harvard CS264] 09 - Machine Learning on Big Data: Lessons Learned from Googl...
 
Efficient end-to-end learning for quantizable representations
Efficient end-to-end learning for quantizable representationsEfficient end-to-end learning for quantizable representations
Efficient end-to-end learning for quantizable representations
 
Inoculation strategies for victims of viruses
Inoculation strategies for victims of virusesInoculation strategies for victims of viruses
Inoculation strategies for victims of viruses
 

Viewers also liked

Historical perspective of the Philippine educational system 100220073509-phpa...
Historical perspective of the Philippine educational system 100220073509-phpa...Historical perspective of the Philippine educational system 100220073509-phpa...
Historical perspective of the Philippine educational system 100220073509-phpa...Ʀohema Maguad
 
Generic Framework for Knowledge Classification-1
Generic Framework  for Knowledge Classification-1Generic Framework  for Knowledge Classification-1
Generic Framework for Knowledge Classification-1Venkata Vineel
 
Content Strategy Post-Penalty
Content Strategy Post-PenaltyContent Strategy Post-Penalty
Content Strategy Post-PenaltyRoss Hudgens
 
How to Get a Job in SEO
How to Get a Job in SEOHow to Get a Job in SEO
How to Get a Job in SEORoss Hudgens
 
Ecommerce Content Marketing for SEO
Ecommerce Content Marketing for SEOEcommerce Content Marketing for SEO
Ecommerce Content Marketing for SEORoss Hudgens
 
How to Achieve Content Marketing Nirvana
How to Achieve Content Marketing NirvanaHow to Achieve Content Marketing Nirvana
How to Achieve Content Marketing NirvanaRoss Hudgens
 
Child and adolescent individual changes
Child and adolescent individual changesChild and adolescent individual changes
Child and adolescent individual changesƦohema Maguad
 
Curriculum assessment Instruction
Curriculum assessment InstructionCurriculum assessment Instruction
Curriculum assessment InstructionƦohema Maguad
 
Link Building Strategies: 2013 Edition
Link Building Strategies: 2013 EditionLink Building Strategies: 2013 Edition
Link Building Strategies: 2013 EditionRoss Hudgens
 
Link Building By Imitation
Link Building By ImitationLink Building By Imitation
Link Building By ImitationRoss Hudgens
 
How to Increase Website Traffic by 250,000+ Monthly Visitors
How to Increase Website Traffic by 250,000+ Monthly VisitorsHow to Increase Website Traffic by 250,000+ Monthly Visitors
How to Increase Website Traffic by 250,000+ Monthly VisitorsRoss Hudgens
 
Content Marketing Data That Moves the Needle
Content Marketing Data That Moves the NeedleContent Marketing Data That Moves the Needle
Content Marketing Data That Moves the NeedleRoss Hudgens
 

Viewers also liked (16)

Historical perspective of the Philippine educational system 100220073509-phpa...
Historical perspective of the Philippine educational system 100220073509-phpa...Historical perspective of the Philippine educational system 100220073509-phpa...
Historical perspective of the Philippine educational system 100220073509-phpa...
 
InformationRetrieval
InformationRetrievalInformationRetrieval
InformationRetrieval
 
Generic Framework for Knowledge Classification-1
Generic Framework  for Knowledge Classification-1Generic Framework  for Knowledge Classification-1
Generic Framework for Knowledge Classification-1
 
Content Strategy Post-Penalty
Content Strategy Post-PenaltyContent Strategy Post-Penalty
Content Strategy Post-Penalty
 
Lamaran yohan inmara
Lamaran yohan inmaraLamaran yohan inmara
Lamaran yohan inmara
 
How to Get a Job in SEO
How to Get a Job in SEOHow to Get a Job in SEO
How to Get a Job in SEO
 
Ecommerce Content Marketing for SEO
Ecommerce Content Marketing for SEOEcommerce Content Marketing for SEO
Ecommerce Content Marketing for SEO
 
Z scores
Z scoresZ scores
Z scores
 
How to Achieve Content Marketing Nirvana
How to Achieve Content Marketing NirvanaHow to Achieve Content Marketing Nirvana
How to Achieve Content Marketing Nirvana
 
Child and adolescent individual changes
Child and adolescent individual changesChild and adolescent individual changes
Child and adolescent individual changes
 
Curriculum assessment Instruction
Curriculum assessment InstructionCurriculum assessment Instruction
Curriculum assessment Instruction
 
Transfer of learning
Transfer of learningTransfer of learning
Transfer of learning
 
Link Building Strategies: 2013 Edition
Link Building Strategies: 2013 EditionLink Building Strategies: 2013 Edition
Link Building Strategies: 2013 Edition
 
Link Building By Imitation
Link Building By ImitationLink Building By Imitation
Link Building By Imitation
 
How to Increase Website Traffic by 250,000+ Monthly Visitors
How to Increase Website Traffic by 250,000+ Monthly VisitorsHow to Increase Website Traffic by 250,000+ Monthly Visitors
How to Increase Website Traffic by 250,000+ Monthly Visitors
 
Content Marketing Data That Moves the Needle
Content Marketing Data That Moves the NeedleContent Marketing Data That Moves the Needle
Content Marketing Data That Moves the Needle
 

Similar to HHHHHH

Number Crunching in Python
Number Crunching in PythonNumber Crunching in Python
Number Crunching in PythonValerio Maggio
 
机器学习Adaboost
机器学习Adaboost机器学习Adaboost
机器学习AdaboostShocky1
 
know Machine Learning Basic Concepts.pdf
know Machine Learning Basic Concepts.pdfknow Machine Learning Basic Concepts.pdf
know Machine Learning Basic Concepts.pdfhemangppatel
 
5 character classifiers
5 character classifiers5 character classifiers
5 character classifiersSolin TEM
 
Machine Learning ebook.pdf
Machine Learning ebook.pdfMachine Learning ebook.pdf
Machine Learning ebook.pdfHODIT12
 
1_5_AI_edx_ml_51intro_240204_104838machine learning lecture 1
1_5_AI_edx_ml_51intro_240204_104838machine learning lecture 11_5_AI_edx_ml_51intro_240204_104838machine learning lecture 1
1_5_AI_edx_ml_51intro_240204_104838machine learning lecture 1MostafaHazemMostafaa
 
super vector machines algorithms using deep
super vector machines algorithms using deepsuper vector machines algorithms using deep
super vector machines algorithms using deepKNaveenKumarECE
 
decoder and encoder
 decoder and encoder decoder and encoder
decoder and encoderUnsa Shakir
 
Random Matrix Theory and Machine Learning - Part 4
Random Matrix Theory and Machine Learning - Part 4Random Matrix Theory and Machine Learning - Part 4
Random Matrix Theory and Machine Learning - Part 4Fabian Pedregosa
 
Making BIG DATA smaller
Making BIG DATA smallerMaking BIG DATA smaller
Making BIG DATA smallerTony Tran
 
Introduction to Deep Neural Network
Introduction to Deep Neural NetworkIntroduction to Deep Neural Network
Introduction to Deep Neural NetworkLiwei Ren任力偉
 
Introduction to k-Nearest Neighbors and Amazon SageMaker
Introduction to k-Nearest Neighbors and Amazon SageMaker Introduction to k-Nearest Neighbors and Amazon SageMaker
Introduction to k-Nearest Neighbors and Amazon SageMaker Suman Debnath
 
L1 intro2 supervised_learning
L1 intro2 supervised_learningL1 intro2 supervised_learning
L1 intro2 supervised_learningYogendra Singh
 
Deep Learning for Personalized Search and Recommender Systems
Deep Learning for Personalized Search and Recommender SystemsDeep Learning for Personalized Search and Recommender Systems
Deep Learning for Personalized Search and Recommender SystemsBenjamin Le
 
Structured regression for efficient object detection
Structured regression for efficient object detectionStructured regression for efficient object detection
Structured regression for efficient object detectionzukun
 

Similar to HHHHHH (20)

Number Crunching in Python
Number Crunching in PythonNumber Crunching in Python
Number Crunching in Python
 
"Let us talk about output features! by Florence d’Alché-Buc, LTCI & Full Prof...
"Let us talk about output features! by Florence d’Alché-Buc, LTCI & Full Prof..."Let us talk about output features! by Florence d’Alché-Buc, LTCI & Full Prof...
"Let us talk about output features! by Florence d’Alché-Buc, LTCI & Full Prof...
 
机器学习Adaboost
机器学习Adaboost机器学习Adaboost
机器学习Adaboost
 
know Machine Learning Basic Concepts.pdf
know Machine Learning Basic Concepts.pdfknow Machine Learning Basic Concepts.pdf
know Machine Learning Basic Concepts.pdf
 
5 character classifiers
5 character classifiers5 character classifiers
5 character classifiers
 
Machine Learning ebook.pdf
Machine Learning ebook.pdfMachine Learning ebook.pdf
Machine Learning ebook.pdf
 
1_5_AI_edx_ml_51intro_240204_104838machine learning lecture 1
1_5_AI_edx_ml_51intro_240204_104838machine learning lecture 11_5_AI_edx_ml_51intro_240204_104838machine learning lecture 1
1_5_AI_edx_ml_51intro_240204_104838machine learning lecture 1
 
super vector machines algorithms using deep
super vector machines algorithms using deepsuper vector machines algorithms using deep
super vector machines algorithms using deep
 
decoder and encoder
 decoder and encoder decoder and encoder
decoder and encoder
 
Random Matrix Theory and Machine Learning - Part 4
Random Matrix Theory and Machine Learning - Part 4Random Matrix Theory and Machine Learning - Part 4
Random Matrix Theory and Machine Learning - Part 4
 
Making BIG DATA smaller
Making BIG DATA smallerMaking BIG DATA smaller
Making BIG DATA smaller
 
Introduction to Deep Neural Network
Introduction to Deep Neural NetworkIntroduction to Deep Neural Network
Introduction to Deep Neural Network
 
Introduction to k-Nearest Neighbors and Amazon SageMaker
Introduction to k-Nearest Neighbors and Amazon SageMaker Introduction to k-Nearest Neighbors and Amazon SageMaker
Introduction to k-Nearest Neighbors and Amazon SageMaker
 
Clique and sting
Clique and stingClique and sting
Clique and sting
 
Java and Deep Learning
Java and Deep LearningJava and Deep Learning
Java and Deep Learning
 
L1 intro2 supervised_learning
L1 intro2 supervised_learningL1 intro2 supervised_learning
L1 intro2 supervised_learning
 
Deep Learning for Personalized Search and Recommender Systems
Deep Learning for Personalized Search and Recommender SystemsDeep Learning for Personalized Search and Recommender Systems
Deep Learning for Personalized Search and Recommender Systems
 
nnml.ppt
nnml.pptnnml.ppt
nnml.ppt
 
Introduction to Prolog
Introduction to PrologIntroduction to Prolog
Introduction to Prolog
 
Structured regression for efficient object detection
Structured regression for efficient object detectionStructured regression for efficient object detection
Structured regression for efficient object detection
 

HHHHHH

  • 1. Guiding  Semi-­‐ Supervision  with   Constraint-­‐Driven             Learning   Venkata  Vineel  Yalamarthi   u0881808  
  • 2. •     Semi  -­‐super  vised   Learning  ?     •       Scarcity  of  Training  Data     •       What    are  constraints  ?   •     How/why    do  they  help    ?        
  • 3. Supervised  learning    (  X1àY1)          Labelled  Data        (X2-­‐àY2)      (X3à    Y3)..  ……(XnàYn)    .     What  if  n  is  less  ?      ..    Obtaining  training  data    is  Costly  and  it   could  be  inefficient      .       Example  :  (Fraud  detecNon  /  Anamoly  detecNon)     Domain  experNse  helps……    
  • 4. De9initions     •  X  =  (X1,X2,X3,X4…………Xn)   •  Y  =    (Y1,Y2,Y3,Y4…………Yn)     •  H  :  XàY      is  a  classifier  .              f  :  (Cross  product  of  X  and  Y  )    -­‐àR    set  of  real  numbers     •  The  out-­‐put  of  the  classifier  will  be  such  y  which  maximizes  the        value  of  funcNon  f    
  • 5.             •  ClassificaNon  funcNon  ..     •  It’s  a  linear  sum  of  feature   funcNons    
  • 6. Motivational  Interviewing     Labels  :  Support,ReflecNon,CofrontaNon,Facilitate,   QuesNon  
  • 7. Can  we  exploit  knowledge  of   constraints  in  Inference  Phase?     •  Lets  assume  n  items  (observaNons)    in  sequence    and  p  labels..                        i.e.,  n  tokens    and  p  parts  of  speech    or                                n  tokens  and  p  tags  in  an  NER  task     Brute  Force  :      O(n  power  p  )     Viterbi    :          O(  N  power  P)       Can  we  go  down  further    ?        Can  we  further  reduce  our  search  space       Further  down  ?                              
  • 8. Introducing  constraints  into   Model   •  Let  C1,  C2  ……….CK  be  the  constraints     •  C:  (Cross  product  of  X  and  Y)    à  {0,1}   •  Constraints  are  of  two  types  .     •  Hard  (MUST  be  saNsfied)   •  Sof    (Can  be  relaxed)     •  1Cx  is  the  set  of    sequence  labels  that  DON’T  violate  the     constraints        
  • 9. Constraints  come  to    rescue     •   Lets  say  x    out  of  X  possible  tag  sequences          violate  the  constraints  .                         •         Search  space  comes  from  X  to    X-­‐x  .   •         How  do  we  infer    ?       •         Does  Viterbi  help  us  ?      
  • 10. Example                      A                          B                          C                                D                            E                          F                            G     S1          X1                      X1                    X1                          X1                        X1                      X1                      X1     S2          X10                  X10                X10                  X10                    X10                  X10                  X10       S3        X11                    X11                    X11              X11                      X11              X1I                      X11     Mo:va:onal  Interviewing  :                       At  least  ONE  reflecNon                
  • 11. Soft  constraints       How  do  we  calculate  distance  here    ?         How  do  we  learn  the  parameters  ?    
  • 12. Lars  Ole  Andersen.  Program  Analysis  and  SpecializaNon  for  the  C   programming  Language  .  PhD  Thesis  ,  DIKU  ,  University  of   Copenhagen,  May  1994.   This  is  Ground  Truth  .         But  HMM  gives  this.     Lars  Ole  Andersen.  Program  Analysis  and  SpecializaNon  for  the  C   Programming  Language  .  PhD  Thesis  ,  DIKU  ,  University  of   Copenhagen,  May  1994.          
  • 13.
  • 14.
  • 15. Top-­‐k  inference     We  only  chose  the    few  top  possible  sequences  and  add  ALL  of     of  them  to  training  data.       The  author  used  beam  search  decoding,  but  this  can  be  done   with  any  inference  procedure.     From  the  Unlabeled  sample,  we  label  them  and    include  them  in   the  training  data.       Choice    :    We  may  include  only  the  high  confident  samples.     PitFall  :  Then  we  don’t  really  learn  properly  and  miss-­‐out  some   characteris:cs