Introduc)on	
  to	
  	
  
Machine	
  Learning	
  
NHM	
  Tanveer	
  Hossain	
  Khan	
  (Hasan)	
  
About	
  Me	
  
•  I	
  “work	
  for	
  fun”	
  and	
  mostly	
  work	
  with	
  Ruby.	
  
•  Love	
  programming	
  and	
  learning.	
  
•  Skilled	
  on	
  Ruby,	
  Java,	
  PHP,	
  Nodejs	
  and	
  Go.	
  
•  Love	
  to	
  take	
  challenge 	
  
•  I	
  am	
  working	
  with	
  Tweek.tv	
  (one	
  of	
  the	
  Berlin	
  
startups)	
  
What’s	
  in	
  ?	
  
•  What	
  is	
  Machine	
  learning	
  ?	
  
•  GeQng	
  rid	
  of	
  fear	
  
•  Where	
  to	
  use	
  it	
  ?	
  
•  Who	
  is	
  using	
  ?	
  
•  Discussion	
  on	
  few	
  Machine	
  learning	
  
algorithms.	
  
•  Few	
  books	
  and	
  references.	
  
•  Q/A	
  
What	
  is	
  Machine	
  Learning	
  ?	
  
Defini)on	
  ?	
  	
  
“Field of study that gives computers the
ability to learn without being explicitly
programmed”
By	
  Arthur	
  Samuel	
  (Collected	
  from	
  wiki)	
  
What	
  is	
  Machine	
  Learning?	
  
1.  Train	
  machine	
  with	
  examples	
  	
  
2.  Algorithm	
  stores	
  the	
  trained	
  data	
  into	
  a	
  
internal	
  mathema)cal	
  model.	
  
3.  Predict	
  new	
  data	
  based	
  on	
  the	
  trained	
  
model.	
  
GeQng	
  rid	
  of	
  fear	
  
Where	
  to	
  use	
  it?	
  
•  Automa)cally	
  categoriza)on	
  
•  Preparing	
  recommenda)on	
  
•  Analyzing	
  sen)ment	
  and	
  behaviors	
  	
  
•  Recognizing	
  pa]erns	
  
•  Grouping	
  unrecognized	
  pa]erns	
  
•  OCR,	
  Voice	
  recogni)on,	
  Image	
  recogni)on	
  
•  Discovering	
  likelihood	
  and	
  many	
  more.	
  	
  
Who	
  is	
  using	
  ?	
  
•  Facebook	
  (Image	
  tagging,	
  Newsfeed)	
  
•  Gmail	
  (Spam	
  detec)on,	
  Important	
  email	
  
detec)on)	
  
•  YouTube	
  (Video	
  recommenda)on,	
  What	
  to	
  
watch)	
  
•  Google	
  search	
  (Preparing	
  search	
  result)	
  
•  Amazon	
  (Sugges)ng	
  similar	
  product)	
  
•  Many	
  more…	
  
Let’s	
  introduce	
  ML	
  algorithms	
  
ML	
  in	
  Ac)on	
  
•  Supervised	
  learning	
  
– Classifica)on	
  
– Regression	
  
•  Unsupervised	
  learning	
  
– Clustering	
  
•  Recommenda)on	
  
– Content	
  based	
  
– Collabora)ve	
  filtering	
  
Supervised	
  Learning	
  
•  Machine	
  doesn’t	
  own	
  any	
  cogni)ve	
  system	
  like	
  
human	
  does	
  hence	
  they	
  need	
  human	
  intervened	
  
feature	
  extrac)on!	
  
	
  
•  Classifica)on	
  &	
  Regression	
  
–  Naïve	
  Bayes	
  
–  Decision	
  Tree	
  
•  ID3	
  Algorithm	
  
–  k-­‐NN	
  (k	
  nearest	
  neighbors)	
  
–  SVM	
  (Support	
  Vector	
  Machine)	
  
–  Many	
  more…	
  
Naïve	
  Bayes	
  
•  Mul)	
  class	
  classifica)on	
  
•  Base	
  on	
  bayes	
  theorem	
  
•  Text	
  categoriza)on	
  
•  Works	
  with	
  small	
  training	
  data	
  
Support	
  Vector	
  Machine	
  (SVM)	
  
•  Binary	
  classifica)on	
  
•  None	
  probabilis)c	
  binary	
  linear	
  classifica)on	
  
•  Represents	
  examples	
  as	
  points	
  in	
  space	
  
•  Linear	
  classifier	
  
•  Text	
  categoriza)on	
  
•  Uses	
  loss	
  func)on	
  
ID3	
  
•  Decision	
  tree	
  
•  Predic)ve	
  model	
  
•  Itera)ve	
  
•  Uses	
  in	
  Informa)on	
  Retrieval	
  (IR)	
  technologies	
  
Unsupervised	
  Learning	
  
•  Clustering	
  
– k-­‐means	
  
– Many	
  more…	
  
k-­‐means	
  
•  Signal	
  processing	
  
•  Data	
  mining	
  
•  Itera)ve	
  
•  Feature	
  learning	
  
•  Cluster	
  analysis	
  
•  Color	
  quan)za)on	
  (Reduce	
  number	
  of	
  dis)nct	
  
colors	
  from	
  an	
  image)	
  
Recommenda)ons	
  
•  Content	
  based	
  
– Natural	
  language	
  processing	
  
– Named	
  En)ty	
  Recogni)on	
  
– Disambigua)on	
  (VW	
  Golf	
  or	
  Sports	
  Golf)	
  
•  Collabora)ve	
  Filtering	
  
– Using	
  SVM,	
  Naïve	
  bayes	
  
– Implicit	
  or	
  explicit	
  feedback	
  
– Distance	
  calcula)on	
  &	
  k-­‐nn	
  based	
  filtering	
  
– User	
  or	
  item	
  based	
  
Few	
  pointers	
  	
  
•  h]p://guidetodatamining.com/	
  	
  
– Very	
  easy	
  learning	
  and	
  programmer	
  focused	
  
•  Introduc)on	
  to	
  Machine	
  Learning	
  –	
  Ethem	
  
Alpaydin	
  (The	
  MIT	
  Press)	
  
•  Mahout	
  in	
  Ac)on	
  
•  Mlbase	
  documenta)on	
  
Learn	
  by	
  prac)cing	
  	
  
•  Apache	
  Mahout	
  	
  -­‐	
  h]ps://mahout.apache.org/	
  
•  MLbase	
  -­‐	
  h]p://www.mlbase.org/	
  
•  Easyrec	
  –	
  h]p://www.easyrec.org	
  
•  Weka	
  -­‐	
  
h]p://www.cs.waikato.ac.nz/ml/weka/	
  
You	
  can	
  use	
  in	
  produc)on	
  	
  
(without	
  coding)	
  
•  h]p://predic)on.io/	
  -­‐	
  For	
  Collabora)ve	
  
filtering	
  based	
  recommenda)on	
  engine.	
  
•  Google	
  Predic)on	
  API	
  -­‐	
  	
  
h]ps://developers.google.com/predic)on/	
  
•  Algorithm.io	
  -­‐	
  h]p://www.algorithms.io/	
  (Not	
  
sure	
  about	
  it)	
  
	
  
That’s	
  it,	
  Thanks	
  all	
  J	
  
Q/A	
  	
  
	
  
	
  
	
  
	
  

Introduction to Machine Learning

  • 1.
    Introduc)on  to     Machine  Learning   NHM  Tanveer  Hossain  Khan  (Hasan)  
  • 2.
    About  Me   • I  “work  for  fun”  and  mostly  work  with  Ruby.   •  Love  programming  and  learning.   •  Skilled  on  Ruby,  Java,  PHP,  Nodejs  and  Go.   •  Love  to  take  challenge   •  I  am  working  with  Tweek.tv  (one  of  the  Berlin   startups)  
  • 3.
    What’s  in  ?   •  What  is  Machine  learning  ?   •  GeQng  rid  of  fear   •  Where  to  use  it  ?   •  Who  is  using  ?   •  Discussion  on  few  Machine  learning   algorithms.   •  Few  books  and  references.   •  Q/A  
  • 4.
    What  is  Machine  Learning  ?  
  • 5.
    Defini)on  ?     “Field of study that gives computers the ability to learn without being explicitly programmed” By  Arthur  Samuel  (Collected  from  wiki)  
  • 6.
    What  is  Machine  Learning?   1.  Train  machine  with  examples     2.  Algorithm  stores  the  trained  data  into  a   internal  mathema)cal  model.   3.  Predict  new  data  based  on  the  trained   model.  
  • 7.
  • 8.
    Where  to  use  it?   •  Automa)cally  categoriza)on   •  Preparing  recommenda)on   •  Analyzing  sen)ment  and  behaviors     •  Recognizing  pa]erns   •  Grouping  unrecognized  pa]erns   •  OCR,  Voice  recogni)on,  Image  recogni)on   •  Discovering  likelihood  and  many  more.    
  • 9.
    Who  is  using  ?   •  Facebook  (Image  tagging,  Newsfeed)   •  Gmail  (Spam  detec)on,  Important  email   detec)on)   •  YouTube  (Video  recommenda)on,  What  to   watch)   •  Google  search  (Preparing  search  result)   •  Amazon  (Sugges)ng  similar  product)   •  Many  more…  
  • 10.
  • 11.
    ML  in  Ac)on   •  Supervised  learning   – Classifica)on   – Regression   •  Unsupervised  learning   – Clustering   •  Recommenda)on   – Content  based   – Collabora)ve  filtering  
  • 12.
    Supervised  Learning   • Machine  doesn’t  own  any  cogni)ve  system  like   human  does  hence  they  need  human  intervened   feature  extrac)on!     •  Classifica)on  &  Regression   –  Naïve  Bayes   –  Decision  Tree   •  ID3  Algorithm   –  k-­‐NN  (k  nearest  neighbors)   –  SVM  (Support  Vector  Machine)   –  Many  more…  
  • 13.
    Naïve  Bayes   • Mul)  class  classifica)on   •  Base  on  bayes  theorem   •  Text  categoriza)on   •  Works  with  small  training  data  
  • 14.
    Support  Vector  Machine  (SVM)   •  Binary  classifica)on   •  None  probabilis)c  binary  linear  classifica)on   •  Represents  examples  as  points  in  space   •  Linear  classifier   •  Text  categoriza)on   •  Uses  loss  func)on  
  • 15.
    ID3   •  Decision  tree   •  Predic)ve  model   •  Itera)ve   •  Uses  in  Informa)on  Retrieval  (IR)  technologies  
  • 16.
    Unsupervised  Learning   • Clustering   – k-­‐means   – Many  more…  
  • 17.
    k-­‐means   •  Signal  processing   •  Data  mining   •  Itera)ve   •  Feature  learning   •  Cluster  analysis   •  Color  quan)za)on  (Reduce  number  of  dis)nct   colors  from  an  image)  
  • 18.
    Recommenda)ons   •  Content  based   – Natural  language  processing   – Named  En)ty  Recogni)on   – Disambigua)on  (VW  Golf  or  Sports  Golf)   •  Collabora)ve  Filtering   – Using  SVM,  Naïve  bayes   – Implicit  or  explicit  feedback   – Distance  calcula)on  &  k-­‐nn  based  filtering   – User  or  item  based  
  • 19.
    Few  pointers     •  h]p://guidetodatamining.com/     – Very  easy  learning  and  programmer  focused   •  Introduc)on  to  Machine  Learning  –  Ethem   Alpaydin  (The  MIT  Press)   •  Mahout  in  Ac)on   •  Mlbase  documenta)on  
  • 20.
    Learn  by  prac)cing     •  Apache  Mahout    -­‐  h]ps://mahout.apache.org/   •  MLbase  -­‐  h]p://www.mlbase.org/   •  Easyrec  –  h]p://www.easyrec.org   •  Weka  -­‐   h]p://www.cs.waikato.ac.nz/ml/weka/  
  • 21.
    You  can  use  in  produc)on     (without  coding)   •  h]p://predic)on.io/  -­‐  For  Collabora)ve   filtering  based  recommenda)on  engine.   •  Google  Predic)on  API  -­‐     h]ps://developers.google.com/predic)on/   •  Algorithm.io  -­‐  h]p://www.algorithms.io/  (Not   sure  about  it)    
  • 22.
    That’s  it,  Thanks  all  J   Q/A