Ladle Patel
TCS R&D Innovation Labs
ladlepatelr@gmail.com
Mob:+91-9742123444
Machine learning Examples
Spam OR Non Spam Clustering
Recommendations Market Basket Analysis
What is Machine learning?
 It is a field of artificial intelligence, which is a sub-field of
computer science, in which we teach computers by
example and ask computer to predict for new example
automatically .
Ex: 1) Spam email or not spam email.
2)Product Recommendation.
3)What will be tomorrow’s temperature.
Types Of Machine Learning
Terminology
• Observations :Items or entities used for learning or evaluation
in the context of spam detection, emails.
• Features :Are attributes used to represent an observation.
Ex:In housing prices prediction ,size,area,floors etc..
• Labels :Are values or categories assigned to observations. and
again, in the context of spam detection, these can be an email
being defined as spam or not spam.
• Training and test data :Observations that we use to train and
evaluate a learning algorithm.
Tools Or Programing languages
 Matlab.
 Octave.
 R.
 SAS.
 SPSS.
 Python.
 etc..
What is the Problem ?
 Most of the traditional analytical tools runs on single
machine.
Example
 Spam Or Non Spam.
TFIDF
i work on spark hadoop
I work on spark 1 1 1 1 0
I work on hadoop 1 1 1 0 1
Cross Industry Standard Process for
Data Mining(CRISP-DM)
ML Use Cases
 Marketing
Ex:Customer segmentation, Product mix, Recommendation
 Sales
Ex:Demand forecasting
 Risk
Ex:Fraud detection
 Customer support
Ex:Call centers
ML Use Cases Cont..
 Healthcare
Ex:Survival analysis
 Consumer Financial
Ex:Credit card fraud
 Retail
Ex:Market Basket Analysis
 Insurance
 Manufacturing
Thanks

Apache spark with Machine learning

  • 1.
    Ladle Patel TCS R&DInnovation Labs ladlepatelr@gmail.com Mob:+91-9742123444
  • 2.
    Machine learning Examples SpamOR Non Spam Clustering Recommendations Market Basket Analysis
  • 3.
    What is Machinelearning?  It is a field of artificial intelligence, which is a sub-field of computer science, in which we teach computers by example and ask computer to predict for new example automatically . Ex: 1) Spam email or not spam email. 2)Product Recommendation. 3)What will be tomorrow’s temperature.
  • 4.
  • 5.
    Terminology • Observations :Itemsor entities used for learning or evaluation in the context of spam detection, emails. • Features :Are attributes used to represent an observation. Ex:In housing prices prediction ,size,area,floors etc.. • Labels :Are values or categories assigned to observations. and again, in the context of spam detection, these can be an email being defined as spam or not spam. • Training and test data :Observations that we use to train and evaluate a learning algorithm.
  • 7.
    Tools Or Programinglanguages  Matlab.  Octave.  R.  SAS.  SPSS.  Python.  etc..
  • 8.
    What is theProblem ?  Most of the traditional analytical tools runs on single machine.
  • 9.
  • 10.
    TFIDF i work onspark hadoop I work on spark 1 1 1 1 0 I work on hadoop 1 1 1 0 1
  • 11.
    Cross Industry StandardProcess for Data Mining(CRISP-DM)
  • 12.
    ML Use Cases Marketing Ex:Customer segmentation, Product mix, Recommendation  Sales Ex:Demand forecasting  Risk Ex:Fraud detection  Customer support Ex:Call centers
  • 13.
    ML Use CasesCont..  Healthcare Ex:Survival analysis  Consumer Financial Ex:Credit card fraud  Retail Ex:Market Basket Analysis  Insurance  Manufacturing
  • 14.