Dwdm naive bayes_ankit_gadgil_027
Upcoming SlideShare
Loading in...5

Dwdm naive bayes_ankit_gadgil_027






Total Views
Views on SlideShare
Embed Views



0 Embeds 0

No embeds



Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
Post Comment
Edit your comment

Dwdm naive bayes_ankit_gadgil_027 Dwdm naive bayes_ankit_gadgil_027 Presentation Transcript

  • Data Warehousing And Data Mining“ Naïve Bayes ” Classification Ankit Gadgil : 11030142027 MSc(CA), SICSR, Pune
  • Contents1.Introduction Classification.2.What is Naïve-Bayes classification.3.Theory.4.Conclusion.5.Advantages and Disadvantages.
  • IntroductionClassification:In machine learning and statistics classification is the problem ofidentifying to which of a set of categories a new observation belongs.The individual observations are analyzed into a set of quantifiableproperties, known as various explanatory variables, features, etc.These properties may variously be categorical (e.g. "A", "B", "AB" or"O", for blood type), ordinal (e.g. "large", "medium" or "small"),
  • Naive-Bayes Classifier An algorithm that implements classification, especially in a concreteimplementation, is known as a classifier. A Naïve-Bayes classifier is a simple probabilistic classifier based onapplying Bayes theorem with strong (naive) independent assumptions.Named after Thomas Bayes ( 1702-1761), who proposed the BayesTheorem.In simple terms, a Naïve-Bayes classifier assumes that the presence (orabsence) of a particular feature of a class is unrelated to the presence (orabsence) of any other feature, given the class variable.
  • Explanation: Naïve-Bayes Let, X : Data sample whose class label is unknown. H : Some hypothesis, such that X belongs to some class C. P(H|X) : Probability that the hypothesis holds given the observed data sample X. P(H|X) is the posterior probability, of H conditioned on X. In simple words, Data samples consists of fruits depending upon their color and shape. Suppose that , X : Red and round H : Hypothesis that X is and apple. P(H|X) reflects confidence that X is an apple having seen that X is Round and Red.
  • Explanation: Naïve-Bayes P(H) is the prior probability of H.For the data sample, this is the probability that it is an Apple.(Regardless of how the data looks.) P(X|H) is the posterior probability of X conditioned on H. P(X) is the prior probability of X.For the data sample, this is the probability that it is Red and Round. Bayes’ Theorem is useful in determining the posterior probability, P(H|X).from P(H),P(X)and P(X|H). Bayes Rule: P( X | H ) P( H ) Likelihood× Priorp( H | X )  Posterior= Evidence P( X )
  • Example
  • Learning PhaseOutlook Play=Yes Play=No Temperat Play=Yes Play=No ure Sunny 2/9 3/5 Hot 2/9 2/5 Overcast 4/9 0/5 Mild 4/9 2/5 Rain 3/9 2/5 Cool 3/9 1/5Humidity Play=Yes Play=No Wind Play=Yes Play=No High 3/9 4/5 Strong 3/9 3/5 Normal 6/9 1/5 Weak 6/9 2/5Humidity Play=Yes Play=No
  • Instance  Test Phase  Given a new instance,  x’=(Outlook=Sunny, Temperature=Cool, Humidity=High, Wind=Strong) P(Outlook=Sunny|Play=Yes) = 2/9 P(Outlook=Sunny|Play=No) = 3/5 P(Temperature=Cool|Play=Yes) = 3/9 P(Temperature=Cool|Play==No) = 1/5 P(Huminity=High|Play=Yes) = 3/9 P(Huminity=High|Play=No) = 4/5 P(Wind=Strong|Play=Yes) = 3/9 P(Wind=Strong|Play=No) = 3/5 P(Play=Yes) = 9/14 P(Play=No) = 5/14P(Yes|x’): *P(Sunny|Yes)P(Cool|Yes)P(High|Yes)P(Strong|Yes)]P(Play=Yes) = 0.0053P(No|x’): *P(Sunny|No) P(Cool|No)P(High|No)P(Strong|No)]P(Play=No) = 0.0206 Given the fact P(Yes|x’) < P(No|x’), we label x’ to be “No”.
  • Conclusion Naive Bayes is one of the simplest density estimation methods from which we can form one of the standard classification methods in machine learning. Very easy to program and intuitive. Fast to train and to use as a classifier. Very easy to deal with missing attributes. Very popular in fields such as computational linguistics/NLP. Many successful applications, e.g., spam mail filtering
  • • References: Data Mining :Concepts and Techniques – JiaweiHan, Micheline Kamber Simon Fraser University. Naïve-Bayes Classifier by Ke Chen - comp24111 Machine Learning. Introduction to Baysian Learning - Ata Kaban, University of Birmingham . Learning from Data 1 Naive Bayes - David Barber 2001-2004,Amos Storkey Thank You !!