This presentation introduces naive Bayesian classification. It begins with an overview of Bayes' theorem and defines a naive Bayes classifier as one that assumes conditional independence between predictor variables given the class. The document provides examples of text classification using naive Bayes and discusses its advantages of simplicity and accuracy, as well as its limitation of assuming independence. It concludes that naive Bayes is a commonly used and effective classification technique.
Naive Bayes is a kind of classifier which uses the Bayes Theorem. It predicts membership probabilities for each class such as the probability that given record or data point belongs to a particular class.
Decision tree is a type of supervised learning algorithm (having a pre-defined target variable) that is mostly used in classification problems. It is a tree in which each branch node represents a choice between a number of alternatives, and each leaf node represents a decision.
Naive Bayes is a kind of classifier which uses the Bayes Theorem. It predicts membership probabilities for each class such as the probability that given record or data point belongs to a particular class.
Decision tree is a type of supervised learning algorithm (having a pre-defined target variable) that is mostly used in classification problems. It is a tree in which each branch node represents a choice between a number of alternatives, and each leaf node represents a decision.
A short presentation for beginners on Introduction of Machine Learning, What it is, how it works, what all are the popular Machine Learning techniques and learning models (supervised, unsupervised, semi-supervised, reinforcement learning) and how they works with various Industry use-cases and popular examples.
Ensemble Learning is a technique that creates multiple models and then combines them to produce improved results.
Ensemble learning usually produces more accurate solutions than a single model would.
This course is all about the data mining that how we get the optimized results. it included with all types and how we use these techniques.This course is all about the data mining that how we get the optimized results. it included with all types and how we use these techniques.This course is all about the data mining that how we get the optimized results. it included with all types and how we use these techniques.This course is all about the data mining that how we get the optimized results. it included with all types and how we use these techniques.This course is all about the data mining that how we get the optimized results. it included with all types and how we use these techniques
Machine Learning and Real-World ApplicationsMachinePulse
This presentation was created by Ajay, Machine Learning Scientist at MachinePulse, to present at a Meetup on Jan. 30, 2015. These slides provide an overview of widely used machine learning algorithms. The slides conclude with examples of real world applications.
Ajay Ramaseshan, is a Machine Learning Scientist at MachinePulse. He holds a Bachelors degree in Computer Science from NITK, Suratkhal and a Master in Machine Learning and Data Mining from Aalto University School of Science, Finland. He has extensive experience in the machine learning domain and has dealt with various real world problems.
Slide explaining the distinction between bagging and boosting while understanding the bias variance trade-off. Followed by some lesser known scope of supervised learning. understanding the effect of tree split metric in deciding feature importance. Then understanding the effect of threshold on classification accuracy. Additionally, how to adjust model threshold for classification in supervised learning.
Note: Limitation of Accuracy metric (baseline accuracy), alternative metrics, their use case and their advantage and limitations were briefly discussed.
This is a deep learning presentation based on Deep Neural Network. It reviews the deep learning concept, related works and specific application areas.It describes a use case scenario of deep learning and highlights the current trends and research issues of deep learning
A short presentation for beginners on Introduction of Machine Learning, What it is, how it works, what all are the popular Machine Learning techniques and learning models (supervised, unsupervised, semi-supervised, reinforcement learning) and how they works with various Industry use-cases and popular examples.
Ensemble Learning is a technique that creates multiple models and then combines them to produce improved results.
Ensemble learning usually produces more accurate solutions than a single model would.
This course is all about the data mining that how we get the optimized results. it included with all types and how we use these techniques.This course is all about the data mining that how we get the optimized results. it included with all types and how we use these techniques.This course is all about the data mining that how we get the optimized results. it included with all types and how we use these techniques.This course is all about the data mining that how we get the optimized results. it included with all types and how we use these techniques.This course is all about the data mining that how we get the optimized results. it included with all types and how we use these techniques
Machine Learning and Real-World ApplicationsMachinePulse
This presentation was created by Ajay, Machine Learning Scientist at MachinePulse, to present at a Meetup on Jan. 30, 2015. These slides provide an overview of widely used machine learning algorithms. The slides conclude with examples of real world applications.
Ajay Ramaseshan, is a Machine Learning Scientist at MachinePulse. He holds a Bachelors degree in Computer Science from NITK, Suratkhal and a Master in Machine Learning and Data Mining from Aalto University School of Science, Finland. He has extensive experience in the machine learning domain and has dealt with various real world problems.
Slide explaining the distinction between bagging and boosting while understanding the bias variance trade-off. Followed by some lesser known scope of supervised learning. understanding the effect of tree split metric in deciding feature importance. Then understanding the effect of threshold on classification accuracy. Additionally, how to adjust model threshold for classification in supervised learning.
Note: Limitation of Accuracy metric (baseline accuracy), alternative metrics, their use case and their advantage and limitations were briefly discussed.
This is a deep learning presentation based on Deep Neural Network. It reviews the deep learning concept, related works and specific application areas.It describes a use case scenario of deep learning and highlights the current trends and research issues of deep learning
This presentation is intended for giving an introduction to Genetic Algorithm. Using an example, it explains the different concepts used in Genetic Algorithm. If you are new to GA or want to refresh concepts , then it is a good resource for you.
This presentation guide you through Bayes Theorem, Bayesian Classifier, Naive Bayes, Uses of Naive Bayes classification, Why text classification, Examples of Text Classification, Naive Bayes Approach and Text Classification Algorithm.
For more topics stay tuned with Learnbay.
A Deep Dive into Classification with Naive Bayes. Along the way we take a look at some basics from Ian Witten's Data Mining book and dig into the algorithm.
Presented on Wed Apr 27 2011 at SeaHUG in Seattle, WA.
Introduction to EBSCOhost Research databases at University of Galway Library rosie.dunne
Introduction to EBSCOhost Research databases at University of Galway Library presented by Richard Crookes, Manager, Training Services – UK, Nordics & Africa EBSCO, 12 October 2023
A short tutorial on R, basically for a starter who wants to do data mining especially text data mining.
Related codes and data will be found at the following lnik: http://textanalytics.in/wm/R%20tutorial%20(DATA2014).zip
The Art Pastor's Guide to Sabbath | Steve ThomasonSteve Thomason
What is the purpose of the Sabbath Law in the Torah. It is interesting to compare how the context of the law shifts from Exodus to Deuteronomy. Who gets to rest, and why?
Basic Civil Engineering Notes of Chapter-6, Topic- Ecosystem, Biodiversity Green house effect & Hydrological cycle
Types of Ecosystem
(1) Natural Ecosystem
(2) Artificial Ecosystem
component of ecosystem
Biotic Components
Abiotic Components
Producers
Consumers
Decomposers
Functions of Ecosystem
Types of Biodiversity
Genetic Biodiversity
Species Biodiversity
Ecological Biodiversity
Importance of Biodiversity
Hydrological Cycle
Green House Effect
Embracing GenAI - A Strategic ImperativePeter Windle
Artificial Intelligence (AI) technologies such as Generative AI, Image Generators and Large Language Models have had a dramatic impact on teaching, learning and assessment over the past 18 months. The most immediate threat AI posed was to Academic Integrity with Higher Education Institutes (HEIs) focusing their efforts on combating the use of GenAI in assessment. Guidelines were developed for staff and students, policies put in place too. Innovative educators have forged paths in the use of Generative AI for teaching, learning and assessments leading to pockets of transformation springing up across HEIs, often with little or no top-down guidance, support or direction.
This Gasta posits a strategic approach to integrating AI into HEIs to prepare staff, students and the curriculum for an evolving world and workplace. We will highlight the advantages of working with these technologies beyond the realm of teaching, learning and assessment by considering prompt engineering skills, industry impact, curriculum changes, and the need for staff upskilling. In contrast, not engaging strategically with Generative AI poses risks, including falling behind peers, missed opportunities and failing to ensure our graduates remain employable. The rapid evolution of AI technologies necessitates a proactive and strategic approach if we are to remain relevant.
How to Create Map Views in the Odoo 17 ERPCeline George
The map views are useful for providing a geographical representation of data. They allow users to visualize and analyze the data in a more intuitive manner.
Instructions for Submissions thorugh G- Classroom.pptxJheel Barad
This presentation provides a briefing on how to upload submissions and documents in Google Classroom. It was prepared as part of an orientation for new Sainik School in-service teacher trainees. As a training officer, my goal is to ensure that you are comfortable and proficient with this essential tool for managing assignments and fostering student engagement.
The Roman Empire A Historical Colossus.pdfkaushalkr1407
The Roman Empire, a vast and enduring power, stands as one of history's most remarkable civilizations, leaving an indelible imprint on the world. It emerged from the Roman Republic, transitioning into an imperial powerhouse under the leadership of Augustus Caesar in 27 BCE. This transformation marked the beginning of an era defined by unprecedented territorial expansion, architectural marvels, and profound cultural influence.
The empire's roots lie in the city of Rome, founded, according to legend, by Romulus in 753 BCE. Over centuries, Rome evolved from a small settlement to a formidable republic, characterized by a complex political system with elected officials and checks on power. However, internal strife, class conflicts, and military ambitions paved the way for the end of the Republic. Julius Caesar’s dictatorship and subsequent assassination in 44 BCE created a power vacuum, leading to a civil war. Octavian, later Augustus, emerged victorious, heralding the Roman Empire’s birth.
Under Augustus, the empire experienced the Pax Romana, a 200-year period of relative peace and stability. Augustus reformed the military, established efficient administrative systems, and initiated grand construction projects. The empire's borders expanded, encompassing territories from Britain to Egypt and from Spain to the Euphrates. Roman legions, renowned for their discipline and engineering prowess, secured and maintained these vast territories, building roads, fortifications, and cities that facilitated control and integration.
The Roman Empire’s society was hierarchical, with a rigid class system. At the top were the patricians, wealthy elites who held significant political power. Below them were the plebeians, free citizens with limited political influence, and the vast numbers of slaves who formed the backbone of the economy. The family unit was central, governed by the paterfamilias, the male head who held absolute authority.
Culturally, the Romans were eclectic, absorbing and adapting elements from the civilizations they encountered, particularly the Greeks. Roman art, literature, and philosophy reflected this synthesis, creating a rich cultural tapestry. Latin, the Roman language, became the lingua franca of the Western world, influencing numerous modern languages.
Roman architecture and engineering achievements were monumental. They perfected the arch, vault, and dome, constructing enduring structures like the Colosseum, Pantheon, and aqueducts. These engineering marvels not only showcased Roman ingenuity but also served practical purposes, from public entertainment to water supply.
Unit 8 - Information and Communication Technology (Paper I).pdfThiyagu K
This slides describes the basic concepts of ICT, basics of Email, Emerging Technology and Digital Initiatives in Education. This presentations aligns with the UGC Paper I syllabus.
Unit 8 - Information and Communication Technology (Paper I).pdf
Naive bayes
1. PRESENTATION ON
NAÏVE BAYESIAN CLASSIFICATION
Presented By:
http://ashrafsau.blogspot.in/
Ashraf Uddin
Sujit Singh
Chetanya Pratap Singh
South Asian University
(Master of Computer Application)
http://ashrafsau.blogspot.in/
2. OUTLINE
Introduction to Bayesian Classification
Bayes Theorem
Naïve Bayes Classifier
http://ashrafsau.blogspot.in/
Classification Example
Text Classification – an Application
Comparison with other classifiers
Advantages and disadvantages
Conclusions
3. CLASSIFICATION
Classification:
predicts categorical class labels
http://ashrafsau.blogspot.in/
classifies data (constructs a model) based on the
training set and the values (class labels) in a
classifying attribute and uses it in classifying new
data
Typical Applications
credit approval
target marketing
medical diagnosis
treatment effectiveness analysis
4. A TWO STEP PROCESS
Model construction: describing a set of predetermined
classes
Each tuple/sample is assumed to belong to a predefined
http://ashrafsau.blogspot.in/
class, as determined by the class label attribute
The set of tuples used for model construction: training set
The model is represented as classification rules, decision
trees, or mathematical formulae
Model usage: for classifying future or unknown objects
Estimate accuracy of the model
The known label of test sample is compared with the
classified result from the model
Accuracy rate is the percentage of test set samples that
are correctly classified by the model
Test set is independent of training set, otherwise over-fitting
will occur
5. INTRODUCTION TO BAYESIAN CLASSIFICATION
What is it ?
http://ashrafsau.blogspot.in/
Statistical method for classification.
Supervised Learning Method.
Assumes an underlying probabilistic model, the Bayes
theorem.
Can solve problems involving both categorical and
continuous valued attributes.
Named after Thomas Bayes, who proposed the Bayes
Theorem.
7. THE BAYES THEOREM
The Bayes Theorem:
P(H|X)= P(X|H) P(H)/ P(X)
http://ashrafsau.blogspot.in/
P(H|X) : Probability that the customer will buy a computer given that
we know his age, credit rating and income. (Posterior Probability of
H)
P(H) : Probability that the customer will buy a computer regardless
of age, credit rating, income (Prior Probability of H)
P(X|H) : Probability that the customer is 35 yrs old, have fair credit
rating and earns $40,000, given that he has bought our computer
(Posterior Probability of X)
P(X) : Probability that a person from our set of customers is 35 yrs
old, have fair credit rating and earns $40,000. (Prior Probability of X)
11. “ZERO” PROBLEM
What if there is a class, Ci, and X has an attribute
value, xk, such that none of the samples in Ci has
http://ashrafsau.blogspot.in/
that attribute value?
In that case P(xk|Ci) = 0, which results in P(X|Ci) =
0 even though P(xk|Ci) for all the other attributes in
X may be large.
12. NUMERICAL UNDERFLOW
When p(x|Y ) is often a very small number: the
probability of observing any particular high-
http://ashrafsau.blogspot.in/
dimensional vector is small.
This can lead to numerical under flow.
14. BASIC ASSUMPTION
The Naïve Bayes assumption is that all the features
are conditionally independent given the class label.
http://ashrafsau.blogspot.in/
Even though this is usually false (since features are
usually dependent)
18. USES OF NAÏVE BAYES CLASSIFICATION
Text Classification
Spam Filtering
http://ashrafsau.blogspot.in/
Hybrid Recommender System
Recommender Systems apply machine learning and
data mining techniques for filtering unseen
information and can predict whether a user would like
a given resource
Online Application
Simple Emotion Modeling
19. TEXT CLASSIFICATION – AN
APPLICATION OF NAIVE BAYES
CLASSIFIER
http://ashrafsau.blogspot.in/
20. WHY TEXT CLASSIFICATION?
Learning which articles are of interest
http://ashrafsau.blogspot.in/
Classify web pages by topic
Information extraction
Internet filters
22. EXAMPLES OF TEXT CLASSIFICATION
Classify news stories as
world, business, SciTech, Sports ,Health etc
Classify email as spam / not spam
http://ashrafsau.blogspot.in/
Classify business names by industry
Classify email to tech stuff as Mac, windows etc
Classify pdf files as research , other
Classify movie reviews as
favorable, unfavorable, neutral
Classify documents
Classify technical papers as
Interesting, Uninteresting
Classify Jokes as Funny, Not Funny
Classify web sites of companies by Standard
Industrial Classification (SIC)
23. NAÏVE BAYES APPROACH
Build the Vocabulary as the list of all distinct words that
appear in all the documents of the training set.
http://ashrafsau.blogspot.in/
Remove stop words and markings
The words in the vocabulary become the
attributes, assuming that classification is independent of
the positions of the words
Each document in the training set becomes a record
with frequencies for each word in the Vocabulary.
Train the classifier based on the training data set, by
computing the prior probabilities for each class and
attributes.
Evaluate the results on Test data
24. REPRESENTING TEXT: A LIST OF WORDS
http://ashrafsau.blogspot.in/
Common Refinements: Remove Stop
Words, Symbols
25. TEXT CLASSIFICATION ALGORITHM:
NAÏVE BAYES
http://ashrafsau.blogspot.in/
Tct – Number of particular word in particular class
Tct’ – Number of total words in particular class
B’ – Number of distinct words in all class
27. EXAMPLE CONT…
http://ashrafsau.blogspot.in/
oProbability of Yes Class is more than that of No
Class, Hence this document will go into Yes Class.
28. Advantages and Disadvantages of Naïve Bayes
Advantages :
Easy to implement
Requires a small amount of training data to estimate
the parameters
http://ashrafsau.blogspot.in/
Good results obtained in most of the cases
Disadvantages:
Assumption: class conditional independence, therefore
loss of accuracy
Practically, dependencies exist among variables
E.g., hospitals: patients: Profile: age, family history, etc.
Symptoms: fever, cough etc., Disease: lung cancer, diabetes, etc.
Dependencies among these cannot be modelled by Naïve
Bayesian Classifier
29. An extension of Naive Bayes for delivering robust
classifications
•Naive assumption (statistical independent. of the features
given the class)
http://ashrafsau.blogspot.in/
•NBC computes a single posterior distribution.
•However, the most probable class might depend on the
chosen prior, especially on small data sets.
•Prior-dependent classifications might be weak.
Solution via set of probabilities:
•Robust Bayes Classifier (Ramoni and Sebastiani, 2001)
•Naive Credal Classifier (Zaffalon, 2001)
30. •Violation of Independence Assumption
http://ashrafsau.blogspot.in/
•Zero conditional probability Problem
31. VIOLATION OF INDEPENDENCE ASSUMPTION
Naive Bayesian classifiers assume that the effect of
an attribute value on a given class is independent
http://ashrafsau.blogspot.in/
of the values of the other attributes. This
assumption is called class conditional
independence. It is made to simplify the
computations involved and, in this sense, is
considered “naive.”
32. IMPROVEMENT
Bayesian belief network are graphical
models, which unlike naïve Bayesian
http://ashrafsau.blogspot.in/
classifiers, allow the representation of
dependencies among subsets of attributes.
Bayesian belief networks can also be used for
classification.
33. ZERO CONDITIONAL PROBABILITY PROBLEM
If a given class and feature value never occur
together in the training set then the frequency-
http://ashrafsau.blogspot.in/
based probability estimate will be zero.
This is problematic since it will wipe out all
information in the other probabilities when they are
multiplied.
It is therefore often desirable to incorporate a small-
sample correction in all probability estimates such
that no probability is ever set to be exactly zero.
34. CORRECTION
To eliminate zeros, we use add-one or Laplace
smoothing, which simply adds
http://ashrafsau.blogspot.in/
one to each count
35. EXAMPLE
Suppose that for the class buys computer D (yes) in some training
database, D, containing 1000 tuples.
we have 0 tuples with income D low,
990 tuples with income D medium, and
10 tuples with income D high.
http://ashrafsau.blogspot.in/
The probabilities of these events, without the Laplacian correction, are
0, 0.990 (from 990/1000), and 0.010 (from 10/1000), respectively.
Using the Laplacian correction for the three quantities, we pretend that
we have 1 more tuple for each income-value pair. In this way, we instead
obtain the following probabilities :
respectively. The “corrected” probability estimates are close to their
“uncorrected” counterparts, yet the zero probability value is avoided.
36. Remarks on the Naive Bayesian Classifier
•Studies comparing classification algorithms have found that
the naive Bayesian classifier to be comparable in
http://ashrafsau.blogspot.in/
performance with decision tree and selected neural network
classifiers.
•Bayesian classifiers have also exhibited high accuracy and
speed when applied to large databases.
37. Conclusions
The naive Bayes model is tremendously appealing because of its
simplicity, elegance, and robustness.
It is one of the oldest formal classification algorithms, and yet even in
http://ashrafsau.blogspot.in/
its simplest form it is often surprisingly effective.
It is widely used in areas such as text classification and spam filtering.
A large number of modifications have been introduced, by the
statistical, data mining, machine learning, and pattern recognition
communities, in an attempt to make it more flexible.
but some one has to recognize that such modifications are
necessarily complications, which detract from its basic simplicity.