SlideShare a Scribd company logo
[object Object],[object Object],[object Object],[object Object]
Chapter 6 (II) Alternative Classification Technologies ,[object Object],[object Object],[object Object],[object Object]
Instance-Based ( 基于示例 ) Approach ,[object Object],[object Object]
Instance-Based Method ,[object Object],[object Object],[object Object],[object Object]
Nearest Neighbor Classifiers ,[object Object],[object Object],Training Records Test Record Compute Distance (similarity) Choose k of the “nearest” records (i.e., most similar)
Nearest-Neighbor Classifiers ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Definition of Nearest Neighbor K-nearest neighbors of a record x are data points that have the k smallest distance to x
Key to kNN Approach ,[object Object],[object Object],[object Object],[object Object],[object Object]
Distance- based Similarity Measure  ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Boolean type  布尔型 ,[object Object],[object Object],Object  i Object  j
Distance based Measure for Categorical Type( 标称型 ) of Data ,[object Object],[object Object],[object Object],[object Object]
Distance based Measure for Mixed Types ( 混合型 ) of Data ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
K-Nearest Neighbor Algorithm ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Measure for Other Types of Data ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Similarity Measure for Textual Data ,[object Object],[object Object]
Other Similarity Measure ,[object Object],Cosine measure ( 余弦计算方法 ) :
Discussion on the  k -NN Algorithm ,[object Object],[object Object],[object Object],[object Object],[object Object]
Chapter 6 (II) Alternative Classification Technologies ,[object Object],[object Object],[object Object],[object Object]
Ensemble Methods ,[object Object],[object Object]
General Idea
Examples of Ensemble Approaches ,[object Object],[object Object],[object Object]
Bagging ,[object Object],[object Object]
Bagging Algorithm Let  k  be the number of bootstrap samples set For  i  =1 to k do Create a bootstrap sample  D i  of Size  N Train a (base) classifier  C i  on  D i End for
Boosting ,[object Object],[object Object],[object Object]
Boosting ,[object Object],[object Object],[object Object],[object Object]
Boosting  C 1 T D 1 F (D 2 ) C 2 T D m … C m T The process of generating classifiers F
Boosting ,[object Object],[object Object],[object Object]
AdaBoosting  Algorithm  ,[object Object],The error rate of a base classifier  C i :  where  I(p) = 1  if p is true, and  0  otherwise. The  importance  of  a classifier  C i :
AdaBoosting  Algorithm  The weight update mechanism (Equation):  where  is the normalization factor:  : the weight for example ( x i ,  y i ) during the  round
AdaBoosting  Algorithm  Let  k  be the number of boosting rounds,  D  is the set of all examples  Update the weight of each examples according to Equation End for  ,  Initialize the weights for all  N  examples  For  i = 1  to  k  do Create training set  D i  by sampling from  D  according to  W . Train a base classifier  C i  on  D i  Apply  C i  to all examples in the original set  D
Increasing Classifier Accuracy ,[object Object],[object Object],[object Object],[object Object],Data C 1 C T C 2 … Combine Votes New data sample Class prediction
Chapter 6 (II) Alternative Classification Technologies ,[object Object],[object Object],[object Object],[object Object]
Unlabeled Data ,[object Object],[object Object],[object Object],[object Object]
Co-training Approach ,[object Object],[object Object],[object Object],[object Object]
Co-Training Approach Feature Set X=(X1, X2) Classification Model  One Classification Model Two new labeled data set 1 subset X1 subset X2 training training new labeled data set 2 classifying classifying Unlabeled  data Unlabeled  data example set L example set L
Two views ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Co-training algorithm For instance, p=1, n=3, k=30, and u=75
Co-training: Example ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Co-training: Experimental Results ,[object Object],[object Object],[object Object],[object Object]
Chapter 6 (II) Alternative Classification Technologies ,[object Object],[object Object],[object Object],[object Object]
Learning from Positive & Unlabeled Data ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Positive und Unlabeled
Direct Marketing ,[object Object],[object Object],[object Object]
Novel 2-steps strategy ,[object Object],[object Object],[object Object],[object Object],[object Object]
Two Steps Process
Step 1  Step 2 positive negative Reliable Negative (RN) Q  =U - RN U P positive Using P, RN and Q to build the final classifier iteratively  or Using only P and RN to build a classifier Existing 2-step strategy
Step 1: The Spy technique ,[object Object],[object Object],[object Object],[object Object]
Step 2:     Running a classification algorithm iteratively ,[object Object],[object Object]
PU-Learning ,[object Object],[object Object],[object Object],[object Object]
Data.Mining.C.6(II).classification and prediction

More Related Content

What's hot

Data mining technique (decision tree)
Data mining technique (decision tree)Data mining technique (decision tree)
Data mining technique (decision tree)
Shweta Ghate
 
Chapter - 6 Data Mining Concepts and Techniques 2nd Ed slides Han & Kamber
Chapter - 6 Data Mining Concepts and Techniques 2nd Ed slides Han & KamberChapter - 6 Data Mining Concepts and Techniques 2nd Ed slides Han & Kamber
Chapter - 6 Data Mining Concepts and Techniques 2nd Ed slides Han & Kamber
error007
 
Classification techniques in data mining
Classification techniques in data miningClassification techniques in data mining
Classification techniques in data mining
Kamal Acharya
 
2.8 accuracy and ensemble methods
2.8 accuracy and ensemble methods2.8 accuracy and ensemble methods
2.8 accuracy and ensemble methods
Krish_ver2
 
Unit 3classification
Unit 3classificationUnit 3classification
Unit 3classification
Kalpna Saharan
 
Tree pruning
Tree pruningTree pruning
Tree pruning
priya_kalia
 
Cs501 classification prediction
Cs501 classification predictionCs501 classification prediction
Cs501 classification prediction
Kamal Singh Lodhi
 
08 classbasic
08 classbasic08 classbasic
08 classbasic
engrasi
 
08 classbasic
08 classbasic08 classbasic
08 classbasic
ritumysterious1
 
Covering (Rules-based) Algorithm
Covering (Rules-based) AlgorithmCovering (Rules-based) Algorithm
Covering (Rules-based) Algorithm
ZHAO Sam
 
2.4 rule based classification
2.4 rule based classification2.4 rule based classification
2.4 rule based classification
Krish_ver2
 
Lect9 Decision tree
Lect9 Decision treeLect9 Decision tree
Lect9 Decision tree
hktripathy
 
2.2 decision tree
2.2 decision tree2.2 decision tree
2.2 decision tree
Krish_ver2
 
Classification Continued
Classification ContinuedClassification Continued
Classification Continued
DataminingTools Inc
 
Machine Learning
Machine LearningMachine Learning
Machine Learning
Girish Khanzode
 
Introduction to Machine Learning Classifiers
Introduction to Machine Learning ClassifiersIntroduction to Machine Learning Classifiers
Introduction to Machine Learning Classifiers
Functional Imperative
 
Presentation on supervised learning
Presentation on supervised learningPresentation on supervised learning
Presentation on supervised learning
Tonmoy Bhagawati
 

What's hot (19)

Ch06
Ch06Ch06
Ch06
 
Data mining technique (decision tree)
Data mining technique (decision tree)Data mining technique (decision tree)
Data mining technique (decision tree)
 
Chapter - 6 Data Mining Concepts and Techniques 2nd Ed slides Han & Kamber
Chapter - 6 Data Mining Concepts and Techniques 2nd Ed slides Han & KamberChapter - 6 Data Mining Concepts and Techniques 2nd Ed slides Han & Kamber
Chapter - 6 Data Mining Concepts and Techniques 2nd Ed slides Han & Kamber
 
Classification techniques in data mining
Classification techniques in data miningClassification techniques in data mining
Classification techniques in data mining
 
2.8 accuracy and ensemble methods
2.8 accuracy and ensemble methods2.8 accuracy and ensemble methods
2.8 accuracy and ensemble methods
 
Unit 3classification
Unit 3classificationUnit 3classification
Unit 3classification
 
Tree pruning
Tree pruningTree pruning
Tree pruning
 
Cs501 classification prediction
Cs501 classification predictionCs501 classification prediction
Cs501 classification prediction
 
08 classbasic
08 classbasic08 classbasic
08 classbasic
 
08 classbasic
08 classbasic08 classbasic
08 classbasic
 
08 classbasic
08 classbasic08 classbasic
08 classbasic
 
Covering (Rules-based) Algorithm
Covering (Rules-based) AlgorithmCovering (Rules-based) Algorithm
Covering (Rules-based) Algorithm
 
2.4 rule based classification
2.4 rule based classification2.4 rule based classification
2.4 rule based classification
 
Lect9 Decision tree
Lect9 Decision treeLect9 Decision tree
Lect9 Decision tree
 
2.2 decision tree
2.2 decision tree2.2 decision tree
2.2 decision tree
 
Classification Continued
Classification ContinuedClassification Continued
Classification Continued
 
Machine Learning
Machine LearningMachine Learning
Machine Learning
 
Introduction to Machine Learning Classifiers
Introduction to Machine Learning ClassifiersIntroduction to Machine Learning Classifiers
Introduction to Machine Learning Classifiers
 
Presentation on supervised learning
Presentation on supervised learningPresentation on supervised learning
Presentation on supervised learning
 

Similar to Data.Mining.C.6(II).classification and prediction

Machine Learning and Artificial Neural Networks.ppt
Machine Learning and Artificial Neural Networks.pptMachine Learning and Artificial Neural Networks.ppt
Machine Learning and Artificial Neural Networks.ppt
Anshika865276
 
Text categorization
Text categorizationText categorization
Text categorization
Phuong Nguyen
 
nnml.ppt
nnml.pptnnml.ppt
nnml.ppt
yang947066
 
boosting algorithm
boosting algorithmboosting algorithm
boosting algorithm
Prithvi Paneru
 
Search Engines
Search EnginesSearch Engines
Search Enginesbutest
 
Unit-4 classification
Unit-4 classificationUnit-4 classification
Unit-4 classification
LokarchanaD
 
Introduction to Machine Learning Aristotelis Tsirigos
Introduction to Machine Learning Aristotelis Tsirigos Introduction to Machine Learning Aristotelis Tsirigos
Introduction to Machine Learning Aristotelis Tsirigos butest
 
20070702 Text Categorization
20070702 Text Categorization20070702 Text Categorization
20070702 Text Categorization
midi
 
2.7 other classifiers
2.7 other classifiers2.7 other classifiers
2.7 other classifiers
Krish_ver2
 
ClusetrigBasic.ppt
ClusetrigBasic.pptClusetrigBasic.ppt
ClusetrigBasic.ppt
ChaitanyaKulkarni451137
 
Boosting dl concept learners
Boosting dl concept learners Boosting dl concept learners
Boosting dl concept learners
Giuseppe Rizzo
 
slides
slidesslides
slidesbutest
 
slides
slidesslides
slidesbutest
 
Data Mining Concepts and Techniques, Chapter 10. Cluster Analysis: Basic Conc...
Data Mining Concepts and Techniques, Chapter 10. Cluster Analysis: Basic Conc...Data Mining Concepts and Techniques, Chapter 10. Cluster Analysis: Basic Conc...
Data Mining Concepts and Techniques, Chapter 10. Cluster Analysis: Basic Conc...
Salah Amean
 
Machine learning and Neural Networks
Machine learning and Neural NetworksMachine learning and Neural Networks
Machine learning and Neural Networksbutest
 
Machine Learning: Decision Trees Chapter 18.1-18.3
Machine Learning: Decision Trees Chapter 18.1-18.3Machine Learning: Decision Trees Chapter 18.1-18.3
Machine Learning: Decision Trees Chapter 18.1-18.3butest
 
Classification Of Web Documents
Classification Of Web Documents Classification Of Web Documents
Classification Of Web Documents
hussainahmad77100
 

Similar to Data.Mining.C.6(II).classification and prediction (20)

Machine Learning and Artificial Neural Networks.ppt
Machine Learning and Artificial Neural Networks.pptMachine Learning and Artificial Neural Networks.ppt
Machine Learning and Artificial Neural Networks.ppt
 
Text categorization
Text categorizationText categorization
Text categorization
 
[ppt]
[ppt][ppt]
[ppt]
 
[ppt]
[ppt][ppt]
[ppt]
 
nnml.ppt
nnml.pptnnml.ppt
nnml.ppt
 
boosting algorithm
boosting algorithmboosting algorithm
boosting algorithm
 
Search Engines
Search EnginesSearch Engines
Search Engines
 
ppt
pptppt
ppt
 
Unit-4 classification
Unit-4 classificationUnit-4 classification
Unit-4 classification
 
Introduction to Machine Learning Aristotelis Tsirigos
Introduction to Machine Learning Aristotelis Tsirigos Introduction to Machine Learning Aristotelis Tsirigos
Introduction to Machine Learning Aristotelis Tsirigos
 
20070702 Text Categorization
20070702 Text Categorization20070702 Text Categorization
20070702 Text Categorization
 
2.7 other classifiers
2.7 other classifiers2.7 other classifiers
2.7 other classifiers
 
ClusetrigBasic.ppt
ClusetrigBasic.pptClusetrigBasic.ppt
ClusetrigBasic.ppt
 
Boosting dl concept learners
Boosting dl concept learners Boosting dl concept learners
Boosting dl concept learners
 
slides
slidesslides
slides
 
slides
slidesslides
slides
 
Data Mining Concepts and Techniques, Chapter 10. Cluster Analysis: Basic Conc...
Data Mining Concepts and Techniques, Chapter 10. Cluster Analysis: Basic Conc...Data Mining Concepts and Techniques, Chapter 10. Cluster Analysis: Basic Conc...
Data Mining Concepts and Techniques, Chapter 10. Cluster Analysis: Basic Conc...
 
Machine learning and Neural Networks
Machine learning and Neural NetworksMachine learning and Neural Networks
Machine learning and Neural Networks
 
Machine Learning: Decision Trees Chapter 18.1-18.3
Machine Learning: Decision Trees Chapter 18.1-18.3Machine Learning: Decision Trees Chapter 18.1-18.3
Machine Learning: Decision Trees Chapter 18.1-18.3
 
Classification Of Web Documents
Classification Of Web Documents Classification Of Web Documents
Classification Of Web Documents
 

Recently uploaded

Pride Month Slides 2024 David Douglas School District
Pride Month Slides 2024 David Douglas School DistrictPride Month Slides 2024 David Douglas School District
Pride Month Slides 2024 David Douglas School District
David Douglas School District
 
2024.06.01 Introducing a competency framework for languag learning materials ...
2024.06.01 Introducing a competency framework for languag learning materials ...2024.06.01 Introducing a competency framework for languag learning materials ...
2024.06.01 Introducing a competency framework for languag learning materials ...
Sandy Millin
 
How libraries can support authors with open access requirements for UKRI fund...
How libraries can support authors with open access requirements for UKRI fund...How libraries can support authors with open access requirements for UKRI fund...
How libraries can support authors with open access requirements for UKRI fund...
Jisc
 
Operation Blue Star - Saka Neela Tara
Operation Blue Star   -  Saka Neela TaraOperation Blue Star   -  Saka Neela Tara
Operation Blue Star - Saka Neela Tara
Balvir Singh
 
Advantages and Disadvantages of CMS from an SEO Perspective
Advantages and Disadvantages of CMS from an SEO PerspectiveAdvantages and Disadvantages of CMS from an SEO Perspective
Advantages and Disadvantages of CMS from an SEO Perspective
Krisztián Száraz
 
Unit 8 - Information and Communication Technology (Paper I).pdf
Unit 8 - Information and Communication Technology (Paper I).pdfUnit 8 - Information and Communication Technology (Paper I).pdf
Unit 8 - Information and Communication Technology (Paper I).pdf
Thiyagu K
 
Synthetic Fiber Construction in lab .pptx
Synthetic Fiber Construction in lab .pptxSynthetic Fiber Construction in lab .pptx
Synthetic Fiber Construction in lab .pptx
Pavel ( NSTU)
 
Multithreading_in_C++ - std::thread, race condition
Multithreading_in_C++ - std::thread, race conditionMultithreading_in_C++ - std::thread, race condition
Multithreading_in_C++ - std::thread, race condition
Mohammed Sikander
 
Model Attribute Check Company Auto Property
Model Attribute  Check Company Auto PropertyModel Attribute  Check Company Auto Property
Model Attribute Check Company Auto Property
Celine George
 
Exploiting Artificial Intelligence for Empowering Researchers and Faculty, In...
Exploiting Artificial Intelligence for Empowering Researchers and Faculty, In...Exploiting Artificial Intelligence for Empowering Researchers and Faculty, In...
Exploiting Artificial Intelligence for Empowering Researchers and Faculty, In...
Dr. Vinod Kumar Kanvaria
 
The basics of sentences session 5pptx.pptx
The basics of sentences session 5pptx.pptxThe basics of sentences session 5pptx.pptx
The basics of sentences session 5pptx.pptx
heathfieldcps1
 
Francesca Gottschalk - How can education support child empowerment.pptx
Francesca Gottschalk - How can education support child empowerment.pptxFrancesca Gottschalk - How can education support child empowerment.pptx
Francesca Gottschalk - How can education support child empowerment.pptx
EduSkills OECD
 
Chapter -12, Antibiotics (One Page Notes).pdf
Chapter -12, Antibiotics (One Page Notes).pdfChapter -12, Antibiotics (One Page Notes).pdf
Chapter -12, Antibiotics (One Page Notes).pdf
Kartik Tiwari
 
Guidance_and_Counselling.pdf B.Ed. 4th Semester
Guidance_and_Counselling.pdf B.Ed. 4th SemesterGuidance_and_Counselling.pdf B.Ed. 4th Semester
Guidance_and_Counselling.pdf B.Ed. 4th Semester
Atul Kumar Singh
 
Biological Screening of Herbal Drugs in detailed.
Biological Screening of Herbal Drugs in detailed.Biological Screening of Herbal Drugs in detailed.
Biological Screening of Herbal Drugs in detailed.
Ashokrao Mane college of Pharmacy Peth-Vadgaon
 
special B.ed 2nd year old paper_20240531.pdf
special B.ed 2nd year old paper_20240531.pdfspecial B.ed 2nd year old paper_20240531.pdf
special B.ed 2nd year old paper_20240531.pdf
Special education needs
 
Natural birth techniques - Mrs.Akanksha Trivedi Rama University
Natural birth techniques - Mrs.Akanksha Trivedi Rama UniversityNatural birth techniques - Mrs.Akanksha Trivedi Rama University
Natural birth techniques - Mrs.Akanksha Trivedi Rama University
Akanksha trivedi rama nursing college kanpur.
 
How to Make a Field invisible in Odoo 17
How to Make a Field invisible in Odoo 17How to Make a Field invisible in Odoo 17
How to Make a Field invisible in Odoo 17
Celine George
 
Thesis Statement for students diagnonsed withADHD.ppt
Thesis Statement for students diagnonsed withADHD.pptThesis Statement for students diagnonsed withADHD.ppt
Thesis Statement for students diagnonsed withADHD.ppt
EverAndrsGuerraGuerr
 
The Diamond Necklace by Guy De Maupassant.pptx
The Diamond Necklace by Guy De Maupassant.pptxThe Diamond Necklace by Guy De Maupassant.pptx
The Diamond Necklace by Guy De Maupassant.pptx
DhatriParmar
 

Recently uploaded (20)

Pride Month Slides 2024 David Douglas School District
Pride Month Slides 2024 David Douglas School DistrictPride Month Slides 2024 David Douglas School District
Pride Month Slides 2024 David Douglas School District
 
2024.06.01 Introducing a competency framework for languag learning materials ...
2024.06.01 Introducing a competency framework for languag learning materials ...2024.06.01 Introducing a competency framework for languag learning materials ...
2024.06.01 Introducing a competency framework for languag learning materials ...
 
How libraries can support authors with open access requirements for UKRI fund...
How libraries can support authors with open access requirements for UKRI fund...How libraries can support authors with open access requirements for UKRI fund...
How libraries can support authors with open access requirements for UKRI fund...
 
Operation Blue Star - Saka Neela Tara
Operation Blue Star   -  Saka Neela TaraOperation Blue Star   -  Saka Neela Tara
Operation Blue Star - Saka Neela Tara
 
Advantages and Disadvantages of CMS from an SEO Perspective
Advantages and Disadvantages of CMS from an SEO PerspectiveAdvantages and Disadvantages of CMS from an SEO Perspective
Advantages and Disadvantages of CMS from an SEO Perspective
 
Unit 8 - Information and Communication Technology (Paper I).pdf
Unit 8 - Information and Communication Technology (Paper I).pdfUnit 8 - Information and Communication Technology (Paper I).pdf
Unit 8 - Information and Communication Technology (Paper I).pdf
 
Synthetic Fiber Construction in lab .pptx
Synthetic Fiber Construction in lab .pptxSynthetic Fiber Construction in lab .pptx
Synthetic Fiber Construction in lab .pptx
 
Multithreading_in_C++ - std::thread, race condition
Multithreading_in_C++ - std::thread, race conditionMultithreading_in_C++ - std::thread, race condition
Multithreading_in_C++ - std::thread, race condition
 
Model Attribute Check Company Auto Property
Model Attribute  Check Company Auto PropertyModel Attribute  Check Company Auto Property
Model Attribute Check Company Auto Property
 
Exploiting Artificial Intelligence for Empowering Researchers and Faculty, In...
Exploiting Artificial Intelligence for Empowering Researchers and Faculty, In...Exploiting Artificial Intelligence for Empowering Researchers and Faculty, In...
Exploiting Artificial Intelligence for Empowering Researchers and Faculty, In...
 
The basics of sentences session 5pptx.pptx
The basics of sentences session 5pptx.pptxThe basics of sentences session 5pptx.pptx
The basics of sentences session 5pptx.pptx
 
Francesca Gottschalk - How can education support child empowerment.pptx
Francesca Gottschalk - How can education support child empowerment.pptxFrancesca Gottschalk - How can education support child empowerment.pptx
Francesca Gottschalk - How can education support child empowerment.pptx
 
Chapter -12, Antibiotics (One Page Notes).pdf
Chapter -12, Antibiotics (One Page Notes).pdfChapter -12, Antibiotics (One Page Notes).pdf
Chapter -12, Antibiotics (One Page Notes).pdf
 
Guidance_and_Counselling.pdf B.Ed. 4th Semester
Guidance_and_Counselling.pdf B.Ed. 4th SemesterGuidance_and_Counselling.pdf B.Ed. 4th Semester
Guidance_and_Counselling.pdf B.Ed. 4th Semester
 
Biological Screening of Herbal Drugs in detailed.
Biological Screening of Herbal Drugs in detailed.Biological Screening of Herbal Drugs in detailed.
Biological Screening of Herbal Drugs in detailed.
 
special B.ed 2nd year old paper_20240531.pdf
special B.ed 2nd year old paper_20240531.pdfspecial B.ed 2nd year old paper_20240531.pdf
special B.ed 2nd year old paper_20240531.pdf
 
Natural birth techniques - Mrs.Akanksha Trivedi Rama University
Natural birth techniques - Mrs.Akanksha Trivedi Rama UniversityNatural birth techniques - Mrs.Akanksha Trivedi Rama University
Natural birth techniques - Mrs.Akanksha Trivedi Rama University
 
How to Make a Field invisible in Odoo 17
How to Make a Field invisible in Odoo 17How to Make a Field invisible in Odoo 17
How to Make a Field invisible in Odoo 17
 
Thesis Statement for students diagnonsed withADHD.ppt
Thesis Statement for students diagnonsed withADHD.pptThesis Statement for students diagnonsed withADHD.ppt
Thesis Statement for students diagnonsed withADHD.ppt
 
The Diamond Necklace by Guy De Maupassant.pptx
The Diamond Necklace by Guy De Maupassant.pptxThe Diamond Necklace by Guy De Maupassant.pptx
The Diamond Necklace by Guy De Maupassant.pptx
 

Data.Mining.C.6(II).classification and prediction

  • 1.
  • 2.
  • 3.
  • 4.
  • 5.
  • 6.
  • 7. Definition of Nearest Neighbor K-nearest neighbors of a record x are data points that have the k smallest distance to x
  • 8.
  • 9.
  • 10.
  • 11.
  • 12.
  • 13.
  • 14.
  • 15.
  • 16.
  • 17.
  • 18.
  • 19.
  • 21.
  • 22.
  • 23. Bagging Algorithm Let k be the number of bootstrap samples set For i =1 to k do Create a bootstrap sample D i of Size N Train a (base) classifier C i on D i End for
  • 24.
  • 25.
  • 26. Boosting C 1 T D 1 F (D 2 ) C 2 T D m … C m T The process of generating classifiers F
  • 27.
  • 28.
  • 29. AdaBoosting Algorithm The weight update mechanism (Equation): where is the normalization factor: : the weight for example ( x i , y i ) during the round
  • 30. AdaBoosting Algorithm Let k be the number of boosting rounds, D is the set of all examples Update the weight of each examples according to Equation End for , Initialize the weights for all N examples For i = 1 to k do Create training set D i by sampling from D according to W . Train a base classifier C i on D i Apply C i to all examples in the original set D
  • 31.
  • 32.
  • 33.
  • 34.
  • 35. Co-Training Approach Feature Set X=(X1, X2) Classification Model One Classification Model Two new labeled data set 1 subset X1 subset X2 training training new labeled data set 2 classifying classifying Unlabeled data Unlabeled data example set L example set L
  • 36.
  • 37. Co-training algorithm For instance, p=1, n=3, k=30, and u=75
  • 38.
  • 39.
  • 40.
  • 41.
  • 43.
  • 44.
  • 46. Step 1 Step 2 positive negative Reliable Negative (RN) Q =U - RN U P positive Using P, RN and Q to build the final classifier iteratively or Using only P and RN to build a classifier Existing 2-step strategy
  • 47.
  • 48.
  • 49.

Editor's Notes

  1. The smaller the distance between two points, the more similar