This document discusses classification methods in machine learning, including decision trees and Bayes classification. It provides examples of how to build a decision tree classifier using information gain or Gini index to select attributes. It also explains the basics of Bayes' theorem and how to apply the naive Bayes classifier to predict class membership probabilities based on attribute values in the training data. The document contains sample classification problems and step-by-step computations to demonstrate these machine learning techniques.
Decision tree making use of the classification of the data, when the data is categorical or ordinal. It is a part of the supervised machine learning. It is in the form of a data tree which contains the result of parents node.
LearnBay provides industrial training in Data Science which is co-developed with IBM.
To know more :
Visit our website: https://www.learnbay.co/data-science-course/
Follow us on:
LinkedIn: https://www.linkedin.com/company/learnbay/
Facebook: https://www.facebook.com/learnbay/
Twitter: https://twitter.com/Learnbay1
This document describes decision tree learning and provides an example to illustrate the process. It begins by introducing decision trees and their use for classification. It then provides details on key concepts like entropy, information gain, and the ID3 algorithm. The example shows calculating entropy and information gain for attributes to determine the root node. It further splits the data based on the root node and calculates entropy and information gain for subtrees until classes can be determined at the leaf nodes. The example builds out the full decision tree to classify whether it is suitable to play tennis based on weather conditions.
Decision tree learning is a method for inducing decision trees from training data to classify new examples. It works by recursively splitting the training examples into purer subsets based on attribute values, building a tree structure. The attribute with the highest information gain is selected as the root node, and its values become branches. This process continues on the subsets until leaf nodes assign a target classification. Characteristics that are well-suited to decision tree learning include attribute-value pairs, discrete target functions, and disjunctive descriptions of the target function.
Here are the key steps in ID3's approach to selecting the "best" attribute at each node:
1. Calculate the entropy (impurity/uncertainty) of the target attribute for the examples reaching that node.
2. Calculate the information gain (reduction in entropy) from splitting on each candidate attribute.
3. Select the attribute with the highest information gain. This attribute best separates the examples according to the target class.
So in this example, ID3 would calculate the information gain from splitting on attributes A1 and A2, and select the attribute with the highest gain. The goal is to pick the attribute that produces the "purest" partitions at each step.
The ID3 algorithm builds decision trees using a top-down, greedy search. It uses entropy and information gain to determine the attribute that best splits the data. The algorithm is demonstrated on a weather dataset to predict whether it is suitable for playing ball. Entropy is calculated for each attribute to determine which has the highest information gain to become the root node, which is found to be the outlook attribute. The tree is then grown further by calculating entropy for the subsets based on outlook.
This presentation covers Decision Tree as a supervised machine learning technique, talking about Information Gain method and Gini Index method with their related Algorithms.
Decision tree making use of the classification of the data, when the data is categorical or ordinal. It is a part of the supervised machine learning. It is in the form of a data tree which contains the result of parents node.
LearnBay provides industrial training in Data Science which is co-developed with IBM.
To know more :
Visit our website: https://www.learnbay.co/data-science-course/
Follow us on:
LinkedIn: https://www.linkedin.com/company/learnbay/
Facebook: https://www.facebook.com/learnbay/
Twitter: https://twitter.com/Learnbay1
This document describes decision tree learning and provides an example to illustrate the process. It begins by introducing decision trees and their use for classification. It then provides details on key concepts like entropy, information gain, and the ID3 algorithm. The example shows calculating entropy and information gain for attributes to determine the root node. It further splits the data based on the root node and calculates entropy and information gain for subtrees until classes can be determined at the leaf nodes. The example builds out the full decision tree to classify whether it is suitable to play tennis based on weather conditions.
Decision tree learning is a method for inducing decision trees from training data to classify new examples. It works by recursively splitting the training examples into purer subsets based on attribute values, building a tree structure. The attribute with the highest information gain is selected as the root node, and its values become branches. This process continues on the subsets until leaf nodes assign a target classification. Characteristics that are well-suited to decision tree learning include attribute-value pairs, discrete target functions, and disjunctive descriptions of the target function.
Here are the key steps in ID3's approach to selecting the "best" attribute at each node:
1. Calculate the entropy (impurity/uncertainty) of the target attribute for the examples reaching that node.
2. Calculate the information gain (reduction in entropy) from splitting on each candidate attribute.
3. Select the attribute with the highest information gain. This attribute best separates the examples according to the target class.
So in this example, ID3 would calculate the information gain from splitting on attributes A1 and A2, and select the attribute with the highest gain. The goal is to pick the attribute that produces the "purest" partitions at each step.
The ID3 algorithm builds decision trees using a top-down, greedy search. It uses entropy and information gain to determine the attribute that best splits the data. The algorithm is demonstrated on a weather dataset to predict whether it is suitable for playing ball. Entropy is calculated for each attribute to determine which has the highest information gain to become the root node, which is found to be the outlook attribute. The tree is then grown further by calculating entropy for the subsets based on outlook.
This presentation covers Decision Tree as a supervised machine learning technique, talking about Information Gain method and Gini Index method with their related Algorithms.
This document discusses machine learning decision trees. It outlines the ID3 algorithm for inducing decision trees from data in a top-down manner using information gain. The algorithm selects the attribute with highest information gain at each step to split the data. Overfitting is addressed through reduced error pruning which prunes nodes to minimize error on a validation set. Continuous and multi-valued attributes are handled through discretization. The document also discusses converting decision trees to rules and handling missing attribute values.
The document provides an overview of machine learning and decision tree learning. It discusses how machine learning can be applied to problems that are too difficult to program by hand, such as autonomous driving. It then describes decision tree learning, including how decision trees work, how the ID3 algorithm builds decision trees in a top-down manner by selecting the attribute that best splits the data at each step, and how decision trees can be converted to rules.
This document discusses classification, which is a type of supervised machine learning where algorithms are used to predict categorical class labels. There is a two-step process: 1) model construction using a training dataset to develop rules or formulas for classification, and 2) model usage to classify new data. Common applications include credit approval, target marketing, medical diagnosis, and treatment effectiveness analysis. The document also covers Bayesian classification, which uses probability distributions over class labels to classify new data instances based on attribute values and their probabilities.
Decision tree in artificial intelligenceMdAlAmin187
The document presents an overview of decision trees, including what they are, common algorithms like ID3 and C4.5, types of decision trees, and how to construct a decision tree using the ID3 algorithm. It provides an example applying ID3 to a sample dataset about determining whether to go out based on weather conditions. Key advantages of decision trees are that they are simple to understand, can handle both numerical and categorical data, and closely mirror human decision making. Limitations include potential for overfitting and lower accuracy compared to other models.
Data Science Training in Bangalore | Learnbay.in | Decision Tree | Machine Le...Learnbay Datascience
Decision Tree by Learnbay | Data Science Training in Bangalore | Machine Learning Courses. Learnbay offers classroom data science training courses in Bangalore with project and job assistance for working professionals.For more details visit https://www.learnbay.in/shop/courses/data-science-training-courses-bangalore/
This document describes the step-by-step process for building a decision tree model. It discusses:
1. Calculating the entropy and information gain of attributes to determine the root node and subsequent branches of the tree. Outlook had the highest information gain and was selected as the root node.
2. Splitting the data based on the root node attribute. For overcast outlook, the decision is always yes. For other outlooks, the process is repeated to find the next best attribute.
3. Continuing splitting the data based on attributes with the highest information gain at each node, until the data is perfectly classified or no more attributes can be used. Humidity and wind were selected for further splits based
The document describes three programs that implement different machine learning algorithms:
1) The FIND-S algorithm to find the most specific hypothesis from training data.
2) The Candidate-Elimination algorithm to output all hypotheses consistent with training examples.
3) The ID3 decision tree algorithm to build a decision tree classifier and classify new samples. Pseudocode and sample outputs are provided.
The document discusses decision tree learning, including:
- Decision trees represent a disjunction of conjunctions of constraints on attribute values to classify instances.
- The ID3 and C4.5 algorithms use information gain to select the attribute that best splits the data at each node, growing the tree in a top-down greedy manner.
- Decision trees can model nonlinearity and are generally easy to interpret, but may overfit more complex datasets.
Decision tree learning is a method for approximating discrete-valued functions that is widely used in machine learning. It represents learned functions as decision trees that classify instances described by attribute value pairs. The ID3 algorithm performs a top-down induction of decision trees by selecting the attribute that best splits the data at each step. This results in an expressive hypothesis space that is robust to noise while avoiding overfitting through techniques like reduced-error pruning.
Data Mining Concepts and Techniques.pptRvishnupriya2
This document discusses classification techniques in data mining, including decision trees. It covers supervised vs. unsupervised learning, the classification process, decision tree induction using information gain and other measures, handling continuous attributes, overfitting, and tree pruning. Specific algorithms covered include ID3, C4.5, CART, and CHAID. The goal of classification and how decision trees are constructed from the training data is explained at a high level.
Data Mining Concepts and Techniques.pptRvishnupriya2
This document discusses classification techniques for data mining. It covers supervised and unsupervised learning methods. Specifically, it describes classification as a two-step process involving model construction from training data and then using the model to classify new data. Several classification algorithms are covered, including decision tree induction, Bayes classification, and rule-based classification. Evaluation metrics like accuracy and techniques to improve classification like ensemble methods are also summarized.
Machine learning is a branch of artificial intelligence (AI) and computer science which focuses on the use of data and algorithms to imitate the way that humans learn, gradually improving its accuracy.
IBM has a rich history with machine learning. One of its own, Arthur Samuel, is credited for coining the term, “machine learning” with his research (link resides outside ibm.com) around the game of checkers. Robert Nealey, the self-proclaimed checkers master, played the game on an IBM 7094 computer in 1962, and he lost to the computer. Compared to what can be done today, this feat seems trivial, but it’s considered a major milestone in the field of artificial intelligence.
Over the last couple of decades, the technological advances in storage and processing power have enabled some innovative products based on machine learning, such as Netflix’s recommendation engine and self-driving cars.
Machine learning is an important component of the growing field of data science. Through the use of statistical methods, algorithms are trained to make classifications or predictions, and to uncover key insights in data mining projects. These insights subsequently drive decision making within applications and businesses, ideally impacting key growth metrics. As big data continues to expand and grow, the market demand for data scientists will increase. They will be required to help identify the most relevant business questions and the data to answer them.
Machine learning algorithms are typically created using frameworks that accelerate solution development, such as TensorFlow and PyTorch.
Related content
Subscribe to IBM newsletters
Begin your journey to AI
Learn how to scale AI
Explore the AI Academy
Machine Learning vs. Deep Learning vs. Neural Networks
Since deep learning and machine learning tend to be used interchangeably, it’s worth noting the nuances between the two. Machine learning, deep learning, and neural networks are all sub-fields of artificial intelligence. However, neural networks is actually a sub-field of machine learning, and deep learning is a sub-field of neural networks.
The way in which deep learning and machine learning differ is in how each algorithm learns. "Deep" machine learning can use labeled datasets, also known as supervised learning, to inform its algorithm, but it doesn’t necessarily require a labeled dataset. Deep learning can ingest unstructured data in its raw form (e.g., text or images), and it can automatically determine the set of features which distinguish different categories of data from one another. This eliminates some of the human intervention required and enables the use of larger data sets. You can think of deep learning as "scalable machine learning" as Lex Fridman notes in this MIT lecture (link resides outside ibm.com).
Classical, or "non-deep", machine learning is more dependent on human intervention to learn. Human experts determine the set of features to understand the differences between data inputs,
This document discusses classification concepts and decision tree induction. It defines classification as predicting categorical class labels based on a training set. Decision tree induction is introduced as a basic classification algorithm that recursively partitions data based on attribute values to construct a tree. Information gain and the Gini index are presented as common measures for selecting the best attribute to use at each tree node split. Overfitting is identified as a potential issue, and prepruning and postpruning techniques are described to address it.
Descriptive and diagnostic analytics describe and analyze organizational data through measures like mean, median, standard deviation and charts/tables to identify patterns and insights. Predictive analytics use statistical techniques like regression and clustering to predict future outcomes and behaviors. Prescriptive analytics provide recommendations for decisions and actions based on optimization models. The document then focuses on classification techniques in predictive analytics, covering methods like decision trees, rules, Naive Bayes and nearest neighbor algorithms.
The document summarizes key concepts in classification and decision tree induction. It discusses supervised vs unsupervised learning, the two-step classification process of model construction and usage, and decision tree induction basics including attribute selection measures like information gain, gain ratio, and Gini index. It also covers overfitting and techniques like prepruning and postpruning decision trees.
This chapter discusses classification techniques for data mining. It begins with an overview of classification vs unsupervised learning and the basic process of classification which involves model construction using a training set and then using the model to classify new data. It then covers decision trees, describing the basic algorithm for inducing decision trees from data and different measures for attribute selection like information gain, gain ratio, and gini index. The chapter also discusses model evaluation, overfitting, tree pruning, and techniques for scaling classification to large databases.
The document outlines the process of building decision trees for machine learning. It discusses key concepts like decision tree structure with root, internal and leaf nodes. It also explains entropy and information gain, which are measures of impurity/purity used to select the best attributes to split nodes on. The example of building a decision tree to predict playing tennis is used throughout to demonstrate these concepts in a step-by-step manner.
Machine learning decision tree AIML ML Lecture 7.pptxGulamSarwar31
This document contains lecture notes from BITS Pilani on machine learning and decision trees. It discusses decision trees, information gain, entropy, overfitting, and techniques for handling continuous values and missing data in decision trees. The key topics covered are decision tree induction using the ID3 algorithm, measures of information like entropy used for splitting nodes, and methods for avoiding overfitting like reduced error pruning and converting decision trees to rules for post-pruning.
Discover top-tier mobile app development services, offering innovative solutions for iOS and Android. Enhance your business with custom, user-friendly mobile applications.
HCL Notes and Domino License Cost Reduction in the World of DLAUpanagenda
Webinar Recording: https://www.panagenda.com/webinars/hcl-notes-and-domino-license-cost-reduction-in-the-world-of-dlau/
The introduction of DLAU and the CCB & CCX licensing model caused quite a stir in the HCL community. As a Notes and Domino customer, you may have faced challenges with unexpected user counts and license costs. You probably have questions on how this new licensing approach works and how to benefit from it. Most importantly, you likely have budget constraints and want to save money where possible. Don’t worry, we can help with all of this!
We’ll show you how to fix common misconfigurations that cause higher-than-expected user counts, and how to identify accounts which you can deactivate to save money. There are also frequent patterns that can cause unnecessary cost, like using a person document instead of a mail-in for shared mailboxes. We’ll provide examples and solutions for those as well. And naturally we’ll explain the new licensing model.
Join HCL Ambassador Marc Thomas in this webinar with a special guest appearance from Franz Walder. It will give you the tools and know-how to stay on top of what is going on with Domino licensing. You will be able lower your cost through an optimized configuration and keep it low going forward.
These topics will be covered
- Reducing license cost by finding and fixing misconfigurations and superfluous accounts
- How do CCB and CCX licenses really work?
- Understanding the DLAU tool and how to best utilize it
- Tips for common problem areas, like team mailboxes, functional/test users, etc
- Practical examples and best practices to implement right away
This document discusses machine learning decision trees. It outlines the ID3 algorithm for inducing decision trees from data in a top-down manner using information gain. The algorithm selects the attribute with highest information gain at each step to split the data. Overfitting is addressed through reduced error pruning which prunes nodes to minimize error on a validation set. Continuous and multi-valued attributes are handled through discretization. The document also discusses converting decision trees to rules and handling missing attribute values.
The document provides an overview of machine learning and decision tree learning. It discusses how machine learning can be applied to problems that are too difficult to program by hand, such as autonomous driving. It then describes decision tree learning, including how decision trees work, how the ID3 algorithm builds decision trees in a top-down manner by selecting the attribute that best splits the data at each step, and how decision trees can be converted to rules.
This document discusses classification, which is a type of supervised machine learning where algorithms are used to predict categorical class labels. There is a two-step process: 1) model construction using a training dataset to develop rules or formulas for classification, and 2) model usage to classify new data. Common applications include credit approval, target marketing, medical diagnosis, and treatment effectiveness analysis. The document also covers Bayesian classification, which uses probability distributions over class labels to classify new data instances based on attribute values and their probabilities.
Decision tree in artificial intelligenceMdAlAmin187
The document presents an overview of decision trees, including what they are, common algorithms like ID3 and C4.5, types of decision trees, and how to construct a decision tree using the ID3 algorithm. It provides an example applying ID3 to a sample dataset about determining whether to go out based on weather conditions. Key advantages of decision trees are that they are simple to understand, can handle both numerical and categorical data, and closely mirror human decision making. Limitations include potential for overfitting and lower accuracy compared to other models.
Data Science Training in Bangalore | Learnbay.in | Decision Tree | Machine Le...Learnbay Datascience
Decision Tree by Learnbay | Data Science Training in Bangalore | Machine Learning Courses. Learnbay offers classroom data science training courses in Bangalore with project and job assistance for working professionals.For more details visit https://www.learnbay.in/shop/courses/data-science-training-courses-bangalore/
This document describes the step-by-step process for building a decision tree model. It discusses:
1. Calculating the entropy and information gain of attributes to determine the root node and subsequent branches of the tree. Outlook had the highest information gain and was selected as the root node.
2. Splitting the data based on the root node attribute. For overcast outlook, the decision is always yes. For other outlooks, the process is repeated to find the next best attribute.
3. Continuing splitting the data based on attributes with the highest information gain at each node, until the data is perfectly classified or no more attributes can be used. Humidity and wind were selected for further splits based
The document describes three programs that implement different machine learning algorithms:
1) The FIND-S algorithm to find the most specific hypothesis from training data.
2) The Candidate-Elimination algorithm to output all hypotheses consistent with training examples.
3) The ID3 decision tree algorithm to build a decision tree classifier and classify new samples. Pseudocode and sample outputs are provided.
The document discusses decision tree learning, including:
- Decision trees represent a disjunction of conjunctions of constraints on attribute values to classify instances.
- The ID3 and C4.5 algorithms use information gain to select the attribute that best splits the data at each node, growing the tree in a top-down greedy manner.
- Decision trees can model nonlinearity and are generally easy to interpret, but may overfit more complex datasets.
Decision tree learning is a method for approximating discrete-valued functions that is widely used in machine learning. It represents learned functions as decision trees that classify instances described by attribute value pairs. The ID3 algorithm performs a top-down induction of decision trees by selecting the attribute that best splits the data at each step. This results in an expressive hypothesis space that is robust to noise while avoiding overfitting through techniques like reduced-error pruning.
Data Mining Concepts and Techniques.pptRvishnupriya2
This document discusses classification techniques in data mining, including decision trees. It covers supervised vs. unsupervised learning, the classification process, decision tree induction using information gain and other measures, handling continuous attributes, overfitting, and tree pruning. Specific algorithms covered include ID3, C4.5, CART, and CHAID. The goal of classification and how decision trees are constructed from the training data is explained at a high level.
Data Mining Concepts and Techniques.pptRvishnupriya2
This document discusses classification techniques for data mining. It covers supervised and unsupervised learning methods. Specifically, it describes classification as a two-step process involving model construction from training data and then using the model to classify new data. Several classification algorithms are covered, including decision tree induction, Bayes classification, and rule-based classification. Evaluation metrics like accuracy and techniques to improve classification like ensemble methods are also summarized.
Machine learning is a branch of artificial intelligence (AI) and computer science which focuses on the use of data and algorithms to imitate the way that humans learn, gradually improving its accuracy.
IBM has a rich history with machine learning. One of its own, Arthur Samuel, is credited for coining the term, “machine learning” with his research (link resides outside ibm.com) around the game of checkers. Robert Nealey, the self-proclaimed checkers master, played the game on an IBM 7094 computer in 1962, and he lost to the computer. Compared to what can be done today, this feat seems trivial, but it’s considered a major milestone in the field of artificial intelligence.
Over the last couple of decades, the technological advances in storage and processing power have enabled some innovative products based on machine learning, such as Netflix’s recommendation engine and self-driving cars.
Machine learning is an important component of the growing field of data science. Through the use of statistical methods, algorithms are trained to make classifications or predictions, and to uncover key insights in data mining projects. These insights subsequently drive decision making within applications and businesses, ideally impacting key growth metrics. As big data continues to expand and grow, the market demand for data scientists will increase. They will be required to help identify the most relevant business questions and the data to answer them.
Machine learning algorithms are typically created using frameworks that accelerate solution development, such as TensorFlow and PyTorch.
Related content
Subscribe to IBM newsletters
Begin your journey to AI
Learn how to scale AI
Explore the AI Academy
Machine Learning vs. Deep Learning vs. Neural Networks
Since deep learning and machine learning tend to be used interchangeably, it’s worth noting the nuances between the two. Machine learning, deep learning, and neural networks are all sub-fields of artificial intelligence. However, neural networks is actually a sub-field of machine learning, and deep learning is a sub-field of neural networks.
The way in which deep learning and machine learning differ is in how each algorithm learns. "Deep" machine learning can use labeled datasets, also known as supervised learning, to inform its algorithm, but it doesn’t necessarily require a labeled dataset. Deep learning can ingest unstructured data in its raw form (e.g., text or images), and it can automatically determine the set of features which distinguish different categories of data from one another. This eliminates some of the human intervention required and enables the use of larger data sets. You can think of deep learning as "scalable machine learning" as Lex Fridman notes in this MIT lecture (link resides outside ibm.com).
Classical, or "non-deep", machine learning is more dependent on human intervention to learn. Human experts determine the set of features to understand the differences between data inputs,
This document discusses classification concepts and decision tree induction. It defines classification as predicting categorical class labels based on a training set. Decision tree induction is introduced as a basic classification algorithm that recursively partitions data based on attribute values to construct a tree. Information gain and the Gini index are presented as common measures for selecting the best attribute to use at each tree node split. Overfitting is identified as a potential issue, and prepruning and postpruning techniques are described to address it.
Descriptive and diagnostic analytics describe and analyze organizational data through measures like mean, median, standard deviation and charts/tables to identify patterns and insights. Predictive analytics use statistical techniques like regression and clustering to predict future outcomes and behaviors. Prescriptive analytics provide recommendations for decisions and actions based on optimization models. The document then focuses on classification techniques in predictive analytics, covering methods like decision trees, rules, Naive Bayes and nearest neighbor algorithms.
The document summarizes key concepts in classification and decision tree induction. It discusses supervised vs unsupervised learning, the two-step classification process of model construction and usage, and decision tree induction basics including attribute selection measures like information gain, gain ratio, and Gini index. It also covers overfitting and techniques like prepruning and postpruning decision trees.
This chapter discusses classification techniques for data mining. It begins with an overview of classification vs unsupervised learning and the basic process of classification which involves model construction using a training set and then using the model to classify new data. It then covers decision trees, describing the basic algorithm for inducing decision trees from data and different measures for attribute selection like information gain, gain ratio, and gini index. The chapter also discusses model evaluation, overfitting, tree pruning, and techniques for scaling classification to large databases.
The document outlines the process of building decision trees for machine learning. It discusses key concepts like decision tree structure with root, internal and leaf nodes. It also explains entropy and information gain, which are measures of impurity/purity used to select the best attributes to split nodes on. The example of building a decision tree to predict playing tennis is used throughout to demonstrate these concepts in a step-by-step manner.
Machine learning decision tree AIML ML Lecture 7.pptxGulamSarwar31
This document contains lecture notes from BITS Pilani on machine learning and decision trees. It discusses decision trees, information gain, entropy, overfitting, and techniques for handling continuous values and missing data in decision trees. The key topics covered are decision tree induction using the ID3 algorithm, measures of information like entropy used for splitting nodes, and methods for avoiding overfitting like reduced error pruning and converting decision trees to rules for post-pruning.
Similar to Nhan_Chapter 6_Classification 2022.pdf (20)
Discover top-tier mobile app development services, offering innovative solutions for iOS and Android. Enhance your business with custom, user-friendly mobile applications.
HCL Notes and Domino License Cost Reduction in the World of DLAUpanagenda
Webinar Recording: https://www.panagenda.com/webinars/hcl-notes-and-domino-license-cost-reduction-in-the-world-of-dlau/
The introduction of DLAU and the CCB & CCX licensing model caused quite a stir in the HCL community. As a Notes and Domino customer, you may have faced challenges with unexpected user counts and license costs. You probably have questions on how this new licensing approach works and how to benefit from it. Most importantly, you likely have budget constraints and want to save money where possible. Don’t worry, we can help with all of this!
We’ll show you how to fix common misconfigurations that cause higher-than-expected user counts, and how to identify accounts which you can deactivate to save money. There are also frequent patterns that can cause unnecessary cost, like using a person document instead of a mail-in for shared mailboxes. We’ll provide examples and solutions for those as well. And naturally we’ll explain the new licensing model.
Join HCL Ambassador Marc Thomas in this webinar with a special guest appearance from Franz Walder. It will give you the tools and know-how to stay on top of what is going on with Domino licensing. You will be able lower your cost through an optimized configuration and keep it low going forward.
These topics will be covered
- Reducing license cost by finding and fixing misconfigurations and superfluous accounts
- How do CCB and CCX licenses really work?
- Understanding the DLAU tool and how to best utilize it
- Tips for common problem areas, like team mailboxes, functional/test users, etc
- Practical examples and best practices to implement right away
Programming Foundation Models with DSPy - Meetup SlidesZilliz
Prompting language models is hard, while programming language models is easy. In this talk, I will discuss the state-of-the-art framework DSPy for programming foundation models with its powerful optimizers and runtime constraint system.
Introduction of Cybersecurity with OSS at Code Europe 2024Hiroshi SHIBATA
I develop the Ruby programming language, RubyGems, and Bundler, which are package managers for Ruby. Today, I will introduce how to enhance the security of your application using open-source software (OSS) examples from Ruby and RubyGems.
The first topic is CVE (Common Vulnerabilities and Exposures). I have published CVEs many times. But what exactly is a CVE? I'll provide a basic understanding of CVEs and explain how to detect and handle vulnerabilities in OSS.
Next, let's discuss package managers. Package managers play a critical role in the OSS ecosystem. I'll explain how to manage library dependencies in your application.
I'll share insights into how the Ruby and RubyGems core team works to keep our ecosystem safe. By the end of this talk, you'll have a better understanding of how to safeguard your code.
Skybuffer SAM4U tool for SAP license adoptionTatiana Kojar
Manage and optimize your license adoption and consumption with SAM4U, an SAP free customer software asset management tool.
SAM4U, an SAP complimentary software asset management tool for customers, delivers a detailed and well-structured overview of license inventory and usage with a user-friendly interface. We offer a hosted, cost-effective, and performance-optimized SAM4U setup in the Skybuffer Cloud environment. You retain ownership of the system and data, while we manage the ABAP 7.58 infrastructure, ensuring fixed Total Cost of Ownership (TCO) and exceptional services through the SAP Fiori interface.
5th LF Energy Power Grid Model Meet-up SlidesDanBrown980551
5th Power Grid Model Meet-up
It is with great pleasure that we extend to you an invitation to the 5th Power Grid Model Meet-up, scheduled for 6th June 2024. This event will adopt a hybrid format, allowing participants to join us either through an online Mircosoft Teams session or in person at TU/e located at Den Dolech 2, Eindhoven, Netherlands. The meet-up will be hosted by Eindhoven University of Technology (TU/e), a research university specializing in engineering science & technology.
Power Grid Model
The global energy transition is placing new and unprecedented demands on Distribution System Operators (DSOs). Alongside upgrades to grid capacity, processes such as digitization, capacity optimization, and congestion management are becoming vital for delivering reliable services.
Power Grid Model is an open source project from Linux Foundation Energy and provides a calculation engine that is increasingly essential for DSOs. It offers a standards-based foundation enabling real-time power systems analysis, simulations of electrical power grids, and sophisticated what-if analysis. In addition, it enables in-depth studies and analysis of the electrical power grid’s behavior and performance. This comprehensive model incorporates essential factors such as power generation capacity, electrical losses, voltage levels, power flows, and system stability.
Power Grid Model is currently being applied in a wide variety of use cases, including grid planning, expansion, reliability, and congestion studies. It can also help in analyzing the impact of renewable energy integration, assessing the effects of disturbances or faults, and developing strategies for grid control and optimization.
What to expect
For the upcoming meetup we are organizing, we have an exciting lineup of activities planned:
-Insightful presentations covering two practical applications of the Power Grid Model.
-An update on the latest advancements in Power Grid -Model technology during the first and second quarters of 2024.
-An interactive brainstorming session to discuss and propose new feature requests.
-An opportunity to connect with fellow Power Grid Model enthusiasts and users.
Northern Engraving | Nameplate Manufacturing Process - 2024Northern Engraving
Manufacturing custom quality metal nameplates and badges involves several standard operations. Processes include sheet prep, lithography, screening, coating, punch press and inspection. All decoration is completed in the flat sheet with adhesive and tooling operations following. The possibilities for creating unique durable nameplates are endless. How will you create your brand identity? We can help!
For the full video of this presentation, please visit: https://www.edge-ai-vision.com/2024/06/how-axelera-ai-uses-digital-compute-in-memory-to-deliver-fast-and-energy-efficient-computer-vision-a-presentation-from-axelera-ai/
Bram Verhoef, Head of Machine Learning at Axelera AI, presents the “How Axelera AI Uses Digital Compute-in-memory to Deliver Fast and Energy-efficient Computer Vision” tutorial at the May 2024 Embedded Vision Summit.
As artificial intelligence inference transitions from cloud environments to edge locations, computer vision applications achieve heightened responsiveness, reliability and privacy. This migration, however, introduces the challenge of operating within the stringent confines of resource constraints typical at the edge, including small form factors, low energy budgets and diminished memory and computational capacities. Axelera AI addresses these challenges through an innovative approach of performing digital computations within memory itself. This technique facilitates the realization of high-performance, energy-efficient and cost-effective computer vision capabilities at the thin and thick edge, extending the frontier of what is achievable with current technologies.
In this presentation, Verhoef unveils his company’s pioneering chip technology and demonstrates its capacity to deliver exceptional frames-per-second performance across a range of standard computer vision networks typical of applications in security, surveillance and the industrial sector. This shows that advanced computer vision can be accessible and efficient, even at the very edge of our technological ecosystem.
What is an RPA CoE? Session 1 – CoE VisionDianaGray10
In the first session, we will review the organization's vision and how this has an impact on the COE Structure.
Topics covered:
• The role of a steering committee
• How do the organization’s priorities determine CoE Structure?
Speaker:
Chris Bolin, Senior Intelligent Automation Architect Anika Systems
The Microsoft 365 Migration Tutorial For Beginner.pptxoperationspcvita
This presentation will help you understand the power of Microsoft 365. However, we have mentioned every productivity app included in Office 365. Additionally, we have suggested the migration situation related to Office 365 and how we can help you.
You can also read: https://www.systoolsgroup.com/updates/office-365-tenant-to-tenant-migration-step-by-step-complete-guide/
In the realm of cybersecurity, offensive security practices act as a critical shield. By simulating real-world attacks in a controlled environment, these techniques expose vulnerabilities before malicious actors can exploit them. This proactive approach allows manufacturers to identify and fix weaknesses, significantly enhancing system security.
This presentation delves into the development of a system designed to mimic Galileo's Open Service signal using software-defined radio (SDR) technology. We'll begin with a foundational overview of both Global Navigation Satellite Systems (GNSS) and the intricacies of digital signal processing.
The presentation culminates in a live demonstration. We'll showcase the manipulation of Galileo's Open Service pilot signal, simulating an attack on various software and hardware systems. This practical demonstration serves to highlight the potential consequences of unaddressed vulnerabilities, emphasizing the importance of offensive security practices in safeguarding critical infrastructure.
Fueling AI with Great Data with Airbyte WebinarZilliz
This talk will focus on how to collect data from a variety of sources, leveraging this data for RAG and other GenAI use cases, and finally charting your course to productionalization.
[OReilly Superstream] Occupy the Space: A grassroots guide to engineering (an...Jason Yip
The typical problem in product engineering is not bad strategy, so much as “no strategy”. This leads to confusion, lack of motivation, and incoherent action. The next time you look for a strategy and find an empty space, instead of waiting for it to be filled, I will show you how to fill it in yourself. If you’re wrong, it forces a correction. If you’re right, it helps create focus. I’ll share how I’ve approached this in the past, both what works and lessons for what didn’t work so well.
4. Introduction
Supervised vs. Unsupervised Learning
Supervised Learning (classification)
Supervision: The training data (observations,
measurements,…) are accompanied by labels
indicating the class of the observations
New data is classified based on the training set
Unsupervised Learning (Clusetering)
The class labels of training data is unknown
Given a set of measurements, observations, etc.
with the aim of establishing the existence of classes
or clusters in the data
5. Introduction
Classification
predicts categorical class labels
classifies data (constructs a model) based on the
training set and the values (class labels) in a
classifying attribute and uses it in classifying new
data
Model:
Training: based on training data to identidy classifier.
(xi,yi), with xi: object ith , yi class of ith. F(X)
Testing: new object x will be predicted based on classifier.
9. Introduction
K fold - cross validation: Evaluating Classifier
Accuracy
Randomly partition the data into k mutually
exclusive subsets, each approximately equal size
At i-th iteration, use Di as test set and others as
training set
K = 10
Leave-one-out: k folds where k = # of tuples, for
small sized data
10. Introduction
Confusion matrix: Given m classes, an entry, CMi,j in a
confusion matrix indicates # of tuples in class i that
they are labeled by the classifier as class j
14. ❑ Duration: 1 min.
❑ Question:
Why did you decide to study this course “Data mining”
Question
14
15. Should we play baseball today?
fwind : {weak, strong}
ftemperature : {hot, mild, cool}
fhumidity : {high, normal}
foutlook : {sunny, overcast, rainy}
{sunny, mild, normal, strong}
Outlook
Rainy
Overcas
t
Sunny
Yes
Wind Humidity
Yes No
Yes No
Weak Strong Normal High
Playball =
{Yes, No}
{foutlook, ftemperature, fhumidity, fwind} flearning
16. Should we play baseball today?
Conditions: {Outlook = Sunny, Temperature = Hot,
Humidity = Normal, Wind = Strong}
Outlook
Rainy
Overcast
Sunny
Yes
Wind Humidity
Yes No
Yes No
Weak Strong Normal High
The answer: Yes, today we should play baseball.
17. Description: Decision tree is a tree including root node
and branch node (representing a choice among choices),
and leaf node (representing a decision).
Outlook
Rainy
Overcast
Sunny
Yes
Wind Humidity
Yes No
Yes No
Weak Strong Normal High
Root node
Leaf
node
Branch
node
Branch
Decision tree
18. 18
Algorithm for Decision Tree
Basic algorithm (a greedy algorithm)
Tree is constructed in a top-down recursive divide-and-
conquer manner
At start, all the training examples are at the root
Attributes are categorical (if continuous-valued, they are
discretized in advance)
Examples are partitioned recursively based on selected
attributes
Test attributes are selected on the basis of a heuristic or
statistical measure (e.g., information gain, Gini index…)
Conditions for stopping partitioning
All samples for a given node belong to the same class
There are no remaining attributes for further partitioning –
majority voting is employed for classifying the leaf
There are no samples left
19. 19
Algorithm for Decision Tree
Generate rules based on decision tree
IF (conditon1) [and (condition2) and …] THEN Conclusion
IF outlook = sunny AND humidity = high THEN playball = no
IF outlook = overcast THEN playball = yes
IF outlook = rainy AND wind = weak THEN playball = yes
Outlook
Rainy
Overcast
Sunny
Yes
Wind Humidity
Yes No
Yes No
Weak Strong Normal High
20. ❑ Members: 3-5
students; Duration:
10 mins.
❑ Question: The
data below is
ready to apply
decision
algorithm? Why?
Propose your
solution.
Group Discussion
20
21. 21
Entropy
Entropy
A measure of uncertainty associated with a random variable
Entropy is used to build the tree
Calculation: Entropy of set S:
S: sample set
N: number of different values of all samples in S
Aj: number of sample corresponding to each j
Fs(Aj): ratio of Aj to S
S is a 14-sample set having 9 samples belong to class
Yes, and 5 samples belong to class No
23. 23
Information Gain
Information Gain of set of sample S based on attribute
A:
G(S,A): information gain of set S based on attribute A
E(S): entropy of S
m: number of different values of attribute A
Ai: number of sample corresponding to each I of
attribute A
Fs(Ai): ratio of Ai to S
SAi: subset of S including all samples having value Ai
24. 24
Information Gain
Day Outlook Temperature Humidity Wind Play ball
D1 Sunny Hot High Weak No
D2 Sunny Hot High Strong No
D3 Overcast Hot High Weak Yes
D4 Rainy Mild High Weak Yes
D5 Rainy Cool Normal Weak Yes
D6 Rainy Cool Normal Strong No
D7 Overcast Cool Normal Strong Yes
D8 Sunny Mild High Weak No
D9 Sunny Cool Normal Weak Yes
D10 Rainy Mild Normal Weak Yes
D11 Sunny Mild Normal Strong Yes
D12 Overcast Mild High Strong Yes
D13 Overcast Hot Normal Weak Yes
D14 Rainy Mild High Strong No
25. 25
Information Gain
G(S,Wind) = ?
S has 14 samples and 2 classes: 9 Yes, 5 No
Wind has 2 different values: Weak, Strong
Wind=Weak (8 samples: 6 Yes, 2 No); Wind=Strong
(6 samples: 3 Yes, 3 No):
– With:
28. 28
Decision Tree
Outlook has 3 different values: sunny, overcast, and
rainy → The root has 3 branches
Which attribute should be chosen at Sunny branch?
(Outlook, Humidity, Temperature, Wind)
➢ Ssunny = {D1, D2, D8, D9, D11}, then it has 5 samples with
Outlook = sunny
➢ Gain(Ssunny, Humidity) = 0.970
➢ Gain(Ssunny, Temperature) = 0.570
➢ Gain(Ssunny, Wind) = 0.019
➢ Select Humidity
Keep doing until all samples are classified or there are
no remaining attributes for further partitioning
29. 29
Information Gain
Day Outlook Temperature Humidity Wind Play ball
D1 Sunny Hot High Weak No
D2 Sunny Hot High Strong No
D3 Overcast Hot High Weak Yes
D4 Rainy Mild High Weak Yes
D5 Rainy Cool Normal Weak Yes
D6 Rainy Cool Normal Strong No
D7 Overcast Cool Normal Strong Yes
D8 Sunny Mild High Weak No
D9 Sunny Cool Normal Weak Yes
D10 Rainy Mild Normal Weak Yes
D11 Sunny Mild Normal Strong Yes
D12 Overcast Mild High Strong Yes
D13 Overcast Hot Normal Weak Yes
D14 Rainy Mild High Strong No
30. 30
Gini index
Gini index of data D:
With: the relative frequency of class j in D
Example: with data set above:
14 samples: 9 Yes, 5 No
Gini(D)=1 - (9/14)2 - (5/14)2 = 0.459
2
)
(
1
)
(
Gini
−
=
j
D
j
p
D
)
( D
j
p
31. 31
Gini index
If a data set D is split on A into k subsets D1, D2,…,
Dk the gini index giniA(D) is defined as:
With:
➢ ni: #samples of node i
➢ N: #samples of node A
Select attribute with minimal Gini index for
partitioning
=
=
k
i
i
A i
n
n
D
1
)
(
Gini
)
(
Gini
32. 32
Gini index
Day Outlook Temperature Humidity Wind Play ball
D1 Sunny Hot High Weak No
D2 Sunny Hot High Strong No
D3 Overcast Hot High Weak Yes
D4 Rainy Mild High Weak Yes
D5 Rainy Cool Normal Weak Yes
D6 Rainy Cool Normal Strong No
D7 Overcast Cool Normal Strong Yes
D8 Sunny Mild High Weak No
D9 Sunny Cool Normal Weak Yes
D10 Rainy Mild Normal Weak Yes
D11 Sunny Mild Normal Strong Yes
D12 Overcast Mild High Strong Yes
D13 Overcast Hot Normal Weak Yes
D14 Rainy Mild High Strong No
35. 35
Gini index
Gini(D)=1 - (9/14)2 - (5/14)2 = 0.459
1. GiniOutlook(D)= 0.343
2. GiniTemperature(D) = 0.440
3. GiniHumidity(D) = 0.367
4. GiniWind(D) = 0.428
→ Outlook is selected as the root (Gini index is the
minimal value)
40. 40
Bayes classification
Introduction
A statistical classifier: performs probabilistic
prediction, i.e., predicts class membership
probabilities
Foundation: Based on Bayes’ Theorem (1763)
Incremental: Each training example can
incrementally increase/decrease the probability that
a hypothesis is correct — prior knowledge can be
combined with observed data
41. 41
Bayes’ Theorem: Basics
Let X be a data sample (“evidence”): class label is unknown
Let H be a hypothesis that X belongs to class C
Classification is to determine P(H|X), (i.e., posteriori
probability): the probability that the hypothesis holds given
the observed data sample X
P(H) (prior probability): the initial probability
X will plays baseball, regardless of humidity, wind, overcast…
P(X) (prior probability): probability that sample data is
observed
)
(
/
)
(
)
|
(
)
(
)
(
)
|
(
)
|
( X
X
X
X
X P
H
P
H
P
P
H
P
H
P
H
P
=
=
42. 42
Bayes’ Theorem: Basics
P(X|H) (likelihood): the probability of observing the sample X,
given that the hypothesis holds
Informally, this can be viewed as:
posteriori = likelihood x prior/evidence
Predicts X belongs to Ci iff the probability P(Ci|X) is the highest
among all the P(Ck|X) for all the k classes
)
(
/
)
(
)
|
(
)
(
)
(
)
|
(
)
|
( X
X
X
X
X P
H
P
H
P
P
H
P
H
P
H
P
=
=
43. 43
Bayes’ Theorem: Basics
Naïve Bayes Classifier: attributes are conditionally
independent (i.e., no dependence relation among attributes)
P(X|H): X=(x1, x2,…, xk)
P(x1,…,xk|H) = P(x1|H)·…·P(xk|H)
)
(
/
)
(
)
|
(
)
(
)
(
)
|
(
)
|
( X
X
X
X
X P
H
P
H
P
P
H
P
H
P
H
P
=
=
44. Bayes’ Classifier – Example
Outlook Temperature Humidity Wind Play ball
Sunny Hot High Weak No
Sunny Hot High Strong No
Overcast Hot High Weak Yes
Rainy Mild High Weak Yes
Rainy Cool Normal Weak Yes
Rainy Cool Normal Strong No
Overcast Cool Normal Strong Yes
Sunny Mild High Weak No
Sunny Cool Normal Weak Yes
Rainy Mild Normal Weak Yes
Sunny Mild Normal Strong Yes
Overcast Mild High Strong Yes
Overcast Hot Normal Weak Yes
Rainy Mild High Strong No
45. Bayes’ Classifier – Example
Let X = (Outlook = Rainy, Temp = Cool, Humidity =
Normal, Wind = Weak) → X belongs to class Yes or No?
Compute → Predict
1. P(Play=Yes) *P(X|Play=Yes) = P(Play=Yes) *
P(Outlook=Rainy|Play=Yes)*P(Temp=Cool|Play=Yes)*
P(Humidity=Normal|Play=Yes)* P(Wind=Weak|Play=Yes)
2. P(Play=No) *P(X|Play=No) = P(Play=No) *
P(Outlook=Rainy|Play=No)*P(Temp=Cool|Play=No)*
P(Humidity=Normal|Play=No)* P(Wind=Weak|Play=No)
)
(
/
)
(
)
|
(
)
(
)
(
)
|
(
)
|
( X
X
X
X
X P
H
P
H
P
P
H
P
H
P
H
P
=
=
Comment
46. Bayes’ Classifier – Example
Let X = (Outlook = Rainy, Temp = Cool, Humidity =
Normal, Wind = Weak) → X belongs to class Yes or No?
Compute:
✓ P (Play=Yes) = 9/14; P(Play=No) = 5/14
✓ P(Outlook=Rainy|Play=Yes) = 3/9;
✓ P(Outlook=Rainy|Play=No) = 2/5;
50. ❑ Members: 3-5 students;
❑ Duration: 5 mins.
❑ Let X = (Outlook = Sunny, Temp = Hot, Humidity = High,
Wind = Weak), predict X.
Group Discussion
50
51. ❑ Members: 3-5 students;
❑ Duration: 5 mins.
❑ Let X = (Outlook = Overcast, Temp = Hot, Humidity =
High, Wind = Weak), predict X.
Group Discussion
51
Naïve Bayesian prediction requires
each conditional prob. be non-zero.
52. 52
Bayes’ Classifier
Need to avoid the Zero-Probability Problem
Use Laplacian correction (or Laplacian estimator)
P(Ci)=(|Ci,D|+1)/(|D|+m)
P(Xk|Ci)=(# Ci,D {xk}+1)/(|Ci,D|+r)
With:
- m: #classes
- r: #different values of the attribute
54. 54
Comments
Advatages
Easy to implement
Good results obtained in most of the cases
Disadvatages
Assumption: class conditional independence, therefore loss of
accuracy
Practically, dependencies exist among variables
E.g., hospitals: patients: Profile: age, family history, etc.
Symptoms: fever, cough etc., Disease: lung cancer,
diabetes, etc.
Dependencies among these cannot be modeled by Naïve
Bayes Classifier
64. 64
Comments
Advantages
High tolerance to noisy data
Ability to classify untrained patterns
Well-suited for continuous-valued inputs and
outputs
Successful on an array of real-world data, e.g.,
hand-written letters, …
Techniques have recently been developed for very
complicated topics
65. 65
Comments
Disadvantages
Long training time
Require a number of parameters typically best
determined empirically, e.g., the network topology
or “structure.”
Poor interpretability: Difficult to interpret the
symbolic meaning behind the learned weights and
of “hidden units” in the network
66. 1. Introduction
2. Decision Tree
3. Bayes Classification Methods
4. Neural network
5. K - Nearest Neighbor Classifier
6. Support Vector Machine
CONTENT
68. 68
K - Nearest Neighbor Classifier
1. Introduction
2. K - Nearest Neighbor Classifier
3. Comments
69. 69
Introduction
The k-nearest-neighbor method was first described in
the early 1950s
The idea is to search for the closest match(es) of the
test data in the feature space.
All instances correspond to points in the n-D space
The nearest neighbor are defined: dist(X1, X2) (ex:
Euclidean distance)
70. 70
K - Nearest Neighbor Classifier
The training tuples are described by n attributes
Each tuple represents a point in an n-dimensional
space → all the training tuples are stored in an n-
dimensional pattern space
When given an unknown tuple: a k-nearest-neighbor
classifier searches the pattern space for the k training
tuples that are closest to the unknown tuple
These k training tuples are the k “nearest neighbors” of
the unknown tuple.
71. 71
K - Nearest Neighbor Classifier
“Closeness” is defined in terms of a distance metric
(such as Euclidean distance)
X1(x11, x12, …x1n), X2(x21, x22, …x2n)
For discrete-valued, k-NN returns the most common
value among the k training examples nearest to xq
72. 72
K - Nearest Neighbor Classifier
K=1, 3, 4, or 7?
K should be an odd number
Neighbours with equal
importance?
Weighted kNN: depending on
their distance to the new-comer:
https://docs.opencv.org/3.4/d5/d26/tutorial_py_knn_understanding.html
2
)
,
(
1
i
x
q
x
d
w
73. 73
Comments
K=?
Extremely slow when classifying test tuples.
"learning" involves only memorizing (storing) the
data, before testing and classifying.
Distance?
Robust to noisy data
74. 1. Introduction
2. Decision Tree
3. Bayes Classification Methods
4. Neural network
5. K - Nearest Neighbor Classifier
6. Support Vector Machine
CONTENT
76. 76
Introduction
A relatively new classification method for both linear
and nonlinear data
It uses a nonlinear mapping to transform the original
training data into a higher dimension
With the new dimension, it searches for the linear
optimal separating hyperplane (i.e., “decision
boundary”)
With an appropriate nonlinear mapping to a
sufficiently high dimension, data from two classes can
always be separated by a hyperplane
SVM finds this hyperplane using support vectors
(“essential” training tuples) and margins (defined by
the support vectors)
77. 77
SVM—History and Applications
Vapnik and colleagues (1992)—groundwork from Vapnik
& Chervonenkis’ statistical learning theory in 1960s
Features: training can be slow but accuracy is high owing
to their ability to model complex nonlinear decision
boundaries (margin maximization)
Used for: classification and numeric prediction
Applications:
handwritten digit recognition, object recognition,
speaker identification, benchmarking time-series
prediction tests
79. April 7, 2022 Data Mining: Concepts and
Techniques
79
SVM—Margins and Support Vectors
80. 80
SVM—When Data Is Linearly
Separable
m
Let data D be (X1, y1), …, (X|D|, y|D|), where Xi is the set of training tuples
associated with the class labels yi
There are infinite lines (hyperplanes) separating the two classes but we want to
find the best one (the one that minimizes classification error on unseen data)
SVM searches for the hyperplane with the largest margin, i.e., maximum
marginal hyperplane (MMH)
81. 81
SVM—Linearly Separable
◼ A separating hyperplane can be written as
W ● X + b = 0
where W={w1, w2, …, wn} is a weight vector and b a scalar (bias)
◼ For 2-D it can be written as
w0 + w1 x1 + w2 x2 = 0
◼ The hyperplane defining the sides of the margin:
H1: w0 + w1 x1 + w2 x2 ≥ 1 for yi = +1, and
H2: w0 + w1 x1 + w2 x2 ≤ – 1 for yi = –1
◼ Any training tuples that fall on hyperplanes H1 or H2 (i.e., the
sides defining the margin) are support vectors
◼ This becomes a constrained (convex) quadratic optimization
problem: Quadratic objective function and linear constraints →
Quadratic Programming (QP) → Lagrangian multipliers
82. 82
Why Is SVM Effective on High Dimensional Data?
◼ The complexity of trained classifier is characterized by the # of
support vectors rather than the dimensionality of the data
◼ The support vectors are the essential or critical training examples —
they lie closest to the decision boundary (MMH)
◼ If all other training examples are removed and the training is
repeated, the same separating hyperplane would be found
◼ The number of support vectors found can be used to compute an
(upper) bound on the expected error rate of the SVM classifier, which
is independent of the data dimensionality
◼ Thus, an SVM with a small number of support vectors can have good
generalization, even when the dimensionality of the data is high
83. 83
SVM: Different Kernel functions
◼ Instead of computing the dot product on the transformed
data, it is math. equivalent to applying a kernel function
K(Xi, Xj) to the original data, i.e., K(Xi, Xj) = Φ(Xi) Φ(Xj)
◼ Typical Kernel Functions
84. 84
SVM Related Links
SVM Website: http://www.kernel-machines.org/
SVM practical guide: library for SVM
Representative implementations
LIBSVM: an efficient implementation of SVM, multi-class
classifications, nu-SVM, one-class SVM, including also
various interfaces with java, python, etc.
SVM-light: simpler but performance is not better than
LIBSVM, support only binary classification and only in C
SVM-torch: another recent implementation also written
in C
85. 1. Introduction
2. Decision Tree
3. Bayes Classification Methods
4. Neural network
5. K - Nearest Neighbor Classifier
6. Support Vector Machine
CONTENT