This document provides an overview of lectures on machine learning topics including classification, overfitting, support vector machines, data projection, and regression. It discusses evaluating models, controlling overfitting through cross-validation, precision vs recall, and implementing classification and regression in Python using Scikit-Learn. Examples are provided on linear classification with SVM, handling non-linearly separable data, and using data projection techniques like LDA.
This document provides an overview of machine learning concepts covered in an Introduction to Machine Learning course. It discusses topics like binary and multiclass classification, evaluation metrics like precision and recall, imbalanced datasets, and algorithms like k-nearest neighbors, decision trees, support vector machines, and data projection techniques. Examples and illustrations are provided to explain key concepts in classification and how different algorithms work.
This document provides an overview of an introduction to machine learning course, including:
- A description of the course content which covers Python programming, data visualization, supervised learning algorithms, regression, and unsupervised learning.
- An example of predicting bike share usage at different stations and the importance of understanding the problem and data.
- Guidance on exploring and visualizing data in Python to gain insights before applying machine learning algorithms.
This document provides an overview of a course on machine learning. It discusses topics that will be covered, including data visualization, descriptive statistics, the central limit theorem, correlation, classification, and confusion matrices. Classification examples include binary classification of emails as spam or not spam based on multiple features, as well as digit recognition from images. Trade-offs between types of errors in predictive models and optimizing goals like profit are also mentioned.
This document provides an overview of a course on machine learning. It discusses topics that will be covered in the course including data visualization, descriptive statistics, the central limit theorem, correlation, classification algorithms for binary and multiclass problems, and confusion matrices. Examples are provided for correlation, linear classification of handwritten digits, and how different types of classification errors can impact domains like medical diagnosis or airline overbooking policies. The goal is to introduce foundational machine learning concepts.
This document outlines the course details for an Introduction to Machine Learning module. The module will cover the basics of machine learning including supervised learning techniques like classification and regression, as well as unsupervised learning techniques like clustering and dimensionality reduction. Students will learn to prepare data, visualize it, evaluate models, and communicate results to both technical and non-technical audiences. The course will involve lectures, labs, homework assignments, and a final written test. The goal is for students to understand the basics of machine learning and be able to apply the techniques to analyze real-world data.
This document outlines the course details for an Introduction to Machine Learning module. The module will cover the basics of machine learning including supervised learning techniques like classification and regression, as well as unsupervised learning techniques like clustering and dimensionality reduction. Students will learn to prepare data, visualize it, evaluate models, and communicate results to both technical and non-technical audiences. The course will involve lectures, labs, homework assignments, and a final written test. The goal is for students to understand the basics of machine learning and be able to apply the techniques to analyze real-world data.
Ordinal Regression and Machine Learning: Applications, Methods, MetricsFrancesco Casalegno
What do movie recommender systems, disease progression evaluation, and sovereign credit ranking have in common?
→ ordinal regression sits between classification and regression
→ target values are categorical and discrete, but ordered
→ many challenges to face when training and evaluating models
What will you find in this presentation?
→ real life, clear examples of ordinal regression you see everyday
→ learning to rank: predict user preferences and items relevance
→ best solution methods: naïve, binary decomposition, threshold
→ how to measure performance: understand & choose metrics
This document proposes two active learning methods, SVM-CC and SVM-CCMS, for hyperspectral image classification that focus on identifying and sampling from critical classes. The methods use a shifting hyperplane model to identify critical class pairs with high probability of being difficult to classify. SVM-CC randomly samples from the critical class set, while SVM-CCMS samples points closest to the decision margin within critical classes. Experimental results on two hyperspectral datasets show the proposed methods outperform random sampling and concentrate samples on support vectors, particularly improving performance for hard classes.
This document provides an overview of machine learning concepts covered in an Introduction to Machine Learning course. It discusses topics like binary and multiclass classification, evaluation metrics like precision and recall, imbalanced datasets, and algorithms like k-nearest neighbors, decision trees, support vector machines, and data projection techniques. Examples and illustrations are provided to explain key concepts in classification and how different algorithms work.
This document provides an overview of an introduction to machine learning course, including:
- A description of the course content which covers Python programming, data visualization, supervised learning algorithms, regression, and unsupervised learning.
- An example of predicting bike share usage at different stations and the importance of understanding the problem and data.
- Guidance on exploring and visualizing data in Python to gain insights before applying machine learning algorithms.
This document provides an overview of a course on machine learning. It discusses topics that will be covered, including data visualization, descriptive statistics, the central limit theorem, correlation, classification, and confusion matrices. Classification examples include binary classification of emails as spam or not spam based on multiple features, as well as digit recognition from images. Trade-offs between types of errors in predictive models and optimizing goals like profit are also mentioned.
This document provides an overview of a course on machine learning. It discusses topics that will be covered in the course including data visualization, descriptive statistics, the central limit theorem, correlation, classification algorithms for binary and multiclass problems, and confusion matrices. Examples are provided for correlation, linear classification of handwritten digits, and how different types of classification errors can impact domains like medical diagnosis or airline overbooking policies. The goal is to introduce foundational machine learning concepts.
This document outlines the course details for an Introduction to Machine Learning module. The module will cover the basics of machine learning including supervised learning techniques like classification and regression, as well as unsupervised learning techniques like clustering and dimensionality reduction. Students will learn to prepare data, visualize it, evaluate models, and communicate results to both technical and non-technical audiences. The course will involve lectures, labs, homework assignments, and a final written test. The goal is for students to understand the basics of machine learning and be able to apply the techniques to analyze real-world data.
This document outlines the course details for an Introduction to Machine Learning module. The module will cover the basics of machine learning including supervised learning techniques like classification and regression, as well as unsupervised learning techniques like clustering and dimensionality reduction. Students will learn to prepare data, visualize it, evaluate models, and communicate results to both technical and non-technical audiences. The course will involve lectures, labs, homework assignments, and a final written test. The goal is for students to understand the basics of machine learning and be able to apply the techniques to analyze real-world data.
Ordinal Regression and Machine Learning: Applications, Methods, MetricsFrancesco Casalegno
What do movie recommender systems, disease progression evaluation, and sovereign credit ranking have in common?
→ ordinal regression sits between classification and regression
→ target values are categorical and discrete, but ordered
→ many challenges to face when training and evaluating models
What will you find in this presentation?
→ real life, clear examples of ordinal regression you see everyday
→ learning to rank: predict user preferences and items relevance
→ best solution methods: naïve, binary decomposition, threshold
→ how to measure performance: understand & choose metrics
This document proposes two active learning methods, SVM-CC and SVM-CCMS, for hyperspectral image classification that focus on identifying and sampling from critical classes. The methods use a shifting hyperplane model to identify critical class pairs with high probability of being difficult to classify. SVM-CC randomly samples from the critical class set, while SVM-CCMS samples points closest to the decision margin within critical classes. Experimental results on two hyperspectral datasets show the proposed methods outperform random sampling and concentrate samples on support vectors, particularly improving performance for hard classes.
This document appears to be lecture slides for a course on deriving knowledge from data at scale. It covers many topics related to building machine learning models including data preparation, feature selection, classification algorithms like decision trees and support vector machines, and model evaluation. It provides examples applying these techniques to a Titanic passenger dataset to predict survival. It emphasizes the importance of data wrangling and discusses various feature selection methods.
Machine learning in science and industry — day 1arogozhnikov
A course of machine learning in science and industry.
- notions and applications
- nearest neighbours: search and machine learning algorithms
- roc curve
- optimal classification and regression
- density estimation
- Gaussian mixtures and EM algorithm
- clustering, an example of clustering in the opera
The document discusses text categorization and compares several machine learning algorithms for this task, including Support Vector Machines (SVM), Transductive SVM (TSVM), and SVM combined with K-Nearest Neighbors (SVM-KNN). It provides an overview of text categorization and challenges. It then describes SVM, TSVM which uses unlabeled data to improve classification, and SVM-KNN which combines SVM with KNN to better handle unlabeled data. Pseudocode is presented for the algorithms.
This document provides an overview of machine learning techniques for classification and anomaly detection. It begins with an introduction to machine learning and common tasks like classification, clustering, and anomaly detection. Basic classification techniques are then discussed, including probabilistic classifiers like Naive Bayes, decision trees, instance-based learning like k-nearest neighbors, and linear classifiers like logistic regression. The document provides examples and comparisons of these different methods. It concludes by discussing anomaly detection and how it differs from classification problems, noting challenges like having few positive examples of anomalies.
This document presents cluster forests, a clustering ensemble method that aggregates multiple clustering instances. It consists of two stages: generating clustering instances using a random forest approach, and aggregating the results. The method is evaluated on eight datasets, demonstrating strong performance compared to baseline clustering algorithms under two evaluation metrics. Cluster forests provide a clustering analogy to random forests and a unifying view of clustering and classification.
[RecSys 2014] Deviation-Based and Similarity-Based Contextual SLIM Recommenda...YONG ZHENG
Yong Zheng. "Deviation-Based and Similarity-Based Contextual SLIM Recommendation Algorithms". ACM RecSys Doctoral Symposium, Proceedings of the 8th ACM Conference on Recommender Systems (ACM RecSys 2014), pp. 437-440, Silicon Valley, CA, USA, Oct 2014 [Doctoral Symposium, Acceptance rate: 47%]
IMPROVING SUPERVISED CLASSIFICATION OF DAILY ACTIVITIES LIVING USING NEW COST...csandit
The growing population of elders in the society calls for a new approach in care giving. By
inferring what activities elderly are performing in their houses it is possible to determine their
physical and cognitive capabilities. In this paper we show the potential of important
discriminative classifiers namely the Soft-Support Vector Machines (C-SVM), Conditional
Random Fields (CRF) and k-Nearest Neighbors (k-NN) for recognizing activities from sensor
patterns in a smart home environment. We address also the class imbalance problem in activity
recognition field which has been known to hinder the learning performance of classifiers. Cost
sensitive learning is attractive under most imbalanced circumstances, but it is difficult to
determine the precise misclassification costs in practice. We introduce a new criterion for
selecting the suitable cost parameter C of the C-SVM method. Through our evaluation on four
real world imbalanced activity datasets, we demonstrate that C-SVM based on our proposed
criterion outperforms the state-of-the-art discriminative methods in activity recognition.
Netflix uses a variety of techniques to provide personalized recommendations to users. Some key aspects include:
1. Netflix recommendations are generated using both offline and online techniques. Offline techniques allow for more complex computations but results may become stale, while online techniques can respond quickly but have stricter time constraints.
2. Recommendations are generated using a variety of data sources and machine learning models, including SVD, RBMs, gradient boosted trees, and other techniques. Both the data and models are important for generating high quality recommendations.
3. Netflix tests recommendations using both offline and online A/B testing techniques. Offline testing is used to evaluate new models and ideas before launching online tests involving real users
Supervised learning involves using a training dataset to learn a target function that can be used to predict class labels or attribute values. The document discusses supervised learning and classification, including types of supervised learning problems like classification and regression. It provides examples of classification algorithms like K-nearest neighbors, decision trees, naive Bayes, and support vector machines. It also gives examples of how to implement classification algorithms using scikit-learn and discusses evaluating classification model performance based on accuracy.
Learning to Balance: Bayesian Meta-Learning for Imbalanced and Out-of-distrib...MLAI2
While tasks could come with varying the number of instances and classes in realistic settings, the existing meta-learning approaches for few-shot classification assume that number of instances per task and class is fixed. Due to such restriction, they learn to equally utilize the meta-knowledge across all the tasks, even when the number of instances per task and class largely varies. Moreover, they do not consider distributional difference in unseen tasks, on which the meta-knowledge may have less usefulness depending on the task relatedness. To overcome these limitations, we propose a novel meta-learning model that adaptively balances the effect of the meta-learning and task-specific learning within each task. Through the learning of the balancing variables, we can decide whether to obtain a solution by relying on the meta-knowledge or task-specific learning. We formulate this objective into a Bayesian inference framework and tackle it using variational inference. We validate our Bayesian Task-Adaptive Meta-Learning (Bayesian TAML) on two realistic task- and class-imbalanced datasets, on which it significantly outperforms existing meta-learning approaches. Further ablation study confirms the effectiveness of each balancing component and the Bayesian learning framework.
This document summarizes a talk on scaling machine learning algorithms to big data settings using a divide-and-conquer approach. It discusses three converging trends of big data, distributed computing, and machine learning. The goal is to extend machine learning to big data, but traditional ML algorithms do not scale well. The proposed approach divides data into subsets, applies existing ML algorithms to each subset in parallel, and then combines the results. Matrix factorization is provided as an example application, where the Divide-Factor-Combine framework allows preserving theoretical guarantees while enabling scalability.
Localization and classification. Overfeat: class agnostic versu class specific localization, fully convolutional neural networks, greedy merge strategy. Multiobject detection. Region proposal and selective search. R-CNN, Fast R-CNN, Faster R-CNN and YOLO. Image segmentation. Semantic segmentation and transposed convolutions. Instance segmentation and Mask R-CNN. Image captioning. Recurrent Neural Networks (RNNs). Language generation. Long Short Term Memory (LSTMs). DeepImageSent, Show and Tell, and Show, Attend and Tell algorithms.
Probability density estimation using Product of Conditional ExpertsChirag Gupta
This document discusses probability density estimation using a product of conditional experts model. It summarizes that density estimation constructs a probability distribution function from observed data to understand the underlying pattern. A product of conditional experts model is proposed, where simple classification models like logistic regression are used as experts to estimate the conditional probability. The experts are combined by multiplying their probabilities. The model is trained using gradient ascent to maximize the log probability. When evaluated on artificial and real datasets, the product of conditional experts model is shown to learn distributions close to the true distributions and generalize better than linear and non-linear baseline models. The document also explores applying the model to outlier detection.
This document summarizes a 2010 tutorial on metric learning given by Brian Kulis at the University of California, Berkeley. The tutorial introduces metric learning problems and algorithms. It discusses how metric learning can learn feature weights or linear/nonlinear transformations from data to improve distance metrics for tasks like clustering and classification. Key topics covered include Mahalanobis distance metrics, linear and nonlinear metric learning methods, and applications. The tutorial aims to explain both theoretical concepts and practical considerations for metric learning.
Flavours of Physics Challenge: Transfer Learning approachAlexander Rakhlin
Presentation for "Heavy Flavour Data Mining workshop", February 18-19, University of Zurich. I discuss the solution that won Physics Prize of Flavours of Physics challenge organized by CERN, Yandex, Intel at Kaggle.
Application of combined support vector machines in process fault diagnosisDr.Pooja Jain
This document discusses applying combined support vector machines (C-SVM) for process fault diagnosis and compares its performance to other classifiers. The authors test C-SVM, k-nearest neighbors, and simple SVM on data from the Tennessee Eastman process simulator and a three tank system. Their results show C-SVM achieves the lowest classification error compared to the other methods, though its complexity increases with the number of faults. Principal component analysis did not improve performance over the other classifiers. Selecting important variables using contribution charts significantly enhanced classifier performance on the Tennessee Eastman data.
This document provides an overview of supervised machine learning algorithms, including linear regression, naive bayesian classification, and their applications. It discusses basic concepts like training a classification model on labeled data and testing it on new unlabeled data. Linear regression finds the best linear relationship between variables, while naive bayes assumes conditional independence between attributes. The document uses examples to illustrate classification of loan applications and text documents. It explains the mathematical process and advantages of the naive bayes approach.
Hands-on Tutorial of Machine Learning in PythonChun-Ming Chang
This document provides an overview of a hands-on tutorial on machine learning in Python. It discusses various machine learning algorithms including linear regression, logistic regression, and regularization. It explains key concepts such as model selection, cross-validation, preprocessing, and evaluation metrics. Examples are provided to illustrate linear regression, regularization techniques like Ridge and Lasso regression, and logistic regression. The document encourages participants to practice these techniques on exercises.
- Quiz 1 will be on Wednesday covering material from lecture with an emphasis on topics not covered in projects. It will contain around 20 multiple choice or short answer questions to be completed in class.
- The document previews a machine learning lecture covering topics like clustering strategies, classifiers, generalization, bias-variance tradeoff, and support vector machines. It provides slides and summaries of key concepts.
- Summarizing techniques for reducing error in machine learning models like choosing simpler classifiers, collecting more training data, and regularizing parameters.
Temple of Asclepius in Thrace. Excavation resultsKrassimira Luka
The temple and the sanctuary around were dedicated to Asklepios Zmidrenus. This name has been known since 1875 when an inscription dedicated to him was discovered in Rome. The inscription is dated in 227 AD and was left by soldiers originating from the city of Philippopolis (modern Plovdiv).
This document appears to be lecture slides for a course on deriving knowledge from data at scale. It covers many topics related to building machine learning models including data preparation, feature selection, classification algorithms like decision trees and support vector machines, and model evaluation. It provides examples applying these techniques to a Titanic passenger dataset to predict survival. It emphasizes the importance of data wrangling and discusses various feature selection methods.
Machine learning in science and industry — day 1arogozhnikov
A course of machine learning in science and industry.
- notions and applications
- nearest neighbours: search and machine learning algorithms
- roc curve
- optimal classification and regression
- density estimation
- Gaussian mixtures and EM algorithm
- clustering, an example of clustering in the opera
The document discusses text categorization and compares several machine learning algorithms for this task, including Support Vector Machines (SVM), Transductive SVM (TSVM), and SVM combined with K-Nearest Neighbors (SVM-KNN). It provides an overview of text categorization and challenges. It then describes SVM, TSVM which uses unlabeled data to improve classification, and SVM-KNN which combines SVM with KNN to better handle unlabeled data. Pseudocode is presented for the algorithms.
This document provides an overview of machine learning techniques for classification and anomaly detection. It begins with an introduction to machine learning and common tasks like classification, clustering, and anomaly detection. Basic classification techniques are then discussed, including probabilistic classifiers like Naive Bayes, decision trees, instance-based learning like k-nearest neighbors, and linear classifiers like logistic regression. The document provides examples and comparisons of these different methods. It concludes by discussing anomaly detection and how it differs from classification problems, noting challenges like having few positive examples of anomalies.
This document presents cluster forests, a clustering ensemble method that aggregates multiple clustering instances. It consists of two stages: generating clustering instances using a random forest approach, and aggregating the results. The method is evaluated on eight datasets, demonstrating strong performance compared to baseline clustering algorithms under two evaluation metrics. Cluster forests provide a clustering analogy to random forests and a unifying view of clustering and classification.
[RecSys 2014] Deviation-Based and Similarity-Based Contextual SLIM Recommenda...YONG ZHENG
Yong Zheng. "Deviation-Based and Similarity-Based Contextual SLIM Recommendation Algorithms". ACM RecSys Doctoral Symposium, Proceedings of the 8th ACM Conference on Recommender Systems (ACM RecSys 2014), pp. 437-440, Silicon Valley, CA, USA, Oct 2014 [Doctoral Symposium, Acceptance rate: 47%]
IMPROVING SUPERVISED CLASSIFICATION OF DAILY ACTIVITIES LIVING USING NEW COST...csandit
The growing population of elders in the society calls for a new approach in care giving. By
inferring what activities elderly are performing in their houses it is possible to determine their
physical and cognitive capabilities. In this paper we show the potential of important
discriminative classifiers namely the Soft-Support Vector Machines (C-SVM), Conditional
Random Fields (CRF) and k-Nearest Neighbors (k-NN) for recognizing activities from sensor
patterns in a smart home environment. We address also the class imbalance problem in activity
recognition field which has been known to hinder the learning performance of classifiers. Cost
sensitive learning is attractive under most imbalanced circumstances, but it is difficult to
determine the precise misclassification costs in practice. We introduce a new criterion for
selecting the suitable cost parameter C of the C-SVM method. Through our evaluation on four
real world imbalanced activity datasets, we demonstrate that C-SVM based on our proposed
criterion outperforms the state-of-the-art discriminative methods in activity recognition.
Netflix uses a variety of techniques to provide personalized recommendations to users. Some key aspects include:
1. Netflix recommendations are generated using both offline and online techniques. Offline techniques allow for more complex computations but results may become stale, while online techniques can respond quickly but have stricter time constraints.
2. Recommendations are generated using a variety of data sources and machine learning models, including SVD, RBMs, gradient boosted trees, and other techniques. Both the data and models are important for generating high quality recommendations.
3. Netflix tests recommendations using both offline and online A/B testing techniques. Offline testing is used to evaluate new models and ideas before launching online tests involving real users
Supervised learning involves using a training dataset to learn a target function that can be used to predict class labels or attribute values. The document discusses supervised learning and classification, including types of supervised learning problems like classification and regression. It provides examples of classification algorithms like K-nearest neighbors, decision trees, naive Bayes, and support vector machines. It also gives examples of how to implement classification algorithms using scikit-learn and discusses evaluating classification model performance based on accuracy.
Learning to Balance: Bayesian Meta-Learning for Imbalanced and Out-of-distrib...MLAI2
While tasks could come with varying the number of instances and classes in realistic settings, the existing meta-learning approaches for few-shot classification assume that number of instances per task and class is fixed. Due to such restriction, they learn to equally utilize the meta-knowledge across all the tasks, even when the number of instances per task and class largely varies. Moreover, they do not consider distributional difference in unseen tasks, on which the meta-knowledge may have less usefulness depending on the task relatedness. To overcome these limitations, we propose a novel meta-learning model that adaptively balances the effect of the meta-learning and task-specific learning within each task. Through the learning of the balancing variables, we can decide whether to obtain a solution by relying on the meta-knowledge or task-specific learning. We formulate this objective into a Bayesian inference framework and tackle it using variational inference. We validate our Bayesian Task-Adaptive Meta-Learning (Bayesian TAML) on two realistic task- and class-imbalanced datasets, on which it significantly outperforms existing meta-learning approaches. Further ablation study confirms the effectiveness of each balancing component and the Bayesian learning framework.
This document summarizes a talk on scaling machine learning algorithms to big data settings using a divide-and-conquer approach. It discusses three converging trends of big data, distributed computing, and machine learning. The goal is to extend machine learning to big data, but traditional ML algorithms do not scale well. The proposed approach divides data into subsets, applies existing ML algorithms to each subset in parallel, and then combines the results. Matrix factorization is provided as an example application, where the Divide-Factor-Combine framework allows preserving theoretical guarantees while enabling scalability.
Localization and classification. Overfeat: class agnostic versu class specific localization, fully convolutional neural networks, greedy merge strategy. Multiobject detection. Region proposal and selective search. R-CNN, Fast R-CNN, Faster R-CNN and YOLO. Image segmentation. Semantic segmentation and transposed convolutions. Instance segmentation and Mask R-CNN. Image captioning. Recurrent Neural Networks (RNNs). Language generation. Long Short Term Memory (LSTMs). DeepImageSent, Show and Tell, and Show, Attend and Tell algorithms.
Probability density estimation using Product of Conditional ExpertsChirag Gupta
This document discusses probability density estimation using a product of conditional experts model. It summarizes that density estimation constructs a probability distribution function from observed data to understand the underlying pattern. A product of conditional experts model is proposed, where simple classification models like logistic regression are used as experts to estimate the conditional probability. The experts are combined by multiplying their probabilities. The model is trained using gradient ascent to maximize the log probability. When evaluated on artificial and real datasets, the product of conditional experts model is shown to learn distributions close to the true distributions and generalize better than linear and non-linear baseline models. The document also explores applying the model to outlier detection.
This document summarizes a 2010 tutorial on metric learning given by Brian Kulis at the University of California, Berkeley. The tutorial introduces metric learning problems and algorithms. It discusses how metric learning can learn feature weights or linear/nonlinear transformations from data to improve distance metrics for tasks like clustering and classification. Key topics covered include Mahalanobis distance metrics, linear and nonlinear metric learning methods, and applications. The tutorial aims to explain both theoretical concepts and practical considerations for metric learning.
Flavours of Physics Challenge: Transfer Learning approachAlexander Rakhlin
Presentation for "Heavy Flavour Data Mining workshop", February 18-19, University of Zurich. I discuss the solution that won Physics Prize of Flavours of Physics challenge organized by CERN, Yandex, Intel at Kaggle.
Application of combined support vector machines in process fault diagnosisDr.Pooja Jain
This document discusses applying combined support vector machines (C-SVM) for process fault diagnosis and compares its performance to other classifiers. The authors test C-SVM, k-nearest neighbors, and simple SVM on data from the Tennessee Eastman process simulator and a three tank system. Their results show C-SVM achieves the lowest classification error compared to the other methods, though its complexity increases with the number of faults. Principal component analysis did not improve performance over the other classifiers. Selecting important variables using contribution charts significantly enhanced classifier performance on the Tennessee Eastman data.
This document provides an overview of supervised machine learning algorithms, including linear regression, naive bayesian classification, and their applications. It discusses basic concepts like training a classification model on labeled data and testing it on new unlabeled data. Linear regression finds the best linear relationship between variables, while naive bayes assumes conditional independence between attributes. The document uses examples to illustrate classification of loan applications and text documents. It explains the mathematical process and advantages of the naive bayes approach.
Hands-on Tutorial of Machine Learning in PythonChun-Ming Chang
This document provides an overview of a hands-on tutorial on machine learning in Python. It discusses various machine learning algorithms including linear regression, logistic regression, and regularization. It explains key concepts such as model selection, cross-validation, preprocessing, and evaluation metrics. Examples are provided to illustrate linear regression, regularization techniques like Ridge and Lasso regression, and logistic regression. The document encourages participants to practice these techniques on exercises.
- Quiz 1 will be on Wednesday covering material from lecture with an emphasis on topics not covered in projects. It will contain around 20 multiple choice or short answer questions to be completed in class.
- The document previews a machine learning lecture covering topics like clustering strategies, classifiers, generalization, bias-variance tradeoff, and support vector machines. It provides slides and summaries of key concepts.
- Summarizing techniques for reducing error in machine learning models like choosing simpler classifiers, collecting more training data, and regularizing parameters.
Temple of Asclepius in Thrace. Excavation resultsKrassimira Luka
The temple and the sanctuary around were dedicated to Asklepios Zmidrenus. This name has been known since 1875 when an inscription dedicated to him was discovered in Rome. The inscription is dated in 227 AD and was left by soldiers originating from the city of Philippopolis (modern Plovdiv).
A Visual Guide to 1 Samuel | A Tale of Two HeartsSteve Thomason
These slides walk through the story of 1 Samuel. Samuel is the last judge of Israel. The people reject God and want a king. Saul is anointed as the first king, but he is not a good king. David, the shepherd boy is anointed and Saul is envious of him. David shows honor while Saul continues to self destruct.
How to Setup Default Value for a Field in Odoo 17Celine George
In Odoo, we can set a default value for a field during the creation of a record for a model. We have many methods in odoo for setting a default value to the field.
Gender and Mental Health - Counselling and Family Therapy Applications and In...PsychoTech Services
A proprietary approach developed by bringing together the best of learning theories from Psychology, design principles from the world of visualization, and pedagogical methods from over a decade of training experience, that enables you to: Learn better, faster!
CapTechTalks Webinar Slides June 2024 Donovan Wright.pptxCapitolTechU
Slides from a Capitol Technology University webinar held June 20, 2024. The webinar featured Dr. Donovan Wright, presenting on the Department of Defense Digital Transformation.
🔥🔥🔥🔥🔥🔥🔥🔥🔥
إضغ بين إيديكم من أقوى الملازم التي صممتها
ملزمة تشريح الجهاز الهيكلي (نظري 3)
💀💀💀💀💀💀💀💀💀💀
تتميز هذهِ الملزمة بعِدة مُميزات :
1- مُترجمة ترجمة تُناسب جميع المستويات
2- تحتوي على 78 رسم توضيحي لكل كلمة موجودة بالملزمة (لكل كلمة !!!!)
#فهم_ماكو_درخ
3- دقة الكتابة والصور عالية جداً جداً جداً
4- هُنالك بعض المعلومات تم توضيحها بشكل تفصيلي جداً (تُعتبر لدى الطالب أو الطالبة بإنها معلومات مُبهمة ومع ذلك تم توضيح هذهِ المعلومات المُبهمة بشكل تفصيلي جداً
5- الملزمة تشرح نفسها ب نفسها بس تكلك تعال اقراني
6- تحتوي الملزمة في اول سلايد على خارطة تتضمن جميع تفرُعات معلومات الجهاز الهيكلي المذكورة في هذهِ الملزمة
واخيراً هذهِ الملزمة حلالٌ عليكم وإتمنى منكم إن تدعولي بالخير والصحة والعافية فقط
كل التوفيق زملائي وزميلاتي ، زميلكم محمد الذهبي 💊💊
🔥🔥🔥🔥🔥🔥🔥🔥🔥
2. Trinity College Dublin, The University of Dublin
Overview previous lectures
2
• Classification
• Evaluation
• Overfitting and Cross-validation
• Chance level
• K-nearest neighbour (KNN)
• Decision tree
3. Trinity College Dublin, The University of Dublin
Overview lecture
3
• Cross-validation
• Overfitting
• More about Support Vector Machines (SVM)
• Data projection (introduction)
• Introduction to regression
4. Trinity College Dublin, The University of Dublin 4
Support Vector Machine (SVM)
Linear Binary SVM Classification
- Scenario where the two classes are linearly
separable
- The solid line in the plot on the right represents
the decision boundary of an SVM classifier
- This line separates the two classes + stays as far
away from the closest training instances as
possible
5. Trinity College Dublin, The University of Dublin 5
Support Vector Machine (SVM)
A more realistic scenario.
We are going to get some errors. We can choose:
Do we prefer having higher precision or higher
recall? We can’t have both, but we can move the
decision boundary to make the solution the best as
possible for our goals.
6. Trinity College Dublin, The University of Dublin
Overfitting
6
https://towardsdatascience.com/techniques-for-handling-underfitting-
and-overfitting-in-machine-learning-348daa2380b9
Overfitted model: it does not
generalise well!
Maybe some datapoints were bad
measurements or mislabelled
7. Trinity College Dublin, The University of Dublin
Overfitting
7
Controlling for overfitting
- We want to make sure that our model is working for real. That it generalises.
Not that it works (good classification) because we are overfitting
- To do so, we fit the model on one portion of the data and test it on a
separate portion of the data. This approach controls for overfitting as the
model is evaluated on unseen data (cross-validation)
Preventing overfitting
- More complex models tend to overfit more
- There are strategies to reduce the amount of overfitting (e.g., regularisation,
early stopping)
8. Trinity College Dublin, The University of Dublin
Cross-validation (controlling for overfitting)
8
https://towardsdatascience.com/cross-validation-k-fold-vs-monte-carlo-e54df2fc179b
Class 1
Class 2
Ground truth Training set Test set
9. Trinity College Dublin, The University of Dublin
Cross-validation
9
https://towardsdatascience.com/cross-validation-k-fold-vs-monte-carlo-e54df2fc179b
Class 1
Class 2
Ground truth Training set Test set
10. Trinity College Dublin, The University of Dublin
Cross-validation
10
https://towardsdatascience.com/cross-validation-k-fold-vs-monte-carlo-e54df2fc179b
Class 1
Class 2
Ground truth Training set Test set
The model is overfitting! Too
complex
At least the cross-validation is
controlling for that i.e.,
prediction on the test set is
not very good
11. Trinity College Dublin, The University of Dublin
k-fold Cross-validation
11
https://towardsdatascience.com/cross-validation-k-fold-vs-monte-carlo-e54df2fc179b
12. Trinity College Dublin, The University of Dublin
Baseline – real vs. ideal
12
- Coin flip:
- 2 classes (head vs. tail)
- 50-50 chance
- Random
- Is that a zero or a one digit?
- 2 classes
- Let’s use a simple linear classifier. We definitely want this classifier to
perform better than chance.
- What is chance? Well, 2 classes.. Isnt’t that a 50-50 chance to get it right?
- Nope. That depends on the probability of encountering a 1 or a 0
- So, let’s say that we have equal number of and ones in the dataset. That
means that we have a 50-50 chance that a random classifier gets it right.
- Yes.. with infinite data
13. Trinity College Dublin, The University of Dublin
Baseline – real vs. ideal
13
- Small datasets have a higher chance that a random classifier would get it right
by chance
- So, classification results should be compared to a baseline (or chance level)
that is calculated by taking into account the sample size (N)
- We will see that in the coming lectures
- Things get more complicated with multiclass and imbalanced datasets
https://www.discovermagazine.com/mind/machine-learning-exceeding-chance-level-by-chance
14. Trinity College Dublin, The University of Dublin
Precision vs. recall
14
“Hands-On Machine Learning with Scikit-Learn,
Keras, and TensorFlow”, Aurélien Géron, 2019
Trade-off
16. Trinity College Dublin, The University of Dublin
Classification – evaluation metrics
16
F1-Score = harmonic mean of precision and recall
Precision, recall, and F1-score apply to both binary
balanced, binary imbalanced, and multiclass classification.
17. Trinity College Dublin, The University of Dublin
Classification in Python
17
“Hands-On Machine Learning with Scikit-Learn,
Keras, and TensorFlow”, Aurélien Géron, 2019
X is the data matrix
(features)
y is the class (‘five’ or
‘not a five’)
18. Trinity College Dublin, The University of Dublin 18
Support Vector Machine (SVM)
“Hands-On Machine Learning with Scikit-Learn,
Keras, and TensorFlow”, Aurélien Géron, 2019
19. Trinity College Dublin, The University of Dublin 19
Support Vector Machine (SVM)
- Some datasets are not even close
to being linearly separable.
- One approach is to use
polynomial features
e.g., x2 = (x1)2
x3 = (x1)3
20. Trinity College Dublin, The University of Dublin 20
Support Vector Machine (SVM)
- Some datasets are not even close
to being linearly separable.
- One approach is to use
polynomial features
e.g., x2 = (x1)2
x3 = (x1)3
- Kernel methods
https://towardsdatascience.com/the-kernel-trick-c98cdbcaeb3f
21. Trinity College Dublin, The University of Dublin 21
LDA: Linear Discriminant Analysis
and Data projection
x1
x2
Y ∈ {green,blue}
x2
x1
X: [x1, x2] Sometimes it is easier to look at things from a different angle,
instead of searching for a complicated solution
22. Trinity College Dublin, The University of Dublin 22
Data projection
x1
x2
Y ∈ {green,blue}
X: [x1, x2]
Xproj = X - [2,0]
Xproj = [x1, x2] - [2,0]
Xproj = [x1-2, x2]
xproj1
xproj2
23. Trinity College Dublin, The University of Dublin 23
Data projection
x1
x2
Y ∈ {green,blue}
X: [x1, x2]
Xproj = X - [2,3]
Xproj = [x1, x2] - [2,3]
Xproj = [x1-2, x2-3]
xproj1
xproj2
24. Trinity College Dublin, The University of Dublin 24
Data projection
A projection is a transformation of data points from one axis system to another
x1
x2
xproj1
xproj2
xproj1
xproj2
25. Trinity College Dublin, The University of Dublin 25
Data projection
x1
x2
x1
x2
Bad projection Good projection
26. Trinity College Dublin, The University of Dublin 26
x1
x2
Good projection
Data projection
LDA: Linear Discriminant Analysis
Find the axis that:
- Maximises the variance of the class
means (between-class)
- Minimises the within-class variance
27. Trinity College Dublin, The University of Dublin 27
x1
x2
Good projection
Data projection
xproj
Perfect separability between classes
30. Trinity College Dublin, The University of Dublin
Discussion
30
• How could we design a pothole detector that can map the potholes in
Dublin? What would be the data? How would we use this data to
perform classification and detect the potholes?
Problem/question Data collection
Preprocessing /
cleaning
Analysing
Interpretation /
outcome
Improve
ML
Visualisation Visualisation Visualisation
31. Trinity College Dublin, The University of Dublin
Supervised Learning
31
y = f(X)
f ynew
Model Training (learning or fit)
Xnew
f y
X
Using the model (test)
known known
unknown known
known unknown
Classification: y is a category/class
Regression: y is a number
32. Trinity College Dublin, The University of Dublin 32
Regression
Classification Regression
Find decision boundary:
e.g.:
Combination of X > boundary
y class A
Combination of X < boundary
y class B
Find decision boundary:
e.g.:
y = Combination of X
33. Trinity College Dublin, The University of Dublin
Regression
33
X2: inflation
X1: cost of materials
y = avg cost house
Using the past (of x) to
predict the future (of y)
34. Trinity College Dublin, The University of Dublin 34
Regression
Dependent variable
Independent variables
35. Trinity College Dublin, The University of Dublin
Classification in Python
35
“Hands-On Machine Learning with Scikit-Learn,
Keras, and TensorFlow”, Aurélien Géron, 2019
X is the data matrix
(features)
y is the class (‘five’ or
‘not a five’)
36. Trinity College Dublin, The University of Dublin
Regression in Python
36
“Hands-On Machine Learning with Scikit-Learn,
Keras, and TensorFlow”, Aurélien Géron, 2019
X is the data matrix
(features)
y is the class (‘five’ or
‘not a five’)
Editor's Notes
Mention that the main challenge is always to determine those axes (features). Not just 2D, multidimensional. It could be age, height,
Mention that the main challenge is always to determine those axes (features). Not just 2D, multidimensional. It could be age, height,
Mention that the main challenge is always to determine those axes (features). Not just 2D, multidimensional. It could be age, height,
Mention that the main challenge is always to determine those axes (features). Not just 2D, multidimensional. It could be age, height,
Mention that the main challenge is always to determine those axes (features). Not just 2D, multidimensional. It could be age, height,