Machine Learning presentation. Helps you to have a brief idea about what machine learning is and gives you direction to go deep into it. It covers the idea of Supervised learning and unsupervised learning and examples of how to use different models.
Delta Analytics is a 501(c)3 non-profit in the Bay Area. We believe that data is powerful, and that anybody should be able to harness it for change. Our teaching fellows partner with schools and organizations worldwide to work with students excited about the power of data to do good.
Welcome to the course! These modules will teach you the fundamental building blocks and the theory necessary to be a responsible machine learning practitioner in your own community. Each module focuses on accessible examples designed to teach you about good practices and the powerful (yet surprisingly simple) algorithms we use to model data.
To learn more about our mission or provide feedback, take a look at www.deltanalytics.org.
Delta Analytics is a 501(c)3 non-profit in the Bay Area. We believe that data is powerful, and that anybody should be able to harness it for change. Our teaching fellows partner with schools and organizations worldwide to work with students excited about the power of data to do good.
Welcome to the course! These modules will teach you the fundamental building blocks and the theory necessary to be a responsible machine learning practitioner in your own community. Each module focuses on accessible examples designed to teach you about good practices and the powerful (yet surprisingly simple) algorithms we use to model data.
To learn more about our mission or provide feedback, take a look at www.deltanalytics.org.
Machine Learning Interview Questions and AnswersSatyam Jaiswal
Practice Best Machine Learning Interview Questions and Answers for the best preparation of the machine learning interview. these questions are very popular and asked various times in machine learning interview.
Module 4: Model Selection and EvaluationSara Hooker
Delta Analytics is a 501(c)3 non-profit in the Bay Area. We believe that data is powerful, and that anybody should be able to harness it for change. Our teaching fellows partner with schools and organizations worldwide to work with students excited about the power of data to do good.
Welcome to the course! These modules will teach you the fundamental building blocks and the theory necessary to be a responsible machine learning practitioner in your own community. Each module focuses on accessible examples designed to teach you about good practices and the powerful (yet surprisingly simple) algorithms we use to model data.
To learn more about our mission or provide feedback, take a look at www.deltanalytics.org.
Delta Analytics is a 501(c)3 non-profit in the Bay Area. We believe that data is powerful, and that anybody should be able to harness it for change. Our teaching fellows partner with schools and organizations worldwide to work with students excited about the power of data to do good.
Welcome to the course! These modules will teach you the fundamental building blocks and the theory necessary to be a responsible machine learning practitioner in your own community. Each module focuses on accessible examples designed to teach you about good practices and the powerful (yet surprisingly simple) algorithms we use to model data.
To learn more about our mission or provide feedback, take a look at www.deltanalytics.org.
Delta Analytics is a 501(c)3 non-profit in the Bay Area. We believe that data is powerful, and that anybody should be able to harness it for change. Our teaching fellows partner with schools and organizations worldwide to work with students excited about the power of data to do good.
Welcome to the course! These modules will teach you the fundamental building blocks and the theory necessary to be a responsible machine learning practitioner in your own community. Each module focuses on accessible examples designed to teach you about good practices and the powerful (yet surprisingly simple) algorithms we use to model data.
To learn more about our mission or provide feedback, take a look at www.deltanalytics.org.
Machine Learning Interview Questions and AnswersSatyam Jaiswal
Practice Best Machine Learning Interview Questions and Answers for the best preparation of the machine learning interview. these questions are very popular and asked various times in machine learning interview.
Module 4: Model Selection and EvaluationSara Hooker
Delta Analytics is a 501(c)3 non-profit in the Bay Area. We believe that data is powerful, and that anybody should be able to harness it for change. Our teaching fellows partner with schools and organizations worldwide to work with students excited about the power of data to do good.
Welcome to the course! These modules will teach you the fundamental building blocks and the theory necessary to be a responsible machine learning practitioner in your own community. Each module focuses on accessible examples designed to teach you about good practices and the powerful (yet surprisingly simple) algorithms we use to model data.
To learn more about our mission or provide feedback, take a look at www.deltanalytics.org.
Delta Analytics is a 501(c)3 non-profit in the Bay Area. We believe that data is powerful, and that anybody should be able to harness it for change. Our teaching fellows partner with schools and organizations worldwide to work with students excited about the power of data to do good.
Welcome to the course! These modules will teach you the fundamental building blocks and the theory necessary to be a responsible machine learning practitioner in your own community. Each module focuses on accessible examples designed to teach you about good practices and the powerful (yet surprisingly simple) algorithms we use to model data.
To learn more about our mission or provide feedback, take a look at www.deltanalytics.org. If you would like to use this material to further our mission of improving access to machine learning. Education please reach out to inquiry@deltanalytics.org.
Data Analytics, Machine Learning, and HPC in Today’s Changing Application Env...Intel® Software
This session explains how solutions desired by such IT/Internet/Silicon Valley etc companies can look like, how they may differ from the more “classical” consumers of machine learning and analytics, and the arising challenges that current and future HPC development may have to cope with.
Delta Analytics is a 501(c)3 non-profit in the Bay Area. We believe that data is powerful, and that anybody should be able to harness it for change. Our teaching fellows partner with schools and organizations worldwide to work with students excited about the power of data to do good.
Welcome to the course! These modules will teach you the fundamental building blocks and the theory necessary to be a responsible machine learning practitioner in your own community. Each module focuses on accessible examples designed to teach you about good practices and the powerful (yet surprisingly simple) algorithms we use to model data.
To learn more about our mission or provide feedback, take a look at www.deltanalytics.org.
Introduction to machine learning. Basics of machine learning. Overview of machine learning. Linear regression. logistic regression. cost function. Gradient descent. sensitivity, specificity. model selection.
Unsupervised Anomaly Detection with Isolation Forest - Elena SharovaPyData
PyData London 2018
This talk will focus on the importance of correctly defining an anomaly when conducting anomaly detection using unsupervised machine learning. It will include a review of Isolation Forest algorithm (Liu et al. 2008), and a demonstration of how this algorithm can be applied to transaction monitoring, specifically to detect money laundering.
---
www.pydata.org
PyData is an educational program of NumFOCUS, a 501(c)3 non-profit organization in the United States. PyData provides a forum for the international community of users and developers of data analysis tools to share ideas and learn from each other. The global PyData network promotes discussion of best practices, new approaches, and emerging technologies for data management, processing, analytics, and visualization. PyData communities approach data science using many languages, including (but not limited to) Python, Julia, and R.
PyData conferences aim to be accessible and community-driven, with novice to advanced level presentations. PyData tutorials and talks bring attendees the latest project features along with cutting-edge use cases.
Provides a brief overview of what machine learning is, how it works (theory), how to prepare data for a machine learning problem, an example case study, and additional resources.
Searching for Anomalies, by Thomas Dietterich, Distinguished Professor Emeritus in the School of EECS at Oregon State University and Chief Scientist of BigML.
*MLSEV 2020: Virtual Conference.
Amazon Product Review Sentiment Analysis with Machine Learningijtsrd
Users of Amazons online shopping service are allowed to leave feedback for the items they buy. Amazon makes no effort to monitor or limit the scope of these reviews. Although the amount of reviews for various items varies, the reviews provide easily accessible and abundant data for a variety of applications. This paper aims to apply and expand existing natural language processing and sentiment analysis research to data obtained from Amazon. The number of stars given to a product by a user is used as training data for supervised machine learning. Since more people are dependent on online products these days, the value of a review is increasing. Before making a purchase, a buyer must read thousands of reviews to fully comprehend a product. In this day and age of machine learning, however, sorting through thousands of comments and learning from them would be much easier if a model was used to polarize and learn from them. We used supervised learning to polarize a massive Amazon dataset and achieve satisfactory accuracy. Ravi Kumar Singh | Dr. Kamalraj Ramalingam "Amazon Product Review Sentiment Analysis with Machine Learning" Published in International Journal of Trend in Scientific Research and Development (ijtsrd), ISSN: 2456-6470, Volume-5 | Issue-4 , June 2021, URL: https://www.ijtsrd.compapers/ijtsrd42372.pdf Paper URL: https://www.ijtsrd.comcomputer-science/data-processing/42372/amazon-product-review-sentiment-analysis-with-machine-learning/ravi-kumar-singh
Delta Analytics is a 501(c)3 non-profit in the Bay Area. We believe that data is powerful, and that anybody should be able to harness it for change. Our teaching fellows partner with schools and organizations worldwide to work with students excited about the power of data to do good.
Welcome to the course! These modules will teach you the fundamental building blocks and the theory necessary to be a responsible machine learning practitioner in your own community. Each module focuses on accessible examples designed to teach you about good practices and the powerful (yet surprisingly simple) algorithms we use to model data.
To learn more about our mission or provide feedback, take a look at www.deltanalytics.org.
State of the Art in Machine Learning, by Thomas Dietterich, Distinguished Professor Emeritus in the School of EECS at Oregon State University and Chief Scientist of BigML.
*MLSEV 2020: Virtual Conference.
How ml can improve purchase conversionsSudeep Shukla
- What is Machine Learning and what problems can it solve?
- Basic Machine Learning models
- Data gathering and data cleaning
- Parameters for judging whether the model is performing well?
- Making it easy for sales & marketing teams to use the ML program
Overview of Machine learning concepts – Over fitting and train/test splits, Types of Machine learning – Supervised, Unsupervised, Reinforced learning, Introduction to Bayes Theorem, Linear Regression- model assumptions, regularization (lasso, ridge, elastic net), Classification and Regression algorithms- Naïve Bayes, K-Nearest Neighbors, logistic regression, support vector machines (SVM), decision trees, and random forest, Classification Errors..
Delta Analytics is a 501(c)3 non-profit in the Bay Area. We believe that data is powerful, and that anybody should be able to harness it for change. Our teaching fellows partner with schools and organizations worldwide to work with students excited about the power of data to do good.
Welcome to the course! These modules will teach you the fundamental building blocks and the theory necessary to be a responsible machine learning practitioner in your own community. Each module focuses on accessible examples designed to teach you about good practices and the powerful (yet surprisingly simple) algorithms we use to model data.
To learn more about our mission or provide feedback, take a look at www.deltanalytics.org. If you would like to use this material to further our mission of improving access to machine learning. Education please reach out to inquiry@deltanalytics.org.
Data Analytics, Machine Learning, and HPC in Today’s Changing Application Env...Intel® Software
This session explains how solutions desired by such IT/Internet/Silicon Valley etc companies can look like, how they may differ from the more “classical” consumers of machine learning and analytics, and the arising challenges that current and future HPC development may have to cope with.
Delta Analytics is a 501(c)3 non-profit in the Bay Area. We believe that data is powerful, and that anybody should be able to harness it for change. Our teaching fellows partner with schools and organizations worldwide to work with students excited about the power of data to do good.
Welcome to the course! These modules will teach you the fundamental building blocks and the theory necessary to be a responsible machine learning practitioner in your own community. Each module focuses on accessible examples designed to teach you about good practices and the powerful (yet surprisingly simple) algorithms we use to model data.
To learn more about our mission or provide feedback, take a look at www.deltanalytics.org.
Introduction to machine learning. Basics of machine learning. Overview of machine learning. Linear regression. logistic regression. cost function. Gradient descent. sensitivity, specificity. model selection.
Unsupervised Anomaly Detection with Isolation Forest - Elena SharovaPyData
PyData London 2018
This talk will focus on the importance of correctly defining an anomaly when conducting anomaly detection using unsupervised machine learning. It will include a review of Isolation Forest algorithm (Liu et al. 2008), and a demonstration of how this algorithm can be applied to transaction monitoring, specifically to detect money laundering.
---
www.pydata.org
PyData is an educational program of NumFOCUS, a 501(c)3 non-profit organization in the United States. PyData provides a forum for the international community of users and developers of data analysis tools to share ideas and learn from each other. The global PyData network promotes discussion of best practices, new approaches, and emerging technologies for data management, processing, analytics, and visualization. PyData communities approach data science using many languages, including (but not limited to) Python, Julia, and R.
PyData conferences aim to be accessible and community-driven, with novice to advanced level presentations. PyData tutorials and talks bring attendees the latest project features along with cutting-edge use cases.
Provides a brief overview of what machine learning is, how it works (theory), how to prepare data for a machine learning problem, an example case study, and additional resources.
Searching for Anomalies, by Thomas Dietterich, Distinguished Professor Emeritus in the School of EECS at Oregon State University and Chief Scientist of BigML.
*MLSEV 2020: Virtual Conference.
Amazon Product Review Sentiment Analysis with Machine Learningijtsrd
Users of Amazons online shopping service are allowed to leave feedback for the items they buy. Amazon makes no effort to monitor or limit the scope of these reviews. Although the amount of reviews for various items varies, the reviews provide easily accessible and abundant data for a variety of applications. This paper aims to apply and expand existing natural language processing and sentiment analysis research to data obtained from Amazon. The number of stars given to a product by a user is used as training data for supervised machine learning. Since more people are dependent on online products these days, the value of a review is increasing. Before making a purchase, a buyer must read thousands of reviews to fully comprehend a product. In this day and age of machine learning, however, sorting through thousands of comments and learning from them would be much easier if a model was used to polarize and learn from them. We used supervised learning to polarize a massive Amazon dataset and achieve satisfactory accuracy. Ravi Kumar Singh | Dr. Kamalraj Ramalingam "Amazon Product Review Sentiment Analysis with Machine Learning" Published in International Journal of Trend in Scientific Research and Development (ijtsrd), ISSN: 2456-6470, Volume-5 | Issue-4 , June 2021, URL: https://www.ijtsrd.compapers/ijtsrd42372.pdf Paper URL: https://www.ijtsrd.comcomputer-science/data-processing/42372/amazon-product-review-sentiment-analysis-with-machine-learning/ravi-kumar-singh
Delta Analytics is a 501(c)3 non-profit in the Bay Area. We believe that data is powerful, and that anybody should be able to harness it for change. Our teaching fellows partner with schools and organizations worldwide to work with students excited about the power of data to do good.
Welcome to the course! These modules will teach you the fundamental building blocks and the theory necessary to be a responsible machine learning practitioner in your own community. Each module focuses on accessible examples designed to teach you about good practices and the powerful (yet surprisingly simple) algorithms we use to model data.
To learn more about our mission or provide feedback, take a look at www.deltanalytics.org.
State of the Art in Machine Learning, by Thomas Dietterich, Distinguished Professor Emeritus in the School of EECS at Oregon State University and Chief Scientist of BigML.
*MLSEV 2020: Virtual Conference.
How ml can improve purchase conversionsSudeep Shukla
- What is Machine Learning and what problems can it solve?
- Basic Machine Learning models
- Data gathering and data cleaning
- Parameters for judging whether the model is performing well?
- Making it easy for sales & marketing teams to use the ML program
Overview of Machine learning concepts – Over fitting and train/test splits, Types of Machine learning – Supervised, Unsupervised, Reinforced learning, Introduction to Bayes Theorem, Linear Regression- model assumptions, regularization (lasso, ridge, elastic net), Classification and Regression algorithms- Naïve Bayes, K-Nearest Neighbors, logistic regression, support vector machines (SVM), decision trees, and random forest, Classification Errors..
Machine learning for sensor Data AnalyticsMATLABISRAEL
במצגת זאת נראה כיצד עושים Machine Learning בסביבת MATLAB. נציג מספר יכולות ואפליקציות מובנות ההופכות את תהליך למידת המכונה ליעיל ומהיר יותר – כלים כמו ה-Classification Learner, ה-Regression Learner ו-Bayesian Optimization. בהסתמך על מידע המתקבל מחיישני סמארטפון, נבנה מערכת סיווג המזהה את הפעילות שמבצע המשתמש – הליכה, טיפוס במדרגות, שכיבה, וכו'
Machine Learning jobs are one of the top emerging jobs in the industry currently, and standing out during an interview is key for landing your desired job. Here are some Machine Learning interview questions you should know about, if you plan to build a successful career in the field.
This presentation inludes step-by step tutorial by including the screen recordings to learn Rapid Miner.It also includes the step-step-step procedure to use the most interesting features -Turbo Prep and Auto Model.
Identifying and classifying unknown Network Disruptionjagan477830
Since the evolution of modern technology and with the drastic increase in the scale of network communication more and more network disruptions in traffic and private protocols have been taking place. Identifying and classifying the unknown network disruptions can provide support and even help to maintain the backup systems.
Types of Machine Learning- Tanvir Siddike MoinTanvir Moin
Machine learning can be broadly categorized into four main types based on how they learn from data:
Supervised Learning: Imagine a teacher showing you labeled examples (like classifying pictures of cats and dogs). Supervised learning algorithms learn from labeled data, where each data point has a corresponding answer or label. The algorithm analyzes the data and learns to map the inputs to the desired outputs. This is commonly used for tasks like spam filtering, image recognition, and weather prediction.
Unsupervised Learning: Unlike supervised learning, unsupervised learning deals with unlabeled data. It's like being given a pile of toys and asked to organize them however you see fit. The algorithm finds hidden patterns or structures within the data. This is useful for tasks like customer segmentation, anomaly detection, and recommendation systems.
Reinforcement Learning: This is inspired by how humans learn through trial and error. The algorithm interacts with its environment and receives rewards for good decisions and penalties for bad ones. Over time, it learns to take actions that maximize the rewards. This is used in applications like training self-driving cars and playing games like chess.
Semi-Supervised Learning: This combines aspects of supervised and unsupervised learning. It leverages a small amount of labeled data along with a larger amount of unlabeled data to improve the learning process. This is beneficial when labeled data is scarce or expensive to obtain.
AI Professionals use top machine learning algorithms to automate models that analyze more extensive and complex data which was not possible in older machine learning algos.
SAP Sapphire 2024 - ASUG301 building better apps with SAP Fiori.pdfPeter Spielvogel
Building better applications for business users with SAP Fiori.
• What is SAP Fiori and why it matters to you
• How a better user experience drives measurable business benefits
• How to get started with SAP Fiori today
• How SAP Fiori elements accelerates application development
• How SAP Build Code includes SAP Fiori tools and other generative artificial intelligence capabilities
• How SAP Fiori paves the way for using AI in SAP apps
PHP Frameworks: I want to break free (IPC Berlin 2024)Ralf Eggert
In this presentation, we examine the challenges and limitations of relying too heavily on PHP frameworks in web development. We discuss the history of PHP and its frameworks to understand how this dependence has evolved. The focus will be on providing concrete tips and strategies to reduce reliance on these frameworks, based on real-world examples and practical considerations. The goal is to equip developers with the skills and knowledge to create more flexible and future-proof web applications. We'll explore the importance of maintaining autonomy in a rapidly changing tech landscape and how to make informed decisions in PHP development.
This talk is aimed at encouraging a more independent approach to using PHP frameworks, moving towards a more flexible and future-proof approach to PHP development.
Transcript: Selling digital books in 2024: Insights from industry leaders - T...BookNet Canada
The publishing industry has been selling digital audiobooks and ebooks for over a decade and has found its groove. What’s changed? What has stayed the same? Where do we go from here? Join a group of leading sales peers from across the industry for a conversation about the lessons learned since the popularization of digital books, best practices, digital book supply chain management, and more.
Link to video recording: https://bnctechforum.ca/sessions/selling-digital-books-in-2024-insights-from-industry-leaders/
Presented by BookNet Canada on May 28, 2024, with support from the Department of Canadian Heritage.
Climate Impact of Software Testing at Nordic Testing DaysKari Kakkonen
My slides at Nordic Testing Days 6.6.2024
Climate impact / sustainability of software testing discussed on the talk. ICT and testing must carry their part of global responsibility to help with the climat warming. We can minimize the carbon footprint but we can also have a carbon handprint, a positive impact on the climate. Quality characteristics can be added with sustainability, and then measured continuously. Test environments can be used less, and in smaller scale and on demand. Test techniques can be used in optimizing or minimizing number of tests. Test automation can be used to speed up testing.
The Metaverse and AI: how can decision-makers harness the Metaverse for their...Jen Stirrup
The Metaverse is popularized in science fiction, and now it is becoming closer to being a part of our daily lives through the use of social media and shopping companies. How can businesses survive in a world where Artificial Intelligence is becoming the present as well as the future of technology, and how does the Metaverse fit into business strategy when futurist ideas are developing into reality at accelerated rates? How do we do this when our data isn't up to scratch? How can we move towards success with our data so we are set up for the Metaverse when it arrives?
How can you help your company evolve, adapt, and succeed using Artificial Intelligence and the Metaverse to stay ahead of the competition? What are the potential issues, complications, and benefits that these technologies could bring to us and our organizations? In this session, Jen Stirrup will explain how to start thinking about these technologies as an organisation.
GraphRAG is All You need? LLM & Knowledge GraphGuy Korland
Guy Korland, CEO and Co-founder of FalkorDB, will review two articles on the integration of language models with knowledge graphs.
1. Unifying Large Language Models and Knowledge Graphs: A Roadmap.
https://arxiv.org/abs/2306.08302
2. Microsoft Research's GraphRAG paper and a review paper on various uses of knowledge graphs:
https://www.microsoft.com/en-us/research/blog/graphrag-unlocking-llm-discovery-on-narrative-private-data/
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™UiPathCommunity
In questo evento online gratuito, organizzato dalla Community Italiana di UiPath, potrai esplorare le nuove funzionalità di Autopilot, il tool che integra l'Intelligenza Artificiale nei processi di sviluppo e utilizzo delle Automazioni.
📕 Vedremo insieme alcuni esempi dell'utilizzo di Autopilot in diversi tool della Suite UiPath:
Autopilot per Studio Web
Autopilot per Studio
Autopilot per Apps
Clipboard AI
GenAI applicata alla Document Understanding
👨🏫👨💻 Speakers:
Stefano Negro, UiPath MVPx3, RPA Tech Lead @ BSP Consultant
Flavio Martinelli, UiPath MVP 2023, Technical Account Manager @UiPath
Andrei Tasca, RPA Solutions Team Lead @NTT Data
Welcome to the first live UiPath Community Day Dubai! Join us for this unique occasion to meet our local and global UiPath Community and leaders. You will get a full view of the MEA region's automation landscape and the AI Powered automation technology capabilities of UiPath. Also, hosted by our local partners Marc Ellis, you will enjoy a half-day packed with industry insights and automation peers networking.
📕 Curious on our agenda? Wait no more!
10:00 Welcome note - UiPath Community in Dubai
Lovely Sinha, UiPath Community Chapter Leader, UiPath MVPx3, Hyper-automation Consultant, First Abu Dhabi Bank
10:20 A UiPath cross-region MEA overview
Ashraf El Zarka, VP and Managing Director MEA, UiPath
10:35: Customer Success Journey
Deepthi Deepak, Head of Intelligent Automation CoE, First Abu Dhabi Bank
11:15 The UiPath approach to GenAI with our three principles: improve accuracy, supercharge productivity, and automate more
Boris Krumrey, Global VP, Automation Innovation, UiPath
12:15 To discover how Marc Ellis leverages tech-driven solutions in recruitment and managed services.
Brendan Lingam, Director of Sales and Business Development, Marc Ellis
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Albert Hoitingh
In this session I delve into the encryption technology used in Microsoft 365 and Microsoft Purview. Including the concepts of Customer Key and Double Key Encryption.
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...SOFTTECHHUB
The choice of an operating system plays a pivotal role in shaping our computing experience. For decades, Microsoft's Windows has dominated the market, offering a familiar and widely adopted platform for personal and professional use. However, as technological advancements continue to push the boundaries of innovation, alternative operating systems have emerged, challenging the status quo and offering users a fresh perspective on computing.
One such alternative that has garnered significant attention and acclaim is Nitrux Linux 3.5.0, a sleek, powerful, and user-friendly Linux distribution that promises to redefine the way we interact with our devices. With its focus on performance, security, and customization, Nitrux Linux presents a compelling case for those seeking to break free from the constraints of proprietary software and embrace the freedom and flexibility of open-source computing.
Essentials of Automations: The Art of Triggers and Actions in FMESafe Software
In this second installment of our Essentials of Automations webinar series, we’ll explore the landscape of triggers and actions, guiding you through the nuances of authoring and adapting workspaces for seamless automations. Gain an understanding of the full spectrum of triggers and actions available in FME, empowering you to enhance your workspaces for efficient automation.
We’ll kick things off by showcasing the most commonly used event-based triggers, introducing you to various automation workflows like manual triggers, schedules, directory watchers, and more. Plus, see how these elements play out in real scenarios.
Whether you’re tweaking your current setup or building from the ground up, this session will arm you with the tools and insights needed to transform your FME usage into a powerhouse of productivity. Join us to discover effective strategies that simplify complex processes, enhancing your productivity and transforming your data management practices with FME. Let’s turn complexity into clarity and make your workspaces work wonders!
6. www.nicesoftwaresolutions.com
Goals of the course:
1) Identify a machine learning problem.
2) Use basic machine learning techniques
3) Think about your data/results.
What is Machine Learning?
Machine Learning explores the construction and usage of the algorithm
that can learn from data.
Machine has the ability to learn from data, it is able to improve the
performance when it receives more information and this experience
typically comes in the form of observations on how particular instances
on a problem was solved before.
1) Constructs/ use algorithms that learn from data
2) More information Higher performance
3) Previous Solutions Experience
9. www.nicesoftwaresolutions.com
Input Knowledge:
In example: pre-labeled squares
Size Edge Color
Small Dotted Green
Big Striped Yellow
Medium Normal Green
Features
Label
ObservationsIn R – use data.frame()
> Squares<- data.frame(
size = c(“small”, “big”, “medium”),
edge = c(“dotted”, “striped”, “normal”),
color = c(“green”, “yellow”, “green”))
10. www.nicesoftwaresolutions.com
Class Functions :
In a data frame rows corresponds to observations and column corresponds to
variables.
To find out the dimensions of the data sets, dim function can be used.
> dim(squares) #Observations, # features
> str(squares) Structured Overview
>summary(squares) Distribution Measures
11. www.nicesoftwaresolutions.com
How Machine Learning works?
Machine Learning is a method in which the machine is
given the input data, on which it is trained and then tested
for predicting the output of the new inputs given.
Machine
Training
Data Learned
Machine
Test Data
New Input
Predicted
Output
Model
Accuracy
Compare
Predicted
Output
Input Data
14. www.nicesoftwaresolutions.com
Data Set
Cleaning
data Set
•The Data needs to be cleaned in the
format which is suitable for performing
machine learning and getting the desired
output.
Data
Shuffling
•This step is very important
and will be discussed in
further slides.
Divide the
data in two
parts
•The data needs to be
divided into training and
testing data set.
Categorizing
the inputs
and the
output
•The output can be generally of
two types i.e. Classification
and Regression.
Input the
training data set
into the
Machine
•By providing the training data set
to the machine, we are getting
the model ready for prediction.
Providing inputs of test
data to the model and
getting the output.
Comparison of predicted
output and test dataset
output.
•This will help us to
get the model
accuracy.
The model is now
ready for prediction.
Machine Learning Work Flow:
18. www.nicesoftwaresolutions.com
Interactions:-
When variables are brought together to get a predictive output, which may behave different individually, but gives
different output when put together.
For Example, Having Alcohol while driving can lead to an accident. Also using mobile phone can also lead to an
accident. But driving using mobile phone when drunk will have an additive effect.
On the contrary, when drug A and drug B taken individually, might be harmful but taken together can
nullify each others effect.
19. www.nicesoftwaresolutions.com
Automatic feature Selection:-
There are possibilities, when we did not know which predictors to select as we
cannot judge the impact of them on the prediction (might be due to lack of subject
matter knowledge).
In this case, we use a method called stepwise regression.
There are two types of Stepwise regression:-
20. www.nicesoftwaresolutions.com
Forward Stepwise:
Even though the two models show the same output, this is not always the case.
It is possible that the two could come to completely different conclusions about the most important predictors.
21. www.nicesoftwaresolutions.com
The two can create completely different models. This does not guarantee that which one is the better.
Statisticians raise concerns that the stepwise regression model violates some of the principles that allows the
regression model to explain data as well predict.
But using stepwise regression does not mean that the models predictions are worthless. It simply means that
the model may over or under state the importance of predictors.
22. www.nicesoftwaresolutions.com
Dummy Variables, Missing data and interactions:-
Dummy Variables:-
All the predictor used in regression analysis must be numeric.(That is categorical data should be represented in
numbers.)
Missing Data:-
Missing data creates problem while doing predictive analysis in creating a model. As it should be replaced by a value,
that can be used for predictive analysis.
Good 1
Bad 0
NA/Bank Mean()
NA Sd()
NA 0
24. www.nicesoftwaresolutions.com
Classification-
Applications
Common Applications of Classification:
1. Medical Diagnosis
2. Animal Recognition
3. ?
Important:
Qualitative Output
Predefined Classes
The classes in Medical diagnosis may be:
Sick
Very Sick
Normal
Heathy
The classes for Animal Recognition might be:
Cats
Dogs
Horses
30. www.nicesoftwaresolutions.com
Clustering
• Clustering : grouping objects
• Similar within cluster
• Dissimilar between Clusters
• Example: Grouping similar animal photos
No Labels
No right or wrong
Plenty possible clustering
33. www.nicesoftwaresolutions.com
Classification
Regression
Clustering
Similar
Supervised Learning:
1. Predefined Labels.
2. Function Predicts label or value for new output.
3. Measuring performance is easier, as there are predicted and ad real labels
for comparison.
Un-Supervised Learning:
1. Does not require labeled observations.
2. Clustering: Finds group of observations that are similar.
3. Techniques to access such model will be discussed further.
Supervised Vs Unsupervised:
37. www.nicesoftwaresolutions.com
Decision Tree
Split the complex decision into small decisions applying the if…..else… condition to get the desired output.
For example, A bank wants to make a decision whether or not to give loan to a particular person based on his
credit score.
In this case, the bank collects all the information like income, credit score, required loan amount, etc. Here the
bank must quickly decide whether or not to give loan.
38. www.nicesoftwaresolutions.com
On the basis of the information given, we can create a decision tree using the Divide-and –Conquer method.
This method will help to analyze whether the customer will be able to repay the loan amount or not.
Suppose the tree considers two aspects of each applicant the credit score and the requested loan amount
The algorithm looks for an initial split that creates the two most homogeneous groups.
First, it splits into groups of high and low credit scores, then it splits requested the loan amount into high and
low. Each of these splits results into if…else… decision in a tree structure mentioned below:
39. www.nicesoftwaresolutions.com
If credit score is low it predicts loan default and if it is high and the loan amount is large it will predict default
as well, otherwise it predicts repaid.
The scenarios will be much more complex than this. This illustrates the basic process of how such a tree might
be built.
One the most widely used package is “rpart”.
(Formula)
Method=classification tree.