Machine learning involves using data to allow computers to learn without being explicitly programmed. There are three main types of machine learning problems: supervised learning, unsupervised learning, and reinforcement learning. The typical machine learning process involves five steps: 1) data gathering, 2) data preprocessing, 3) feature engineering, 4) algorithm selection and training, and 5) making predictions. Generalization is an important concept that relates to how well a model trained on one dataset can predict outcomes on an unseen dataset. Both underfitting and overfitting can lead to poor generalization by introducing bias or variance errors.
Overview of Machine learning concepts – Over fitting and train/test splits, Types of Machine learning – Supervised, Unsupervised, Reinforced learning, Introduction to Bayes Theorem, Linear Regression- model assumptions, regularization (lasso, ridge, elastic net), Classification and Regression algorithms- Naïve Bayes, K-Nearest Neighbors, logistic regression, support vector machines (SVM), decision trees, and random forest, Classification Errors..
Overview of Machine learning concepts – Over fitting and train/test splits, Types of Machine learning – Supervised, Unsupervised, Reinforced learning, Introduction to Bayes Theorem, Linear Regression- model assumptions, regularization (lasso, ridge, elastic net), Classification and Regression algorithms- Naïve Bayes, K-Nearest Neighbors, logistic regression, support vector machines (SVM), decision trees, and random forest, Classification Errors..
Machine Learning course in Chandigarh Joinasmeerana605
The machine learning process is iterative. Data collection and preparation are crucial. Feature engineering transforms raw data into meaningful representations. Model selection involves trying different algorithms. Training exposes the model to data and allows it to learn. We evaluate how well it performs on new data before finally deploying it for predictions.Join Machine Learning course in Chandigarh.
This slide gives brief overview of supervised, unsupervised and reinforcement learning. Algorithms discussed are Naive Bayes, K nearest neighbour, SVM,decision tree, Markov model.
Difference between regression and classification. difference between supervised and reinforcement, iterative functioning of Markov model and machine learning applications.
In a world of data explosion, the rate of data generation and consumption is on the increasing side, there comes the buzzword - Big Data.
Big Data is the concept of fast-moving, large-volume data in varying dimensions (sources) and
highly unpredicted sources.
The 4Vs of Big Data
● Volume - Scale of Data
● Velocity - Analysis of Streaming Data
● Variety - Different forms of Data
● Veracity - Uncertainty of Data
With increasing data availability, the new trend in the industry demands not just data collection,
but making ample sense of acquired data - thereby, the concept of Data Analytics.
Taking it a step further to further make a futuristic prediction and realistic inferences - the concept
of Machine Learning.
A blend of both gives a robust analysis of data for the past, now and the future.
There is a thin line between data analytics and Machine learning which becomes very obvious
when you dig deep.
Machine Learning 2 deep Learning: An IntroSi Krishan
Provides a brief introduction to machine learning, reasons for its popularity, a simple walk through example and then a need for deep learning and some of its characteristics. This is an updated version of an earlier presentation.
Machine Learning jobs are one of the top emerging jobs in the industry currently, and standing out during an interview is key for landing your desired job. Here are some Machine Learning interview questions you should know about, if you plan to build a successful career in the field.
Machine Learning course in Chandigarh Joinasmeerana605
The machine learning process is iterative. Data collection and preparation are crucial. Feature engineering transforms raw data into meaningful representations. Model selection involves trying different algorithms. Training exposes the model to data and allows it to learn. We evaluate how well it performs on new data before finally deploying it for predictions.Join Machine Learning course in Chandigarh.
This slide gives brief overview of supervised, unsupervised and reinforcement learning. Algorithms discussed are Naive Bayes, K nearest neighbour, SVM,decision tree, Markov model.
Difference between regression and classification. difference between supervised and reinforcement, iterative functioning of Markov model and machine learning applications.
In a world of data explosion, the rate of data generation and consumption is on the increasing side, there comes the buzzword - Big Data.
Big Data is the concept of fast-moving, large-volume data in varying dimensions (sources) and
highly unpredicted sources.
The 4Vs of Big Data
● Volume - Scale of Data
● Velocity - Analysis of Streaming Data
● Variety - Different forms of Data
● Veracity - Uncertainty of Data
With increasing data availability, the new trend in the industry demands not just data collection,
but making ample sense of acquired data - thereby, the concept of Data Analytics.
Taking it a step further to further make a futuristic prediction and realistic inferences - the concept
of Machine Learning.
A blend of both gives a robust analysis of data for the past, now and the future.
There is a thin line between data analytics and Machine learning which becomes very obvious
when you dig deep.
Machine Learning 2 deep Learning: An IntroSi Krishan
Provides a brief introduction to machine learning, reasons for its popularity, a simple walk through example and then a need for deep learning and some of its characteristics. This is an updated version of an earlier presentation.
Machine Learning jobs are one of the top emerging jobs in the industry currently, and standing out during an interview is key for landing your desired job. Here are some Machine Learning interview questions you should know about, if you plan to build a successful career in the field.
Show drafts
volume_up
Empowering the Data Analytics Ecosystem: A Laser Focus on Value
The data analytics ecosystem thrives when every component functions at its peak, unlocking the true potential of data. Here's a laser focus on key areas for an empowered ecosystem:
1. Democratize Access, Not Data:
Granular Access Controls: Provide users with self-service tools tailored to their specific needs, preventing data overload and misuse.
Data Catalogs: Implement robust data catalogs for easy discovery and understanding of available data sources.
2. Foster Collaboration with Clear Roles:
Data Mesh Architecture: Break down data silos by creating a distributed data ownership model with clear ownership and responsibilities.
Collaborative Workspaces: Utilize interactive platforms where data scientists, analysts, and domain experts can work seamlessly together.
3. Leverage Advanced Analytics Strategically:
AI-powered Automation: Automate repetitive tasks like data cleaning and feature engineering, freeing up data talent for higher-level analysis.
Right-Tool Selection: Strategically choose the most effective advanced analytics techniques (e.g., AI, ML) based on specific business problems.
4. Prioritize Data Quality with Automation:
Automated Data Validation: Implement automated data quality checks to identify and rectify errors at the source, minimizing downstream issues.
Data Lineage Tracking: Track the flow of data throughout the ecosystem, ensuring transparency and facilitating root cause analysis for errors.
5. Cultivate a Data-Driven Mindset:
Metrics-Driven Performance Management: Align KPIs and performance metrics with data-driven insights to ensure actionable decision making.
Data Storytelling Workshops: Equip stakeholders with the skills to translate complex data findings into compelling narratives that drive action.
Benefits of a Precise Ecosystem:
Sharpened Focus: Precise access and clear roles ensure everyone works with the most relevant data, maximizing efficiency.
Actionable Insights: Strategic analytics and automated quality checks lead to more reliable and actionable data insights.
Continuous Improvement: Data-driven performance management fosters a culture of learning and continuous improvement.
Sustainable Growth: Empowered by data, organizations can make informed decisions to drive sustainable growth and innovation.
By focusing on these precise actions, organizations can create an empowered data analytics ecosystem that delivers real value by driving data-driven decisions and maximizing the return on their data investment.
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...John Andrews
SlideShare Description for "Chatty Kathy - UNC Bootcamp Final Project Presentation"
Title: Chatty Kathy: Enhancing Physical Activity Among Older Adults
Description:
Discover how Chatty Kathy, an innovative project developed at the UNC Bootcamp, aims to tackle the challenge of low physical activity among older adults. Our AI-driven solution uses peer interaction to boost and sustain exercise levels, significantly improving health outcomes. This presentation covers our problem statement, the rationale behind Chatty Kathy, synthetic data and persona creation, model performance metrics, a visual demonstration of the project, and potential future developments. Join us for an insightful Q&A session to explore the potential of this groundbreaking project.
Project Team: Jay Requarth, Jana Avery, John Andrews, Dr. Dick Davis II, Nee Buntoum, Nam Yeongjin & Mat Nicholas
2. What is MC Learning
www.skillslash.com
The subfield of computer science that “gives computers the ability to learn
without being explicitly programmed”.
(Arthur Samuel, 1959)
A computer program is said to learn from experience E with respect to some class of
tasks T and performance measure P if its performance at tasks in T, as measured by P,
improves with experience E.”
(Tom Mitchell, 1997)
Using data for answering
questions
3.
4.
5.
6.
7.
8.
9.
10.
11.
12.
13.
14.
15.
16.
17.
18.
19.
20.
21.
22.
23.
24.
25.
26.
27.
28.
29.
30.
31.
32.
33.
34.
35.
36.
37.
38.
39.
40. High Bias and Low Variance
(Low Flexibility)
Low Bias and High Variance
(Too Flexibility)
Low Bias and High Variance
(Balanced Flexibility)
41. Bias Error:
The bias is known as the difference between the prediction of the values by the ML model and the correct
value. Being high in biasing gives a large error in training as well as testing data.
Variance Error:
Variance is the amount that the estimate of the target function will change if different training data was
used.
50. Types of Supervised ML
Supervise
d
Unsupervise
d
Reinforceme
nt
Output is a discrete variable
(e.g.,
Defaulter and Non Defaulter
Spam and non spam
Purchaser Non Purchaser)
Classificatio
n
Regressio
n
Output is continuous (e.g.,
price of house,
temperature)
www.skillslash.com
55. Types of Machine Learning
Problems
Supervise
d
Unsupervise
d
Reinforceme
nt
Supervise
d
Is this a cat or a dog?
Are these emails spam or not?
Unsupervised
Predict the market value of houses, given the
square meters, number of rooms,
neighborhood, etc.
Reinforcement
Learn through examples of which we know
the desired output (what we want to
predict).
56. Types of Machine Learning
Problems
Unsupervise
d
Supervised
There is no desired output. Learn something
about the data. Latent relationships.
I want to find anomalies in the credit card
usage patterns of my customers.
Reinforcement
I have photos and want to put them in
20 groups.
www.skillslash.com
57. Types of Machine Learning
Problems
Unsupervise
d
Supervise
d
Reinforceme
nt
Useful for learning structure in the data
(clustering), hidden correlations, reduce
dimensionality, etc.
www.skillslash.com
58. Environment gives feedback via a
positive or negative reward signal.
Unsupervised
Reinforceme
nt
Supervise
d
An agent interacts with an environment and
watches the result of the interaction.
Types of Machine Learning
Problems
www.skillslash.com
59.
60. Data Gathering
60
Might depend on human work
• Manual labeling for supervised learning.
• Domain knowledge. Maybe even experts.
May come for free, or “sort of”
• E.g., Machine Translation.
The more the better: Some algorithms need large amounts of data
to be useful (e.g., neural networks).
The quantity and quality of data dictate the model accuracy
www.skillslash.com
61. Data Preprocessing
61
Is there anything wrong with the data?
• Missing values
• Outliers
• Bad encoding (for text)
• Wrongly-labeled examples
• Biased data
• Do I have many more samples of one
class than the rest?
Need to fix/remove data?
www.skillslash.com
62. Feature Engineering
62
What is a feature?
A feature is an individual measurable
property of a phenomenon being
observed
Our inputs are represented by a set of
features.
To classify spam email, features could be:
• Number of words that have been
ch4ng3d
like this.
• Language of the email
Buy ch34p drugs
from the
ph4rm4cy now :)
:) :)
(2, 0, 3)
Feature
engineerin
g
www.skillslash.com
63. Feature Engineering
63
Extract more information from existing data, not adding “new” data
per-se
• Making it more useful
• With good features, most algorithms can learn
faster It can be an art
• Requires thought and knowledge of the
data Two steps:
• Variable transformation (e.g., dates into weekdays,
normalizing)
www.skillslash.com
64. Algorithm Selection & Training
64
Supervise
d
• Linear classifier
• Naive Bayes
• Support Vector Machines
(SVM)
• Decision Tree
• Random Forests
• k-Nearest Neighbors
• Neural Networks (Deep
learning)
Unsupervise
d
• PCA
• t-SNE
• k-mean
s
• DBSCAN
Reinforcemen
t
• SARSA–λ
• Q-Learnin
g
www.skillslash.com
65. 65
THE MACHINE LEARNING FRAMEWORK
y = f(x)
● Training: given a training set of labeled examples {(x1
,y1
), …,
(xN
,yN
)}, estimate the prediction function f by minimizing the
prediction error on the training set
● Testing: apply f to a never before seen test example x and
output the predicted value y = f(x)
output prediction
function
Image
feature
www.skillslash.com
66. Goal of training: making the correct prediction as often as
possible
• Incremental improvement:
• Use of metrics for evaluating performance and comparing
solutions
• Hyperparameter tuning: more an art than a science
Algorithm Selection & Training
66
Predic
t
Adjus
t
www.skillslash.com
67. Summary
67
• Machine Learning is intelligent use of data to answer questions
• Enabled by an exponential increase in computing power and
data availability
• Three big types of problems: supervised, unsupervised,
reinforcement
• 5 steps to every machine learning solution:
1. Data Gathering
2. Data Preprocessing
3. Feature Engineering
4. Algorithm Selection & Training
5. Making Predictions www.skillslash.com
68. Generalization
● How well does a learned model generalize from the data it
was trained on to a new test set?
Training set (labels known) Test set (labels
unknown)
69. Generalization
● Components of generalization error
○ Bias: how much the average model over all training sets differ from the true
model?
■ Error due to inaccurate assumptions/simplifications made by the model
■ Using very less features
○ Variance: how much models estimated from different training sets differ from
each other
● Underfitting: model is too “simple” to represent all the relevant class
characteristics
○ High bias and low variance
○ High training error and high test error
● Overfitting: model is too “complex” and fits irrelevant characteristics
(noise) in the data
○ Low bias and high variance
○ Low training error and high test error
70.
71.
72. Bias-Variance Trade-off
• Models with too few parameters are
inaccurate because of a large bias (not
enough flexibility).
• Bias can also come due to wrong
assumption.
• Lead to Train error
• Models with too many parameters are
inaccurate because of a large variance
(too much sensitivity to the sample).
• Lead to Test Error