Disentangling the origin of chemical differences using GHOST
MachineLearning_AishwaryaCR
1.
2.
3.
4.
5. What is Machine Learning
D E F I N I T I O N S
“The field of study that gives computers/machines the ability to learn
without being explicitly programmed” - Arthur Samuel
“A computer program is said to learn from experience E with respect to
some class of tasks T and performance measure P, if its performance at
tasks in T, as measured by P, improves with experience E ” – Tom Mitchel
7. Supervised
Learning
Regression
• Used for continuous value responses
• When the output is not bound to a defined set of values
• Predicting the price of a house given the size of the house in Sq. feet
Classification
• Used for categorical response values , where the data can be separated
into specific “classes”.
• When the output is bound to a discrete set of values.
• Should an incoming email go to the inbox / spam folder
8. Regression Vs. Classification
Linear Regression : Algorithm used to predict continuous values
Logistic Regression : Algorithm to predict outputs that belong to predefined classes
9. Unsupervised
Learning
Clustering
• Google news , where news of the same topics are grouped in one category
• Deriving structure to Chaotic data ; or unstructured data
Non-Clustering
• Getting a particular data out of a set of unstructured data
• Removing the voice samples of males that are of age above 25 from voice
samples collected at a cocktail party
10. What is Deep Learning then?
• Deep Learning is a subset of Machine
Learning
• Deep Learning is a method to solve
Supervised Machine Learning Problems
14. Deep Learning
Frameworks
“Deep learning
framework offers
building blocks for
designing, training
and validating deep
neural network
through a high-level
programming
interface”
A Tensor Object
Operations on the Tensor
Object
A Computational Graph
Optimizations
21. Apps + insights
Social
LOB
Graph
IoT
Image
CRM INGEST STORE PREP & TRAIN MODEL & SERVE
Data orchestration
and monitoring
Data lake
and storage
Hadoop/Spark/SQL
and ML IoT
Azure Machine Learning
Microsoft Solutions to solving a ML problem
22. Data Wrangling/ Data Preparation/Data Cleaning
• Advanced Analytics
• Visualization
• Observation
• Statistics
24. Evaluating a model
• A model is created based on the training data and then tested against a Test
data
• Divide the data into 2 sets
• Training Data
• Test Data
• How good is a model against the real data? – Scoring the model against the
test data ; y_predict Vs. y_test
• What is the accuracy of the prediction?
25. Deploying a model
• Deploying a model includes creating a web service that your applications can
consume
• This can be achieve by the model management service
27. Azure Machine Learning
Studio
Platform for emerging data scientists to
graphically build and deploy experiments
• Rapid experiment composition
• > 100 easily configured modules for data
prep, training, evaluation
• Extensibility through R & Python
• Serverless training and deployment
Some numbers
• 100’s of thousands of deployed models
serving billions of requests
Editor's Notes
R-squared is a statistical measure of how close the data are to the fitted regression line. It is also known as the coefficient of determination, or the coefficient of multiple determination for multiple regression. 0% indicates that the model explains none of the variability of the response data around its mean
Root mean squared value is the root of the squared sum of the difference between predicted value and the actual value divided by the sum of the # examples.