Automated machine learning uses algorithms to automate the machine learning workflow including data preprocessing, model selection, hyperparameter tuning, and evaluation to build an optimal machine learning model with little or no human involvement. It can save time by automating repetitive tasks and help identify the best performing models for various types of machine learning problems like classification, regression, and clustering. Automated machine learning tools provide an end-to-end experience to build, deploy, and manage machine learning models at scale with minimal coding or machine learning expertise required.
"The proposed system overcomes the above mentioned issue in an efficient way. It aims at analyzing the number of fraud transactions that are present in the dataset.
"
"The proposed system overcomes the above mentioned issue in an efficient way. It aims at analyzing the number of fraud transactions that are present in the dataset.
"
Innovations in technology has revolutionized financial services to an extent that large financial institutions like Goldman Sachs are claiming to be technology companies! It is no secret that technological innovations like Data science and AI are changing fundamentally how financial products are created, tested and delivered. While it is exciting to learn about technologies themselves, there is very little guidance available to companies and financial professionals should retool and gear themselves towards the upcoming revolution.
In this master class, we will discuss key innovations in Data Science and AI and connect applications of these novel fields in forecasting and optimization. Through case studies and examples, we will demonstrate why now is the time you should invest to learn about the topics that will reshape the financial services industry of the future!
AI in Finance
Types of Machine Learning- Tanvir Siddike MoinTanvir Moin
Machine learning can be broadly categorized into four main types based on how they learn from data:
Supervised Learning: Imagine a teacher showing you labeled examples (like classifying pictures of cats and dogs). Supervised learning algorithms learn from labeled data, where each data point has a corresponding answer or label. The algorithm analyzes the data and learns to map the inputs to the desired outputs. This is commonly used for tasks like spam filtering, image recognition, and weather prediction.
Unsupervised Learning: Unlike supervised learning, unsupervised learning deals with unlabeled data. It's like being given a pile of toys and asked to organize them however you see fit. The algorithm finds hidden patterns or structures within the data. This is useful for tasks like customer segmentation, anomaly detection, and recommendation systems.
Reinforcement Learning: This is inspired by how humans learn through trial and error. The algorithm interacts with its environment and receives rewards for good decisions and penalties for bad ones. Over time, it learns to take actions that maximize the rewards. This is used in applications like training self-driving cars and playing games like chess.
Semi-Supervised Learning: This combines aspects of supervised and unsupervised learning. It leverages a small amount of labeled data along with a larger amount of unlabeled data to improve the learning process. This is beneficial when labeled data is scarce or expensive to obtain.
Machine intelligence data science methodology 060420Jeremy Lehman
Machine learning and artificial intelligence project methodology that focuses on business results, builds alignment across the entire business, and forms enduring capabilities.
Machine Learning 2 deep Learning: An IntroSi Krishan
Provides a brief introduction to machine learning, reasons for its popularity, a simple walk through example and then a need for deep learning and some of its characteristics. This is an updated version of an earlier presentation.
Machine Learning has become a must to improve insight, quality and time to market. But it's also been called the 'high interest credit card of technical debt' with challenges in managing both how it's applied and how its results are consumed.
Building High Available and Scalable Machine Learning ApplicationsYalçın Yenigün
The slide contains some high level information about some machine learning algorithms, cross validation and feature extraction techniques. It also contains high level techniques about high available and scalable ML products.
Doing Analytics Right - Designing and Automating AnalyticsTasktop
There is no “one-sized fits all” of development analytics. It is not as simple as “here are the measures you need, go implement them.” The world of software delivery is too complex, and software organizations differ too significantly, to make it that simple. As discussed in the first webinar, the analytics you need depend on your unique business goals and environment.
That said, the design of your analytics solution will still require:
* The dashboards,
* the required data, and
* an appropriate choice of analytical techniques and statistics to apply to the data.
This webinar will describe a straightforward method for finding your analytic solution. In particular, we will explain how to adapt the Goal, Question, Metric (GQM) method to development processes. In addition, we will explain how to avoid “the light is brighter here” analytics anti-pattern: the idea that organizations tend to design metrics programs around the data they can easily get, rather than figuring out how to get the data they really need.
What is data mining? The process of analyzing data to discover hidden patterns and relationships that can help you manage and improve your business.
Check out: www.eleaderstochange.com
Follow #eleaders2change
Supervised learning is a fundamental concept in machine learning, where a computer algorithm learns from labeled data to make predictions or decisions. It is a type of machine learning paradigm that involves training a model on a dataset where both the input data and the corresponding desired output (or target) are provided. The goal of supervised learning is to learn a mapping or relationship between inputs and outputs so that the model can make accurate predictions on new, unseen data.v
Innovations in technology has revolutionized financial services to an extent that large financial institutions like Goldman Sachs are claiming to be technology companies! It is no secret that technological innovations like Data science and AI are changing fundamentally how financial products are created, tested and delivered. While it is exciting to learn about technologies themselves, there is very little guidance available to companies and financial professionals should retool and gear themselves towards the upcoming revolution.
In this master class, we will discuss key innovations in Data Science and AI and connect applications of these novel fields in forecasting and optimization. Through case studies and examples, we will demonstrate why now is the time you should invest to learn about the topics that will reshape the financial services industry of the future!
AI in Finance
Types of Machine Learning- Tanvir Siddike MoinTanvir Moin
Machine learning can be broadly categorized into four main types based on how they learn from data:
Supervised Learning: Imagine a teacher showing you labeled examples (like classifying pictures of cats and dogs). Supervised learning algorithms learn from labeled data, where each data point has a corresponding answer or label. The algorithm analyzes the data and learns to map the inputs to the desired outputs. This is commonly used for tasks like spam filtering, image recognition, and weather prediction.
Unsupervised Learning: Unlike supervised learning, unsupervised learning deals with unlabeled data. It's like being given a pile of toys and asked to organize them however you see fit. The algorithm finds hidden patterns or structures within the data. This is useful for tasks like customer segmentation, anomaly detection, and recommendation systems.
Reinforcement Learning: This is inspired by how humans learn through trial and error. The algorithm interacts with its environment and receives rewards for good decisions and penalties for bad ones. Over time, it learns to take actions that maximize the rewards. This is used in applications like training self-driving cars and playing games like chess.
Semi-Supervised Learning: This combines aspects of supervised and unsupervised learning. It leverages a small amount of labeled data along with a larger amount of unlabeled data to improve the learning process. This is beneficial when labeled data is scarce or expensive to obtain.
Machine intelligence data science methodology 060420Jeremy Lehman
Machine learning and artificial intelligence project methodology that focuses on business results, builds alignment across the entire business, and forms enduring capabilities.
Machine Learning 2 deep Learning: An IntroSi Krishan
Provides a brief introduction to machine learning, reasons for its popularity, a simple walk through example and then a need for deep learning and some of its characteristics. This is an updated version of an earlier presentation.
Machine Learning has become a must to improve insight, quality and time to market. But it's also been called the 'high interest credit card of technical debt' with challenges in managing both how it's applied and how its results are consumed.
Building High Available and Scalable Machine Learning ApplicationsYalçın Yenigün
The slide contains some high level information about some machine learning algorithms, cross validation and feature extraction techniques. It also contains high level techniques about high available and scalable ML products.
Doing Analytics Right - Designing and Automating AnalyticsTasktop
There is no “one-sized fits all” of development analytics. It is not as simple as “here are the measures you need, go implement them.” The world of software delivery is too complex, and software organizations differ too significantly, to make it that simple. As discussed in the first webinar, the analytics you need depend on your unique business goals and environment.
That said, the design of your analytics solution will still require:
* The dashboards,
* the required data, and
* an appropriate choice of analytical techniques and statistics to apply to the data.
This webinar will describe a straightforward method for finding your analytic solution. In particular, we will explain how to adapt the Goal, Question, Metric (GQM) method to development processes. In addition, we will explain how to avoid “the light is brighter here” analytics anti-pattern: the idea that organizations tend to design metrics programs around the data they can easily get, rather than figuring out how to get the data they really need.
What is data mining? The process of analyzing data to discover hidden patterns and relationships that can help you manage and improve your business.
Check out: www.eleaderstochange.com
Follow #eleaders2change
Supervised learning is a fundamental concept in machine learning, where a computer algorithm learns from labeled data to make predictions or decisions. It is a type of machine learning paradigm that involves training a model on a dataset where both the input data and the corresponding desired output (or target) are provided. The goal of supervised learning is to learn a mapping or relationship between inputs and outputs so that the model can make accurate predictions on new, unseen data.v
Show drafts
volume_up
Empowering the Data Analytics Ecosystem: A Laser Focus on Value
The data analytics ecosystem thrives when every component functions at its peak, unlocking the true potential of data. Here's a laser focus on key areas for an empowered ecosystem:
1. Democratize Access, Not Data:
Granular Access Controls: Provide users with self-service tools tailored to their specific needs, preventing data overload and misuse.
Data Catalogs: Implement robust data catalogs for easy discovery and understanding of available data sources.
2. Foster Collaboration with Clear Roles:
Data Mesh Architecture: Break down data silos by creating a distributed data ownership model with clear ownership and responsibilities.
Collaborative Workspaces: Utilize interactive platforms where data scientists, analysts, and domain experts can work seamlessly together.
3. Leverage Advanced Analytics Strategically:
AI-powered Automation: Automate repetitive tasks like data cleaning and feature engineering, freeing up data talent for higher-level analysis.
Right-Tool Selection: Strategically choose the most effective advanced analytics techniques (e.g., AI, ML) based on specific business problems.
4. Prioritize Data Quality with Automation:
Automated Data Validation: Implement automated data quality checks to identify and rectify errors at the source, minimizing downstream issues.
Data Lineage Tracking: Track the flow of data throughout the ecosystem, ensuring transparency and facilitating root cause analysis for errors.
5. Cultivate a Data-Driven Mindset:
Metrics-Driven Performance Management: Align KPIs and performance metrics with data-driven insights to ensure actionable decision making.
Data Storytelling Workshops: Equip stakeholders with the skills to translate complex data findings into compelling narratives that drive action.
Benefits of a Precise Ecosystem:
Sharpened Focus: Precise access and clear roles ensure everyone works with the most relevant data, maximizing efficiency.
Actionable Insights: Strategic analytics and automated quality checks lead to more reliable and actionable data insights.
Continuous Improvement: Data-driven performance management fosters a culture of learning and continuous improvement.
Sustainable Growth: Empowered by data, organizations can make informed decisions to drive sustainable growth and innovation.
By focusing on these precise actions, organizations can create an empowered data analytics ecosystem that delivers real value by driving data-driven decisions and maximizing the return on their data investment.
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...Subhajit Sahu
Abstract — Levelwise PageRank is an alternative method of PageRank computation which decomposes the input graph into a directed acyclic block-graph of strongly connected components, and processes them in topological order, one level at a time. This enables calculation for ranks in a distributed fashion without per-iteration communication, unlike the standard method where all vertices are processed in each iteration. It however comes with a precondition of the absence of dead ends in the input graph. Here, the native non-distributed performance of Levelwise PageRank was compared against Monolithic PageRank on a CPU as well as a GPU. To ensure a fair comparison, Monolithic PageRank was also performed on a graph where vertices were split by components. Results indicate that Levelwise PageRank is about as fast as Monolithic PageRank on the CPU, but quite a bit slower on the GPU. Slowdown on the GPU is likely caused by a large submission of small workloads, and expected to be non-issue when the computation is performed on massive graphs.
As Europe's leading economic powerhouse and the fourth-largest hashtag#economy globally, Germany stands at the forefront of innovation and industrial might. Renowned for its precision engineering and high-tech sectors, Germany's economic structure is heavily supported by a robust service industry, accounting for approximately 68% of its GDP. This economic clout and strategic geopolitical stance position Germany as a focal point in the global cyber threat landscape.
In the face of escalating global tensions, particularly those emanating from geopolitical disputes with nations like hashtag#Russia and hashtag#China, hashtag#Germany has witnessed a significant uptick in targeted cyber operations. Our analysis indicates a marked increase in hashtag#cyberattack sophistication aimed at critical infrastructure and key industrial sectors. These attacks range from ransomware campaigns to hashtag#AdvancedPersistentThreats (hashtag#APTs), threatening national security and business integrity.
🔑 Key findings include:
🔍 Increased frequency and complexity of cyber threats.
🔍 Escalation of state-sponsored and criminally motivated cyber operations.
🔍 Active dark web exchanges of malicious tools and tactics.
Our comprehensive report delves into these challenges, using a blend of open-source and proprietary data collection techniques. By monitoring activity on critical networks and analyzing attack patterns, our team provides a detailed overview of the threats facing German entities.
This report aims to equip stakeholders across public and private sectors with the knowledge to enhance their defensive strategies, reduce exposure to cyber risks, and reinforce Germany's resilience against cyber threats.
Techniques to optimize the pagerank algorithm usually fall in two categories. One is to try reducing the work per iteration, and the other is to try reducing the number of iterations. These goals are often at odds with one another. Skipping computation on vertices which have already converged has the potential to save iteration time. Skipping in-identical vertices, with the same in-links, helps reduce duplicate computations and thus could help reduce iteration time. Road networks often have chains which can be short-circuited before pagerank computation to improve performance. Final ranks of chain nodes can be easily calculated. This could reduce both the iteration time, and the number of iterations. If a graph has no dangling nodes, pagerank of each strongly connected component can be computed in topological order. This could help reduce the iteration time, no. of iterations, and also enable multi-iteration concurrency in pagerank computation. The combination of all of the above methods is the STICD algorithm. [sticd] For dynamic graphs, unchanged components whose ranks are unaffected can be skipped altogether.
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...John Andrews
SlideShare Description for "Chatty Kathy - UNC Bootcamp Final Project Presentation"
Title: Chatty Kathy: Enhancing Physical Activity Among Older Adults
Description:
Discover how Chatty Kathy, an innovative project developed at the UNC Bootcamp, aims to tackle the challenge of low physical activity among older adults. Our AI-driven solution uses peer interaction to boost and sustain exercise levels, significantly improving health outcomes. This presentation covers our problem statement, the rationale behind Chatty Kathy, synthetic data and persona creation, model performance metrics, a visual demonstration of the project, and potential future developments. Join us for an insightful Q&A session to explore the potential of this groundbreaking project.
Project Team: Jay Requarth, Jana Avery, John Andrews, Dr. Dick Davis II, Nee Buntoum, Nam Yeongjin & Mat Nicholas
2. What machine
learning does?
Find patterns in data
Uses those patterns to predict the future
Examples:
- Detecting credit card Fraud
- Determine whether a customer is likely to switch to
a competitor
- Deciding when to do preventive maintenance on a
factory robot
3. What does it mean to
learn?
How did you learn to read?
Learning requires:
- Identity patterns
- Recognizing those patterns when you see them again
“This is what machine learning does”
4. Name Amount Fraudulent
Amit $3200 No
Rahul $1300 Yes
Ramesh $5700 Yes
Vinay $5700 No
Finding Patterns: A Simple Example
What’s the pattern for fraudulent transactions?
5. Finding Patterns: Another Example
Name Amount Where
Issued
Where
Used
Age Fraudulent
Sameer $2500 USA USA 23 No
Payal $2394 USA RUS 27 Yes
Piyush $1009 USA RUS 25 Yes
Amit $8488 FRA USA 63 No
Peter $298 AUS JAP 59 No
Jones $3150 USA RUS 43 No
Harry $8155 USA RUS 25 Yes
Mark $7475 UK GER 31 No
Nancy $550 USA RUS 26 No
Eli $7345 USA RUS 19 Yes
7. Relationship between AI and ML
7
Artificial Intelligence
An umbrella term, which we can loosely describe as:
“all the various techniques we might use to make a
computer do something smart”
Machine Learning
A successful subset of AI that focuses on
Ar
le
tif
a
ic
rn
ia
ilng from data,not on programming explicit
rules to follow
9. Rule based Programming vs ML
We write explicit rules
to follow
We provide data to
learn from
1
1
10. Rule based Analysis
Problem statement is fairly simple
Rules are straightforward and can be easily codified
Rules can change frequently
Few problem instances to train M L model
1
2
11. ML based Analysis
Problem statement is reasonably complex
Hard to find patterns using visualizations and other
exploratory tools
Decision variables sensitive to data, need to change as new
information is received
Large corpus available to train models
1
3
12. ML - based
• Dynamic – alter output based on
patterns in data
• Expert skill not needed, need an intuition
for how models work
• To update model,update corpus
• Large,high-quality data corpus
• Can not operate on a single problem
instance
ML based and Rule based Models
Rule - based
• Static – rules are applied independent of data
• Experts vital for formulating rules, experts
based on problem
• To update model, need to update rules i.e.
record model
• No corpus required
• Can operate on isolated problem instances
14. Machine Learning Types
• Classify input data in to categories
Classification
• Predict continuous numeric values
Regression
• Self discover patterns and grouping in
data
Clustering
Supervised
Learning
1
6
Unsupervised
Learning
15. Classification
• Classify input data in to categories
• Make predictions in a non-continuous form
• Classification is binary (either A or B) or multiclass (A or B or C …etc.)
• Supervised learning
1
7
Examples: Binary classification
• Email:Spam or N ot Spam?
• Identify sentiment as positive or negative.
•Determine whether a patient's lab sample is cancerous.
Examples: M ulticlass classification
• Email filters as spam, junk,or good.
• Stocks:Buy,sell or hold?
• Images:Cat,dog or mouse?
• Positive, negative or neutral sentiments?
17. Regression
• Predict a numeric value, typically in a continuous form
• learning from labeled historical data to predict or forecast new values
• Supervised learning
1
9
Examples:
• Given past stock data predict price tomorrow
• Given location and attributes of a home predict price
19. Clustering
• Find cases with similar characteristics in an unlabeled dataset and group them together
• Unsupervised learning
Examples:
• Document discovery – find all documents related to homicide cases
• Social media ad targeting – find all users who are interested in sports
2
1
22. Figu
re o
u
t right m
achine learning type?
24
Predicting the online sales volume for the next financial quarter?
Analyzing X-ray images to detect whether a person has pneumonia?
Grouping together online shoppers with similar traits for targeted marketing?
Forecasting stock market index values based on macro economic changes?
Processing new tweets to categorize them as positive or negative?
Predict ice cream sales based on the weather forecast?
Determine the amount of credit to give to a customer?
Determine if a social media post has positive or negative sentiment?
Approve or reject a customer's application for credit?
23. Qu
estio
ns: Figure o
u
t right m
achine learning type?
Predicting the online sales volume for the next financial quarter? -> Regression
Analyzing X-ray images to detect whether a person has pneumonia? -> Classification
Grouping together online shoppers with similar traits for targeted marketing? -> Clustering
Forecasting stock market index values based on macro economic changes? -> Regression
Processing new tweets to categorize them as positive or negative? -> Classification
Predict ice cream sales based on the weather forecast? -> Regression
Determine the amount of credit to give to a customer? -> Regression
Determine if a social media post has positive or negative sentiment? -> Classification
Approve or reject a customer's application for credit? ->
2
C
5
lassification
26. Feature Selection is the process where you automatically or manually select those features which
contribute most to your prediction variable or output in which you are interested in.
• Feature selection has a huge impact on the performance of the model in XL
• You need to identify and remove irrelevant or partially relevant features
• Principle – Minimum redundancy and maximum relevance
Benefits of selecting features?
• Reduces Overfitting: Less redundant data means less opportunity to make decisions based on noise.
• Improves Accuracy: Less misleading data means modeling accuracy improves.
• Reduces Training Time: fewer data points reduce algorithm complexity and algorithms train faster.
Feature Selection
27. • Feature engineering is the process of creating new features from raw data to increase the predictive
power of the machine learning model.
• Engineered features capture additional information that is not available in the original feature set.
• Examples of feature engineering are aggregating data, calculating a moving average, and calculating
the difference over time
Feature Engineering
28. Question: Splitting the address field into country, city and street number can be appropriately mapped to machine learning
task.
A. Feature engineering
B. Featureselection
Question: You need to map the right Learning task for a given scenario?
“Pickingtemperature and pressure to train a weather model”
A. Feature engineering
B. Featureselection
Question: Feature Engineering
32. 34
Training vs Validation vs Testing Dataset
• The training dataset is the sample of data used to train the model. It is the largest sample of data used
when creating a machine learning model.
• The validation dataset is a second sample of data used to provide an evaluation of the model to see if
the model can correctly predict, or classify, using data not seen before. The validation dataset is used
to tune the model. .It helps to get an unbiased evaluation of the model while tuning its
hyperparameters.
• A testing dataset is a set of data used to provide a final unbiased evaluation of the model. A test
dataset is an independent sample of data and is used once a model has been completely trained with
the training and validation datasets.
33. Qu
estio
n
Which two datasets do you use to build a machine learning model? Each correct answer presents part of
the solution. Choose the correct answers
1. Training dataset
2. Azure Open Dataset
3. Validation dataset
4. Testing dataset
35
39. 4
1
Workspace: Associated resources
• Azure Storage account: Is used as the default datastore for the workspace. Jupyter notebooks that are
used with your Azure Machine Learning compute instances are stored here as well.
• Azure Container Registry: Registers docker containers that you use during training and when you
deploy a model.
• Azure Application Insights: Stores monitoring information about your models.
• Azure Key Vault: Stores secrets that are used by compute targets and other sensitive information that's
needed by the workspace.
41. MachineLearning Studio
43
• Experiments are training runs you use to build your models.
• Pipelines are reusable workflows for training and retraining your model.
• Datasets aid in management of the data you use for model training and pipeline creation.
• Once you have a model you want to deploy, you create a registered model.
• Use the registered model and a scoring script to create a deployment endpoint.
• Compute targets are used to run your experiments.
Compute Instances: A compute instance is used as a compute target for authoring and training models for
development and testing purposes.
Compute Clusters: Scalable clusters of virtual machines for on-demand processing of experiment code, for running
batch interference on large amounts of data.
Inference Clusters: Deployment targets for predictive services that use your trained models.
Attached Compute: Links to existing Azure compute resources, such as Virtual Machines or Azure Databricks
clusters.
42. 44
Metric:Regression
• Mean Absolute Error (MAE): The average difference between predicted values and true values. This value is based on the same units
as the label, in this case dollars. The lower this value is, the better the model is predicting.
• Root Mean Squared Error (RMSE): The square root of the mean squared difference between predicted and true values. The result is a
metric based on the same unit as the label (dollars). When compared to the M A E (above), a larger difference indicates greater variance
in the individual errors (for example, with some errors being very small, while others are large).
• Relative Squared Error (RSE): A relative metric between 0 and 1 based on the square of the differences between predicted and true
values. The closer to 0 this metric is, the better the model is performing. Because this metric is relative, it can be used to compare
models where the labels are in different units.
• Relative Absolute Error (RAE): A relative metric between 0 and 1 based on the absolute differences between predicted and true
values. The closer to 0 this metric is, the better the model is performing. Like RSE, this metric can be used to compare models where
the labels are in different units.
• Coefficient of Determination (R2): Coefficient of determination is a measure of the variance from the mean in its predictions. Its value
varies between 0 and 1, where 1 typically indicates a perfectly fit model, while 0 indicates a random one. This metric is more
commonly referred to as R-Squared.
44. 46
Questions:ML Studio
Question: A dataset contains attributes that have values in different units with different ranges of values. Which data preprocessing
method is used to transform the values into a common scale?
• Normalization
• Binning
• Substitution
• Sampling
Question: Split data
• You can divide a dataset using regular expression.
• You can split a dataset for training/testing by rows.
• You can split a dataset for training/testing by columns.
Question: You need to hold back a dataset from model training so it can be used to estimate a model's prediction error while tuning its
hyperparameters. Which dataset should you use?
• Training dataset
• Testing dataset
• Raw data
• Validation dataset
45. 47
Questions:ML Studio
Question: What should you do to measure the accuracy of a trained machine learning model?
• Score the model.
• Summarize the data.
• Normalize the data.
• Create features.
Question: What should you do to measure the accuracy of the predictions and assess model fit?
• Evaluate the model.
• Detect languages.
• Score the model.
• Evaluate the probability function.
48. Question: Which modules should you use to complete the pipeline? To answer, drag the appropriate module to the relevant slots. A module
may be used once, more than once, or not at all.
50
49. 51
Metric:classification
• Accuracy: The ratio of correct predictions (true positives + true negatives) to the total number of predictions. In other words, what
proportion of diabetes predictions did the model get right?
• Precision: Precision is a measure of the correct positive results. Precision is the number of true positives divided by the sum of the
number of true positives and false positives. Precision is scored between 0 and 1. Values closer to 1 are better.
• Recall: The fraction of the cases classified as positive that are actually positive (the number of true positives divided by the number of
true positives plus false negatives). In other words, out of all the patients who actually have diabetes, how many did the model
identify?
• F1 Score: F1 score is a measure combining precision and recall. F1 score is the weighted average of precision and recall (the number
of true positives divided by the sum of true positives and false negatives). F-score is scored between 0 and 1. Values closer to 1 are
better.
• AUC: measures the area under a curve that represents true positive rate over true negative rate. AUC ranges between 0 and 1. Values
closer to 1 indicate that the model is performing better. AUC value of 0.4 means that the model is performing worse than a random
guess. AUC values range between 0 and 1. The higher the value, the better the performance of the classification model.
50. 52
Metric:classification
Question: Which two metrics can you use to evaluate classification machine learning models? Each correct answer presents a complete
solution. Choose the correct answers
A. Mean Absolute Error (MAE)
B. Precision
C. Recall
D. Average Distance to Cluster Center
Question: In the evaluation of the classification model you get a value of 0.3 for the area under the curve (AUC) metrics. What does it
mean?
A. 40 percent of data is allocated to training and 60 percent to testing.
B. 60 percent of data is allocated to training and 40 percent to testing.
C. The model is performing worse than a random guess.
D. The model is performing better than a random guess.
51. 53
Metric:classification
Question: Which two metrics can you use to evaluate a classification machine learning model? Each correct answer presents a complete
solution.
A. Coefficient of determination
B. Maximal Distance to Cluster Center
C. F-score
D. Root mean squared error (RMSE)
E. Precision
52. Metric:classification
Question: You evaluate a machine learning model and it generates the matrix shown in the exhibit?
For each of the following statements, select Yes if the statement is true. Otherwise, select No
A. The machine learning model is a classification model
B. There are 879 false positives
C. There are 2015 true negatives
54
53. Question: You create a real-time inference pipeline from a training pipeline in Azure Machine Learning designer. You need to complete the
inference pipeline.
Which modules should you use to complete the pipeline? To answer, drag the appropriate module to the relevant slots. A module may be
used once, more than once, or not at all.
55