SlideShare a Scribd company logo
Machine Learning: What
Assurance professionals need to
know
Andrew Clark, Principal, Capital One
About me
● B.S. in Business Administration with a concentration in Accounting, Summa Cum
Laude, from the University of Tennessee at Chattanooga.
● M.S. in Data Science from Southern Methodist University
● Ph.D. student in Economics with a concentration in International Monetary
Policy at the University of Reading.
● American Statistical Association Graduate Statistician (GStat), INFORMS
Certified Analytics Professional (CAP), and AWS Certified Solutions Architect -
Associate.
● Has designed, built, and deployed numerous machine learning and continuous
auditing solutions using open source technologies.
● Avid conference presenter and writer on the topic of machine learning
Overview
● What is machine learning?
● Why is it important?
● What do all of the buzzwords mean?
● What are the two broad types of machine learning?
● Non-technical introduction
● How does it pertain to auditors?
● Security Issues
● Case studies
● What would a machine learning audit entail?
● Where can I learn more about machine learning?
Kong, Qingkai . "Machine Learning 1 - What is
machine learning and real world example."
Qingkai's Blog (web log), October 4, 2016.
Accessed September 3, 2018.
http://qingkaikong.blogspot.com/2016/10/machin
e-learning-1-what-is-machine.html?showComme
nt=1484689212391#c4748865641151946089.
Learning Objectives
● Recall the different types of machine learning
● Understand how machine learning works at a high level
● Explain where machine learning can be applied
● Define common machine learning terminology
What is Machine
Learning?
A computer recognizing patterns
without having to be explicitly
programed.
What many businesses get wrong
Why is Machine Learning Important?
● Disrupting business. Example ML powered businesses disrupted Blockbuster,
Taxis, etc.
● Revolutionizing existing business models. Predictive maintenance in
manufacturing, retailing, credit card fraud detection, loan underwriting
● One of the key technologies in driving economic growth
What Machine Learning is not
● Magic
● Not going to take your job (for the majority of professionals)
● Always the best tool for the job
What do all these buzzwords mean?
“Machine Learning based, artificial intelligent, Big Data spewing, Deep Learning,
Neural Network touting, Cognitive Computing, Virtual Reality Natural Language
Processing,…Chat Bot.”
Two broad types of machine learning
● Supervised
● Unsupervised
Supervised Learning
● Given a labeled dataset, ‘fraud not fraud’, the algorithm is ‘trained’, to recognize
which items are fraud and which items are not fraud.
● Examples:
○ Transaction fraud detection
○ Classifying images: dog/not dog
● Common techniques include:
○ Logistic Regression
○ Support Vector Machines
Unsupervised Learning
● Given some cleaned data, the algorithm, divides the data into like groups.
● Examples:
○ Pattern recognition
○ Anomaly detection
○ Clustering
● Popular models:
○ Kmeans
○ Gaussian mixture models
○ DBSCAN
A non-technical introduction
● Process, when strung together, called a
pipeline
○ Business Understanding
○ Data Understanding
○ Data Preparation
○ Modeling
○ Evaluation
○ Deployment
Kearn, Martin . "Machine Learning is for Muggles too!" Microsoft Developer
(web log), March 1, 2016. Accessed February 21, 2017.
https://blogs.msdn.microsoft.com/martinkearn/2016/03/01/machine-learning-is-f
or-muggles-too/.
Business Understanding
● The most important step – ‘The Why’
● Why is this needed and what is the desired outcome?
Data understanding
● An understanding of where the data is coming from is key to good modeling
● SQL relational database? NoSQL database? Csv, txt, webpage, Tweets?
● What scale is the data on? For example, Celsius or Fahrenheit?
Data Preparation
● Currently, close to 90% of what Data Scientists do
● ‘Munging’
● Data scaling
● Select variables
● Divide into test and train sets
● “I’m a data janitor. That’s the sexiest job of the 21st century. It’s very
flattering, but it’s also a little baffling.” – Josh Wills, Head of Data
Engineering @ Slack
Press, Gil. "Cleaning Big Data: Most Time-Consuming, Least Enjoyable Data Science Task, Survey Says." Forbes. March 23, 2016. Accessed March 13,
2017.
https://www.forbes.com/sites/gilpress/2016/03/23/data-preparation-most-time-consuming-least-enjoyable-data-science-task-survey-says/#21e789136f6
3.
Modeling
Evaluation
● Accuracy
● Precision
● Recall
● Does the model solve the problem?
Deployment
● Integrated into existing infrastructure or application?
● Separate web application?
● Scheduled job?
● Run adhoc?
How Machine Learning
Works
The animal version
As an auditor, what does this mean for you?
● New opportunities and risks
● Machine Learning control frameworks
● Catch-22 of businesses accepting the risk of black boxes or becoming irrelevant
● Use cases in audit analytics
● More complicated environment, new skills required to understand business
implications and audit algorithms
Machine Learning Security issues
Mikhailov, Emil, and Roman Trusov. "How Adversarial Attacks Work." Y Combinator. November 02,
2017. Accessed January 17, 2018.
http://blog.ycombinator.com/how-adversarial-attacks-work/?imm_mid=0f81cc&cmp=em-data-na-na-n
ewsltr_20171115.
Machine Learning Security issues cont.
Mikhailov, Emil, and Roman Trusov. "How Adversarial Attacks Work." Y Combinator. November 02,
2017. Accessed January 17, 2018.
http://blog.ycombinator.com/how-adversarial-attacks-work/?imm_mid=0f81cc&cmp=em-data-na-na-n
ewsltr_20171115.
Use cases in Assurance and Compliance
● Anomaly detection
○ Unsupervised journal entry anomaly detection
○ Clustering on invoice and AP data for outliers
○ Outlier user access
● ‘Auditor sense’ investigation
○ Supervised model for expense report investigation
○ Supervised model for journal entries
○ AP transactions, customer transactions, etc.
● Document review
The Machine Learning Algorithm Audit
● With algorithms increasingly dictating our lives, how do we know that they are
operating as intended?
○ Weapons of Math Destruction by Cathy O'Neil
● Unfilled role for assurance professionals.
● Review assumptions, and when available, look at the weighting for features in the
model.
○ Decision tree, logistic regression, etc.
● Can provide a lot of value when using only SDLC audit methodologies
● ISACA Paper
Where can I learn more about Machine Learning?
● Visual Intro, highly recommended, short and sweet
○ http://www.r2d3.us/visual-intro-to-machine-learning-part-1/
● Wikipedia
○ https://en.wikipedia.org/wiki/Machine_learning
● Good beginning article with some fantastic books
○ http://machinelearningmastery.com/4-steps-to-get-started-in-machine-learning/
● Weka
○ http://www.cs.waikato.ac.nz/ml/weka/
● Scikit-Learn
○ http://scikit-learn.org/
Conclusion
● Definition of Machine Learning
● Buzzword breakdown
● Broad algorithm overview
● Machine Learning process
● Real world use cases
● The Machine Learning Audit
● Where to learn more about Machine Learning
Questions?
Thank you!

More Related Content

What's hot

Credit Card Fraud Detection Using ML In Databricks
Credit Card Fraud Detection Using ML In DatabricksCredit Card Fraud Detection Using ML In Databricks
Credit Card Fraud Detection Using ML In Databricks
Databricks
 
AI & ML in Cyber Security - Why Algorithms Are Dangerous
AI & ML in Cyber Security - Why Algorithms Are DangerousAI & ML in Cyber Security - Why Algorithms Are Dangerous
AI & ML in Cyber Security - Why Algorithms Are Dangerous
Raffael Marty
 
1. Data Analytics-introduction
1. Data Analytics-introduction1. Data Analytics-introduction
1. Data Analytics-introduction
krishna singh
 
Automation Isn't Enough: You Need Robotics or AI
Automation Isn't Enough: You Need Robotics or AIAutomation Isn't Enough: You Need Robotics or AI
Automation Isn't Enough: You Need Robotics or AI
Datavail
 
Machine Learning in Big Data
Machine Learning in Big DataMachine Learning in Big Data
Machine Learning in Big Data
DataWorks Summit/Hadoop Summit
 
New professional careers in data
New professional careers in dataNew professional careers in data
New professional careers in data
David Rostcheck
 
Pay no attention to the man behind the curtain - the unseen work behind data ...
Pay no attention to the man behind the curtain - the unseen work behind data ...Pay no attention to the man behind the curtain - the unseen work behind data ...
Pay no attention to the man behind the curtain - the unseen work behind data ...
mark madsen
 
How relevant is Predictive Analytics relevant today?
How relevant is Predictive Analytics relevant today?How relevant is Predictive Analytics relevant today?
How relevant is Predictive Analytics relevant today?
Steven Mugerwa
 
Operationalizing Machine Learning in the Enterprise
Operationalizing Machine Learning in the EnterpriseOperationalizing Machine Learning in the Enterprise
Operationalizing Machine Learning in the Enterprise
mark madsen
 
Fairness and Transparency: Algorithmic Explainability, some Legal and Ethical...
Fairness and Transparency: Algorithmic Explainability, some Legal and Ethical...Fairness and Transparency: Algorithmic Explainability, some Legal and Ethical...
Fairness and Transparency: Algorithmic Explainability, some Legal and Ethical...
Patrick Van Renterghem
 
Intro to Data Science for Non-Data Scientists
Intro to Data Science for Non-Data ScientistsIntro to Data Science for Non-Data Scientists
Intro to Data Science for Non-Data Scientists
Sri Ambati
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
Laguna State Polytechnic University
 
The Black Box: Interpretability, Reproducibility, and Data Management
The Black Box: Interpretability, Reproducibility, and Data ManagementThe Black Box: Interpretability, Reproducibility, and Data Management
The Black Box: Interpretability, Reproducibility, and Data Management
mark madsen
 
Who is a data scientist
Who is a data scientist  Who is a data scientist
Who is a data scientist
prateek kumar
 
Machine Learning and Blockchain by Director of Product at Target
Machine Learning and Blockchain by Director of Product at TargetMachine Learning and Blockchain by Director of Product at Target
Machine Learning and Blockchain by Director of Product at Target
Product School
 
D92-198gstindspdx
D92-198gstindspdxD92-198gstindspdx
D92-198gstindspdx
Thinkful
 
Solve User Problems: Data Architecture for Humans
Solve User Problems: Data Architecture for HumansSolve User Problems: Data Architecture for Humans
Solve User Problems: Data Architecture for Humans
mark madsen
 
Data science as a professional career
Data science as a professional careerData science as a professional career
Data science as a professional career
David Rostcheck
 
Idiots guide to setting up a data science team
Idiots guide to setting up a data science teamIdiots guide to setting up a data science team
Idiots guide to setting up a data science team
Ashish Bansal
 

What's hot (20)

Credit Card Fraud Detection Using ML In Databricks
Credit Card Fraud Detection Using ML In DatabricksCredit Card Fraud Detection Using ML In Databricks
Credit Card Fraud Detection Using ML In Databricks
 
AI & ML in Cyber Security - Why Algorithms Are Dangerous
AI & ML in Cyber Security - Why Algorithms Are DangerousAI & ML in Cyber Security - Why Algorithms Are Dangerous
AI & ML in Cyber Security - Why Algorithms Are Dangerous
 
1. Data Analytics-introduction
1. Data Analytics-introduction1. Data Analytics-introduction
1. Data Analytics-introduction
 
Automation Isn't Enough: You Need Robotics or AI
Automation Isn't Enough: You Need Robotics or AIAutomation Isn't Enough: You Need Robotics or AI
Automation Isn't Enough: You Need Robotics or AI
 
Machine Learning in Big Data
Machine Learning in Big DataMachine Learning in Big Data
Machine Learning in Big Data
 
New professional careers in data
New professional careers in dataNew professional careers in data
New professional careers in data
 
Pay no attention to the man behind the curtain - the unseen work behind data ...
Pay no attention to the man behind the curtain - the unseen work behind data ...Pay no attention to the man behind the curtain - the unseen work behind data ...
Pay no attention to the man behind the curtain - the unseen work behind data ...
 
How relevant is Predictive Analytics relevant today?
How relevant is Predictive Analytics relevant today?How relevant is Predictive Analytics relevant today?
How relevant is Predictive Analytics relevant today?
 
Operationalizing Machine Learning in the Enterprise
Operationalizing Machine Learning in the EnterpriseOperationalizing Machine Learning in the Enterprise
Operationalizing Machine Learning in the Enterprise
 
Fairness and Transparency: Algorithmic Explainability, some Legal and Ethical...
Fairness and Transparency: Algorithmic Explainability, some Legal and Ethical...Fairness and Transparency: Algorithmic Explainability, some Legal and Ethical...
Fairness and Transparency: Algorithmic Explainability, some Legal and Ethical...
 
Intro to Data Science for Non-Data Scientists
Intro to Data Science for Non-Data ScientistsIntro to Data Science for Non-Data Scientists
Intro to Data Science for Non-Data Scientists
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
 
The Black Box: Interpretability, Reproducibility, and Data Management
The Black Box: Interpretability, Reproducibility, and Data ManagementThe Black Box: Interpretability, Reproducibility, and Data Management
The Black Box: Interpretability, Reproducibility, and Data Management
 
Who is a data scientist
Who is a data scientist  Who is a data scientist
Who is a data scientist
 
Creditcard
CreditcardCreditcard
Creditcard
 
Machine Learning and Blockchain by Director of Product at Target
Machine Learning and Blockchain by Director of Product at TargetMachine Learning and Blockchain by Director of Product at Target
Machine Learning and Blockchain by Director of Product at Target
 
D92-198gstindspdx
D92-198gstindspdxD92-198gstindspdx
D92-198gstindspdx
 
Solve User Problems: Data Architecture for Humans
Solve User Problems: Data Architecture for HumansSolve User Problems: Data Architecture for Humans
Solve User Problems: Data Architecture for Humans
 
Data science as a professional career
Data science as a professional careerData science as a professional career
Data science as a professional career
 
Idiots guide to setting up a data science team
Idiots guide to setting up a data science teamIdiots guide to setting up a data science team
Idiots guide to setting up a data science team
 

Similar to Machine Learning: What Assurance Professionals Need to Know

Data analytics career path
Data analytics career pathData analytics career path
Data analytics career path
Rubikal
 
Data Analytics Career Paths
Data Analytics Career PathsData Analytics Career Paths
Data Analytics Career Paths
Ahmed Amr Abdul-Fattah
 
Big Data in FinTech
Big Data in FinTechBig Data in FinTech
Big Data in FinTech
Mahmoud Jalajel
 
Big Data overview
Big Data overviewBig Data overview
Big Data overview
alexisroos
 
Data science Applications in the Enterprise
Data science Applications in the EnterpriseData science Applications in the Enterprise
Data science Applications in the Enterprise
Srinath Perera
 
Practical AI use cases in Customer Service
Practical AI use cases in Customer ServicePractical AI use cases in Customer Service
Practical AI use cases in Customer Service
Denys Holovatyi
 
Algorithmic auditing 1.0
Algorithmic auditing 1.0Algorithmic auditing 1.0
Algorithmic auditing 1.0
QuantUniversity
 
Predictive Analytics
Predictive Analytics Predictive Analytics
Predictive Analytics
Rudradeb Mitra
 
Career in Python and data science
Career in Python and data science Career in Python and data science
Career in Python and data science
Sagar Hedau
 
FinTech, AI, Machine Learning in Finance
FinTech, AI, Machine Learning in FinanceFinTech, AI, Machine Learning in Finance
FinTech, AI, Machine Learning in Finance
Sanjiv Das
 
Big Data Meetup by Chad Richeson
Big Data Meetup by Chad RichesonBig Data Meetup by Chad Richeson
Big Data Meetup by Chad RichesonSocietyConsulting
 
How to add machine learning to your applications today
How to add machine learning to your applications todayHow to add machine learning to your applications today
How to add machine learning to your applications today
Michal Hodinka
 
AI, Automation, and Economic Impact - National Security Implications
AI, Automation, and Economic Impact - National Security ImplicationsAI, Automation, and Economic Impact - National Security Implications
AI, Automation, and Economic Impact - National Security Implications
Daniel Faggella
 
WSO2CON 2024 Slides - Unlocking Value with AI
WSO2CON 2024 Slides - Unlocking Value with AIWSO2CON 2024 Slides - Unlocking Value with AI
WSO2CON 2024 Slides - Unlocking Value with AI
WSO2
 
Pydata Chicago - work hard once
Pydata Chicago - work hard oncePydata Chicago - work hard once
Pydata Chicago - work hard once
Ji Dong
 
Machine Learning and AI in Risk Management
Machine Learning and AI in Risk ManagementMachine Learning and AI in Risk Management
Machine Learning and AI in Risk Management
QuantUniversity
 
Minne analytics presentation 2018 12 03 final compressed
Minne analytics presentation 2018 12 03 final   compressedMinne analytics presentation 2018 12 03 final   compressed
Minne analytics presentation 2018 12 03 final compressed
Bonnie Holub
 
I, project manager, The rise of artificial intelligence in the world of proje...
I, project manager, The rise of artificial intelligence in the world of proje...I, project manager, The rise of artificial intelligence in the world of proje...
I, project manager, The rise of artificial intelligence in the world of proje...
PMILebanonChapter
 
Applied AI Tech Talk: How to Setup a Data Science Dept
Applied AI Tech Talk: How to Setup a Data Science DeptApplied AI Tech Talk: How to Setup a Data Science Dept
Applied AI Tech Talk: How to Setup a Data Science DeptJonathan Sedar
 
How to succeed at data without even trying!
How to succeed at data without even trying!How to succeed at data without even trying!
How to succeed at data without even trying!
Dylan
 

Similar to Machine Learning: What Assurance Professionals Need to Know (20)

Data analytics career path
Data analytics career pathData analytics career path
Data analytics career path
 
Data Analytics Career Paths
Data Analytics Career PathsData Analytics Career Paths
Data Analytics Career Paths
 
Big Data in FinTech
Big Data in FinTechBig Data in FinTech
Big Data in FinTech
 
Big Data overview
Big Data overviewBig Data overview
Big Data overview
 
Data science Applications in the Enterprise
Data science Applications in the EnterpriseData science Applications in the Enterprise
Data science Applications in the Enterprise
 
Practical AI use cases in Customer Service
Practical AI use cases in Customer ServicePractical AI use cases in Customer Service
Practical AI use cases in Customer Service
 
Algorithmic auditing 1.0
Algorithmic auditing 1.0Algorithmic auditing 1.0
Algorithmic auditing 1.0
 
Predictive Analytics
Predictive Analytics Predictive Analytics
Predictive Analytics
 
Career in Python and data science
Career in Python and data science Career in Python and data science
Career in Python and data science
 
FinTech, AI, Machine Learning in Finance
FinTech, AI, Machine Learning in FinanceFinTech, AI, Machine Learning in Finance
FinTech, AI, Machine Learning in Finance
 
Big Data Meetup by Chad Richeson
Big Data Meetup by Chad RichesonBig Data Meetup by Chad Richeson
Big Data Meetup by Chad Richeson
 
How to add machine learning to your applications today
How to add machine learning to your applications todayHow to add machine learning to your applications today
How to add machine learning to your applications today
 
AI, Automation, and Economic Impact - National Security Implications
AI, Automation, and Economic Impact - National Security ImplicationsAI, Automation, and Economic Impact - National Security Implications
AI, Automation, and Economic Impact - National Security Implications
 
WSO2CON 2024 Slides - Unlocking Value with AI
WSO2CON 2024 Slides - Unlocking Value with AIWSO2CON 2024 Slides - Unlocking Value with AI
WSO2CON 2024 Slides - Unlocking Value with AI
 
Pydata Chicago - work hard once
Pydata Chicago - work hard oncePydata Chicago - work hard once
Pydata Chicago - work hard once
 
Machine Learning and AI in Risk Management
Machine Learning and AI in Risk ManagementMachine Learning and AI in Risk Management
Machine Learning and AI in Risk Management
 
Minne analytics presentation 2018 12 03 final compressed
Minne analytics presentation 2018 12 03 final   compressedMinne analytics presentation 2018 12 03 final   compressed
Minne analytics presentation 2018 12 03 final compressed
 
I, project manager, The rise of artificial intelligence in the world of proje...
I, project manager, The rise of artificial intelligence in the world of proje...I, project manager, The rise of artificial intelligence in the world of proje...
I, project manager, The rise of artificial intelligence in the world of proje...
 
Applied AI Tech Talk: How to Setup a Data Science Dept
Applied AI Tech Talk: How to Setup a Data Science DeptApplied AI Tech Talk: How to Setup a Data Science Dept
Applied AI Tech Talk: How to Setup a Data Science Dept
 
How to succeed at data without even trying!
How to succeed at data without even trying!How to succeed at data without even trying!
How to succeed at data without even trying!
 

More from Andrew Clark

AWS for Auditors
AWS for AuditorsAWS for Auditors
AWS for Auditors
Andrew Clark
 
Reinventing Auditing with Machine Learning
Reinventing Auditing with Machine LearningReinventing Auditing with Machine Learning
Reinventing Auditing with Machine Learning
Andrew Clark
 
The Machine Learning Audit. MIS ITAC 2017 Keynote
The Machine Learning Audit. MIS ITAC 2017 KeynoteThe Machine Learning Audit. MIS ITAC 2017 Keynote
The Machine Learning Audit. MIS ITAC 2017 Keynote
Andrew Clark
 
Where Open Source Meets Audit Analytics - ISACA North America CACS 2017
Where Open Source Meets Audit Analytics - ISACA North America CACS 2017Where Open Source Meets Audit Analytics - ISACA North America CACS 2017
Where Open Source Meets Audit Analytics - ISACA North America CACS 2017
Andrew Clark
 
Machine Learning for Auditors: What you need to know - ISACA North America CA...
Machine Learning for Auditors: What you need to know - ISACA North America CA...Machine Learning for Auditors: What you need to know - ISACA North America CA...
Machine Learning for Auditors: What you need to know - ISACA North America CA...
Andrew Clark
 
ITAC 2016 Where Open Source Meets Audit Analytics
ITAC 2016 Where Open Source Meets Audit AnalyticsITAC 2016 Where Open Source Meets Audit Analytics
ITAC 2016 Where Open Source Meets Audit Analytics
Andrew Clark
 

More from Andrew Clark (6)

AWS for Auditors
AWS for AuditorsAWS for Auditors
AWS for Auditors
 
Reinventing Auditing with Machine Learning
Reinventing Auditing with Machine LearningReinventing Auditing with Machine Learning
Reinventing Auditing with Machine Learning
 
The Machine Learning Audit. MIS ITAC 2017 Keynote
The Machine Learning Audit. MIS ITAC 2017 KeynoteThe Machine Learning Audit. MIS ITAC 2017 Keynote
The Machine Learning Audit. MIS ITAC 2017 Keynote
 
Where Open Source Meets Audit Analytics - ISACA North America CACS 2017
Where Open Source Meets Audit Analytics - ISACA North America CACS 2017Where Open Source Meets Audit Analytics - ISACA North America CACS 2017
Where Open Source Meets Audit Analytics - ISACA North America CACS 2017
 
Machine Learning for Auditors: What you need to know - ISACA North America CA...
Machine Learning for Auditors: What you need to know - ISACA North America CA...Machine Learning for Auditors: What you need to know - ISACA North America CA...
Machine Learning for Auditors: What you need to know - ISACA North America CA...
 
ITAC 2016 Where Open Source Meets Audit Analytics
ITAC 2016 Where Open Source Meets Audit AnalyticsITAC 2016 Where Open Source Meets Audit Analytics
ITAC 2016 Where Open Source Meets Audit Analytics
 

Recently uploaded

Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)
TravisMalana
 
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
nscud
 
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
ewymefz
 
The affect of service quality and online reviews on customer loyalty in the E...
The affect of service quality and online reviews on customer loyalty in the E...The affect of service quality and online reviews on customer loyalty in the E...
The affect of service quality and online reviews on customer loyalty in the E...
jerlynmaetalle
 
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
Timothy Spann
 
Quantitative Data AnalysisReliability Analysis (Cronbach Alpha) Common Method...
Quantitative Data AnalysisReliability Analysis (Cronbach Alpha) Common Method...Quantitative Data AnalysisReliability Analysis (Cronbach Alpha) Common Method...
Quantitative Data AnalysisReliability Analysis (Cronbach Alpha) Common Method...
2023240532
 
Data_and_Analytics_Essentials_Architect_an_Analytics_Platform.pptx
Data_and_Analytics_Essentials_Architect_an_Analytics_Platform.pptxData_and_Analytics_Essentials_Architect_an_Analytics_Platform.pptx
Data_and_Analytics_Essentials_Architect_an_Analytics_Platform.pptx
AnirbanRoy608946
 
一比一原版(QU毕业证)皇后大学毕业证成绩单
一比一原版(QU毕业证)皇后大学毕业证成绩单一比一原版(QU毕业证)皇后大学毕业证成绩单
一比一原版(QU毕业证)皇后大学毕业证成绩单
enxupq
 
Influence of Marketing Strategy and Market Competition on Business Plan
Influence of Marketing Strategy and Market Competition on Business PlanInfluence of Marketing Strategy and Market Competition on Business Plan
Influence of Marketing Strategy and Market Competition on Business Plan
jerlynmaetalle
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP
 
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
yhkoc
 
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Subhajit Sahu
 
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
slg6lamcq
 
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
nscud
 
一比一原版(BU毕业证)波士顿大学毕业证成绩单
一比一原版(BU毕业证)波士顿大学毕业证成绩单一比一原版(BU毕业证)波士顿大学毕业证成绩单
一比一原版(BU毕业证)波士顿大学毕业证成绩单
ewymefz
 
Machine learning and optimization techniques for electrical drives.pptx
Machine learning and optimization techniques for electrical drives.pptxMachine learning and optimization techniques for electrical drives.pptx
Machine learning and optimization techniques for electrical drives.pptx
balafet
 
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
v3tuleee
 
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
Timothy Spann
 
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
ahzuo
 
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
g4dpvqap0
 

Recently uploaded (20)

Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)
 
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
 
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
 
The affect of service quality and online reviews on customer loyalty in the E...
The affect of service quality and online reviews on customer loyalty in the E...The affect of service quality and online reviews on customer loyalty in the E...
The affect of service quality and online reviews on customer loyalty in the E...
 
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
 
Quantitative Data AnalysisReliability Analysis (Cronbach Alpha) Common Method...
Quantitative Data AnalysisReliability Analysis (Cronbach Alpha) Common Method...Quantitative Data AnalysisReliability Analysis (Cronbach Alpha) Common Method...
Quantitative Data AnalysisReliability Analysis (Cronbach Alpha) Common Method...
 
Data_and_Analytics_Essentials_Architect_an_Analytics_Platform.pptx
Data_and_Analytics_Essentials_Architect_an_Analytics_Platform.pptxData_and_Analytics_Essentials_Architect_an_Analytics_Platform.pptx
Data_and_Analytics_Essentials_Architect_an_Analytics_Platform.pptx
 
一比一原版(QU毕业证)皇后大学毕业证成绩单
一比一原版(QU毕业证)皇后大学毕业证成绩单一比一原版(QU毕业证)皇后大学毕业证成绩单
一比一原版(QU毕业证)皇后大学毕业证成绩单
 
Influence of Marketing Strategy and Market Competition on Business Plan
Influence of Marketing Strategy and Market Competition on Business PlanInfluence of Marketing Strategy and Market Competition on Business Plan
Influence of Marketing Strategy and Market Competition on Business Plan
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
 
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
 
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
 
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
 
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
 
一比一原版(BU毕业证)波士顿大学毕业证成绩单
一比一原版(BU毕业证)波士顿大学毕业证成绩单一比一原版(BU毕业证)波士顿大学毕业证成绩单
一比一原版(BU毕业证)波士顿大学毕业证成绩单
 
Machine learning and optimization techniques for electrical drives.pptx
Machine learning and optimization techniques for electrical drives.pptxMachine learning and optimization techniques for electrical drives.pptx
Machine learning and optimization techniques for electrical drives.pptx
 
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
 
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
 
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
 
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
一比一原版(爱大毕业证书)爱丁堡大学毕业证如何办理
 

Machine Learning: What Assurance Professionals Need to Know

  • 1. Machine Learning: What Assurance professionals need to know Andrew Clark, Principal, Capital One
  • 2. About me ● B.S. in Business Administration with a concentration in Accounting, Summa Cum Laude, from the University of Tennessee at Chattanooga. ● M.S. in Data Science from Southern Methodist University ● Ph.D. student in Economics with a concentration in International Monetary Policy at the University of Reading. ● American Statistical Association Graduate Statistician (GStat), INFORMS Certified Analytics Professional (CAP), and AWS Certified Solutions Architect - Associate. ● Has designed, built, and deployed numerous machine learning and continuous auditing solutions using open source technologies. ● Avid conference presenter and writer on the topic of machine learning
  • 3. Overview ● What is machine learning? ● Why is it important? ● What do all of the buzzwords mean? ● What are the two broad types of machine learning? ● Non-technical introduction ● How does it pertain to auditors? ● Security Issues ● Case studies ● What would a machine learning audit entail? ● Where can I learn more about machine learning? Kong, Qingkai . "Machine Learning 1 - What is machine learning and real world example." Qingkai's Blog (web log), October 4, 2016. Accessed September 3, 2018. http://qingkaikong.blogspot.com/2016/10/machin e-learning-1-what-is-machine.html?showComme nt=1484689212391#c4748865641151946089.
  • 4. Learning Objectives ● Recall the different types of machine learning ● Understand how machine learning works at a high level ● Explain where machine learning can be applied ● Define common machine learning terminology
  • 5. What is Machine Learning? A computer recognizing patterns without having to be explicitly programed.
  • 7. Why is Machine Learning Important? ● Disrupting business. Example ML powered businesses disrupted Blockbuster, Taxis, etc. ● Revolutionizing existing business models. Predictive maintenance in manufacturing, retailing, credit card fraud detection, loan underwriting ● One of the key technologies in driving economic growth
  • 8. What Machine Learning is not ● Magic ● Not going to take your job (for the majority of professionals) ● Always the best tool for the job
  • 9. What do all these buzzwords mean? “Machine Learning based, artificial intelligent, Big Data spewing, Deep Learning, Neural Network touting, Cognitive Computing, Virtual Reality Natural Language Processing,…Chat Bot.”
  • 10. Two broad types of machine learning ● Supervised ● Unsupervised
  • 11. Supervised Learning ● Given a labeled dataset, ‘fraud not fraud’, the algorithm is ‘trained’, to recognize which items are fraud and which items are not fraud. ● Examples: ○ Transaction fraud detection ○ Classifying images: dog/not dog ● Common techniques include: ○ Logistic Regression ○ Support Vector Machines
  • 12.
  • 13.
  • 14. Unsupervised Learning ● Given some cleaned data, the algorithm, divides the data into like groups. ● Examples: ○ Pattern recognition ○ Anomaly detection ○ Clustering ● Popular models: ○ Kmeans ○ Gaussian mixture models ○ DBSCAN
  • 15.
  • 16. A non-technical introduction ● Process, when strung together, called a pipeline ○ Business Understanding ○ Data Understanding ○ Data Preparation ○ Modeling ○ Evaluation ○ Deployment Kearn, Martin . "Machine Learning is for Muggles too!" Microsoft Developer (web log), March 1, 2016. Accessed February 21, 2017. https://blogs.msdn.microsoft.com/martinkearn/2016/03/01/machine-learning-is-f or-muggles-too/.
  • 17. Business Understanding ● The most important step – ‘The Why’ ● Why is this needed and what is the desired outcome?
  • 18. Data understanding ● An understanding of where the data is coming from is key to good modeling ● SQL relational database? NoSQL database? Csv, txt, webpage, Tweets? ● What scale is the data on? For example, Celsius or Fahrenheit?
  • 19. Data Preparation ● Currently, close to 90% of what Data Scientists do ● ‘Munging’ ● Data scaling ● Select variables ● Divide into test and train sets ● “I’m a data janitor. That’s the sexiest job of the 21st century. It’s very flattering, but it’s also a little baffling.” – Josh Wills, Head of Data Engineering @ Slack Press, Gil. "Cleaning Big Data: Most Time-Consuming, Least Enjoyable Data Science Task, Survey Says." Forbes. March 23, 2016. Accessed March 13, 2017. https://www.forbes.com/sites/gilpress/2016/03/23/data-preparation-most-time-consuming-least-enjoyable-data-science-task-survey-says/#21e789136f6 3.
  • 21. Evaluation ● Accuracy ● Precision ● Recall ● Does the model solve the problem?
  • 22. Deployment ● Integrated into existing infrastructure or application? ● Separate web application? ● Scheduled job? ● Run adhoc?
  • 24.
  • 25.
  • 26.
  • 27.
  • 28.
  • 29.
  • 30. As an auditor, what does this mean for you? ● New opportunities and risks ● Machine Learning control frameworks ● Catch-22 of businesses accepting the risk of black boxes or becoming irrelevant ● Use cases in audit analytics ● More complicated environment, new skills required to understand business implications and audit algorithms
  • 31. Machine Learning Security issues Mikhailov, Emil, and Roman Trusov. "How Adversarial Attacks Work." Y Combinator. November 02, 2017. Accessed January 17, 2018. http://blog.ycombinator.com/how-adversarial-attacks-work/?imm_mid=0f81cc&cmp=em-data-na-na-n ewsltr_20171115.
  • 32. Machine Learning Security issues cont. Mikhailov, Emil, and Roman Trusov. "How Adversarial Attacks Work." Y Combinator. November 02, 2017. Accessed January 17, 2018. http://blog.ycombinator.com/how-adversarial-attacks-work/?imm_mid=0f81cc&cmp=em-data-na-na-n ewsltr_20171115.
  • 33. Use cases in Assurance and Compliance ● Anomaly detection ○ Unsupervised journal entry anomaly detection ○ Clustering on invoice and AP data for outliers ○ Outlier user access ● ‘Auditor sense’ investigation ○ Supervised model for expense report investigation ○ Supervised model for journal entries ○ AP transactions, customer transactions, etc. ● Document review
  • 34. The Machine Learning Algorithm Audit ● With algorithms increasingly dictating our lives, how do we know that they are operating as intended? ○ Weapons of Math Destruction by Cathy O'Neil ● Unfilled role for assurance professionals. ● Review assumptions, and when available, look at the weighting for features in the model. ○ Decision tree, logistic regression, etc. ● Can provide a lot of value when using only SDLC audit methodologies ● ISACA Paper
  • 35. Where can I learn more about Machine Learning? ● Visual Intro, highly recommended, short and sweet ○ http://www.r2d3.us/visual-intro-to-machine-learning-part-1/ ● Wikipedia ○ https://en.wikipedia.org/wiki/Machine_learning ● Good beginning article with some fantastic books ○ http://machinelearningmastery.com/4-steps-to-get-started-in-machine-learning/ ● Weka ○ http://www.cs.waikato.ac.nz/ml/weka/ ● Scikit-Learn ○ http://scikit-learn.org/
  • 36. Conclusion ● Definition of Machine Learning ● Buzzword breakdown ● Broad algorithm overview ● Machine Learning process ● Real world use cases ● The Machine Learning Audit ● Where to learn more about Machine Learning