SlideShare a Scribd company logo
Copyright © 2017 Information Systems Audit and Control Association, Inc. All rights reserved.
Andrew Clark, IT Auditor / Data Scientist
Astec Industries, Inc., M.S. Data Science Candidate
Copyright © 2017 Information Systems Audit and Control Association, Inc. All rights reserved.
Overview
• What is machine learning?
• Why is it important?
• What do all of the buzzwords mean?
• Non-technical introduction
• What are the two broad types of machine learning?
• How does it pertain to auditors?
• Case studies
• What would a machine learning audit entail?
• Where can I learn more about machine learning?
Kong, Qingkai . "Machine Learning 1 - What is machine learning and real world example." Qingkai's Blog (web
log), October 4, 2016. Accessed February 21, 2017. http://qingkaikong.blogspot.com/2016/10/machine-learning-
1-what-is-machine.html?showComment=1484689212391#c4748865641151946089.
Copyright © 2017 Information Systems Audit and Control Association, Inc. All rights reserved.
What is Machine Learning?
A computer recognizing patterns without having to be explicitly programed.
Copyright © 2017 Information Systems Audit and Control Association, Inc. All rights reserved.
Why is Machine Learning Important?
• Disrupting business. Example ML powered businesses disrupted
Blockbuster, Taxis, etc.
• Revolutionizing existing business models. Predictive maintenance, retailing,
credit card fraud detection.
• One of the key technologies in driving economic growth
Copyright © 2017 Information Systems Audit and Control Association, Inc. All rights reserved.
What Machine Learning is not:
• Magic
• Going to take your job (for the majority of professionals)
• Always the best tool for the job
Copyright © 2017 Information Systems Audit and Control Association, Inc. All rights reserved.
What do all the buzzwords mean?
• Machine Learning based artificial intelligent - Big Data spewing - Deep
Learning - Neural Network touting - Cognitive Computing - Virtual Reality -
Natural Language Processing - Chat Bot.
Copyright © 2017 Information Systems Audit and Control Association, Inc. All rights reserved.
A non-technical introduction
• Process, when strung together, called a pipeline
• Business Understanding
• Data Understanding
• Data Preparation
• Modeling
• Evaluation
• Deployment
Kearn, Martin . "Machine Learning is for Muggles too!" Microsoft Developer (web log), March 1, 2016. Accessed February
21, 2017. https://blogs.msdn.microsoft.com/martinkearn/2016/03/01/machine-learning-is-for-muggles-too/.
Copyright © 2017 Information Systems Audit and Control Association, Inc. All rights reserved.
Business Understanding
• The most important step
• ‘The why’
• Why is this needed and what is the desired outcome
Copyright © 2017 Information Systems Audit and Control Association, Inc. All rights reserved.
Data Understanding
• An understanding of where the data is coming from is key to good modeling
• SQL relational database? NoSQL database? Csv, txt, webpage, Tweets?
• What scale is the data on? For example, Celsius or Fahrenheit?
• Is the scale the same on all data streams or will transformations be
required?
Copyright © 2017 Information Systems Audit and Control Association, Inc. All rights reserved.
Data Preparation
• Currently, close to 90% of what Data Scientists do
• ‘Munging’
• “I’m a data janitor. That’s the sexiest job of the 21st century. It’s very
flattering, but it’s also a little baffling.” – Josh Wills
• Press, Gil. "Cleaning Big Data: Most Time-Consuming, Least Enjoyable Data Science Task, Survey Says." Forbes. March 23, 2016. Accessed March
13, 2017. https://www.forbes.com/sites/gilpress/2016/03/23/data-preparation-most-time-consuming-least-enjoyable-data-science-task-survey-
says/#21e789136f63.
Copyright © 2017 Information Systems Audit and Control Association, Inc. All rights reserved.
Modeling
"Choosing the right estimator." Choosing the right estimator —
scikit-learn 0.18.1 documentation. Accessed March 13, 2017.
http://scikit-
learn.org/stable/tutorial/machine_learning_map/index.html.
Copyright © 2017 Information Systems Audit and Control Association, Inc. All rights reserved.
Copyright © 2017 Information Systems Audit and Control Association, Inc. All rights reserved.
Evaluation
• Accuracy
• Precision
• Recall
• Does the model solve the problem?
Copyright © 2017 Information Systems Audit and Control Association, Inc. All rights reserved.
Deployment
• Integrated into existing infrastructure or application?
• Separate web application?
• Scheduled job?
• Run adhoc?
Copyright © 2017 Information Systems Audit and Control Association, Inc. All rights reserved.
Unsupervised Machine Learning
• Given some cleaned data, the algorithm, a series of instructions, divides the
data into like groups.
• Popular models:
– Kmeans
– KNN (K-nearest neighbors)
Copyright © 2017 Information Systems Audit and Control Association, Inc. All rights reserved.
Supervised Machine Learning
• Given a labeled dataset, ‘fraud not fraud’, the algorithm is ‘trained’, to
recognize which items are fraud and which items are not fraud.
• Common techniques include:
– Logistic Regression
– Support Vector Machines
Copyright © 2017 Information Systems Audit and Control Association, Inc. All rights reserved.
Example, Logistic Regression
from sklearn.linear_model import LogisticRegression
LogR = LogisticRegression()
# [height, weight, shoe_size]
X = [[181, 80, 44], [177, 70, 43], [160, 60, 38], [154, 54, 37], [166, 65, 40], [190, 90, 47], [175, 64, 39],
[177, 70, 40], [159, 55, 37], [171, 75, 42], [181, 85, 43]]
Y = ['male', 'male', 'female', 'female', 'male', 'male', 'female', 'female', 'female', 'male', 'male']
LogR.fit(X, Y)
prediction = LogR .predict([[190, 70, 43]])
print prediction
>>[‘female’]
https://github.com/aclarkData/simple-machine-learning-examples/blob/master/very_simple_examples/logistic_regression.py
Copyright © 2017 Information Systems Audit and Control Association, Inc. All rights reserved.
Example, Kmeans
• Clustering journal entries
• Essentially, we obtain a month, or any time period, of journal entries, “one-
hot encode” (convert to binary, i.e. 0,1) the non-numerical columns (which
essentially means convert ‘Hello’ into a series of 0s and 1s, and group
together in a pre-determined set of groups, for example, 3.
Copyright © 2017 Information Systems Audit and Control Association, Inc. All rights reserved.
Kmeans continued
http://blog.mpacula.com/2011/04/27/k-means-clustering-example-python/
Copyright © 2017 Information Systems Audit and Control Association, Inc. All rights reserved.
As an auditor, what does this mean for you?
• New opportunities and risks
• Catch-22 of businesses accepting the risk of black boxes or becoming
irrelevant
• Use cases in audit analytic
• More complicated environment, new skills required to understand business
implications and audit algorithms
Copyright © 2017 Information Systems Audit and Control Association, Inc. All rights reserved.
Use cases in Assurance and Compliance
• Anomaly detection
– Unsupervised journal entry anomaly detection
– Clustering on invoice and AP data for outliers
• ‘Auditor sense’ investigation
– Supervised model for expense report investigation
– Supervised model for journal entries
– AP transactions, customer transactions, etc.
Copyright © 2017 Information Systems Audit and Control Association, Inc. All rights reserved.
The Machine Learning Algorithm Audit
• With algorithms increasingly dictating our lives, how do we know that they
are operating as intended?
– e.x. Weapons of Math Destruction by Cathy O'Neil
• Unfilled role for assurance professionals.
– Review assumptions, and when available, such as decision tree, logistic regression,
etc, look at the weighting for features in the model.
– Can provide a lot of value with using only SDLC audit methodologies
Copyright © 2017 Information Systems Audit and Control Association, Inc. All rights reserved.
Machine Learning Audit Example – Logistic Regression
>>weights = pd.Series(clf.coef_[0], index=ShoeData.columns)
>>weights
Height -0.439204
Weight 0.622762
Shoe_size 0.829036
>>weights.plot(kind='bar’, title =‘ …’)
https://github.com/aclarkData/simple-machine-learning-examples
/blob/master/very_simple_examples/BasicMachineLearning.ipynb
Copyright © 2017 Information Systems Audit and Control Association, Inc. All rights reserved.
Machine Learning Audit Example – Decision Tree
Classifier
>>from sklearn import tree
>>clf = tree.DecisionTreeClassifier()
>>clf.fit(X, Y)
>>prediction = clf.predict([[190, 70, 43]])
>>print prediction
[u'male']
>>dot_data = tree.export_graphviz(clf, feature_names=ShoeData.columns, class_names = ShoeData.columns,
out_file='tree.dot')
https://github.com/aclarkData/simple-machine-learning-examples
/blob/master/very_simple_examples/BasicMachineLearning.ipynb
Copyright © 2017 Information Systems Audit and Control Association, Inc. All rights reserved.
http://www.webgraphviz.com/
Copyright © 2017 Information Systems Audit and Control Association, Inc. All rights reserved.
EU’s General Data Protection Regulation (GDPR)
• In April 2016, the EU passed General Data Protection Regulation act, which
gives citizens a right explanation for citizens and regulators regarding
algorithmic decision making.
• Empowers citizens with the ability to understand why they were rejected for
a bank loan, for instance, when the decision was based off an algorithm.
Copyright © 2017 Information Systems Audit and Control Association, Inc. All rights reserved.
Where can I learn more about Machine Learning?
• -Visual Intro, highly recommended, short and sweet
• http://www.r2d3.us/visual-intro-to-machine-learning-part-1/
• -Wikipedia
• https://en.wikipedia.org/wiki/Machine_learning
• -Good beginning article with some fantastic books
• http://machinelearningmastery.com/4-steps-to-get-started-in-machine-
learning/
• -Weka
• http://www.cs.waikato.ac.nz/ml/weka/
• Scikit-Learn
• http://scikit-learn.org/
Copyright © 2017 Information Systems Audit and Control Association, Inc. All rights reserved.
Conclusion and recap
• Definition of Machine Learning
• Buzzword breakdown
• Machine Learning process
• Broad algorithm overview
• Real world use cases
• The Machine Learning Audit
• Where to learn more about Machine Learning
Copyright © 2017 Information Systems Audit and Control Association, Inc. All rights reserved.
Thank you!
• Email: andrewtaylorclark@gmail.com
• GitHub: aclarkData
• Blog: https://aclarkdata.github.io/
• LinkedIn: www.linkedin.com/in/andrew-clark-b326b767

More Related Content

What's hot

AI Data Acquisition and Governance: Considerations for Success
AI Data Acquisition and Governance: Considerations for SuccessAI Data Acquisition and Governance: Considerations for Success
AI Data Acquisition and Governance: Considerations for Success
Databricks
 
Unified Information Governance, Powered by Knowledge Graph
Unified Information Governance, Powered by Knowledge GraphUnified Information Governance, Powered by Knowledge Graph
Unified Information Governance, Powered by Knowledge Graph
Vaticle
 
Domino and AWS: collaborative analytics and model governance at financial ser...
Domino and AWS: collaborative analytics and model governance at financial ser...Domino and AWS: collaborative analytics and model governance at financial ser...
Domino and AWS: collaborative analytics and model governance at financial ser...
Domino Data Lab
 
Machine Learning and AI in Risk Management
Machine Learning and AI in Risk ManagementMachine Learning and AI in Risk Management
Machine Learning and AI in Risk Management
QuantUniversity
 
What Is Data Science? Data Science Course - Data Science Tutorial For Beginne...
What Is Data Science? Data Science Course - Data Science Tutorial For Beginne...What Is Data Science? Data Science Course - Data Science Tutorial For Beginne...
What Is Data Science? Data Science Course - Data Science Tutorial For Beginne...
Edureka!
 
New professional careers in data
New professional careers in dataNew professional careers in data
New professional careers in data
David Rostcheck
 
Data science Applications in the Enterprise
Data science Applications in the EnterpriseData science Applications in the Enterprise
Data science Applications in the Enterprise
Srinath Perera
 
Who is a Data Scientist? | How to become a Data Scientist? | Data Science Cou...
Who is a Data Scientist? | How to become a Data Scientist? | Data Science Cou...Who is a Data Scientist? | How to become a Data Scientist? | Data Science Cou...
Who is a Data Scientist? | How to become a Data Scientist? | Data Science Cou...
Edureka!
 
Data Science
Data ScienceData Science
Data Science
Amit Singh
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
Laguna State Polytechnic University
 
Data science presentation 2nd CI day
Data science presentation 2nd CI dayData science presentation 2nd CI day
Data science presentation 2nd CI day
Mohammed Barakat
 
H2o storm
H2o stormH2o storm
H2o storm
Spencer Aiello
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
ANOOP V S
 
Automating Data Science over a Human Genomics Knowledge Base
Automating Data Science over a Human Genomics Knowledge BaseAutomating Data Science over a Human Genomics Knowledge Base
Automating Data Science over a Human Genomics Knowledge Base
Vaticle
 
Career in Data Science
Career in Data ScienceCareer in Data Science
Career in Data Science
ActonRoy
 
Introduction to Big Data
Introduction to Big DataIntroduction to Big Data
Introduction to Big Data
trendwiseanalytics1
 
introduction to data science
introduction to data scienceintroduction to data science
introduction to data science
bhavesh lande
 
Predictive Analytics - Display Advertising & Credit Card Acquisition Use cases
Predictive Analytics - Display Advertising & Credit Card Acquisition Use cases Predictive Analytics - Display Advertising & Credit Card Acquisition Use cases
Predictive Analytics - Display Advertising & Credit Card Acquisition Use cases
Big Data Pulse
 
Predictive Analytics - Big Data & Artificial Intelligence
Predictive Analytics - Big Data & Artificial IntelligencePredictive Analytics - Big Data & Artificial Intelligence
Predictive Analytics - Big Data & Artificial Intelligence
Manish Jain
 
Data Science Project Lifecycle and Skill Set
Data Science Project Lifecycle and Skill SetData Science Project Lifecycle and Skill Set
Data Science Project Lifecycle and Skill Set
IDEAS - Int'l Data Engineering and Science Association
 

What's hot (20)

AI Data Acquisition and Governance: Considerations for Success
AI Data Acquisition and Governance: Considerations for SuccessAI Data Acquisition and Governance: Considerations for Success
AI Data Acquisition and Governance: Considerations for Success
 
Unified Information Governance, Powered by Knowledge Graph
Unified Information Governance, Powered by Knowledge GraphUnified Information Governance, Powered by Knowledge Graph
Unified Information Governance, Powered by Knowledge Graph
 
Domino and AWS: collaborative analytics and model governance at financial ser...
Domino and AWS: collaborative analytics and model governance at financial ser...Domino and AWS: collaborative analytics and model governance at financial ser...
Domino and AWS: collaborative analytics and model governance at financial ser...
 
Machine Learning and AI in Risk Management
Machine Learning and AI in Risk ManagementMachine Learning and AI in Risk Management
Machine Learning and AI in Risk Management
 
What Is Data Science? Data Science Course - Data Science Tutorial For Beginne...
What Is Data Science? Data Science Course - Data Science Tutorial For Beginne...What Is Data Science? Data Science Course - Data Science Tutorial For Beginne...
What Is Data Science? Data Science Course - Data Science Tutorial For Beginne...
 
New professional careers in data
New professional careers in dataNew professional careers in data
New professional careers in data
 
Data science Applications in the Enterprise
Data science Applications in the EnterpriseData science Applications in the Enterprise
Data science Applications in the Enterprise
 
Who is a Data Scientist? | How to become a Data Scientist? | Data Science Cou...
Who is a Data Scientist? | How to become a Data Scientist? | Data Science Cou...Who is a Data Scientist? | How to become a Data Scientist? | Data Science Cou...
Who is a Data Scientist? | How to become a Data Scientist? | Data Science Cou...
 
Data Science
Data ScienceData Science
Data Science
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
 
Data science presentation 2nd CI day
Data science presentation 2nd CI dayData science presentation 2nd CI day
Data science presentation 2nd CI day
 
H2o storm
H2o stormH2o storm
H2o storm
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
 
Automating Data Science over a Human Genomics Knowledge Base
Automating Data Science over a Human Genomics Knowledge BaseAutomating Data Science over a Human Genomics Knowledge Base
Automating Data Science over a Human Genomics Knowledge Base
 
Career in Data Science
Career in Data ScienceCareer in Data Science
Career in Data Science
 
Introduction to Big Data
Introduction to Big DataIntroduction to Big Data
Introduction to Big Data
 
introduction to data science
introduction to data scienceintroduction to data science
introduction to data science
 
Predictive Analytics - Display Advertising & Credit Card Acquisition Use cases
Predictive Analytics - Display Advertising & Credit Card Acquisition Use cases Predictive Analytics - Display Advertising & Credit Card Acquisition Use cases
Predictive Analytics - Display Advertising & Credit Card Acquisition Use cases
 
Predictive Analytics - Big Data & Artificial Intelligence
Predictive Analytics - Big Data & Artificial IntelligencePredictive Analytics - Big Data & Artificial Intelligence
Predictive Analytics - Big Data & Artificial Intelligence
 
Data Science Project Lifecycle and Skill Set
Data Science Project Lifecycle and Skill SetData Science Project Lifecycle and Skill Set
Data Science Project Lifecycle and Skill Set
 

Similar to Machine Learning for Auditors: What you need to know - ISACA North America CACS 2017

Lean Security
Lean SecurityLean Security
Lean Security
Ben Johnson
 
AI for Software Engineering
AI for Software EngineeringAI for Software Engineering
AI for Software Engineering
Miroslaw Staron
 
Are API Services Taking Over All the Interesting Data Science Problems?
Are API Services Taking Over All the Interesting Data Science Problems?Are API Services Taking Over All the Interesting Data Science Problems?
Are API Services Taking Over All the Interesting Data Science Problems?
IDEAS - Int'l Data Engineering and Science Association
 
Makine Öğrenmesi, Yapay Zeka ve Veri Bilimi Süreçlerinin Otomatikleştirilmesi...
Makine Öğrenmesi, Yapay Zeka ve Veri Bilimi Süreçlerinin Otomatikleştirilmesi...Makine Öğrenmesi, Yapay Zeka ve Veri Bilimi Süreçlerinin Otomatikleştirilmesi...
Makine Öğrenmesi, Yapay Zeka ve Veri Bilimi Süreçlerinin Otomatikleştirilmesi...
Ali Alkan
 
Data Preparation vs. Inline Data Wrangling in Data Science and Machine Learning
Data Preparation vs. Inline Data Wrangling in Data Science and Machine LearningData Preparation vs. Inline Data Wrangling in Data Science and Machine Learning
Data Preparation vs. Inline Data Wrangling in Data Science and Machine Learning
Kai Wähner
 
The Myths + Realities of Machine-Learning Cybersecurity
The Myths + Realities of Machine-Learning CybersecurityThe Myths + Realities of Machine-Learning Cybersecurity
The Myths + Realities of Machine-Learning Cybersecurity
Interset
 
Build, train, and deploy machine learning models at scale - AWS Summit Cape T...
Build, train, and deploy machine learning models at scale - AWS Summit Cape T...Build, train, and deploy machine learning models at scale - AWS Summit Cape T...
Build, train, and deploy machine learning models at scale - AWS Summit Cape T...
Amazon Web Services
 
TechEvent 2019: Artificial Intelligence in Dev & Ops; Martin Luckow - Trivadis
TechEvent 2019: Artificial Intelligence in Dev & Ops; Martin Luckow - TrivadisTechEvent 2019: Artificial Intelligence in Dev & Ops; Martin Luckow - Trivadis
TechEvent 2019: Artificial Intelligence in Dev & Ops; Martin Luckow - Trivadis
Trivadis
 
SAS an open ecosystem for Artifical Intelligence - Dean Zouari
SAS an open ecosystem for Artifical Intelligence - Dean ZouariSAS an open ecosystem for Artifical Intelligence - Dean Zouari
SAS an open ecosystem for Artifical Intelligence - Dean Zouari
Institute of Contemporary Sciences
 
Real-time Analytics using Data from IoT Devices - AWS Online Tech Talks
Real-time Analytics using Data from IoT Devices - AWS Online Tech TalksReal-time Analytics using Data from IoT Devices - AWS Online Tech Talks
Real-time Analytics using Data from IoT Devices - AWS Online Tech Talks
Amazon Web Services
 
Gse uk-cedrinemadera-2018-shared
Gse uk-cedrinemadera-2018-sharedGse uk-cedrinemadera-2018-shared
Gse uk-cedrinemadera-2018-shared
cedrinemadera
 
IP Final project 12th
IP Final project 12thIP Final project 12th
IP Final project 12th
SantySS
 
Road to rockstar system analyst
Road to rockstar system analystRoad to rockstar system analyst
Road to rockstar system analyst
Mizno Kruge
 
Building an enterprise Natural Language Search Engine with ElasticSearch and ...
Building an enterprise Natural Language Search Engine with ElasticSearch and ...Building an enterprise Natural Language Search Engine with ElasticSearch and ...
Building an enterprise Natural Language Search Engine with ElasticSearch and ...
Debmalya Biswas
 
Reducing Technology Risks Through Prototyping
Reducing Technology Risks Through Prototyping Reducing Technology Risks Through Prototyping
Reducing Technology Risks Through Prototyping
Valdas Maksimavičius
 
Architecting for Data Science
Architecting for Data ScienceArchitecting for Data Science
Architecting for Data Science
Johann Schleier-Smith
 
NEW LAUNCH! Infinitely Scalable Machine Learning Algorithms with Amazon AI - ...
NEW LAUNCH! Infinitely Scalable Machine Learning Algorithms with Amazon AI - ...NEW LAUNCH! Infinitely Scalable Machine Learning Algorithms with Amazon AI - ...
NEW LAUNCH! Infinitely Scalable Machine Learning Algorithms with Amazon AI - ...
Amazon Web Services
 
Prescriptive Analytics-1.pptx
Prescriptive Analytics-1.pptxPrescriptive Analytics-1.pptx
Prescriptive Analytics-1.pptx
Karthik132344
 
Machine Learning for Statisticians - Introduction
Machine Learning for Statisticians - IntroductionMachine Learning for Statisticians - Introduction
Machine Learning for Statisticians - Introduction
Dr Ganesh Iyer
 
Microsoft AI Platform - AETHER Introduction
Microsoft AI Platform - AETHER IntroductionMicrosoft AI Platform - AETHER Introduction
Microsoft AI Platform - AETHER Introduction
Karthik Murugesan
 

Similar to Machine Learning for Auditors: What you need to know - ISACA North America CACS 2017 (20)

Lean Security
Lean SecurityLean Security
Lean Security
 
AI for Software Engineering
AI for Software EngineeringAI for Software Engineering
AI for Software Engineering
 
Are API Services Taking Over All the Interesting Data Science Problems?
Are API Services Taking Over All the Interesting Data Science Problems?Are API Services Taking Over All the Interesting Data Science Problems?
Are API Services Taking Over All the Interesting Data Science Problems?
 
Makine Öğrenmesi, Yapay Zeka ve Veri Bilimi Süreçlerinin Otomatikleştirilmesi...
Makine Öğrenmesi, Yapay Zeka ve Veri Bilimi Süreçlerinin Otomatikleştirilmesi...Makine Öğrenmesi, Yapay Zeka ve Veri Bilimi Süreçlerinin Otomatikleştirilmesi...
Makine Öğrenmesi, Yapay Zeka ve Veri Bilimi Süreçlerinin Otomatikleştirilmesi...
 
Data Preparation vs. Inline Data Wrangling in Data Science and Machine Learning
Data Preparation vs. Inline Data Wrangling in Data Science and Machine LearningData Preparation vs. Inline Data Wrangling in Data Science and Machine Learning
Data Preparation vs. Inline Data Wrangling in Data Science and Machine Learning
 
The Myths + Realities of Machine-Learning Cybersecurity
The Myths + Realities of Machine-Learning CybersecurityThe Myths + Realities of Machine-Learning Cybersecurity
The Myths + Realities of Machine-Learning Cybersecurity
 
Build, train, and deploy machine learning models at scale - AWS Summit Cape T...
Build, train, and deploy machine learning models at scale - AWS Summit Cape T...Build, train, and deploy machine learning models at scale - AWS Summit Cape T...
Build, train, and deploy machine learning models at scale - AWS Summit Cape T...
 
TechEvent 2019: Artificial Intelligence in Dev & Ops; Martin Luckow - Trivadis
TechEvent 2019: Artificial Intelligence in Dev & Ops; Martin Luckow - TrivadisTechEvent 2019: Artificial Intelligence in Dev & Ops; Martin Luckow - Trivadis
TechEvent 2019: Artificial Intelligence in Dev & Ops; Martin Luckow - Trivadis
 
SAS an open ecosystem for Artifical Intelligence - Dean Zouari
SAS an open ecosystem for Artifical Intelligence - Dean ZouariSAS an open ecosystem for Artifical Intelligence - Dean Zouari
SAS an open ecosystem for Artifical Intelligence - Dean Zouari
 
Real-time Analytics using Data from IoT Devices - AWS Online Tech Talks
Real-time Analytics using Data from IoT Devices - AWS Online Tech TalksReal-time Analytics using Data from IoT Devices - AWS Online Tech Talks
Real-time Analytics using Data from IoT Devices - AWS Online Tech Talks
 
Gse uk-cedrinemadera-2018-shared
Gse uk-cedrinemadera-2018-sharedGse uk-cedrinemadera-2018-shared
Gse uk-cedrinemadera-2018-shared
 
IP Final project 12th
IP Final project 12thIP Final project 12th
IP Final project 12th
 
Road to rockstar system analyst
Road to rockstar system analystRoad to rockstar system analyst
Road to rockstar system analyst
 
Building an enterprise Natural Language Search Engine with ElasticSearch and ...
Building an enterprise Natural Language Search Engine with ElasticSearch and ...Building an enterprise Natural Language Search Engine with ElasticSearch and ...
Building an enterprise Natural Language Search Engine with ElasticSearch and ...
 
Reducing Technology Risks Through Prototyping
Reducing Technology Risks Through Prototyping Reducing Technology Risks Through Prototyping
Reducing Technology Risks Through Prototyping
 
Architecting for Data Science
Architecting for Data ScienceArchitecting for Data Science
Architecting for Data Science
 
NEW LAUNCH! Infinitely Scalable Machine Learning Algorithms with Amazon AI - ...
NEW LAUNCH! Infinitely Scalable Machine Learning Algorithms with Amazon AI - ...NEW LAUNCH! Infinitely Scalable Machine Learning Algorithms with Amazon AI - ...
NEW LAUNCH! Infinitely Scalable Machine Learning Algorithms with Amazon AI - ...
 
Prescriptive Analytics-1.pptx
Prescriptive Analytics-1.pptxPrescriptive Analytics-1.pptx
Prescriptive Analytics-1.pptx
 
Machine Learning for Statisticians - Introduction
Machine Learning for Statisticians - IntroductionMachine Learning for Statisticians - Introduction
Machine Learning for Statisticians - Introduction
 
Microsoft AI Platform - AETHER Introduction
Microsoft AI Platform - AETHER IntroductionMicrosoft AI Platform - AETHER Introduction
Microsoft AI Platform - AETHER Introduction
 

More from Andrew Clark

GRC 2020 - IIA - ISACA Machine Learning Monitoring, Compliance and Governance
GRC 2020 - IIA - ISACA Machine Learning Monitoring, Compliance and GovernanceGRC 2020 - IIA - ISACA Machine Learning Monitoring, Compliance and Governance
GRC 2020 - IIA - ISACA Machine Learning Monitoring, Compliance and Governance
Andrew Clark
 
Blockchain for Auditors
Blockchain for AuditorsBlockchain for Auditors
Blockchain for Auditors
Andrew Clark
 
The Machine Learning Audit
The Machine Learning AuditThe Machine Learning Audit
The Machine Learning Audit
Andrew Clark
 
Machine Learning Risk Management
Machine Learning Risk ManagementMachine Learning Risk Management
Machine Learning Risk Management
Andrew Clark
 
Big data and other buzzwords
Big data and other buzzwordsBig data and other buzzwords
Big data and other buzzwords
Andrew Clark
 
Machine Learning: What Assurance Professionals Need to Know
Machine Learning: What Assurance Professionals Need to Know Machine Learning: What Assurance Professionals Need to Know
Machine Learning: What Assurance Professionals Need to Know
Andrew Clark
 
Active Directory for Auditors
Active Directory for AuditorsActive Directory for Auditors
Active Directory for Auditors
Andrew Clark
 
ITAC 2016 Where Open Source Meets Audit Analytics
ITAC 2016 Where Open Source Meets Audit AnalyticsITAC 2016 Where Open Source Meets Audit Analytics
ITAC 2016 Where Open Source Meets Audit Analytics
Andrew Clark
 

More from Andrew Clark (8)

GRC 2020 - IIA - ISACA Machine Learning Monitoring, Compliance and Governance
GRC 2020 - IIA - ISACA Machine Learning Monitoring, Compliance and GovernanceGRC 2020 - IIA - ISACA Machine Learning Monitoring, Compliance and Governance
GRC 2020 - IIA - ISACA Machine Learning Monitoring, Compliance and Governance
 
Blockchain for Auditors
Blockchain for AuditorsBlockchain for Auditors
Blockchain for Auditors
 
The Machine Learning Audit
The Machine Learning AuditThe Machine Learning Audit
The Machine Learning Audit
 
Machine Learning Risk Management
Machine Learning Risk ManagementMachine Learning Risk Management
Machine Learning Risk Management
 
Big data and other buzzwords
Big data and other buzzwordsBig data and other buzzwords
Big data and other buzzwords
 
Machine Learning: What Assurance Professionals Need to Know
Machine Learning: What Assurance Professionals Need to Know Machine Learning: What Assurance Professionals Need to Know
Machine Learning: What Assurance Professionals Need to Know
 
Active Directory for Auditors
Active Directory for AuditorsActive Directory for Auditors
Active Directory for Auditors
 
ITAC 2016 Where Open Source Meets Audit Analytics
ITAC 2016 Where Open Source Meets Audit AnalyticsITAC 2016 Where Open Source Meets Audit Analytics
ITAC 2016 Where Open Source Meets Audit Analytics
 

Recently uploaded

一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
nscud
 
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
nscud
 
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
slg6lamcq
 
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Subhajit Sahu
 
Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...
Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...
Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...
pchutichetpong
 
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
ewymefz
 
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
axoqas
 
一比一原版(QU毕业证)皇后大学毕业证成绩单
一比一原版(QU毕业证)皇后大学毕业证成绩单一比一原版(QU毕业证)皇后大学毕业证成绩单
一比一原版(QU毕业证)皇后大学毕业证成绩单
enxupq
 
Quantitative Data AnalysisReliability Analysis (Cronbach Alpha) Common Method...
Quantitative Data AnalysisReliability Analysis (Cronbach Alpha) Common Method...Quantitative Data AnalysisReliability Analysis (Cronbach Alpha) Common Method...
Quantitative Data AnalysisReliability Analysis (Cronbach Alpha) Common Method...
2023240532
 
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
Timothy Spann
 
一比一原版(BU毕业证)波士顿大学毕业证成绩单
一比一原版(BU毕业证)波士顿大学毕业证成绩单一比一原版(BU毕业证)波士顿大学毕业证成绩单
一比一原版(BU毕业证)波士顿大学毕业证成绩单
ewymefz
 
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
vcaxypu
 
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
u86oixdj
 
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
slg6lamcq
 
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
yhkoc
 
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
v3tuleee
 
standardisation of garbhpala offhgfffghh
standardisation of garbhpala offhgfffghhstandardisation of garbhpala offhgfffghh
standardisation of garbhpala offhgfffghh
ArpitMalhotra16
 
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
NABLAS株式会社
 
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
vcaxypu
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP
 

Recently uploaded (20)

一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
 
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
 
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
 
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
 
Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...
Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...
Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...
 
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
 
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
 
一比一原版(QU毕业证)皇后大学毕业证成绩单
一比一原版(QU毕业证)皇后大学毕业证成绩单一比一原版(QU毕业证)皇后大学毕业证成绩单
一比一原版(QU毕业证)皇后大学毕业证成绩单
 
Quantitative Data AnalysisReliability Analysis (Cronbach Alpha) Common Method...
Quantitative Data AnalysisReliability Analysis (Cronbach Alpha) Common Method...Quantitative Data AnalysisReliability Analysis (Cronbach Alpha) Common Method...
Quantitative Data AnalysisReliability Analysis (Cronbach Alpha) Common Method...
 
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
 
一比一原版(BU毕业证)波士顿大学毕业证成绩单
一比一原版(BU毕业证)波士顿大学毕业证成绩单一比一原版(BU毕业证)波士顿大学毕业证成绩单
一比一原版(BU毕业证)波士顿大学毕业证成绩单
 
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
 
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
 
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
 
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
 
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
 
standardisation of garbhpala offhgfffghh
standardisation of garbhpala offhgfffghhstandardisation of garbhpala offhgfffghh
standardisation of garbhpala offhgfffghh
 
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
【社内勉強会資料_Octo: An Open-Source Generalist Robot Policy】
 
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
 

Machine Learning for Auditors: What you need to know - ISACA North America CACS 2017

  • 1. Copyright © 2017 Information Systems Audit and Control Association, Inc. All rights reserved. Andrew Clark, IT Auditor / Data Scientist Astec Industries, Inc., M.S. Data Science Candidate
  • 2. Copyright © 2017 Information Systems Audit and Control Association, Inc. All rights reserved. Overview • What is machine learning? • Why is it important? • What do all of the buzzwords mean? • Non-technical introduction • What are the two broad types of machine learning? • How does it pertain to auditors? • Case studies • What would a machine learning audit entail? • Where can I learn more about machine learning? Kong, Qingkai . "Machine Learning 1 - What is machine learning and real world example." Qingkai's Blog (web log), October 4, 2016. Accessed February 21, 2017. http://qingkaikong.blogspot.com/2016/10/machine-learning- 1-what-is-machine.html?showComment=1484689212391#c4748865641151946089.
  • 3. Copyright © 2017 Information Systems Audit and Control Association, Inc. All rights reserved. What is Machine Learning? A computer recognizing patterns without having to be explicitly programed.
  • 4. Copyright © 2017 Information Systems Audit and Control Association, Inc. All rights reserved. Why is Machine Learning Important? • Disrupting business. Example ML powered businesses disrupted Blockbuster, Taxis, etc. • Revolutionizing existing business models. Predictive maintenance, retailing, credit card fraud detection. • One of the key technologies in driving economic growth
  • 5. Copyright © 2017 Information Systems Audit and Control Association, Inc. All rights reserved. What Machine Learning is not: • Magic • Going to take your job (for the majority of professionals) • Always the best tool for the job
  • 6. Copyright © 2017 Information Systems Audit and Control Association, Inc. All rights reserved. What do all the buzzwords mean? • Machine Learning based artificial intelligent - Big Data spewing - Deep Learning - Neural Network touting - Cognitive Computing - Virtual Reality - Natural Language Processing - Chat Bot.
  • 7. Copyright © 2017 Information Systems Audit and Control Association, Inc. All rights reserved. A non-technical introduction • Process, when strung together, called a pipeline • Business Understanding • Data Understanding • Data Preparation • Modeling • Evaluation • Deployment Kearn, Martin . "Machine Learning is for Muggles too!" Microsoft Developer (web log), March 1, 2016. Accessed February 21, 2017. https://blogs.msdn.microsoft.com/martinkearn/2016/03/01/machine-learning-is-for-muggles-too/.
  • 8. Copyright © 2017 Information Systems Audit and Control Association, Inc. All rights reserved. Business Understanding • The most important step • ‘The why’ • Why is this needed and what is the desired outcome
  • 9. Copyright © 2017 Information Systems Audit and Control Association, Inc. All rights reserved. Data Understanding • An understanding of where the data is coming from is key to good modeling • SQL relational database? NoSQL database? Csv, txt, webpage, Tweets? • What scale is the data on? For example, Celsius or Fahrenheit? • Is the scale the same on all data streams or will transformations be required?
  • 10. Copyright © 2017 Information Systems Audit and Control Association, Inc. All rights reserved. Data Preparation • Currently, close to 90% of what Data Scientists do • ‘Munging’ • “I’m a data janitor. That’s the sexiest job of the 21st century. It’s very flattering, but it’s also a little baffling.” – Josh Wills • Press, Gil. "Cleaning Big Data: Most Time-Consuming, Least Enjoyable Data Science Task, Survey Says." Forbes. March 23, 2016. Accessed March 13, 2017. https://www.forbes.com/sites/gilpress/2016/03/23/data-preparation-most-time-consuming-least-enjoyable-data-science-task-survey- says/#21e789136f63.
  • 11. Copyright © 2017 Information Systems Audit and Control Association, Inc. All rights reserved. Modeling "Choosing the right estimator." Choosing the right estimator — scikit-learn 0.18.1 documentation. Accessed March 13, 2017. http://scikit- learn.org/stable/tutorial/machine_learning_map/index.html.
  • 12. Copyright © 2017 Information Systems Audit and Control Association, Inc. All rights reserved.
  • 13. Copyright © 2017 Information Systems Audit and Control Association, Inc. All rights reserved. Evaluation • Accuracy • Precision • Recall • Does the model solve the problem?
  • 14. Copyright © 2017 Information Systems Audit and Control Association, Inc. All rights reserved. Deployment • Integrated into existing infrastructure or application? • Separate web application? • Scheduled job? • Run adhoc?
  • 15. Copyright © 2017 Information Systems Audit and Control Association, Inc. All rights reserved. Unsupervised Machine Learning • Given some cleaned data, the algorithm, a series of instructions, divides the data into like groups. • Popular models: – Kmeans – KNN (K-nearest neighbors)
  • 16. Copyright © 2017 Information Systems Audit and Control Association, Inc. All rights reserved. Supervised Machine Learning • Given a labeled dataset, ‘fraud not fraud’, the algorithm is ‘trained’, to recognize which items are fraud and which items are not fraud. • Common techniques include: – Logistic Regression – Support Vector Machines
  • 17. Copyright © 2017 Information Systems Audit and Control Association, Inc. All rights reserved. Example, Logistic Regression from sklearn.linear_model import LogisticRegression LogR = LogisticRegression() # [height, weight, shoe_size] X = [[181, 80, 44], [177, 70, 43], [160, 60, 38], [154, 54, 37], [166, 65, 40], [190, 90, 47], [175, 64, 39], [177, 70, 40], [159, 55, 37], [171, 75, 42], [181, 85, 43]] Y = ['male', 'male', 'female', 'female', 'male', 'male', 'female', 'female', 'female', 'male', 'male'] LogR.fit(X, Y) prediction = LogR .predict([[190, 70, 43]]) print prediction >>[‘female’] https://github.com/aclarkData/simple-machine-learning-examples/blob/master/very_simple_examples/logistic_regression.py
  • 18. Copyright © 2017 Information Systems Audit and Control Association, Inc. All rights reserved. Example, Kmeans • Clustering journal entries • Essentially, we obtain a month, or any time period, of journal entries, “one- hot encode” (convert to binary, i.e. 0,1) the non-numerical columns (which essentially means convert ‘Hello’ into a series of 0s and 1s, and group together in a pre-determined set of groups, for example, 3.
  • 19. Copyright © 2017 Information Systems Audit and Control Association, Inc. All rights reserved. Kmeans continued http://blog.mpacula.com/2011/04/27/k-means-clustering-example-python/
  • 20. Copyright © 2017 Information Systems Audit and Control Association, Inc. All rights reserved. As an auditor, what does this mean for you? • New opportunities and risks • Catch-22 of businesses accepting the risk of black boxes or becoming irrelevant • Use cases in audit analytic • More complicated environment, new skills required to understand business implications and audit algorithms
  • 21. Copyright © 2017 Information Systems Audit and Control Association, Inc. All rights reserved. Use cases in Assurance and Compliance • Anomaly detection – Unsupervised journal entry anomaly detection – Clustering on invoice and AP data for outliers • ‘Auditor sense’ investigation – Supervised model for expense report investigation – Supervised model for journal entries – AP transactions, customer transactions, etc.
  • 22. Copyright © 2017 Information Systems Audit and Control Association, Inc. All rights reserved. The Machine Learning Algorithm Audit • With algorithms increasingly dictating our lives, how do we know that they are operating as intended? – e.x. Weapons of Math Destruction by Cathy O'Neil • Unfilled role for assurance professionals. – Review assumptions, and when available, such as decision tree, logistic regression, etc, look at the weighting for features in the model. – Can provide a lot of value with using only SDLC audit methodologies
  • 23. Copyright © 2017 Information Systems Audit and Control Association, Inc. All rights reserved. Machine Learning Audit Example – Logistic Regression >>weights = pd.Series(clf.coef_[0], index=ShoeData.columns) >>weights Height -0.439204 Weight 0.622762 Shoe_size 0.829036 >>weights.plot(kind='bar’, title =‘ …’) https://github.com/aclarkData/simple-machine-learning-examples /blob/master/very_simple_examples/BasicMachineLearning.ipynb
  • 24. Copyright © 2017 Information Systems Audit and Control Association, Inc. All rights reserved. Machine Learning Audit Example – Decision Tree Classifier >>from sklearn import tree >>clf = tree.DecisionTreeClassifier() >>clf.fit(X, Y) >>prediction = clf.predict([[190, 70, 43]]) >>print prediction [u'male'] >>dot_data = tree.export_graphviz(clf, feature_names=ShoeData.columns, class_names = ShoeData.columns, out_file='tree.dot') https://github.com/aclarkData/simple-machine-learning-examples /blob/master/very_simple_examples/BasicMachineLearning.ipynb
  • 25. Copyright © 2017 Information Systems Audit and Control Association, Inc. All rights reserved. http://www.webgraphviz.com/
  • 26. Copyright © 2017 Information Systems Audit and Control Association, Inc. All rights reserved. EU’s General Data Protection Regulation (GDPR) • In April 2016, the EU passed General Data Protection Regulation act, which gives citizens a right explanation for citizens and regulators regarding algorithmic decision making. • Empowers citizens with the ability to understand why they were rejected for a bank loan, for instance, when the decision was based off an algorithm.
  • 27. Copyright © 2017 Information Systems Audit and Control Association, Inc. All rights reserved. Where can I learn more about Machine Learning? • -Visual Intro, highly recommended, short and sweet • http://www.r2d3.us/visual-intro-to-machine-learning-part-1/ • -Wikipedia • https://en.wikipedia.org/wiki/Machine_learning • -Good beginning article with some fantastic books • http://machinelearningmastery.com/4-steps-to-get-started-in-machine- learning/ • -Weka • http://www.cs.waikato.ac.nz/ml/weka/ • Scikit-Learn • http://scikit-learn.org/
  • 28. Copyright © 2017 Information Systems Audit and Control Association, Inc. All rights reserved. Conclusion and recap • Definition of Machine Learning • Buzzword breakdown • Machine Learning process • Broad algorithm overview • Real world use cases • The Machine Learning Audit • Where to learn more about Machine Learning
  • 29. Copyright © 2017 Information Systems Audit and Control Association, Inc. All rights reserved. Thank you! • Email: andrewtaylorclark@gmail.com • GitHub: aclarkData • Blog: https://aclarkdata.github.io/ • LinkedIn: www.linkedin.com/in/andrew-clark-b326b767