SlideShare a Scribd company logo
Copyright © 2017 Information Systems Audit and Control Association, Inc. All rights reserved.
Andrew Clark, IT Auditor / Internal Audit Data Scientist
Astec Industries, Inc., M.S. Data Science Candidate
Copyright © 2017 Information Systems Audit and Control Association, Inc. All rights reserved.
Overview
1. What is open source software?
2. Why is it important?
3. What are the benefits of using open source software for analytics over
CAATs?
4. How do I begin using open source software for analytics?
5. Case study
6. The application of advanced analytic techniques
Copyright © 2017 Information Systems Audit and Control Association, Inc. All rights reserved.
Meet Open Source
Copyright © 2017 Information Systems Audit and Control Association, Inc. All rights reserved.
Open Source Software
“Open source software is software whose source code is available for
modification or enhancement by anyone.”
What Is Open Source?" Opensource.com. Accessed June 12, 2016. https://opensource.com/resources/what-open-source.
Copyright © 2017 Information Systems Audit and Control Association, Inc. All rights reserved.
Open Source examples
1. Linux (mainly)
2. Android (mainly)
3. Firefox
4. R programming language
5. Git
6. Docker
Copyright © 2017 Information Systems Audit and Control Association, Inc. All rights reserved.
Why is it important?
• Vibrant community
• Frequent updates
• Potential for strong security
• Cutting edge technology
• Customizable
• Cost
Copyright © 2017 Information Systems Audit and Control Association, Inc. All rights reserved.
How does Open Source relate to Audit Analytics?
• State of the art technology
• Computer science's best and brightest love to contribute
• Customizable
• Scalability
• Beautiful visualizations
• Analytics and Data Science leaders use almost exclusively open source
frameworks for their analytics, i.e. Google, Facebook, Uber, Airbnb, etc.
Copyright © 2017 Information Systems Audit and Control Association, Inc. All rights reserved.
"Bubble Charts." Plotly. Accessed August 14, 2016. https://plot.ly/python/bubble-charts/.
Copyright © 2017 Information Systems Audit and Control Association, Inc. All rights reserved.
Benefits over traditional CAATs
• ACL, IDEA, Arbutus, the existing market leaders
• Not very user friendly
• Requires extensive training to use effectively
• Not very flexible
• Does not provide the output auditors are expecting
Copyright © 2017 Information Systems Audit and Control Association, Inc. All rights reserved.
So what do we do about it?
Copyright © 2017 Information Systems Audit and Control Association, Inc. All rights reserved.
Enter Python (and R)
Copyright © 2017 Information Systems Audit and Control Association, Inc. All rights reserved.
What is Python?
"About Python." Python.org. Accessed August 14, 2016. https://www.python.org/about/.
• Open source, general purpose programming
language
• High level of support
• Used by some of the best and brightest in
Data Science
• Extensive scientific, mathematic,
data wrangling and visualization libraries
• Most popular first language in computer
• science departments across America
(http://tinyurl.com/knw5mdv)
Copyright © 2017 Information Systems Audit and Control Association, Inc. All rights reserved.
What is R?
• "R is a language and environment for statistical computing and graphics."-
"What Is R?" The R Project for Statistical Computing. Accessed August 14, 2016. https://www.r-project.org/about.html.
• Used widely by statisticians for statistical analysis
• As a result of its widespread use, thousands of easy to implement libraries
that provide *all* widely used statistical techniques
• Is not a 'real' programming language
Copyright © 2017 Information Systems Audit and Control Association, Inc. All rights reserved.
How would we go about using Python (or R)?
• The hard way: by learning it
• The even harder way: hire an auditor with programming, analytics and
auditing experience
• The *easiest* and most effective way: create a cross functional team by
borrowing a programmer from IT and a business analyst from the
business.
Copyright © 2017 Information Systems Audit and Control Association, Inc. All rights reserved.
Example Python (and R) analytic test
• https://github.com/aclarkData/AuditAnalytics
• 999 amount, weekends and keywords journal entry tests
• Steps:
• Input libraries
• Import data
• Wrangle as needed
• Export to folder
• Email
• Schedule - Task Scheduler in Windows, Cron, or equivalent in Unix based system, i.e. Mac and Linux
Copyright © 2017 Information Systems Audit and Control Association, Inc. All rights reserved.
Copyright © 2017 Information Systems Audit and Control Association, Inc. All rights reserved.
Copyright © 2017 Information Systems Audit and Control Association, Inc. All rights reserved.
Copyright © 2017 Information Systems Audit and Control Association, Inc. All rights reserved.
Copyright © 2017 Information Systems Audit and Control Association, Inc. All rights reserved.
Machine Learning
• In essence, a machine understanding patterns in data without having to be
explicitly programmed.
• Very, very powerful technology that is transforming banking, search
engines, advertising, and soon, every industry.
• Examples: Credit card fraud detection, target demographic advertising, anomalous
sensory data, etc.
Copyright © 2017 Information Systems Audit and Control Association, Inc. All rights reserved.
Machine Learning Cont.
• Numerous possibilities for utilizing machine learning and related
technology, e.x. Natural Language Processing, etc., for Financial Auditing
• For example, unsupervised clustering algorithm in use at Astec Industries.
• Latest developments are only available in open source software or
expensive statistical or computational programs such as SAS, which
currently runs at a minimum of $9,200 upfront per single user license plus
annual fees - “SAS® Analytics Pro." SAS®. Accessed August 26, 2016. https://www.sas.com/store/software/analytics-
pro/prodPERSANL.html.
Copyright © 2017 Information Systems Audit and Control Association, Inc. All rights reserved.
Possibilities
• Time Series Machine Learning for predicting account balances
• Natural Language Processing techniques for contract review and
summarization - current bottleneck is (OCR) Optical Character Recognition
technology.
• Sentiment Analysis for Journal Entry and Transaction descriptions.
• Jupyter notebooks for reproducible analytics and audit documentation
Copyright © 2017 Information Systems Audit and Control Association, Inc. All rights reserved.
Copyright © 2017 Information Systems Audit and Control Association, Inc. All rights reserved.
Conclusion
• Definition of Open Source Software
• Unlimited possibilities for a customizable analytics experience
• Scalable
• Real world example
• Machine Learning and the future of audit analytics
Copyright © 2017 Information Systems Audit and Control Association, Inc. All rights reserved.
Thank you!
• Email: andrewtaylorclark@gmail.com
• GitHub: aclarkData
• Blog: https: aclarkdata.github.io
• LinkedIn: www.linkedin.com/in/andrew-clark-b326b767

More Related Content

What's hot

IANS Forum Dallas - Technology Spotlight Session
IANS Forum Dallas - Technology Spotlight SessionIANS Forum Dallas - Technology Spotlight Session
IANS Forum Dallas - Technology Spotlight Session
Interset
 
Data Science Tutorial | What is Data Science? | Data Science For Beginners | ...
Data Science Tutorial | What is Data Science? | Data Science For Beginners | ...Data Science Tutorial | What is Data Science? | Data Science For Beginners | ...
Data Science Tutorial | What is Data Science? | Data Science For Beginners | ...
Edureka!
 
Career in Data Science
Career in Data ScienceCareer in Data Science
Career in Data Science
ActonRoy
 
Data Scientist Job, Career & Salary | Data Scientist Salary | Data Science Ma...
Data Scientist Job, Career & Salary | Data Scientist Salary | Data Science Ma...Data Scientist Job, Career & Salary | Data Scientist Salary | Data Science Ma...
Data Scientist Job, Career & Salary | Data Scientist Salary | Data Science Ma...
Edureka!
 
Operationalizing Big Data Security Analytics - IANS Forum Dallas
Operationalizing Big Data Security Analytics - IANS Forum DallasOperationalizing Big Data Security Analytics - IANS Forum Dallas
Operationalizing Big Data Security Analytics - IANS Forum Dallas
Interset
 
How To Become a Data Scientist in Iran Marketplace
How To Become a Data Scientist in Iran Marketplace How To Become a Data Scientist in Iran Marketplace
How To Become a Data Scientist in Iran Marketplace
Mohamadreza Mohtat
 
Introduction To Data Science
Introduction To Data ScienceIntroduction To Data Science
Introduction To Data Science
Spotle.ai
 
Introduction to Big Data/Machine Learning
Introduction to Big Data/Machine LearningIntroduction to Big Data/Machine Learning
Introduction to Big Data/Machine Learning
Lars Marius Garshol
 
April 2015 Webinar: Cyber Hunting with Sqrrl
April 2015 Webinar: Cyber Hunting with SqrrlApril 2015 Webinar: Cyber Hunting with Sqrrl
April 2015 Webinar: Cyber Hunting with Sqrrl
Sqrrl
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
ANOOP V S
 
Data Science Training | Data Science Tutorial for Beginners | Data Science wi...
Data Science Training | Data Science Tutorial for Beginners | Data Science wi...Data Science Training | Data Science Tutorial for Beginners | Data Science wi...
Data Science Training | Data Science Tutorial for Beginners | Data Science wi...
Edureka!
 
Machine Learning for Incident Detection: Getting Started
Machine Learning for Incident Detection: Getting StartedMachine Learning for Incident Detection: Getting Started
Machine Learning for Incident Detection: Getting Started
Sqrrl
 
AI-SDV 2021: Francisco Webber - Efficiency is the New Precision
AI-SDV 2021: Francisco Webber - Efficiency is the New PrecisionAI-SDV 2021: Francisco Webber - Efficiency is the New Precision
AI-SDV 2021: Francisco Webber - Efficiency is the New Precision
Dr. Haxel Consult
 
introduction to data science
introduction to data scienceintroduction to data science
introduction to data science
bhavesh lande
 
Training in Analytics and Data Science
Training in Analytics and Data ScienceTraining in Analytics and Data Science
Training in Analytics and Data Science
Ajay Ohri
 
Introduction to Data Science (Data Science Thailand Meetup #1)
Introduction to Data Science (Data Science Thailand Meetup #1)Introduction to Data Science (Data Science Thailand Meetup #1)
Introduction to Data Science (Data Science Thailand Meetup #1)
Data Science Thailand
 
Introduction to Big Data Analytics and Data Science
Introduction to Big Data Analytics and Data ScienceIntroduction to Big Data Analytics and Data Science
Introduction to Big Data Analytics and Data Science
Data Science Thailand
 
User and Entity Behavior Analytics using the Sqrrl Behavior Graph
User and Entity Behavior Analytics using the Sqrrl Behavior GraphUser and Entity Behavior Analytics using the Sqrrl Behavior Graph
User and Entity Behavior Analytics using the Sqrrl Behavior Graph
Sqrrl
 
Data science 101
Data science 101Data science 101
Data science 101
University of West Florida
 
Intro to Data Science by DatalentTeam at Data Science Clinic#11
Intro to Data Science by DatalentTeam at Data Science Clinic#11Intro to Data Science by DatalentTeam at Data Science Clinic#11
Intro to Data Science by DatalentTeam at Data Science Clinic#11
Dr.Sotarat Thammaboosadee CIMP-Data Governance
 

What's hot (20)

IANS Forum Dallas - Technology Spotlight Session
IANS Forum Dallas - Technology Spotlight SessionIANS Forum Dallas - Technology Spotlight Session
IANS Forum Dallas - Technology Spotlight Session
 
Data Science Tutorial | What is Data Science? | Data Science For Beginners | ...
Data Science Tutorial | What is Data Science? | Data Science For Beginners | ...Data Science Tutorial | What is Data Science? | Data Science For Beginners | ...
Data Science Tutorial | What is Data Science? | Data Science For Beginners | ...
 
Career in Data Science
Career in Data ScienceCareer in Data Science
Career in Data Science
 
Data Scientist Job, Career & Salary | Data Scientist Salary | Data Science Ma...
Data Scientist Job, Career & Salary | Data Scientist Salary | Data Science Ma...Data Scientist Job, Career & Salary | Data Scientist Salary | Data Science Ma...
Data Scientist Job, Career & Salary | Data Scientist Salary | Data Science Ma...
 
Operationalizing Big Data Security Analytics - IANS Forum Dallas
Operationalizing Big Data Security Analytics - IANS Forum DallasOperationalizing Big Data Security Analytics - IANS Forum Dallas
Operationalizing Big Data Security Analytics - IANS Forum Dallas
 
How To Become a Data Scientist in Iran Marketplace
How To Become a Data Scientist in Iran Marketplace How To Become a Data Scientist in Iran Marketplace
How To Become a Data Scientist in Iran Marketplace
 
Introduction To Data Science
Introduction To Data ScienceIntroduction To Data Science
Introduction To Data Science
 
Introduction to Big Data/Machine Learning
Introduction to Big Data/Machine LearningIntroduction to Big Data/Machine Learning
Introduction to Big Data/Machine Learning
 
April 2015 Webinar: Cyber Hunting with Sqrrl
April 2015 Webinar: Cyber Hunting with SqrrlApril 2015 Webinar: Cyber Hunting with Sqrrl
April 2015 Webinar: Cyber Hunting with Sqrrl
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
 
Data Science Training | Data Science Tutorial for Beginners | Data Science wi...
Data Science Training | Data Science Tutorial for Beginners | Data Science wi...Data Science Training | Data Science Tutorial for Beginners | Data Science wi...
Data Science Training | Data Science Tutorial for Beginners | Data Science wi...
 
Machine Learning for Incident Detection: Getting Started
Machine Learning for Incident Detection: Getting StartedMachine Learning for Incident Detection: Getting Started
Machine Learning for Incident Detection: Getting Started
 
AI-SDV 2021: Francisco Webber - Efficiency is the New Precision
AI-SDV 2021: Francisco Webber - Efficiency is the New PrecisionAI-SDV 2021: Francisco Webber - Efficiency is the New Precision
AI-SDV 2021: Francisco Webber - Efficiency is the New Precision
 
introduction to data science
introduction to data scienceintroduction to data science
introduction to data science
 
Training in Analytics and Data Science
Training in Analytics and Data ScienceTraining in Analytics and Data Science
Training in Analytics and Data Science
 
Introduction to Data Science (Data Science Thailand Meetup #1)
Introduction to Data Science (Data Science Thailand Meetup #1)Introduction to Data Science (Data Science Thailand Meetup #1)
Introduction to Data Science (Data Science Thailand Meetup #1)
 
Introduction to Big Data Analytics and Data Science
Introduction to Big Data Analytics and Data ScienceIntroduction to Big Data Analytics and Data Science
Introduction to Big Data Analytics and Data Science
 
User and Entity Behavior Analytics using the Sqrrl Behavior Graph
User and Entity Behavior Analytics using the Sqrrl Behavior GraphUser and Entity Behavior Analytics using the Sqrrl Behavior Graph
User and Entity Behavior Analytics using the Sqrrl Behavior Graph
 
Data science 101
Data science 101Data science 101
Data science 101
 
Intro to Data Science by DatalentTeam at Data Science Clinic#11
Intro to Data Science by DatalentTeam at Data Science Clinic#11Intro to Data Science by DatalentTeam at Data Science Clinic#11
Intro to Data Science by DatalentTeam at Data Science Clinic#11
 

Similar to Where Open Source Meets Audit Analytics - ISACA North America CACS 2017

Maruti gollapudi cv
Maruti gollapudi cvMaruti gollapudi cv
Maruti gollapudi cv
Maruti Gollapudi
 
Apache spark empowering the real time data driven enterprise - StreamAnalytix...
Apache spark empowering the real time data driven enterprise - StreamAnalytix...Apache spark empowering the real time data driven enterprise - StreamAnalytix...
Apache spark empowering the real time data driven enterprise - StreamAnalytix...
Impetus Technologies
 
Real-time Analytics using Data from IoT Devices - AWS Online Tech Talks
Real-time Analytics using Data from IoT Devices - AWS Online Tech TalksReal-time Analytics using Data from IoT Devices - AWS Online Tech Talks
Real-time Analytics using Data from IoT Devices - AWS Online Tech Talks
Amazon Web Services
 
New Reporting Experience in IBM Cognos Analytics: Demos of our Favorite New F...
New Reporting Experience in IBM Cognos Analytics: Demos of our Favorite New F...New Reporting Experience in IBM Cognos Analytics: Demos of our Favorite New F...
New Reporting Experience in IBM Cognos Analytics: Demos of our Favorite New F...
Senturus
 
R vs Python vs SAS
R vs Python vs SASR vs Python vs SAS
R vs Python vs SAS
Outreach Digital
 
DevOps on AWS
DevOps on AWSDevOps on AWS
DevOps on AWS
Amazon Web Services
 
DevOps on AWS
DevOps on AWSDevOps on AWS
DevOps on AWS
Amazon Web Services
 
Big Data: Introducing BigInsights, IBM's Hadoop- and Spark-based analytical p...
Big Data: Introducing BigInsights, IBM's Hadoop- and Spark-based analytical p...Big Data: Introducing BigInsights, IBM's Hadoop- and Spark-based analytical p...
Big Data: Introducing BigInsights, IBM's Hadoop- and Spark-based analytical p...
Cynthia Saracco
 
Cloud-native Enterprise Data Science Teams
Cloud-native Enterprise Data Science TeamsCloud-native Enterprise Data Science Teams
Cloud-native Enterprise Data Science Teams
Boston Consulting Group
 
EXTENT-2017: Putting AI to Test
EXTENT-2017: Putting AI to TestEXTENT-2017: Putting AI to Test
EXTENT-2017: Putting AI to Test
Iosif Itkin
 
SLAS 2017 - "Multiple Research Platforms: One Single Data Sharing Portal"
SLAS 2017 - "Multiple Research Platforms:  One Single Data Sharing Portal"SLAS 2017 - "Multiple Research Platforms:  One Single Data Sharing Portal"
SLAS 2017 - "Multiple Research Platforms: One Single Data Sharing Portal"
CSols, Inc.
 
Big Data LDN 2017: How Big Data Insights Become Easily Accessible With Workfl...
Big Data LDN 2017: How Big Data Insights Become Easily Accessible With Workfl...Big Data LDN 2017: How Big Data Insights Become Easily Accessible With Workfl...
Big Data LDN 2017: How Big Data Insights Become Easily Accessible With Workfl...
Matt Stubbs
 
Machine Data Is EVERYWHERE: Use It for Testing
Machine Data Is EVERYWHERE: Use It for TestingMachine Data Is EVERYWHERE: Use It for Testing
Machine Data Is EVERYWHERE: Use It for Testing
TechWell
 
GPSTEC201_Building an Artificial Intelligence Practice for Consulting Partners
GPSTEC201_Building an Artificial Intelligence Practice for Consulting PartnersGPSTEC201_Building an Artificial Intelligence Practice for Consulting Partners
GPSTEC201_Building an Artificial Intelligence Practice for Consulting Partners
Amazon Web Services
 
SAS an open ecosystem for Artifical Intelligence - Dean Zouari
SAS an open ecosystem for Artifical Intelligence - Dean ZouariSAS an open ecosystem for Artifical Intelligence - Dean Zouari
SAS an open ecosystem for Artifical Intelligence - Dean Zouari
Institute of Contemporary Sciences
 
100 Billion Data Points With Lambda_AWSPSSummit_Singapore
100 Billion Data Points With Lambda_AWSPSSummit_Singapore100 Billion Data Points With Lambda_AWSPSSummit_Singapore
100 Billion Data Points With Lambda_AWSPSSummit_Singapore
Amazon Web Services
 
Adding the Sec to Your DevOps Pipelines
Adding the Sec to Your DevOps PipelinesAdding the Sec to Your DevOps Pipelines
Adding the Sec to Your DevOps Pipelines
Amazon Web Services
 
Case Studies: Enterprise BI vs Self-Service Analytics Tools: Real Life Consid...
Case Studies: Enterprise BI vs Self-Service Analytics Tools: Real Life Consid...Case Studies: Enterprise BI vs Self-Service Analytics Tools: Real Life Consid...
Case Studies: Enterprise BI vs Self-Service Analytics Tools: Real Life Consid...
Senturus
 
DEV206_Life of a Code Change to a Tier 1 Service
DEV206_Life of a Code Change to a Tier 1 ServiceDEV206_Life of a Code Change to a Tier 1 Service
DEV206_Life of a Code Change to a Tier 1 Service
Amazon Web Services
 
Getting the most from your API management platform: A case study
Getting the most from your API management platform: A case studyGetting the most from your API management platform: A case study
Getting the most from your API management platform: A case study
Rogue Wave Software
 

Similar to Where Open Source Meets Audit Analytics - ISACA North America CACS 2017 (20)

Maruti gollapudi cv
Maruti gollapudi cvMaruti gollapudi cv
Maruti gollapudi cv
 
Apache spark empowering the real time data driven enterprise - StreamAnalytix...
Apache spark empowering the real time data driven enterprise - StreamAnalytix...Apache spark empowering the real time data driven enterprise - StreamAnalytix...
Apache spark empowering the real time data driven enterprise - StreamAnalytix...
 
Real-time Analytics using Data from IoT Devices - AWS Online Tech Talks
Real-time Analytics using Data from IoT Devices - AWS Online Tech TalksReal-time Analytics using Data from IoT Devices - AWS Online Tech Talks
Real-time Analytics using Data from IoT Devices - AWS Online Tech Talks
 
New Reporting Experience in IBM Cognos Analytics: Demos of our Favorite New F...
New Reporting Experience in IBM Cognos Analytics: Demos of our Favorite New F...New Reporting Experience in IBM Cognos Analytics: Demos of our Favorite New F...
New Reporting Experience in IBM Cognos Analytics: Demos of our Favorite New F...
 
R vs Python vs SAS
R vs Python vs SASR vs Python vs SAS
R vs Python vs SAS
 
DevOps on AWS
DevOps on AWSDevOps on AWS
DevOps on AWS
 
DevOps on AWS
DevOps on AWSDevOps on AWS
DevOps on AWS
 
Big Data: Introducing BigInsights, IBM's Hadoop- and Spark-based analytical p...
Big Data: Introducing BigInsights, IBM's Hadoop- and Spark-based analytical p...Big Data: Introducing BigInsights, IBM's Hadoop- and Spark-based analytical p...
Big Data: Introducing BigInsights, IBM's Hadoop- and Spark-based analytical p...
 
Cloud-native Enterprise Data Science Teams
Cloud-native Enterprise Data Science TeamsCloud-native Enterprise Data Science Teams
Cloud-native Enterprise Data Science Teams
 
EXTENT-2017: Putting AI to Test
EXTENT-2017: Putting AI to TestEXTENT-2017: Putting AI to Test
EXTENT-2017: Putting AI to Test
 
SLAS 2017 - "Multiple Research Platforms: One Single Data Sharing Portal"
SLAS 2017 - "Multiple Research Platforms:  One Single Data Sharing Portal"SLAS 2017 - "Multiple Research Platforms:  One Single Data Sharing Portal"
SLAS 2017 - "Multiple Research Platforms: One Single Data Sharing Portal"
 
Big Data LDN 2017: How Big Data Insights Become Easily Accessible With Workfl...
Big Data LDN 2017: How Big Data Insights Become Easily Accessible With Workfl...Big Data LDN 2017: How Big Data Insights Become Easily Accessible With Workfl...
Big Data LDN 2017: How Big Data Insights Become Easily Accessible With Workfl...
 
Machine Data Is EVERYWHERE: Use It for Testing
Machine Data Is EVERYWHERE: Use It for TestingMachine Data Is EVERYWHERE: Use It for Testing
Machine Data Is EVERYWHERE: Use It for Testing
 
GPSTEC201_Building an Artificial Intelligence Practice for Consulting Partners
GPSTEC201_Building an Artificial Intelligence Practice for Consulting PartnersGPSTEC201_Building an Artificial Intelligence Practice for Consulting Partners
GPSTEC201_Building an Artificial Intelligence Practice for Consulting Partners
 
SAS an open ecosystem for Artifical Intelligence - Dean Zouari
SAS an open ecosystem for Artifical Intelligence - Dean ZouariSAS an open ecosystem for Artifical Intelligence - Dean Zouari
SAS an open ecosystem for Artifical Intelligence - Dean Zouari
 
100 Billion Data Points With Lambda_AWSPSSummit_Singapore
100 Billion Data Points With Lambda_AWSPSSummit_Singapore100 Billion Data Points With Lambda_AWSPSSummit_Singapore
100 Billion Data Points With Lambda_AWSPSSummit_Singapore
 
Adding the Sec to Your DevOps Pipelines
Adding the Sec to Your DevOps PipelinesAdding the Sec to Your DevOps Pipelines
Adding the Sec to Your DevOps Pipelines
 
Case Studies: Enterprise BI vs Self-Service Analytics Tools: Real Life Consid...
Case Studies: Enterprise BI vs Self-Service Analytics Tools: Real Life Consid...Case Studies: Enterprise BI vs Self-Service Analytics Tools: Real Life Consid...
Case Studies: Enterprise BI vs Self-Service Analytics Tools: Real Life Consid...
 
DEV206_Life of a Code Change to a Tier 1 Service
DEV206_Life of a Code Change to a Tier 1 ServiceDEV206_Life of a Code Change to a Tier 1 Service
DEV206_Life of a Code Change to a Tier 1 Service
 
Getting the most from your API management platform: A case study
Getting the most from your API management platform: A case studyGetting the most from your API management platform: A case study
Getting the most from your API management platform: A case study
 

More from Andrew Clark

GRC 2020 - IIA - ISACA Machine Learning Monitoring, Compliance and Governance
GRC 2020 - IIA - ISACA Machine Learning Monitoring, Compliance and GovernanceGRC 2020 - IIA - ISACA Machine Learning Monitoring, Compliance and Governance
GRC 2020 - IIA - ISACA Machine Learning Monitoring, Compliance and Governance
Andrew Clark
 
Blockchain for Auditors
Blockchain for AuditorsBlockchain for Auditors
Blockchain for Auditors
Andrew Clark
 
The Machine Learning Audit
The Machine Learning AuditThe Machine Learning Audit
The Machine Learning Audit
Andrew Clark
 
AWS for Auditors
AWS for AuditorsAWS for Auditors
AWS for Auditors
Andrew Clark
 
Machine Learning Risk Management
Machine Learning Risk ManagementMachine Learning Risk Management
Machine Learning Risk Management
Andrew Clark
 
Big data and other buzzwords
Big data and other buzzwordsBig data and other buzzwords
Big data and other buzzwords
Andrew Clark
 
Machine Learning: What Assurance Professionals Need to Know
Machine Learning: What Assurance Professionals Need to Know Machine Learning: What Assurance Professionals Need to Know
Machine Learning: What Assurance Professionals Need to Know
Andrew Clark
 
Reinventing Auditing with Machine Learning
Reinventing Auditing with Machine LearningReinventing Auditing with Machine Learning
Reinventing Auditing with Machine Learning
Andrew Clark
 
Active Directory for Auditors
Active Directory for AuditorsActive Directory for Auditors
Active Directory for Auditors
Andrew Clark
 

More from Andrew Clark (9)

GRC 2020 - IIA - ISACA Machine Learning Monitoring, Compliance and Governance
GRC 2020 - IIA - ISACA Machine Learning Monitoring, Compliance and GovernanceGRC 2020 - IIA - ISACA Machine Learning Monitoring, Compliance and Governance
GRC 2020 - IIA - ISACA Machine Learning Monitoring, Compliance and Governance
 
Blockchain for Auditors
Blockchain for AuditorsBlockchain for Auditors
Blockchain for Auditors
 
The Machine Learning Audit
The Machine Learning AuditThe Machine Learning Audit
The Machine Learning Audit
 
AWS for Auditors
AWS for AuditorsAWS for Auditors
AWS for Auditors
 
Machine Learning Risk Management
Machine Learning Risk ManagementMachine Learning Risk Management
Machine Learning Risk Management
 
Big data and other buzzwords
Big data and other buzzwordsBig data and other buzzwords
Big data and other buzzwords
 
Machine Learning: What Assurance Professionals Need to Know
Machine Learning: What Assurance Professionals Need to Know Machine Learning: What Assurance Professionals Need to Know
Machine Learning: What Assurance Professionals Need to Know
 
Reinventing Auditing with Machine Learning
Reinventing Auditing with Machine LearningReinventing Auditing with Machine Learning
Reinventing Auditing with Machine Learning
 
Active Directory for Auditors
Active Directory for AuditorsActive Directory for Auditors
Active Directory for Auditors
 

Recently uploaded

Data_and_Analytics_Essentials_Architect_an_Analytics_Platform.pptx
Data_and_Analytics_Essentials_Architect_an_Analytics_Platform.pptxData_and_Analytics_Essentials_Architect_an_Analytics_Platform.pptx
Data_and_Analytics_Essentials_Architect_an_Analytics_Platform.pptx
AnirbanRoy608946
 
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
axoqas
 
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
oz8q3jxlp
 
Machine learning and optimization techniques for electrical drives.pptx
Machine learning and optimization techniques for electrical drives.pptxMachine learning and optimization techniques for electrical drives.pptx
Machine learning and optimization techniques for electrical drives.pptx
balafet
 
一比一原版(QU毕业证)皇后大学毕业证成绩单
一比一原版(QU毕业证)皇后大学毕业证成绩单一比一原版(QU毕业证)皇后大学毕业证成绩单
一比一原版(QU毕业证)皇后大学毕业证成绩单
enxupq
 
一比一原版(TWU毕业证)西三一大学毕业证成绩单
一比一原版(TWU毕业证)西三一大学毕业证成绩单一比一原版(TWU毕业证)西三一大学毕业证成绩单
一比一原版(TWU毕业证)西三一大学毕业证成绩单
ocavb
 
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
vcaxypu
 
Sample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdf
Sample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdfSample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdf
Sample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdf
Linda486226
 
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
ewymefz
 
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
John Andrews
 
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
ewymefz
 
Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)
TravisMalana
 
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
74nqk8xf
 
The affect of service quality and online reviews on customer loyalty in the E...
The affect of service quality and online reviews on customer loyalty in the E...The affect of service quality and online reviews on customer loyalty in the E...
The affect of service quality and online reviews on customer loyalty in the E...
jerlynmaetalle
 
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
Timothy Spann
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP
 
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Subhajit Sahu
 
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
nscud
 
一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理
一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理
一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理
dwreak4tg
 
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
mbawufebxi
 

Recently uploaded (20)

Data_and_Analytics_Essentials_Architect_an_Analytics_Platform.pptx
Data_and_Analytics_Essentials_Architect_an_Analytics_Platform.pptxData_and_Analytics_Essentials_Architect_an_Analytics_Platform.pptx
Data_and_Analytics_Essentials_Architect_an_Analytics_Platform.pptx
 
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
 
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
 
Machine learning and optimization techniques for electrical drives.pptx
Machine learning and optimization techniques for electrical drives.pptxMachine learning and optimization techniques for electrical drives.pptx
Machine learning and optimization techniques for electrical drives.pptx
 
一比一原版(QU毕业证)皇后大学毕业证成绩单
一比一原版(QU毕业证)皇后大学毕业证成绩单一比一原版(QU毕业证)皇后大学毕业证成绩单
一比一原版(QU毕业证)皇后大学毕业证成绩单
 
一比一原版(TWU毕业证)西三一大学毕业证成绩单
一比一原版(TWU毕业证)西三一大学毕业证成绩单一比一原版(TWU毕业证)西三一大学毕业证成绩单
一比一原版(TWU毕业证)西三一大学毕业证成绩单
 
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
 
Sample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdf
Sample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdfSample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdf
Sample_Global Non-invasive Prenatal Testing (NIPT) Market, 2019-2030.pdf
 
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
一比一原版(UofM毕业证)明尼苏达大学毕业证成绩单
 
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
 
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
 
Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)
 
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
 
The affect of service quality and online reviews on customer loyalty in the E...
The affect of service quality and online reviews on customer loyalty in the E...The affect of service quality and online reviews on customer loyalty in the E...
The affect of service quality and online reviews on customer loyalty in the E...
 
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
 
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
 
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
 
一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理
一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理
一比一原版(BCU毕业证书)伯明翰城市大学毕业证如何办理
 
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
一比一原版(Bradford毕业证书)布拉德福德大学毕业证如何办理
 

Where Open Source Meets Audit Analytics - ISACA North America CACS 2017

  • 1. Copyright © 2017 Information Systems Audit and Control Association, Inc. All rights reserved. Andrew Clark, IT Auditor / Internal Audit Data Scientist Astec Industries, Inc., M.S. Data Science Candidate
  • 2. Copyright © 2017 Information Systems Audit and Control Association, Inc. All rights reserved. Overview 1. What is open source software? 2. Why is it important? 3. What are the benefits of using open source software for analytics over CAATs? 4. How do I begin using open source software for analytics? 5. Case study 6. The application of advanced analytic techniques
  • 3. Copyright © 2017 Information Systems Audit and Control Association, Inc. All rights reserved. Meet Open Source
  • 4. Copyright © 2017 Information Systems Audit and Control Association, Inc. All rights reserved. Open Source Software “Open source software is software whose source code is available for modification or enhancement by anyone.” What Is Open Source?" Opensource.com. Accessed June 12, 2016. https://opensource.com/resources/what-open-source.
  • 5. Copyright © 2017 Information Systems Audit and Control Association, Inc. All rights reserved. Open Source examples 1. Linux (mainly) 2. Android (mainly) 3. Firefox 4. R programming language 5. Git 6. Docker
  • 6. Copyright © 2017 Information Systems Audit and Control Association, Inc. All rights reserved. Why is it important? • Vibrant community • Frequent updates • Potential for strong security • Cutting edge technology • Customizable • Cost
  • 7. Copyright © 2017 Information Systems Audit and Control Association, Inc. All rights reserved. How does Open Source relate to Audit Analytics? • State of the art technology • Computer science's best and brightest love to contribute • Customizable • Scalability • Beautiful visualizations • Analytics and Data Science leaders use almost exclusively open source frameworks for their analytics, i.e. Google, Facebook, Uber, Airbnb, etc.
  • 8. Copyright © 2017 Information Systems Audit and Control Association, Inc. All rights reserved. "Bubble Charts." Plotly. Accessed August 14, 2016. https://plot.ly/python/bubble-charts/.
  • 9. Copyright © 2017 Information Systems Audit and Control Association, Inc. All rights reserved. Benefits over traditional CAATs • ACL, IDEA, Arbutus, the existing market leaders • Not very user friendly • Requires extensive training to use effectively • Not very flexible • Does not provide the output auditors are expecting
  • 10. Copyright © 2017 Information Systems Audit and Control Association, Inc. All rights reserved. So what do we do about it?
  • 11. Copyright © 2017 Information Systems Audit and Control Association, Inc. All rights reserved. Enter Python (and R)
  • 12. Copyright © 2017 Information Systems Audit and Control Association, Inc. All rights reserved. What is Python? "About Python." Python.org. Accessed August 14, 2016. https://www.python.org/about/. • Open source, general purpose programming language • High level of support • Used by some of the best and brightest in Data Science • Extensive scientific, mathematic, data wrangling and visualization libraries • Most popular first language in computer • science departments across America (http://tinyurl.com/knw5mdv)
  • 13. Copyright © 2017 Information Systems Audit and Control Association, Inc. All rights reserved. What is R? • "R is a language and environment for statistical computing and graphics."- "What Is R?" The R Project for Statistical Computing. Accessed August 14, 2016. https://www.r-project.org/about.html. • Used widely by statisticians for statistical analysis • As a result of its widespread use, thousands of easy to implement libraries that provide *all* widely used statistical techniques • Is not a 'real' programming language
  • 14. Copyright © 2017 Information Systems Audit and Control Association, Inc. All rights reserved. How would we go about using Python (or R)? • The hard way: by learning it • The even harder way: hire an auditor with programming, analytics and auditing experience • The *easiest* and most effective way: create a cross functional team by borrowing a programmer from IT and a business analyst from the business.
  • 15. Copyright © 2017 Information Systems Audit and Control Association, Inc. All rights reserved. Example Python (and R) analytic test • https://github.com/aclarkData/AuditAnalytics • 999 amount, weekends and keywords journal entry tests • Steps: • Input libraries • Import data • Wrangle as needed • Export to folder • Email • Schedule - Task Scheduler in Windows, Cron, or equivalent in Unix based system, i.e. Mac and Linux
  • 16. Copyright © 2017 Information Systems Audit and Control Association, Inc. All rights reserved.
  • 17. Copyright © 2017 Information Systems Audit and Control Association, Inc. All rights reserved.
  • 18. Copyright © 2017 Information Systems Audit and Control Association, Inc. All rights reserved.
  • 19. Copyright © 2017 Information Systems Audit and Control Association, Inc. All rights reserved.
  • 20. Copyright © 2017 Information Systems Audit and Control Association, Inc. All rights reserved. Machine Learning • In essence, a machine understanding patterns in data without having to be explicitly programmed. • Very, very powerful technology that is transforming banking, search engines, advertising, and soon, every industry. • Examples: Credit card fraud detection, target demographic advertising, anomalous sensory data, etc.
  • 21. Copyright © 2017 Information Systems Audit and Control Association, Inc. All rights reserved. Machine Learning Cont. • Numerous possibilities for utilizing machine learning and related technology, e.x. Natural Language Processing, etc., for Financial Auditing • For example, unsupervised clustering algorithm in use at Astec Industries. • Latest developments are only available in open source software or expensive statistical or computational programs such as SAS, which currently runs at a minimum of $9,200 upfront per single user license plus annual fees - “SAS® Analytics Pro." SAS®. Accessed August 26, 2016. https://www.sas.com/store/software/analytics- pro/prodPERSANL.html.
  • 22. Copyright © 2017 Information Systems Audit and Control Association, Inc. All rights reserved. Possibilities • Time Series Machine Learning for predicting account balances • Natural Language Processing techniques for contract review and summarization - current bottleneck is (OCR) Optical Character Recognition technology. • Sentiment Analysis for Journal Entry and Transaction descriptions. • Jupyter notebooks for reproducible analytics and audit documentation
  • 23. Copyright © 2017 Information Systems Audit and Control Association, Inc. All rights reserved.
  • 24. Copyright © 2017 Information Systems Audit and Control Association, Inc. All rights reserved. Conclusion • Definition of Open Source Software • Unlimited possibilities for a customizable analytics experience • Scalable • Real world example • Machine Learning and the future of audit analytics
  • 25. Copyright © 2017 Information Systems Audit and Control Association, Inc. All rights reserved. Thank you! • Email: andrewtaylorclark@gmail.com • GitHub: aclarkData • Blog: https: aclarkdata.github.io • LinkedIn: www.linkedin.com/in/andrew-clark-b326b767