• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
Data Mining

Data Mining



Data Mining presentation for USF Graduate Computer Forensics class.

Data Mining presentation for USF Graduate Computer Forensics class.



Total Views
Views on SlideShare
Embed Views



1 Embed 3

http://www.linkedin.com 3



Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
Post Comment
Edit your comment

    Data Mining Data Mining Presentation Transcript

    • Ed Tobias, CISA, CIA
      June 8, 2010
      Data Mining
    • Topics
      Introduction & Current Perceptions
      What is Data Mining?
      How is Data Mining used?
      Why is Data Mining important?
    • A little story involving …
      New G/L system
      Curious Audit Manager
      Questionable accounting entries
    • Introduction
      IT Audit Manager for Hillsborough County
      Certified as a CISA and CIA
      Spend 50% doing Data Mining
      Audit Risk Assessment
      Testing control effectiveness
      Fraud Detection
    • Introduction
      Who are you?
      Other industries
    • Current Perceptions about DM
      What do you think Data Mining is?
    • Heard of CAATs?
      Computer Assisted Audit Techniques
      Formerly a specialized skill for IT Auditors
      Common in every audit
      Term is practically obsolete
    • What is Data Mining?
      Automate the detection of relevant patterns
      Look at current & historical data
      Predict future trends
      Efficient method for analyzing large amounts of data
      Enhance key item sampling
      Means for continuous auditing
    • How is Data Mining used?
      Proactively review business processes
      Identify anomalies
      Risk Assessment
      Reactively assist law enforcement in investigations
    • How is Data Mining used?
      Outside of Audit, DM is used to generate revenue
      Automating the detection of relevant patterns
      Look at current & historical data
      Predict future trends – “Predictive Analysis”
      aka Business Intelligence / Data Warehouse
    • Charity Fundraising
      May, 2010 issue of SmartMoneymagazine
      2 data mining articles
    • Charity Fundraising
      Hospital in San Diego, CA
      Patients get their treatment and their assets scanned
      Donor research on your salary history, LinkedIn connections, satellite images of your pool
    • Charity Fundraising
      Brown University – 6 of 10 donors who gave $100K+ identified through DM
    • Charity Fundraising
      ASPCA – using DM and Predictive Analysis to determine donors
      4x donations over 5 years - $80M
    • Charity Fundraising
      Charity’s Junk Mail Strategy
      Double Dip – another request
      Virus effect
      “Dear Friend” – personal recognition
    • How is Data Mining used?
      Audit Process
      Risk Assessment
      Control Assessment
      Fraud Detection and Prevention
    • How is Data Mining used?
      Risk Assessment
      Data analysis for high risk areas
      High Dollar amounts
      Potential for fraud
      Potential for non-compliance
    • How is Data Mining used?
      Risk Assessment
      What can be detected?
      Potential fraud or control weaknesses
      Duplicate vendors
      Duplicate invoices
      Duplicate amounts
      Benford’s Law – identify suspicious transactions
      Focus audit on high risk areas
    • How is Data Mining used?
      Control Assessment
      Traditional audit used sampling approach
      Auditors placed disclaimers regarding the accuracy of their statistical sampling
      Not affordable or available anymore
      Total assurance & clear indication of errors
      DM uses 100% of transactions
      Increases credibility & value of audit
    • Why is Data Mining important?
      Examples of Data Mining
      Proactive - Purchasing and Procurement
      Reactive - Health Plan Auditing
    • Purchasing and Procurement
      Common area for fraud
      Abuse of financial authority
      Technical manipulation of specifications
      Internal collusion to circumvent controls
      External collusion with suppliers
      Manipulation of bid review
      Bogus invoices
    • Purchasing and Procurement
      2003 review - five local government procurement agencies in London
      Purchasing managers
      Specified the bid criteria
      Suggested pre-approved companies
      Suspected in collusion with suppliers
      DM used to identify possible trends
    • Purchasing and Procurement
      Analyzed “win/lose” statistics of the vendors
      Number of bids won
      Number of bids lost due to cost
      Number of bids where vendor failed approval
      Number of bids lost for “other” reasons
    • Purchasing and Procurement
      Determined two key elements:
      Vendors that consistently lost their bids – “shadow bidders”
      Vendors that won over 95% of bids
      Team of forensic experts were used for contract review and work performed
    • Purchasing and Procurement
      Collusion to circumvent controls in place
      Required to have five bids
      Used shadow bidders
      Bid review group
      Ensured that a selected (and corrupt) supplier was chosen every time
      Allowed substitution of inferior materials
    • Health Plan Auditing
      2010 case study – Conducted by St. Joseph’s Univ. & Healthcare Data Mgmt
      Two large companies’ health insurance claims data over two year period
      Company A – 108,000 claims, $25.3M paid
      Company B – 464,000 claims, $118.4M paid
    • Health Plan Auditing
      Compared the results of 100% auditing vs. random-sampling claims (300-400 samples)
      100% auditing produced very distinct results
      Company A - $3.12M exception claims
      Company B - $5.47M exception claims
    • Health Plan Auditing
      1 – Average of the two Post-Audit years
    • Health Plan Auditing
    • Health Plan Auditing
      Company A - $3.12M in exception claims (9,315 records)
    • Health Plan Auditing
      Company B - $5.47M in exception claims (20,395 records)
    • Health Plan Auditing
      Random-sampling claims - used best “analysis” to simulate the audits
      Using exceptions from the 100% auditing approach
      100 random samples of 300 exceptions
      100 random samples of 400 exceptions
      Statistically close to their “population of exceptions” parameters from 100% auditing
    • Health Plan Auditing – 300 samples
      • On average, random-sampling missed from $2.91M - $5.39M of exception claims paid
    • Health Plan Auditing – 400 samples
      • On average, random-sampling missed from $2.85M - $5.36M of exception claims paid
    • Health Plan Auditing
      Random-sampling audits – produced statistically valid estimates of exception claims
      Not the objective in this audit
      Determine root causes of errors
      Minimize / eliminate the errors
    • Health Plan Auditing
      Random-sampling missed a significant amount of exception claim amounts
      Increasing the sample size from 300 to 400
      Did not significantly identify more errors
      Over 90% of claim errors still missed
      Significant amounts of money are wasted
    • Health Plan Auditing
      Random-sampling does not identify root cause of errors
      Trend analysis only possible through data mining
    • Questions
    • Contact Information
      LinkedIn - http://www.linkedin.com/in/ed3200
    • References
      ACFE. 2008 Report to the Nation on Occupational Fraud & Abuse. 2008. Retrieved 6/1/10 fromhttp://www.acfe.com/documents/2008-rttn.pdf
      Barrier, M. One right path: Cynthia Cooper. 2003. Retrieved 6/2/10 from http://findarticles.com/p/articles/mi_m4153/is_6_60/ai_111737943/
      Bourke, J. Computer Assisted Audit Techniques or CAATS. 2010. Retrieved 5/25/10 from https://www.cpa2biz.org/Content/media/PRODUCER_CONTENT/Newsletters/Articles_2010/CPA/Jan/CAATS.jsp
      Deeson, M. Audit says Office Depot overcharged county one million dollars. 2010. Retrieved 6/3/10 from http://www.wtsp.com/news/local/story.aspx?storyid=130518
      Denker, B. Data Mining and the Auditor’s Responsibility. 2003. Retrieved 6/1/10 from http://www.isaca.org/Content/ContentGroups/InfoBytes/20032/Data_Mining_and_the_Auditors_Responsibility.htm
      Kadet, A. Are Charity Fundraisers Spying on You? 2010. Retrieved 6/1/10 from http://www.smartmoney.com/personal-finance/estate-planning/are-charity-fundraisers-spying-on-you
    • References
      Kadet, A. Your Charity’s Junk-Mail Strategy. 2010. Retrieved 6/1/10 from http://www.smartmoney.com/personal-finance/estate-planning/are-charity-fundraisers-spying-on-you
      Kusnierz, R. A Case for Data Mining. 2003. Retrieved 6/1/10 from http://www2.northumberland.gov.uk/fraud/Documents/HM%20Treasury%20Reports/fraud_anti_fraud_adv_02-03.pdf
      Sayana, A. Using CAATs to Support IS Audit. 2003. Retrieved 6/1/10 from http://www.isaca.org/Journal/Past-Issues/2003/Volume-1/Pages/Using-CAATS-to-Support-IS-Audit.aspx
      Sillup, G. and Klimberg, R. Health Plan Auditing: 100-Percent-of-Claims vs. Random-Sample Audits. 2010. Retrieved 6/1/10 from http://www.sawgrassbc.com/captives/Health%20Plan%20Auditing_%20100%20Percent%20verses%20Random%20Sample%2001.2010%20.pdf
      Silltow, J. Data Mining 101: Tools and Techniques. 2006. Retrieved 5/25/10 from http://www.theiia.org/intAuditor/itaudit/archives/2006/august/data-mining-101-tools-and-techniques/
      Wolfe, J. Effective Data Mining for Financial Services Companies. 2008. Retrieved 6/1/10 from http://www.theiia.org/intAuditor/in-the-industry/2008/november/effective-data-mining-for-financial-services-companies/index.cfm?print&search=jonathan%20wolfe&Y=1899 (IIA members only URL)
      Zink, J. Office Depot billing disputed. 2010. Retrieved 6/3/10 from http://www.tampabay.com/news/localgovernment/article1090581.ece