Data Mining


Published on

Data Mining presentation for USF Graduate Computer Forensics class.

  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Data Mining

  1. 1. Ed Tobias, CISA, CIA<br />June 8, 2010<br />Data Mining<br />
  2. 2. Topics<br />Introduction & Current Perceptions<br />What is Data Mining?<br />How is Data Mining used?<br />Why is Data Mining important?<br />Questions<br />
  3. 3. A little story involving …<br />New G/L system<br />Curious Audit Manager <br />Questionable accounting entries<br />
  4. 4. Introduction <br />IT Audit Manager for Hillsborough County<br />Certified as a CISA and CIA<br />Spend 50% doing Data Mining<br />Audit Risk Assessment<br />Testing control effectiveness<br />Compliance<br />Fraud Detection<br />
  5. 5. Introduction <br />Who are you?<br />Accountants<br />Auditors<br />Consultants<br />Other industries<br />
  6. 6. Current Perceptions about DM<br />What do you think Data Mining is?<br />
  7. 7. Heard of CAATs?<br />Computer Assisted Audit Techniques<br />Formerly a specialized skill for IT Auditors<br />Common in every audit<br />Term is practically obsolete <br />
  8. 8. What is Data Mining?<br />Automate the detection of relevant patterns <br />Look at current & historical data<br />Predict future trends<br />Efficient method for analyzing large amounts of data<br />Enhance key item sampling<br />Means for continuous auditing<br />
  9. 9. How is Data Mining used?<br />Proactively review business processes<br />Identify anomalies<br />Risk Assessment<br />Reactively assist law enforcement in investigations<br />
  10. 10. How is Data Mining used?<br />Outside of Audit, DM is used to generate revenue<br />Automating the detection of relevant patterns <br />Look at current & historical data<br />Predict future trends – “Predictive Analysis”<br />aka Business Intelligence / Data Warehouse<br />
  11. 11. Charity Fundraising<br />May, 2010 issue of SmartMoneymagazine<br />2 data mining articles<br />
  12. 12. Charity Fundraising<br />Hospital in San Diego, CA <br />Patients get their treatment and their assets scanned <br />Donor research on your salary history, LinkedIn connections, satellite images of your pool<br />
  13. 13. Charity Fundraising<br />Brown University – 6 of 10 donors who gave $100K+ identified through DM<br />
  14. 14. Charity Fundraising<br />ASPCA – using DM and Predictive Analysis to determine donors<br />4x donations over 5 years - $80M<br />
  15. 15. Charity Fundraising<br />Charity’s Junk Mail Strategy<br />Mailings<br />Double Dip – another request<br />Virus effect<br />“Dear Friend” – personal recognition<br />
  16. 16. How is Data Mining used?<br />Audit Process<br />Risk Assessment<br />Control Assessment<br />Observations<br />Fraud Detection and Prevention<br />
  17. 17. How is Data Mining used?<br />Risk Assessment<br />Data analysis for high risk areas<br />High Dollar amounts<br />Potential for fraud<br />Potential for non-compliance<br />
  18. 18. How is Data Mining used?<br />Risk Assessment<br />What can be detected?<br />Potential fraud or control weaknesses<br />Duplicate vendors<br />Duplicate invoices<br />Duplicate amounts<br />Benford’s Law – identify suspicious transactions<br />Focus audit on high risk areas<br />
  19. 19. How is Data Mining used?<br />Control Assessment<br />Traditional audit used sampling approach<br />Auditors placed disclaimers regarding the accuracy of their statistical sampling<br />Not affordable or available anymore<br />Total assurance & clear indication of errors<br />DM uses 100% of transactions <br />Increases credibility & value of audit<br />
  20. 20. Why is Data Mining important?<br />Examples of Data Mining<br />Proactive - Purchasing and Procurement<br />Reactive - Health Plan Auditing<br />
  21. 21. Purchasing and Procurement<br />Common area for fraud<br />Abuse of financial authority<br />Technical manipulation of specifications<br />Internal collusion to circumvent controls<br />External collusion with suppliers<br />Manipulation of bid review<br />Overbilling<br />Bogus invoices<br />
  22. 22. Purchasing and Procurement<br />2003 review - five local government procurement agencies in London<br />Purchasing managers<br />Specified the bid criteria<br />Suggested pre-approved companies<br />Suspected in collusion with suppliers<br />DM used to identify possible trends<br />
  23. 23. Purchasing and Procurement<br />Analyzed “win/lose” statistics of the vendors<br />Number of bids won<br />Number of bids lost due to cost<br />Number of bids where vendor failed approval<br />Number of bids lost for “other” reasons<br />
  24. 24. Purchasing and Procurement<br />Determined two key elements:<br />Vendors that consistently lost their bids – “shadow bidders”<br />Vendors that won over 95% of bids<br />Team of forensic experts were used for contract review and work performed<br />
  25. 25. Purchasing and Procurement<br />Collusion to circumvent controls in place<br />Required to have five bids<br />Used shadow bidders<br />Bid review group<br />Ensured that a selected (and corrupt) supplier was chosen every time<br />Allowed substitution of inferior materials<br />
  26. 26. Health Plan Auditing<br />2010 case study – Conducted by St. Joseph’s Univ. & Healthcare Data Mgmt<br />Two large companies’ health insurance claims data over two year period<br />Company A – 108,000 claims, $25.3M paid<br />Company B – 464,000 claims, $118.4M paid<br />
  27. 27. Health Plan Auditing<br />Compared the results of 100% auditing vs. random-sampling claims (300-400 samples)<br />100% auditing produced very distinct results<br />Company A - $3.12M exception claims<br />Company B - $5.47M exception claims<br />
  28. 28. Health Plan Auditing<br />1 – Average of the two Post-Audit years<br />
  29. 29. Health Plan Auditing<br />
  30. 30. Health Plan Auditing<br />Company A - $3.12M in exception claims (9,315 records) <br />
  31. 31. Health Plan Auditing<br />Company B - $5.47M in exception claims (20,395 records)<br />
  32. 32. Health Plan Auditing<br />Random-sampling claims - used best “analysis” to simulate the audits<br />Using exceptions from the 100% auditing approach<br />100 random samples of 300 exceptions<br />100 random samples of 400 exceptions<br />Statistically close to their “population of exceptions” parameters from 100% auditing<br />
  33. 33. Health Plan Auditing – 300 samples<br /><ul><li>On average, random-sampling missed from $2.91M - $5.39M of exception claims paid</li></li></ul><li>Health Plan Auditing – 400 samples<br /><ul><li>On average, random-sampling missed from $2.85M - $5.36M of exception claims paid</li></li></ul><li>Health Plan Auditing<br />Random-sampling audits – produced statistically valid estimates of exception claims<br />Not the objective in this audit<br />Determine root causes of errors<br />Minimize / eliminate the errors<br />
  34. 34. Health Plan Auditing<br />Random-sampling missed a significant amount of exception claim amounts<br />Increasing the sample size from 300 to 400 <br />Did not significantly identify more errors<br />Over 90% of claim errors still missed<br />Significant amounts of money are wasted<br />
  35. 35. Health Plan Auditing<br />Random-sampling does not identify root cause of errors<br />Trend analysis only possible through data mining<br />
  36. 36. Questions<br />
  37. 37. Contact Information <br /><br />LinkedIn -<br />
  38. 38. References<br />ACFE. 2008 Report to the Nation on Occupational Fraud & Abuse. 2008. Retrieved 6/1/10 from<br />Barrier, M. One right path: Cynthia Cooper. 2003. Retrieved 6/2/10 from<br />Bourke, J. Computer Assisted Audit Techniques or CAATS. 2010. Retrieved 5/25/10 from<br />Deeson, M. Audit says Office Depot overcharged county one million dollars. 2010. Retrieved 6/3/10 from<br />Denker, B. Data Mining and the Auditor’s Responsibility. 2003. Retrieved 6/1/10 from<br />Kadet, A. Are Charity Fundraisers Spying on You? 2010. Retrieved 6/1/10 from<br />
  39. 39. References<br />Kadet, A. Your Charity’s Junk-Mail Strategy. 2010. Retrieved 6/1/10 from<br />Kusnierz, R. A Case for Data Mining. 2003. Retrieved 6/1/10 from<br />Sayana, A. Using CAATs to Support IS Audit. 2003. Retrieved 6/1/10 from<br />Sillup, G. and Klimberg, R. Health Plan Auditing: 100-Percent-of-Claims vs. Random-Sample Audits. 2010. Retrieved 6/1/10 from<br />Silltow, J. Data Mining 101: Tools and Techniques. 2006. Retrieved 5/25/10 from<br />Wolfe, J. Effective Data Mining for Financial Services Companies. 2008. Retrieved 6/1/10 from (IIA members only URL)<br />Zink, J. Office Depot billing disputed. 2010. Retrieved 6/3/10 from<br />