Your SlideShare is downloading. ×
Which one is different data mining and forensic analytics
Which one is different data mining and forensic analytics
Which one is different data mining and forensic analytics
Which one is different data mining and forensic analytics
Which one is different data mining and forensic analytics
Which one is different data mining and forensic analytics
Which one is different data mining and forensic analytics
Which one is different data mining and forensic analytics
Which one is different data mining and forensic analytics
Which one is different data mining and forensic analytics
Which one is different data mining and forensic analytics
Which one is different data mining and forensic analytics
Which one is different data mining and forensic analytics
Which one is different data mining and forensic analytics
Which one is different data mining and forensic analytics
Which one is different data mining and forensic analytics
Which one is different data mining and forensic analytics
Which one is different data mining and forensic analytics
Which one is different data mining and forensic analytics
Which one is different data mining and forensic analytics
Which one is different data mining and forensic analytics
Which one is different data mining and forensic analytics
Which one is different data mining and forensic analytics
Which one is different data mining and forensic analytics
Which one is different data mining and forensic analytics
Which one is different data mining and forensic analytics
Which one is different data mining and forensic analytics
Which one is different data mining and forensic analytics
Which one is different data mining and forensic analytics
Which one is different data mining and forensic analytics
Which one is different data mining and forensic analytics
Which one is different data mining and forensic analytics
Which one is different data mining and forensic analytics
Which one is different data mining and forensic analytics
Which one is different data mining and forensic analytics
Which one is different data mining and forensic analytics
Which one is different data mining and forensic analytics
Which one is different data mining and forensic analytics
Which one is different data mining and forensic analytics
Which one is different data mining and forensic analytics
Which one is different data mining and forensic analytics
Which one is different data mining and forensic analytics
Which one is different data mining and forensic analytics
Which one is different data mining and forensic analytics
Which one is different data mining and forensic analytics
Which one is different data mining and forensic analytics
Which one is different data mining and forensic analytics
Which one is different data mining and forensic analytics
Which one is different data mining and forensic analytics
Which one is different data mining and forensic analytics
Which one is different data mining and forensic analytics
Which one is different data mining and forensic analytics
Which one is different data mining and forensic analytics
Which one is different data mining and forensic analytics
Which one is different data mining and forensic analytics
Which one is different data mining and forensic analytics
Which one is different data mining and forensic analytics
Which one is different data mining and forensic analytics
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

Which one is different data mining and forensic analytics

313

Published on

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
313
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
13
Comments
0
Likes
0
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide
  • Benford analysis it is a very useful tool for analyzing a large population of numbers / accounts / results to identify outliers which will likely require additional work.The Benford analysis for first digit and second digit will mainly identify major outliers which would have to be reviewed for the business justification, if any. For example, ff the first digit of 8 and the second digit of 5 were above the expected values, a reasonable explanation may be that the organization manufactures a product which contains an component that is purchased frequently for $850. Or maybe there isn’t a good reason and further testing, sampling, review will need to be performed. PivotTables of the outlier number results by vendor may be a likely next step.The Benford analysis for the first two digits can be used to identify avoidance of thresholds. This could be indicative of individuals avoiding obtaining additional approval for PO or invoice amounts of $10,000 or $100,000 and up. The same analysis can be run on purchase order data to see if PO’s are being split to avoid approval thresholds.The Benford analysis for the first three digits can be used to find “conspicuous round off operations”. If the 100 or 500 have extreme outliers, you may want to check to see why there are so many $500 or $1,000 invoices. This could be the result of fraudulent invoices or they may be a business justification (like the organization processes rebates through AP and the rebate is $500 or $1,000).Benford’s analysis probably won’t pinpoint fraud but it is a good way to start looking at a large amount of data. I think it is a good starting point for offering additional fraud detection services.
  • Benford analysis it is a very useful tool for analyzing a large population of numbers / accounts / results to identify outliers which will likely require additional work.The Benford analysis for first digit and second digit will mainly identify major outliers which would have to be reviewed for the business justification, if any. For example, ff the first digit of 8 and the second digit of 5 were above the expected values, a reasonable explanation may be that the organization manufactures a product which contains an component that is purchased frequently for $850. Or maybe there isn’t a good reason and further testing, sampling, review will need to be performed. PivotTables of the outlier number results by vendor may be a likely next step.The Benford analysis for the first two digits can be used to identify avoidance of thresholds. This could be indicative of individuals avoiding obtaining additional approval for PO or invoice amounts of $10,000 or $100,000 and up. The same analysis can be run on purchase order data to see if PO’s are being split to avoid approval thresholds.The Benford analysis for the first three digits can be used to find “conspicuous round off operations”. If the 100 or 500 have extreme outliers, you may want to check to see why there are so many $500 or $1,000 invoices. This could be the result of fraudulent invoices or they may be a business justification (like the organization processes rebates through AP and the rebate is $500 or $1,000).I guess in summary, Benford’s analysis probably won’t pinpoint fraud but it is a good way to start looking at a large amount of data. I think it is a good starting point for offering additional fraud detection services.
  • Topics coveredMissing ChecksConditional formatting and data filtersEmpty Data fieldsDuplicatesComparing two Excel filesPayee not on vendor listVendor also an employeePivot TablesHigh-dollar vendorsPowerPivotReportingGroupingHeaders/FootersCopying Tabs (worksheets)Removing Meta-dataSecure spreadsheets for E-mailing
  • Topics coveredMissing ChecksConditional formatting and data filtersEmpty Data fieldsDuplicatesComparing two Excel filesPayee not on vendor listVendor also an employeePivot TablesHigh-dollar vendorsPowerPivotReportingGroupingHeaders/FootersCopying Tabs (worksheets)Removing Meta-dataSecure spreadsheets for E-mailing
  • Topics coveredMissing ChecksConditional formatting and data filtersEmpty Data fieldsDuplicatesComparing two Excel filesPayee not on vendor listVendor also an employeePivot TablesHigh-dollar vendorsPowerPivotReportingGroupingHeaders/FootersCopying Tabs (worksheets)Removing Meta-dataSecure spreadsheets for E-mailing
  • Topics coveredMissing ChecksConditional formatting and data filtersEmpty Data fieldsDuplicatesComparing two Excel filesPayee not on vendor listVendor also an employeePivot TablesHigh-dollar vendorsPowerPivotReportingGroupingHeaders/FootersCopying Tabs (worksheets)Removing Meta-dataSecure spreadsheets for E-mailing
  • Topics coveredMissing ChecksConditional formatting and data filtersEmpty Data fieldsDuplicatesComparing two Excel filesPayee not on vendor listVendor also an employeePivot TablesHigh-dollar vendorsPowerPivotReportingGroupingHeaders/FootersCopying Tabs (worksheets)Removing Meta-dataSecure spreadsheets for E-mailing
  • Topics coveredMissing ChecksConditional formatting and data filtersEmpty Data fieldsDuplicatesComparing two Excel filesPayee not on vendor listVendor also an employeePivot TablesHigh-dollar vendorsPowerPivotReportingGroupingHeaders/FootersCopying Tabs (worksheets)Removing Meta-dataSecure spreadsheets for E-mailing
  • Transcript

    • 1. Which One is DifferentData Mining and Forensic AnalyticsBill Douglas
    • 2. Cost Advisors’ Background Founded in 1999 Mission: Improve our client‟s business and the lives of our employees Focus on Accounting Investigation and Forensics Logo symbolizes partnership with our clients 2 © 2008 Cost Advisors, Inc. All rights reserved.
    • 3. Bill Douglas’ Background President at Cost Advisors, Inc. 33 years experience Management positions in Accounting, Sales, Marketing CFO, IPO, Big 4 public accounting, business processes, recovery auditing, internal controls, fraud, internal auditing, Sarbanes-Oxley (SOX) Financial project management at both large and small public companies Volunteer Washington County Sheriff‟s Dept. – Fraud Team Frequent speaker and writer about Internal Controls, Fraud 3 © 2012 Cost Advisors, Inc. All rights reserved.
    • 4. Bill Douglas’ Background Credentials and memberships: OR, CA, WA Certified Public Accountant (CPA) Certified Internal Auditor (CIA) Certified Fraud Examiner (CFE) Certified in Financial Forensics (CFF) Certified IT Professional (CITP) OR Licensed Private Investigator (PI) MULTNOMAH BAR ASSOCIATION (Affiliate Member) Northwest Fraud Investigators Association 4 © 2012 Cost Advisors, Inc. All rights reserved.
    • 5. Agenda1. Data Mining Examples2. Data mining you can do in Excel 5 © 2012 Cost Advisors, Inc. All rights reserved.
    • 6. 1. Data Mining for Fraud What is CAATs? Example #1 Accounting Queries Example #2 Scanning Bank Statements Example #3 Benford’s Law 6 © 2012 Cost Advisors, Inc. All rights reserved.
    • 7. What is CAATs? Computer Assisted Audit Tools (CAATs) Examine 100% of transactions Analysis available: Duplicates Missing Records Queries (meeting certain criteria) Population summaries by field (pivot tables) Population statistics 7 © 2012 Cost Advisors, Inc. All rights reserved.
    • 8. Data Sources Import data from many sources Excel Acrobat (.pdf) Text Files (.txt, .doc) Print files (.prn) Hardcopy scans 8 © 2012 Cost Advisors, Inc. All rights reserved.
    • 9. .PRN File 9 © 2012 Cost Advisors, Inc. All rights reserved.
    • 10. Example #1 Accounting Queries Disbursements (Checks) Vendor Master List Accounting System Employee Master List 10 © 2009 Cost Advisors, Inc. All rights reserved.
    • 11. Example #1 Accounting Queries Disbursements (Checks) Data Mining Tool Vendor Master Employee List Master List 11 © 2009 Cost Advisors, Inc. All rights reserved.
    • 12. Example #1 Six Accounting Queries Disbursements (Checks) Duplicate PaymentsPayee not on Vendor List Non-payroll, non-expense report, payments to employees Vendor Master Employee List Master List Vendors with same Vendors using SS# as EIN address as employee Employees with no address 12 © 2009 Cost Advisors, Inc. All rights reserved.
    • 13. Example #2One Set of Books? Victim‟s Accounting System = Victim’s Bank Statement 13 © 2009 Cost Advisors, Inc. All rights reserved.
    • 14. Data Extraction - Review Disbursements (Checks) Vendor Master List Accounting System Employee Master List 14 © 2009 Cost Advisors, Inc. All rights reserved.
    • 15. Disbursements in Excel Disbursements (Checks) 15 © 2009 Cost Advisors, Inc. All rights reserved.
    • 16. Example #2-Scanning Bank Statements Victim’s Bank Statement Disbursements (per Accounting System) Missing = Electronic Comparison 16 © 2009 Cost Advisors, Inc. All rights reserved.
    • 17. Example #3 -Benford’s Law Frank Benford (1938), Simon Newcomb (1881) Some leading digits occur more/less frequently in most data 35.00% 30.00% 25.00% Probability 20.00% 15.00% 10.00% 5.00% 0.00% 1 2 3 4 5 6 7 8 9 Leading Digit 17 © 2012 Cost Advisors, Inc. All rights reserved.
    • 18. Example #3 -Benford’s Law Compares expected amounts to actual amounts There were 1,368 occurrences of amounts beginning with $250 18 © 2012 Cost Advisors, Inc. All rights reserved.
    • 19. Summary of CAATs Data from any source Every transaction can be tested (no sampling) Many tests possible. Comparison examples: Within accounting files Accounting records to bank statements Actual records to expected values (Benford) 19 © 2012 Cost Advisors, Inc. All rights reserved.
    • 20. Agenda1. Data Mining Examples2. Data mining you can do in Excel 20 © 2012 Cost Advisors, Inc. All rights reserved.
    • 21. Goals and Assumptions Do basic investigation yourself 1 hour spent here will save dozens (hundreds?) of hours at workAssumptions: Data is in Excel 2007 or 2010 Basic knowledge of Excel (info for advanced Excel too) 21 © 2012 Cost Advisors, Inc. All rights reserved.
    • 22. Data Mining in Excel Data Filters Empty data fields Conditional formatting Duplicates Comparing two Excel files Payee not on vendor list Pivot Tables High-dollar vendors Missing checks PowerPivot Reporting 22 © 2012 Cost Advisors, Inc. All rights reserved.
    • 23. Data Filters - Setting 23 © 2012 Cost Advisors, Inc. All rights reserved.
    • 24. Data Filters - Blanks 24 © 2012 Cost Advisors, Inc. All rights reserved.
    • 25. Data Filters - Others 25 © 2012 Cost Advisors, Inc. All rights reserved.
    • 26. Data Filters - Suggestions Blank invoice numbers Employees or vendors with no address Vendors using a social security instead of EIN Odd characters at the end of the invoice number or check number (“.” “–” “a”) Invoice numbers 100, 101, 1000 or 1001 26 © 2012 Cost Advisors, Inc. All rights reserved.
    • 27. Data Filters - Clearing 27 © 2012 Cost Advisors, Inc. All rights reserved.
    • 28. Data Mining in Excel Data Filters Empty data fields Conditional formatting Duplicates Comparing two Excel files Payee not on vendor list Pivot Tables High-dollar vendors Missing checks PowerPivot Reporting 28 © 2012 Cost Advisors, Inc. All rights reserved.
    • 29. Conditional Formatting - 29 © 2012 Cost Advisors, Inc. All rights reserved.
    • 30. Conditional Formatting with DataFilter 30 © 2012 Cost Advisors, Inc. All rights reserved.
    • 31. Conditional Format & Filter - Result 31 © 2012 Cost Advisors, Inc. All rights reserved.
    • 32. Conditional Format - SuggestionsLook for duplicates of: Invoice date Invoice number Invoice amount Vendor name 32 © 2012 Cost Advisors, Inc. All rights reserved.
    • 33. Data Mining in Excel Data Filters Empty data fields Conditional formatting Duplicates Comparing two Excel files Payee not on vendor list Pivot Tables High-dollar vendors Missing checks PowerPivot Reporting 33 © 2012 Cost Advisors, Inc. All rights reserved.
    • 34. Comparing Excel Files – First Sheet 34 © 2012 Cost Advisors, Inc. All rights reserved.
    • 35. Comparing Excel Files – Second Sheet 35 © 2012 Cost Advisors, Inc. All rights reserved.
    • 36. Comparing Excel Files – Result These vendors are missing from the vendor master list 36 © 2012 Cost Advisors, Inc. All rights reserved.
    • 37. Data Mining in Excel Data Filters Empty data fields Conditional formatting Duplicates Comparing two Excel files Payee not on vendor list Pivot Tables High-dollar vendors Missing checks PowerPivot Reporting 37 © 2012 Cost Advisors, Inc. All rights reserved.
    • 38. Pivot Table – Largest Vendors (step 1) 38 © 2012 Cost Advisors, Inc. All rights reserved.
    • 39. Pivot Table– LargestVendors(step 2) 39 © 2012 Cost Advisors, Inc. All rights reserved.
    • 40. Pivot Table – Largest Vendors (step 3) 40 © 2012 Cost Advisors, Inc. All rights reserved.
    • 41. Pivot Table – Largest Vendors -Suggestions Look for unusual vendor names and names of employees („cash‟, „petty cash‟, <blanks>, „bank‟, „credit card‟, etc.) Discuss vendor disbursement levels with management 41 © 2012 Cost Advisors, Inc. All rights reserved.
    • 42. PivotTables –MissingCheck #s 42 © 2012 Cost Advisors, Inc. All rights reserved.
    • 43. Data Mining in Excel Data Filters Empty data fields Conditional formatting Duplicates Comparing two Excel files Payee not on vendor list Pivot Tables High-dollar vendors Missing checks PowerPivot Reporting 43 © 2012 Cost Advisors, Inc. All rights reserved.
    • 44. What is PowerPivot From Microsoft for Office (Excel) 2010 It‟s Free Features Turns Excel into a relational database Compresses data Speeds recalculation (DAX Reporting tool) 44 © 2012 Cost Advisors, Inc. All rights reserved.
    • 45. How to Get PowerPivot 64 bit Excel vs. 32 bit Excel 45 © 2012 Cost Advisors, Inc. All rights reserved.
    • 46. Menu PowerPivot Tabs Normal Tabs 46 © 2012 Cost Advisors, Inc. All rights reserved.
    • 47. Menu PowerPivot Tabs Normal Tabs 47 © 2012 Cost Advisors, Inc. All rights reserved.
    • 48. Pivot Fields from Multiple Tabs (Tables) 48 © 2012 Cost Advisors, Inc. All rights reserved.
    • 49. PowerPivot Compression, Speed Access Excel (native) PowerPivotCompression 327MB 82MB 12MBRecalculation ~ 30min < 30 secondsWorksheet size ~ 2GB 1,048,576 rows Millions of rows ~2GB 49 © 2012 Cost Advisors, Inc. All rights reserved.
    • 50. Data Mining in Excel Data Filters Empty data fields Conditional formatting Duplicates Comparing two Excel files Payee not on vendor list Pivot Tables High-dollar vendors Missing checks PowerPivot Reporting 50 © 2012 Cost Advisors, Inc. All rights reserved.
    • 51. Reporting – SetPrint Area 51 © 2012 Cost Advisors, Inc. All rights reserved.
    • 52. Reporting –Setup(Header) 52 © 2012 Cost Advisors, Inc. All rights reserved.
    • 53. Reporting– Setup(Footer) 53 © 2012 Cost Advisors, Inc. All rights reserved.
    • 54. Reporting –Setup (Result) 54 © 2012 Cost Advisors, Inc. All rights reserved.
    • 55. Reporting - Duplicating Tabs 55 © 2012 Cost Advisors, Inc. All rights reserved.
    • 56. Reporting – Removing Meta Data 56 © 2012 Cost Advisors, Inc. All rights reserved.
    • 57. Reporting – encrypting for sending Be sure to save the workbook with a new name - append “(encrypted)” to the filename 57 © 2012 Cost Advisors, Inc. All rights reserved.
    • 58. For More Information Cost Advisors, Inc. 503-704-3719 www.costadvisors.com Download: „Embezzlement Response Guide‟ 58 © 2012 Cost Advisors, Inc. All rights reserved.

    ×