In the Wizard of Oz, Toto pulls back the green curtain to expose that the Wizard of Oz is a fraud. We can peep behind the 'green curtain' of the data visualisation to learn how to 'poke holes' in the data that you are given, both in business and in everyday news headlines.
In order to explode the myths in the data that surrounds us every day, it is a little known secret that there are hidden patterns in the data chaos that surrounds us. Deviations from these patterns highlight invention, bias, anomalies and even deliberate fraud.
You can use both R and Power BI data visualisation combined with timeless data analysis and patterns such as Benford's Law to reveal or conceal efforts to distort the numbers, and question the veracity of the data.
You'll need courage, heart and wisdom to analyse data, since truthful data doesn't necessarily give easy answers!
4. What is Benford’s Law?
In Many real-life sources of data, the
leading digit is distributed in a specific,
non-uniform way
The first digit is 1, about 30% of the time,
Larger digits occur with lower and lower
frequency
9 occurs less than 5%
5.
6. What is Benford’s Law?
Benford's law, also called the first-digit law, states that in
lists of numbers from many (but not all) real-life sources
of data, the leading digit is distributed in a specific, non-
uniform way. According to this law, the first digit is 1
about 30% of the time, and larger digits occur as the
leading digit with lower and lower frequency, to the
point where 9 as a first digit occurs less than 5% of the
time.
Wikipedia, retrieved 08/25/2011
7. From Theory to Application
Dr. Mark Nigrini:
Published a thesis noting that Benford’s Law could be used to detect fraud
human choices are not random
It is suggested that invented numbers are unlikely to follow Benford’s Law.
8. Conforming Data Types
Transactions
Accountancy data e.g. Journal entries
Purchase orders
Stock prices
T&E expenses, etc.
9. Non Conforming Data Types
Nominal data e.g. Telephone Numbers
Heights
Prices where there is a threshold or a ‘cap’
Aggregated Data
Less than 500 transactions
10. Check the data: Benford’s Law, Outliers, data acquisition, missing data, system
information, event timings
Look for strange behaviours in the process e.g. bypassing permission limits
Due diligence not being followed, or simply not implemented due to a lack of safeguards
which are there to protect the innocent from making mistakes
Example: Data Audits
11. Data Visualisation to help you audit your data
Demo in Power BI and R
Benford’s Law
Outliers
Missing data
15. Summary
Benford’s Law is not the final word, but it is a start for the curious.
Doubt yourself all the time… prove everything.
Think repeatable principles in science.
Occam’s Razor
Repeatable results.. By someone else
Context is vital; the data is meaningless without it.
Editor's Notes
To detect manipulations or fraud in accounting data, auditors have successfully used Benford's law as part of their fraud detection processes. Benford's law proposes a distribution for first digits of numbers in naturally occurring data. Government accounting and statistics are similar in nature to financial accounting. In the European Union (EU), there is pressure to comply with the Stability and Growth Pact criteria. Therefore, like firms, governments might try to make their economic situation seem better. In this paper, we use a Benford test to investigate the quality of macroeconomic data relevant to the deficit criteria reported to Eurostat by the EU member states. We find that the data reported by Greece shows the greatest deviation from Benford's law among all euro states.
To detect manipulations or fraud in accounting data, auditors have successfully used Benford's law as part of their fraud detection processes. Benford's law proposes a distribution for first digits of numbers in naturally occurring data. Government accounting and statistics are similar in nature to financial accounting. In the European Union (EU), there is pressure to comply with the Stability and Growth Pact criteria. Therefore, like firms, governments might try to make their economic situation seem better. In this paper, we use a Benford test to investigate the quality of macroeconomic data relevant to the deficit criteria reported to Eurostat by the EU member states. We find that the data reported by Greece shows the greatest deviation from Benford's law among all euro states.
An examination of the quarterly GDP growth rate from December 1991 to September 2012 shows zero occurred as the second digit 21 times, much higher than what Benford would calculate and suggesting a rounding-up to achieve a bigger leading digit. One through four also appeared more regularly than the law reckons, while seven through nine featured less.
Benford's law, also called the first-digit law, states that in lists of numbers from many (but not all) real-life sources of data, the leading digit is distributed in a specific, non-uniform way. According to this law, the first digit is 1 about 30% of the time, and larger digits occur as the leading digit with lower and lower frequency, to the point where 9 as a first digit occurs less than 5% of the time. Wikipedia, retrieved 08/25/2011
Benford's law, also called the first-digit law, states that in lists of numbers from many (but not all) real-life sources of data, the leading digit is distributed in a specific, non-uniform way. According to this law, the first digit is 1 about 30% of the time, and larger digits occur as the leading digit with lower and lower frequency, to the point where 9 as a first digit occurs less than 5% of the time. Wikipedia, retrieved 08/25/2011