Since the spread of IT systems has made it a pre-requisite that auditors as well as management have the ability to examine high volumes of data and transaction in order to determine patterns and trends. In addition, the increasing need to continuously monitor and audit IT systems has created an imperative for the effective use of appropriate data mining tools.
While a variety of powerful tools are readily available today, the skills required to utilize such tools are not. Not only must the correct testing techniques be selected but the effective interpretation of outcomes presented by the software is essential in the drawing of appropriate conclusions based on the data analysis. This 6 webinar series, based on Richard Cascarino’s book “Data Analytics for Internal Auditors” covers these skills and techniques.
Webinar 1 Understanding Sampling
Judgmental vs Statistical Sampling
Probability theory in Data Analysis
Types of Evidence
Population Analysis
Correlations and Regressions
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
Data Analytics for Internal Auditors - Understanding Sampling
1. 3/11/2019
1
Data Analytics - 1
Understanding Sampling
based on Data Analytics for
Internal Auditors
by Richard Cascarino
About Jim Kaplan, CIA, CFE
President and Founder of AuditNet®,
the global resource for auditors (now
available on iOS, Android and
Windows devices)
Auditor, Web Site Guru,
Internet for Auditors Pioneer
Recipient of the IIA’s 2007 Bradford
Cadmus Memorial Award.
Author of “The Auditor’s Guide to
Internet Resources” 2nd Edition
Page 2
1
2
2. 3/11/2019
2
About Richard Cascarino, MBA,
CIA, CISM, CFE, CRMA
• Principal of Richard Cascarino &
Associates based in Colorado USA
• Over 28 years experience in IT audit
training and consultancy
• Past President of the Institute of
Internal Auditors in South Africa
• Member of ISACA
• Member of Association of Certified
Fraud Examiners
• Author of Data Analytics for Internal
Auditors
3
About AuditNet® LLC
• AuditNet®, the global resource for auditors, is available on the
Web, iPad, iPhone, Windows and Android devices and features:
• Over 3,000 Reusable Templates, Audit Programs,
Questionnaires, and Control Matrices
• Training without Travel Webinars focusing on fraud, data
analytics, IT audit, and internal audit
• Audit guides, manuals, and books on audit basics and using
audit technology
• LinkedIn Networking Groups
• Monthly Newsletters with Expert Guest Columnists
• Surveys on timely topics for internal auditors
• NASBA Approved CPE Sponsor
Introductions
Page 4
3
4
3. 3/11/2019
3
The views expressed by the presenters do not necessarily represent
the views, positions, or opinions of AuditNet® LLC. These materials,
and the oral presentation accompanying them, are for educational
purposes only and do not constitute accounting or legal advice or
create an accountant-client relationship.
While AuditNet® makes every effort to ensure information is
accurate and complete, AuditNet® makes no representations,
guarantees, or warranties as to the accuracy or completeness of the
information provided via this presentation. AuditNet® specifically
disclaims all liability for any claims or damages that may result from
the information contained in this presentation, including any
websites maintained by third parties and linked to the AuditNet®
website.
Any mention of commercial products is for information only; it does
not imply recommendation or endorsement by AuditNet® LLC
Today’s Agenda
Statistical and Non-Statistical Concepts
Judgmental vs Statistical Sampling
Probability theory in Data Analysis
Types of Evidence
Sampling Methods
Sample Sizing and Selection
Attribute vs Variables Sampling
PPS Sampling
Population Analysis
Correlations and Regressions
Page 6
5
6
4. 3/11/2019
4
Statistical & Nonstatistical
Concepts
Why Sample?
Speed
Cost
to ensure data is "Substantially or Materially Correct"
Not mentioned in IIA Standards but "Information should be Sufficient,
Competent, Relevant, and Useful to provide a sound basis for audit findings
and recommendations"
"Sufficient information is Factual,
Adequate, and convincing so that a
prudent, informed person would reach the
same conclusion as the auditor"
Statistical vs Nonstatistical
Sampling
Similarities
Both require Auditor Judgment
Audit Procedures performed will not differ
Both permitted in Audit practice
Differences
Statistical Plans
Control and Measure Sampling Risk
Require Technical Training and Expertise
Normally Require Computer Facilities
7
8
5. 3/11/2019
5
Statistical Sampling,
Advantages
Provides the opportunity to select the minimum sample
size required to satisfy the objectives Provides a
quantitative measure of the sampling risk
Permits the auditor to explicitly specify a level of
Reliability Confidence) and a desired degree of
Precision (Materiality)
Provides a measure of sufficiency of the evidence
gathered
Provides for more objective results for management
Provides a more defensible expression of test results
Is simple to apply with computer software
Statistical Sampling,
Disadvantages
Requires random sample selection which
may be more costly and time consuming
May lead to problems in establishing a
correlation between the sample and the
population if not appropriately organized
May require specific staff training
May require the acquisition of specialized
software
9
10
6. 3/11/2019
6
Nonstat Sampling -
Advantages & Disadvantages
Advantages
Allows the auditor to utilize his subjective judgment to
influence the sample towards items of greatest value and
highest risk
May be equally effective and efficient as statistical sampling
but may cost less
Disadvantages
Statistical inferences may not be objectively valid
Cannot quantitatively determine sampling risk
Risks over or under auditing depending on the experience
and judgment
Terminologies - 1
Uncertainty
Audit Risk - a combination of Inherent Risk,
Control Risk and Detection Risk
Inherent / Control Risks - assessed by
Auditor's Judgment
Detection Risk
Sampling Risk - risk of sample being non-
representative
No sampling Risk - all other aspects of
Audit Risk (eg Audit Procedures were not
appropriate
11
12
7. 3/11/2019
7
Sampling Risk
Risk of incorrect acceptance (less chance of
successful Audit) ( risk
Risk of incorrect rejection (greater Audit
effort) ( risk
Risk of assessing Control Risk too high
(less chance of a successful Audit)
Risk of assessing Control Risk too low
(greater Audit effort)
Terminologies - 2
Terminologies - 3
Confidence Level (Reliability)
Percentage of times one would expect the sample
to adequately represent the full population
The higher the percentage, the more
representative the sample
The higher the confidence percentage required,
the larger the sample
Precision
How close the sample estimate is to the
true population value
13
14
8. 3/11/2019
8
Terminologies - 4
Population (total collection of items about which an
opinion will be expressed)
Sampling Unit (the individual items making up the
population)
Frame (sample frame is a listing of the Sampling
Units making up the Population)
Sample (collection of sampling units drawn from the
frame that will be subject to Audit Procedures)
Measures of Location / Central Tendency (Mean,
Median, Mode)
Mean (value of the population divided by the number of
items)
Median (middle value in a population)
Mode (most frequently occurring value)
Bias in Sampling
May arise if:
The sampling frame does not cover the
population adequately or accurately
The sample is non-random (eg
"convenient")
Subjective judgment enters into the
selection criteria
These can lead to systematic, non-
compensating errors in a sample
15
16
9. 3/11/2019
9
Definition of Probability
Experiment: toss a coin twice
Sample space: possible outcomes of an experiment
S = {HH, HT, TH, TT}
Event: a subset of possible outcomes
A={HH}, B={HT, TH}
Probability of an event : an number assigned to an event
Pr(A)
Joint Probability
For events A and B, joint probability Pr(AB)
stands for the probability that both events
happen.
Example: A={HH}, B={HT, TH}, what is the joint
probability Pr(AB)?
17
18
10. 3/11/2019
10
Primary Types of Sampling
Attribute Sampling
Two-way (dichotomous) scale
Primarily yes / no type answers
looks at likelihood of errors in populations
Variables Sampling
Samples a population based upon some
specific variable (eg Value)
Qualitative information
Used to obtain estimates of values etc
Types of Evidence
Obtained by:
Inspection
Observation
Documentation
Confirmation
Inquiry
Analytical Procedures
Recalculation
Re-performance
20
19
20
11. 3/11/2019
11
Primary Types of Sampling
Probability Proportionate to Size (PPS)
A new approach to Variables Sampling
Uses Attribute Sampling methods to
estimate Rand Amounts
Uses "Dollar" as sampling unit to select
items for audit
Also called Dollar Unit Sampling (DUS),
Cumulative Monetary Amount (CMA), and
Combined Attribute Variables (CAV)
sampling
Sampling Methods
Sampling Approaches (Sampling Plans)
Attributes Sampling
Discovery Sampling
Stop-or-Go Sampling (Sequential Sampling)
Variables Sampling
PPS Sampling
Judgmental
21
22
12. 3/11/2019
12
Non-Statistical Selection
Methods (1)
Haphazard Selection
Auditor's best guess of a representative
sample
Often used where no extrapolation will be
done
Block Selection
Used on blocks of transactions (eg all
transactions within a time scale)
Use with caution since inferences beyond
the block may be invalid
Non-Statistical Selection
Methods (2)
Judgment Method
Basic Issues
Value of items
Relative risk
Representativeness
23
24
13. 3/11/2019
13
Sampling Methods (Techniques
of Selection)
Random Number Sampling
Interval Sampling (Systematic
Selection)
Stratified Sampling
Block Sampling (Cluster Sampling)
Probability Proportionate to Size (PPS)
Mechanized
Random Number Sampling
Every unit in a population has an
equal chance of being selected
Use of Random Number Tables
Random Number Generators
Numbers outside the selection range
are ignored
Sampling with Replacement
Sampling without Replacement
Normal assumption in applied
statistics is Without Replacement
25
26
14. 3/11/2019
14
Main Features of Simple
Random Sampling
"Simple" as compared to Systematic
and Stratified Sampling
Randomization ensures the validity of
inference
The standard against which other
sampling methods are evaluated
Suitable where the population is
relatively small
Suitable where the sampling frame is
complete
Systematic Sampling
(Interval Sampling)
From an unnumbered population
Picking the first item at random
Then at a regular interval (eg every
10th item)
Assumes
Random distribution across the population
True randomness of selection is not
required
27
28
15. 3/11/2019
15
Stratified Sampling
Used to minimize the variability of
population units (control distortion)
Permits the drawing of a smaller
sample size
Population stratified into mutually
exclusive groups
Requires clearly definable strata
Need not be stratified on a value basis
Block Sampling
(Cluster Sampling)
Used where Simple Random Sampling
is too time-consuming or expensive
Clusters of items are selected at
random
Clusters may then be sampled or
100% checked
Clusters may be "Natural" (items
normally found together)
Clusters may be "Artificial" (selected
by the auditor
29
30
16. 3/11/2019
16
Variation of Attribute Sampling
Uses Attribute Approach to express an
opinion in value terms rather than a
deviation rate
An alternative to stratification
Samples cumulative values on a value
interval
Automatically biases hits towards
high value items
Probability Proportional to
Size (Dollar Unit Sampling)
Two paths to selection
Directed
Used when serious error or manipulation is
suspected
Not scientific sampling
Used purely to detect a suspected condition
May not be relied on to draw conclusions about
the population
Random Sampling
Seeks to represent the population
Taking a snapshot in miniature
The larger the sample the closer it depicts the
population, the more it can be relied upon
Sample Sizing and Selection
31
32
17. 3/11/2019
17
Factors affecting sample size
Population size?
Population variability
Expected error rate
Desired precision
Confidence level
Tolerable Error
Sample Sizing
Sample size increases as
population increases
Increase is not proportional
Populations of over 5000 require very
little increase in sample size
Population 50 Sample 33
Population 500 Sample 78
Population 1000 Sample 85
Population 55000-100000 Sample 93
Population Size
33
34
18. 3/11/2019
18
Substantial effect on Sample size
Variability is Standard Deviation of a
population
Standard Deviation is computed by
Taking the difference if each item from
the mean
Squaring the difference
Adding the squares and averaging
Taking the square root of the average
Population Variability
(Variables Sampling)
As the Standard Deviation increases the
sample size increases
Rule of thumb
Changes in a population's variability affects the
Sample size by the square of the relative
change
Where there is a large deviation,
Stratification may be required
Generally the more widespread the
values, the larger the sample
Effect of Standard Deviation
on Sample Size
35
36
19. 3/11/2019
19
Initial Auditor assessment of expected
population error rate (Deviation Rate or Rate of
Occurrence)
The higher the expected error rate, the larger
the sample
If expected error rate of 1% gave a sample size of 93
Then an expected error rate of 3% would give a
sample size of 361
All other factors being equal
If the sample shows a higher than expected
error rate?
Expected Error Rate
(Attribute Sampling)
Also called Desired Allowance for
Sampling Risk
eg Inventory is estimated at R
1,000,000 plus/minus R 200,000
The tighter the desired precision, the
larger the sample size required
Sample size changes by the square of
the relative change in precision
eg +/- R 50,000 is a change in desired
precision by a factor of 4
Sample size would increase by sixteen
Desired Precision
37
38
20. 3/11/2019
20
Percentage of time that the sample adequately
represents the population (ie that the estimation of
value can be x% relied upon)
95% confidence level states that for a given sample size, if
the sample was taken 100 times, 95times the sample
selected would adequately represent the population
The higher the confidence required, the larger the sample
In Variables Sampling, primary concern is Risk of incorrect
acceptance
In Attribute Sampling, primary concern is Risk of
assessing the control risk too high
Confidence Level (Reliability)
Tolerable misstatement of a value
eg R 100,000 +/- R 10,000 gives
Confidence Interval of R 90,000 to R
110,000
Primary concern is the risk of
Incorrect Rejection
Relates to efficiency of the audit
Confidence Interval
(Precision)
39
40
21. 3/11/2019
21
Found by multiplying a Reliability Factor by
the Standard Deviation of the Sample
Then adding and subtracting from the
Sample Estimate
Assuming a Normal Distribution
eg a 95% confidence level results in a 1.96
reliability factor
Confidence Interval therefor equals
Estimated Value +/- (Reliability Factor x
(Standard Deviation / Square Root of Sample
Size))
Confidence Interval
The maximum rate of deviations the
Auditor will accept
The closer the expected error rate is
to the Tolerable Error, the larger the
sample
The larger the Tolerable Error, the
smaller the sample
Tolerable Error
41
42
22. 3/11/2019
22
Where
C is the Confidence Coefficient
p is the max error rate
q is 100%-p
P is the desired precision
n is the sample size
n = C2pq
P2
Calculating the Sample Size
(Attribute Sampling)
Where the population is 1000, desired
precision is +/- 2%, desired confidence
level is 95% and the estimated error
rate is not to exceed 5% then
C = 1.96 (Confidence Coefficient at
95%)
p = 0.05
q = 0.95
P = 0.02
n = 1.962 x 0.05 x 0.95
0.022
n = 45.6
For Example
43
44
23. 3/11/2019
23
Usually deals with monetary values
Can be used for other variable values
Time Periods
Quantities
Weights
Determines estimates of values etc. to
predetermined tolerances
Determine
Population Size
Desired Confidence Level
Desired Precision
but
Instead of error rate
Find the Standard Deviation
Variables Sampling
Where
n1 is the preliminary sample size
C is the confidence coefficient
S is the standard deviation of the
population
P is the desired precision then
n1 = C2 S2
P2
Calculating the Sample Size
(Variables Sampling)
45
46
24. 3/11/2019
24
Special applications of Attribute Sampling
An error rate is less than a specified level
(Acceptance)
Sampling for critical errors (Discovery)
Minimum sample to determine an error rate at
a prescribed confidence level (Stop-or-Go)
Acceptance, Discovery,
Stop-or-Go Sampling
Where
s = Standard deviation of the sample
Σ = Sum of
x = Value of each sample item
n = Sample size
then
s = √ Σ(x2)- Σ(x)2/n
n-1
For Example
In a population where Mean is 20, three
samples were drawn, values 11, 20, 29
s = √ 81
s = 9
Standard Deviation
47
48
25. 3/11/2019
25
Mean of the Distribution +/- one
standard deviation includes 68% of
the area under the curve
Mean +/- two standard deviations
includes 95.5% of the area
Mean +/- three standard deviations
includes 99.7% of the area
In a Normal Distribution
Std Deviations Area Under the
Curve
Confidence Coefficient Confidence
Level
1.0 68%
1.64 90%
1.96 95%
2.0 95.5%
2.7 99%
Normal Distributions
49
50
26. 3/11/2019
26
Also known as Dollar Unit Sampling
(DUS)
Cumulative Monetary Amount (CMA)
Combined Attribute Variables (CAV)
Commonly used to assess whether
values are overstated
Uses a different formula to determine
Sample Size
Probability Proportional to
Size Sampling (PPS)
Where
n = Sample size
BV = Book value of the account (eg Accounts
Receivable)
RF = Risk factor (multiplier see below)
TE = Tolerable error (auditor's judgment)
n = BV x RF
TE
Sample Size
51
52
27. 3/11/2019
27
Reliability Required Reliability Factors
99% 4.605
95% 2.996
90% 2.300
For Example
Value of Inventory = R 500,000
Auditor Specified Material Error = R 10,000
Auditor Determined Little effective control (ie
RF=2.6)
n = BV x RF = 500,000 x 2.6 = 1300 = 130
TE 10,000 10
Sampling Interval is therefor 500,000 / 130 = 3846
Reliability Factors
Stock Unit Value Amount Cum Amount
30 16 480 480
90 100 9000 9480
92 111 10212 19692
70 40 2800 22492
20 15 300 22792
Sampling Interval = 3846
If no errors found Auditor concludes
Finished Goods Inventory has a maximum overstatement of
10,000 with 95% Reliability
If errors did occur
The average error amount must be projected to the
whole population (Tainting Percentage)
Application
53
54
28. 3/11/2019
28
PPS tends to select high value items
PPS unaffected by population item variability
PPS may result in a smaller sample size
PPS easy to implement
PPS does not require the normal
approximations required by variables
sampling
PPS Permits a statistically valid sample
selection which includes more high value
items
PPS Advantages
PPS requires that the population be
cumulatively totaled
As errors increase the sample size may be
larger than with other sampling methods
PPS primarily designed to detect
overstatements
Zero or negative items are presumed not to
occur
PPS not intuitively as appealing to auditors
PPS Disadvantages
55
56
29. 3/11/2019
29
Difference Estimation
Determine the difference between audit and book values
Calculate the mean difference
Multiply the mean difference by the numbers in the
population
Allow for sampling risk
Useable where small errors predominate and there is no
skew
Mean-per-unit (MPU) Sampling
Average the audit value of the sample
Multiply it by the population size (Not very accurate)
Ratio Estimation
Multiply the book value of the population by the ratio of
audit value to book value of the sample
Useable where small errors predominate and there
is no skew
Other Sampling Types
No assumption of normal distribution
Wold-Wolfowitz Runs Test
Mann-Whitney or Wilcoxon Test
Kolmogorov-Smirnov Test
Fisher's Exact Test
All use two samples
Non-Parametric Tests
57
58
30. 3/11/2019
30
Population Analysis
N1 and N2 are
regions of normal
behavior
Points o1 and o2
are anomalies
Points in region
O3 are anomalies
X
Y
N1
N2
o1
o2
O3
Types of Data
Nominal
Person’s name, Country etc.
Ordinal
Information with a natural sequence
Eg finishing order in a race
Interval
Ordinal Data with equal intervals
Ratio
Interval measurable from a fixed base
60
59
60
31. 3/11/2019
31
Data Analysis Software
ACL Audit Analytics
Powerful program for data analysis
Most widely used by auditors worldwide
CaseWare’s IDEA
Recent versions include an increasing number of
fraud techniques
ACL’s primary competitor
Correlation Analysis
Measurement of the extent of association of one
variable with another
Two variables are said to be correlated when they
move together in a detectable pattern
Direct correlation is said to exist when both
variables increase or decrease in the same time
although not necessarily by the same amount
Correlation analysis is used by internal auditors
those to identify those factors which appeared to be
related
62
61
62
32. 3/11/2019
32
Examples
63
Regressions
Regression analysis infers a causal relationship
between the two sets of data, so that not only is the
data related, but a change in one will cause a
change in the other
Regression is also referred to as the least squared
method
64
63
64
34. 3/11/2019
34
AuditNet® and cRisk Academy
If you would like
forever access to this
webinar recording
If you are watching
the recording, and
would like to obtain
CPE credit for this
webinar
Previous AuditNet®
webinars are also
available on-demand
for CPE credit
http://criskacademy.com
http://ondemand.criskacade
my.com
Use coupon code: 50OFF
for a discount on this
webinar for one week
Thank You!
Jim Kaplan
AuditNet® LLC
1-800-385-1625
Email:info@auditnet.org
www.auditnet.org
Richard Cascarino & Associates
Cell: +1 970 819 7963
Tel +1 303 747 6087 (Skype Worldwide)
Tel: +1 970 367 5429
eMail: rcasc@rcascarino.com
Web: http://www.rcascarino.com
Skype: Richard.Cascarino
Page 68
67
68