L7 method validation and modeling

Method validation and modeling
of data
Research Methodology
Seppo Karrila
2017

Executive summary
• Key concepts for scientific work with
measurements are discussed, including
accuracy, precision, reproducibility,
repeatability, full replicates, pseudoreplicates,
etc.
• These lead eventually to useful experiments
and data, that are made even more useful by
modeling

Measurements with known accuracy
• Usually given as absolute or relative accuracy, as
in “within 1 gram of true weight” or “within 5%
of true intensity”
• Accuracy is the deviation from true value
• When you compute with inaccurate values, you
can estimate the accuracy of the result
– In a term with multiplications and division, all the
relative accuracies get added together
– For example x3=x*x*x so relative inaccuracy of x is
added 3 times

This works well with physical
measurements
• Mass, current, voltage, time, etc.
– The supplier guarantees some accuracy level
– But the worst case accuracy estimates for physical
formulas are often so pessimistic they are not
really useful
• However, this does not work well with for
example analytical chemistry, because of
many potential disturbances to measurement
– The analyte is in an unknown “matrix” with its
own peculiar properties

In biological testing…
• The results are often subject to many possible
disturbances.
– You can’t know that they are even near correct
unless you include “positive and negative
controls”
– On looking for a treatment effect, negative control
can be placebo treatment (everything else equal
but no active chemical), positive control can be a
known earlier potent treatment

Research should be reproducible
• Because the results should be predictive. The
least demanding prediction is that the same
results can be obtained repeatedly if your
whole experiment is duplicated.
• So also your reporting is required to provide
all the detail, in order that someone else could
duplicate your work
– They will bother only if the results appear
important.

Prediction at what accuracy?
• The goal of science: summarizing experiences
in a useful way, so they can be learned easily
(from understanding on summary level
instead of memorizing lots of details), and
predictions can be made (about future effects
of decisions or design or actions)
• But no prediction will be perfectly accurate
– How do we describe accuracy?

Significant digits and rounding
• One significant digit: 1; 10; 100,000
• Two significant digits: 1.0; 10±1; (10±1)*10,000
• If the diameter is 21 cm, what is the
circumference of a circle?
Pi*21 = 3.1415926...*21 = 65.9734457...
But wait. You have to round to two digits, otherwise you
mislead the reader who thinks you measured at
nanometer accuracy. (Yes, the readers pretend to be
stupid. Especially me and all reviewers.)
Answer: Circumference is approximately 66 cm.

You should not mislead
• Your calculator or computer can easily show a
lot of decimals
– Get a computer algebra package if you want even
more, they allow “unlimited” precision
computations (wxMaxima is free, you can check
what it does on web, I don’t think you need it as a
distraction)
• You MUST NOT show fictional accuracy in
results. If you know only two digits correctly,
showing 8 is misleading.

How can I know the accuracy of a
measurement?
• Often you really can’t, but for physical
measurements there are calibration
standards.
– Correct result is known, calibration corrects
reading to what it should be (within some
accuracy)
• What you can find out is precision
– If I know the samples are similar, or the same
sample can be tested multiple times, how
precisely are the readings similar?

Precision
versus
accuracy
• Precision means small changes can be detected
reliably
• Accuracy means that in addition to being precise,
the reading is actually correct

Important difference of repeatability
and reproducibility
• Repeatability:
– You in your own lab can repeat an experiment
with similar results
• Reproducibility:
– Someone else someplace else can do the same,
with their “equivalent” equipment

About replicates in experimental
design
• Full experimental replication
– You repeat the whole experiment, from beginning to end,
and get replicate measurement values
• Technical or pseudoreplication
– Your experiments have produced a sample, you repeat
only the measurement (perhaps with different sub-
samples in destructive measurement)
• So when someone says “three replicates”, you need to
ask which type of replicates
– The latter just allows getting hopefully more precise
measurements by averaging over technical replicates
– The former indicates if the actual experiment could be
repeated in the same lab by the same team

When you have a new experimental
set-up…
• Think about possible disturbances
– Can you eliminate them, reduce their effects, or can you
measure and record them for later analysis?
– For example, in a calorimetry measurement you can use thermal
insulation to reduce leakage of heat, then measure what
leakage rate actually remains
• Test if you can detect differences between extreme cases
– If not, then the observations so far are not useful
• Eventually when you have a working routine that at least
differentiates between extremes
– Then you can start demonstrating that the results are useful for
some purpose…
– If they are, maybe the routine is developed to a validated
method

There are many concepts used in
analytical method validation
• The most
important ones
are here, for a
user.
• For details about
such validation,
see the
references

Estimates vs. actual quantity
• For example, for polymers we can define the
degree of crosslinking
– Direct observation of it is difficult
– Instead there are alternative practical methods that
give (alternative and different) estimates
• Indirect methods give you estimates only
– Do not use notation or naming that could confuse
these with the actual quantity
– An estimate of my weight is not the same as my
measured weight

How can you test reliability of a
measurement service?
• Suppose you have samples A, B and C to be measured for
analyte concentration
– You could send for analysis:
– You know S1=S3=B, S2=S6=C, S5=(A+B)/2, etc. IF THE
MEASUREMENT RESULTS ARE CORRECT
Sample label shown to lab Content
Sample 1 B
Sample 2 C
Sample 3 B
Sample 4 A
Sample 5 50:50 A+B mix
Sample 6 C
Sample 7 A spiked with analyte

Why make models?
• You can accumulate a large amount of
measurement data
– It is only useful as predictor if someone duplicates the
experiments
– If you can give a model, it allows
• Representing your data in small convenient form
• Interpolation between your data points, simulations
• Numerical optimization for “best decision”
• Or, fitting a published model may show your data conforms
to known theory, or in this sense matches earlier
experiments

So the next few lectures
• Are about checking your data with
visualization, fitting models, and other
statistical tools
• Recall that your goal is to answer a research
question, you have experiments AND analysis
planned to do this… unless you have
complicated theory available, the analysis
tends to be fairly simple

Summary of key points
• Repeatability and reproducibility are
cornerstones of trustworthy experimental
science
• These relate to precision, accuracy, validation,
and eventually getting useful data and making
models with it

References (open access)
• Belouafa, S., Habti, F., Benhar, S., Belafkih, B.,
Tayane, S., Hamdouch, S., … Abourriche, A.
(2017). Statistical tools and approaches to
validate analytical methods: methodology and
practical examples★. International Journal of
Metrology and Quality Engineering, 8, 9.
https://doi.org/10.1051/ijmqe/2016030
• Huber, L. Validation of analytical methods.
https://www.agilent.com/cs/library/primers/publ
ic/5990-5140EN.pdf

L7 method validation and modeling

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to L7 method validation and modeling

Similar to L7 method validation and modeling (20)

More from Seppo Karrila

More from Seppo Karrila (11)

Recently uploaded

Recently uploaded (20)

L7 method validation and modeling