Statistical Analysis of Left-Censored Geochemical Data
Tomlinson et al (2016) - sediment & biota
1. Using multivariate statistics to identify
analyte sources in sediments & biota at a
shallow-water, military munitions
disposal site off leeward O‘ahu, Hawai‘i
Michael S. Tomlinson Eric H. De Carlo
Geoffrey L. Carton Dennis R. Helsel
5. The Study
• 141 sediment samples from 4 strata
– DMM (discarded military munitions)
– CON (control)
– NPS (nonpoint source)
– WWT (wastewater treatment)
• 286 biota samples from 4 organism types
– Limu (seaweed)
– He‘e (octopus)
– Weke (fish)
– Pāpaʻi kua loa (Kona crab)
6. The Study (continued)
• 5 sediment sampling events (different
seasons)
• 4 biota sampling events (different seasons)
• Sediments & biota analyzed for elements &
energetics (propellants & explosives)
• Multiple nondetects
• Multiple detection levels
7. Nondetects (NDs) are data!
(Partial table below is a good format for biogeochemical data)
The “U” data qualifier inserted by data validator is a confirmation of the lab result,
i.e., “ND”. Note: “ND” provides NO information without the detection limit (DL)
8. So what do you do with nondetects (NDs)
Ignore
0
½DL
DL
RL
9. We used some of the
multivariate statistical
techniques described
in
10. These techniques included:
• Summary statistics using:
– Kaplan-Meier
– Regression on order statistics
• Kendall’s tau nonparametric correlation
• Interval-censored score test (analogous to
generalized Wilcoxon test)
• Nonmetric multidimensional scaling (NMDS)
(discussing today) using interval censored data
12. We applied nonmetric multidimensional
scaling (NMDS) to the interval-censored
data to identify analyte sources
13. NMDS revealed distinct analyte clusters
Notice how
DMM
analytes (Mg,
Pb, Cu, & Zn
& energetics)
cluster
And, notice
how terrestrial
elements
cluster
(Typically, a Kruskal’s stress ≤ 0.20
indicates pattern is not random)
14. And, if you overlay the samples…
Most DMM
samples
cluster with
the DMM
analytes
and most
samples
influenced by
terrestrial
processes
cluster with
terrestrial
analytes
15. And, now something for the biologists –
• NMDS of biota
(Hawaiian food)
• No strong
patterns by
strata, but…
16. Not surprisingly, data clustered by organism
Cu-based hemocyanin & Cu- & Zn-based enzymes (White & Rainbow, 1985)
17. Only limu exhibited clustering by strata – Why?
• Biology differs from other organisms?
• Sessile rather than motile organism
18. Conclusions
There are a number of multivariate statistical routines that
can work with left-censored data with multiple DLs
Substitution (e.g., ½DL) is neither necessary nor
recommended
Nonmetric multidimensional scaling (NMDS) was able to
identify the sources of most analytes in sediment
Overlaying the NMDS results for sediment samples generally
corroborated the analyte results
The elements Cu, Zn, Pb & Mg and energetics clustered with
each other and the DMM samples
Terrestrial elements clustered with samples from the CON,
NPS, and WWT strata
19. Conclusions (continued)
NMDS plots of biota (typical Hawaiian food), not surprisingly,
clustered by organism
Octopus & crabs clustered with each other & with Cu & Zn
o Cu – because the blood of both organisms contain
hemocyanin
o Cu & Zn – because of the high concentrations of Cu & Zn
enzymes found in both organisms
Only limu kohu or asparagus seaweed (Asparagopsis
taxiformis) showed any clustering by strata – Why?
o Different biology (plant rather than animal)?
o Sessile rather than motile
Multivariate statistical analyses such as NMDS can aid in the
identification of analyte sources
20. Acknowledgments
My coauthors
– Eric H. De Carlo (UHM Oceanography)
– Geoffrey L. Carton (CALIBRE Systems)
– Dennis R. Helsel (Practical Stats)
And a special Mahalo
to Mr. Keoki Stender
for allowing me to use
his marine life photos
www.keokiscuba.com/
www.marinelifephotography.com
21. Mahalo nui loa! Questions?
Michael Tomlinson – mtomlins@hawaii.edu