2. Acknowledgement
Ed Roehl – CTO
• World class industrial researcher;
• Software design, development, and project
management;
• Advanced process engineering, computer-
based modeling and optimization methods,
industrial R&D, product/process design
automation, CAE, PDM;
• Data mining, multivariate analysis, predictive
modeling, simulation, advanced control, signal
processing, non-linear/chaotic systems,
computational geometry;
• AI, expert systems, OOP/computer languages,
machine learning/artificial neural networks.
Uwe Mundry, Partner
• World class software design, development;
• multi--spectral and hyper-spectral imaging and
pattern recognition, 4D medical imaging, 4D
geographical imaging, homeland security
applications, real-time decision support
systems with industrial applications; Data
mining, multivariate analysis, predictive
modeling, simulation, advanced control, signal
processing, non-linear/chaotic systems,
computational geometry, machine
learning/artificial neural networks;
OOP/multiple computer languages; Medical
and environmental imaging.
3. Why optimize your plant?
• Reduced operating budgets (10% very
common)
• Increasingly stringent regulations
--Water treatment?
--Wastewater treatment?
• Increasing cost of capital improvements
--USD worth less
--QE2 will lower value of debt instruments such
as bonds
4. Process optimization by modeling
1. Modeling processes through various
means
a. Bench-scale models
b. Pilot-scale models
c. Mathematical models
1) Deterministic/mechanistic—based on first principles
2) Empirical—either statistical or based upon some
optimal function to describe behavior
3) Hybrid of 1) and 2)
5. Process optimization by modeling
What is a mathematical model?
―…..consistent set of mathematical equations which
is thought to correspond to some other entity, its
prototype.‖—Rutherford Aris
6. Definitions for pilot-scale modeling
• Geometric Similarity—All lengths of the model and the
prototype must be in the same ratio. All corresponding
angles must be equal. [This is the easy one to achieve.]
• Kinematic Similarity—Ratios of fluid velocity and other
relevant velocities must be the same for the model and
prototype. Ratios of flow time scale and boundary time
scale must be the same. [Problems with laminar/turbulent.]
• Dynamic Similarity—The force polygons for the model
and prototype must be proportional. For example, forces
such as inertia, pressure, viscous forces, surface tension
forces, etc.
7. Equations of importance
• R = ρVℓ/µ (very important!)
• W = ρV2ℓ/σ (surface tension effects)
• F = V/ (gℓ)½ (free surface effects)
8. Scale-up problems with models
1. For bench-scale and pilot-scale:
a. Example of problems with scale-up for
simple drag coefficient, CD:
CD = f (R, W, F, α)
[Where is this important for water treatment?]
c. Pilot-scale testing is good for comparing
one pilot train with another pilot train but not
for finding absolute numbers for full-scale
9. So what of models?
―Models are undeniably beautiful, and a man may
justly be proud to be seen in their company. But
they may have their hidden vices. The question is,
after all, not only whether they are good to look at,
but whether we can live happily with them.‖
--Abraham Kaplan, The Conduct of Inquiry
10. Another problem: chaotic behavior
• ―Deterministic evolution of a nonlinear system
which is between regular behavior and
stochastic behavior.” – Abarbanel
• ―The property that characterizes a dynamical
system in which most orbits exhibit sensitive
dependence.” – Lorenz
• ―Neither periodic or stochastic behaviors that
have structure in state/feature space, making
them somewhat predictable.‖– ADMi
11. Lorenz attractor shows problem
• Poster child of chaos
• Purely synthetic, derived from 3 equations
– dx/dt = -σx + σy
– dy/dt = -xz + rx – y
– dz/dt = xy – bz
signal3D delay plot
showing
“orbitals”
“extreme sensitivity to changes
in boundary conditions”
mode 1
mode 2
mode 1
mode 2
14. Modeling chaotic behavior, 1
State Space Reconstruction (SSR)
• SSR is the means by which complex, constantly changing
processes can be represented in straightforward geometric
terms for visualization and modeling. SSR is like super
trending. It suggests that a process’ state space can be
optimally but not perfectly characterized by state vectors
Y(t). The vectors are constructed using an optimal number
of measurements, equal to ―local dimension‖ dL
(Abarbanel,1996), that are spaced optimally apart in time
by integer multiples of an optimal time delay d3.
Mathematically:
• Y(t) = [x(t), x(t - d), x(t - 2d),...., x(t – (dL - 1)d)] eq. 1
• Note that here Y(t) is univariate. Values of dL and d are
estimated analytically or experimentally from the data.
15. Modeling chaotic behavior, 2
• For a multivariate process of k independent variables:
• Y(t) = {[x1(t), x1(t - d1),…, x1(t – (dL1 – 1)d1)],....,[xk(t),
xk(t - dk),…, xk(t – (dLk – 1)dk)]} eq. 2
• This provides each variable with its own dL and d. A further
generalization that provides non-fixed time delay spacing
for each variable:
• Y(t) = {[x1(t), x1(t - d1,1),…, x1(t – (dL1 – 1)d1,dL1-
1)],....,[xk(t), xk(t - dk,1),…, xk(t – (dLk – 1)dk,dLk-1]} eq. 3
• Determining the best variables xk to use, and properly
estimating dimensions dLk and time delays dk by analytical
or experimental means, helps to insure that a given
process can be successfully reconstructed.
17. Consider modeling full-scale
system with full-scale system
1. Approach
a. Use data mining to extract information
contained in the full-scale data
b. Eliminates problems inherent in scale-up
issues
c. Chaotic behavior can be modeled
d. Systematic and objective approach to
optimizing information
19. A view of a general process
PHYSICAL
PROCESS
inputs
outputsx1
x2
x3
x4
x5
x6
x7
x8
y1
y2
y3
multiply periodic
chaotic
stochastic
Causes of Variability
• people
• configuration of controls
• raw water
• weather
• chemicals
• Outputs that are
predictable can then
be controlled
• Outputs that are
unpredictable cannot
be controlled
20. Relate variables with neural
networks
• Inspired by the Brain
– get complicated behaviors from lots of ―simple‖
interconnected devices - neurons and synapses
– non-linear, multivariate curve fitting
– models are synthesized from example data
• machine learning
x1
x2
x3
x4
x5
y1
y2
inputs outputs
21. ANNs produce response surfaces
Example: Trihalomethanes Formation
no data
surface fitted by non-linear
ANN model represents normal
behavior
deviation from normal
better conditions?
23. Modeling chloroform
• Input = TURBFIN (MWA=4,t=-1),
R2
ANN=0.47, RMSE=7.3
• +Input=COLORFIN (MWA=4),
R2
ANN=0.60, RMSE=6.2
• +Input=TPFIN, R2
ANN=0.74,
RMSE=5.0
R2
ANN=0.74
same
TPFIN=32C
TPFIN=11C
CF higher
at high TP
Days when DBPs measured
24. Observations about chloroform
• Finished turbidity accounts for 47% of
variability in chloroform
• Finished turbidity + color accounts for 60%
• Finished turbidity + color + temperature
accounts for 74%
• Or, R2ANN = 0.74
• Recommend:
1) optimize turbidity removal—most
important
Is this counterintuitive?
2) optimize TOC removal
26. Modeling
BDM, Part 1
• Inputs = TURBFIN (t=-2) ,
COLORFIN (MWA=3), R2
ANN=0.24,
RMSE=1.8
• +Input=TPFIN, R2
ANN=0.66,
RMSE=1.2
BDM far more sensitive to
TPFIN than TURBFIN &
COLORFIN
R2
ANN=0.66
TPFIN=32C
TPFIN=11C
Days when DBPs measured
27. Observations regarding BDM
• Finished turbidity + finished color accounts
for 24% [very low correlation!]
• Finished turbidity + color + temperature
accounts for 66%
• Or, R2 = 0.66
• So, BDM is dominated by temperature
28. • Remove TURBFIN, add inputs =
PRE-Cl2, R2
ANN=0.72, RMSE=1.1
Modeling
BDM, Part 2
TPFIN=11C
COLORFIN=3.0
TPFIN=11C
COLORFIN=1.0
TPFIN=32C
COLORFIN=3.0
TPFIN=32C
COLORFIN=1.0
BDM sensitivity
to PRE-Cl2 &
NH3 higher at
low TPFIN.
BDM higher at
higher
COLORFIN.
TP is dominant
effect.
29. Modeling TCA
• Input = TURBFIN (MWA=4,t=-3),
R2
ANN=0.47, RMSE=5.5
• +Input=COLORFIN (MWA=4),
R2
ANN=0.47, RMSE=5.5
• +Input=TPFIN, R2
ANN=0.61,
RMSE=4.7
TPFIN=32C
TPFIN=11C
TCA less seasonal
than DCA
R2
ANN=0.61
Days when DBPs measured
30. Observations modeling TCA
• Finished turbidity accounts for 47%
variability
• Finished turbidity + finished color accounts
for 47% [surprising, as color not capturing
precursors!]
• Finished turbidity + color + finished
temperature accounts for 61%
• Or, R2 = 0.61
31. Summary - modeling THM and
HAA species
• Consider finished turbidity, color, and temperature
– indicators of organics speciation by time of year
– treatment process kinetics and performance
• Chloroform positively correlated to finished turbidity, color,
and temperature; R2
ANN = 0.74
• BDM highly seasonal; positively correlated to and finished
turbidity, color, and temperature, and pre-Cl2 and NH3;
R2
ANN = 0.66 to 0.72
• DCA highly seasonal; positively correlated by to finished
turbidity, color, and temperature; R2
ANN = 0.73
• TCA somewhat seasonal; positively correlated by to
finished turbidity, and temperature; R2
ANN = 0.61
39. Observations for % TOC removal
• Optimal coagulation pH = 6.5
• Coagulation aid = 0.05 mg/L (or < )
– However, coagulant aid does effect turbidity
• ClO2 = 0.8 mg/L
• Coagulant dose as function of [TOC]
41. Total % turbidity removal
• System is robust in removal of turbidity regardless of source turbidity
levels; when source turbidity increases, % removal asymptotically
approaches –100%
• Goal is to minimize operating costs to meet water quality targets
44. Observations % filtration turbidity
removal
1. Turbidity removal through filtration is highly
sensitive to:
a. coagulant dose
b. chlorine dioxide dose
2. Turbidity removal through filtration is NOT
sensitive to filter polymer aid
3. Turbidity removal = f (sed. turbidity + ClO2 +
coagulant + coagulant aid); R2 = 0.75
4. Filter run times very low; recommend eliminating
filter polymer aid
5. Recommend side-by-side filter testing
47. Observations about tank water
quality
• Nitrification demonstrated by loss of total
chlorine residual, lower pH, higher NO-
2
• Total chlorine loss is pH sensitive
• Total chlorine loss is very temperature
dependent
– Nitrification rate increases exponentially above
approximately 80 F
• At pH > 9, loss of residual stabilizes
48. Questions
John B. Cook, PE
Advanced Data Mining Intl,
Greenville, SC
John.Cook@advdmi.com
843.513.2130
www.advdmi.com