This document discusses using machine learning models and remote sensing techniques to estimate water quality parameters. It tests a support vector machine (SVM) model to predict secchi depth and pH using Landsat and Sentinel satellite imagery. The model performs reasonably well, with an R2 of 0.664 and RMSE of 0.08 for secchi depth, and R2 of 0.647 and RMSE of 0.09 for pH. While integrated in-situ and remote sensing data yields effective results, more samples are needed to avoid overfitting, and deep learning may provide more accurate predictions by learning higher-order relationships.
Estimating Water Quality with Remote Sensing and Machine Learning
1. Water Quality Parameters Estimation
Using Remote Sensing Techniques
Dinesh Neupane | Nov 26, 2022
2. Context
• Monitoring the quality of inland surface water is critical for managing
and improving its quality (even more critical for coastal region)
• Conventional point sampling method – accurate but comes with huge
cost and time disadvantage
• Remote sensing provides “out of box” solution for this kind of
problem
• Higher spatial and temporal coverage
• Integrated approach combining in-situ data and satellite data
3. Research Questions
• How well the machine learning model especially the Support Vector
Machine perform predicting the water quality parameters?
• What are the challenges and possible solutions retrieving water
quality parameters from satellite images.
• Based on the hypothesis that learning models like SVM learn higher-
order statistical relationships
• Surface reflectance value as proxy
5. Methods
• Data
• Satellite Images: Landsat – 8 OLI and Sentinel 2 MSI
• In-Situ Data: From Mobile Baykeepers –Swim Guide team (Air Temperature,
Water Temperature, pH, Oxygen Concentration, Dissolved Oxygen, Alkalinity,
Hardness, Turbidity and Sechhi Depth)
• Cloud Computing
6. Band Selection & Ratios
S.N. Parameter Band (30m x 30m)
1 Water B3, B8
2 Secchi Depth (m) B1 and B3
3 pH B1, B3, B6 and B8
4 Turbidity B1 and B3
5 DO B2, B4, B6 and B11
6 Chl-a B2, B3, B4 and B5
Normalized Difference Water Index (NDWI) = (B3-B8)/(B3+B8)
Normalized Difference B1 and B3 = (B1-B3)/(B1+B3)
Normalized Difference B1 and B6 = (B1-B6)/(B1+B6)
8. Methods
• Sklearn machine learning python binding was used for SVM learning
purpose
• Data Split: 70% data was used for training purpose and 30% was used
for testing purpose
• Water quality parameters (Secchi Depth and pH) were used as
dependent continuous variables, surface reflectance values for
respective bands of pH and Secchi Depth were used as independent
variables
9. Support Vector Machine
• Supervised learning approach
• Main goal of SVM is to define a
hyperplane that separates the
points in two different classes.
• Points that are closest to the
outer boundary lines are support
vectors
• Pairwise data for pH and Secchi
depth were passed to the support
vector function with ‘rbf’ kernel
(based on rbf algorithm)
12. Conclusions
• Results show that water quality parameters (sechhi depth and pH)
can be predicted with reasonable accuracy.
• The coefficient of determination r2 =0.664 and RMSE = 0.08 for
sechhi depth and r2 = 0.647 and RMSE = 0.09 for pH respectively for
test data and show a significant correlation with surface reflectance
values with respective bands.
• Confirmed the effectiveness of integrated approach based on in-situ
data and satellite remote sensing datasets
• The little inaccuracy might be due to field data's high variability and
influence of human activities between sampling points
13. Suggestions
• Areas of improvement:
• More samples needed to avoid overfitting issues
• Way Forward
• Deep learning techniques might provide more accurate results on prediction
since it learns higher order statistical relationships