SlideShare a Scribd company logo
1 of 61
Download to read offline
Big Data in Climate:
Opportunities and Challenges for
Machine Learning and Data Mining
Vipin Kumar
University of Minnesota
kumar001@umn.edu
www.cs.umn.edu/~kumar
Big Data in Climate
• Satellite Data
– Spectral Reflectance
– Elevation Models
– Nighttime Lights
– Aerosols
• Oceanographic Data
– Temperature
– Salinity
– Circulation
• Climate Models
• Reanalysis Data
• River Discharge
• Agricultural Statistics
• Population Data
• Air Quality
• …
Source: NCAR
Source: NASA
SAMSI Climate Workshop, 20178/21/2017 2
Big Data in Climate
• Satellite Data
– Spectral Reflectance
– Elevation Models
– Nighttime Lights
– Aerosols
• Oceanographic Data
– Temperature
– Salinity
– Circulation
• Climate Models
• Reanalysis Data
• River Discharge
• Agricultural Statistics
• Population Data
• Air Quality
• …
Source: NCAR
Source: NASA
“Climate change research is now
‘big science,’ comparable in its
magnitude, complexity, and societal
importance to human genomics and
bioinformatics.”
(Nature Climate Change, Oct 2012)
SAMSI Climate Workshop, 20178/21/2017 3
Pattern Mining:
Monitoring Ocean Eddies
• Spatio-temporal pattern mining using novel
multiple object tracking algorithms
• Created an open source data base of 20+ years of
eddies and eddy tracks
Extremes and Uncertainty:
Heat waves, heavy rainfall
• Extreme value theory in space-time and
dependence of extremes on covariates
• Spatiotemporal trends in extremes and
physics-guided uncertainty
quantification
Relationship mining:
Seasonal hurricane activity
• Statistical method for automatic inference of
modulating networks
• Discovery of key factors and mechanisms
modulating hurricane variability
Sparse Predictive Modeling:
Precipitation Downscaling
• Hierarchical sparse regression and multi-task
learning with spatial smoothing
• Regional climate predictions from global
observations
Network Analysis:
Climate Teleconnections
• Scalable method for discovering related graph
regions
• Discovery of novel climate teleconnections
• Also applicable in analyzing brain fMRI data
Change Detection:
Monitoring Ecosystem Distrubances
• Robust scoring techniques for identifying diverse
changes in spatio-temporal data
• Created a comprehensive catalogue of global changes in
surface water and vegetation, e.g. fires and
deforestation.
Five Year, $ 10m NSF Expeditions in Computing Project (1029711, PI: Vipin Kumar, U. Minnesota)
Understanding Climate Change: A Data-driven Approach
Research Highlights
http://climatechange.cs.umn.edu/
[Arindam Banerjee]
[Auroop Ganguly]
[Nagiza Samatova]
SAMSI Climate Workshop, 20178/21/2017 4
Pattern Mining:
Monitoring Ocean Eddies
• Spatio-temporal pattern mining using novel
multiple object tracking algorithms
• Created an open source data base of 20+ years of
eddies and eddy tracks
Extremes and Uncertainty:
Heat waves, heavy rainfall
• Extreme value theory in space-time and
dependence of extremes on covariates
• Spatiotemporal trends in extremes and
physics-guided uncertainty
quantification
Relationship mining:
Seasonal hurricane activity
• Statistical method for automatic inference of
modulating networks
• Discovery of key factors and mechanisms
modulating hurricane variability
Sparse Predictive Modeling:
Precipitation Downscaling
• Hierarchical sparse regression and multi-task
learning with spatial smoothing
• Regional climate predictions from global
observations
Network Analysis:
Climate Teleconnections
• Scalable method for discovering related graph
regions
• Discovery of novel climate teleconnections
• Also applicable in analyzing brain fMRI data
Change Detection:
Monitoring Ecosystem Distrubances
• Robust scoring techniques for identifying diverse
changes in spatio-temporal data
• Created a comprehensive catalogue of global changes in
surface water and vegetation, e.g. fires and
deforestation.
Five Year, $ 10m NSF Expeditions in Computing Project (1029711, PI: Vipin Kumar, U. Minnesota)
Understanding Climate Change: A Data-driven Approach
Research Highlights
http://climatechange.cs.umn.edu/
Challenges
• Multi-resolution, multi-scale data
• High temporal variability
• Spatio-temporal auto-correlation
• Spatial and temporal heterogeneity
• Large amount of noise and missing values
• Lack of representative ground truth
• Class imbalance (changes are rare events)
SAMSI Climate Workshop, 20178/21/2017 5
Big Data in Earth System Monitoring
A vegetation index measures the
surface “greenness” – proxy for total
biomass
This vegetation time series
captures temporal dynamics
around the site of the China
National Convention Center
Data Type Coverage Spatial
Resolution
Temporal
Resolution
Spectral
Resolution
Duration Availability
MODIS Multispectral Global 250 m Daily 7 2000 - present Public
LANDSAT Multispectral Global 30 m 16 days 7 1972 - present Public
Hyperion Hyperspectral Regional 30 m 16 days 220 2001 - present Private
Sentinal - 1 Radar Global 5 m 12 days - 2014 - present Public
Quickbird Multispectral Global 2.16 m 2 to 12 days 4 2001 - 2014 Private
WorldView - 1 Panchromatic Global 50 cm 6 days 1 2007 - present Private
MODIS covers ~ 5 billion locations globally
at 250m resolution daily since Feb 2000.
SAMSI Climate Workshop, 20178/21/2017 6
3. Global mapping of inland surface water dynamics
 Heterogeneous Ensemble Learning (SDM 2015, ICDM 2015)
 Physics-guided Labeling (ICDM 2015, RSE 2017)
 Information Transfer across Space and Time
Monitoring Global Change: Case Studies
Lake Oroville in 2011 and 2014
1. Global mapping of forest fires:
 RAPT: Rare Class Prediction in Absence of
Ground Truth (TKDE 2017)
2. Mapping of plantation dynamics in
tropical forests:
 Recurrent Neural Networks to model space
and time (IEEE Big Data 2016, SDM 2016, KDD 2017)
SAMSI Climate Workshop, 20178/21/2017 7
Case Study 1: Global Forest Fires Mapping
Monitoring fires is important for climate change impact
State-of-the-art: NASA MCD64A1
• Most extensively used global fire monitoring product
• Uses MODIS surface reflectance and Active Fire data in a predictive model
• Performance varies considerably across different geographical regions
• Known to have very low recall in tropical forests that play a critical role in
regulating the Earth’s climate, maintaining biodiversity, and serving as carbon sinks
A record number of more than 150 countries
signed the landmark agreement to tackle
climate change at a ceremony at UN
headquarters on 22 April, 2016.
“the best chance to save
the one planet we have"
SAMSI Climate Workshop, 20178/21/2017 8
Predictive Modeling:
Traditional Paradigm
Explanatory
Variable
Target Label
1
0
0
1
. .
1
Learn a classification function
which generalizes well on
unseen data that comes from
the same distribution as
training data.
Predicts whether a given
pixel is burned or not?
Burned area mapping
SAMSI Climate Workshop, 20178/21/2017 9
8000 sq.Km scene in SE Asia(2005)
Predictive Modeling for
Global Monitoring of Forest Fires
Challenges:
(1) Complete absence of target labels for supervision
(however, imperfect annotations of poor quality labels are available for every sample)
Variations in the relationship between the explanatory and target variable
• Geographical heterogeneity
• Seasonal heterogeneity
• Land class heterogeneity
• Temporal heterogeneity
Global availability of labeled samples
for burned area classification
?
?
?
SAMSI Climate Workshop, 20178/21/2017 10
Challenges:
(1) Complete absence of target labels for supervision
(however, imperfect annotations of poor quality labels are available for every sample)
(2) Highly imbalanced classes
True Positive Rate = 0.9
False Positive Rate = 0.01
skew
0
1 recall
precision
For eg. California State
Year 2008 (experienced maximum
fire activity in last decade)
2,296 sq. km. of forests burned
out of a total
73,702 sq. km. forested area
Predictive Modeling for
Global Monitoring of Forest Fires
SAMSI Climate Workshop, 20178/21/2017 11
Challenges:
(1) Complete absence of target labels for supervision
(however, imperfect annotations of poor quality labels are available for every sample)
(2) Highly imbalanced classes
(3) How to evaluate performance of a model
using imperfect labels?
Global availability of labeled samples
for burned area classification
Predictive Modeling for
Global Monitoring of Forest Fires
SAMSI Climate Workshop, 20178/21/2017 12
Predictive Modeling for a Rare Target Class
using Imperfect Labels
Examples of imperfect labels
What are imperfect labels ?
• Noisy/perturbed true labels
• Inexpensive to obtain
• Raw feature
• Heuristics (given by expert)
• Available for all test instances
Application Target label Imperfect label
Burned Area Fire/No Fire Thermal anomaly
Urban settlement Urban/No urban Night time light
Recommending
items to a new user
Interested/
No interested
Friends interest
SAMSI Climate Workshop, 20178/21/2017 13
14
Learning with imperfect labels
Supervised Learning
Imperfect LabelsExpert-annotated Labels
Sufficient
training samples
Inadequate
training samples
SVM
Decision tree
Logistic regression
Semi-supervised
Active Learning
Multi-view
Multi-task
Single annotatorMultiple annotators
Learning with crowds
Raykar et al.
Partial Supervision Imperfect Supervision
Positive Unlabeled learning
Bing Liu et al.
Elkan et al.
Natrajan et al.
Menon et al.
Balanced Rare class
15
Learning with imperfect labels
(2) Imperfect label is conditionally independent
of feature space given the true label (CCN)
+ < 1(1)
Assumptions
True Labels
16
Learning with imperfect labels
(2) Imperfect label is conditionally independent
of feature space given the true label (CCN)
+ < 1(1)
Assumptions
Imperfect Labels that are conditionally
independent (Errors shown in Red)
17
Learning with imperfect labels
(2) Imperfect label is conditionally independent
of feature space given the true label (CCN)
+ < 1(1)
Assumptions
Imperfect Labels that are conditionally
independent (Errors shown in Red)
Imperfect Labels that are not conditionally
independent (Errors shown in Red)
18
Learning with imperfect labels
(2) Imperfect label is conditionally independent
of feature space given the true label (CCN)
+ < 1(1)
Assumptions
Blum et.al, COLT 1998
Given enough samples, it is possible to learn a
classifier in CCN label noise scenario that is as good
as one learned using perfectly labeled samples.
19
Learning with imperfect labels
(2) Imperfect label is conditionally independent
of feature space given the true label (CCN)
+ < 1(1)
Assumptions
Blum et.al, COLT 1998
Given enough samples, it is possible to learn a
classifier in CCN label noise scenario that is as good
as one learned using perfectly labeled samples.
Liu et.al, ICML 2003
For Positive and Unlabeled (PU) learning setting
(CCN with β =0) presented an algorithm to optimize
precision*recall without using any perfect labels.
20
Learning with imperfect labels
(2) Imperfect label is conditionally independent
of feature space given the true label (CCN)
+ < 1(1)
Assumptions
Blum et.al, COLT 1998
Given enough samples, it is possible to learn a
classifier in CCN label noise scenario that is as good
as one learned using perfectly labeled samples.
Liu et.al, ICML 2003
For Positive and Unlabeled (PU) learning setting
(CCN with β =0) presented an algorithm to optimize
precision*recall without using any perfect labels.
Natarajan et.al, NIPS 2013
For some performance measures like accuracy, a
classifier can be learned using CCN labels treating
them as perfect labels.
21
Learning with imperfect labels
(2) Imperfect label is conditionally independent
of feature space given the true label (CCN)
+ < 1(1)
Assumptions
Blum et.al, COLT 1998
Given enough samples, it is possible to learn a
classifier in CCN label noise scenario that is as good
as one learned using perfectly labeled samples.
Liu et.al, ICML 2003
For Positive and Unlabeled (PU) learning setting
(CCN with β =0) presented an algorithm to optimize
precision*recall without using any perfect labels.
Elkan and Noto et.al,
KDD 2008
Natarajan et.al, NIPS 2013
For some performance measures like accuracy, a
classifier can be learned using CCN labels treating
them as perfect labels.
Menon et.al, ICML 2015
-For general CCN scenario showed the linear
relationship between P(y=1|x) and P(a=1|x).
- Presented a method to optimize balanced error rate,
AUC
22
Learning with imperfect labels
(2) Imperfect label is conditionally independent
of feature space given the true label (CCN)
+ < 1(1)
Assumptions
Blum et.al, COLT 1998
Given enough samples, it is possible to learn a
classifier in CCN label noise scenario that is as good
as one learned using perfectly labeled samples.
Liu et.al, ICML 2003
For Positive and Unlabeled (PU) learning setting
(CCN with β =0) presented an algorithm to optimize
precision*recall without using any perfect labels.
Elkan and Noto et.al,
KDD 2008
Natarajan et.al, NIPS 2013
For some performance measures like accuracy, a
classifier can be learned using CCN labels treating
them as perfect labels.
Menon et.al, ICML 2015
-For general CCN scenario showed the linear
relationship between P(y=1|x) and P(a=1|x).
- Presented a method to optimize balanced error rate,
AUC
Imperfect supervision under rare class scenario
Models built using imperfect labels a to optimize combinations of
precision and recall can perform very poorly compared to models
built using true labels y
23
RAPT: RAre class Prediction in absence of ground Truth
• Step 1: Learn classification models using imperfect labels that is as
good as the one that could be built using perfect labels
– We provide a new method to optimize precision*recall using imperfect labels and
prove that the selected threshold maximizes the true precision*recall
• Step 2: Combine predictions from classification model and the
imperfect label to improve precision at some loss in recall
– For ultra-rare class scenarios, the gain in precision after aggregation is significantly
higher compared to the loss in recall
– The improvement in G-measure after step 2 is by a factor of
• Step 3: Use spatial context to improve recall
– Exploits the relationship structure such as spatial neighborhood, biological network
or social network, to improve the coverage (recall) of the rare class instances.
24
Results for Burned Area Mapping
California State
RAPT Step 1
GT-based classifier
25
Results for Burned Area Mapping
California State
RAPT Step 1
Weak label
GT-based classifier
26
Results for Burned Area Mapping
California State
RAPT Step 1
Weak label
GT-based classifier
27
Results for Burned Area Mapping
California State
RAPT Step 2
RAPT Step 1
Weak label
GT-based classifier
28
Results for Burned Area Mapping
California State
RAPT Step 2
RAPT Step 3
RAPT Step 1
Weak label
GT-based classifier
29
Results for Burned Area Mapping
Georgia State
RAPT Step 2
RAPT Step 3
RAPT Step 1
Weak label
GT-based classifier
30
Results for Burned Area Mapping
Montana State
RAPT Step 2
RAPT Step 3
RAPT Step 1
Weak label
GT-based classifier
31
571 K sq. km. burned area found in tropical forests
three times the area reported by state-of-art NASA product: MCD64A1
Fires in tropical forests during 2001-2014
RAPT
(571 K)
MCD64A1
(186 K)
126 K
60K
445 K
SAMSI Climate Workshop, 2017
Global Monitoring of Fires in Tropical Forests
32
Burn Detection B B B
Land cover F F F F F F F N N N N N N
Year 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014
Google Earth Image:
Year 2002
Google Earth Image:
Year 2015
RAPT detection 2002-2014
(RAPT only Common)
SAMSI Climate Workshop, 2017
Deforestation via Burning in Amazon
33
3. Global mapping of inland surface water dynamics
 Heterogeneous Ensemble Learning
 Physics-guided Labeling
 Information Transfer across Space and Time
Monitoring Global Change: Case Studies
Lake Oroville in 2011 and 2014
1. Global mapping of forest fires:
 RAPT: Rare Class Prediction in Absence of
Ground Truth
2. Mapping of plantation dynamics in
tropical forests:
 Recurrent Neural Networks to model space
and time
8/21/2017 33
34
Case Study 2: Mapping of Plantation Dynamics
Interplay between food, energy and
water:
Production of edible oils and biofuels.
High carbon emissions.
Degradation of water quality.
35
State-of-the-art and Challenges
•Tree Plantation (TP): This data set is
created by Transparent World, with the
support of Global Forest Watch. Plantations
are manually annotated on 2014. TP has high
recall and low precision.
•Roundtable on Sustainable Palm
Oil (RSPO): This dataset is available across
Indonesia in 2000, 2005, and 2009. In addition,
the study digitized all the locations into 19
land cover types in these eras. RSPO has high
precision and low recall.
TP, 2014 RSPO, 2001
RSPO, 2005 RSPO, 2009
 Challenges
•Imperfect annotators
•Tree Plantation (TP): high recall and low precision.
•Roundtable on Sustainable Palm Oil (RSPO): high precision and low recall.
•Data heterogeneity
•Land cover heterogeneity
 differentiate plantation from a variety of land covers, e.g. forest, are highly confused with plantations.
•Spatial heterogeneity
•Temporal heterogeneity
 Seasonal variation - e.g., a crop land after harvest looks very similar to a barren land.
 Yearly variation – the spectral features of a land cover change across years.
•Noisy and high-dimensional feature space
36
Our Contribution
 Learning from multiple imperfect annotators
(Jia et al. BigData 2016)
 Each annotator has different expertise level on different plantation types.
 We recursively update the expertise of each annotator and estimate true labels.
 Handling temporal heterogeneity in prediction
(Jia et al. SDM 2017)
 We model temporal and spatial dependencies across years in an LSTM model.
 We propose an incremental learning strategy to update the LSTM model.
 Aggregating classes, collecting samples and validating results
(Jia et al. Technical Report, 2017)
 We aggregate similar classes according to domain expertise.
 For each aggregated class, we sample equal amount of samples from each sub-
class across multiple years.
 We validate the generated plantation maps by comparing random sampled
locations to high-resolution images.
• 1. Jia, X., Khandelwal, A., Gerber, J., Carlson, K., West, P., and Kumar, V. Learning Large-scale Plantation Mapping from Imperfect
Annotators. In IEEE Big Data (Big Data), 2016.
• 2. Jia, X., Khandelwal, A., Nayak, G., Gerber, J., Carlson, K., West, P., and Kumar, V. Predict Land Covers with Transition Modeling and
Incremental Learning. In SDM, 2017.
• 3. Jia, X., Khandelwal, A., Gerber, J., Carlson, K., Samberg, L., West, P., and Kumar, V. Automated Plantation Mapping in Southeast Asia Using
Remote Sensing Data. In Department of Computer Science and Engineering-Technical Reports.
37
Annual Plantation Maps
h29v08
h29v09
h28v09
 Annual growth rate ≈ 9.57%
38
All plantations
This and all following figures
show only confident forest pixels.
Interaction between Fires and Palm Oil Plantation
39
Plantations with
no burn scars
Plantations with burn
scar during 2001-2014
This and all following figures
show only confident forest pixels.
Interaction between Fires and Palm Oil Plantation
40
Plantations with
no burn scars
Plantations with burn
scar during 2001-2014
Burned pixels around
plantations (same burn
date as nearby red pixels)
This and all following figures
show only confident forest pixels.
Interaction between Fires and Palm Oil Plantation
3. Global mapping of inland surface water dynamics
 Heterogeneous Ensemble Learning
 Physics-guided Labeling
 Information Transfer across Space and Time
Monitoring Global Change: Case Studies
Lake Oroville in 2011 and 2014
1. Global mapping of forest fires:
 RAPT: Rare Class Prediction in Absence of
Ground Truth
2. Mapping of plantation dynamics in
tropical forests:
 Recurrent Neural Networks to model space
and time
SAMSI Climate Workshop, 20178/21/2017 41
Cedo Caka Lake
in Tibet, 1984
Cedo Caka Lake
in Tibet, 2011 Aral Sea in 2014Aral Sea in 1989
SAMSI Climate Workshop, 20178/21/2017 42
Importance of Monitoring Global Surface Water Dynamics
Impact of Climate Change Impact of Human Actions Early Warning Systems
Great Flood of Mississippi
River, 1993
Importance of Monitoring Global Surface Water Dynamics
Cedo Caka Lake
in Tibet, 1984
Cedo Caka Lake
in Tibet, 2011
Shrinking of Aral Sea since 1960s
Aral Sea in 2014Aral Sea in 2000
Melting of glacial lakes in Tibet 43
Opportunity in using Remote Sensing Data
• Multi-spectral data
 MODIS (at 500m, from 2000)
 Landsat (at 30m, from 1970s)
• Can be used to classify every location at a given
time as water or land (binary classes)
• Ground truth on specific dates available from
various sources: SRTM, GLWD
Challenges for Traditional Big Data Methods in Monitoring Water
• Challenge 1: Heterogeneity in
space and time
– Water and land bodies look different
in different regions of the world
– Same water body can look different
at different time-instances
Great Bitter Lake, Egypt Lake Tana, Ethiopia Lake Abbe, Africa
Mar Chiquita Lake, Argentina in 2000 (left) and 2012 (right)
• Challenge 2: Data Quality
– Noise: clouds, shadows,
atmospheric disturbances
– Missing data
Poyang Lake, China
(Pink color shows missing data)44
Method Innovations for Monitoring Water
• Ensemble Learning Methods for
Handling Heterogeneity in Data 1,2
P1
P2
P3
Positive Modes
(Water)
Negative Modes
(Land)
N1
N2
N3
• Using Physics Guided Labeling
to Handle Poor Data Quality3,4
Elevation A > B > C > D
Learn an ensemble of classifiers to distinguish b/w
different pairs of positive and negative modes
Use elevation information to constrain
physically-consistent labels
3 Khandelwal et al. ICDM 2015
4 Mithal et al. (PhD Dissertation)
1 Karpatne et al. SDM 2015
2 Karpatne et al. ICDM 2015 45SAMSI Climate Workshop, 2017
A Global Surface Water Monitoring System
http://z.umn.edu/monitoringwater
• Maps the dynamics of all
major surface water bodies
(surface area > 2.5 km2)
shown as blue dots
Key Highlights:
• Detects melting of glacial lakes
• Maps changes in river morphology
• Identifies reservoir constructions
• Finds relationships b/w surface water
and precipitation/groundwater
SAMSI Climate Workshop, 20178/21/2017 46
Showing Surface Water Dynamics
Don Martin Dam, Mexico
Surface area of water around Don Martin Dam across time
Annual Landsat Time-lapse of this region
(Courtesy: Google Earth Engine)
8/21/2017 47
Regions of Change
in South America
Red Dots (Water Gain):
Region of size > 2.5 km2 that have changed
from land to water in the last 15 years
Green Dots (Water Loss):
Region of size > 2.5 km2 that have changed
from water to land in the last 15 years
Example time series of a Water Gain region
Example time series of a Water Loss region
SAMSI Climate Workshop, 20178/21/2017 48
Examples of Change: Shrinking Water Bodies
Aggregate dynamics of all green dots shown on left
(Green dots show regions changing from water to land in last 15 years)
Annual Time-lapse of an example green dot
8/21/2017 49
September 2013
November 2015
November 2015
Examples of Change: Melting Glacial Lakes in Tibet
Water Gain regions (red dots)
show melting of lakes
Red polygons show regions
changing from land to water
Aggregate dynamics of
all red regions in Tibet
8/21/2017 50
Examples of Change: River Meandering
(Adjacent occurrence of Water Gain (red) and Water Loss (green) regions all along
the river indicate the displacement of water from the green dots to the red dots)
Zoomed-in View
Example time series of a Water Gain region
Example time series of a Water Loss region
1
Time-lapse of 1
2
Time-lapse of 2
SAMSI Climate Workshop, 20178/21/2017 51
8/21/2017 SAMSI Climate Workshop, 2017 52
Examples of Change: Shrinking Island
Examples of Change: Dam Construction
8/21/2017 53
Construction of Chubetsu Dam, Japan
SAMSI Climate Workshop, 2017
Construction of a dam characterized
by a sudden and persistent increase
in surface area
Global Reservoir and Dam (GRanD) Database
Global Reservoir and Dam (GRanD)
Database:
• A data curation initiative by Global
Water System Project (GWSP)
• Finds 61 dams constructed after 2001
UMN Approach:
• Finds 701 dams constructed after 2001
Dams reported by GRanD since 2001: 35
SAMSI Climate Workshop, 20178/21/2017 54
A data curation initiative by Global Water System Project (GWSP)
Comparison of Dam Detections with GRanD
Global Reservoir and Dam (GRanD)
Database:
• A data curation initiative by Global
Water System Project (GWSP)
• Finds 61 dams constructed after 2001
UMN Approach:
• Finds 701 dams constructed after 2001
Dams only reported by GRanD: 5
Dams reported by both UMN and GRanD: 30
Dams only reported by UMN: 671
SAMSI Climate Workshop, 20178/21/2017 55
Applications of Surface Water Monitoring
Quantifying water stocks and flow
Global projections of water risks (red)
Integrating with hydrological models
56
• Changes in the volume of water in lakes and reservoirs
o in conjunction with altimetry data
• River flow measurements at a global scale
• High quality data for calibrating hydrological models
Key Enabler: High-resolution estimates of water bodies in
space (30m) and time (daily)8/21/2017
Goal: Daily surface water extents maps at
30m spatial resolution
• Challenge:
– LANDSAT (30m spatial resolution, every 16 days)
– MODIS (500m spatial resolution, daily)
– Overlapping information is available only at few time steps
• Approach: Transfer information across Space and Time
578/21/2017 SAMSI Climate Workshop, 2017
Goal: Daily surface water extents maps at
30m spatial resolution
• Challenge:
– LANDSAT (30m spatial resolution, every 16 days)
– MODIS (500m spatial resolution, daily)
– Overlapping information is available only at few time steps
• Approach: Transfer information across Space and Time
58Extent at coarse resolution (500m) Extent at high resolution (30m)
created using our approach
Kajakai Reservoir, Afghanistan
8/21/2017 SAMSI Climate Workshop, 2017
Goal: Daily surface water extents maps at
30m spatial resolution
• Challenge:
– LANDSAT (30m spatial resolution, every 16 days)
– MODIS (500m spatial resolution, daily)
– Overlapping information is available only at few time steps
• Approach: Transfer information across Space and Time
59Extent at coarse resolution (500m) Extent at high resolution (30m)
created using our approach
Kajakai Reservoir, Afghanistan Jul 05, 2015
8/21/2017
Concluding Remarks
• Big data in climate offers great opportunity to advance
machine learning as well as increase our understanding of
the Earth’s climate and environment.
• Novel approaches are needed that can guide the process of
knowledge discovery in scientific applications
– “Theory-guided Data Science”
• Methods have applicability across diverse domains:
– Ecosystem management
– Epidemiology
– Geospatial Intelligence
– Neuroscience
SAMSI Climate Workshop, 20178/21/2017 60
Thank You! Questions?
UMN Graduate Students
UMN Team Members
Arindam Banerjee, Snigdhansu Chatterjee, Stefan Liess, Shashi
Shekhar, Michael Steinbach
NSF Expeditions Collaborators
NCSU: Nagiza Samatova, Fredrick Semazzi; Northeastern: Auroop Ganguly;
North Carolina A&T: Abdollah Homaifar, Fred Semazzi
External Collaborators
NASA Ames: Rama Nemani, Nikunj Oza; IonE, UMN: Kate Braumann;
UCLA: Dennis Lettenmaier, Miriam Marlier
Guruprasad NayakAnkush Khandelwal Varun Mithal
Work supported by NASA and NSF Expeditions in Computing project on Understanding Climate Change using Data-driven Approaches
Anuj Karpatne Xi Chen
Xiaowei Jia Saurabh Agrawal
8/21/2017 61

More Related Content

What's hot

IUKWC Workshop Nov16: Developing Hydro-climatic Services for Water Security –...
IUKWC Workshop Nov16: Developing Hydro-climatic Services for Water Security –...IUKWC Workshop Nov16: Developing Hydro-climatic Services for Water Security –...
IUKWC Workshop Nov16: Developing Hydro-climatic Services for Water Security –...India UK Water Centre (IUKWC)
 
IUKWC Workshop Nov16: Developing Hydro-climatic Services for Water Security –...
IUKWC Workshop Nov16: Developing Hydro-climatic Services for Water Security –...IUKWC Workshop Nov16: Developing Hydro-climatic Services for Water Security –...
IUKWC Workshop Nov16: Developing Hydro-climatic Services for Water Security –...India UK Water Centre (IUKWC)
 
IUKWC Workshop Nov16: Developing Hydro-climatic Services for Water Security –...
IUKWC Workshop Nov16: Developing Hydro-climatic Services for Water Security –...IUKWC Workshop Nov16: Developing Hydro-climatic Services for Water Security –...
IUKWC Workshop Nov16: Developing Hydro-climatic Services for Water Security –...India UK Water Centre (IUKWC)
 
IUKWC Workshop Nov16: Developing Hydro-climatic Services for Water Security –...
IUKWC Workshop Nov16: Developing Hydro-climatic Services for Water Security –...IUKWC Workshop Nov16: Developing Hydro-climatic Services for Water Security –...
IUKWC Workshop Nov16: Developing Hydro-climatic Services for Water Security –...India UK Water Centre (IUKWC)
 
IUKWC Workshop Nov16: Developing Hydro-climatic Services for Water Security –...
IUKWC Workshop Nov16: Developing Hydro-climatic Services for Water Security –...IUKWC Workshop Nov16: Developing Hydro-climatic Services for Water Security –...
IUKWC Workshop Nov16: Developing Hydro-climatic Services for Water Security –...India UK Water Centre (IUKWC)
 
Time integration of evapotranspiration using a two source surface energy bala...
Time integration of evapotranspiration using a two source surface energy bala...Time integration of evapotranspiration using a two source surface energy bala...
Time integration of evapotranspiration using a two source surface energy bala...Ramesh Dhungel
 
Climate downscaling
Climate downscalingClimate downscaling
Climate downscalingIC3Climate
 
TIME INTEGRATION OF EVAPOTRANSPIRATION USING A TWO SOURCE SURFACE ENERGY BALA...
TIME INTEGRATION OF EVAPOTRANSPIRATION USING A TWO SOURCE SURFACE ENERGY BALA...TIME INTEGRATION OF EVAPOTRANSPIRATION USING A TWO SOURCE SURFACE ENERGY BALA...
TIME INTEGRATION OF EVAPOTRANSPIRATION USING A TWO SOURCE SURFACE ENERGY BALA...Ramesh Dhungel
 
2013 ASPRS Track, Developing an ArcGIS Toolbox for Estimating EvapoTranspirat...
2013 ASPRS Track, Developing an ArcGIS Toolbox for Estimating EvapoTranspirat...2013 ASPRS Track, Developing an ArcGIS Toolbox for Estimating EvapoTranspirat...
2013 ASPRS Track, Developing an ArcGIS Toolbox for Estimating EvapoTranspirat...GIS in the Rockies
 
DSD-INT 2016 Data assimilation to improve volcanic ash forecasts using LOTOS-...
DSD-INT 2016 Data assimilation to improve volcanic ash forecasts using LOTOS-...DSD-INT 2016 Data assimilation to improve volcanic ash forecasts using LOTOS-...
DSD-INT 2016 Data assimilation to improve volcanic ash forecasts using LOTOS-...Deltares
 
Datta-Barua, URSI AT-RASC, 2015, Canary Islands, Ionospheric-Thermospheric St...
Datta-Barua, URSI AT-RASC, 2015, Canary Islands, Ionospheric-Thermospheric St...Datta-Barua, URSI AT-RASC, 2015, Canary Islands, Ionospheric-Thermospheric St...
Datta-Barua, URSI AT-RASC, 2015, Canary Islands, Ionospheric-Thermospheric St...Daniel Miladinovich
 

What's hot (20)

IUKWC Workshop Nov16: Developing Hydro-climatic Services for Water Security –...
IUKWC Workshop Nov16: Developing Hydro-climatic Services for Water Security –...IUKWC Workshop Nov16: Developing Hydro-climatic Services for Water Security –...
IUKWC Workshop Nov16: Developing Hydro-climatic Services for Water Security –...
 
IUKWC Workshop Nov16: Developing Hydro-climatic Services for Water Security –...
IUKWC Workshop Nov16: Developing Hydro-climatic Services for Water Security –...IUKWC Workshop Nov16: Developing Hydro-climatic Services for Water Security –...
IUKWC Workshop Nov16: Developing Hydro-climatic Services for Water Security –...
 
Program on Mathematical and Statistical Methods for Climate and the Earth Sys...
Program on Mathematical and Statistical Methods for Climate and the Earth Sys...Program on Mathematical and Statistical Methods for Climate and the Earth Sys...
Program on Mathematical and Statistical Methods for Climate and the Earth Sys...
 
IUKWC Workshop Nov16: Developing Hydro-climatic Services for Water Security –...
IUKWC Workshop Nov16: Developing Hydro-climatic Services for Water Security –...IUKWC Workshop Nov16: Developing Hydro-climatic Services for Water Security –...
IUKWC Workshop Nov16: Developing Hydro-climatic Services for Water Security –...
 
CLIM Program: Remote Sensing Workshop, Optimization Methods in Remote Sensing...
CLIM Program: Remote Sensing Workshop, Optimization Methods in Remote Sensing...CLIM Program: Remote Sensing Workshop, Optimization Methods in Remote Sensing...
CLIM Program: Remote Sensing Workshop, Optimization Methods in Remote Sensing...
 
IUKWC Workshop Nov16: Developing Hydro-climatic Services for Water Security –...
IUKWC Workshop Nov16: Developing Hydro-climatic Services for Water Security –...IUKWC Workshop Nov16: Developing Hydro-climatic Services for Water Security –...
IUKWC Workshop Nov16: Developing Hydro-climatic Services for Water Security –...
 
IUKWC Workshop Nov16: Developing Hydro-climatic Services for Water Security –...
IUKWC Workshop Nov16: Developing Hydro-climatic Services for Water Security –...IUKWC Workshop Nov16: Developing Hydro-climatic Services for Water Security –...
IUKWC Workshop Nov16: Developing Hydro-climatic Services for Water Security –...
 
CLIM: Transition Workshop - Optimization Methods in Remote Sensing - Jessica...
CLIM: Transition Workshop - Optimization Methods in Remote Sensing  - Jessica...CLIM: Transition Workshop - Optimization Methods in Remote Sensing  - Jessica...
CLIM: Transition Workshop - Optimization Methods in Remote Sensing - Jessica...
 
Buckles_research
Buckles_researchBuckles_research
Buckles_research
 
Time integration of evapotranspiration using a two source surface energy bala...
Time integration of evapotranspiration using a two source surface energy bala...Time integration of evapotranspiration using a two source surface energy bala...
Time integration of evapotranspiration using a two source surface energy bala...
 
Climate downscaling
Climate downscalingClimate downscaling
Climate downscaling
 
TIME INTEGRATION OF EVAPOTRANSPIRATION USING A TWO SOURCE SURFACE ENERGY BALA...
TIME INTEGRATION OF EVAPOTRANSPIRATION USING A TWO SOURCE SURFACE ENERGY BALA...TIME INTEGRATION OF EVAPOTRANSPIRATION USING A TWO SOURCE SURFACE ENERGY BALA...
TIME INTEGRATION OF EVAPOTRANSPIRATION USING A TWO SOURCE SURFACE ENERGY BALA...
 
2013 ASPRS Track, Developing an ArcGIS Toolbox for Estimating EvapoTranspirat...
2013 ASPRS Track, Developing an ArcGIS Toolbox for Estimating EvapoTranspirat...2013 ASPRS Track, Developing an ArcGIS Toolbox for Estimating EvapoTranspirat...
2013 ASPRS Track, Developing an ArcGIS Toolbox for Estimating EvapoTranspirat...
 
Climate Modelling for Ireland -Dr Ray McGrath, Met Eireann
Climate Modelling for Ireland -Dr Ray McGrath, Met EireannClimate Modelling for Ireland -Dr Ray McGrath, Met Eireann
Climate Modelling for Ireland -Dr Ray McGrath, Met Eireann
 
Downscaling climate information (BC3 Summer School _July 2015)
Downscaling climate information (BC3 Summer School _July 2015)Downscaling climate information (BC3 Summer School _July 2015)
Downscaling climate information (BC3 Summer School _July 2015)
 
DSD-INT 2016 Data assimilation to improve volcanic ash forecasts using LOTOS-...
DSD-INT 2016 Data assimilation to improve volcanic ash forecasts using LOTOS-...DSD-INT 2016 Data assimilation to improve volcanic ash forecasts using LOTOS-...
DSD-INT 2016 Data assimilation to improve volcanic ash forecasts using LOTOS-...
 
CLIM Undergraduate Workshop: How was this Made?: Making Dirty Data into Somet...
CLIM Undergraduate Workshop: How was this Made?: Making Dirty Data into Somet...CLIM Undergraduate Workshop: How was this Made?: Making Dirty Data into Somet...
CLIM Undergraduate Workshop: How was this Made?: Making Dirty Data into Somet...
 
CSU_Poster
CSU_PosterCSU_Poster
CSU_Poster
 
CLIM Undergraduate Workshop: Applications in Climate Context - Michael Wehner...
CLIM Undergraduate Workshop: Applications in Climate Context - Michael Wehner...CLIM Undergraduate Workshop: Applications in Climate Context - Michael Wehner...
CLIM Undergraduate Workshop: Applications in Climate Context - Michael Wehner...
 
Datta-Barua, URSI AT-RASC, 2015, Canary Islands, Ionospheric-Thermospheric St...
Datta-Barua, URSI AT-RASC, 2015, Canary Islands, Ionospheric-Thermospheric St...Datta-Barua, URSI AT-RASC, 2015, Canary Islands, Ionospheric-Thermospheric St...
Datta-Barua, URSI AT-RASC, 2015, Canary Islands, Ionospheric-Thermospheric St...
 

Viewers also liked

Viewers also liked (19)

CLIM Fall 2017 Course: Statistics for Climate Research, Analysis for Climate ...
CLIM Fall 2017 Course: Statistics for Climate Research, Analysis for Climate ...CLIM Fall 2017 Course: Statistics for Climate Research, Analysis for Climate ...
CLIM Fall 2017 Course: Statistics for Climate Research, Analysis for Climate ...
 
CLIM Fall 2017 Course: Statistics for Climate Research, Estimating Curves and...
CLIM Fall 2017 Course: Statistics for Climate Research, Estimating Curves and...CLIM Fall 2017 Course: Statistics for Climate Research, Estimating Curves and...
CLIM Fall 2017 Course: Statistics for Climate Research, Estimating Curves and...
 
Program on Mathematical and Statistical Methods for Climate and the Earth Sys...
Program on Mathematical and Statistical Methods for Climate and the Earth Sys...Program on Mathematical and Statistical Methods for Climate and the Earth Sys...
Program on Mathematical and Statistical Methods for Climate and the Earth Sys...
 
Program on Mathematical and Statistical Methods for Climate and the Earth Sys...
Program on Mathematical and Statistical Methods for Climate and the Earth Sys...Program on Mathematical and Statistical Methods for Climate and the Earth Sys...
Program on Mathematical and Statistical Methods for Climate and the Earth Sys...
 
CLIM Fall 2017 Course: Statistics for Climate Research, Climate Informatics -...
CLIM Fall 2017 Course: Statistics for Climate Research, Climate Informatics -...CLIM Fall 2017 Course: Statistics for Climate Research, Climate Informatics -...
CLIM Fall 2017 Course: Statistics for Climate Research, Climate Informatics -...
 
Program on Mathematical and Statistical Methods for Climate and the Earth Sys...
Program on Mathematical and Statistical Methods for Climate and the Earth Sys...Program on Mathematical and Statistical Methods for Climate and the Earth Sys...
Program on Mathematical and Statistical Methods for Climate and the Earth Sys...
 
CLIM Fall 2017 Course: Statistics for Climate Research, Guest lecture: Data F...
CLIM Fall 2017 Course: Statistics for Climate Research, Guest lecture: Data F...CLIM Fall 2017 Course: Statistics for Climate Research, Guest lecture: Data F...
CLIM Fall 2017 Course: Statistics for Climate Research, Guest lecture: Data F...
 
CLIM Fall 2017 Course: Statistics for Climate Research, Nonstationary Covaria...
CLIM Fall 2017 Course: Statistics for Climate Research, Nonstationary Covaria...CLIM Fall 2017 Course: Statistics for Climate Research, Nonstationary Covaria...
CLIM Fall 2017 Course: Statistics for Climate Research, Nonstationary Covaria...
 
CLIM Fall 2017 Course: Statistics for Climate Research, Geostats for Large Da...
CLIM Fall 2017 Course: Statistics for Climate Research, Geostats for Large Da...CLIM Fall 2017 Course: Statistics for Climate Research, Geostats for Large Da...
CLIM Fall 2017 Course: Statistics for Climate Research, Geostats for Large Da...
 
Program on Mathematical and Statistical Methods for Climate and the Earth Sys...
Program on Mathematical and Statistical Methods for Climate and the Earth Sys...Program on Mathematical and Statistical Methods for Climate and the Earth Sys...
Program on Mathematical and Statistical Methods for Climate and the Earth Sys...
 
CLIM Fall 2017 Course: Statistics for Climate Research, Spatial Data: Models ...
CLIM Fall 2017 Course: Statistics for Climate Research, Spatial Data: Models ...CLIM Fall 2017 Course: Statistics for Climate Research, Spatial Data: Models ...
CLIM Fall 2017 Course: Statistics for Climate Research, Spatial Data: Models ...
 
Program on Mathematical and Statistical Methods for Climate and the Earth Sys...
Program on Mathematical and Statistical Methods for Climate and the Earth Sys...Program on Mathematical and Statistical Methods for Climate and the Earth Sys...
Program on Mathematical and Statistical Methods for Climate and the Earth Sys...
 
CLIM Fall 2017 Course: Statistics for Climate Research, Detection & Attributi...
CLIM Fall 2017 Course: Statistics for Climate Research, Detection & Attributi...CLIM Fall 2017 Course: Statistics for Climate Research, Detection & Attributi...
CLIM Fall 2017 Course: Statistics for Climate Research, Detection & Attributi...
 
CLIM Undergraduate Workshop: (Attachment) Performing Extreme Value Analysis (...
CLIM Undergraduate Workshop: (Attachment) Performing Extreme Value Analysis (...CLIM Undergraduate Workshop: (Attachment) Performing Extreme Value Analysis (...
CLIM Undergraduate Workshop: (Attachment) Performing Extreme Value Analysis (...
 
CLIM Undergraduate Workshop: Statistical Development and challenges for Paleo...
CLIM Undergraduate Workshop: Statistical Development and challenges for Paleo...CLIM Undergraduate Workshop: Statistical Development and challenges for Paleo...
CLIM Undergraduate Workshop: Statistical Development and challenges for Paleo...
 
CLIM Undergraduate Workshop: Extreme Value Analysis for Climate Research - Wh...
CLIM Undergraduate Workshop: Extreme Value Analysis for Climate Research - Wh...CLIM Undergraduate Workshop: Extreme Value Analysis for Climate Research - Wh...
CLIM Undergraduate Workshop: Extreme Value Analysis for Climate Research - Wh...
 
CLIM Undergraduate Workshop: Undergraduate Workshop Introduction - Elvan Ceyh...
CLIM Undergraduate Workshop: Undergraduate Workshop Introduction - Elvan Ceyh...CLIM Undergraduate Workshop: Undergraduate Workshop Introduction - Elvan Ceyh...
CLIM Undergraduate Workshop: Undergraduate Workshop Introduction - Elvan Ceyh...
 
CLIM Undergraduate Workshop: Tutorial on R Software - Huang Huang, Oct 23, 2017
CLIM Undergraduate Workshop: Tutorial on R Software - Huang Huang, Oct 23, 2017CLIM Undergraduate Workshop: Tutorial on R Software - Huang Huang, Oct 23, 2017
CLIM Undergraduate Workshop: Tutorial on R Software - Huang Huang, Oct 23, 2017
 
CLIM Undergraduate Workshop: Introduction to Spatial Data Analysis with R - M...
CLIM Undergraduate Workshop: Introduction to Spatial Data Analysis with R - M...CLIM Undergraduate Workshop: Introduction to Spatial Data Analysis with R - M...
CLIM Undergraduate Workshop: Introduction to Spatial Data Analysis with R - M...
 

Similar to Big Data Advances in Climate Change Research

EcoTas13 BradEvans e-MAST framework
EcoTas13 BradEvans e-MAST frameworkEcoTas13 BradEvans e-MAST framework
EcoTas13 BradEvans e-MAST frameworkTERN Australia
 
EcoTas13 BradEvans e-MAST
EcoTas13 BradEvans e-MASTEcoTas13 BradEvans e-MAST
EcoTas13 BradEvans e-MASTTERN Australia
 
Assessing ecosystem services over large areas
Assessing ecosystem services over large areasAssessing ecosystem services over large areas
Assessing ecosystem services over large areasAlessandro Gimona
 
RemoteSensingProjectPaper
RemoteSensingProjectPaperRemoteSensingProjectPaper
RemoteSensingProjectPaperJames Sherwood
 
TERN eMAST : Observations and terrestrial ecosystem models : Terrestrial Ecos...
TERN eMAST : Observations and terrestrial ecosystem models : Terrestrial Ecos...TERN eMAST : Observations and terrestrial ecosystem models : Terrestrial Ecos...
TERN eMAST : Observations and terrestrial ecosystem models : Terrestrial Ecos...Brad Evans
 
Esri and the Scientific Community
Esri and the Scientific CommunityEsri and the Scientific Community
Esri and the Scientific CommunityDawn Wright
 
Drought monitoring, Precipitation statistics, and water balance with freely a...
Drought monitoring, Precipitation statistics, and water balance with freely a...Drought monitoring, Precipitation statistics, and water balance with freely a...
Drought monitoring, Precipitation statistics, and water balance with freely a...AngelosAlamanos
 
Extending the modern record back in time using proxy data
Extending the modern record back in time using proxy dataExtending the modern record back in time using proxy data
Extending the modern record back in time using proxy dataTim Osborn
 
RDC Mairi Best - Data Interoperability II
RDC Mairi Best  -  Data Interoperability IIRDC Mairi Best  -  Data Interoperability II
RDC Mairi Best - Data Interoperability IICASRAI
 
Estimating soil organic carbon changes: is it feasible?
Estimating soil organic carbon changes: is it feasible?Estimating soil organic carbon changes: is it feasible?
Estimating soil organic carbon changes: is it feasible?ExternalEvents
 
Bio inspired computational techniques applied to the analysis and visualizati...
Bio inspired computational techniques applied to the analysis and visualizati...Bio inspired computational techniques applied to the analysis and visualizati...
Bio inspired computational techniques applied to the analysis and visualizati...askroll
 
The evolution of the Sentinel Landscape initiative
The evolution of the Sentinel Landscape initiativeThe evolution of the Sentinel Landscape initiative
The evolution of the Sentinel Landscape initiativeWorld Agroforestry (ICRAF)
 
Day 2 divas basnet, nepal development research institute (ndri), nepal, arrcc...
Day 2 divas basnet, nepal development research institute (ndri), nepal, arrcc...Day 2 divas basnet, nepal development research institute (ndri), nepal, arrcc...
Day 2 divas basnet, nepal development research institute (ndri), nepal, arrcc...ICIMOD
 
April 2010, Tri-State EPSCoR Meeting, Incline Village
April 2010, Tri-State EPSCoR Meeting, Incline VillageApril 2010, Tri-State EPSCoR Meeting, Incline Village
April 2010, Tri-State EPSCoR Meeting, Incline VillageJeff Dozier
 
SC7 Workshop 3: Big Data Challenges in Building a Global Earth Observation Sy...
SC7 Workshop 3: Big Data Challenges in Building a Global Earth Observation Sy...SC7 Workshop 3: Big Data Challenges in Building a Global Earth Observation Sy...
SC7 Workshop 3: Big Data Challenges in Building a Global Earth Observation Sy...BigData_Europe
 
MOSAICC - a Capacity Development Tool for Assessments of Climate Change Impac...
MOSAICC - a Capacity Development Tool for Assessments of Climate Change Impac...MOSAICC - a Capacity Development Tool for Assessments of Climate Change Impac...
MOSAICC - a Capacity Development Tool for Assessments of Climate Change Impac...FAO
 

Similar to Big Data Advances in Climate Change Research (20)

EcoTas13 BradEvans e-MAST framework
EcoTas13 BradEvans e-MAST frameworkEcoTas13 BradEvans e-MAST framework
EcoTas13 BradEvans e-MAST framework
 
EcoTas13 BradEvans e-MAST
EcoTas13 BradEvans e-MASTEcoTas13 BradEvans e-MAST
EcoTas13 BradEvans e-MAST
 
Assessing ecosystem services over large areas
Assessing ecosystem services over large areasAssessing ecosystem services over large areas
Assessing ecosystem services over large areas
 
RemoteSensingProjectPaper
RemoteSensingProjectPaperRemoteSensingProjectPaper
RemoteSensingProjectPaper
 
TERN eMAST : Observations and terrestrial ecosystem models : Terrestrial Ecos...
TERN eMAST : Observations and terrestrial ecosystem models : Terrestrial Ecos...TERN eMAST : Observations and terrestrial ecosystem models : Terrestrial Ecos...
TERN eMAST : Observations and terrestrial ecosystem models : Terrestrial Ecos...
 
Esri and the Scientific Community
Esri and the Scientific CommunityEsri and the Scientific Community
Esri and the Scientific Community
 
Drought monitoring, Precipitation statistics, and water balance with freely a...
Drought monitoring, Precipitation statistics, and water balance with freely a...Drought monitoring, Precipitation statistics, and water balance with freely a...
Drought monitoring, Precipitation statistics, and water balance with freely a...
 
Extending the modern record back in time using proxy data
Extending the modern record back in time using proxy dataExtending the modern record back in time using proxy data
Extending the modern record back in time using proxy data
 
E-MAST
E-MASTE-MAST
E-MAST
 
Quantification of ephemeral gully erosion
Quantification of ephemeral gully erosionQuantification of ephemeral gully erosion
Quantification of ephemeral gully erosion
 
11. Presentación corta: servicios ecosistémicos y cambio climático
11. Presentación corta: servicios ecosistémicos y cambio climático11. Presentación corta: servicios ecosistémicos y cambio climático
11. Presentación corta: servicios ecosistémicos y cambio climático
 
RDC Mairi Best - Data Interoperability II
RDC Mairi Best  -  Data Interoperability IIRDC Mairi Best  -  Data Interoperability II
RDC Mairi Best - Data Interoperability II
 
Estimating soil organic carbon changes: is it feasible?
Estimating soil organic carbon changes: is it feasible?Estimating soil organic carbon changes: is it feasible?
Estimating soil organic carbon changes: is it feasible?
 
Bio inspired computational techniques applied to the analysis and visualizati...
Bio inspired computational techniques applied to the analysis and visualizati...Bio inspired computational techniques applied to the analysis and visualizati...
Bio inspired computational techniques applied to the analysis and visualizati...
 
The evolution of the Sentinel Landscape initiative
The evolution of the Sentinel Landscape initiativeThe evolution of the Sentinel Landscape initiative
The evolution of the Sentinel Landscape initiative
 
Day 2 divas basnet, nepal development research institute (ndri), nepal, arrcc...
Day 2 divas basnet, nepal development research institute (ndri), nepal, arrcc...Day 2 divas basnet, nepal development research institute (ndri), nepal, arrcc...
Day 2 divas basnet, nepal development research institute (ndri), nepal, arrcc...
 
April 2010, Tri-State EPSCoR Meeting, Incline Village
April 2010, Tri-State EPSCoR Meeting, Incline VillageApril 2010, Tri-State EPSCoR Meeting, Incline Village
April 2010, Tri-State EPSCoR Meeting, Incline Village
 
Using AI to Tackle Global Environmental Challenges
Using AI to Tackle Global Environmental ChallengesUsing AI to Tackle Global Environmental Challenges
Using AI to Tackle Global Environmental Challenges
 
SC7 Workshop 3: Big Data Challenges in Building a Global Earth Observation Sy...
SC7 Workshop 3: Big Data Challenges in Building a Global Earth Observation Sy...SC7 Workshop 3: Big Data Challenges in Building a Global Earth Observation Sy...
SC7 Workshop 3: Big Data Challenges in Building a Global Earth Observation Sy...
 
MOSAICC - a Capacity Development Tool for Assessments of Climate Change Impac...
MOSAICC - a Capacity Development Tool for Assessments of Climate Change Impac...MOSAICC - a Capacity Development Tool for Assessments of Climate Change Impac...
MOSAICC - a Capacity Development Tool for Assessments of Climate Change Impac...
 

More from The Statistical and Applied Mathematical Sciences Institute

More from The Statistical and Applied Mathematical Sciences Institute (20)

Causal Inference Opening Workshop - Latent Variable Models, Causal Inference,...
Causal Inference Opening Workshop - Latent Variable Models, Causal Inference,...Causal Inference Opening Workshop - Latent Variable Models, Causal Inference,...
Causal Inference Opening Workshop - Latent Variable Models, Causal Inference,...
 
2019 Fall Series: Special Guest Lecture - 0-1 Phase Transitions in High Dimen...
2019 Fall Series: Special Guest Lecture - 0-1 Phase Transitions in High Dimen...2019 Fall Series: Special Guest Lecture - 0-1 Phase Transitions in High Dimen...
2019 Fall Series: Special Guest Lecture - 0-1 Phase Transitions in High Dimen...
 
Causal Inference Opening Workshop - Causal Discovery in Neuroimaging Data - F...
Causal Inference Opening Workshop - Causal Discovery in Neuroimaging Data - F...Causal Inference Opening Workshop - Causal Discovery in Neuroimaging Data - F...
Causal Inference Opening Workshop - Causal Discovery in Neuroimaging Data - F...
 
Causal Inference Opening Workshop - Smooth Extensions to BART for Heterogeneo...
Causal Inference Opening Workshop - Smooth Extensions to BART for Heterogeneo...Causal Inference Opening Workshop - Smooth Extensions to BART for Heterogeneo...
Causal Inference Opening Workshop - Smooth Extensions to BART for Heterogeneo...
 
Causal Inference Opening Workshop - A Bracketing Relationship between Differe...
Causal Inference Opening Workshop - A Bracketing Relationship between Differe...Causal Inference Opening Workshop - A Bracketing Relationship between Differe...
Causal Inference Opening Workshop - A Bracketing Relationship between Differe...
 
Causal Inference Opening Workshop - Testing Weak Nulls in Matched Observation...
Causal Inference Opening Workshop - Testing Weak Nulls in Matched Observation...Causal Inference Opening Workshop - Testing Weak Nulls in Matched Observation...
Causal Inference Opening Workshop - Testing Weak Nulls in Matched Observation...
 
Causal Inference Opening Workshop - Difference-in-differences: more than meet...
Causal Inference Opening Workshop - Difference-in-differences: more than meet...Causal Inference Opening Workshop - Difference-in-differences: more than meet...
Causal Inference Opening Workshop - Difference-in-differences: more than meet...
 
Causal Inference Opening Workshop - New Statistical Learning Methods for Esti...
Causal Inference Opening Workshop - New Statistical Learning Methods for Esti...Causal Inference Opening Workshop - New Statistical Learning Methods for Esti...
Causal Inference Opening Workshop - New Statistical Learning Methods for Esti...
 
Causal Inference Opening Workshop - Bipartite Causal Inference with Interfere...
Causal Inference Opening Workshop - Bipartite Causal Inference with Interfere...Causal Inference Opening Workshop - Bipartite Causal Inference with Interfere...
Causal Inference Opening Workshop - Bipartite Causal Inference with Interfere...
 
Causal Inference Opening Workshop - Bridging the Gap Between Causal Literatur...
Causal Inference Opening Workshop - Bridging the Gap Between Causal Literatur...Causal Inference Opening Workshop - Bridging the Gap Between Causal Literatur...
Causal Inference Opening Workshop - Bridging the Gap Between Causal Literatur...
 
Causal Inference Opening Workshop - Some Applications of Reinforcement Learni...
Causal Inference Opening Workshop - Some Applications of Reinforcement Learni...Causal Inference Opening Workshop - Some Applications of Reinforcement Learni...
Causal Inference Opening Workshop - Some Applications of Reinforcement Learni...
 
Causal Inference Opening Workshop - Bracketing Bounds for Differences-in-Diff...
Causal Inference Opening Workshop - Bracketing Bounds for Differences-in-Diff...Causal Inference Opening Workshop - Bracketing Bounds for Differences-in-Diff...
Causal Inference Opening Workshop - Bracketing Bounds for Differences-in-Diff...
 
Causal Inference Opening Workshop - Assisting the Impact of State Polcies: Br...
Causal Inference Opening Workshop - Assisting the Impact of State Polcies: Br...Causal Inference Opening Workshop - Assisting the Impact of State Polcies: Br...
Causal Inference Opening Workshop - Assisting the Impact of State Polcies: Br...
 
Causal Inference Opening Workshop - Experimenting in Equilibrium - Stefan Wag...
Causal Inference Opening Workshop - Experimenting in Equilibrium - Stefan Wag...Causal Inference Opening Workshop - Experimenting in Equilibrium - Stefan Wag...
Causal Inference Opening Workshop - Experimenting in Equilibrium - Stefan Wag...
 
Causal Inference Opening Workshop - Targeted Learning for Causal Inference Ba...
Causal Inference Opening Workshop - Targeted Learning for Causal Inference Ba...Causal Inference Opening Workshop - Targeted Learning for Causal Inference Ba...
Causal Inference Opening Workshop - Targeted Learning for Causal Inference Ba...
 
Causal Inference Opening Workshop - Bayesian Nonparametric Models for Treatme...
Causal Inference Opening Workshop - Bayesian Nonparametric Models for Treatme...Causal Inference Opening Workshop - Bayesian Nonparametric Models for Treatme...
Causal Inference Opening Workshop - Bayesian Nonparametric Models for Treatme...
 
2019 Fall Series: Special Guest Lecture - Adversarial Risk Analysis of the Ge...
2019 Fall Series: Special Guest Lecture - Adversarial Risk Analysis of the Ge...2019 Fall Series: Special Guest Lecture - Adversarial Risk Analysis of the Ge...
2019 Fall Series: Special Guest Lecture - Adversarial Risk Analysis of the Ge...
 
2019 Fall Series: Professional Development, Writing Academic Papers…What Work...
2019 Fall Series: Professional Development, Writing Academic Papers…What Work...2019 Fall Series: Professional Development, Writing Academic Papers…What Work...
2019 Fall Series: Professional Development, Writing Academic Papers…What Work...
 
2019 GDRR: Blockchain Data Analytics - Machine Learning in/for Blockchain: Fu...
2019 GDRR: Blockchain Data Analytics - Machine Learning in/for Blockchain: Fu...2019 GDRR: Blockchain Data Analytics - Machine Learning in/for Blockchain: Fu...
2019 GDRR: Blockchain Data Analytics - Machine Learning in/for Blockchain: Fu...
 
2019 GDRR: Blockchain Data Analytics - QuTrack: Model Life Cycle Management f...
2019 GDRR: Blockchain Data Analytics - QuTrack: Model Life Cycle Management f...2019 GDRR: Blockchain Data Analytics - QuTrack: Model Life Cycle Management f...
2019 GDRR: Blockchain Data Analytics - QuTrack: Model Life Cycle Management f...
 

Recently uploaded

Sanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfSanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfsanyamsingh5019
 
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...Marc Dusseiller Dusjagr
 
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdf
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdfEnzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdf
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdfSumit Tiwari
 
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Sapana Sha
 
Solving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptxSolving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptxOH TEIK BIN
 
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTiammrhaywood
 
Introduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxIntroduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxpboyjonauth
 
Employee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptxEmployee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptxNirmalaLoungPoorunde1
 
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17Celine George
 
Hybridoma Technology ( Production , Purification , and Application )
Hybridoma Technology  ( Production , Purification , and Application  ) Hybridoma Technology  ( Production , Purification , and Application  )
Hybridoma Technology ( Production , Purification , and Application ) Sakshi Ghasle
 
Crayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon ACrayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon AUnboundStockton
 
The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13Steve Thomason
 
Presiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha electionsPresiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha electionsanshu789521
 
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactdawncurless
 
Interactive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationInteractive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationnomboosow
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxheathfieldcps1
 
Separation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and ActinidesSeparation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and ActinidesFatimaKhan178732
 
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions  for the students and aspirants of Chemistry12th.pptxOrganic Name Reactions  for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions for the students and aspirants of Chemistry12th.pptxVS Mahajan Coaching Centre
 

Recently uploaded (20)

Sanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfSanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdf
 
Staff of Color (SOC) Retention Efforts DDSD
Staff of Color (SOC) Retention Efforts DDSDStaff of Color (SOC) Retention Efforts DDSD
Staff of Color (SOC) Retention Efforts DDSD
 
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
 
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
 
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdf
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdfEnzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdf
Enzyme, Pharmaceutical Aids, Miscellaneous Last Part of Chapter no 5th.pdf
 
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
 
Solving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptxSolving Puzzles Benefits Everyone (English).pptx
Solving Puzzles Benefits Everyone (English).pptx
 
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
 
Introduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxIntroduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptx
 
Employee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptxEmployee wellbeing at the workplace.pptx
Employee wellbeing at the workplace.pptx
 
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
 
Hybridoma Technology ( Production , Purification , and Application )
Hybridoma Technology  ( Production , Purification , and Application  ) Hybridoma Technology  ( Production , Purification , and Application  )
Hybridoma Technology ( Production , Purification , and Application )
 
Crayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon ACrayon Activity Handout For the Crayon A
Crayon Activity Handout For the Crayon A
 
The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13
 
Presiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha electionsPresiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha elections
 
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impact
 
Interactive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationInteractive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communication
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptx
 
Separation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and ActinidesSeparation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and Actinides
 
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions  for the students and aspirants of Chemistry12th.pptxOrganic Name Reactions  for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
 

Big Data Advances in Climate Change Research

  • 1. Big Data in Climate: Opportunities and Challenges for Machine Learning and Data Mining Vipin Kumar University of Minnesota kumar001@umn.edu www.cs.umn.edu/~kumar
  • 2. Big Data in Climate • Satellite Data – Spectral Reflectance – Elevation Models – Nighttime Lights – Aerosols • Oceanographic Data – Temperature – Salinity – Circulation • Climate Models • Reanalysis Data • River Discharge • Agricultural Statistics • Population Data • Air Quality • … Source: NCAR Source: NASA SAMSI Climate Workshop, 20178/21/2017 2
  • 3. Big Data in Climate • Satellite Data – Spectral Reflectance – Elevation Models – Nighttime Lights – Aerosols • Oceanographic Data – Temperature – Salinity – Circulation • Climate Models • Reanalysis Data • River Discharge • Agricultural Statistics • Population Data • Air Quality • … Source: NCAR Source: NASA “Climate change research is now ‘big science,’ comparable in its magnitude, complexity, and societal importance to human genomics and bioinformatics.” (Nature Climate Change, Oct 2012) SAMSI Climate Workshop, 20178/21/2017 3
  • 4. Pattern Mining: Monitoring Ocean Eddies • Spatio-temporal pattern mining using novel multiple object tracking algorithms • Created an open source data base of 20+ years of eddies and eddy tracks Extremes and Uncertainty: Heat waves, heavy rainfall • Extreme value theory in space-time and dependence of extremes on covariates • Spatiotemporal trends in extremes and physics-guided uncertainty quantification Relationship mining: Seasonal hurricane activity • Statistical method for automatic inference of modulating networks • Discovery of key factors and mechanisms modulating hurricane variability Sparse Predictive Modeling: Precipitation Downscaling • Hierarchical sparse regression and multi-task learning with spatial smoothing • Regional climate predictions from global observations Network Analysis: Climate Teleconnections • Scalable method for discovering related graph regions • Discovery of novel climate teleconnections • Also applicable in analyzing brain fMRI data Change Detection: Monitoring Ecosystem Distrubances • Robust scoring techniques for identifying diverse changes in spatio-temporal data • Created a comprehensive catalogue of global changes in surface water and vegetation, e.g. fires and deforestation. Five Year, $ 10m NSF Expeditions in Computing Project (1029711, PI: Vipin Kumar, U. Minnesota) Understanding Climate Change: A Data-driven Approach Research Highlights http://climatechange.cs.umn.edu/ [Arindam Banerjee] [Auroop Ganguly] [Nagiza Samatova] SAMSI Climate Workshop, 20178/21/2017 4
  • 5. Pattern Mining: Monitoring Ocean Eddies • Spatio-temporal pattern mining using novel multiple object tracking algorithms • Created an open source data base of 20+ years of eddies and eddy tracks Extremes and Uncertainty: Heat waves, heavy rainfall • Extreme value theory in space-time and dependence of extremes on covariates • Spatiotemporal trends in extremes and physics-guided uncertainty quantification Relationship mining: Seasonal hurricane activity • Statistical method for automatic inference of modulating networks • Discovery of key factors and mechanisms modulating hurricane variability Sparse Predictive Modeling: Precipitation Downscaling • Hierarchical sparse regression and multi-task learning with spatial smoothing • Regional climate predictions from global observations Network Analysis: Climate Teleconnections • Scalable method for discovering related graph regions • Discovery of novel climate teleconnections • Also applicable in analyzing brain fMRI data Change Detection: Monitoring Ecosystem Distrubances • Robust scoring techniques for identifying diverse changes in spatio-temporal data • Created a comprehensive catalogue of global changes in surface water and vegetation, e.g. fires and deforestation. Five Year, $ 10m NSF Expeditions in Computing Project (1029711, PI: Vipin Kumar, U. Minnesota) Understanding Climate Change: A Data-driven Approach Research Highlights http://climatechange.cs.umn.edu/ Challenges • Multi-resolution, multi-scale data • High temporal variability • Spatio-temporal auto-correlation • Spatial and temporal heterogeneity • Large amount of noise and missing values • Lack of representative ground truth • Class imbalance (changes are rare events) SAMSI Climate Workshop, 20178/21/2017 5
  • 6. Big Data in Earth System Monitoring A vegetation index measures the surface “greenness” – proxy for total biomass This vegetation time series captures temporal dynamics around the site of the China National Convention Center Data Type Coverage Spatial Resolution Temporal Resolution Spectral Resolution Duration Availability MODIS Multispectral Global 250 m Daily 7 2000 - present Public LANDSAT Multispectral Global 30 m 16 days 7 1972 - present Public Hyperion Hyperspectral Regional 30 m 16 days 220 2001 - present Private Sentinal - 1 Radar Global 5 m 12 days - 2014 - present Public Quickbird Multispectral Global 2.16 m 2 to 12 days 4 2001 - 2014 Private WorldView - 1 Panchromatic Global 50 cm 6 days 1 2007 - present Private MODIS covers ~ 5 billion locations globally at 250m resolution daily since Feb 2000. SAMSI Climate Workshop, 20178/21/2017 6
  • 7. 3. Global mapping of inland surface water dynamics  Heterogeneous Ensemble Learning (SDM 2015, ICDM 2015)  Physics-guided Labeling (ICDM 2015, RSE 2017)  Information Transfer across Space and Time Monitoring Global Change: Case Studies Lake Oroville in 2011 and 2014 1. Global mapping of forest fires:  RAPT: Rare Class Prediction in Absence of Ground Truth (TKDE 2017) 2. Mapping of plantation dynamics in tropical forests:  Recurrent Neural Networks to model space and time (IEEE Big Data 2016, SDM 2016, KDD 2017) SAMSI Climate Workshop, 20178/21/2017 7
  • 8. Case Study 1: Global Forest Fires Mapping Monitoring fires is important for climate change impact State-of-the-art: NASA MCD64A1 • Most extensively used global fire monitoring product • Uses MODIS surface reflectance and Active Fire data in a predictive model • Performance varies considerably across different geographical regions • Known to have very low recall in tropical forests that play a critical role in regulating the Earth’s climate, maintaining biodiversity, and serving as carbon sinks A record number of more than 150 countries signed the landmark agreement to tackle climate change at a ceremony at UN headquarters on 22 April, 2016. “the best chance to save the one planet we have" SAMSI Climate Workshop, 20178/21/2017 8
  • 9. Predictive Modeling: Traditional Paradigm Explanatory Variable Target Label 1 0 0 1 . . 1 Learn a classification function which generalizes well on unseen data that comes from the same distribution as training data. Predicts whether a given pixel is burned or not? Burned area mapping SAMSI Climate Workshop, 20178/21/2017 9 8000 sq.Km scene in SE Asia(2005)
  • 10. Predictive Modeling for Global Monitoring of Forest Fires Challenges: (1) Complete absence of target labels for supervision (however, imperfect annotations of poor quality labels are available for every sample) Variations in the relationship between the explanatory and target variable • Geographical heterogeneity • Seasonal heterogeneity • Land class heterogeneity • Temporal heterogeneity Global availability of labeled samples for burned area classification ? ? ? SAMSI Climate Workshop, 20178/21/2017 10
  • 11. Challenges: (1) Complete absence of target labels for supervision (however, imperfect annotations of poor quality labels are available for every sample) (2) Highly imbalanced classes True Positive Rate = 0.9 False Positive Rate = 0.01 skew 0 1 recall precision For eg. California State Year 2008 (experienced maximum fire activity in last decade) 2,296 sq. km. of forests burned out of a total 73,702 sq. km. forested area Predictive Modeling for Global Monitoring of Forest Fires SAMSI Climate Workshop, 20178/21/2017 11
  • 12. Challenges: (1) Complete absence of target labels for supervision (however, imperfect annotations of poor quality labels are available for every sample) (2) Highly imbalanced classes (3) How to evaluate performance of a model using imperfect labels? Global availability of labeled samples for burned area classification Predictive Modeling for Global Monitoring of Forest Fires SAMSI Climate Workshop, 20178/21/2017 12
  • 13. Predictive Modeling for a Rare Target Class using Imperfect Labels Examples of imperfect labels What are imperfect labels ? • Noisy/perturbed true labels • Inexpensive to obtain • Raw feature • Heuristics (given by expert) • Available for all test instances Application Target label Imperfect label Burned Area Fire/No Fire Thermal anomaly Urban settlement Urban/No urban Night time light Recommending items to a new user Interested/ No interested Friends interest SAMSI Climate Workshop, 20178/21/2017 13
  • 14. 14 Learning with imperfect labels Supervised Learning Imperfect LabelsExpert-annotated Labels Sufficient training samples Inadequate training samples SVM Decision tree Logistic regression Semi-supervised Active Learning Multi-view Multi-task Single annotatorMultiple annotators Learning with crowds Raykar et al. Partial Supervision Imperfect Supervision Positive Unlabeled learning Bing Liu et al. Elkan et al. Natrajan et al. Menon et al. Balanced Rare class
  • 15. 15 Learning with imperfect labels (2) Imperfect label is conditionally independent of feature space given the true label (CCN) + < 1(1) Assumptions True Labels
  • 16. 16 Learning with imperfect labels (2) Imperfect label is conditionally independent of feature space given the true label (CCN) + < 1(1) Assumptions Imperfect Labels that are conditionally independent (Errors shown in Red)
  • 17. 17 Learning with imperfect labels (2) Imperfect label is conditionally independent of feature space given the true label (CCN) + < 1(1) Assumptions Imperfect Labels that are conditionally independent (Errors shown in Red) Imperfect Labels that are not conditionally independent (Errors shown in Red)
  • 18. 18 Learning with imperfect labels (2) Imperfect label is conditionally independent of feature space given the true label (CCN) + < 1(1) Assumptions Blum et.al, COLT 1998 Given enough samples, it is possible to learn a classifier in CCN label noise scenario that is as good as one learned using perfectly labeled samples.
  • 19. 19 Learning with imperfect labels (2) Imperfect label is conditionally independent of feature space given the true label (CCN) + < 1(1) Assumptions Blum et.al, COLT 1998 Given enough samples, it is possible to learn a classifier in CCN label noise scenario that is as good as one learned using perfectly labeled samples. Liu et.al, ICML 2003 For Positive and Unlabeled (PU) learning setting (CCN with β =0) presented an algorithm to optimize precision*recall without using any perfect labels.
  • 20. 20 Learning with imperfect labels (2) Imperfect label is conditionally independent of feature space given the true label (CCN) + < 1(1) Assumptions Blum et.al, COLT 1998 Given enough samples, it is possible to learn a classifier in CCN label noise scenario that is as good as one learned using perfectly labeled samples. Liu et.al, ICML 2003 For Positive and Unlabeled (PU) learning setting (CCN with β =0) presented an algorithm to optimize precision*recall without using any perfect labels. Natarajan et.al, NIPS 2013 For some performance measures like accuracy, a classifier can be learned using CCN labels treating them as perfect labels.
  • 21. 21 Learning with imperfect labels (2) Imperfect label is conditionally independent of feature space given the true label (CCN) + < 1(1) Assumptions Blum et.al, COLT 1998 Given enough samples, it is possible to learn a classifier in CCN label noise scenario that is as good as one learned using perfectly labeled samples. Liu et.al, ICML 2003 For Positive and Unlabeled (PU) learning setting (CCN with β =0) presented an algorithm to optimize precision*recall without using any perfect labels. Elkan and Noto et.al, KDD 2008 Natarajan et.al, NIPS 2013 For some performance measures like accuracy, a classifier can be learned using CCN labels treating them as perfect labels. Menon et.al, ICML 2015 -For general CCN scenario showed the linear relationship between P(y=1|x) and P(a=1|x). - Presented a method to optimize balanced error rate, AUC
  • 22. 22 Learning with imperfect labels (2) Imperfect label is conditionally independent of feature space given the true label (CCN) + < 1(1) Assumptions Blum et.al, COLT 1998 Given enough samples, it is possible to learn a classifier in CCN label noise scenario that is as good as one learned using perfectly labeled samples. Liu et.al, ICML 2003 For Positive and Unlabeled (PU) learning setting (CCN with β =0) presented an algorithm to optimize precision*recall without using any perfect labels. Elkan and Noto et.al, KDD 2008 Natarajan et.al, NIPS 2013 For some performance measures like accuracy, a classifier can be learned using CCN labels treating them as perfect labels. Menon et.al, ICML 2015 -For general CCN scenario showed the linear relationship between P(y=1|x) and P(a=1|x). - Presented a method to optimize balanced error rate, AUC Imperfect supervision under rare class scenario Models built using imperfect labels a to optimize combinations of precision and recall can perform very poorly compared to models built using true labels y
  • 23. 23 RAPT: RAre class Prediction in absence of ground Truth • Step 1: Learn classification models using imperfect labels that is as good as the one that could be built using perfect labels – We provide a new method to optimize precision*recall using imperfect labels and prove that the selected threshold maximizes the true precision*recall • Step 2: Combine predictions from classification model and the imperfect label to improve precision at some loss in recall – For ultra-rare class scenarios, the gain in precision after aggregation is significantly higher compared to the loss in recall – The improvement in G-measure after step 2 is by a factor of • Step 3: Use spatial context to improve recall – Exploits the relationship structure such as spatial neighborhood, biological network or social network, to improve the coverage (recall) of the rare class instances.
  • 24. 24 Results for Burned Area Mapping California State RAPT Step 1 GT-based classifier
  • 25. 25 Results for Burned Area Mapping California State RAPT Step 1 Weak label GT-based classifier
  • 26. 26 Results for Burned Area Mapping California State RAPT Step 1 Weak label GT-based classifier
  • 27. 27 Results for Burned Area Mapping California State RAPT Step 2 RAPT Step 1 Weak label GT-based classifier
  • 28. 28 Results for Burned Area Mapping California State RAPT Step 2 RAPT Step 3 RAPT Step 1 Weak label GT-based classifier
  • 29. 29 Results for Burned Area Mapping Georgia State RAPT Step 2 RAPT Step 3 RAPT Step 1 Weak label GT-based classifier
  • 30. 30 Results for Burned Area Mapping Montana State RAPT Step 2 RAPT Step 3 RAPT Step 1 Weak label GT-based classifier
  • 31. 31 571 K sq. km. burned area found in tropical forests three times the area reported by state-of-art NASA product: MCD64A1 Fires in tropical forests during 2001-2014 RAPT (571 K) MCD64A1 (186 K) 126 K 60K 445 K SAMSI Climate Workshop, 2017 Global Monitoring of Fires in Tropical Forests
  • 32. 32 Burn Detection B B B Land cover F F F F F F F N N N N N N Year 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 Google Earth Image: Year 2002 Google Earth Image: Year 2015 RAPT detection 2002-2014 (RAPT only Common) SAMSI Climate Workshop, 2017 Deforestation via Burning in Amazon
  • 33. 33 3. Global mapping of inland surface water dynamics  Heterogeneous Ensemble Learning  Physics-guided Labeling  Information Transfer across Space and Time Monitoring Global Change: Case Studies Lake Oroville in 2011 and 2014 1. Global mapping of forest fires:  RAPT: Rare Class Prediction in Absence of Ground Truth 2. Mapping of plantation dynamics in tropical forests:  Recurrent Neural Networks to model space and time 8/21/2017 33
  • 34. 34 Case Study 2: Mapping of Plantation Dynamics Interplay between food, energy and water: Production of edible oils and biofuels. High carbon emissions. Degradation of water quality.
  • 35. 35 State-of-the-art and Challenges •Tree Plantation (TP): This data set is created by Transparent World, with the support of Global Forest Watch. Plantations are manually annotated on 2014. TP has high recall and low precision. •Roundtable on Sustainable Palm Oil (RSPO): This dataset is available across Indonesia in 2000, 2005, and 2009. In addition, the study digitized all the locations into 19 land cover types in these eras. RSPO has high precision and low recall. TP, 2014 RSPO, 2001 RSPO, 2005 RSPO, 2009  Challenges •Imperfect annotators •Tree Plantation (TP): high recall and low precision. •Roundtable on Sustainable Palm Oil (RSPO): high precision and low recall. •Data heterogeneity •Land cover heterogeneity  differentiate plantation from a variety of land covers, e.g. forest, are highly confused with plantations. •Spatial heterogeneity •Temporal heterogeneity  Seasonal variation - e.g., a crop land after harvest looks very similar to a barren land.  Yearly variation – the spectral features of a land cover change across years. •Noisy and high-dimensional feature space
  • 36. 36 Our Contribution  Learning from multiple imperfect annotators (Jia et al. BigData 2016)  Each annotator has different expertise level on different plantation types.  We recursively update the expertise of each annotator and estimate true labels.  Handling temporal heterogeneity in prediction (Jia et al. SDM 2017)  We model temporal and spatial dependencies across years in an LSTM model.  We propose an incremental learning strategy to update the LSTM model.  Aggregating classes, collecting samples and validating results (Jia et al. Technical Report, 2017)  We aggregate similar classes according to domain expertise.  For each aggregated class, we sample equal amount of samples from each sub- class across multiple years.  We validate the generated plantation maps by comparing random sampled locations to high-resolution images. • 1. Jia, X., Khandelwal, A., Gerber, J., Carlson, K., West, P., and Kumar, V. Learning Large-scale Plantation Mapping from Imperfect Annotators. In IEEE Big Data (Big Data), 2016. • 2. Jia, X., Khandelwal, A., Nayak, G., Gerber, J., Carlson, K., West, P., and Kumar, V. Predict Land Covers with Transition Modeling and Incremental Learning. In SDM, 2017. • 3. Jia, X., Khandelwal, A., Gerber, J., Carlson, K., Samberg, L., West, P., and Kumar, V. Automated Plantation Mapping in Southeast Asia Using Remote Sensing Data. In Department of Computer Science and Engineering-Technical Reports.
  • 38. 38 All plantations This and all following figures show only confident forest pixels. Interaction between Fires and Palm Oil Plantation
  • 39. 39 Plantations with no burn scars Plantations with burn scar during 2001-2014 This and all following figures show only confident forest pixels. Interaction between Fires and Palm Oil Plantation
  • 40. 40 Plantations with no burn scars Plantations with burn scar during 2001-2014 Burned pixels around plantations (same burn date as nearby red pixels) This and all following figures show only confident forest pixels. Interaction between Fires and Palm Oil Plantation
  • 41. 3. Global mapping of inland surface water dynamics  Heterogeneous Ensemble Learning  Physics-guided Labeling  Information Transfer across Space and Time Monitoring Global Change: Case Studies Lake Oroville in 2011 and 2014 1. Global mapping of forest fires:  RAPT: Rare Class Prediction in Absence of Ground Truth 2. Mapping of plantation dynamics in tropical forests:  Recurrent Neural Networks to model space and time SAMSI Climate Workshop, 20178/21/2017 41
  • 42. Cedo Caka Lake in Tibet, 1984 Cedo Caka Lake in Tibet, 2011 Aral Sea in 2014Aral Sea in 1989 SAMSI Climate Workshop, 20178/21/2017 42 Importance of Monitoring Global Surface Water Dynamics Impact of Climate Change Impact of Human Actions Early Warning Systems Great Flood of Mississippi River, 1993
  • 43. Importance of Monitoring Global Surface Water Dynamics Cedo Caka Lake in Tibet, 1984 Cedo Caka Lake in Tibet, 2011 Shrinking of Aral Sea since 1960s Aral Sea in 2014Aral Sea in 2000 Melting of glacial lakes in Tibet 43 Opportunity in using Remote Sensing Data • Multi-spectral data  MODIS (at 500m, from 2000)  Landsat (at 30m, from 1970s) • Can be used to classify every location at a given time as water or land (binary classes) • Ground truth on specific dates available from various sources: SRTM, GLWD
  • 44. Challenges for Traditional Big Data Methods in Monitoring Water • Challenge 1: Heterogeneity in space and time – Water and land bodies look different in different regions of the world – Same water body can look different at different time-instances Great Bitter Lake, Egypt Lake Tana, Ethiopia Lake Abbe, Africa Mar Chiquita Lake, Argentina in 2000 (left) and 2012 (right) • Challenge 2: Data Quality – Noise: clouds, shadows, atmospheric disturbances – Missing data Poyang Lake, China (Pink color shows missing data)44
  • 45. Method Innovations for Monitoring Water • Ensemble Learning Methods for Handling Heterogeneity in Data 1,2 P1 P2 P3 Positive Modes (Water) Negative Modes (Land) N1 N2 N3 • Using Physics Guided Labeling to Handle Poor Data Quality3,4 Elevation A > B > C > D Learn an ensemble of classifiers to distinguish b/w different pairs of positive and negative modes Use elevation information to constrain physically-consistent labels 3 Khandelwal et al. ICDM 2015 4 Mithal et al. (PhD Dissertation) 1 Karpatne et al. SDM 2015 2 Karpatne et al. ICDM 2015 45SAMSI Climate Workshop, 2017
  • 46. A Global Surface Water Monitoring System http://z.umn.edu/monitoringwater • Maps the dynamics of all major surface water bodies (surface area > 2.5 km2) shown as blue dots Key Highlights: • Detects melting of glacial lakes • Maps changes in river morphology • Identifies reservoir constructions • Finds relationships b/w surface water and precipitation/groundwater SAMSI Climate Workshop, 20178/21/2017 46
  • 47. Showing Surface Water Dynamics Don Martin Dam, Mexico Surface area of water around Don Martin Dam across time Annual Landsat Time-lapse of this region (Courtesy: Google Earth Engine) 8/21/2017 47
  • 48. Regions of Change in South America Red Dots (Water Gain): Region of size > 2.5 km2 that have changed from land to water in the last 15 years Green Dots (Water Loss): Region of size > 2.5 km2 that have changed from water to land in the last 15 years Example time series of a Water Gain region Example time series of a Water Loss region SAMSI Climate Workshop, 20178/21/2017 48
  • 49. Examples of Change: Shrinking Water Bodies Aggregate dynamics of all green dots shown on left (Green dots show regions changing from water to land in last 15 years) Annual Time-lapse of an example green dot 8/21/2017 49
  • 50. September 2013 November 2015 November 2015 Examples of Change: Melting Glacial Lakes in Tibet Water Gain regions (red dots) show melting of lakes Red polygons show regions changing from land to water Aggregate dynamics of all red regions in Tibet 8/21/2017 50
  • 51. Examples of Change: River Meandering (Adjacent occurrence of Water Gain (red) and Water Loss (green) regions all along the river indicate the displacement of water from the green dots to the red dots) Zoomed-in View Example time series of a Water Gain region Example time series of a Water Loss region 1 Time-lapse of 1 2 Time-lapse of 2 SAMSI Climate Workshop, 20178/21/2017 51
  • 52. 8/21/2017 SAMSI Climate Workshop, 2017 52 Examples of Change: Shrinking Island
  • 53. Examples of Change: Dam Construction 8/21/2017 53 Construction of Chubetsu Dam, Japan SAMSI Climate Workshop, 2017 Construction of a dam characterized by a sudden and persistent increase in surface area
  • 54. Global Reservoir and Dam (GRanD) Database Global Reservoir and Dam (GRanD) Database: • A data curation initiative by Global Water System Project (GWSP) • Finds 61 dams constructed after 2001 UMN Approach: • Finds 701 dams constructed after 2001 Dams reported by GRanD since 2001: 35 SAMSI Climate Workshop, 20178/21/2017 54 A data curation initiative by Global Water System Project (GWSP)
  • 55. Comparison of Dam Detections with GRanD Global Reservoir and Dam (GRanD) Database: • A data curation initiative by Global Water System Project (GWSP) • Finds 61 dams constructed after 2001 UMN Approach: • Finds 701 dams constructed after 2001 Dams only reported by GRanD: 5 Dams reported by both UMN and GRanD: 30 Dams only reported by UMN: 671 SAMSI Climate Workshop, 20178/21/2017 55
  • 56. Applications of Surface Water Monitoring Quantifying water stocks and flow Global projections of water risks (red) Integrating with hydrological models 56 • Changes in the volume of water in lakes and reservoirs o in conjunction with altimetry data • River flow measurements at a global scale • High quality data for calibrating hydrological models Key Enabler: High-resolution estimates of water bodies in space (30m) and time (daily)8/21/2017
  • 57. Goal: Daily surface water extents maps at 30m spatial resolution • Challenge: – LANDSAT (30m spatial resolution, every 16 days) – MODIS (500m spatial resolution, daily) – Overlapping information is available only at few time steps • Approach: Transfer information across Space and Time 578/21/2017 SAMSI Climate Workshop, 2017
  • 58. Goal: Daily surface water extents maps at 30m spatial resolution • Challenge: – LANDSAT (30m spatial resolution, every 16 days) – MODIS (500m spatial resolution, daily) – Overlapping information is available only at few time steps • Approach: Transfer information across Space and Time 58Extent at coarse resolution (500m) Extent at high resolution (30m) created using our approach Kajakai Reservoir, Afghanistan 8/21/2017 SAMSI Climate Workshop, 2017
  • 59. Goal: Daily surface water extents maps at 30m spatial resolution • Challenge: – LANDSAT (30m spatial resolution, every 16 days) – MODIS (500m spatial resolution, daily) – Overlapping information is available only at few time steps • Approach: Transfer information across Space and Time 59Extent at coarse resolution (500m) Extent at high resolution (30m) created using our approach Kajakai Reservoir, Afghanistan Jul 05, 2015 8/21/2017
  • 60. Concluding Remarks • Big data in climate offers great opportunity to advance machine learning as well as increase our understanding of the Earth’s climate and environment. • Novel approaches are needed that can guide the process of knowledge discovery in scientific applications – “Theory-guided Data Science” • Methods have applicability across diverse domains: – Ecosystem management – Epidemiology – Geospatial Intelligence – Neuroscience SAMSI Climate Workshop, 20178/21/2017 60
  • 61. Thank You! Questions? UMN Graduate Students UMN Team Members Arindam Banerjee, Snigdhansu Chatterjee, Stefan Liess, Shashi Shekhar, Michael Steinbach NSF Expeditions Collaborators NCSU: Nagiza Samatova, Fredrick Semazzi; Northeastern: Auroop Ganguly; North Carolina A&T: Abdollah Homaifar, Fred Semazzi External Collaborators NASA Ames: Rama Nemani, Nikunj Oza; IonE, UMN: Kate Braumann; UCLA: Dennis Lettenmaier, Miriam Marlier Guruprasad NayakAnkush Khandelwal Varun Mithal Work supported by NASA and NSF Expeditions in Computing project on Understanding Climate Change using Data-driven Approaches Anuj Karpatne Xi Chen Xiaowei Jia Saurabh Agrawal 8/21/2017 61