2. MPI
Climate Change
N u m e r i c a l m o d e l i n g & t o o l s M P I 7 9 4 | Y A S S E R B . A . F A R A G12 January 2021
DATA
• In the past, data in which science and engineering is based, was scarce and frequently obtained by
experiments proposed to verify a given hypothesis. Each experiment was able to yield only very limited
data.
• Today, data is abundant and abundantly collected in each single experiment at a very small cost.
(Francisco J. Montáns, Francisco Chinesta, Rafael Gómez-Bombarelli, J. Nathan Kutz, Data-driven modeling and learning in science and engineering)
35
3. MPI
Climate Change
N u m e r i c a l m o d e l i n g & t o o l s M P I 7 9 4 | Y A S S E R B . A . F A R A G12 January 2021
BIG-DATA – Def.
• The term has been in use since the 1990s, with some giving credit to John Mashey for popularizing the term.
• Big data usually includes data sets with sizes beyond the ability of commonly used software tools to capture, curate,
manage, and process data within a tolerable elapsed time.
• Big data philosophy encompasses unstructured, semi-structured and structured data, however the main focus is on
unstructured data
• Big data "size" is a constantly moving target, as of 2012 ranging from a few dozen terabytes to many zettabytes of data
• A 2018 definition states "Big data is where parallel computing tools are needed to handle data“.
36
4. MPI
Climate Change
N u m e r i c a l m o d e l i n g & t o o l s M P I 7 9 4 | Y A S S E R B . A . F A R A G12 January 2021
Rev. 4
37
5. MPI
Climate Change
N u m e r i c a l m o d e l i n g & t o o l s M P I 7 9 4 | Y A S S E R B . A . F A R A G12 January 2021
Data deployment
38
6. MPI
Climate Change
N u m e r i c a l m o d e l i n g & t o o l s M P I 7 9 4 | Y A S S E R B . A . F A R A G12 January 2021
STEPS…
Control
• Create a framework to
optimize your system
and hence maximize
it’s efficiency
Design &
Develop
• Choose the suitable
methodology
• Design your model
• Validate your results
• Improve your model
efficiency
Analyze
• Visualize your data
• Derive data variables
inter-relationships
UNDERSTAND YOUR
DATA
Measure
• Collect valid data
• Improve your data
quality
Define
• Ask the right
questions
• Define your targets
Set boundaries
39
7. MPI
Climate Change
N u m e r i c a l m o d e l i n g & t o o l s M P I 7 9 4 | Y A S S E R B . A . F A R A G12 January 2021
Example
• There are weather sensors and satellites
deployed all around the globe. A huge
amount of data is collected from them,
and then this data is used to monitor
the weather and environmental
conditions.
• All of the data collected from these
sensors and satellites contribute to big
data and can be used in different ways
such as:
1. In weather forecasting
2. To study global warming
3. In understanding the patterns of
natural disasters
4. To make necessary preparations in
the case of crises
5. To predict the availability of usable
water around the world
40
8. MPI
Climate Change
N u m e r i c a l m o d e l i n g & t o o l s M P I 7 9 4 | Y A S S E R B . A . F A R A G12 January 2021
Data-driven models
• A data-driven model is based on the analysis of the data about a specific system. The main concept of data-
driven model is to find relationships between the system state variables (input and output) without explicit
knowledge of the physical behavior of the system
(Solomatine et al. 2008).
• Data-driven modelling is therefore focused on Machine Learning (ML) methods that can be used to build
models for complementing or replacing physically based models. A machine-learning algorithm is used to
determine the relationship between a system's inputs and outputs using a training data set that is
representative of all the behaviour found in the system .
Data-Driven Modelling: Concepts, Approaches and Experiences,
https://www.researchgate.net/publication/226880269_Data-Driven_Modelling_Concepts_Approaches_and_Experiences
• Machine Learning allows the systems to make decisions autonomously without any external support. These
decisions are made when the machine is able to learn from the data and understand the underlying patterns
that are contained within it. Then, through pattern matching and further analysis, they return the outcome
which can be a classification or a prediction
41
9. MPI
Climate Change
N u m e r i c a l m o d e l i n g & t o o l s M P I 7 9 4 | Y A S S E R B . A . F A R A G12 January 2021
Machine Learning (ML)
10. MPI
Climate Change
N u m e r i c a l m o d e l i n g & t o o l s M P I 7 9 4 | Y A S S E R B . A . F A R A G12 January 2021
Machine Learning (ML)
Machine Learning
Supervised Learning
Unsupervised
Learning
Reinforcement
Learning
11. MPI
Climate Change
N u m e r i c a l m o d e l i n g & t o o l s M P I 7 9 4 | Y A S S E R B . A . F A R A G12 January 2021
Typical Data-driven model
Validation
Input data set
(𝑥𝑥𝑛𝑛)
Real model
Data-driven
model (ML)
Actual (observed)
output (𝑦𝑦𝑛𝑛)
Estimated output
(�𝑦𝑦𝑛𝑛)
Learning to
minimize the
error
e
e
44
12. MPI
Climate Change
N u m e r i c a l m o d e l i n g & t o o l s M P I 7 9 4 | Y A S S E R B . A . F A R A G12 January 2021
Regression models
45
x y
1 9.0
2 12.0
3 15.0
4 18.0
5 21.0
6 24.0
7 27.0
8 30.0
9 33.0
10 36.0
11 39.0
12 42.0
13 45.0
14 48.0
15 51.0
16 54.0
17 57.0
18 60.0
19 63.0
20 66.0
-
10.0
20.0
30.0
40.0
50.0
60.0
70.0
80.0
0 5 10 15 20 25
y
x
Estimate the value of y at x=25
13. MPI
Climate Change
N u m e r i c a l m o d e l i n g & t o o l s M P I 7 9 4 | Y A S S E R B . A . F A R A G12 January 2021
Linear Regression
46
x y
1 9.0
2 12.0
3 15.0
4 18.0
5 21.0
6 24.0
7 27.0
8 30.0
9 33.0
10 36.0
11 39.0
12 42.0
13 45.0
14 48.0
15 51.0
16 54.0
17 57.0
18 60.0
19 63.0
20 66.0
-
10.0
20.0
30.0
40.0
50.0
60.0
70.0
80.0
0 5 10 15 20 25
y
x
Estimate the value of y at x=25 y = -1E-07x2 + 3x + 6
14. MPI
Climate Change
N u m e r i c a l m o d e l i n g & t o o l s M P I 7 9 4 | Y A S S E R B . A . F A R A G12 January 2021
Non-Linear Regression
47
Can you predict the US. Population at 2020
Year (x)
US. Population
(m)
1790 3.9
1800 5.3
1810 7.2
1820 9.6
1830 12.9
1840 17.1
1850 23.1
1860 31.4
1870 38.6
1880 50.2
1890 62.9
1900 76
1910 92
1920 105.7
1930 122.8
1940 131.7
1950 150.7
1960 179
1970 205
1980 226.5
1990 248.7
0
50
100
150
200
250
300
1790 1840 1890 1940 1990 2040
Population(millions)
Year
15. MPI
Climate Change
N u m e r i c a l m o d e l i n g & t o o l s M P I 7 9 4 | Y A S S E R B . A . F A R A G12 January 2021
Regression
48