Presentation on how to chat with PDF using ChatGPT code interpreter
Advance mathematics mid term presentation rev01
1. Finding Regression Coefficient
To Predict Radiation
Contents:
• Finding of regression coefficient using
•
•
Advance Mathematics
Statistical method
Genetic algorithm (by minimizing error)
References:
1. Course work in environmental geology
2. Course work in advance mathematics for planning
1
Nirmal Raj Joshi|13ME135|Structual Material Laboratory
2. Advance Mathematics for Planning
OBJECTIVE
*The objective of this analysis is to predict the radiation level at 6th location using
known data of 5 locations. The data set contains radiation measured at 6 different
spots.
*Use various techniques to find the regression coefficients
a) Linear regression
b) genetic algorithm
2
3. Date
D-1
9/2/2013 1.318
9/3/2013
1.3
9/4/2013 1.324
…..
…..
11/8/2013 1.294
11/11/2013 1.32
D-14
D-15
D-20
D-24
2.132
2.132
2.158
…..
2.116
2.084
1.48
1.46
1.378
…..
1.438
1.384
3.144
3.468
3.32
…..
3.356
3.156
2.128
2.13
1.924
…..
2.036
1.936
2.858
2.89
2.874
…..
2.724
2.904
6
D-26
Radiation level
Advance Mathematics for Planning
Data
Radiation level at various locations
4
2
0
Time (days)
D-1
Target data
D-14
D-15
D-20
D-24
D-26
Input data
3
4. The regression equation is
*There is no constant term in the equation.
model_radiation=lm(D.1~D.14+D.15+D.20+D.24+D.26+0,data=rdata)
summary(model_radiation)
1.4
R-Output
Coefficients:
Estimate Std. Error t value Pr(>|t|)
D.14 0.226600 0.088572 2.558 0.0143 *
D.15 0.407026 0.155673 2.615 0.0124 *
D.20 0.025524 0.024257 1.052 0.2989
D.24 0.004203 0.081023 0.052 0.9589
D.26 0.050943 0.052996 0.961 0.3421
--Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 0.03299 on 41 degrees of freedom
Multiple R-squared: 0.9994,
Adjusted R-squared: 0.9993
F-statistic: 1.353e+04 on 5 and 41 DF, p-value: < 2.2e-16
Radiation at location D-1: Actual Vs Predicted
Radiation level
Advance Mathematics for Planning
Regression analysis using normal
statistical method
1.3
1.2
1.1
1
Time (days)
D-1
ANOVA table
Source of Variation
Between Groups
Within Groups
SS
6.85E-07
0.135982
df
1
90
Total
0.135982
MS
6.85E-07
0.001511
F
0.000453
D-1 pred
P-value
0.983061
91
*R2 value of this multiple regression is 0.9994 which indicated the linear model is quite reliable.
*R2 value of PD-1 calculated and observed value is 0.4668.
*F<Fcrit, hence the model is acceptable at 95% confidence level
F crit
3.946876
4
5. Advance Mathematics for Planning
Regression analysis using
Genetic algorithm
About genetic algorithm
1.
2.
3.
4.
5.
Start with a randomly generated population of n l−bit
chromosomes (candidate solutions to a problem).
Calculate the fitness ƒ(x) of each chromosome x in the
population.
Repeat the following steps until n offspring have been created:
a. Select a pair of parent chromosomes from the current
population, the probability of selection being an increasing
function of fitness.
b. With probability pc (the "crossover probability" or
"crossover rate"), cross over the pair at a randomly
chosen point (chosen with uniform probability) to form two
offspring. If no crossover takes place, form two offspring
that are exact copies of their respective parents.
c. Mutate the two offspring at each locus with probability pm
(the mutation probability or mutation rate), and place the
resulting chromosomes in the new population.
Replace the current population with the new population.
Go to step 2 until the fitness of successive population converges.
Generate initial population
Array of [P] values. Find fitness.
Generate off-springs using elite population
Set of new population
Stop when convergence is met
5
6. Advance Mathematics for Planning
Regression analysis using
Genetic algorithm
Solution using genetic algorithm
Two fitness functions were used viz. (a) R2 and (b) E2=(Yactual-Ypred)2 separately. The program was run in MATLAB
and results were obtained. For both run, the input parameters are:
SN
1
Parameter
Population type
2
Number of variables
optimize
Number of population
Number of generation
3
4
5
Value
Double
numbers
to 5
5000
100 or attainment of
error
in 1e-6
7
8
9
Allowable
error
consecutive population
Population
generation
scheme
Fitness scaling
Selection function
Reproduction scheme
10
Mutation function
6
precision
Uniform
Rank
Roulette wheel
Elite
selection
with
crossover of 0.8 at
single point
Uniform
6
7. Solution using genetic algorithm
CASE-1: Maximizing R2 value
=> α=[ 0.4702 0.9999 0.0861 0.0232 0.0957] and R2=0.4688
CASE-2: Minimizing E2=(Yactual-Ypred)2
=> α=[0.1952 0.3847 0.0362 0.0502 0.0397] and R2=0.4600
Although R2 is
low, the fitting is
more realistic
Radiation at location D-1: Actual Vs Predicted
3.500
3.000
2.500
Radiation Level
Advance Mathematics for Planning
Regression analysis using
Genetic algorithm
2.000
1.500
1.000
0.500
0.000
Time (Days)
D-1 pred with E2
D-1actual
D-1pred with R2
7
8. Final adopted values
Radiation at location D-1: Actual Vs Predicted
11/11/2013
11/9/2013
11/7/2013
11/5/2013
11/3/2013
11/1/2013
10/30/2013
10/28/2013
10/26/2013
10/24/2013
10/22/2013
10/20/2013
10/18/2013
10/16/2013
10/14/2013
10/12/2013
10/10/2013
10/8/2013
10/6/2013
10/4/2013
10/2/2013
9/30/2013
9/28/2013
9/26/2013
9/24/2013
9/22/2013
9/20/2013
9/18/2013
9/16/2013
9/14/2013
9/12/2013
9/10/2013
9/8/2013
9/6/2013
9/4/2013
1.400
1.350
1.300
1.250
1.200
1.150
1.100
1.050
1.000
9/2/2013
Radiation Level
Advance Mathematics for Planning
Regression analysis using
Genetic algorithm
Time (Days)
D-1 pred with E2
Source of Variation
Between Groups
Within Groups
SS
2.21E-05
0.135241
df
1
90
Total
0.135263
D-1actual
MS
2.21E-05
0.001503
F
0.014681
P-value
0.90383
91
In the table, we see that F<Fcrit, thus it can be said that the distribution estimated by the regression coefficients α
gives significantly correct value at 95% confidence level.
F crit
3.946876
8
9. Advance Mathematics for Planning
Summary
• Regression coefficients was calculated using two method.
• The data analysis tools should be selected wisely to get the correct results.
For e.g. in case of GA, R2 value may not yield proper results.
9
Welcome and good afternoon. My name is Nirmal Raj Joshi. I am from civil engineering department.I used the data analysis technique to find regression coefficient to predict radiation at certain location by knowing radiation at other few known location.The problem is related to environmental engineering.Two methods were used to find the coefficients , that are a)statistical method b) genetic algorithm
read
The data contains tabulated valued of radiation measured at different dates.Our aim is to condense the data in a linear equation
The equation assumed is as.. Note that there is no constant term.The data was analysed in R. The output values was used to make the required equation.Furthermore, ANOVA table was also build to check the new equation with existing data.The F value indicated that the model is acceptable as F<F critical.
In next step, regression coefficients were calculated using genetic algorighm.Breifly speaking, the genetinc algorithm uses large set of input random population and check statistic of each one to predict the suitable output.As can be seen in this figure and algorithm aside,…. (Explain)
Next the model was built in MATLAB using initial population of 5000 members.The fitness funtion was a) R square and b)E squareAnd the program was run for both the cases.
This is the output for both cases.As can be seen in the graph, the calculation using E square value generated more suitable coefficients to match with actual measurement.Thus this value was adopted.
Here we can see the final equation.As before, ANOVA table was build to check the F statistic.It was found that F <F critical, hence the model could be used.
Thus in this study, we used two methods to find regression coefficients and found similar two equations.We also noticed that proper statistical equation must be used to obtain correct results depending on problem type.