Statistics is a critical tool for robustness analysis, measurement system error analysis, test data analysis, probabilistic risk assessment, and many other fields in the engineering world. These basic applications are related to our basic engineering problems which help us to solve the problems and gives us the better solution and brings the efficiency to work with our real life engineering problems.
Water Industry Process Automation & Control Monthly - April 2024
Statistics in real life engineering
1. Application:
• Determining the exact measurements needed for a structure to be secure
• To aid quality control and decide on improvements
• To help in the design and build of a structure or product
• To work out the time a job will take and how many people are needed
• To find the probability of risk
• To define the process of reducing the loss
• To find the error
Outline of Applications
• Histogram and statistical data sets
• The Mode of Statistical Dataset
• The Mean
• The Median
• Deviation and Standard Deviation (σ)
• The probability
• Bayes’ Theorem
• Application Of Regression Analysis in the Field of EEE
•
Histogram and statistical data sets
• Histogram is the most frequently used form to represent datasets
It is often referred to as “Frequency distribution” of a dataset.
• Procedure in establishing Histogram of a dataset:
• Determine the largest and smallest numbers in the raw data.
• Determine appropriate range (the difference between largest and smallest
numbers)to present the data
2. • Divide the range into a convenient number of intervals having the same
size(value). If this is not feasible, use the intervals of different sizes of open
intervals.
• Determine the number of observations falling into each set interval, i.e. find the
data frequency
• Example: Electricity - consumption (billion kWh) in Bangladesh.
Country 2000 2001 2002 2003 2004 2005 2006
Bangladesh 11.04 11.22 12.55 14.26 14.25 15.3 16.2
2012
23.94
• We will plot the data on the Frequency polygon to show the Consumption
graphically
The Mode of Statistical Dataset
• Statistical dataset can usually represented by the mode of the set The mode of the dataset
is represented by the
• Number that appear in the dataset most frequently
Example : In an Experiment we will use the Output that comes most
0
5
10
15
20
25
30
Year 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012
3. For instances:
• The set 2, 2, 5, 7, 9, 9, 9, 10, 10 11, 12, 18 has mode 9
• The set: 1.75, 1.83, 1.85, 1.95, 1.97, 2.03, 2.03, 2.06, 2.13, 2.15, 2.15, 2.25, 2.35,
2.70, 2.70 has a triple mode of: 2.03, 2.15 and 2.70, as each of these numbers each
appear twice in the set
• The set 3, 5, 8,10, 12, 15 has no mode
The Mean
• The “Mean” of a dataset is the arithmetic average of the data in the set
• It is a good way to represent the “Central tendency” of the set
• Mean:
𝑆𝑢𝑚𝑚𝑎𝑡𝑖𝑜𝑛 𝑂𝑓 𝑎 𝑙𝑙 𝐷𝑎𝑡𝑎
𝑇𝑜𝑡𝑎𝑙 𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑑𝑎𝑡𝑎
=
∑ 𝑋𝑖𝑛
𝑖=1
𝑛
Example: In an Experiment of calculating the best output we can use the mean to
get an average best value:
say the outputs are : 6.5V, 7.3V, 5.6V
So mean:
6.5+7.3+5.6
3
=
19.4
3
=6.47 V
We will use this 6.4V As the output
The Median
• The Median In cases in which data in the set shows significant amount of “Out-ranged”
data, the Median– meaning the “central data” is used to show the “central tendency” of
the set
• Example: During experiment we need to calculate the central value that will represent the
output of the experiment then we use median value.
• say the outputs are: 2V ,3V ,5V ,7V ,9V ,11V , 73V
• We may take the data in the “central” of the set, i.e., 7Vto be the Median
representing the central tendency of the dataset.
Deviation and Standard Deviation (σ)
4. • the “Mean” of a dataset indicates its “central tendency,” it is often required to measure
how some data in the set “deviates” from its mean value x
Example: In a circuit we are using different loads then it shows the variation in the frequency
response curves. From the values of output we can calculate the Deviation and the Standard of
deviation to understand the Variability of the loads frequency output.
• From the data the Var= 100.2/21=4.77 Mhz
Standard deviation= √ 𝑉𝑎𝑟 = 2.18𝑀ℎ𝑧
Var->0 then its more homogenous.
X (Mhz) f fx (x-avg(X))^2 f(x-avg(x))^2
21 3 63 18.4 55.29
24 6 144 1.67 9.96
27 12 324 2.92 35.04
100.2
The probability
• Probability is a measure of the chance or likelihood that a particular outcome will occur.
This occurence is measured by numerical statement assigned on a scale 0 to 1. A
probability near 0 indicates that an event is unlikely to occur and a probability close to 1
indicates that an outcome is likely to occur.
• Probability is heavily used are communication theory, networking, detection and
estimation
• Understanding of random variables and their applications is central for the
constituent industries.
• Use the statistics to perform semiconductor process control, yield management,
reliability analysis etc.
5. • Understand and can calculate the probability of risk of any experimental work etc.
Applications of Probability and Statistics
Computer Science:
Machine Learning
Data Mining
Simulation
Image Processing
Computer Vision
Computer Graphics
Visualization Software
Testing Algorithms
For Example
• In an industry machine A produces 60% of the output On the average machine B
produces the rest. On the average 20 items in 1000 produced by machine A are defective
and 12 items in 500 produced by B are defective. In a day’s run, the two machines
produce 10,000 items. An item is drawn at random from a day’s output and is found to be
defective. What is the probability that it was produced by machine A? it was produced by
B?
• Solution:
• Let us define the following events
• B1 :Item produced by machine A , P(B1)= .60
• B2 : Item produced by machine B, P(B2)=.40
• A: Selected item is defective
P(A)=.60*.02+.40*.024= .0216
• Probability from Machine A =.60*.02/.0216= .55
• Probability from Machine B =.40*.024/.0216= .45
6. • Total no of product is defective is =10000*.0216= 216
Application Of Regression Analysis in the Field of EEE
Regression analysis is a technique of studying the dependence of one variable on one or
more variables with a view to estimating or predicting the average value of the dependent
variable in terms of the known or fixed values of the independent variables
Regression model is applied in many essential sectors in EEE field. One of those sector is
estimating the future outputs of machineries and also calculate the system losses.
EXAMPLE
Power produced by a nuclear reactor :
How much power the reactor will provide in 20th week?
Here the Equation is
y=a+bx+e
By calculating we can find
b= -0.83 a=62.48
So the regression model will be
Y=62.48+(-0.98)X +e
Where e=+/- 9.26.
7. With this regression model we can easily estimate the future performance of this Reactor
Bayes’ Theorem
Named for Thomas Bayes, an English clergyman and mathematician, Bayesian logic is a
branch of logic applied to decision making and inferential statistics that deals
with probability inference: using the knowledge of prior events to predict future events.
Bayes first proposed his theorem in his 1763 work (published two years after his death in
1761),An Essay Towards Solving a Problem in the Doctrine of Chances . Bayes' theorem
provided, for the first time, a mathematical method that could be used to calculate, given
occurrences in prior trials, the likelihood of a target occurrence in future trials. According
to Bayesian logic, the only way to quantify a situation with an uncertain outcome is
through determining its probability
Bayes’ theorem is used in many sectors of Computer Science and Engineering and it
plays one of the most important roles in Artificial Intelligence .In which it enables a
machine to judge its decision like in Gmail most of the mail that contains the word
“discount” is considered as spam.
Example
What is the probability that a mail containing the word discount is a spam?
Given,
P(a mail contains the word “ discount ”)=0.05
P(a mail is a spam )=0.1
P(a mail is spam given it contains discount)=0.7
P(b|a) = P(a|b)P(b)
P(a)
P(a mail containing the word discount is a spam)
=(0.7)*(0.05) =0.35
(0.1)
Statistics is a critical tool for robustness analysis, measurement system error analysis, test data
analysis, probabilistic risk assessment, and many other fields in the engineering world. These
8. basic applications are related to our basic engineering problems which help us to solve the
problems and gives us the better solution and brings the efficiency to work with our real life
engineering problems.