This document provides an overview of common summary statistics used to describe distributions, including measures of central tendency (mean, median, mode), spread (variance, standard deviation, range, interquartile range), and shape (skewness and kurtosis). Formulas for calculating the mean, variance, and standard deviation are presented. Guidelines for rounding and interpreting these statistics are also discussed. Examples are provided to demonstrate calculating and interpreting these summary statistics.
Rajshahi Krishi Unnayan Bank is playing a vital role in the economic development of Bangladesh, especially in supporting farmers in sixteen districts of Rajshahi and Rangpur divisions. Agriculture is the foremost important part of the Bangladeshi economy.
The statistic is a type of scientific examination that utilizes evaluated models, portrayals, and summaries for a given arrangement of trial information or genuine investigations. Statistics consider systems to assemble, audit, break down, and make determinations from the information.
Rajshahi Krishi Unnayan Bank is playing a vital role in the economic development of Bangladesh, especially in supporting farmers in sixteen districts of Rajshahi and Rangpur divisions. Agriculture is the foremost important part of the Bangladeshi economy.
The statistic is a type of scientific examination that utilizes evaluated models, portrayals, and summaries for a given arrangement of trial information or genuine investigations. Statistics consider systems to assemble, audit, break down, and make determinations from the information.
6th International Conference on Machine Learning & Applications (CMLA 2024)ClaraZara1
6th International Conference on Machine Learning & Applications (CMLA 2024) will provide an excellent international forum for sharing knowledge and results in theory, methodology and applications of on Machine Learning & Applications.
Student information management system project report ii.pdfKamal Acharya
Our project explains about the student management. This project mainly explains the various actions related to student details. This project shows some ease in adding, editing and deleting the student details. It also provides a less time consuming process for viewing, adding, editing and deleting the marks of the students.
Immunizing Image Classifiers Against Localized Adversary Attacksgerogepatton
This paper addresses the vulnerability of deep learning models, particularly convolutional neural networks
(CNN)s, to adversarial attacks and presents a proactive training technique designed to counter them. We
introduce a novel volumization algorithm, which transforms 2D images into 3D volumetric representations.
When combined with 3D convolution and deep curriculum learning optimization (CLO), itsignificantly improves
the immunity of models against localized universal attacks by up to 40%. We evaluate our proposed approach
using contemporary CNN architectures and the modified Canadian Institute for Advanced Research (CIFAR-10
and CIFAR-100) and ImageNet Large Scale Visual Recognition Challenge (ILSVRC12) datasets, showcasing
accuracy improvements over previous techniques. The results indicate that the combination of the volumetric
input and curriculum learning holds significant promise for mitigating adversarial attacks without necessitating
adversary training.
Saudi Arabia stands as a titan in the global energy landscape, renowned for its abundant oil and gas resources. It's the largest exporter of petroleum and holds some of the world's most significant reserves. Let's delve into the top 10 oil and gas projects shaping Saudi Arabia's energy future in 2024.
Welcome to WIPAC Monthly the magazine brought to you by the LinkedIn Group Water Industry Process Automation & Control.
In this month's edition, along with this month's industry news to celebrate the 13 years since the group was created we have articles including
A case study of the used of Advanced Process Control at the Wastewater Treatment works at Lleida in Spain
A look back on an article on smart wastewater networks in order to see how the industry has measured up in the interim around the adoption of Digital Transformation in the Water Industry.
Hierarchical Digital Twin of a Naval Power SystemKerry Sado
A hierarchical digital twin of a Naval DC power system has been developed and experimentally verified. Similar to other state-of-the-art digital twins, this technology creates a digital replica of the physical system executed in real-time or faster, which can modify hardware controls. However, its advantage stems from distributing computational efforts by utilizing a hierarchical structure composed of lower-level digital twin blocks and a higher-level system digital twin. Each digital twin block is associated with a physical subsystem of the hardware and communicates with a singular system digital twin, which creates a system-level response. By extracting information from each level of the hierarchy, power system controls of the hardware were reconfigured autonomously. This hierarchical digital twin development offers several advantages over other digital twins, particularly in the field of naval power systems. The hierarchical structure allows for greater computational efficiency and scalability while the ability to autonomously reconfigure hardware controls offers increased flexibility and responsiveness. The hierarchical decomposition and models utilized were well aligned with the physical twin, as indicated by the maximum deviations between the developed digital twin hierarchy and the hardware.
Final project report on grocery store management system..pdfKamal Acharya
In today’s fast-changing business environment, it’s extremely important to be able to respond to client needs in the most effective and timely manner. If your customers wish to see your business online and have instant access to your products or services.
Online Grocery Store is an e-commerce website, which retails various grocery products. This project allows viewing various products available enables registered users to purchase desired products instantly using Paytm, UPI payment processor (Instant Pay) and also can place order by using Cash on Delivery (Pay Later) option. This project provides an easy access to Administrators and Managers to view orders placed using Pay Later and Instant Pay options.
In order to develop an e-commerce website, a number of Technologies must be studied and understood. These include multi-tiered architecture, server and client-side scripting techniques, implementation technologies, programming language (such as PHP, HTML, CSS, JavaScript) and MySQL relational databases. This is a project with the objective to develop a basic website where a consumer is provided with a shopping cart website and also to know about the technologies used to develop such a website.
This document will discuss each of the underlying technologies to create and implement an e- commerce website.
HEAP SORT ILLUSTRATED WITH HEAPIFY, BUILD HEAP FOR DYNAMIC ARRAYS.
Heap sort is a comparison-based sorting technique based on Binary Heap data structure. It is similar to the selection sort where we first find the minimum element and place the minimum element at the beginning. Repeat the same process for the remaining elements.
Hybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdffxintegritypublishin
Advancements in technology unveil a myriad of electrical and electronic breakthroughs geared towards efficiently harnessing limited resources to meet human energy demands. The optimization of hybrid solar PV panels and pumped hydro energy supply systems plays a pivotal role in utilizing natural resources effectively. This initiative not only benefits humanity but also fosters environmental sustainability. The study investigated the design optimization of these hybrid systems, focusing on understanding solar radiation patterns, identifying geographical influences on solar radiation, formulating a mathematical model for system optimization, and determining the optimal configuration of PV panels and pumped hydro storage. Through a comparative analysis approach and eight weeks of data collection, the study addressed key research questions related to solar radiation patterns and optimal system design. The findings highlighted regions with heightened solar radiation levels, showcasing substantial potential for power generation and emphasizing the system's efficiency. Optimizing system design significantly boosted power generation, promoted renewable energy utilization, and enhanced energy storage capacity. The study underscored the benefits of optimizing hybrid solar PV panels and pumped hydro energy supply systems for sustainable energy usage. Optimizing the design of solar PV panels and pumped hydro energy supply systems as examined across diverse climatic conditions in a developing country, not only enhances power generation but also improves the integration of renewable energy sources and boosts energy storage capacities, particularly beneficial for less economically prosperous regions. Additionally, the study provides valuable insights for advancing energy research in economically viable areas. Recommendations included conducting site-specific assessments, utilizing advanced modeling tools, implementing regular maintenance protocols, and enhancing communication among system components.
Hybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdf
sumstats.ppt
1. 12/24/2022 Summary Statistics 1
Summary Statistics
Last week we used stemplots and histograms to
describe the shape, location, and spread of a
distribution. This week we use numerical summaries of
location and spread.
2. 12/24/2022 Summary Statistics 2
Main Summary Statistics by Type
Central location
Mean
Median
Mode
Spread
Variance and standard deviation
Quartiles and Inter Quartile Range (IQR)
Shape
Statistical measures of spread (e.g., skewness and
kurtosis) are available but are seldom used in
practice (not covered)
3. 12/24/2022 Summary Statistics 3
Notation
n sample size
X variable
xi value of individual i
sum all values (capital sigma)
Illustrative example (sample.sav), data:
21 42 5 11 30 50 28 27 24 52
n = 10
X = age
x1= 21, x2= 42, …, x10= 52
x = 21 + 42 + … + 52 = 290
4. 12/24/2022 Summary Statistics 4
Sample Mean
i
i
x
n
n
x
x
1
0
.
29
)
290
(
10
1
1
i
x
n
x
Illustrative example: n = 10 (data & intermediate calculations on prior slide)
5. 12/24/2022 Summary Statistics 5
Population Mean
Same operation as sample mean, but
based on entire population (N =
population size)
Not available in practice, but important
conceptually
i
i
x
N
N
x 1
6. 12/24/2022 Summary Statistics 6
Interpretation of xbar
Sample mean used to predict
an observation drawn at random from a sample
an observation drawn at random from the
population
the population mean
Gravitational center (balance point)
0 10 20 30 40 50 60
Mean = 29
7. 12/24/2022 Summary Statistics 7
Median – a different kind of average
“Middle value”
Covered last week
Order data
Depth of median is (n+1) / 2
When n is odd middle value
When n is even average two middle values
Illustrative example, n = 10 median has
depth (10+1) / 2 = 5.5
05 11 21 24 27 28 30 42 50 52
median = average of 27 and 28 = 27.5
8. 12/24/2022 Summary Statistics 8
Median is “robust”
Robust resistant to skews and outliers
This data set has a mean (xbar) of 1600:
1362 1439 1460 1614 1666 1792 1867
This data set has an outlier and a mean of 2743:
1362 1439 1460 1614 1666 1792 9867
Outlier
The median is 1614 in both instances.
The median was not influenced by the outlier.
9. 12/24/2022 Summary Statistics 9
Mode
Mode value with greatest frequency
e.g., {4, 7, 7, 7, 8, 8, 9} has mode = 7
Used only in very large data sets
10. 12/24/2022 Summary Statistics 10
Mean, Median, Mode
(A) Symmetrical data: mean = median
(B) positive skew: mean > median [mean gets “pulled” by tail]
(C) negative skew: mean < median
Mean Mode
Median
(A)Symmetrica
l
Mode
Median
Mean
Mean
Median
Mode
(B)PositiveSkew (B)NegativeS
kew
11. 12/24/2022 Summary Statistics 11
Spread = Variability
Variability amount values spread
above and below the average
Measures of spread
Range and inter-quartile range
Standard deviation and variance (this week)
12. 12/24/2022 Summary Statistics 12
Range = max – min
The range is rarely used in practice b/c it
tends to underestimate population range
and is not robust
13. 12/24/2022 Summary Statistics 13
Standard deviation
x
xi
Deviation =
2
x
x
SS i
Sum of squared deviations =
1
2
n
SS
s
Sample variance =
2
s
s
Sample standard deviation =
Most common descriptive measure of spread
14. 12/24/2022 Summary Statistics 14
Standard deviation (formula)
2
)
(
1
1
x
x
n
s i
Sample standard deviation s is the unbiased estimator of
population standard deviation .
Population standard deviation is rarely known in practice.
15. 12/24/2022 Summary Statistics 15
New data set (“Metabolic Rates”)
This example is not in your lecture notes
Metabolic rates (cal/day), n = 7
1792 1666 1362 1614 1460 1867 1439
1600
7
200
,
11
7
1439
1867
1460
1614
1362
1666
1792
x
17. 12/24/2022 Summary Statistics 17
Standard Deviation Calculation
metabolic.sav – introduced slide 15
Observations Deviations Squared deviations
1792 1792 1600 = 192 (192)2 = 36,864
1666 1666 1600 = 66 (66)2 = 4,356
1362 1362 1600 = -238 (-238)2 = 56,644
1614 1614 1600 = 14 (14)2 = 196
1460 1460 1600 = -140 (-140)2 = 19,600
1867 1867 1600 = 267 (267)2 = 71,289
1439 1439 1600 = -161 (-161)2 = 25,921
SUMS 0* SS = 214,870
x
xi
i
x 2
x
xi
* Sum of deviations will always equal zero
18. 12/24/2022 Summary Statistics 18
Standard Deviation Metabolic data
(cont.)
2
2
calories
67
.
811
,
35
1
7
870
,
214
1
n
SS
s
calories
24
.
189
67
.
811
,
35
2
s
s
Variance (s2)
Standard deviation (s)
19. 12/24/2022 Summary Statistics 19
General rule for rounding means
and standard deviations
Report mean to one additional decimals above that of
the data
To achieve accuracy, intermediate calculations should
carry still an additional decimals
Illustrative example
Suppose data is recorded with one decimal accuracy (i.e.,
xx.x)
Report mean with two decimal accuracy (i.e., xx.xx)
Carry all intermediate calculations with at least three decimal
accuracy (i.e., xx.xxx)
Even more important: Always use common sense and judgment.
20. 12/24/2022 Summary Statistics 20
TI-30XIIS – about $12
In practice, we often use software
or a calculator to check our
standard deviation
21. 12/24/2022 Summary Statistics 21
Interpretation of Standard Deviation
Larger standard deviation greater variability
s1 = 15 and s2 = 10 group 1 has more variability
68-95-99.7 rule – Normal data only
68% of data with 1 SD of mean, 95% within 2 SD from
mean, and 99.7% within 3 SD of mean
e.g., if mean = 30 and SD = 10, then 95% of individuals are
in the range 30 ± (2)(10) = 30 ± 20 = (10 to 50)
Chebychev’s rule – All data
at least 75% data within 2 SD of mean
e.g., mean = 30 and SD = 10, then at least 75% of
individuals in range 30 ± (2)(10) = (10 to 50)
22. 12/24/2022 Summary Statistics 22
Quartiles and IQR
Quartiles divide the ordered data into
four equally-sized groups
Q0 = minimum
Q1 = 25th %ile
Q2 = 50th %ile (Median)
Q3 = 75th %ile
Q4 = maximum
23. 12/24/2022 Summary Statistics 23
Rule for quartiles
Find the median Q2
Middle of lower half of data set Q1
Middle of upper half of the data Q3
Bottom half | Top half
05 11 21 24 27 | 28 30 42 50 52
Q1 Q2 Q3
IQR = Q3 – Q1 = 42 – 21 = 21
gives spread of middle 50% of the data
30. 12/24/2022 Summary Statistics 30
Interpretation of boxplots
Location
Position of median
Position of box
Spread
Hinge-spread (box length) = IQR
Whisker-to-whisker spread (range or range minus
the outside values)
Shape
Symmetry of box
Size of whiskers
Outside values (potential outliers)