Dr. Qinq Shao lecture about ordinary differential equations and boundary level problems give general information for the ODE-BVP and statistics for Data Analysis
2. Statistics for Data Analysis
• Merriam-Webster Definition
A branch of mathematics dealing with the collection, analysis,
interpretation, and presentation of masses of numerical data
• Questions that can be addressed
• What is the most likely value of a measured quantity?
• How certain are we of a measured value?
• How many measurements would be needed to improve certainty?
• Are two measurements actually different from each other?
• Derived based on probability, which only really applies to
many repeated measurements
• Often the goal is
measured value ± uncertainty
3. Randomness and Statistical Measures
• In experiments, we assume errors are independent and identically
distributed so analysis can be done
• Some quantities can be calculated directly from observed set of n
values {xi}
• Mean (numpy.mean)
𝑥 =
𝑖=1
𝑛
𝑥𝑖 /(𝑛)
• Variance
𝜎2 =
𝑖=1
𝑛
(𝑥𝑖−𝑥)2 /(𝑛 − 1)
• Standard deviation = s (numpy.std)
• Standard error
𝑆𝐸 =
𝜎
𝑛
4. Normal Distribution
• There are many functions that describe the distribution of expected
observations
• For continuous variable, probability of observing value x between xa
and xb is given by p = 𝑥𝑎
𝑥𝑏
𝑓 𝑥 𝑑𝑥
• Many exist (log-normal, Weibull, etc.)
• The most relevant to experiments:
normal distribution random
variation about mean
• Gaussian describes distribution
𝑓 𝑥 =
1
2𝜋𝜎
exp −
(𝑥−𝑥)2
2𝜎2
5. Confidence Intervals (a range of
estimates for an unknown variable)
• The actual confidence interval depends on the number of trials (n)
and how sure we want to be (CI)
• We select confidence interval (CI) (default = 95%), meaning that we
want there to be a 95% chance of actual value being in that interval
• Alpha (a) = 1-CI (0.05 here)
• Confidence interval is found using the t-statistic, which is tabulated as
a function of a and degrees of freedom (n-1 for us)
• Use a/2 because t-statistic applies to one side of distribution
• In Scipy, scipy.stats.t.ppf(p,nu)
• https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.t.ht
ml
7. Multiple Variables and Covariance
• Many experiments depend on multiple variables
• Covariance is a measure of how uncertainties are related
𝜎2(𝑥, 𝑦) =
𝑖=1
𝑛
(𝑥𝑖−𝑥)(𝑦𝑖−𝑦) /(𝑛 − 1)
• If s(x,y) = 0, uncertainties are not directly related (although variables
still might be)
• If there is no covariance, uncertainty in a measured quantity can be
propagated to derived quantities
• It can be shown mathematically that the variance of z, which is a
function of N variables xi without covariance, is given by: