6. Section 1:
Introduction to
Statistics
✔Definition of Statistics
✔History & Timeline of Statistics
✔Types of Statistics
✔What is Data?
✔Types and sources of Data
✔What is Big Data?
✔Note about R and RStudio
8. Section 3:
Estimation &
Hypothesis
testing
✔Basic Definitions of concepts and
examples
✔The Science of Sampling
✔Estimation Theory
✔Normal Distribution of Sampling
Distributions
✔Basic Definition of concepts of
Hypothesis testing
✔Essential Steps in Hypothesis Testing
✔Common Test Statistic and their
Sampling Distribution
12. What is Statistics?
"Statistics is a branch of applied mathematics that deals with the
collection, presentation, summarization, analysis, and
interpretation of numerical data"
13. What is Statistics?
"Statistics is a branch of applied
mathematics that deals with the
collection, presentation,
summarization, analysis, and
interpretation of numerical data"
14. Two important aspects
1. The word "probability" was not explicit enough
2. Data can come in a variety of forms
15. Timeline of Statistics
1. How did ancient Babylonians feed their people?
2. How did Florence Nightingale transform nursing?
3. How does one know that smoking may kill?
Source: Inaugural lecture presented by Prof. Ray Okafor titled Statistics: Data, Uncertainty, Probability, Information, Decision-making on Wednesday
8th August 2018
16. History of Statistics
ØThe word "statistics" was coined from the
German word "Statistik" which means "the
analysis of data about the state". The
word was coined the Gottfried Achenwell, a
German statistician-cum-philosopher in 1749
ØThe first use of the word "statistics" in English
was in 1791 by Sir John SInclair in his Statistical
Account of Scotland.
1. https://www.pinterest.com/pin/533535887079130693/
2. https://www.hetwebsite.net/het/profiles/sinclair.htm
17. History of Statistics
⮚Probability got its first mention as a subfield of
Statistics in 1303 when the Chinese introduced a
diagram showing the binomial coefficients up to
power eight
⮚In 1761, Thomas Bayes, an English clergyman and
once described as an amateur mathematician
proved, a theorem (Bayes' theorem) that is, the
cornerstone conditional probability
https://en.wikipedia.org/wiki/Thomas_Bayes
18. History of Statistics
ØIn 1805, the French mathematic. Legendre, introduced
the "method of Least Squares". This is a popularly known method
for fitting a curve to a set of data.
ØIn 1808, another statistical milestone was achieved with the derivation
of the normal distribution by two men Gauss (German) and Laplace
(French)
ØIn 1835, the Belgian Quetelet's "Treatise on Man" introduced
the field of social science statistics
Source: Inaugural lecture presented by Prof. Ray Okafor titled Statistics: Data, Uncertainty, Probability, Information, Decision-making on Wednesday
8th August 2018
19. History of Statistics
Ø In 1840, Willian Farr (a
British epidemiologist) presented a work
that allowed epidemics to be tracked
and diseases compared.
Ø The beginning of modern study of epidemics
itself is associated with a "cholera map"
produced by 1854 by John Snow, a
British physician
https://paradigmchange.me/wp/crisis/
https://en.wikipedia.org/wiki/Thomas_Bayes
https://www.atlasobscura.com/places/broad-street-cholera-pump
20. History of Statistics
⮚In 1859, Florence Nightingale developed a
circular chart (a forerunner of the pie chart),
showing monthly casualties from the Crimean
war.
⮚In 1877, the British polymath; mathematician,
statistician, and eugenicist, Sir Francis Galton
introduced the concept of "regression to the
mean" in his study of hereditary and inheritance.
https://en.wikipedia.org/wiki/Florence_Nightingale
21. History of Statistics
⮚In 1894, a British statistician, and biostatistician, Karl Pearson
derived a coefficient of correlation named after him
⮚In 1839, the French mathematician, physicist, and engineer,
Denis Poisson, introduced the discrete probability distribution
(Poisson distribution) named after him
⮚The famous sample t-test was introduced in 1908 by William
Gosset, the chief brewer for Guiness in Dublin
Source: Inaugural lecture presented by Prof. Ray Okafor titled Statistics: Data, Uncertainty, Probability, Information, Decision-making on Wednesday
8th August 2018
22. History of Statistics
⮚Sir Ronald Fisher (1890 – 1962) was a British
statistician, evolutionary biologist and geneticist.
He is generally regarded as the father of modern
statistics.
⮚His contributions to statistics include: Design of
Experiments and ANOVA, F-test procedure,
Method of Maximum likelihood, etc.
https://en.wikipedia.org/wiki/Ronald_Fisher
23. History of Statistics
• In 1972, Sir David Cox, a British statistician, gave to
the statistics world his "proportional hazard model" for the analysis
of complex (censored) survival data
• In 1977, John Tukey, the talented American statistician and
genius, introduced innovative tools for exploration of data
notably boxplot.
• In 1997, the term "Big Data" first appeared in print.
Source: Inaugural lecture presented by Prof. Ray Okafor titled Statistics: Data, Uncertainty, Probability,
Information, Decision-making on Wednesday 8th August 2018
24. Types of Statistics Descriptive: way to organize,
represent and describe a
collection of data using tables,
graphs, and summary measures
Inferential: is a method which
allows us to use information
collected from a sample to
make decisions, predictions or
inferences from a population
https://www.youtube.com/watch?v=qHjF5UDSyBU
27. What is Data?
An old-fashioned definition says:
"Data (variable) is a quantity that changes values"
⮚ Data is the raw material for information synthesis
⮚ Statisticians deploy statistical methodologies to
synthesize information from data
⮚ Information is everything
⮚ Information yields knowledge, and knowledge is power and security
28. Types of Data
Data is numeric
(quantitative) if it takes
numbers
Data is nonnumeric
(qualitative) if it takes
numbers
https://byjus.com/maths/types-of-data-in-statistics/
30. Sources of Data Primary data sources include
information collected and processed
directly by the researcher, such as
observations, surveys, interviews,
and focus groups, while
Secondary data sources include
information retrieved through
preexisting sources: research
articles, Internet or library searches,
etc.
https://analysisproject.blogspot.com/2013/12/sources-of-data-
and-information.html?m=1
32. The Buzz word: Big Data
https://letsthinkeasy.com/tag/empowered-edge/
33. 3 characteristics of Big Data
⮚ Volume
⮚ Velocity
⮚ Variety
https://gijn.org/2014/09/09/what-is-big-data/
34. About R and RStudio
⮚ Show the RStudio interface
⮚ Explain the idea behind the
group work
https://datacarpentry.org/genomics-r-intro/01-introduction/index.html
35. Group Members
Group 1
❖ Ayodele John
❖ Abimbola Eniola
❖ Pat-Ibitayo Marvel
Group 2
❖ Agbeleye Tomiloba
❖ Achief Georgina
❖ Daodu Precious
Group 3
❖ Umogun Daniel
❖ Toyon Boluwatife
❖ Unigwe Sophia
Group 4
❖ Ade-omonijo Collins
❖ Idehen Martha
❖ Ogidan Precious
Group 5
❖ Ogungbade Iyanuoluwa
❖ Oluseyi Daniella
❖ Usiade Gloria
36. Check this out!!!
✔ What is the difference between Statistics and a Statistic. Support
your claims with relevant examples?
✔ Examples of the different types of data
✔ What are the steps in carrying out a sample survey?
✔ Differentiate between frequentist and Bayesian statistics
✔ Explore the different Nigeria's Federal Government sources
of secondary data
37. Group Assignment
Create a 2-page report
✔ What is R? Why use R?
✔ Brief history of R
✔ Features of R
✔ Four (4) basic principles of a free software
38. Ending note
"The sexiest profession of the next decade will be the statistician"
~
Hal Varian (2008)