Introduction to Elementary statistics


Published on

a slide presentation made for instruction. this covers concepts and facts introducing statistics.

1 Comment
No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Introduction to Elementary statistics

  1. 1. Elementary Statistics Krizza Joy M. dela Cruz NEUST-CoEd
  2. 2. Introduction to Statistics <ul><li>The Word statistics have been derived from Latin word “ Status ” or the Italian word “ Statista ” which means “ Political State ” or a Government. </li></ul>
  3. 3. Two Basic Meanings of Statistics <ul><li>Plural Sense (General definition) </li></ul><ul><li>Singular Sense of Statistics </li></ul><ul><li> (Scientific Definition) </li></ul>
  4. 4. Plural Sense of Statistics <ul><li>statistics as </li></ul><ul><ul><ul><li>actual numbers derived from data </li></ul></ul></ul><ul><ul><ul><li>numerical observations of any kind </li></ul></ul></ul><ul><ul><ul><li>quantitative information </li></ul></ul></ul><ul><ul><ul><li>statistical data </li></ul></ul></ul>
  5. 5. Characteristics <ul><ul><li>Statistical data are aggregates of facts </li></ul></ul><ul><ul><li>Statistical data are numerically expressed </li></ul></ul><ul><ul><li>data must be collected in systematic manner </li></ul></ul><ul><ul><li>figures must be accurate in a reasonable degree or standards </li></ul></ul><ul><ul><li>statistics are collected in a predetermined purpose </li></ul></ul>
  6. 6. Examples: <ul><li>Figures of birth and death </li></ul><ul><li>Statistics in enrollment and drop-outs </li></ul><ul><li>Tax returns </li></ul><ul><li>Import-export data </li></ul><ul><li>Statistics on unemployment </li></ul><ul><li>Frequency of failures in school </li></ul><ul><li>Population counts </li></ul>
  7. 7. Singular Sense of Statistics <ul><li>statistics as </li></ul><ul><ul><ul><li>a method of analysis </li></ul></ul></ul><ul><ul><ul><li>methods used at arriving in quantitative information or statistical data </li></ul></ul></ul><ul><ul><ul><li>a theory and method of collecting, organizing, presenting, analyzing, and interpreting data; </li></ul></ul></ul><ul><ul><ul><li>statistical methods </li></ul></ul></ul>
  8. 8. Stages of Statistical Inquiry <ul><li>Collection of data </li></ul><ul><li>Presentation of data </li></ul><ul><li>Analysis of data </li></ul><ul><li>Interpretation of data </li></ul>
  9. 9. Application of Statistics <ul><li>In Education, statistics </li></ul><ul><li>gives information about school’s population change (statistics in enrollment and dropout rate) </li></ul><ul><li>helps in processing certain evaluations and surveys given to improve the school system </li></ul><ul><li>determines the relationship of educational performance to other factors such as socioeconomic background </li></ul><ul><li>analyzes the achievements, grades and in preparations of test (proficiency level) </li></ul>
  10. 10. <ul><li>In business and economics/government, statistics </li></ul><ul><li>plays an important role in market feasibility studies for new products </li></ul><ul><li>forecasts business trends (potentials of investments) </li></ul><ul><li>helps control and maintenance of quality products </li></ul><ul><li>improves the employer-employee relationship (labor relations) </li></ul><ul><li>helps financial analyst make investment decisions (human resource allocation) </li></ul><ul><li>analyzes the return on assets before taxes </li></ul><ul><li>provides organized records on the cost of living, taxes, wages and material resources which is necessary for intelligent decision making </li></ul>
  11. 11. <ul><li>In sociology and population dynamics, statistics </li></ul><ul><li>helps in solving problems involving man and society through statistical studies on population movement, mortality, morbidity, urban planning, and labor movement </li></ul><ul><li>is used for surveys designed to collect early returns on election day to forecast the outcome of an election </li></ul>
  12. 12. <ul><li>In researches, statistics </li></ul><ul><li>is used to test differences, effectiveness, impact, relationship or independence of some variables </li></ul><ul><li>provides the researchers valuable statistical design of surveys and experiments which may lead to new discoveries </li></ul>
  13. 13. History of Statistics
  14. 14. Ancient times (3000 BC – 27 BC) <ul><li>Simple forms of statistics have been used since the beginning of civilization when pictorial representation and other symbols were used to record numbers of people, animals, and inanimate objects on skin slabs, or sticks of woods, and walls of caves . </li></ul>
  15. 15. <ul><li>The population is recorded in Babylonia and in China. </li></ul><ul><li>The Sumerians counted their citizens for taxation purposes. </li></ul><ul><li>The EGYPTIANS analyzed the population and material wealth of their country before beginning to build the pyramids. Early scriptures in statistics were written in PAPYRUS. </li></ul><ul><li>In Biblical times, censuses were undertaken by Moses in 1491 BC and by David in 1017 BC. </li></ul><ul><li>In China, under ZHOU DYNASTY, from 1027 to 256 BC, population census city and registrations had become normal instruments of public administration to evaluate the number of persons available for the army and taxation. </li></ul>
  16. 16. <ul><li>The ancient Greeks held censuses to be used as bases for taxation as early as 594 BC. </li></ul><ul><li>In Northern Hindustan, KING ASOKA issued a census between to 270 to 232 BC. </li></ul><ul><li>During 27 BC in ancient Rome, the roman censuses designed for both taxation and military conscription were responsibility of local censors. The ROMAN EMPIRE was the first government to gather expensive data about the population, area, and wealth of territories that it controlled. SERVIUS TULLIUS, the 6th King of Rome, was given credit for instituting the gathering of population data, who initiated the first census. </li></ul>
  17. 17. <ul><li>Following the Norman conquest of England, William I, King of England, required the compilation of information and resources completed in 1086. This compilation, “THE DOMES DAY BOOK”, also known as BOOK OF WINCHESTER, is the first landmark in British Statistics. </li></ul><ul><li>Registrations for land ownership and on manpower for wars were made. </li></ul><ul><li>In 13th century, tax lists of the parish included in the registration of those who were subject to tax. Later on, births, deaths, baptisms and marriages had to be registered. </li></ul>Middle ages (300 AD – 1400 AD)
  18. 18. 17th century <ul><li>Gerolamo Cardano wrote “Liber de Ludo Alea” (The book on chance and games) first published in 1663, first known study of the principles of probability. Thus, started the tale of probability and statistics. </li></ul><ul><li>Chevalier de Mere, made a proposal to Blaise Pascal in the famous “Problem of Points”, a work which marked the beginning of the mathematics of probability. </li></ul>
  19. 19. 18th century <ul><li>Achenwall is the man noted to be the first to introduce the word “statistics” (1719-1772). </li></ul><ul><li>Zimmerman and Sinclair introduced and popularized the name “statistics” in their books. </li></ul><ul><li>The method of least squares was first described by Carl Friedrich Gauss around 1794. </li></ul>
  20. 20. <ul><li>Jakob Bernoulli wrote a combinatorial mathematical paper entitled “Ars Conjectandi” and was published year 1713. </li></ul><ul><li>Abraham de Moivre contributed the idea of approximation and the law of error which is similar to standard deviation. </li></ul><ul><li>During this century, statistics was used in the study entitled “Political Arrangement of the Modern States of the Known World” </li></ul><ul><li>In this century also, the term “statistics” designated the systematic collection of demographic and economic data of the state. </li></ul>
  21. 21. 19th century <ul><li>Laplace’s “Theories Analytique des Probabilities” of 1812 further supported and stabilized the said theory. </li></ul><ul><li>The meaning of “statistics” broadened, then including the discipline concerned with the collection, summary, and analysis of data. Social scientists used statistical reasoning and probability models to advance the new sciences of experimental psychology and sociology; physical scientists used statistical reasoning and probability models to advance the new sciences of thermodynamics and statistical mechanics. </li></ul>
  22. 22. <ul><li>Lambert Adolfe Quetelet applied the theory of probability to anthropological measurements and expanded the same principle to the physiological, psychological, physical and chemical fields. Because of his continued emphasis on the importance of using statistical methods, he is referred to as the “Father or Modern Statistics”. Also, he established the Central Commission for Statistics. </li></ul><ul><li>Francis Galton (1822-1911) developed the use of percentiles and the correlation method. </li></ul>
  23. 23. <ul><li>Karl Pearson (1857-1936) originated the basic statistical concepts and procedures as standard deviation, the random walk and the chi-squares. </li></ul><ul><li>Ronald Aylmer Fisher (1890-1962) contributed in the field of statistics the use of Fisher Test (F-Test), analysis of variance (ANOVA) and covariance in inferential statistics. </li></ul>
  24. 24. 20th century <ul><li>At present, statistics is a reliable means of describing accurately the values of economic, political, social, biological and physical data and serves as tool to correlate and analyze such data. </li></ul><ul><li>Much data can be approximated accurately by certain distributions and the results of probability distributions can be used in analyzing statistical data. </li></ul><ul><li>Statistics is widely employed in government, business, and the natural and social sciences. Electronic computers have expedited statistical computation, and have allowed statisticians to develop “computer-intensive” methods. </li></ul>
  25. 27. Division of Statistics <ul><li>Descriptive Statistics </li></ul><ul><li>Inferential Statistics </li></ul>
  26. 28. Descriptive Statistics <ul><li>statistical procedure concerned with describing the characteristics and properties of a group of persons, places or things that was based on easily verifiable facts. It organizes the presentation, description, and interpretation of data gathered. It includes the study of relationships among variables </li></ul>
  27. 29. Inferential Statistics <ul><li>statistical procedure used to draw inferences for the population on the basis of information obtained from the sample using the techniques of descriptive statistics. </li></ul>
  28. 30. Making inferences from the data to more general conditions Simply describe what’s going on with the data Infer the nature of a larger (typically infinite) set of data Describe the data in hand Inferring from the sample data what the population might think Provides summaries about the sample and the measures Drawing conclusions that extend beyond the immediate data alone Describing what is or what the data shows Inferential Statistics Descriptive statistics
  29. 31. Example: Inferential Statistics answer question like “ Is there significant difference in the academic performance of male and female students in statistics?” A politician wants to estimate his chance of winning in the upcoming senatorial election. Example: Descriptive statistics answer question like “ How many students are interested to take statistics online?” A basketball player wants to find his average shots for the past 10 games. Test on Proportion and Chi-square test Normal distributions Simple time series analysis, correlation and regression Summary measures of data Sampling distributions and Hypothesis testing Sampling techniques Inferential Statistics Descriptive statistics
  30. 32. Basic Statistical Terms and Symbols <ul><li>Population vs. Sample </li></ul><ul><li>Population – set of all individuals or entities under consideration or study. It may be a finite or infinite collection of objects, events or individuals, with specified class or characteristics. </li></ul><ul><li>Sample – Small portion or part of the population; a representative of the population in a research study </li></ul>
  31. 33. <ul><li>Parameter – numerical value which describes a population </li></ul><ul><li> µ ( mu ) – population mean </li></ul><ul><li> s (sigma) – population standard deviation </li></ul><ul><li>s 2 – population variance </li></ul>
  32. 34. <ul><li>Statistic – numerical value which describes a sample </li></ul><ul><li>x – sample mean </li></ul><ul><li> s 2 – sample variance </li></ul><ul><li> s – sample standard deviation </li></ul>
  33. 35. constant vs. variable <ul><li>Constant – characteristic or property of a population or sample which makes the member similar to each other </li></ul><ul><li>Variable – characteristic of interest measurable on each and every individual in the universe denoted by a capital letter in the English alphabet which assumes different values or labels </li></ul><ul><ul><ul><li>Measurement – process of assigning the value or label of a particular experiment unit </li></ul></ul></ul><ul><ul><ul><li>Experimental Unit – person or the object by which the variable is measured </li></ul></ul></ul>
  34. 36. Classification of variables <ul><li>Qualitative vs. Quantitative </li></ul><ul><li>Qualitative variable – yields categorical or qualitative responses. It refers to the attributes or characteristics of the sample. E.g. civil status, religious affiliations, gender </li></ul><ul><li>Quantitative variable – yields numerical responses representing an amount or quantity e.g height, weight, number of children </li></ul><ul><ul><ul><li>Discrete – values obtained by counting e.g. births, students in class </li></ul></ul></ul><ul><ul><ul><li>Continuous – values obtained by measurement e.g. age, height </li></ul></ul></ul>
  35. 37. <ul><li>Dependent vs. Independent </li></ul><ul><li>Dependent – a variable which is affected by another variable e.g. test scores </li></ul><ul><li>Independent – a variable which affects the other variable </li></ul><ul><li>e.g. number of hours spent for studying </li></ul>
  36. 38. Levels of measurement <ul><li>Nominal </li></ul><ul><li>Ordinal </li></ul><ul><li>Interval </li></ul><ul><li>ratio </li></ul>
  37. 39. Nominal <ul><li>Characterized by data that consist of names, labels, or categories only. The data cannot be arranged in an ordering scheme. </li></ul><ul><li>Examples: names, religion, civil status </li></ul>
  38. 40. Ordinal <ul><li>Involves data that may be arranged in some order, but differences between data values either cannot be determined or are meaningless. </li></ul><ul><li>Examples: Military Rank, Job position </li></ul>
  39. 41. Interval <ul><li>It is like ordinal level, with the additional property that meaningful amounts of differences and ratios are meaningful. </li></ul><ul><li>Examples: IQ score, temperature </li></ul>
  40. 42. Ratio <ul><li>Interval level modified to include the inherent zero starting point. For values at this level, differences and ratios are meaningful. </li></ul><ul><li>Examples: Height, weight, width, weekly allowance </li></ul>
  41. 43. survey vs. experiment <ul><li>survey – done if factors which may affect the investigation are not taken into consideration </li></ul><ul><li>experiment – method wherein effort is exerted to control the factors which may affect the variable in question </li></ul>
  42. 44. <ul><li>Cristina scored 45 marks out of 100 in the first quiz in statistics. </li></ul><ul><li>(Not statistical data) </li></ul><ul><li>Cristina, Karlo, Shea, Manuel and Ara scored 45, 91, 73, 53, and 87 respectively. </li></ul><ul><li>(Statistical Data) </li></ul>
  43. 45. <ul><li>“ Per Capita income of the Philippines is low” </li></ul><ul><li>(Not statistical Data) </li></ul><ul><li>“ Income of the Philippines from the Tourism had increased from 106.4 M in 2003 to 143.1 M in 2011” </li></ul><ul><li>(Statistical Data) </li></ul>
  44. 46. Exercises: Descriptive Vs. Inferential <ul><li>A physicist studying turbulence in the laboratory needs the average quantities that vary over small intervals of time. </li></ul><ul><ul><li>(Descriptive) </li></ul></ul><ul><li>We are interested in examining how many math classes have been taken on the average by current graduating seniors at American colleges and universities during their four years in school. </li></ul><ul><ul><ul><ul><ul><li>(Inferential) </li></ul></ul></ul></ul></ul>
  45. 47. <ul><li>A research scientist is interested in studying the experiences of twins raised together versus those raised apart. </li></ul><ul><li> (inferential) </li></ul><ul><li>A person wants to determine the range of ages of females, who commit themselves into marriage, in his town. </li></ul><ul><li>(descriptive) </li></ul><ul><li>An employer wants to know the average salaries received by his workers in his company. </li></ul><ul><li>(descriptive) </li></ul>
  46. 48. <ul><li>An experiment comparing the effectiveness of a new anti-depressant drug with a prescription without physical effect. </li></ul><ul><li>(inferential) </li></ul><ul><li>You have been hired by the National Election Commission to examine how the American people feel about the fairness of the voting procedures in the U.S. </li></ul><ul><li>(inferential) </li></ul><ul><li>Determining shooting percentage of players on a basketball team. </li></ul><ul><li>(Descriptive) </li></ul>C
  47. 49. Exercises: Population vs. Sample <ul><li>A substitute teacher wants to know how students in the class did on their last test. He asks only the 10 students sitting in the front row to report how they did on their last test and he concludes from them that the class did extremely well. What is the sample? What is the population? Can you identify any problems with choosing the sample in the way that the teacher did? </li></ul>
  48. 50. <ul><li>A coach is interested in how many cartwheels the average college freshmen at his university can do. Eight volunteers from the freshman class step forward. After observing their performance, the coach concludes that college freshmen can do an average of 16 cartwheels in a row without stopping. </li></ul>
  49. 51. Quantitative Vs. Qualitative <ul><li>Type of blood </li></ul><ul><li>Height of babies </li></ul><ul><li>Breed of cattle </li></ul><ul><li>Consumer’s expenditure </li></ul><ul><li>Standard of living </li></ul>
  50. 52. Discrete vs. Continuous <ul><li>Volume of a pail of water </li></ul><ul><li>Passing rate of LET results </li></ul><ul><li>Weight of grapes </li></ul><ul><li>Number of pigs sold </li></ul><ul><li>Frequency of training programs attended </li></ul>
  51. 53. Exercises: Dependent vs. Independent <ul><li>Example #1: Can blueberries slow down aging? A study indicates that antioxidants found in blueberries may slow down the process of aging. In this study, 19-month old rats (equivalent to 60-year old humans) were fed either their standard diet or a diet supplemented by either blueberry, strawberry, or spinach powder. After eight weeks, the rats were given memory and motor tests. Although all supplemented rats showed improvement, those supplemented with blue berry powder showed the most notable improvement.  </li></ul>
  52. 54. <ul><li>Example #2: Does  beta carotene  protect against cancer? Beta-carotene supplements have been thought to protect against cancer. However, a study published in the Journal of the National Cancer Institute suggests this is false. The study was conducted with 39,000 women aged 45 and up. These women were randomly assigned to receive a beta-carotene supplement or a  placebo , and their health was studied over their lifetime. Cancer rates for women taking the beta-carotene supplement did not differ systematically from the cancer rates of those women taking the placebo.  </li></ul>
  53. 55. <ul><li>Example #3: How bright is right? An automobile manufacturer wants to know how bright brake lights should be in order to minimize the time required for the driver of a following car to realize that the car in front is stopping and to hit the brakes. </li></ul>
  54. 56. Population and sample <ul><li>Population: Graduating senior students of Americans Colleges and Universities </li></ul><ul><li>Sample: 20% of the graduating students in each of the colleges and universities in America </li></ul>
  55. 57. <ul><li>Population: Set of all twins who are raised together and those who are raised apart </li></ul><ul><li>Sample: 20% of the twins raised together and 20% of the twins raised apart </li></ul>
  56. 58. <ul><li>Population: females who are married in the town. </li></ul><ul><li>Sample: 20% of the females who commit themselves into marriage </li></ul>
  57. 59. <ul><li>Population: all workers in the company </li></ul><ul><li>Sample: some employees in each different works in the company </li></ul>
  58. 60. <ul><li>Population: Americans who are registered voters </li></ul><ul><li>Sample: 10% of all registered voters in America </li></ul>
  59. 61. <ul><li>Population: All players in a basketball team. </li></ul><ul><li>Sample: 10 players chosen in random among all the players in the team </li></ul>
  60. 62. <ul><li>SOLUTION </li></ul><ul><li>The population consists of all students in the class. </li></ul><ul><li>The sample includes the 10 students sitting in the front row. </li></ul><ul><li>The sample is made up of just the 10 students sitting in the front row. The sample is not likely to be representative of the population. Those who sit in the front row tend to be more interested in the class and tend to perform higher on tests. Hence, the sample may perform at a higher level than the population. </li></ul>
  61. 63. <ul><li>SOLUTION </li></ul><ul><li>The population is the freshmen at the coach's university. </li></ul><ul><li>The sample is poorly chosen because volunteers are more likely to be able to do cartwheels than the average freshman; people who can't do cartwheels probably did not volunteer! </li></ul><ul><li>In the example, we are also not told of the gender of the volunteers. Were they all women, for example? That might affect the outcome, contributing to the non-representative nature of the sample (if the school is co-ed). </li></ul>
  62. 64. <ul><li>INDEPENDENT: diet </li></ul><ul><li>DEPENDENT: memory and motor skills </li></ul>
  63. 65. <ul><li>INDEPENDENT: supplements </li></ul><ul><li>DEPENDENT: occurrence of cancer </li></ul>
  64. 66. <ul><li>INDEPENDENT: brightness of brake lights </li></ul><ul><li>DEPENDENT: time to hit brake </li></ul>