Statistical Analysis Of Data Final

2,024 views

Published on

Published in: Technology
1 Comment
8 Likes
Statistics
Notes
No Downloads
Views
Total views
2,024
On SlideShare
0
From Embeds
0
Number of Embeds
12
Actions
Shares
0
Downloads
156
Comments
1
Likes
8
Embeds 0
No embeds

No notes for slide

Statistical Analysis Of Data Final

  1. 1. BIOSTATISTICS – A TOOL FOR RESEARCH AND DATA ANALYSIS PRESENTED BY SABA BUTT
  2. 2. SIGNIFICANCE OF STATISTICS FOR ANALYSIS AND RESEARCH
  3. 3. STATISTICS IS NECESSARY FOR ALL FIELDS OF LIFE REQUIRING RESEARCH AND DATA ANALYSIS <ul><li>In all fields of life we have to analyze facts and interpret from these to make conclusions. The analysis needs statistics – to compare the qualities and quantities to help reach some conclusion, which will lead to decision making in business, government, industry etc and development of theories in science. </li></ul>
  4. 4. BIOSTATISTICS THE STATISTICS IN LIFE SCIENCES
  5. 5. <ul><li>designing experiments and other data collection, </li></ul><ul><li>summarizing information to aid understanding, </li></ul><ul><li>drawing conclusions from data, and </li></ul><ul><li>estimating the present or predicting the future. </li></ul><ul><li>In making predictions, Statistics uses the companion subject of Probability , which models chance mathematically and enables calculations of chance in complicated cases. </li></ul>BIOSTATISTICS IS A DISCIPLINE THAT IS CONCERNED WITH:
  6. 6. SOME IMPORTANT DEFINITIONS
  7. 7. POPULATION AND SAMPLE <ul><li>POPULATION: A population consists of an entire set of objects, observations, or scores that have something in common. For example, a population might be defined as all males between the ages of 15 and 18. </li></ul><ul><li>SAMPLE : A sample is a subset of a Population Since it is usually impractical to test every member of a population, a sample from the population is typically the best approach available. </li></ul>
  8. 8. PARAMETER AND STATISTIC <ul><li>PARAMETER: A parameter is a numerical quantity measuring some aspect of a population of scores. For example, the mean is a measure of central tendency in a population. </li></ul><ul><li>STATISTIC: A &quot;statistic&quot; is defined as a numerical quantity (such as the mean calculated in a sample). </li></ul>
  9. 9. MEASURES OF CENTRAL TENDENCY <ul><ul><li>Mean (Arithmetic Mean) </li></ul></ul><ul><ul><ul><li>Average value of a sample or population </li></ul></ul></ul><ul><ul><li>Median </li></ul></ul><ul><ul><ul><li>Middle value of sample or population </li></ul></ul></ul><ul><ul><li>Mode </li></ul></ul><ul><ul><ul><li>The value repeated most </li></ul></ul></ul>
  10. 10. The Arithmetic Mean or Mean is what is commonly called the average: When the word &quot;mean&quot; is used without a modifier, it can be assumed that it refers to the arithmetic mean. The mean is the sum of all the scores divided by the number of scores. Formula of calculating Population Mean is: μ = ΣX/N, where μ = population mean, and N = number of scores . If the scores are from a sample , then the symbol X refers to the mean and n refers to the sample size , formula written as: X = ΣX/n.
  11. 11. Median: The median is the middle of a distribution: half the scores are above the median and half are below the median. The median is less sensitive to extreme scores than the mean and this makes it a better measure than the mean for highly skewed distributions . 5 3 4 2.5 6 Mode: The mode is the most frequently occurring score in a distribution and is used as a measure of central tendency. The advantage of the mode as a measure of central tendency is that its meaning is obvious. 5 3 4 5 6
  12. 12. MEASURES OF DISPERSION <ul><li>After measuring the central value i.e., mean, next is to know that to which extent this central value represents all values, that is, to know the scattering or dispersion of the data . There are certain measures which gives values of dispersion. The most important and widely used of these in research are: </li></ul><ul><ul><li>Varience </li></ul></ul><ul><ul><li>Standard Deviation </li></ul></ul><ul><ul><li>Standard Error of Mean </li></ul></ul>
  13. 13. HYPOTHESIS TESTING <ul><li>Student’s t test </li></ul><ul><li>F test </li></ul><ul><li>ANOVA </li></ul><ul><li>Correlation </li></ul><ul><li>Regression </li></ul>
  14. 14. EXAMPLE OF DATA ANALYSIS <ul><li>Comparison of Weight to Height Ratio expressed by Body Mass Index of a population. BMI is calculated as weight in Kg / Height in Meter 2 . </li></ul><ul><li>General surveys in USA and Europe showed that young population is overweight which is enhancing chances of diseases. We surveyed young female population of Punjab University for BMI. We measured BMI of 400 students randomly. </li></ul>
  15. 15. Subject No. BMI Subject No. BMI
  16. 16. <ul><li>We have two tables of data: one giving BMI of girls, other BMI of boys. These are long data tables. </li></ul><ul><li>Now, we have to analyze it to conclude something from this data . What we need, now? </li></ul><ul><li>We need a measure of central tendency to indicate average BMI to compare with other populations, between boys and girls and with the normal range. </li></ul><ul><li>The most common and useful measure for the purpose is the Arithmetic Mean . Arithmetic Mean is calculated by taking sum of all values and dividing it by No. of observations. </li></ul>ARITHMETIC MEAN
  17. 17. SAMPLING ERROR <ul><li>Then next, we have an average value but is this average representative of all values really. Is it possible that some values be very large and some very small? If it is so, the Mean is not representative of whole data. This is called sampling error because some students may have strong genetic tendency to being overweight, these values are somewhat different from population. This will make our result erroneous, i.e., our Mean does not represent all data. </li></ul>
  18. 18. EXAMPLE <ul><li>We have four values - 2, 3, 4, 10 </li></ul><ul><li>Mean = Sum of values / No of Observations </li></ul><ul><li>2 + 3 + 4 + 10 / 4 </li></ul><ul><li>= 4.75 </li></ul><ul><li>This is far from three values in the data. This is because of a large value that exist in the data i.e. 10. </li></ul>
  19. 19. STANDARD DEVIATION <ul><li>Now, we need some statistical measure that tell us how to rule out sampling error. </li></ul><ul><li>This is the standard deviation – measure to find how the individual values vary from the average value, i.e., Mean. </li></ul>
  20. 20. Standard Deviation of that Data <ul><li>SD = s = ∑ (x – x) 2 </li></ul><ul><li> n - 1 </li></ul><ul><li>Descriptive Statistics from MINITAB </li></ul><ul><li>Variable N Mean Median StDev SE Mean </li></ul><ul><li>C1 4 4.75 3.50 3.59 1.80 </li></ul>
  21. 21. Student’s T Test <ul><li>Two Sample T-Test and Confidence Interval </li></ul><ul><li>Two sample T for BMI-F vs BMI-M </li></ul><ul><li>N Mean StDev SE Mean </li></ul><ul><li>BMI-F 30 31.35 6.26 1.1 </li></ul><ul><li>BMI-M 21 26.96 4.11 0.90 </li></ul><ul><li>95% CI for mu BMI-F - mu BMI-M: ( 1.5, 7.31) </li></ul><ul><li>T-Test mu BMI-F = mu BMI-M (vs not =): T= 3.02 P=0.0040 DF= 48 </li></ul>

×