Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Upcoming SlideShare
×

# Emilie Rousselin Stastistics

706 views

Published on

Published in: Education
• Full Name
Comment goes here.

Are you sure you want to Yes No
Your message goes here
• Be the first to comment

• Be the first to like this

### Emilie Rousselin Stastistics

1. 1. Statistics Emilie Rousselin
2. 2. Statistics <ul><li>There are three important areas in Statistics: </li></ul><ul><li>1. Descriptive Statistics  </li></ul><ul><li>  </li></ul><ul><li>2. Inferential Statistics  </li></ul><ul><li>  </li></ul><ul><li>3. Regression </li></ul>
3. 3. 1. Descriptive Statistics <ul><li>- It deals with data that have been previously collected. It says something about a set of information.  </li></ul><ul><li>  </li></ul><ul><li>When dealing with descriptive Statistics we can talk about: </li></ul><ul><li>  </li></ul><ul><li>a.Measures of Central Tendency </li></ul><ul><li>b.Measures of dispersion </li></ul><ul><li>- Variables and distributions are two important concepts within descriptive Statistics. </li></ul>
4. 4. a. Measures of central tendency <ul><li>- The Arithmetic mean: It is the sum of all the values in a set, divided by the number of values. When we talk about a population μ is used. When dealing with a sample x̄ . </li></ul><ul><li>Ex:  μ = ΣX/N or x̄ = ΣX/N </li></ul><ul><li>  </li></ul><ul><li>- The mode: It is the most repeated value in a set. If two values are repeated then the isbimodal. </li></ul><ul><li>Ex: {25, 26, 30, 30, 50,} The most repeated value is 30. </li></ul><ul><li>- The median: It is the number in the middle of a set of numbers arranged in numerical order. If two numbers are both in  the middle of a set then the sum of these two values is divided by 2. </li></ul>
5. 5. b.Measures of dispersion <ul><li>- Interquartile range: It is the difference between the third and first quartiles. </li></ul><ul><li>IQR = Q3-Q1 </li></ul><ul><li>- Range: It is the difference between the largest and the smallest values of a set. </li></ul><ul><li>- Variance: It is &quot;a measure of how items are dispersed about their mean. </li></ul><ul><li>  </li></ul><ul><li>- Standard deviation: It is the square root of the variance. We use s for a sample and σ for a population. </li></ul>
6. 6.   <ul><li>  -The empirical rule: 68%, 95%, 99.7% </li></ul><ul><li>For a data with a bell-shaped graph, 68% of the values lie within one standard deviation of the mean, 95% within two standard deviations and 99.7% within three standard deviation. </li></ul><ul><li>  </li></ul><ul><li>  </li></ul><ul><li>- Here are all the formulae you need and more: </li></ul><ul><li>http://www.statistics.com/resources/statsymbols.pdf </li></ul>
7. 7. Box-plot <ul><li>Q1 is the median of the first half of a list of numbers (lower part) Q2 is the median of the whole list Q3 is the median of the second half of the list (upper part) Q4 is the largest value in the list </li></ul><ul><li>  </li></ul>
8. 8. 2. Inferential Statistics <ul><li>- It is useful to make comparisons or predictions about a population using the information collected.  </li></ul><ul><li>  </li></ul><ul><li>- &quot;Or, we use inferential statistics to make judgments of the probability that an observed difference between groups is a dependable one or one that might have happened by chance in this study.&quot; </li></ul><ul><li>  </li></ul><ul><li>- It includes: </li></ul><ul><li>  </li></ul><ul><li>a. The T-test </li></ul><ul><li>b. General linear model </li></ul>
9. 9. a.The T-test <ul><li>- The T-test evaluates if the means of two different groups are statiscally different from each other. This is very useful to compare two different groups or categories. </li></ul><ul><li>- In a paired T-test the same population is tested. </li></ul><ul><li>- In an unpaired T-test two different population are tested. </li></ul>
10. 10. The T-test
11. 11. b. The General Linear Model (GLM)‏ <ul><li>- &quot;The General Linear Model (GLM) underlies most of the statistical analyses that are used in applied and social research&quot; </li></ul><ul><li>  </li></ul><ul><li>- The objective is to summarize or describe accurately what is happening in the data. If the line is rising it means data are positive and if it is going down it means it is negative. </li></ul>
12. 12. 3. Regression <ul><li>a. Analysis of variance (ANOVA)‏ </li></ul><ul><li>b. Nonlinear regression </li></ul><ul><li>c. Rank Correlation </li></ul>
13. 13. a. Analysis of variance (ANOVA)‏ <ul><li>- &quot;It is a collection of statistics models, and their associated procedures in which the observed variance into components due to different explanatory variables. </li></ul>
14. 14. b. The nonlinear regression <ul><li>- &quot;Observational data are modeled by a function which is a nonlinear combination of the model parameters and depends on one or more independent variables. The data are fitted by a method of successive approximations.&quot; </li></ul>
15. 15. c. Rank Correlation <ul><li>- It studies the different relationships between rankings on the same set of items. </li></ul><ul><li>There are two main types: </li></ul><ul><li>- Spearman's rank correlation coefficient </li></ul><ul><li>  </li></ul><ul><li>- The Kendall tau rank correlation coefficient </li></ul>
16. 16. Works Cited <ul><li>- http://www.originlab.com/index.aspx?s=8&lm=115&pid=73 </li></ul><ul><li>- http://www.statistics.com/resources/statsymbols.pdf </li></ul><ul><li>- http://www.socialresearchmethods.net/kb/statdesc.php </li></ul><ul><li>- Hon, Keone, &quot;An Introduction to Statistics&quot;. </li></ul><ul><li>- http://writing.colostate.edu/guides/research/stats/pop3a.cfm </li></ul><ul><li>- http://www.businessbookmall.com/SFR%201.pdf </li></ul>