Inferential Statistics
MEDICAL STATISTICS -PART II

http://www.socialresearchmethods.net/kb/stati
nf.php
World of Statistics
Descriptive statistics Inferential Statistics
PROVIDE SIMPLE SUMMARIES ABOUT THE
SAMPLE AND THE MEASUR...
Talk to beginner in next 4 pages
Descriptive statistics

Inferential Statistics






 The sample behaviour inference...
Variables
Independent
 Also called Experimental or predictor variable
 The manipulation of which influence the result
(d...
Categorical and Continuous
Variables
Categorical
 Also called Discrete or qualitative
 Three types
 Nominal: only name ...
Sample
Sampling is the process of selecting units of people from a population of
interest so that by studying the sample w...
What do these words mean
A "bell-shaped" curve that describes the group's
distribution of a single variable
Think of the bell curve as a smoothed
h...
Difference: 2 µ related not by
chance
In the figure, we show distributions for both the
treatment and control group. The m...
Hypothesis Testing
A statistical hypothesis is an assumption about a population parameter
Hypothesis testing refers to t...
2 types of statistical hypotheses
Null hypothesis(Ho)

Alternative hypothesis(Ha)

Sample observations result purely from ...
Example
Suppose we wanted to determine whether a coin was fair and balanced. A
null hypothesis might be that half the flip...
The region of acceptance
http://www.six-sigma-material.com/Hypothesis-Testing.html

If p-Value is < than alpha-risk, rejec...
The prototype inferential
statistics: t-test
To compare the average performance of two groups
Use a single measure to se...
General Linear Model includes
t-test
 Analysis of Variance (ANOVA)
 Analysis of Covariance (ANCOVA)
 Regression analys...
Experimental Analysis: some
inferential statistics
The simple two-group posttest-only randomized experiment : t-test or
o...
The T-Test :test the variability of
2 µ in posttest analysis
Judge the difference between their means relative to the spre...
Mean equal but variability is different
In which of the three cases would it be easiest to conclude
that the means of the ...
Compute t value
standard error of the difference
Therefore finally

varT=variance of treatment group
varC=variance of control group
nt=sam...
Interpretation of t
Positive t-value shows first mean > second

 Negative if first mean is smaller
 Then match t value ...
Dummy Variables
Is a numerical variable
Used in regression analysis
 distinguish different treatment groups
dummy vari...
General Linear Model[GLM ]
Most important statistical tool that allows us to summarize a wide
variety of research outcome...
y = b0 + bx + e: The straight-line
model
y = a set of outcome variables
x = a set of pre-test variables or covariates
b0 =...
Regression line :The line in clouds [scattered
variables]

Contro
l

Treatment
b1=β1=the slope
Linearity
e: vertical distance from the straight line to each
point. This term is called "error" because it is the
degree to which t...
Posttest-Only Analysis[two-group
posttest-only randomized
experimental design]
Two groups
 A post-only measure
Two dist...
3 ways to estimate Posttest-Only
Analysis
T-test
ANOVA[one-way
Analysis of Variance ]
ANCOVA[regression
analysis ]:most...
Posttest-Only Analysis result by ttest
Posttest-Only Analysis result by
regression
in the statistical model yi is the same as y in the
straight line formula, β0 ...
Factorial Design Analysis:2x2






It is a regression analysis
Ingredients from 2x2 factorial table
A dummy variable...
Randomized Block Analysis
 A regression analysis
Analysis of Covariance=mx[linear
regression]
ANOVA[one-way Analysis of Variance ]
ANCOVA[one-way Analysis of Covariance ...
Non-equivalent Groups Analysis
http://www.socialresearchmethods.net/kb/statnegd.php
Regression-Discontinuity
Analysis
http://www.socialresearchmethods.net/kb/statrd.php
Regression Point Displacement
Analysis
Requires
 A posttest score
A pretest score
A variable to represent the treatment...
Regression Point Displacement
Analysis
Goal is to estimate the size of the vertical
displacement of the treated unit from ...
Analysis of Covariance
(ANCOVA) model
The cost of this ppt.
YOU CAN IMPROVE BY
 No plastic use
 Going through links given in this ppt,teachers,
friends and wo...
Medical Statistics Part-II:Inferential  statistics
Upcoming SlideShare
Loading in...5
×

Medical Statistics Part-II:Inferential statistics

2,163

Published on

I am not a statistician but shared my mistakes to learn

Published in: Health & Medicine
1 Comment
11 Likes
Statistics
Notes
No Downloads
Views
Total Views
2,163
On Slideshare
0
From Embeds
0
Number of Embeds
5
Actions
Shares
0
Downloads
185
Comments
1
Likes
11
Embeds 0
No embeds

No notes for slide

Medical Statistics Part-II:Inferential statistics

  1. 1. Inferential Statistics MEDICAL STATISTICS -PART II http://www.socialresearchmethods.net/kb/stati nf.php
  2. 2. World of Statistics Descriptive statistics Inferential Statistics PROVIDE SIMPLE SUMMARIES ABOUT THE SAMPLE AND THE MEASURES IN a)TABLES b) GRAGHS . CHARTS . CIRCLE .DOT PLOTS .BOX-AND-WHISKER PLOTS .SCATTER PLOT . SURVIVAL PLOTS . BLAND-ALTMAN PLOTS The T-Test Dummy Variables["proxy" ]=variables in inferential statistics General Linear Model[ most common use of inferential statistics] Post test-Only Analysis[simple t-test or one-way ANOVA] Factorial Design Analysis[ANOVA] Randomized Block Analysis Analysis of Covariance[ANOVA & ANCOVA] Non-equivalent Groups Analysis Regression-Discontinuity Analysis Regression Point Displacement Analysis
  3. 3. Talk to beginner in next 4 pages Descriptive statistics Inferential Statistics      The sample behaviour inferences the population under study beyond just visibility  Estimates parameter(s) and looks beyond parameters at sample and population level  Testing of statistical hypotheses[how far not by chance]  Critically analyse variability using statistical models and advanced software under the guidance of expert statistician Properties of population Everything is visible in tables and graphs Everybody can understand with minimum effort Uses central tendency(Bell curve) and measure spread  Measured in parameters( mean, standard deviation and variance)  But have not access to the whole population you are interested in investigating
  4. 4. Variables Independent  Also called Experimental or predictor variable  The manipulation of which influence the result (dependent variable)  Example: The number revision or level of intelligence that influences mark secured from full 100 mark Dependent Variables Mark secured in examination is dependent variable affected by manipulation of dependent
  5. 5. Categorical and Continuous Variables Categorical  Also called Discrete or qualitative  Three types  Nominal: only name them in 2 or more groups  Ordinal: Arrange them in orders in 2 or more groups  Dichotomous: Arrange them only two groups  https://statistics.laerd.com/statisticalguides/types-of-variable.php Continuous  Also called quantitative variables  Two types  Interval: measured along a continuum and they have a numerical value (for example, temperature measured in degrees Celsius or Fahrenheit)  Ratio: The name "ratio" reflects the fact that you can use the ratio of measurements. example, a distance of ten metres is twice the distance of 5 metres. Ratio variables are interval variables, but with the added condition that 0 (zero) of the measurement indicates that there is none of that variable
  6. 6. Sample Sampling is the process of selecting units of people from a population of interest so that by studying the sample we may fairly generalize our results back to the population from which they were chosen) The listing of the accessible population from which you'll draw your sample is called the sampling frame
  7. 7. What do these words mean
  8. 8. A "bell-shaped" curve that describes the group's distribution of a single variable Think of the bell curve as a smoothed histogram or bar graph describing the frequency of each possible measurement response: The heart of statistics
  9. 9. Difference: 2 µ related not by chance In the figure, we show distributions for both the treatment and control group. The mean values for each group are indicated with dashed lines. The difference between the means is simply the horizontal difference between where the control and treatment group means hit the horizontal axis.
  10. 10. Hypothesis Testing A statistical hypothesis is an assumption about a population parameter Hypothesis testing refers to the formal procedures used by statisticians to accept or reject statistical hypotheses Typically examine a random sample from the population If sample data are not consistent with the statistical hypothesis, the hypothesis is rejected
  11. 11. 2 types of statistical hypotheses Null hypothesis(Ho) Alternative hypothesis(Ha) Sample observations result purely from chance sample observations are influenced by some nonrandom cause Decision Errors Type I error. A Type I error occurs when the researcher rejects a null hypothesis when it is true. The probability of committing a Type I error is called the significance level. This probability is also called alpha( α). Type II error. A Type II error occurs when the researcher fails to reject a null hypothesis that is false. The probability of committing a Type II error is called Beta( β). The probability of not committing a Type II error is called the Power of the test.
  12. 12. Example Suppose we wanted to determine whether a coin was fair and balanced. A null hypothesis might be that half the flips would result in Heads and half, in Tails. The alternative hypothesis might be that the number of Heads and Tails would be very different. Symbolically, these hypotheses would be expressed as H0: P = 0.5 Ha: P ≠ 0.5 Suppose we flipped the coin 50 times, resulting in 40 Heads and 10 Tails. Given this result, we would be inclined to reject the null hypothesis. We would conclude, based on the evidence, that the coin was probably not fair and balanced. http://stattrek.com/hypothesis-test/hypothesis-testing.aspx?Tutorial=AP
  13. 13. The region of acceptance http://www.six-sigma-material.com/Hypothesis-Testing.html If p-Value is < than alpha-risk, reject Ho and acce If p-Value is > than alpha-risk, fail to reject the Nu
  14. 14. The prototype inferential statistics: t-test To compare the average performance of two groups Use a single measure to see if there is a difference Example: Whether eighth-grade boys and girls differ in math test scores or whether a program group differs on the outcome measure from a control group
  15. 15. General Linear Model includes t-test  Analysis of Variance (ANOVA)  Analysis of Covariance (ANCOVA)  Regression analysis Multivariate methods : 1. Factor analysis 2.Multidimensional scaling 3.Cluster analysis 4. discriminant function analysis General Linear Model is the simplest straight-line model that opens the door for more complex inferential statistics
  16. 16. Experimental Analysis: some inferential statistics The simple two-group posttest-only randomized experiment : t-test or one-way ANOVA  The factorial experimental designs : Analysis of Variance (ANOVA) Model  Randomized Block Designs: ANOVA blocking model The Analysis of Covariance Experimental Design uses the Analysis of Covariance statistical model[ANCOVA]
  17. 17. The T-Test :test the variability of 2 µ in posttest analysis Judge the difference between their means relative to the spread or variability of their scores
  18. 18. Mean equal but variability is different In which of the three cases would it be easiest to conclude that the means of the two groups are different? If you answered the low variability case, you are correct! Why is it easiest to conclude that the groups differ in that case? Because that is the situation with the least amount of overlap between the bell-shaped curves for the two groups. If you look at the high variability case, you should see that there quite a few control group cases that score in the range of the treatment group and vice versa. Why is this so important? Because, if you want to see if two groups are "different" it's not good enough just to subtract one mean from the other -- you have to take into account the variability around the means! A small difference between means will be hard to detect if there is lots of variability or noise. A large difference will between means will be easily detectable if variability is low
  19. 19. Compute t value
  20. 20. standard error of the difference Therefore finally varT=variance of treatment group varC=variance of control group nt=sample number in treatment group nc=sample number in control group Var=variance=Square of standard deviation
  21. 21. Interpretation of t Positive t-value shows first mean > second  Negative if first mean is smaller  Then match t value in table of significance to test whether this value is large enough to say that the difference between the groups is not likely to have been a chance finding  To test the significance, you need to set a risk level (called the) alpha level=P value)  In most social research, the "rule of thumb" is to set the alpha level at .05  This means that five times out of a hundred you would find a statistically significant difference between the means even if there was none (i.e., by "chance")  Determine the degrees of freedom (df) for the test.  In the t-test, the degrees of freedom is the sum of the persons in both groups minus 2  Given the alpha level, the df, and the t-value, you can look the t-value up in a standard table of significance to see whether this observation is a mere chance or real association. The t-test, one-way Analysis of Variance (ANOVA) and a form of regression analysis are mathematically equivalent
  22. 22. Dummy Variables Is a numerical variable Used in regression analysis  distinguish different treatment groups dummy variable is 0 indicates ,it is placebo group  dummy variable is 1 indicates ,it is treatment group enable us to use a single regression equation to represent multiple groups act like 'switches' that turn various parameters on and off in an equation
  23. 23. General Linear Model[GLM ] Most important statistical tool that allows us to summarize a wide variety of research outcomes  It is the foundation for 1. t-test 2. Analysis of Variance (ANOVA) 3. Analysis of Covariance (ANCOVA) 4.Regression analysis 5.Multivariate methods including factor analysis, cluster analysis, multidimensional scaling, discriminant function analysis, canonical correlation
  24. 24. y = b0 + bx + e: The straight-line model y = a set of outcome variables x = a set of pre-test variables or covariates b0 = the set of intercepts (value of each y when each x=0) b = a set of coefficients, one each for each x e=vertical distance from the straight line to each point Z: regression analysis utilizes a dummy variable for treatment Keywords General: in general[G] Model: an equation[M] Linear: An equation represented as line using bivariate or multivariate plot[L] Regression: The extent of agreement of pre and post test result(variable ) on a line of equation
  25. 25. Regression line :The line in clouds [scattered variables] Contro l Treatment
  26. 26. b1=β1=the slope
  27. 27. Linearity
  28. 28. e: vertical distance from the straight line to each point. This term is called "error" because it is the degree to which the line is in error in describing each point
  29. 29. Posttest-Only Analysis[two-group posttest-only randomized experimental design] Two groups  A post-only measure Two distributions (measures), each with an average and variation Assess treatment effect = statistical (i.e. non-chance) difference between the groups 
  30. 30. 3 ways to estimate Posttest-Only Analysis T-test ANOVA[one-way Analysis of Variance ] ANCOVA[regression analysis ]:most general T-test Same result ANOVA ANCOVA
  31. 31. Posttest-Only Analysis result by ttest
  32. 32. Posttest-Only Analysis result by regression in the statistical model yi is the same as y in the straight line formula, β0 is the same as b, b1 is the same as m, and Zi is the same as x. In other words, in the statistical formula, b0 is the intercept and b1 is the slope.
  33. 33. Factorial Design Analysis:2x2      It is a regression analysis Ingredients from 2x2 factorial table A dummy variable (represented by a Z) for each factor Two main effects and one interaction Main effects are the statistics associated with the beta values that are adjacent to the Z-variables  The interaction effect is the statistic associated with b3 (i.e., the t-value for this coefficient)
  34. 34. Randomized Block Analysis  A regression analysis
  35. 35. Analysis of Covariance=mx[linear regression] ANOVA[one-way Analysis of Variance ] ANCOVA[one-way Analysis of Covariance ]
  36. 36. Non-equivalent Groups Analysis http://www.socialresearchmethods.net/kb/statnegd.php
  37. 37. Regression-Discontinuity Analysis http://www.socialresearchmethods.net/kb/statrd.php
  38. 38. Regression Point Displacement Analysis Requires  A posttest score A pretest score A variable to represent the treatment group (where 0=comparison &1=treatment) Identical to the requirements for the ANCOVA except RPD design has a single treated group score The model we'll use is the Analysis of Covariance (ANCOVA) model
  39. 39. Regression Point Displacement Analysis Goal is to estimate the size of the vertical displacement of the treated unit from the regression line of all of the control units, indicated on the graph by the dashed arrow. The figure shows a bivariate (pre-post) distribution for a hypothetical RPD design of a community-based AIDS education program. The new AIDS education program is piloted in one particular county in a state, with the remaining counties acting as controls. The state routinely publishes annual HIV positive rates by county for the entire state. The x-values show the HIVpositive rates per 1000 people for the year preceding the program while the y-values show the rates for the year following it.
  40. 40. Analysis of Covariance (ANCOVA) model
  41. 41. The cost of this ppt. YOU CAN IMPROVE BY  No plastic use  Going through links given in this ppt,teachers, friends and workshops interested in statistics
  1. Gostou de algum slide específico?

    Recortar slides é uma maneira fácil de colecionar informações para acessar mais tarde.

×