MCT Mathematics & Statistics Paul Garthwaite [email_address] http://statistics.open.ac.uk/advisory.html Introduction to St...
The Scientific Method <ul><li>Deductive reasoning: </li></ul><ul><ul><li>from the general to the specific (&quot;top-down&...
Theory: In a pig’s digestive system, all phosphate    ions are the same, regardless of what they    were bound with. Theor...
Study Design (deductive reasoning)
Hypothesis testing is like a court of law: You aim to disprove the null hypothesis.   The hypothesis of a court: The perso...
Inductive Reasoning <ul><li>From set of specific observations to broader generalizations and theories (&quot;bottom up&quo...
Observational Study (inductive reasoning)
Observational studies could feed into inductive reasoning. Pilot studies have a place in forming hypotheses. Some discipli...
Statistical Design <ul><li>Study can be: </li></ul><ul><ul><li>Observational    analyse existing data (Inductive) </li></...
Warning <ul><li>Poor designs can lead to: </li></ul><ul><li>Inefficient use of collected data </li></ul><ul><li>Difficult ...
Use Common Sense <ul><li>Think about questions your research might answer.  </li></ul><ul><li>Can you gather data related ...
<ul><li>In many ways, statistics just makes common sense rigorous. </li></ul><ul><li>Think about what covariates may be re...
Gather lots of data <ul><li>A decent experiment will generally form about a quarter of a PhD (perhaps more) – four papers ...
How much data? (My rules of thumb.) <ul><li>In a controlled experiment where the quantity of interest is a measurement, fo...
Questionnaires Likert scales are good: strongly  weakly  indifferent/  disagree/  strongly  agree/  agree/  disagree. Havi...
Statistical Data Analysis <ul><li>Turning data into information: First produce summary statistics (means percentages, stan...
Common fundamental statistical methods <ul><li>t- tests </li></ul><ul><li>Comparison of proportions </li></ul><ul><li>Cont...
Regression <ul><li>In many ways regression is the most useful statistical method. </li></ul><ul><li>It lets you  test  whe...
<ul><li>There is an advisory service that can help on: </li></ul><ul><ul><li>Designing an experiment </li></ul></ul><ul><u...
Statistical Software <ul><li>Packages are only tools (‘number crunches’) </li></ul><ul><ul><li>   Most important is to ch...
Some Statistical Packages <ul><li>General software (e.g. spreadsheets) </li></ul><ul><li>Specialised: </li></ul><ul><ul><l...
Statistics Courses <ul><li>M248 : Analysing Data </li></ul><ul><ul><li>Exploratory data analysis.  Models for data. Estima...
Statistics Courses <ul><li>M343 : Applications of Probability </li></ul><ul><ul><li>Models to describe patterns in time an...
The Stats-Advisory Service <ul><li>Drop-in sessions </li></ul><ul><ul><li>Mondays : 2:00 – 4:00 (M216) </li></ul></ul><ul>...
Upcoming SlideShare
Loading in …5
×

Stats Workshop2010

883 views
811 views

Published on

Paul Garthwaite's Presentation on Statistics

0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
883
On SlideShare
0
From Embeds
0
Number of Embeds
137
Actions
Shares
0
Downloads
28
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

Stats Workshop2010

  1. 1. MCT Mathematics & Statistics Paul Garthwaite [email_address] http://statistics.open.ac.uk/advisory.html Introduction to Statistical Analysis
  2. 2. The Scientific Method <ul><li>Deductive reasoning: </li></ul><ul><ul><li>from the general to the specific (&quot;top-down&quot; approach) </li></ul></ul>
  3. 3. Theory: In a pig’s digestive system, all phosphate ions are the same, regardless of what they were bound with. Theory: If you are a diabetic, losing weight will help you live longer.
  4. 4. Study Design (deductive reasoning)
  5. 5. Hypothesis testing is like a court of law: You aim to disprove the null hypothesis. The hypothesis of a court: The person in the dock is innocent. The aim is to gather evidence that is inconsistent with this hypothesis. We reject the hypothesis (and decide the person is guilty) if the evidence makes the hypothesis unlikely (beyond all reasonable doubt) .
  6. 6. Inductive Reasoning <ul><li>From set of specific observations to broader generalizations and theories (&quot;bottom up&quot; approach) </li></ul>
  7. 7. Observational Study (inductive reasoning)
  8. 8. Observational studies could feed into inductive reasoning. Pilot studies have a place in forming hypotheses. Some disciplines (e.g. psychology) seem to disapprove of observational studies. Presumably such studies are written up as if the hypotheses were decided before gathering the data. (A dangerous practice!)
  9. 9. Statistical Design <ul><li>Study can be: </li></ul><ul><ul><li>Observational  analyse existing data (Inductive) </li></ul></ul><ul><ul><li>Experimental  produce new data (Deductive) </li></ul></ul><ul><li>Relies on random sampling </li></ul><ul><ul><li>Obtain information about the whole from analysing the part (inferential statistics) </li></ul></ul><ul><li>Experimental design: </li></ul><ul><ul><li>randomly allocates conditions/treatments on subjects to observe their response </li></ul></ul>
  10. 10. Warning <ul><li>Poor designs can lead to: </li></ul><ul><li>Inefficient use of collected data </li></ul><ul><li>Difficult statistical analysis </li></ul><ul><li> </li></ul><ul><li>Inability to draw meaningful conclusions </li></ul>
  11. 11. Use Common Sense <ul><li>Think about questions your research might answer. </li></ul><ul><li>Can you gather data related to those questions? </li></ul><ul><li>Using common sense, would the data answer those questions? </li></ul><ul><li>Pigs and phosphates: feed pigs different phosphate compounds and see if their bone strengths differ? </li></ul><ul><li>Diabetes and diet: use patient notes to get age at death, age at diagnosis, and weight loss in first year after diagnosis. </li></ul>
  12. 12. <ul><li>In many ways, statistics just makes common sense rigorous. </li></ul><ul><li>Think about what covariates may be relevant and try to measure them (gender and age in many social contexts; smoking in medical studies; etc.) </li></ul><ul><li>Try to reduce random variation. </li></ul>
  13. 13. Gather lots of data <ul><li>A decent experiment will generally form about a quarter of a PhD (perhaps more) – four papers are enough for a PhD in most disciplines. </li></ul><ul><li>Designing an experiment, collecting data, analysing it, writing a paper, revising the paper, and so on, will take several months. </li></ul><ul><li>People typically do not spend enough time gathering data. The data drives the conclusions you can reach </li></ul><ul><li>More data = Firmer conclusions </li></ul>
  14. 14. How much data? (My rules of thumb.) <ul><li>In a controlled experiment where the quantity of interest is a measurement, forty or so independent observations will typically enable modest-sized differences to be identified. </li></ul><ul><li>With observational data and questionnaire data, gathering 150 data or more should typically be the aim: you want 25 observations in each category of interest. </li></ul><ul><li>More data is needed with counts than measurements. </li></ul><ul><li>More data is needed with binary quantities (yes/no; cured/not cured; success/failure) than with Likert scores. </li></ul>
  15. 15. Questionnaires Likert scales are good: strongly weakly indifferent/ disagree/ strongly agree/ agree/ disagree. Having five points on a Likert scale is often about right. Code the values as 1, 2, 3, 4, 5 and it is usually OK to treat them as measurements. Open-ended questions are hard to analyse.
  16. 16. Statistical Data Analysis <ul><li>Turning data into information: First produce summary statistics (means percentages, standard deviations), graphs, bar-charts, cross-tabulations. </li></ul><ul><li>Try to get a feel for your data – what does it tell you? (If you feel you are non-numerate, work at becoming numerate.) </li></ul><ul><li>Try to form quantitative hypotheses that you think the data will refute. (e.g. “The proportions in the ‘strongly agree’ category are the same in these two sub-populations” or “As this quantity changes, the average value of this other quantity does not change”.) </li></ul>
  17. 17. Common fundamental statistical methods <ul><li>t- tests </li></ul><ul><li>Comparison of proportions </li></ul><ul><li>Contingency tables </li></ul><ul><li>Regression </li></ul><ul><li>Analysis of variance </li></ul><ul><li>It is worth knowing when these are useful. </li></ul>
  18. 18. Regression <ul><li>In many ways regression is the most useful statistical method. </li></ul><ul><li>It lets you test whether one variable affects another (while controlling for other covariates if necessary). </li></ul><ul><li>It also describes the relationship. </li></ul><ul><li>Stepwise methods help you find/test which variables are important. </li></ul><ul><li>Generalised linear models add flexibility. </li></ul>
  19. 19. <ul><li>There is an advisory service that can help on: </li></ul><ul><ul><li>Designing an experiment </li></ul></ul><ul><ul><li>How to approach the analysis of data </li></ul></ul><ul><ul><li>Choosing appropriate techniques </li></ul></ul><ul><ul><li>Interpreting results </li></ul></ul><ul><ul><li>Understanding outputs from statistical packages </li></ul></ul><ul><li>Too few people ask for advice before gathering data. </li></ul>
  20. 20. Statistical Software <ul><li>Packages are only tools (‘number crunches’) </li></ul><ul><ul><li> Most important is to choose adequate </li></ul></ul><ul><ul><li>method for your problem </li></ul></ul><ul><ul><li>Remember: </li></ul></ul><ul><ul><li> Garbage in  Garbage out </li></ul></ul>
  21. 21. Some Statistical Packages <ul><li>General software (e.g. spreadsheets) </li></ul><ul><li>Specialised: </li></ul><ul><ul><li>Genstat, Minitab, SAS, Statistica, </li></ul></ul><ul><ul><li>SPSS </li></ul></ul><ul><ul><ul><li>wide range of statistical procedures </li></ul></ul></ul><ul><ul><ul><li>good graphical capability </li></ul></ul></ul><ul><ul><ul><li>fairly easy to use (menu driven option) </li></ul></ul></ul><ul><ul><ul><li>Good help facility with case studies </li></ul></ul></ul>
  22. 22. Statistics Courses <ul><li>M248 : Analysing Data </li></ul><ul><ul><li>Exploratory data analysis. Models for data. Estimation. Confidence intervals. Hypothesis testing. Regression and two-variable problems. (Minitab) </li></ul></ul><ul><li>M249 : Practical Modern Statistics </li></ul><ul><ul><li>Medical statistics. Time series analysis. Multivariate statistics. Bayesian methods. </li></ul></ul><ul><ul><li>Focus on applications: SPSS and WinBUGS. </li></ul></ul>
  23. 23. Statistics Courses <ul><li>M343 : Applications of Probability </li></ul><ul><ul><li>Models to describe patterns in time and space. Epidemiological models. Genetics and stockmarket price applications. </li></ul></ul><ul><li>M346 : Linear Statistical Modelling </li></ul><ul><ul><li>ANOVA. Design of experiments. Linear regression. Generalized linear models. Diagnostic checking. Log-linear models. (GenStat) </li></ul></ul>
  24. 24. The Stats-Advisory Service <ul><li>Drop-in sessions </li></ul><ul><ul><li>Mondays : 2:00 – 4:00 (M216) </li></ul></ul><ul><ul><li>Thursdays : 10:30 – 12:20 (M214) (Both in Maths and Computing Building) </li></ul></ul><ul><li>Web: </li></ul><ul><ul><li>http://statistics.open.ac.uk/advisory.html </li></ul></ul><ul><li>E-mail: </li></ul><ul><ul><li> [email_address] </li></ul></ul>

×