Chapter Sixteen EXPLORING, DISPLAYING, AND EXAMINING DATA16-3
Types of Data Analysis • Exploratory data analysis – the data guide the choice of analysis--or a revision of the planned analysis • Confirmatory data analysis – closer to classical statistical inference in its use of significance and confidence – may use information from a closely related data set or by validating findings through the gathering and analyzing of new data16-4
Techniques to Display and Examine Distributions • Frequency Table • Visual Displays – Histograms – Stem-and-leaf display – Box-plot • Crosstabulation of Variables16-5
Techniques to Display and Examine Distributions • Histograms – Display all intervals in a distribution, even without observed values – Examine the shape of the distribution for skewness, kurtosis, and the modal pattern16-6
Techniques to Display and Examine Distributions (cont.) • Box-plot (box and whisker-plot) – Rectangular plot encompasses 50% of the data values • Edges of the box (hinges) – Center line through the width of the box marks the median – Whiskers extend from the right and left hinges to the largest and smallest values16-7
Techniques to Display and Examine Distributions (cont.) • Transformation – To improve interpretation and compatibility with other data sets – To enhance symmetry and stabilize spread – To improve linear relationships between and among variables16-8
Improvement & Control Analysis • Statistical process control – Uses statistical tools to analyze, monitor, and improve process performance – Total Quality Management – Control chart • Displays sequential measurements of a process together with a center line and control limits – Upper control limit – Lower control limit16-9
Types of Control Charts • Variables data (ratio or interval measurements) – X-bar – R-charts – s-charts – Pareto Diagrams • Bar chart whose percentages sum to 100 percent16-10
Geographic Information Systems • Systems of hardware, software, and procedures that capture, store, manipulate, integrate, and display spatially-referenced data16-11
Geographic Information Systems • Minimum four components – Integrating information from various sources – Capturing data – Projection and restructuring – Modeling16-12
Crosstabulation • A technique for comparing two classification variables – Cells – Marginals – Contingency tables16-13
Percentaging Errors • Averaging percentages without weighting • Using too-large percentages (>100%) • Using percentage with very small sample • Citing percentage decrease exceeding 100 percent16-14
Other Table-based Analysis • Automatic Interaction Detection (AID) – Sequential partitioning procedure that uses a dependent variable and set of predictors – Searches among up to 300 variables for the best single division of data into subsets according to each predictor variable, – Chooses one division approach – Splits the sample using chi-square tests to create multi-way splits.16-15
A particular slide catching your eye?
Clipping is a handy way to collect important slides you want to go back to later.