SlideShare a Scribd company logo
1 of 20
Introduction to Statistics
Lecture Notes
Chapters 3-5
Please sign in (SIGNATURES) as you come in to class. It will save
my voice instead of my taking attendance (this is only to settle the
class roster).
What’s up with the powerpoint?
 I don’t usually use slides, but am going to try to use
these to save my voice somewhat.
 Notes: Still working on getting the class roster
settled. Has been some movement on the waitlist,
will keep in touch as things develop. Be sure you’ve
signed in!
 First homework is posted (on our course website),
but isn’t due until next Friday (the 4th). The
additional problem is NOT optional, that just means it
is not a book problem.
Handouts for Today
 There is one handout on graphs/descriptive statistics
going around. Save this to use tomorrow in class.
 There is a second handout – the anonymous survey
largely designed by the class on Monday. Please go
ahead and take a few minutes to fill this out (no
names!) and get it back to me. We’ll take a look at
this data next week in lab.
 If you missed class Monday, I have extra course
syllabuses at the front as well.
The “W”’s of a Data Set
 Who – the observations (population – set of all objects
you are interested in obtaining the value of some
parameter for – since we usually can’t observe all
objects, we take a sample of objects – a subset of the
overall population of objects to observe)
 Note: There is NO such thing as a population sample or
sample population.
 What – the variables
 Why – why was the data collected
 How – how was the data collected (related to
design/sampling in chapters 12-13)
 When/Where – more information that could be relevant
Chapters 3-5 Overview
 Covers basic graphs and descriptive statistics for
both categorical and quantitative variables
 This is what you would do as a “preliminary analysis”
for a variable.
 Recall: a data set can have multiple variables in it.
 These chapters focus on mostly univariate (single
variable) analyses. There is one comparative graph
– a side-by-side boxplot in Chapter 5.
3 Rules of Data Analysis
 Rule 1- Make a picture
 Rule 2 – Make a picture (really, before you do
anything else)
 Rule 3 – Make a picture (really, we mean a well-
chosen picture for your variables)
Categorical Variable Prelim Analysis
 Frequency tables (one variable) – summarize counts
by category
 Contingency tables (2 or more variables) –
summarize counts by category for multiple variables
 Bar charts
 Pie charts
Frequency
 What is frequency?
 Frequency is the number of objects/cases per category
 You can also look at relative frequency.
 Relative frequency is the number of objects/cases per
category divided by the total number of objects.
 Hence it gives proportions for each category out of the
total.
 It is often converted to %.
Bar Charts
 One bar per category – height is determined by
frequency or relative frequency
 Order of categories is arbitrary.
 Does NOT let you talk about the shape of a
distribution.
 “Area” principle – areas are supposed to be relative.
This is often violated when people try to make
graphs “cool” and go 3-D, etc. (see Example passed
around).
Pie Charts
 Take 100% of cases and divide up 360 degrees
based on relative frequencies.
 We will look at bar charts over pie charts.
 Note that for bar charts you do not need to create
bars for 100% of the cases. You could look at the top
three risk factors for a disease, etc. However, we
usually do have 100% of cases shown.
Contingency Tables - Example
 See first page of Handout
 Totals for rows/columns give marginal distributions
for each variable.
 You can also look at conditional distributions. Fix
a row or column and work solely within that row or
column.
 Concept of independence (will formalize later):
 If the distribution of one variable is the same for all
categories of another variable, then the two variables are
independent.
On Your Own
 Text has some discussion of segmented bar-charts
and side-by-side (feel free to read or skip)
Simpson’s Paradox
 Something that can happen when you aggregate
categorical data
 Looking at overall averages or % can be misleading
 Can get different results looking at breakdown
 Berkeley Discrimination Data Example (see bottom of
page one of the handout)
 Claims of Sexual Discrimination in1973 Graduate School
Admissions
 Overall, 44.28% of males who applied were admitted, while
only 34.58% of females were admitted.
 Look what happens when you breakdown by the 6 largest
departments though! (try this on your own or with a partner). Is
there evidence of discrimination against females at the dept.
level? What is going on?
Quantitative Variables Preliminary Analysis
 Graphs
 Dot plot – won’t use much – read about on your own
 Stem and leaf – won’t use much – read about on your own
 Histogram
 Boxplot (chapter 5)
 Qqplot (Friday or next week)
 Time plot (Friday or next week)
 Descriptive statistics
 Measures of center: mean, median
 Measures of spread: standard deviation, IQR, range
Describing the distribution of a quantitative
variable
 You should focus on three things when describing
the distribution of a quantitative variable:
 Shape – unimodal (one peak), bimodal (two peaks),
multimodal (many peaks), bell-shaped, skewed left (tail to
the left), skewed right (tail to the right), symmetric,
uniform (no peaks, basically flat)
 Center – estimate the center (or use a descriptive
statistic)
 If multiple peaks, report the peak locations
 Spread – estimate the spread (can use a descriptive
statistic)
Dot Plot – On Your Own
 Most basic quantitative graph
 Use for a low number of observations (<50)
 Basically use a number line and place a dot above it
for each value you have observed.
 Example from wikipedia:
Stem and Leaf – On Your Own
 Your book discusses lots of options for these,
including split leaves (which is something R/Rcmdr
will do).
 Basics: You take your values and set a stem –
maybe tens. Then the leaves are the ones place. For
each stem, you list the leaves that coincide in
numeric order.
 Usually works decently for fewer than 100
observations
 Try it. Suppose you have scores on a pre-test for an
at-risk youth group as follows:
 5, 11, 13, 21, 34, 36, 45, 47, 48, 48, 49
Histogram
 Take the quantitative variable and break it up into “piles”
or “bins” (usually the same width).
 Count the number of observations in each bin or pile.
 Plot the frequencies per bin.
 Usually no spaces between bins (if there is, it is a gap –
NOT like a bar chart).
 You DO need to know the boundaries. (5,10], (10,15] as
bins IS different from [5,10),[10,15). (If anyone needs me
to explain open/closed brackets, please ask).
 Technology lets us vary the width of bins (effectively the
number)
 You can also use unequal bin widths but then you need
something called density, not frequency.
Examples
 See page 2 of the handout
 Try to describe the shape of each histogram
 Then see page 3 of the handout
 We’re going to create a histogram by hand if there is time
 If no time, you can do this on your own.
Cookie Lab
 Time Permitting (otherwise, Friday)
 The last page (to turn in) is not due till the end of
class tomorrow. So don’t worry if we don’t get to it
today. You can look at it tonight or tomorrow in class
(I’ll give last five minutes of class for you to work on
it).

More Related Content

Similar to Introduction to Statistics - Chapter 3-5 Notes.ppt

Bj research session 9 analysing quantitative
Bj research session 9 analysing quantitativeBj research session 9 analysing quantitative
Bj research session 9 analysing quantitativeIan Cammack
 
Scientific inquiry
Scientific inquiryScientific inquiry
Scientific inquiryjdougherty
 
TMGT 361Assignment V InstructionsLectureEssayStatistics 001.docx
TMGT 361Assignment V InstructionsLectureEssayStatistics 001.docxTMGT 361Assignment V InstructionsLectureEssayStatistics 001.docx
TMGT 361Assignment V InstructionsLectureEssayStatistics 001.docxherthalearmont
 
Introduction To Statistics
Introduction To StatisticsIntroduction To Statistics
Introduction To Statisticsalbertlaporte
 
3Type your name hereType your three-letter and -number cours.docx
3Type your name hereType your three-letter and -number cours.docx3Type your name hereType your three-letter and -number cours.docx
3Type your name hereType your three-letter and -number cours.docxlorainedeserre
 
Data Handling
Data Handling Data Handling
Data Handling 75193
 
Mengxue HuReflection Paper #210202015Topic explain.docx
Mengxue HuReflection Paper #210202015Topic explain.docxMengxue HuReflection Paper #210202015Topic explain.docx
Mengxue HuReflection Paper #210202015Topic explain.docxandreecapon
 
Module 4 data analysis
Module 4 data analysisModule 4 data analysis
Module 4 data analysisILRI-Jmaru
 
U3 IP.savMKTG420_U3IP.docUnit 3 Individual Project .docx
U3 IP.savMKTG420_U3IP.docUnit 3 Individual Project      .docxU3 IP.savMKTG420_U3IP.docUnit 3 Individual Project      .docx
U3 IP.savMKTG420_U3IP.docUnit 3 Individual Project .docxwillcoxjanay
 
Quantitative Data - A Basic Introduction
Quantitative Data - A Basic IntroductionQuantitative Data - A Basic Introduction
Quantitative Data - A Basic IntroductionDrKevinMorrell
 
Btm8107 8 week2 activity understanding and exploring assumptions a+ work
Btm8107 8 week2 activity understanding and exploring assumptions a+ workBtm8107 8 week2 activity understanding and exploring assumptions a+ work
Btm8107 8 week2 activity understanding and exploring assumptions a+ workcoursesexams1
 
De vry math 221 all discussion+ilbs latest 2016 november
De vry math 221 all discussion+ilbs latest 2016 novemberDe vry math 221 all discussion+ilbs latest 2016 november
De vry math 221 all discussion+ilbs latest 2016 novemberlenasour
 

Similar to Introduction to Statistics - Chapter 3-5 Notes.ppt (20)

Tqm old tools
Tqm old toolsTqm old tools
Tqm old tools
 
Data structure
Data   structureData   structure
Data structure
 
Graphs ppt
Graphs pptGraphs ppt
Graphs ppt
 
Bj research session 9 analysing quantitative
Bj research session 9 analysing quantitativeBj research session 9 analysing quantitative
Bj research session 9 analysing quantitative
 
Scientific inquiry
Scientific inquiryScientific inquiry
Scientific inquiry
 
TMGT 361Assignment V InstructionsLectureEssayStatistics 001.docx
TMGT 361Assignment V InstructionsLectureEssayStatistics 001.docxTMGT 361Assignment V InstructionsLectureEssayStatistics 001.docx
TMGT 361Assignment V InstructionsLectureEssayStatistics 001.docx
 
Chapter03
Chapter03Chapter03
Chapter03
 
Chapter03
Chapter03Chapter03
Chapter03
 
Introduction To Statistics
Introduction To StatisticsIntroduction To Statistics
Introduction To Statistics
 
Week 7 spss
Week 7 spssWeek 7 spss
Week 7 spss
 
Experimental Research
Experimental ResearchExperimental Research
Experimental Research
 
3Type your name hereType your three-letter and -number cours.docx
3Type your name hereType your three-letter and -number cours.docx3Type your name hereType your three-letter and -number cours.docx
3Type your name hereType your three-letter and -number cours.docx
 
Data Handling
Data Handling Data Handling
Data Handling
 
Mengxue HuReflection Paper #210202015Topic explain.docx
Mengxue HuReflection Paper #210202015Topic explain.docxMengxue HuReflection Paper #210202015Topic explain.docx
Mengxue HuReflection Paper #210202015Topic explain.docx
 
Module 4 data analysis
Module 4 data analysisModule 4 data analysis
Module 4 data analysis
 
U3 IP.savMKTG420_U3IP.docUnit 3 Individual Project .docx
U3 IP.savMKTG420_U3IP.docUnit 3 Individual Project      .docxU3 IP.savMKTG420_U3IP.docUnit 3 Individual Project      .docx
U3 IP.savMKTG420_U3IP.docUnit 3 Individual Project .docx
 
Quantitative Data - A Basic Introduction
Quantitative Data - A Basic IntroductionQuantitative Data - A Basic Introduction
Quantitative Data - A Basic Introduction
 
Year 9 Stats
Year 9 StatsYear 9 Stats
Year 9 Stats
 
Btm8107 8 week2 activity understanding and exploring assumptions a+ work
Btm8107 8 week2 activity understanding and exploring assumptions a+ workBtm8107 8 week2 activity understanding and exploring assumptions a+ work
Btm8107 8 week2 activity understanding and exploring assumptions a+ work
 
De vry math 221 all discussion+ilbs latest 2016 november
De vry math 221 all discussion+ilbs latest 2016 novemberDe vry math 221 all discussion+ilbs latest 2016 november
De vry math 221 all discussion+ilbs latest 2016 november
 

More from Gurumurthy B R

Gas Chromatography.ppt
Gas Chromatography.pptGas Chromatography.ppt
Gas Chromatography.pptGurumurthy B R
 
American Revolutionppt.ppt
American Revolutionppt.pptAmerican Revolutionppt.ppt
American Revolutionppt.pptGurumurthy B R
 
ZP394sample_ImmigrationPP.ppt
ZP394sample_ImmigrationPP.pptZP394sample_ImmigrationPP.ppt
ZP394sample_ImmigrationPP.pptGurumurthy B R
 
Immigrants in America.ppt
Immigrants in America.pptImmigrants in America.ppt
Immigrants in America.pptGurumurthy B R
 
Lesson 3 American History - 1800 through the Civil War(1).pptx
Lesson 3 American History - 1800 through the Civil War(1).pptxLesson 3 American History - 1800 through the Civil War(1).pptx
Lesson 3 American History - 1800 through the Civil War(1).pptxGurumurthy B R
 
سادسةHistory_of_USA.ppt
سادسةHistory_of_USA.pptسادسةHistory_of_USA.ppt
سادسةHistory_of_USA.pptGurumurthy B R
 
SJSUIntroSocTischlerChap8PPT.ppt
SJSUIntroSocTischlerChap8PPT.pptSJSUIntroSocTischlerChap8PPT.ppt
SJSUIntroSocTischlerChap8PPT.pptGurumurthy B R
 
GeographyReview29_3Poverty.pptx
GeographyReview29_3Poverty.pptxGeographyReview29_3Poverty.pptx
GeographyReview29_3Poverty.pptxGurumurthy B R
 
CPRReportLaunch-Presentation-Sweden-010914-2.pptx
CPRReportLaunch-Presentation-Sweden-010914-2.pptxCPRReportLaunch-Presentation-Sweden-010914-2.pptx
CPRReportLaunch-Presentation-Sweden-010914-2.pptxGurumurthy B R
 
03-12-13Child Poverty.ppt
03-12-13Child Poverty.ppt03-12-13Child Poverty.ppt
03-12-13Child Poverty.pptGurumurthy B R
 

More from Gurumurthy B R (20)

basic_rules.ppt
basic_rules.pptbasic_rules.ppt
basic_rules.ppt
 
3D_Printing.ppt
3D_Printing.ppt3D_Printing.ppt
3D_Printing.ppt
 
Gas Chromatography.ppt
Gas Chromatography.pptGas Chromatography.ppt
Gas Chromatography.ppt
 
damop_2005_gif.ppt
damop_2005_gif.pptdamop_2005_gif.ppt
damop_2005_gif.ppt
 
lecture3.pptx
lecture3.pptxlecture3.pptx
lecture3.pptx
 
vortrag070704.ppt
vortrag070704.pptvortrag070704.ppt
vortrag070704.ppt
 
verbrevs3.ppt
verbrevs3.pptverbrevs3.ppt
verbrevs3.ppt
 
American Revolutionppt.ppt
American Revolutionppt.pptAmerican Revolutionppt.ppt
American Revolutionppt.ppt
 
trs-7.ppt
trs-7.ppttrs-7.ppt
trs-7.ppt
 
ZP394sample_ImmigrationPP.ppt
ZP394sample_ImmigrationPP.pptZP394sample_ImmigrationPP.ppt
ZP394sample_ImmigrationPP.ppt
 
Immigrants in America.ppt
Immigrants in America.pptImmigrants in America.ppt
Immigrants in America.ppt
 
Lesson 3 American History - 1800 through the Civil War(1).pptx
Lesson 3 American History - 1800 through the Civil War(1).pptxLesson 3 American History - 1800 through the Civil War(1).pptx
Lesson 3 American History - 1800 through the Civil War(1).pptx
 
سادسةHistory_of_USA.ppt
سادسةHistory_of_USA.pptسادسةHistory_of_USA.ppt
سادسةHistory_of_USA.ppt
 
SJSUIntroSocTischlerChap8PPT.ppt
SJSUIntroSocTischlerChap8PPT.pptSJSUIntroSocTischlerChap8PPT.ppt
SJSUIntroSocTischlerChap8PPT.ppt
 
23634.ppt
23634.ppt23634.ppt
23634.ppt
 
nash_session1_e.ppt
nash_session1_e.pptnash_session1_e.ppt
nash_session1_e.ppt
 
Chapter 9.ppt
Chapter 9.pptChapter 9.ppt
Chapter 9.ppt
 
GeographyReview29_3Poverty.pptx
GeographyReview29_3Poverty.pptxGeographyReview29_3Poverty.pptx
GeographyReview29_3Poverty.pptx
 
CPRReportLaunch-Presentation-Sweden-010914-2.pptx
CPRReportLaunch-Presentation-Sweden-010914-2.pptxCPRReportLaunch-Presentation-Sweden-010914-2.pptx
CPRReportLaunch-Presentation-Sweden-010914-2.pptx
 
03-12-13Child Poverty.ppt
03-12-13Child Poverty.ppt03-12-13Child Poverty.ppt
03-12-13Child Poverty.ppt
 

Recently uploaded

Introduction to Microprocesso programming and interfacing.pptx
Introduction to Microprocesso programming and interfacing.pptxIntroduction to Microprocesso programming and interfacing.pptx
Introduction to Microprocesso programming and interfacing.pptxvipinkmenon1
 
Biology for Computer Engineers Course Handout.pptx
Biology for Computer Engineers Course Handout.pptxBiology for Computer Engineers Course Handout.pptx
Biology for Computer Engineers Course Handout.pptxDeepakSakkari2
 
Call Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile serviceCall Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile servicerehmti665
 
Microscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptxMicroscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptxpurnimasatapathy1234
 
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdf
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdfCCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdf
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdfAsst.prof M.Gokilavani
 
Electronically Controlled suspensions system .pdf
Electronically Controlled suspensions system .pdfElectronically Controlled suspensions system .pdf
Electronically Controlled suspensions system .pdfme23b1001
 
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort serviceGurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort servicejennyeacort
 
VICTOR MAESTRE RAMIREZ - Planetary Defender on NASA's Double Asteroid Redirec...
VICTOR MAESTRE RAMIREZ - Planetary Defender on NASA's Double Asteroid Redirec...VICTOR MAESTRE RAMIREZ - Planetary Defender on NASA's Double Asteroid Redirec...
VICTOR MAESTRE RAMIREZ - Planetary Defender on NASA's Double Asteroid Redirec...VICTOR MAESTRE RAMIREZ
 
Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024hassan khalil
 
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...Soham Mondal
 
Software and Systems Engineering Standards: Verification and Validation of Sy...
Software and Systems Engineering Standards: Verification and Validation of Sy...Software and Systems Engineering Standards: Verification and Validation of Sy...
Software and Systems Engineering Standards: Verification and Validation of Sy...VICTOR MAESTRE RAMIREZ
 
Introduction-To-Agricultural-Surveillance-Rover.pptx
Introduction-To-Agricultural-Surveillance-Rover.pptxIntroduction-To-Agricultural-Surveillance-Rover.pptx
Introduction-To-Agricultural-Surveillance-Rover.pptxk795866
 
Internship report on mechanical engineering
Internship report on mechanical engineeringInternship report on mechanical engineering
Internship report on mechanical engineeringmalavadedarshan25
 
Call Girls Narol 7397865700 Independent Call Girls
Call Girls Narol 7397865700 Independent Call GirlsCall Girls Narol 7397865700 Independent Call Girls
Call Girls Narol 7397865700 Independent Call Girlsssuser7cb4ff
 
Oxy acetylene welding presentation note.
Oxy acetylene welding presentation note.Oxy acetylene welding presentation note.
Oxy acetylene welding presentation note.eptoze12
 

Recently uploaded (20)

9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
 
young call girls in Green Park🔝 9953056974 🔝 escort Service
young call girls in Green Park🔝 9953056974 🔝 escort Serviceyoung call girls in Green Park🔝 9953056974 🔝 escort Service
young call girls in Green Park🔝 9953056974 🔝 escort Service
 
Introduction to Microprocesso programming and interfacing.pptx
Introduction to Microprocesso programming and interfacing.pptxIntroduction to Microprocesso programming and interfacing.pptx
Introduction to Microprocesso programming and interfacing.pptx
 
Biology for Computer Engineers Course Handout.pptx
Biology for Computer Engineers Course Handout.pptxBiology for Computer Engineers Course Handout.pptx
Biology for Computer Engineers Course Handout.pptx
 
🔝9953056974🔝!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...
🔝9953056974🔝!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...🔝9953056974🔝!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...
🔝9953056974🔝!!-YOUNG call girls in Rajendra Nagar Escort rvice Shot 2000 nigh...
 
Call Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile serviceCall Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile service
 
Microscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptxMicroscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptx
 
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdf
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdfCCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdf
CCS355 Neural Network & Deep Learning UNIT III notes and Question bank .pdf
 
Exploring_Network_Security_with_JA3_by_Rakesh Seal.pptx
Exploring_Network_Security_with_JA3_by_Rakesh Seal.pptxExploring_Network_Security_with_JA3_by_Rakesh Seal.pptx
Exploring_Network_Security_with_JA3_by_Rakesh Seal.pptx
 
Electronically Controlled suspensions system .pdf
Electronically Controlled suspensions system .pdfElectronically Controlled suspensions system .pdf
Electronically Controlled suspensions system .pdf
 
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort serviceGurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
Gurgaon ✡️9711147426✨Call In girls Gurgaon Sector 51 escort service
 
VICTOR MAESTRE RAMIREZ - Planetary Defender on NASA's Double Asteroid Redirec...
VICTOR MAESTRE RAMIREZ - Planetary Defender on NASA's Double Asteroid Redirec...VICTOR MAESTRE RAMIREZ - Planetary Defender on NASA's Double Asteroid Redirec...
VICTOR MAESTRE RAMIREZ - Planetary Defender on NASA's Double Asteroid Redirec...
 
Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024
 
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
 
Software and Systems Engineering Standards: Verification and Validation of Sy...
Software and Systems Engineering Standards: Verification and Validation of Sy...Software and Systems Engineering Standards: Verification and Validation of Sy...
Software and Systems Engineering Standards: Verification and Validation of Sy...
 
Introduction-To-Agricultural-Surveillance-Rover.pptx
Introduction-To-Agricultural-Surveillance-Rover.pptxIntroduction-To-Agricultural-Surveillance-Rover.pptx
Introduction-To-Agricultural-Surveillance-Rover.pptx
 
Internship report on mechanical engineering
Internship report on mechanical engineeringInternship report on mechanical engineering
Internship report on mechanical engineering
 
Call Girls Narol 7397865700 Independent Call Girls
Call Girls Narol 7397865700 Independent Call GirlsCall Girls Narol 7397865700 Independent Call Girls
Call Girls Narol 7397865700 Independent Call Girls
 
Oxy acetylene welding presentation note.
Oxy acetylene welding presentation note.Oxy acetylene welding presentation note.
Oxy acetylene welding presentation note.
 
POWER SYSTEMS-1 Complete notes examples
POWER SYSTEMS-1 Complete notes  examplesPOWER SYSTEMS-1 Complete notes  examples
POWER SYSTEMS-1 Complete notes examples
 

Introduction to Statistics - Chapter 3-5 Notes.ppt

  • 1. Introduction to Statistics Lecture Notes Chapters 3-5 Please sign in (SIGNATURES) as you come in to class. It will save my voice instead of my taking attendance (this is only to settle the class roster).
  • 2. What’s up with the powerpoint?  I don’t usually use slides, but am going to try to use these to save my voice somewhat.  Notes: Still working on getting the class roster settled. Has been some movement on the waitlist, will keep in touch as things develop. Be sure you’ve signed in!  First homework is posted (on our course website), but isn’t due until next Friday (the 4th). The additional problem is NOT optional, that just means it is not a book problem.
  • 3. Handouts for Today  There is one handout on graphs/descriptive statistics going around. Save this to use tomorrow in class.  There is a second handout – the anonymous survey largely designed by the class on Monday. Please go ahead and take a few minutes to fill this out (no names!) and get it back to me. We’ll take a look at this data next week in lab.  If you missed class Monday, I have extra course syllabuses at the front as well.
  • 4. The “W”’s of a Data Set  Who – the observations (population – set of all objects you are interested in obtaining the value of some parameter for – since we usually can’t observe all objects, we take a sample of objects – a subset of the overall population of objects to observe)  Note: There is NO such thing as a population sample or sample population.  What – the variables  Why – why was the data collected  How – how was the data collected (related to design/sampling in chapters 12-13)  When/Where – more information that could be relevant
  • 5. Chapters 3-5 Overview  Covers basic graphs and descriptive statistics for both categorical and quantitative variables  This is what you would do as a “preliminary analysis” for a variable.  Recall: a data set can have multiple variables in it.  These chapters focus on mostly univariate (single variable) analyses. There is one comparative graph – a side-by-side boxplot in Chapter 5.
  • 6. 3 Rules of Data Analysis  Rule 1- Make a picture  Rule 2 – Make a picture (really, before you do anything else)  Rule 3 – Make a picture (really, we mean a well- chosen picture for your variables)
  • 7. Categorical Variable Prelim Analysis  Frequency tables (one variable) – summarize counts by category  Contingency tables (2 or more variables) – summarize counts by category for multiple variables  Bar charts  Pie charts
  • 8. Frequency  What is frequency?  Frequency is the number of objects/cases per category  You can also look at relative frequency.  Relative frequency is the number of objects/cases per category divided by the total number of objects.  Hence it gives proportions for each category out of the total.  It is often converted to %.
  • 9. Bar Charts  One bar per category – height is determined by frequency or relative frequency  Order of categories is arbitrary.  Does NOT let you talk about the shape of a distribution.  “Area” principle – areas are supposed to be relative. This is often violated when people try to make graphs “cool” and go 3-D, etc. (see Example passed around).
  • 10. Pie Charts  Take 100% of cases and divide up 360 degrees based on relative frequencies.  We will look at bar charts over pie charts.  Note that for bar charts you do not need to create bars for 100% of the cases. You could look at the top three risk factors for a disease, etc. However, we usually do have 100% of cases shown.
  • 11. Contingency Tables - Example  See first page of Handout  Totals for rows/columns give marginal distributions for each variable.  You can also look at conditional distributions. Fix a row or column and work solely within that row or column.  Concept of independence (will formalize later):  If the distribution of one variable is the same for all categories of another variable, then the two variables are independent.
  • 12. On Your Own  Text has some discussion of segmented bar-charts and side-by-side (feel free to read or skip)
  • 13. Simpson’s Paradox  Something that can happen when you aggregate categorical data  Looking at overall averages or % can be misleading  Can get different results looking at breakdown  Berkeley Discrimination Data Example (see bottom of page one of the handout)  Claims of Sexual Discrimination in1973 Graduate School Admissions  Overall, 44.28% of males who applied were admitted, while only 34.58% of females were admitted.  Look what happens when you breakdown by the 6 largest departments though! (try this on your own or with a partner). Is there evidence of discrimination against females at the dept. level? What is going on?
  • 14. Quantitative Variables Preliminary Analysis  Graphs  Dot plot – won’t use much – read about on your own  Stem and leaf – won’t use much – read about on your own  Histogram  Boxplot (chapter 5)  Qqplot (Friday or next week)  Time plot (Friday or next week)  Descriptive statistics  Measures of center: mean, median  Measures of spread: standard deviation, IQR, range
  • 15. Describing the distribution of a quantitative variable  You should focus on three things when describing the distribution of a quantitative variable:  Shape – unimodal (one peak), bimodal (two peaks), multimodal (many peaks), bell-shaped, skewed left (tail to the left), skewed right (tail to the right), symmetric, uniform (no peaks, basically flat)  Center – estimate the center (or use a descriptive statistic)  If multiple peaks, report the peak locations  Spread – estimate the spread (can use a descriptive statistic)
  • 16. Dot Plot – On Your Own  Most basic quantitative graph  Use for a low number of observations (<50)  Basically use a number line and place a dot above it for each value you have observed.  Example from wikipedia:
  • 17. Stem and Leaf – On Your Own  Your book discusses lots of options for these, including split leaves (which is something R/Rcmdr will do).  Basics: You take your values and set a stem – maybe tens. Then the leaves are the ones place. For each stem, you list the leaves that coincide in numeric order.  Usually works decently for fewer than 100 observations  Try it. Suppose you have scores on a pre-test for an at-risk youth group as follows:  5, 11, 13, 21, 34, 36, 45, 47, 48, 48, 49
  • 18. Histogram  Take the quantitative variable and break it up into “piles” or “bins” (usually the same width).  Count the number of observations in each bin or pile.  Plot the frequencies per bin.  Usually no spaces between bins (if there is, it is a gap – NOT like a bar chart).  You DO need to know the boundaries. (5,10], (10,15] as bins IS different from [5,10),[10,15). (If anyone needs me to explain open/closed brackets, please ask).  Technology lets us vary the width of bins (effectively the number)  You can also use unequal bin widths but then you need something called density, not frequency.
  • 19. Examples  See page 2 of the handout  Try to describe the shape of each histogram  Then see page 3 of the handout  We’re going to create a histogram by hand if there is time  If no time, you can do this on your own.
  • 20. Cookie Lab  Time Permitting (otherwise, Friday)  The last page (to turn in) is not due till the end of class tomorrow. So don’t worry if we don’t get to it today. You can look at it tonight or tomorrow in class (I’ll give last five minutes of class for you to work on it).