SlideShare a Scribd company logo
1 of 17
Download to read offline
Boot Camp R Fall 2013
PT651 Lab Component
Slides by Sherri Verdugo
•  Brief definition of statistics: The subject of
statistics deals with techniques for collecting,
analyzing, and drawing conclusions from data. In
science, or any subject, even psychology, you
must know how to read the literature or how to
interpret data. Statistics is your key, to knowing
when you are given false data--or, for example, if
that new medication really works. Definition is from
Statistical Methods, Eighth Edition by George
W. Snedecor and William G. Cochran.
Statistics
Brief description.
R Boot Camp and Descriptive Information
Agenda
Introduction to the Lab Component
Wrap Up
R: The basics and Rstudio; The basics
Descriptive Statistics & R
Revisiting the Blog and Installation
Parameters, Statistics oh my….the magic behind the curtain
Measures of Variability & R
✓
1
2
3
4
5
6
7
Installing	
  on	
  your	
  desktop	
  means	
  that	
  the	
  	
  
environment	
  is	
  local	
  on	
  your	
  machine!	
  
Installing	
  on	
  the	
  Cloud	
  means	
  you	
  are	
  
portable	
  and	
  accessible!	
  
• The Blog
Installation tips
- http://g2research.blogspot.com/2013_08_01_archive.html
ü PC, Mac, Unix, etc.
• Desktop or Cloud
R & R Studio
-R on the Desktop
ü R on the Cloud
The Blog and Installation of R
Revisiting the Past
Desktop versus the cloud
R
R is available locally on your computer and
depends on the size of your computer!
R is available on the internet! Fast!
Rstudio and R are located on the cloud.
Cloud
R is located on local machine. Drawbacks:
only in one environment.
Desktop
21
A sample represents a population. Dependent Variables are cheaper and easier to change J
The magic behind the curtain.
Parameters, Statistics, oh my!
1
Population characteristics.
Parameters
Measures of sample data.
Statistics
3
Rank order of data values for a
variable.
f = Frequency Distribution
Caveat:
Class: grouping of scores w/ unique range
Independent Variables: costly or nearly
impossible to change. I.E.
Dependent Variables: outcome variables.
Mutually Exclusive: no overlap in classes
Mutually Exhaustive: each score fits into a
class
2
Descriptive Statistics: ways to represent data using characterization of
central tendency, shape and variability in the data.
Three Measures of Central Tendency
Measures of Central Tendency:
central value or a typical value for a probability distribution
M3
Which score changes the most when you change the
scores in the data?
Frequencies tell a story as well!
3
The most frequent score of the sequence.
Mode
The average of all scores. Is the sample mean and µ is the population mean.
Mean
1
The middle number in a sequence. Place the numbers in order. For N=odd number then this is the exact
mid point. For N=even number add the two middle numbers together and divide by two.
Median
2
X
X =
x1 + x2 +...+ xn
n
µx = ∑ xP(x)
Variability = dispersion of scores.
Adding the deviations results in zero? That’s not what I need. NEW CONCEPT: Sum of Squares (SS).
Percentiles = the grouping of the scores totaling 100%. What percentile is a score in the data set?
Quartiles: Q1=25%, Q2=50%, Q3=75%, Q4=100%
Standard Deviation = how much variation from the mean or what we expect the value to be.
Range = difference between highest and lowest scores.
Variance = how far a set of numbers are “dispersed” or spread out.
Coefficient of Variation = unitized risk expressed as a ratio.
Do you see any patterns?
Name five measures of variability
Variability
1
2
3
4
5
6
7
sx =
(xi − x)2
i=1
n
∑
n −1
sx
2
=
(xi − x)2
i=1
n
∑
n −1
cv =
σ
µ
SS = (x − x)2
∑
Range = Xhighscore − Xlowestscore
cv =
σ
µ
*100
Tip: Remember help() and google/google scholar < This is the command prompt
Examples in R using the Cloud or Desktop
•  How can we find the descriptive statistics
in R?
•  Look at R scripts for: 1) mean, 2) median,
3) mode, 4) range, 5) percentiles, 6)
variance, 7) standard deviation, and 8)
coefficient of variation
•  Class Exercise: 1_descriptives_general.R
Descriptives & R Continued
•  #Descriptive Statistics: General Entry Level Exercise
•  2+2;f=2+2;f #R is a calculator and you can store an answer:
•  a=4;a; c = a*2; c#Solve an equation and store variables
•  fun1=((a+c+2)/2);fun1#Result is 7 (4+8+2)=14 and this is 14/2 = 7
•  #Step 1: generate a few random numbers to look at
•  rnorm(10, mean=1.2, sd=3.4);s=rnorm(10, mean=1.2, sd=3.4);s
•  me=mean(s); me #Mean
•  med=median(s);med #Median
•  table(s);names(sort(-table(s)))[1]#Mode. Not interesting because we don't have any duplicates
•  ra=range(s);ra#range low and high scores
•  mran=max(s);mran;miran=min(s);miran
•  ran=mran-miran;ran#we are running the range from the extremes of the data (hi and lo)
•  qua=quantile(s);qua#quantiles
•  per=quantile(s, c(.32, .57, .98));per#percentiles
•  varia=var(s);varia#variance
•  stds=sd(s);stds#standard deviation
•  stdsc=sqrt(varia);stdsc#double check the std. deviation :)
•  coefs=((stds/me)*100);coefs# sanity check
•  library(raster);cv(s, na.rm=TRUE); #double check with a package.
Descriptive Statistics & R
An Example Script.
Can we normalize scores? Of course we can.
Descriptive Statistics and R
Normal…we want normal for now J Work through the tutorial J
tYes!
1) Samples from a normal
distribution = distribution of
sample means are normal.
2)The mean of the distribution
of sample means = the mean
of the "parent population," the
population from which the
samples are drawn.
3)The higher the sample size
that is drawn, the "narrower"
will be the spread of the
distribution of sample means.
Can we do this in R?
1
Central Limit Theorem
This is the point where you
should see the pattern
between the previous slides
and the new concept of the
CLT. If you feel like it
doesn’t make sense, that is
normal (Yes, I like that word
a lot!)…. However, you can
use this to work through
statistical problems….and
solve complex problems
with relative ease.
Parts of the puzzle have
been given to you.
Sanity Check!
Distribution of many many
trials is normal….even if the
distribution of each trial is
not normal!
http://www.learner.org/
courses/mathilluminated/
units/7/textbook/06.php
Why is this important?
You can work with the law
of large numbers…
Approximations.
32
It might not be normal from the population. It
might however be normal using the Central Limit
Theorem and that means that 68.26% will fall ± 1
s.d. from the mean.
Standard scores (i.e. z-scores) represent scores
distances from the mean. When the mean and
standard deviation are known….you can have a
z-score attached to it. Think IQ tests, SAT, GRE,
MCAT…etc.
To solve any mystery…you are looking for
patterns. Statistics is a tool that gives you the
ability to look for a pattern in a sample that
represents a portion of the population.
Dispersions of the scores
Measures of Variability & R
2
Patterns.
Can we
Standardize?
Is the Population
Normal???
Looking at Real Data that is “Clean”
Measures of Variability & R
Nothing is impossible with R. If you can’t figure it out…think logically. What is it that you want to do and then readdress
the steps that you have taken so far.
• Clean Data
Has no missing values
-Homework Examples
ü Extremely rare in the real world….
• Real Data
Has missing values
-Can be handled by R
ü Big steps involved in cleaning the data set….
Statistics…your guide to the universe
Every Number tells a story
Clean Data
Collect Data
Design a Study.
Analyze Data
Present Data
Conclusions J
Your own sub headline
Wrap Up
3
These have to be loaded before you use them. If you have to use them….make sure you comment on
them in the code so that you can use them again later.
Library or Packages
R Packages have templates ready for you to learn with…some packages are better than others.
Templates
1
Present the descriptive information for the data set. This is key when you are presenting inferential
statistics in a journal, report, homework, or even understanding the data you are reading about.
Always…Always…Always
2
Have productivity anywhere you have internet.
Taking it to the Cloud
When you are in a clinical setting…time is of the
essence. You are dealing with patients and the
information you present must be accurate.
Population
Sample
Information
Patient
Information
THANK YOU!
That wasn’t so hard J Remember you are now
programming! Who needs Excel or SPSS now?
Free is sometimes better in more ways than one!

More Related Content

What's hot

Introduction to simulating data to improve your research
Introduction to simulating data to improve your researchIntroduction to simulating data to improve your research
Introduction to simulating data to improve your researchDorothy Bishop
 
MachineLearning.ppt
MachineLearning.pptMachineLearning.ppt
MachineLearning.pptbutest
 
Lecture 8: Machine Learning in Practice (1)
Lecture 8: Machine Learning in Practice (1) Lecture 8: Machine Learning in Practice (1)
Lecture 8: Machine Learning in Practice (1) Marina Santini
 
Stat11t alq chapter03
Stat11t alq chapter03Stat11t alq chapter03
Stat11t alq chapter03raylenepotter
 
Module 3: Linear Regression
Module 3:  Linear RegressionModule 3:  Linear Regression
Module 3: Linear RegressionSara Hooker
 
Data simulation basics
Data simulation basicsData simulation basics
Data simulation basicsDorothy Bishop
 
Simulating data to gain insights into power and p-hacking
Simulating data to gain insights intopower and p-hackingSimulating data to gain insights intopower and p-hacking
Simulating data to gain insights into power and p-hackingDorothy Bishop
 
Module 1.2 data preparation
Module 1.2  data preparationModule 1.2  data preparation
Module 1.2 data preparationSara Hooker
 
Lecture 3: Basic Concepts of Machine Learning - Induction & Evaluation
Lecture 3: Basic Concepts of Machine Learning - Induction & EvaluationLecture 3: Basic Concepts of Machine Learning - Induction & Evaluation
Lecture 3: Basic Concepts of Machine Learning - Induction & EvaluationMarina Santini
 
Nbe rtopicsandrecomvlecture1
Nbe rtopicsandrecomvlecture1Nbe rtopicsandrecomvlecture1
Nbe rtopicsandrecomvlecture1NBER
 
Machine Learning presentation.
Machine Learning presentation.Machine Learning presentation.
Machine Learning presentation.butest
 
Basic statistics 1
Basic statistics  1Basic statistics  1
Basic statistics 1Kumar P
 
Lecture 9: Machine Learning in Practice (2)
Lecture 9: Machine Learning in Practice (2)Lecture 9: Machine Learning in Practice (2)
Lecture 9: Machine Learning in Practice (2)Marina Santini
 

What's hot (19)

Introduction to simulating data to improve your research
Introduction to simulating data to improve your researchIntroduction to simulating data to improve your research
Introduction to simulating data to improve your research
 
MachineLearning.ppt
MachineLearning.pptMachineLearning.ppt
MachineLearning.ppt
 
Stat11t chapter1
Stat11t chapter1Stat11t chapter1
Stat11t chapter1
 
Stat11t chapter3
Stat11t chapter3Stat11t chapter3
Stat11t chapter3
 
Lecture 8: Machine Learning in Practice (1)
Lecture 8: Machine Learning in Practice (1) Lecture 8: Machine Learning in Practice (1)
Lecture 8: Machine Learning in Practice (1)
 
Stat11t alq chapter03
Stat11t alq chapter03Stat11t alq chapter03
Stat11t alq chapter03
 
L4. Ensembles of Decision Trees
L4. Ensembles of Decision TreesL4. Ensembles of Decision Trees
L4. Ensembles of Decision Trees
 
Stat11t chapter2
Stat11t chapter2Stat11t chapter2
Stat11t chapter2
 
Decision tree
Decision treeDecision tree
Decision tree
 
Module 3: Linear Regression
Module 3:  Linear RegressionModule 3:  Linear Regression
Module 3: Linear Regression
 
Data simulation basics
Data simulation basicsData simulation basics
Data simulation basics
 
Simulating data to gain insights into power and p-hacking
Simulating data to gain insights intopower and p-hackingSimulating data to gain insights intopower and p-hacking
Simulating data to gain insights into power and p-hacking
 
Module 1.2 data preparation
Module 1.2  data preparationModule 1.2  data preparation
Module 1.2 data preparation
 
Lecture 3: Basic Concepts of Machine Learning - Induction & Evaluation
Lecture 3: Basic Concepts of Machine Learning - Induction & EvaluationLecture 3: Basic Concepts of Machine Learning - Induction & Evaluation
Lecture 3: Basic Concepts of Machine Learning - Induction & Evaluation
 
Krupa rm
Krupa rmKrupa rm
Krupa rm
 
Nbe rtopicsandrecomvlecture1
Nbe rtopicsandrecomvlecture1Nbe rtopicsandrecomvlecture1
Nbe rtopicsandrecomvlecture1
 
Machine Learning presentation.
Machine Learning presentation.Machine Learning presentation.
Machine Learning presentation.
 
Basic statistics 1
Basic statistics  1Basic statistics  1
Basic statistics 1
 
Lecture 9: Machine Learning in Practice (2)
Lecture 9: Machine Learning in Practice (2)Lecture 9: Machine Learning in Practice (2)
Lecture 9: Machine Learning in Practice (2)
 

Viewers also liked

Apple techmanuals verd
Apple techmanuals verdApple techmanuals verd
Apple techmanuals verdSherri Gunder
 
Leveraging the Cloud for Social Media (IT Track - Social Media Plus May 25th,...
Leveraging the Cloud for Social Media (IT Track - Social Media Plus May 25th,...Leveraging the Cloud for Social Media (IT Track - Social Media Plus May 25th,...
Leveraging the Cloud for Social Media (IT Track - Social Media Plus May 25th,...Ed Laczynski
 
Socialmediapresentation v2
Socialmediapresentation v2Socialmediapresentation v2
Socialmediapresentation v2Sherri Gunder
 
Citihub Open Source and Cloud approach to Social Media Listening
Citihub Open Source and Cloud approach to Social Media ListeningCitihub Open Source and Cloud approach to Social Media Listening
Citihub Open Source and Cloud approach to Social Media ListeningChris Allison
 
Rstudio in aws 16 9
Rstudio in aws 16 9Rstudio in aws 16 9
Rstudio in aws 16 9Tal Galili
 

Viewers also liked (7)

Apple techmanuals verd
Apple techmanuals verdApple techmanuals verd
Apple techmanuals verd
 
Leveraging the Cloud for Social Media (IT Track - Social Media Plus May 25th,...
Leveraging the Cloud for Social Media (IT Track - Social Media Plus May 25th,...Leveraging the Cloud for Social Media (IT Track - Social Media Plus May 25th,...
Leveraging the Cloud for Social Media (IT Track - Social Media Plus May 25th,...
 
Socialmediapresentation v2
Socialmediapresentation v2Socialmediapresentation v2
Socialmediapresentation v2
 
Aws r
Aws rAws r
Aws r
 
Indian Premiere League 2015 & Social Media
 Indian Premiere League 2015 & Social Media Indian Premiere League 2015 & Social Media
Indian Premiere League 2015 & Social Media
 
Citihub Open Source and Cloud approach to Social Media Listening
Citihub Open Source and Cloud approach to Social Media ListeningCitihub Open Source and Cloud approach to Social Media Listening
Citihub Open Source and Cloud approach to Social Media Listening
 
Rstudio in aws 16 9
Rstudio in aws 16 9Rstudio in aws 16 9
Rstudio in aws 16 9
 

Similar to Ch17 lab r_verdu103: Entry level statistics exercise (descriptives)

UNIT1-2.pptx
UNIT1-2.pptxUNIT1-2.pptx
UNIT1-2.pptxcsecem
 
Data Science Interview Questions | Data Science Interview Questions And Answe...
Data Science Interview Questions | Data Science Interview Questions And Answe...Data Science Interview Questions | Data Science Interview Questions And Answe...
Data Science Interview Questions | Data Science Interview Questions And Answe...Simplilearn
 
Analysing & interpreting data.ppt
Analysing & interpreting data.pptAnalysing & interpreting data.ppt
Analysing & interpreting data.pptmanaswidebbarma1
 
CABT Math 8 measures of central tendency and dispersion
CABT Math 8   measures of central tendency and dispersionCABT Math 8   measures of central tendency and dispersion
CABT Math 8 measures of central tendency and dispersionGilbert Joseph Abueg
 
3. measures of central tendency
3. measures of central tendency3. measures of central tendency
3. measures of central tendencyrenz50
 
Statistics for IB Biology
Statistics for IB BiologyStatistics for IB Biology
Statistics for IB BiologyEran Earland
 
m2_2_variation_z_scores.pptx
m2_2_variation_z_scores.pptxm2_2_variation_z_scores.pptx
m2_2_variation_z_scores.pptxMesfinMelese4
 
Focus on what you learned that made an impression, what may have s.docx
Focus on what you learned that made an impression, what may have s.docxFocus on what you learned that made an impression, what may have s.docx
Focus on what you learned that made an impression, what may have s.docxkeugene1
 
Data Science & AI Road Map by Python & Computer science tutor in Malaysia
Data Science  & AI Road Map by Python & Computer science tutor in MalaysiaData Science  & AI Road Map by Python & Computer science tutor in Malaysia
Data Science & AI Road Map by Python & Computer science tutor in MalaysiaAhmed Elmalla
 
M08 BiasVarianceTradeoff
M08 BiasVarianceTradeoffM08 BiasVarianceTradeoff
M08 BiasVarianceTradeoffRaman Kannan
 
Lecture 1 Descriptives.pptx
Lecture 1 Descriptives.pptxLecture 1 Descriptives.pptx
Lecture 1 Descriptives.pptxABCraftsman
 
Summary statistics
Summary statisticsSummary statistics
Summary statisticsRupak Roy
 
1.0 Descriptive statistics.pdf
1.0 Descriptive statistics.pdf1.0 Descriptive statistics.pdf
1.0 Descriptive statistics.pdfthaersyam
 

Similar to Ch17 lab r_verdu103: Entry level statistics exercise (descriptives) (20)

UNIT1-2.pptx
UNIT1-2.pptxUNIT1-2.pptx
UNIT1-2.pptx
 
Data Science Interview Questions | Data Science Interview Questions And Answe...
Data Science Interview Questions | Data Science Interview Questions And Answe...Data Science Interview Questions | Data Science Interview Questions And Answe...
Data Science Interview Questions | Data Science Interview Questions And Answe...
 
Analysing & interpreting data.ppt
Analysing & interpreting data.pptAnalysing & interpreting data.ppt
Analysing & interpreting data.ppt
 
Chemistry Lab Manual
Chemistry Lab ManualChemistry Lab Manual
Chemistry Lab Manual
 
CABT Math 8 measures of central tendency and dispersion
CABT Math 8   measures of central tendency and dispersionCABT Math 8   measures of central tendency and dispersion
CABT Math 8 measures of central tendency and dispersion
 
Data in science
Data in science Data in science
Data in science
 
data
datadata
data
 
3. measures of central tendency
3. measures of central tendency3. measures of central tendency
3. measures of central tendency
 
Statistics for IB Biology
Statistics for IB BiologyStatistics for IB Biology
Statistics for IB Biology
 
m2_2_variation_z_scores.pptx
m2_2_variation_z_scores.pptxm2_2_variation_z_scores.pptx
m2_2_variation_z_scores.pptx
 
Focus on what you learned that made an impression, what may have s.docx
Focus on what you learned that made an impression, what may have s.docxFocus on what you learned that made an impression, what may have s.docx
Focus on what you learned that made an impression, what may have s.docx
 
Data Science & AI Road Map by Python & Computer science tutor in Malaysia
Data Science  & AI Road Map by Python & Computer science tutor in MalaysiaData Science  & AI Road Map by Python & Computer science tutor in Malaysia
Data Science & AI Road Map by Python & Computer science tutor in Malaysia
 
M08 BiasVarianceTradeoff
M08 BiasVarianceTradeoffM08 BiasVarianceTradeoff
M08 BiasVarianceTradeoff
 
Statistics
StatisticsStatistics
Statistics
 
Lecture 1 Descriptives.pptx
Lecture 1 Descriptives.pptxLecture 1 Descriptives.pptx
Lecture 1 Descriptives.pptx
 
Learning to learn Model Behavior: How to use "human-in-the-loop" to explain d...
Learning to learn Model Behavior: How to use "human-in-the-loop" to explain d...Learning to learn Model Behavior: How to use "human-in-the-loop" to explain d...
Learning to learn Model Behavior: How to use "human-in-the-loop" to explain d...
 
Data analysis
Data analysisData analysis
Data analysis
 
Descriptive statistics i
Descriptive statistics iDescriptive statistics i
Descriptive statistics i
 
Summary statistics
Summary statisticsSummary statistics
Summary statistics
 
1.0 Descriptive statistics.pdf
1.0 Descriptive statistics.pdf1.0 Descriptive statistics.pdf
1.0 Descriptive statistics.pdf
 

Recently uploaded

On National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan FellowsOn National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan FellowsMebane Rash
 
Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)Jisc
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfciinovamais
 
Fostering Friendships - Enhancing Social Bonds in the Classroom
Fostering Friendships - Enhancing Social Bonds  in the ClassroomFostering Friendships - Enhancing Social Bonds  in the Classroom
Fostering Friendships - Enhancing Social Bonds in the ClassroomPooky Knightsmith
 
Unit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptxUnit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptxVishalSingh1417
 
How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17Celine George
 
SOC 101 Demonstration of Learning Presentation
SOC 101 Demonstration of Learning PresentationSOC 101 Demonstration of Learning Presentation
SOC 101 Demonstration of Learning Presentationcamerronhm
 
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...christianmathematics
 
Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsTechSoup
 
Application orientated numerical on hev.ppt
Application orientated numerical on hev.pptApplication orientated numerical on hev.ppt
Application orientated numerical on hev.pptRamjanShidvankar
 
Dyslexia AI Workshop for Slideshare.pptx
Dyslexia AI Workshop for Slideshare.pptxDyslexia AI Workshop for Slideshare.pptx
Dyslexia AI Workshop for Slideshare.pptxcallscotland1987
 
Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfKey note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfAdmir Softic
 
How to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POSHow to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POSCeline George
 
Making communications land - Are they received and understood as intended? we...
Making communications land - Are they received and understood as intended? we...Making communications land - Are they received and understood as intended? we...
Making communications land - Are they received and understood as intended? we...Association for Project Management
 
ICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptxICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptxAreebaZafar22
 
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptxHMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptxEsquimalt MFRC
 
Food safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdfFood safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdfSherif Taha
 
Unit-IV; Professional Sales Representative (PSR).pptx
Unit-IV; Professional Sales Representative (PSR).pptxUnit-IV; Professional Sales Representative (PSR).pptx
Unit-IV; Professional Sales Representative (PSR).pptxVishalSingh1417
 
Spellings Wk 3 English CAPS CARES Please Practise
Spellings Wk 3 English CAPS CARES Please PractiseSpellings Wk 3 English CAPS CARES Please Practise
Spellings Wk 3 English CAPS CARES Please PractiseAnaAcapella
 

Recently uploaded (20)

On National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan FellowsOn National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan Fellows
 
Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdf
 
Fostering Friendships - Enhancing Social Bonds in the Classroom
Fostering Friendships - Enhancing Social Bonds  in the ClassroomFostering Friendships - Enhancing Social Bonds  in the Classroom
Fostering Friendships - Enhancing Social Bonds in the Classroom
 
Unit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptxUnit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptx
 
How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17
 
SOC 101 Demonstration of Learning Presentation
SOC 101 Demonstration of Learning PresentationSOC 101 Demonstration of Learning Presentation
SOC 101 Demonstration of Learning Presentation
 
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
 
Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The Basics
 
Application orientated numerical on hev.ppt
Application orientated numerical on hev.pptApplication orientated numerical on hev.ppt
Application orientated numerical on hev.ppt
 
Dyslexia AI Workshop for Slideshare.pptx
Dyslexia AI Workshop for Slideshare.pptxDyslexia AI Workshop for Slideshare.pptx
Dyslexia AI Workshop for Slideshare.pptx
 
Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfKey note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdf
 
Spatium Project Simulation student brief
Spatium Project Simulation student briefSpatium Project Simulation student brief
Spatium Project Simulation student brief
 
How to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POSHow to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POS
 
Making communications land - Are they received and understood as intended? we...
Making communications land - Are they received and understood as intended? we...Making communications land - Are they received and understood as intended? we...
Making communications land - Are they received and understood as intended? we...
 
ICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptxICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptx
 
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptxHMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
 
Food safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdfFood safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdf
 
Unit-IV; Professional Sales Representative (PSR).pptx
Unit-IV; Professional Sales Representative (PSR).pptxUnit-IV; Professional Sales Representative (PSR).pptx
Unit-IV; Professional Sales Representative (PSR).pptx
 
Spellings Wk 3 English CAPS CARES Please Practise
Spellings Wk 3 English CAPS CARES Please PractiseSpellings Wk 3 English CAPS CARES Please Practise
Spellings Wk 3 English CAPS CARES Please Practise
 

Ch17 lab r_verdu103: Entry level statistics exercise (descriptives)

  • 1. Boot Camp R Fall 2013 PT651 Lab Component Slides by Sherri Verdugo
  • 2. •  Brief definition of statistics: The subject of statistics deals with techniques for collecting, analyzing, and drawing conclusions from data. In science, or any subject, even psychology, you must know how to read the literature or how to interpret data. Statistics is your key, to knowing when you are given false data--or, for example, if that new medication really works. Definition is from Statistical Methods, Eighth Edition by George W. Snedecor and William G. Cochran. Statistics Brief description.
  • 3. R Boot Camp and Descriptive Information Agenda Introduction to the Lab Component Wrap Up R: The basics and Rstudio; The basics Descriptive Statistics & R Revisiting the Blog and Installation Parameters, Statistics oh my….the magic behind the curtain Measures of Variability & R ✓ 1 2 3 4 5 6 7
  • 4. Installing  on  your  desktop  means  that  the     environment  is  local  on  your  machine!   Installing  on  the  Cloud  means  you  are   portable  and  accessible!   • The Blog Installation tips - http://g2research.blogspot.com/2013_08_01_archive.html ü PC, Mac, Unix, etc. • Desktop or Cloud R & R Studio -R on the Desktop ü R on the Cloud The Blog and Installation of R Revisiting the Past
  • 5. Desktop versus the cloud R R is available locally on your computer and depends on the size of your computer! R is available on the internet! Fast! Rstudio and R are located on the cloud. Cloud R is located on local machine. Drawbacks: only in one environment. Desktop 21
  • 6. A sample represents a population. Dependent Variables are cheaper and easier to change J The magic behind the curtain. Parameters, Statistics, oh my! 1 Population characteristics. Parameters Measures of sample data. Statistics 3 Rank order of data values for a variable. f = Frequency Distribution Caveat: Class: grouping of scores w/ unique range Independent Variables: costly or nearly impossible to change. I.E. Dependent Variables: outcome variables. Mutually Exclusive: no overlap in classes Mutually Exhaustive: each score fits into a class 2 Descriptive Statistics: ways to represent data using characterization of central tendency, shape and variability in the data.
  • 7. Three Measures of Central Tendency Measures of Central Tendency: central value or a typical value for a probability distribution M3 Which score changes the most when you change the scores in the data? Frequencies tell a story as well! 3 The most frequent score of the sequence. Mode The average of all scores. Is the sample mean and µ is the population mean. Mean 1 The middle number in a sequence. Place the numbers in order. For N=odd number then this is the exact mid point. For N=even number add the two middle numbers together and divide by two. Median 2 X X = x1 + x2 +...+ xn n µx = ∑ xP(x)
  • 8. Variability = dispersion of scores. Adding the deviations results in zero? That’s not what I need. NEW CONCEPT: Sum of Squares (SS). Percentiles = the grouping of the scores totaling 100%. What percentile is a score in the data set? Quartiles: Q1=25%, Q2=50%, Q3=75%, Q4=100% Standard Deviation = how much variation from the mean or what we expect the value to be. Range = difference between highest and lowest scores. Variance = how far a set of numbers are “dispersed” or spread out. Coefficient of Variation = unitized risk expressed as a ratio. Do you see any patterns? Name five measures of variability Variability 1 2 3 4 5 6 7 sx = (xi − x)2 i=1 n ∑ n −1 sx 2 = (xi − x)2 i=1 n ∑ n −1 cv = σ µ SS = (x − x)2 ∑ Range = Xhighscore − Xlowestscore cv = σ µ *100
  • 9. Tip: Remember help() and google/google scholar < This is the command prompt Examples in R using the Cloud or Desktop •  How can we find the descriptive statistics in R? •  Look at R scripts for: 1) mean, 2) median, 3) mode, 4) range, 5) percentiles, 6) variance, 7) standard deviation, and 8) coefficient of variation •  Class Exercise: 1_descriptives_general.R Descriptives & R Continued
  • 10. •  #Descriptive Statistics: General Entry Level Exercise •  2+2;f=2+2;f #R is a calculator and you can store an answer: •  a=4;a; c = a*2; c#Solve an equation and store variables •  fun1=((a+c+2)/2);fun1#Result is 7 (4+8+2)=14 and this is 14/2 = 7 •  #Step 1: generate a few random numbers to look at •  rnorm(10, mean=1.2, sd=3.4);s=rnorm(10, mean=1.2, sd=3.4);s •  me=mean(s); me #Mean •  med=median(s);med #Median •  table(s);names(sort(-table(s)))[1]#Mode. Not interesting because we don't have any duplicates •  ra=range(s);ra#range low and high scores •  mran=max(s);mran;miran=min(s);miran •  ran=mran-miran;ran#we are running the range from the extremes of the data (hi and lo) •  qua=quantile(s);qua#quantiles •  per=quantile(s, c(.32, .57, .98));per#percentiles •  varia=var(s);varia#variance •  stds=sd(s);stds#standard deviation •  stdsc=sqrt(varia);stdsc#double check the std. deviation :) •  coefs=((stds/me)*100);coefs# sanity check •  library(raster);cv(s, na.rm=TRUE); #double check with a package. Descriptive Statistics & R An Example Script.
  • 11. Can we normalize scores? Of course we can. Descriptive Statistics and R Normal…we want normal for now J Work through the tutorial J tYes! 1) Samples from a normal distribution = distribution of sample means are normal. 2)The mean of the distribution of sample means = the mean of the "parent population," the population from which the samples are drawn. 3)The higher the sample size that is drawn, the "narrower" will be the spread of the distribution of sample means. Can we do this in R? 1 Central Limit Theorem This is the point where you should see the pattern between the previous slides and the new concept of the CLT. If you feel like it doesn’t make sense, that is normal (Yes, I like that word a lot!)…. However, you can use this to work through statistical problems….and solve complex problems with relative ease. Parts of the puzzle have been given to you. Sanity Check! Distribution of many many trials is normal….even if the distribution of each trial is not normal! http://www.learner.org/ courses/mathilluminated/ units/7/textbook/06.php Why is this important? You can work with the law of large numbers… Approximations. 32
  • 12. It might not be normal from the population. It might however be normal using the Central Limit Theorem and that means that 68.26% will fall ± 1 s.d. from the mean. Standard scores (i.e. z-scores) represent scores distances from the mean. When the mean and standard deviation are known….you can have a z-score attached to it. Think IQ tests, SAT, GRE, MCAT…etc. To solve any mystery…you are looking for patterns. Statistics is a tool that gives you the ability to look for a pattern in a sample that represents a portion of the population. Dispersions of the scores Measures of Variability & R 2 Patterns. Can we Standardize? Is the Population Normal???
  • 13. Looking at Real Data that is “Clean” Measures of Variability & R Nothing is impossible with R. If you can’t figure it out…think logically. What is it that you want to do and then readdress the steps that you have taken so far. • Clean Data Has no missing values -Homework Examples ü Extremely rare in the real world…. • Real Data Has missing values -Can be handled by R ü Big steps involved in cleaning the data set….
  • 14. Statistics…your guide to the universe Every Number tells a story Clean Data Collect Data Design a Study. Analyze Data Present Data Conclusions J
  • 15. Your own sub headline Wrap Up 3 These have to be loaded before you use them. If you have to use them….make sure you comment on them in the code so that you can use them again later. Library or Packages R Packages have templates ready for you to learn with…some packages are better than others. Templates 1 Present the descriptive information for the data set. This is key when you are presenting inferential statistics in a journal, report, homework, or even understanding the data you are reading about. Always…Always…Always 2
  • 16. Have productivity anywhere you have internet. Taking it to the Cloud When you are in a clinical setting…time is of the essence. You are dealing with patients and the information you present must be accurate. Population Sample Information Patient Information
  • 17. THANK YOU! That wasn’t so hard J Remember you are now programming! Who needs Excel or SPSS now? Free is sometimes better in more ways than one!