SlideShare a Scribd company logo
1 of 22
Master the Art of Analytics
A Simplistic Explainer Series For Citizen Data Scientists
J o u r n e y To w a r d s A u g m e n t e d A n a l y t i c s
Independent Sample T-test
Basic Terminologies
 Sample data is the subset of population data used to represent the entire group as whole
 For instance, if we want to come up with average value of all cars in united states, it is
impractical to assess the each car value in united states, adding these numbers and dividing
by total number of cars
 Instead, we can randomly select some of the cars, say 200 and get value of each of these 200
cars and find average of these 200 numbers
 These 200 numbers containing randomly selected 200 cars’ values is called a sample data of
entire United states’ cars’ values (population data)
 There are various sampling techniques such as simple random sampling, stratified sampling
and systematic sampling which are explained in annexure section
Basic Terminologies
 Null hypothesis in case of Independent sample t-test is a general statement that there is no
statistically significant difference between two samples
 Alternative hypothesis in case of Independent sample t-test is the one that states that there is a
statistically significant difference between two samples
 For instance, an online store marketing manager decides to test the hypothesis that females
have significantly higher tendency to shop online than males
 In this case following would be the null and alternative hypothesis:
 Null hypothesis : There is no significant difference between males and females in terms
of tendency to shop online
 Alternative hypothesis : There is statistically significant difference between males and
females in terms of tendency to shop online
 P- value : In case of independent sample t test, it indicates whether there is a
statistically significant difference between two samples
 For different levels of accuracy desired, the p-value can be checked at different
thresholds and inference can be made accordingly
 For instance, for confidence level or accuracy = 95% ( error =5%) , we have to
check p-value against the threshold of 0.05.
 If p-value < 0.05 then the difference is significant else the difference is
insignificant
 Similarly, for confidence level =98% (error =2%), we have to check p-value
against the threshold of 0.02.
 If p-value < 0.02 then the difference is significant else the difference is
insignificant and so on
Basic Terminologies
Introduction
• Independent sample t-test is a statistical test that determines
whether there is a statistically significant difference between the
means of two independent samples
• For instance, checking if average value of a sedan car type is significantly
different than the SUV car type
• Here the hypothesis would be set as follows :
• Null hypothesis : SUV and Sedan car types have insignificant difference in terms of value
• Alternative hypothesis : Value of SUV and Sedan differ significantly
Example : Input
Let’s conduct the Independent t-test on following two variables, one
is a dimension containing two values and the other is a measure :
Group Value
A 90
A 95
A 80
B 78
B 75
B 70
B 65
Two Independent Groups Dependent Variable
Example : Output
Group “A” Mean
Value
79.0
Group “B” Mean Value 72.0
Mean Difference 7.0
P-value 0.041
 At 95% confidence level (5% chance of error) :
 As p-value = 0.041 which is less than 0.05, there is a statistically significant
difference between the means of two groups A and B
 Mean of Group A is significantly higher than that of Group B
 At 98 % confidence level (2% chance of error) :
 As p-value = 0.041 which is greater than 0.02, there is no statistically
significant difference between the means of two groups A and B
Standard input parameters & sample UI
Sample output 1 : Interpretation
Sample output 2 : Model Summary
Sample output 3 : OUTLIERS
Outliers : They are the data values that differ greatly from the majority of a set of data.
Limitations
• Can be applied on only two samples (one dimension with two values
and one measure at a time)
• Observations within each group must be independent
• The values in each group must be normally distributed
• Number of data points should be at least 30
General applications
• Medicine
• Has the quality of life improved for patients who took drug A as opposed to patients
who took drug B?
• Sociology
• Are men more satisfied with their jobs than women? Do they earn more?
• Biology
• Are foxes in one specific habitat larger than in another?
• Economics
• Is the economic growth of developing nations larger than the economic growth of
the first world?
• Marketing
• Does customer segment A spend more on groceries than customer segment B?
Use case 1
Business benefit:
•Once the test is completed, p-value is
generated which indicates whether
there is statistical difference between
income of two groups.
•Based on this value, a manager can
easily conclude that whether average
income earned by female employees is
statistically different from male
employees and if the different is
statistically significant then which
gender earns higher or lower.
Business problem :
•An HR Manager wants to find out
whether male employees earn more
than female employees.
•Here the dependent variable would be
‘Total Annual Income’ .
Use case 1 : Input Dataset
Gender Income
Male 21000
Male 15000
Male 25600
Male 23000
Female 19750
Female 25000
Female 21250
Female 14400
Female 10000
Use case 1 : Output
Value
“Male” Mean Income Value 19444.44
“Female” Mean Income Value 18080.0
Mean Difference 1364.44
P-value 0.406
P-value : 0.406 (> 0.05) indicates that there is no significant difference
between income of males and females.
Use case 2
Business benefit:
• Once the test is completed, p-
value is generated which
indicates whether there is a
statistical difference between
purchase amounts of both
segments.
• Based on this value, grocery store
manager can decide on its
marketing strategies for better
sales and increased revenue.
Business problem :
• A Grocery store sales manager
wants to know whether customer
segment A spends more on
groceries than customer segment
B.
• Here the dependent variable
would be ‘Purchase Amount'.
Use case 3
Business benefit:
• Once the test is completed, p-value
is generated which indicates
whether there is statistical
difference between cholestrol
concentration of two groups.
• Based on this value, researcher can
conclude whether exercise was
more effective than the diet control
to control cholestrol level and
suggest better treatment to
patients.
Business problem :
• Suppose a medical researcher
decided to investigate whether an
exercise or diet control is more
effective in lowering cholestrol
levels. There are two groups :
Calorie-controlled diet group &
exercise-training group.
• Here the dependent variable would
be ‘Cholestrol concentrations’ .
Sampling Methods
• There are three main types of sampling :
• Simple random sampling:
• Here, the selection is purely based on a chance and every item has an equal chance of getting
selected
• Lottery system is an example of simple random sampling
• Stratified sampling:
• Here, the population data is divided into subgroups known as strata
• The members in each of the subgroup formed have similar attributes and characteristics in
terms of demographics, income, location etc.
• A random sample from each of these subgroups is taken in proportion to the subgroup size
relative to the population size
• These subsets of subgroups are then added to from a final stratified random sample
• Higher statistical precision is achieved through this method due to low variability within each
subgroup, also less sample size is required for this method of sampling when compared to
simple random sampling
Sampling Methods
• Government policymakers generally make use of stratified random sampling method for
coming up with better targeted solutions
• Systematic sampling:
• Here, the researcher has to decide the sampling size first and then the interval of
sampling – the standard distance between each sampled element
• Divide total population size by sample size to come up with this interval
• For instance, say you want to create a systematic random sample of 1,000 people
from a population of 10,000.
• Using a list of the total population, number each person from 1 to 10,000.
• Then, randomly choose a number, like 4, as the number to start with. This means that the
person numbered "4" would be your first selection, and then every tenth person from then
on would be included in your sample.
• Your sample, then, would be composed of persons numbered 14, 24, 34, 44, 54, and so on
down the line until you reach the person numbered 9,994
Want to Learn
More?
Get in touch with us @
support@Smarten.com
And Do Checkout the Learning section
on
Smarten.com
June 2018

More Related Content

What's hot

Data Analysis with SPSS : One-way ANOVA
Data Analysis with SPSS : One-way ANOVAData Analysis with SPSS : One-way ANOVA
Data Analysis with SPSS : One-way ANOVADr Ali Yusob Md Zain
 
Lecture2 hypothesis testing
Lecture2 hypothesis testingLecture2 hypothesis testing
Lecture2 hypothesis testingo_devinyak
 
Analysis of variance (ANOVA) everything you need to know
Analysis of variance (ANOVA) everything you need to knowAnalysis of variance (ANOVA) everything you need to know
Analysis of variance (ANOVA) everything you need to knowStat Analytica
 
Lecture - ANCOVA 4 Slides.pdf
Lecture - ANCOVA 4 Slides.pdfLecture - ANCOVA 4 Slides.pdf
Lecture - ANCOVA 4 Slides.pdfmuhammad shahid
 
Introduction to Generalized Linear Models
Introduction to Generalized Linear ModelsIntroduction to Generalized Linear Models
Introduction to Generalized Linear Modelsrichardchandler
 
Spss lecture notes
Spss lecture notesSpss lecture notes
Spss lecture notesDavid mbwiga
 
Regression analysis.
Regression analysis.Regression analysis.
Regression analysis.sonia gupta
 
Data Analysis with SPSS PPT.pdf
Data Analysis with SPSS PPT.pdfData Analysis with SPSS PPT.pdf
Data Analysis with SPSS PPT.pdfThanavathi C
 
Logistic regression
Logistic regressionLogistic regression
Logistic regressionDrZahid Khan
 
Introduction to correlation and regression analysis
Introduction to correlation and regression analysisIntroduction to correlation and regression analysis
Introduction to correlation and regression analysisFarzad Javidanrad
 
Parametric Statistical tests
Parametric Statistical testsParametric Statistical tests
Parametric Statistical testsSundar B N
 
Introduction to Principle Component Analysis
Introduction to Principle Component AnalysisIntroduction to Principle Component Analysis
Introduction to Principle Component AnalysisSunjeet Jena
 
Research Methods: Basic Concepts and Methods
Research Methods: Basic Concepts and MethodsResearch Methods: Basic Concepts and Methods
Research Methods: Basic Concepts and MethodsAhmed-Refat Refat
 
Exploratory factor analysis
Exploratory factor analysisExploratory factor analysis
Exploratory factor analysisJames Neill
 

What's hot (20)

Z-test
Z-testZ-test
Z-test
 
Data Analysis with SPSS : One-way ANOVA
Data Analysis with SPSS : One-way ANOVAData Analysis with SPSS : One-way ANOVA
Data Analysis with SPSS : One-way ANOVA
 
Lecture2 hypothesis testing
Lecture2 hypothesis testingLecture2 hypothesis testing
Lecture2 hypothesis testing
 
Basic Statistics
Basic  StatisticsBasic  Statistics
Basic Statistics
 
Bayesian inference
Bayesian inferenceBayesian inference
Bayesian inference
 
Analysis of variance (ANOVA) everything you need to know
Analysis of variance (ANOVA) everything you need to knowAnalysis of variance (ANOVA) everything you need to know
Analysis of variance (ANOVA) everything you need to know
 
Lecture - ANCOVA 4 Slides.pdf
Lecture - ANCOVA 4 Slides.pdfLecture - ANCOVA 4 Slides.pdf
Lecture - ANCOVA 4 Slides.pdf
 
Introduction to Generalized Linear Models
Introduction to Generalized Linear ModelsIntroduction to Generalized Linear Models
Introduction to Generalized Linear Models
 
Spss lecture notes
Spss lecture notesSpss lecture notes
Spss lecture notes
 
Analysis of variance anova
Analysis of variance anovaAnalysis of variance anova
Analysis of variance anova
 
Regression analysis.
Regression analysis.Regression analysis.
Regression analysis.
 
Data Analysis with SPSS PPT.pdf
Data Analysis with SPSS PPT.pdfData Analysis with SPSS PPT.pdf
Data Analysis with SPSS PPT.pdf
 
Logistic regression
Logistic regressionLogistic regression
Logistic regression
 
Introduction to correlation and regression analysis
Introduction to correlation and regression analysisIntroduction to correlation and regression analysis
Introduction to correlation and regression analysis
 
Parametric Statistical tests
Parametric Statistical testsParametric Statistical tests
Parametric Statistical tests
 
Introduction to Principle Component Analysis
Introduction to Principle Component AnalysisIntroduction to Principle Component Analysis
Introduction to Principle Component Analysis
 
Design of Experiments
Design of ExperimentsDesign of Experiments
Design of Experiments
 
Research Methods: Basic Concepts and Methods
Research Methods: Basic Concepts and MethodsResearch Methods: Basic Concepts and Methods
Research Methods: Basic Concepts and Methods
 
Exploratory factor analysis
Exploratory factor analysisExploratory factor analysis
Exploratory factor analysis
 
Point Estimation
Point EstimationPoint Estimation
Point Estimation
 

Similar to What is the Independent Samples T Test Method of Analysis and How Can it Benefit an Organization?

What is the Paired Sample T Test and How is it Beneficial to Business Analysis?
What is the Paired Sample T Test and How is it Beneficial to Business Analysis?What is the Paired Sample T Test and How is it Beneficial to Business Analysis?
What is the Paired Sample T Test and How is it Beneficial to Business Analysis?Smarten Augmented Analytics
 
Bmgt 311 chapter_13
Bmgt 311 chapter_13Bmgt 311 chapter_13
Bmgt 311 chapter_13Chris Lovett
 
Sampling design
Sampling designSampling design
Sampling designBalaji P
 
Sample Size Calculations for Impact Evaluations
Sample Size Calculations for Impact EvaluationsSample Size Calculations for Impact Evaluations
Sample Size Calculations for Impact EvaluationsMarcos Vera
 
What is the Chi Square Test of Association and How Can it be Used for Analysis?
What is the Chi Square Test of Association and How Can it be Used for Analysis?What is the Chi Square Test of Association and How Can it be Used for Analysis?
What is the Chi Square Test of Association and How Can it be Used for Analysis?Smarten Augmented Analytics
 
QUANTITATIVE TECHNIQUES IN MANAGEMENT
QUANTITATIVE TECHNIQUES IN MANAGEMENTQUANTITATIVE TECHNIQUES IN MANAGEMENT
QUANTITATIVE TECHNIQUES IN MANAGEMENTamitymbaassignment
 
SAMPLE SIZE CALCULATION IN DIFFERENT STUDY DESIGNS AT.pptx
SAMPLE SIZE CALCULATION IN DIFFERENT STUDY DESIGNS AT.pptxSAMPLE SIZE CALCULATION IN DIFFERENT STUDY DESIGNS AT.pptx
SAMPLE SIZE CALCULATION IN DIFFERENT STUDY DESIGNS AT.pptxssuserd509321
 
Section 9 Chi Square and ANOVA Tests Rhonda Knehans Dr.docx
Section 9 Chi Square and ANOVA Tests Rhonda Knehans Dr.docxSection 9 Chi Square and ANOVA Tests Rhonda Knehans Dr.docx
Section 9 Chi Square and ANOVA Tests Rhonda Knehans Dr.docxkenjordan97598
 
Between Black and White Population1. Comparing annual percent .docx
Between Black and White Population1. Comparing annual percent .docxBetween Black and White Population1. Comparing annual percent .docx
Between Black and White Population1. Comparing annual percent .docxjasoninnes20
 
Need a nonplagiarised paper and a form completed by 1006015 before.docx
Need a nonplagiarised paper and a form completed by 1006015 before.docxNeed a nonplagiarised paper and a form completed by 1006015 before.docx
Need a nonplagiarised paper and a form completed by 1006015 before.docxlea6nklmattu
 
Inferential Statistics.pptx
Inferential Statistics.pptxInferential Statistics.pptx
Inferential Statistics.pptxjonatanjohn1
 
Qt business statistics-lesson1-2013
Qt business statistics-lesson1-2013Qt business statistics-lesson1-2013
Qt business statistics-lesson1-2013sonu kumar
 
Bmgt 311 chapter_13
Bmgt 311 chapter_13Bmgt 311 chapter_13
Bmgt 311 chapter_13Chris Lovett
 
Statistical Learning and Model Selection (1).pptx
Statistical Learning and Model Selection (1).pptxStatistical Learning and Model Selection (1).pptx
Statistical Learning and Model Selection (1).pptxrajalakshmi5921
 
Telesidang 4 bab_8_9_10stst
Telesidang 4 bab_8_9_10ststTelesidang 4 bab_8_9_10stst
Telesidang 4 bab_8_9_10ststNor Ihsan
 

Similar to What is the Independent Samples T Test Method of Analysis and How Can it Benefit an Organization? (20)

What is the Paired Sample T Test and How is it Beneficial to Business Analysis?
What is the Paired Sample T Test and How is it Beneficial to Business Analysis?What is the Paired Sample T Test and How is it Beneficial to Business Analysis?
What is the Paired Sample T Test and How is it Beneficial to Business Analysis?
 
Bmgt 311 chapter_13
Bmgt 311 chapter_13Bmgt 311 chapter_13
Bmgt 311 chapter_13
 
Sampling design
Sampling designSampling design
Sampling design
 
Sample Size Calculations for Impact Evaluations
Sample Size Calculations for Impact EvaluationsSample Size Calculations for Impact Evaluations
Sample Size Calculations for Impact Evaluations
 
What is the Chi Square Test of Association and How Can it be Used for Analysis?
What is the Chi Square Test of Association and How Can it be Used for Analysis?What is the Chi Square Test of Association and How Can it be Used for Analysis?
What is the Chi Square Test of Association and How Can it be Used for Analysis?
 
QUANTITATIVE TECHNIQUES IN MANAGEMENT
QUANTITATIVE TECHNIQUES IN MANAGEMENTQUANTITATIVE TECHNIQUES IN MANAGEMENT
QUANTITATIVE TECHNIQUES IN MANAGEMENT
 
SAMPLE SIZE CALCULATION IN DIFFERENT STUDY DESIGNS AT.pptx
SAMPLE SIZE CALCULATION IN DIFFERENT STUDY DESIGNS AT.pptxSAMPLE SIZE CALCULATION IN DIFFERENT STUDY DESIGNS AT.pptx
SAMPLE SIZE CALCULATION IN DIFFERENT STUDY DESIGNS AT.pptx
 
Section 9 Chi Square and ANOVA Tests Rhonda Knehans Dr.docx
Section 9 Chi Square and ANOVA Tests Rhonda Knehans Dr.docxSection 9 Chi Square and ANOVA Tests Rhonda Knehans Dr.docx
Section 9 Chi Square and ANOVA Tests Rhonda Knehans Dr.docx
 
Between Black and White Population1. Comparing annual percent .docx
Between Black and White Population1. Comparing annual percent .docxBetween Black and White Population1. Comparing annual percent .docx
Between Black and White Population1. Comparing annual percent .docx
 
Need a nonplagiarised paper and a form completed by 1006015 before.docx
Need a nonplagiarised paper and a form completed by 1006015 before.docxNeed a nonplagiarised paper and a form completed by 1006015 before.docx
Need a nonplagiarised paper and a form completed by 1006015 before.docx
 
Inferential Statistics.pptx
Inferential Statistics.pptxInferential Statistics.pptx
Inferential Statistics.pptx
 
Qt business statistics-lesson1-2013
Qt business statistics-lesson1-2013Qt business statistics-lesson1-2013
Qt business statistics-lesson1-2013
 
T test
T test T test
T test
 
Hypothsis testing
Hypothsis testingHypothsis testing
Hypothsis testing
 
Bmgt 311 chapter_13
Bmgt 311 chapter_13Bmgt 311 chapter_13
Bmgt 311 chapter_13
 
Variable inferential statistics
Variable inferential statisticsVariable inferential statistics
Variable inferential statistics
 
Statistical Learning and Model Selection (1).pptx
Statistical Learning and Model Selection (1).pptxStatistical Learning and Model Selection (1).pptx
Statistical Learning and Model Selection (1).pptx
 
Basic stat tools
Basic stat toolsBasic stat tools
Basic stat tools
 
ABTest-20231020.pptx
ABTest-20231020.pptxABTest-20231020.pptx
ABTest-20231020.pptx
 
Telesidang 4 bab_8_9_10stst
Telesidang 4 bab_8_9_10ststTelesidang 4 bab_8_9_10stst
Telesidang 4 bab_8_9_10stst
 

More from Smarten Augmented Analytics

Crime Type Prediction - Augmented Analytics Use Case – Smarten
Crime Type Prediction - Augmented Analytics Use Case – SmartenCrime Type Prediction - Augmented Analytics Use Case – Smarten
Crime Type Prediction - Augmented Analytics Use Case – SmartenSmarten Augmented Analytics
 
What Is Multilayer Perceptron Classifier And How Is It Used For Enterprise An...
What Is Multilayer Perceptron Classifier And How Is It Used For Enterprise An...What Is Multilayer Perceptron Classifier And How Is It Used For Enterprise An...
What Is Multilayer Perceptron Classifier And How Is It Used For Enterprise An...Smarten Augmented Analytics
 
What Is Generalized Linear Regression with Gaussian Distribution And How Can ...
What Is Generalized Linear Regression with Gaussian Distribution And How Can ...What Is Generalized Linear Regression with Gaussian Distribution And How Can ...
What Is Generalized Linear Regression with Gaussian Distribution And How Can ...Smarten Augmented Analytics
 
What Is Random Forest Classification And How Can It Help Your Business?
What Is Random Forest Classification And How Can It Help Your Business?What Is Random Forest Classification And How Can It Help Your Business?
What Is Random Forest Classification And How Can It Help Your Business?Smarten Augmented Analytics
 
What is Isotonic Regression and How Can a Business Utilize it to Analyze Data?
What is Isotonic Regression and How Can a Business Utilize it to Analyze Data?What is Isotonic Regression and How Can a Business Utilize it to Analyze Data?
What is Isotonic Regression and How Can a Business Utilize it to Analyze Data?Smarten Augmented Analytics
 
Students' Academic Performance Predictive Analytics Use Case – Smarten
Students' Academic Performance Predictive Analytics Use Case – SmartenStudents' Academic Performance Predictive Analytics Use Case – Smarten
Students' Academic Performance Predictive Analytics Use Case – SmartenSmarten Augmented Analytics
 
Random Forest Regression Analysis Reveals Impact of Variables on Target Values
Random Forest Regression Analysis Reveals Impact of Variables on Target Values  Random Forest Regression Analysis Reveals Impact of Variables on Target Values
Random Forest Regression Analysis Reveals Impact of Variables on Target Values Smarten Augmented Analytics
 
Gradient Boosting Regression Analysis Reveals Dependent Variables and Interre...
Gradient Boosting Regression Analysis Reveals Dependent Variables and Interre...Gradient Boosting Regression Analysis Reveals Dependent Variables and Interre...
Gradient Boosting Regression Analysis Reveals Dependent Variables and Interre...Smarten Augmented Analytics
 
What is Simple Linear Regression and How Can an Enterprise Use this Technique...
What is Simple Linear Regression and How Can an Enterprise Use this Technique...What is Simple Linear Regression and How Can an Enterprise Use this Technique...
What is Simple Linear Regression and How Can an Enterprise Use this Technique...Smarten Augmented Analytics
 
What is Multiple Linear Regression and How Can it be Helpful for Business Ana...
What is Multiple Linear Regression and How Can it be Helpful for Business Ana...What is Multiple Linear Regression and How Can it be Helpful for Business Ana...
What is Multiple Linear Regression and How Can it be Helpful for Business Ana...Smarten Augmented Analytics
 
Fraud Mitigation Predictive Analytics Use Case – Smarten
Fraud Mitigation Predictive Analytics Use Case – SmartenFraud Mitigation Predictive Analytics Use Case – Smarten
Fraud Mitigation Predictive Analytics Use Case – SmartenSmarten Augmented Analytics
 
Quality Control Predictive Analytics Use Case - Smarten
Quality Control Predictive Analytics Use Case - SmartenQuality Control Predictive Analytics Use Case - Smarten
Quality Control Predictive Analytics Use Case - SmartenSmarten Augmented Analytics
 
Machine Maintenance Management Predictive Analytics Use Case - Smarten
Machine Maintenance Management Predictive Analytics Use Case - SmartenMachine Maintenance Management Predictive Analytics Use Case - Smarten
Machine Maintenance Management Predictive Analytics Use Case - SmartenSmarten Augmented Analytics
 
Predictive Analytics Using External Data Augmented Analytics Use Case - Smarten
Predictive Analytics Using External Data Augmented Analytics Use Case - SmartenPredictive Analytics Using External Data Augmented Analytics Use Case - Smarten
Predictive Analytics Using External Data Augmented Analytics Use Case - SmartenSmarten Augmented Analytics
 
Marketing Optimization Augmented Analytics Use Cases - Smarten
Marketing Optimization Augmented Analytics Use Cases - SmartenMarketing Optimization Augmented Analytics Use Cases - Smarten
Marketing Optimization Augmented Analytics Use Cases - SmartenSmarten Augmented Analytics
 
Human Resource Attrition Augmented Analytics Use Case - Smarten
Human Resource Attrition Augmented Analytics Use Case - SmartenHuman Resource Attrition Augmented Analytics Use Case - Smarten
Human Resource Attrition Augmented Analytics Use Case - SmartenSmarten Augmented Analytics
 
Customer Targeting Augmented Analytics Use Case - Smarten
Customer Targeting Augmented Analytics Use Case - SmartenCustomer Targeting Augmented Analytics Use Case - Smarten
Customer Targeting Augmented Analytics Use Case - SmartenSmarten Augmented Analytics
 
What is Naïve Bayes Classification and How is it Used for Enterprise Analysis?
What is Naïve Bayes Classification and How is it Used for Enterprise Analysis?What is Naïve Bayes Classification and How is it Used for Enterprise Analysis?
What is Naïve Bayes Classification and How is it Used for Enterprise Analysis?Smarten Augmented Analytics
 
What is KNN Classification and How Can This Analysis Help an Enterprise?
What is KNN Classification and How Can This Analysis Help an Enterprise?What is KNN Classification and How Can This Analysis Help an Enterprise?
What is KNN Classification and How Can This Analysis Help an Enterprise?Smarten Augmented Analytics
 
What is Multiple Linear Regression and How Can it be Helpful for Business Ana...
What is Multiple Linear Regression and How Can it be Helpful for Business Ana...What is Multiple Linear Regression and How Can it be Helpful for Business Ana...
What is Multiple Linear Regression and How Can it be Helpful for Business Ana...Smarten Augmented Analytics
 

More from Smarten Augmented Analytics (20)

Crime Type Prediction - Augmented Analytics Use Case – Smarten
Crime Type Prediction - Augmented Analytics Use Case – SmartenCrime Type Prediction - Augmented Analytics Use Case – Smarten
Crime Type Prediction - Augmented Analytics Use Case – Smarten
 
What Is Multilayer Perceptron Classifier And How Is It Used For Enterprise An...
What Is Multilayer Perceptron Classifier And How Is It Used For Enterprise An...What Is Multilayer Perceptron Classifier And How Is It Used For Enterprise An...
What Is Multilayer Perceptron Classifier And How Is It Used For Enterprise An...
 
What Is Generalized Linear Regression with Gaussian Distribution And How Can ...
What Is Generalized Linear Regression with Gaussian Distribution And How Can ...What Is Generalized Linear Regression with Gaussian Distribution And How Can ...
What Is Generalized Linear Regression with Gaussian Distribution And How Can ...
 
What Is Random Forest Classification And How Can It Help Your Business?
What Is Random Forest Classification And How Can It Help Your Business?What Is Random Forest Classification And How Can It Help Your Business?
What Is Random Forest Classification And How Can It Help Your Business?
 
What is Isotonic Regression and How Can a Business Utilize it to Analyze Data?
What is Isotonic Regression and How Can a Business Utilize it to Analyze Data?What is Isotonic Regression and How Can a Business Utilize it to Analyze Data?
What is Isotonic Regression and How Can a Business Utilize it to Analyze Data?
 
Students' Academic Performance Predictive Analytics Use Case – Smarten
Students' Academic Performance Predictive Analytics Use Case – SmartenStudents' Academic Performance Predictive Analytics Use Case – Smarten
Students' Academic Performance Predictive Analytics Use Case – Smarten
 
Random Forest Regression Analysis Reveals Impact of Variables on Target Values
Random Forest Regression Analysis Reveals Impact of Variables on Target Values  Random Forest Regression Analysis Reveals Impact of Variables on Target Values
Random Forest Regression Analysis Reveals Impact of Variables on Target Values
 
Gradient Boosting Regression Analysis Reveals Dependent Variables and Interre...
Gradient Boosting Regression Analysis Reveals Dependent Variables and Interre...Gradient Boosting Regression Analysis Reveals Dependent Variables and Interre...
Gradient Boosting Regression Analysis Reveals Dependent Variables and Interre...
 
What is Simple Linear Regression and How Can an Enterprise Use this Technique...
What is Simple Linear Regression and How Can an Enterprise Use this Technique...What is Simple Linear Regression and How Can an Enterprise Use this Technique...
What is Simple Linear Regression and How Can an Enterprise Use this Technique...
 
What is Multiple Linear Regression and How Can it be Helpful for Business Ana...
What is Multiple Linear Regression and How Can it be Helpful for Business Ana...What is Multiple Linear Regression and How Can it be Helpful for Business Ana...
What is Multiple Linear Regression and How Can it be Helpful for Business Ana...
 
Fraud Mitigation Predictive Analytics Use Case – Smarten
Fraud Mitigation Predictive Analytics Use Case – SmartenFraud Mitigation Predictive Analytics Use Case – Smarten
Fraud Mitigation Predictive Analytics Use Case – Smarten
 
Quality Control Predictive Analytics Use Case - Smarten
Quality Control Predictive Analytics Use Case - SmartenQuality Control Predictive Analytics Use Case - Smarten
Quality Control Predictive Analytics Use Case - Smarten
 
Machine Maintenance Management Predictive Analytics Use Case - Smarten
Machine Maintenance Management Predictive Analytics Use Case - SmartenMachine Maintenance Management Predictive Analytics Use Case - Smarten
Machine Maintenance Management Predictive Analytics Use Case - Smarten
 
Predictive Analytics Using External Data Augmented Analytics Use Case - Smarten
Predictive Analytics Using External Data Augmented Analytics Use Case - SmartenPredictive Analytics Using External Data Augmented Analytics Use Case - Smarten
Predictive Analytics Using External Data Augmented Analytics Use Case - Smarten
 
Marketing Optimization Augmented Analytics Use Cases - Smarten
Marketing Optimization Augmented Analytics Use Cases - SmartenMarketing Optimization Augmented Analytics Use Cases - Smarten
Marketing Optimization Augmented Analytics Use Cases - Smarten
 
Human Resource Attrition Augmented Analytics Use Case - Smarten
Human Resource Attrition Augmented Analytics Use Case - SmartenHuman Resource Attrition Augmented Analytics Use Case - Smarten
Human Resource Attrition Augmented Analytics Use Case - Smarten
 
Customer Targeting Augmented Analytics Use Case - Smarten
Customer Targeting Augmented Analytics Use Case - SmartenCustomer Targeting Augmented Analytics Use Case - Smarten
Customer Targeting Augmented Analytics Use Case - Smarten
 
What is Naïve Bayes Classification and How is it Used for Enterprise Analysis?
What is Naïve Bayes Classification and How is it Used for Enterprise Analysis?What is Naïve Bayes Classification and How is it Used for Enterprise Analysis?
What is Naïve Bayes Classification and How is it Used for Enterprise Analysis?
 
What is KNN Classification and How Can This Analysis Help an Enterprise?
What is KNN Classification and How Can This Analysis Help an Enterprise?What is KNN Classification and How Can This Analysis Help an Enterprise?
What is KNN Classification and How Can This Analysis Help an Enterprise?
 
What is Multiple Linear Regression and How Can it be Helpful for Business Ana...
What is Multiple Linear Regression and How Can it be Helpful for Business Ana...What is Multiple Linear Regression and How Can it be Helpful for Business Ana...
What is Multiple Linear Regression and How Can it be Helpful for Business Ana...
 

Recently uploaded

KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptxKnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptxTier1 app
 
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...ICS
 
chapter--4-software-project-planning.ppt
chapter--4-software-project-planning.pptchapter--4-software-project-planning.ppt
chapter--4-software-project-planning.pptkotipi9215
 
Call Girls in Naraina Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Naraina Delhi 💯Call Us 🔝8264348440🔝Call Girls in Naraina Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Naraina Delhi 💯Call Us 🔝8264348440🔝soniya singh
 
Asset Management Software - Infographic
Asset Management Software - InfographicAsset Management Software - Infographic
Asset Management Software - InfographicHr365.us smith
 
Salesforce Certified Field Service Consultant
Salesforce Certified Field Service ConsultantSalesforce Certified Field Service Consultant
Salesforce Certified Field Service ConsultantAxelRicardoTrocheRiq
 
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...stazi3110
 
What is Fashion PLM and Why Do You Need It
What is Fashion PLM and Why Do You Need ItWhat is Fashion PLM and Why Do You Need It
What is Fashion PLM and Why Do You Need ItWave PLM
 
Optimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVOptimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVshikhaohhpro
 
The Evolution of Karaoke From Analog to App.pdf
The Evolution of Karaoke From Analog to App.pdfThe Evolution of Karaoke From Analog to App.pdf
The Evolution of Karaoke From Analog to App.pdfPower Karaoke
 
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed DataAlluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed DataAlluxio, Inc.
 
why an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdfwhy an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdfjoe51371421
 
Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)OPEN KNOWLEDGE GmbH
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comFatema Valibhai
 
Building Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop SlideBuilding Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop SlideChristina Lin
 
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdfThe Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdfkalichargn70th171
 
Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...OnePlan Solutions
 
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...MyIntelliSource, Inc.
 
5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdfWave PLM
 
Professional Resume Template for Software Developers
Professional Resume Template for Software DevelopersProfessional Resume Template for Software Developers
Professional Resume Template for Software DevelopersVinodh Ram
 

Recently uploaded (20)

KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptxKnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
 
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
 
chapter--4-software-project-planning.ppt
chapter--4-software-project-planning.pptchapter--4-software-project-planning.ppt
chapter--4-software-project-planning.ppt
 
Call Girls in Naraina Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Naraina Delhi 💯Call Us 🔝8264348440🔝Call Girls in Naraina Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Naraina Delhi 💯Call Us 🔝8264348440🔝
 
Asset Management Software - Infographic
Asset Management Software - InfographicAsset Management Software - Infographic
Asset Management Software - Infographic
 
Salesforce Certified Field Service Consultant
Salesforce Certified Field Service ConsultantSalesforce Certified Field Service Consultant
Salesforce Certified Field Service Consultant
 
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
 
What is Fashion PLM and Why Do You Need It
What is Fashion PLM and Why Do You Need ItWhat is Fashion PLM and Why Do You Need It
What is Fashion PLM and Why Do You Need It
 
Optimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVOptimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTV
 
The Evolution of Karaoke From Analog to App.pdf
The Evolution of Karaoke From Analog to App.pdfThe Evolution of Karaoke From Analog to App.pdf
The Evolution of Karaoke From Analog to App.pdf
 
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed DataAlluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
 
why an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdfwhy an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdf
 
Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.com
 
Building Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop SlideBuilding Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
 
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdfThe Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
 
Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...
 
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
 
5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf
 
Professional Resume Template for Software Developers
Professional Resume Template for Software DevelopersProfessional Resume Template for Software Developers
Professional Resume Template for Software Developers
 

What is the Independent Samples T Test Method of Analysis and How Can it Benefit an Organization?

  • 1. Master the Art of Analytics A Simplistic Explainer Series For Citizen Data Scientists J o u r n e y To w a r d s A u g m e n t e d A n a l y t i c s
  • 3. Basic Terminologies  Sample data is the subset of population data used to represent the entire group as whole  For instance, if we want to come up with average value of all cars in united states, it is impractical to assess the each car value in united states, adding these numbers and dividing by total number of cars  Instead, we can randomly select some of the cars, say 200 and get value of each of these 200 cars and find average of these 200 numbers  These 200 numbers containing randomly selected 200 cars’ values is called a sample data of entire United states’ cars’ values (population data)  There are various sampling techniques such as simple random sampling, stratified sampling and systematic sampling which are explained in annexure section
  • 4. Basic Terminologies  Null hypothesis in case of Independent sample t-test is a general statement that there is no statistically significant difference between two samples  Alternative hypothesis in case of Independent sample t-test is the one that states that there is a statistically significant difference between two samples  For instance, an online store marketing manager decides to test the hypothesis that females have significantly higher tendency to shop online than males  In this case following would be the null and alternative hypothesis:  Null hypothesis : There is no significant difference between males and females in terms of tendency to shop online  Alternative hypothesis : There is statistically significant difference between males and females in terms of tendency to shop online
  • 5.  P- value : In case of independent sample t test, it indicates whether there is a statistically significant difference between two samples  For different levels of accuracy desired, the p-value can be checked at different thresholds and inference can be made accordingly  For instance, for confidence level or accuracy = 95% ( error =5%) , we have to check p-value against the threshold of 0.05.  If p-value < 0.05 then the difference is significant else the difference is insignificant  Similarly, for confidence level =98% (error =2%), we have to check p-value against the threshold of 0.02.  If p-value < 0.02 then the difference is significant else the difference is insignificant and so on Basic Terminologies
  • 6. Introduction • Independent sample t-test is a statistical test that determines whether there is a statistically significant difference between the means of two independent samples • For instance, checking if average value of a sedan car type is significantly different than the SUV car type • Here the hypothesis would be set as follows : • Null hypothesis : SUV and Sedan car types have insignificant difference in terms of value • Alternative hypothesis : Value of SUV and Sedan differ significantly
  • 7. Example : Input Let’s conduct the Independent t-test on following two variables, one is a dimension containing two values and the other is a measure : Group Value A 90 A 95 A 80 B 78 B 75 B 70 B 65 Two Independent Groups Dependent Variable
  • 8. Example : Output Group “A” Mean Value 79.0 Group “B” Mean Value 72.0 Mean Difference 7.0 P-value 0.041  At 95% confidence level (5% chance of error) :  As p-value = 0.041 which is less than 0.05, there is a statistically significant difference between the means of two groups A and B  Mean of Group A is significantly higher than that of Group B  At 98 % confidence level (2% chance of error) :  As p-value = 0.041 which is greater than 0.02, there is no statistically significant difference between the means of two groups A and B
  • 10. Sample output 1 : Interpretation
  • 11. Sample output 2 : Model Summary
  • 12. Sample output 3 : OUTLIERS Outliers : They are the data values that differ greatly from the majority of a set of data.
  • 13. Limitations • Can be applied on only two samples (one dimension with two values and one measure at a time) • Observations within each group must be independent • The values in each group must be normally distributed • Number of data points should be at least 30
  • 14. General applications • Medicine • Has the quality of life improved for patients who took drug A as opposed to patients who took drug B? • Sociology • Are men more satisfied with their jobs than women? Do they earn more? • Biology • Are foxes in one specific habitat larger than in another? • Economics • Is the economic growth of developing nations larger than the economic growth of the first world? • Marketing • Does customer segment A spend more on groceries than customer segment B?
  • 15. Use case 1 Business benefit: •Once the test is completed, p-value is generated which indicates whether there is statistical difference between income of two groups. •Based on this value, a manager can easily conclude that whether average income earned by female employees is statistically different from male employees and if the different is statistically significant then which gender earns higher or lower. Business problem : •An HR Manager wants to find out whether male employees earn more than female employees. •Here the dependent variable would be ‘Total Annual Income’ .
  • 16. Use case 1 : Input Dataset Gender Income Male 21000 Male 15000 Male 25600 Male 23000 Female 19750 Female 25000 Female 21250 Female 14400 Female 10000
  • 17. Use case 1 : Output Value “Male” Mean Income Value 19444.44 “Female” Mean Income Value 18080.0 Mean Difference 1364.44 P-value 0.406 P-value : 0.406 (> 0.05) indicates that there is no significant difference between income of males and females.
  • 18. Use case 2 Business benefit: • Once the test is completed, p- value is generated which indicates whether there is a statistical difference between purchase amounts of both segments. • Based on this value, grocery store manager can decide on its marketing strategies for better sales and increased revenue. Business problem : • A Grocery store sales manager wants to know whether customer segment A spends more on groceries than customer segment B. • Here the dependent variable would be ‘Purchase Amount'.
  • 19. Use case 3 Business benefit: • Once the test is completed, p-value is generated which indicates whether there is statistical difference between cholestrol concentration of two groups. • Based on this value, researcher can conclude whether exercise was more effective than the diet control to control cholestrol level and suggest better treatment to patients. Business problem : • Suppose a medical researcher decided to investigate whether an exercise or diet control is more effective in lowering cholestrol levels. There are two groups : Calorie-controlled diet group & exercise-training group. • Here the dependent variable would be ‘Cholestrol concentrations’ .
  • 20. Sampling Methods • There are three main types of sampling : • Simple random sampling: • Here, the selection is purely based on a chance and every item has an equal chance of getting selected • Lottery system is an example of simple random sampling • Stratified sampling: • Here, the population data is divided into subgroups known as strata • The members in each of the subgroup formed have similar attributes and characteristics in terms of demographics, income, location etc. • A random sample from each of these subgroups is taken in proportion to the subgroup size relative to the population size • These subsets of subgroups are then added to from a final stratified random sample • Higher statistical precision is achieved through this method due to low variability within each subgroup, also less sample size is required for this method of sampling when compared to simple random sampling
  • 21. Sampling Methods • Government policymakers generally make use of stratified random sampling method for coming up with better targeted solutions • Systematic sampling: • Here, the researcher has to decide the sampling size first and then the interval of sampling – the standard distance between each sampled element • Divide total population size by sample size to come up with this interval • For instance, say you want to create a systematic random sample of 1,000 people from a population of 10,000. • Using a list of the total population, number each person from 1 to 10,000. • Then, randomly choose a number, like 4, as the number to start with. This means that the person numbered "4" would be your first selection, and then every tenth person from then on would be included in your sample. • Your sample, then, would be composed of persons numbered 14, 24, 34, 44, 54, and so on down the line until you reach the person numbered 9,994
  • 22. Want to Learn More? Get in touch with us @ support@Smarten.com And Do Checkout the Learning section on Smarten.com June 2018