SlideShare a Scribd company logo
1 of 50
Dr. Urooj A Siddiqui
 Data – Raw Facts, especially numerical facts,
collected together for reference or
information.
 Data is collected on some particular
variable/s
 Data analysis is processing of data to derive
useful information
 Knowledge communicated concerning some
particular fact
 The created knowledge helps in APPLICATION /
DECISION MAKING
 Categorical: Qualitative
 Continuous: Quantitative
Data
Categorical
Nominal Ordinal
Continuous
Interval Ratio
 Any phenomenon which takes at least two
different values/ observations
 Data: Set of values/ observations
collected on variable is called data
 Nominal
 Ordinal
 Interval
 Ratio
1. Data Preparation / Initial
Operations
2. Summarizing Data / Data
Analysis Operations
 Editing / Cleaning
 Coding
 Classification
 Tabulation
 Graphical
Representation
 Tables / Crosstab
 Graph / Figure
 Statistical Analysis
1. Descriptive Methods
 Frequency, %age, Ratio,
 Mean, Median, Standard
Deviation (Variance)
2. Inferential Methods
 Comparison (t/z-test/Anova)
 Association (chi square test)
 Correlation (r)
 Prediction/ Regression
(y = ax + b)
 Editing / Data Cleaning
 examining the collected raw data to detect any errors
and omit/correct it if possible
 Coding
 assigning numerals to answers so that responses can
be put into a limited number of categories
 Classification
 Grouping of data on some basis (large volume of raw
data is reduced into homogenous groups
I. Attribute - on the basis of demographic bases
eg. gender, rural/urban, day scholar/hosteller
II. Class Interval – on the basis on some numeric range
eg. 0-10, 10-20 etc.
I. Tabulation
 is the process of displaying raw data in tabular
form and summarising it for further analysis
 orderly arranging data in columns and rows
Tabulation is essential because
 It conserves space and reduces statements
 It facilitates the process of summation of
items, comparison, detection of errors and
omissions
 Basis for various statistical computations
Name
Gende
r
Caste Age Mob. No. Edu
Yrs in
school
IQ
Pain
level
temp of
locality
deg cel
Ram M Hindu 60 9450366367 NIL 0 16 Mild-0 -4
Akbar M Muslim 65 8004896712 HS 16 14 Mod-1 20
Sita F Hindu 305 9934876545 Int. 19 0 Mild-0 15
Shalini F Hindu 90 2542543598 HS 8 1 6 Mild-0 0
Mehnaj F Sikh 38 9458098734 UG 21 13 Severe-2 0
Ravi M Hindu 48 9412890112 PG 23 20 Mod-1 -1
Hari M Hindu 45 8796654398 Prim 12 10 Mod-1 30
Name Gender Caste Age Mob.No.
Edu
level
Yrs in
sch.
IQ
Pain
level
temp of
locality
deg cel
7 1 1 60 9450366367 -1 0 16 0 -4
2 1 2 65 8004896712 1 16 14 2 20
5 2 1 35 9934876545 2 19 0 0 15
4 2 1 90 2542543598 1 8 1 6 0 0
3 2 3 38 9458098734 3 21 13 3 0
6 1 1 48 9412890112 4 23 20 2 -1
1 1 1 45 8796654398 0 12 10 2 30
Nominal & Ordinal called qualitative . Interval and Ratio called quantitative
Single Variable Freq. Table
Age Group (years) Freq.
Below 20 2
20-22 28
22-24 16
24-26 10
Above 26 4
60
Roll.
No
Age
(yr)
1 22
2 24
3 23
4 26
5 19
6 25
. .
. .
. .
. .
. .
60 22
 Single / Multi Variable Table - one or
more variable (no interaction)
**Multiple Variable Table – as presented in above slide
 Crosstabs – interaction of two or more
variables
Two Variable Interaction – Crosstab
Age Group
Gender
Male Female Total
Below 20 1 1 2
20-22 18 10 28
22-24 9 7 16
24-26 7 3 10
Above 26 3 1 4
38 22 60
Graphical Representation of Data
 Pie Chart
 Bar Graph
 Histogram
 Line Graph
 Scatter Plot
 Scatter Plot & Correlation
Pie Charts
 It is used to represent %ages, distribution of 1
variable at various levels
8.2, 58%
3.2,
23%
1.4,
10%
1.2,
8%
Sales (in mn)
1st Qtr
2nd Qtr
3rd Qtr
4th Qtr
Bar Chart
 It is used to represent 1 variable at various levels
 Levels can be year/ groups etc.
4.3
2.5
3.5
4.5
0
0.5
1
1.5
2
2.5
3
3.5
4
2018 2019 2020 2021
Sales
Bar Chart
4.3
2.5
3.5
2.4
4.4
1.8
2 2
3
2.5
3
4
0
0.5
1
1.5
2
2.5
3
3.5
4
4.5
5
2018 2019 2020
Clustered Bar
1st
2nd
3rd
4th
Histogram
 To show the distribution of a quantitative
variable
4
6
10
8
2 0
0
2
4
6
8
10
12
10 20 30 40 50
Frequency
Class Interval/Variable Unit
Line Diagram
 To show change in variable in a particular time
period / on some reference range
₹ 5.60
₹ 5.80
₹ 6.00
₹ 6.20
₹ 6.40
₹ 6.60
₹ 6.80
₹ 7.00
₹ 7.20
₹ 7.40
1 2 3 4 5 6 7 8 9 10
Stock
Price
Last 10 Days
Line Diagram
 May also be used to compare 2 or more variables
along the range
0
2
4
6
8
10
12
14
1 2 3 4 5 6 7 8
Adani
Tata
Reliance
Scatter Plot
 It is used to express relationships between two
variables
0
1
2
3
4
5
6
0 1 2 3 4
Sales in
Crore
Adv Budget in 10’Lacs
Y-Values
Scatter Plot
 to express relationships between two variables
Scatter Plot
 Trend Lines - Correlation
Income / day
No. of
families
0-500 20
500-1000 30
1000-1500 50
1500-2000 70
2000-2500 40
2500-3000 30
3000-3500 10
. .
0
10
20
30
40
50
60
70
80
0 1000 2000 3000 4000
No.of
families
Income
age (xi) x-xi (x-xi) sqr.
A 21 2 4
B 22 1 1
C 23 0 0
D 24 -1 1
E 25 -2 4
mean x 23 Sum 0
10 (sum x-xi sq)
Avg Sq (variance) 2 (10 by 5), n=5
SD (root v) s 1.41
Age Group (years) Freq. Probability
Below 20 2 2/60
20-22 28 28/60
22-24 16 16/60
24-26 10 10/60
Above 26 4 4/60
60
Mean
(x-sample,
µ-population)
23 (years)
SD (s-sample, sigma-
population)
2 (years)
Roll.
No
Age
(yr)
1 22
2 24
3 23
4 26
5 19
6 22
. .
. .
. .
. .
. .
60 22
 A distribution in frequencies of observations is
known – probability distribution
 Z- Normal Distribution/Test - Mean (µ), SD-
 To compare means (1 or 2 means)
 t – Distribution/Test- Mean (x), SD (s)
 To compare means (1 or 2 means)
 Chi Square Distribution / Test
 To compare sample SD with population SD
 F Test
 To compare two sample variances
 A freq. distribution with bell shape curve and
some known properties
 Parameters - Mean (µ), SD (sigma)
 Known properties
 68% values are within µ ± 1 SD
 95% values are within µ ± 2 SD
 99% values are within µ ± 3 SD
 95% CI = µ ± 2.SD (range)
 Lower limit µ - 2.SD
 Upper limit µ + 2.SD
23
25
27
29
21
19
17
Example of our case
 95% CI = µ ± 2.SD
 Lower limit = µ - 2.SD, Upper limit = µ + 2.SD,
 LL = 23 - 2.2 = 19, UL = 23 + 2.2 = 27
 95% CI Range = 19-27 years
 95% of the students in the class are in the range
of 19-27 yrs
 We are 95% confident that if we randomly select
a student from the class his/her age will be
within this range (19-27 yrs)
 Reverse is Hypothesis Testing
 If mean and SD of any population is known and if
some value is given can we determine whether it
belongs to this population or distribution ?
0
+0.5
+1
+1.5
-0.5
-1
-1.5
When Population SD is KNOWN When Population SD is UNKNOWN
Finding Probability
 Calculate z score (test statistic) of the observed
value or hypothesized value with the formula
 Determine p value associated with particular z
score at selected significance level (5%)
 P value can be seen in the tables of the particular
test
t =
 Two types of Hypothesis, Null - H0, Alternate - Ha
P Value Method
 Determine p value
 Compare with selected
alpha level (0.05)
 p ≤ 0.05 – Reject Null
 P > 0.05 – Fail to Reject
null / accept null
 This method is generally
employed by data analysis
software – Excel, SPSS
Table Value Method
 Calculate test statistic
value – Calculated TS
Value
 Determine Critical value
of test statistic at
selected significance level
– Table TS Value
 If TSCal ≥ TSTab – Reject
Null
 If TSCal < TSTab – Fail to
Reject null / accept null
 This method is generally
employed when manual
testing is done
RN
Gender
G
Caste
C
Age
A
Mob.No.
No. of
Classes
N
Marks
Obtained
M
Specialization
Opted
S
1 1 1 22 9450366367 87 72 HR-3
2 1 2 24 8004896712 65 68 HR-3
3 2 1 26 9934876545 48 56 Fin.-2
4 2 1 21 2542543598 95 83 Mktg.-1
5 2 3 22 9458098734 65 58 Fin.-2
6 1 1 23 9412890112 74 65 Mktg.-1
• Mean & Variance (SD) – Eg. A, N, M – sample stat. – x, s
• Correlation Eg. N-M, A-N, A-M – r
• Association between Gender and Sp. Opted (G n S) - chi
Note Sample Ch.c – Statistic , Population Ch.c - Parameter
 Assume a population – N, µ,
 Now assume we take many samples of size n and
calculate mean for each sample
 x1, x2, x3, x4, x5, x6, . . . . . . . . x100
 Can we make a freq. distribution of these values
and draw a curve?
 Now when we draw a distribution of these values
we will have an average (x) and SD (s)
 This average is called mean of means and
considered mean of population
 The SD of population is calculated as
which is called as Standard Error
 Sample mean & their difference - z / t
 Sample correlation statistic– z / t (derived from r)
 Variance (SD2) – F
 Association – Chi Sqr.
 Central Limit Theorem
 If we collect many samples and draw its
distribution the mean of this distribution is
population mean and SD of population is
 We use CLT in Hypothesis Testing
 z - when is Known and sample size is ≥ 30
 t - when is Unknown and sample size < 30
 In sample estimation t test is employed
 Example - H0 & H1
 H0 – There is no difference b/w mean of two groups
 H1 – There is a significant difference b/w mean of two groups
 H0 – There is no difference b/w mean marks of males &
females
 H1 – There is a significant difference b/w male & females
 Hypothesis Testing steps
 Set Null Value (u1=u2, u1-u2=0) – Make Null Distribution –
Calculate z /t sample test statistic – compare with table
value/set p value – reject/accept null
 Used to compare variance of two samples
 Employed in ANOVA – analysis of variance
 When there are more than two groups and their
means are to be compared
 Example
 Comparison of marks among three streams of
students arts, commerce and science
 H0 – There is no difference among mean marks of three groups
 H1 – There is a significant difference among mean marks of three
groups
 Set Null Value (µ1=µ2=µ3) – Make Null Distribution – Calculate F
test statistic – compare with table value/p value – reject/accept
null
 Test of Independence
 It is used to determine association between two
categorical variables (nominal & ordinal)
 Example
 Gender (M/F) and Opted Specialization (M/F/HR)
 Question like ‘is any specialisation is preferred by
females?’ are answered
 H0 – There is no association b/w gender and opted speclisa.n
 H1 – There is a significant association b/w gender & opted
speclisa.n
 Here, mean is not calculated instead frequency of categories
is taken into consideration
 Actual Frequency and Expected Frequency
 Cross tabs are used to calculate actual & expected freq
 Hypothesis Testing steps
 Set Null Value (actual freq. = expected freq.) – Make Null
Distribution – Calculate chi sqr. sample test statistic –
compare with table value/set p value – reject/accept null
Two Variable Interaction – Crosstab
Opted
Specialization
Total
(60)
Gender
Male (40) Female (20)
Mktg. 30 20 8
Fin. 15 10 2
HR 15 10 10
60 40 20
 Set Null and Alternate Hypothesis – H0 H1
 Select the null value
 Null – status quo, no difference, no effect
 Status quo – no change
 No difference – 0 difference
 No relationship – 0 effect / 0 correlation
 No association – 0 relationship (b/w nominal variab.)
 It is assumed that H0 is true in population
 Draw Null Distribution – find range of expected values
if null is true (µ ± 2.SE)
 Take observed value from sample and compare with
expected null values
 If observed value is among expected null range –
accept null
 If observed value is different from null range – reject
null
1. Univariate/Bi-variate 2. Muti-variate
 Mean/Variance
Estimation
 Z test
 T test
 Chi Square
 F Test
 Correlation
 Correlation
 Regression
 Discriminant
 Cluster Analysis etc.
 Regression analysis
 1 dependent variable/DV (continuous)
 many independent variables/IV (continuous)
 Y = a.x1 +b.x2 +c.x3…….+.x.n
 Discriminant analysis
 1 dependent variable (catgorical)
 many independent variables (continuous)
 Z (yes/no) = a.x1 +b.x2 +c.x3…….+.x.n
 Cluster analysis
 No DV/IV
 Used to group respondents/customers in
various cluster
 Employed in market segmentation
 Factor analysis
 No DV/IV
 Used to group variables in various cluster of
more condensed variables

More Related Content

Similar to BRM Unit 3 Data Analysis-1.pptx

Introduction to Statistics - Basics of Data - Class 1
Introduction to Statistics - Basics of Data - Class 1Introduction to Statistics - Basics of Data - Class 1
Introduction to Statistics - Basics of Data - Class 1RajnishSingh367990
 
STATISTICS BASICS INCLUDING DESCRIPTIVE STATISTICS
STATISTICS BASICS INCLUDING DESCRIPTIVE STATISTICSSTATISTICS BASICS INCLUDING DESCRIPTIVE STATISTICS
STATISTICS BASICS INCLUDING DESCRIPTIVE STATISTICSnagamani651296
 
presentation
presentationpresentation
presentationPwalmiki
 
Student’s presentation
Student’s presentationStudent’s presentation
Student’s presentationPwalmiki
 
MEASURES OF DISPERSION NOTES.pdf
MEASURES OF DISPERSION NOTES.pdfMEASURES OF DISPERSION NOTES.pdf
MEASURES OF DISPERSION NOTES.pdfLSHERLEYMARY
 
Univariate, bivariate analysis, hypothesis testing, chi square
Univariate, bivariate analysis, hypothesis testing, chi squareUnivariate, bivariate analysis, hypothesis testing, chi square
Univariate, bivariate analysis, hypothesis testing, chi squarekongara
 
T- Distribution Report
T- Distribution ReportT- Distribution Report
T- Distribution ReportBahzad5
 
MPhil clinical psy Non-parametric statistics.pptx
MPhil clinical psy Non-parametric statistics.pptxMPhil clinical psy Non-parametric statistics.pptx
MPhil clinical psy Non-parametric statistics.pptxrodrickrajamanickam
 
Introduction to the t test
Introduction to the t testIntroduction to the t test
Introduction to the t testSr Edith Bogue
 
3Measurements of health and disease_MCTD.pdf
3Measurements of health and disease_MCTD.pdf3Measurements of health and disease_MCTD.pdf
3Measurements of health and disease_MCTD.pdfAmanuelDina
 
Module
ModuleModule
Moduleasha
 
Module stats
Module statsModule stats
Module statsNidhi
 

Similar to BRM Unit 3 Data Analysis-1.pptx (20)

Class1.ppt
Class1.pptClass1.ppt
Class1.ppt
 
Introduction to Statistics - Basics of Data - Class 1
Introduction to Statistics - Basics of Data - Class 1Introduction to Statistics - Basics of Data - Class 1
Introduction to Statistics - Basics of Data - Class 1
 
STATISTICS BASICS INCLUDING DESCRIPTIVE STATISTICS
STATISTICS BASICS INCLUDING DESCRIPTIVE STATISTICSSTATISTICS BASICS INCLUDING DESCRIPTIVE STATISTICS
STATISTICS BASICS INCLUDING DESCRIPTIVE STATISTICS
 
Class1.ppt
Class1.pptClass1.ppt
Class1.ppt
 
Statistics
StatisticsStatistics
Statistics
 
presentation
presentationpresentation
presentation
 
Student’s presentation
Student’s presentationStudent’s presentation
Student’s presentation
 
MEASURES OF DISPERSION NOTES.pdf
MEASURES OF DISPERSION NOTES.pdfMEASURES OF DISPERSION NOTES.pdf
MEASURES OF DISPERSION NOTES.pdf
 
Univariate, bivariate analysis, hypothesis testing, chi square
Univariate, bivariate analysis, hypothesis testing, chi squareUnivariate, bivariate analysis, hypothesis testing, chi square
Univariate, bivariate analysis, hypothesis testing, chi square
 
T- Distribution Report
T- Distribution ReportT- Distribution Report
T- Distribution Report
 
test & measuement
test & measuementtest & measuement
test & measuement
 
MPhil clinical psy Non-parametric statistics.pptx
MPhil clinical psy Non-parametric statistics.pptxMPhil clinical psy Non-parametric statistics.pptx
MPhil clinical psy Non-parametric statistics.pptx
 
3.2 Measures of variation
3.2 Measures of variation3.2 Measures of variation
3.2 Measures of variation
 
Introduction to the t test
Introduction to the t testIntroduction to the t test
Introduction to the t test
 
SP and R.pptx
SP and R.pptxSP and R.pptx
SP and R.pptx
 
3Measurements of health and disease_MCTD.pdf
3Measurements of health and disease_MCTD.pdf3Measurements of health and disease_MCTD.pdf
3Measurements of health and disease_MCTD.pdf
 
STATISTICS
STATISTICSSTATISTICS
STATISTICS
 
Module
ModuleModule
Module
 
Module stats
Module statsModule stats
Module stats
 
Measures of Dispersion
Measures of DispersionMeasures of Dispersion
Measures of Dispersion
 

More from VikasRai405977

162762-artificial-intelligence-template-16x9.pptx
162762-artificial-intelligence-template-16x9.pptx162762-artificial-intelligence-template-16x9.pptx
162762-artificial-intelligence-template-16x9.pptxVikasRai405977
 
160002-business-template-000hhhhhhhh1.pptx
160002-business-template-000hhhhhhhh1.pptx160002-business-template-000hhhhhhhh1.pptx
160002-business-template-000hhhhhhhh1.pptxVikasRai405977
 
Ughghhhhhhhhhhhhhhhhhjjjjjjjjjjnit -3.pptx
Ughghhhhhhhhhhhhhhhhhjjjjjjjjjjnit -3.pptxUghghhhhhhhhhhhhhhhhhjjjjjjjjjjnit -3.pptx
Ughghhhhhhhhhhhhhhhhhjjjjjjjjjjnit -3.pptxVikasRai405977
 
Evans_Analytics2e_ppt_13.pptxbbbbbbbbbbb
Evans_Analytics2e_ppt_13.pptxbbbbbbbbbbbEvans_Analytics2e_ppt_13.pptxbbbbbbbbbbb
Evans_Analytics2e_ppt_13.pptxbbbbbbbbbbbVikasRai405977
 
Chapter-2 ppt for the MBA 4rh seme6y.pdf
Chapter-2 ppt for the MBA 4rh seme6y.pdfChapter-2 ppt for the MBA 4rh seme6y.pdf
Chapter-2 ppt for the MBA 4rh seme6y.pdfVikasRai405977
 
Project management. Forms of PROJECT.pptx
Project management. Forms of PROJECT.pptxProject management. Forms of PROJECT.pptx
Project management. Forms of PROJECT.pptxVikasRai405977
 
Vikas Rai & Sobhit Jaiswal.pptx
Vikas Rai & Sobhit Jaiswal.pptxVikas Rai & Sobhit Jaiswal.pptx
Vikas Rai & Sobhit Jaiswal.pptxVikasRai405977
 
231_97525_EA421_2013_1__2_1_Chap003.ppt
231_97525_EA421_2013_1__2_1_Chap003.ppt231_97525_EA421_2013_1__2_1_Chap003.ppt
231_97525_EA421_2013_1__2_1_Chap003.pptVikasRai405977
 
MIS Unit 1 Dr. Vijay.pptx
MIS Unit 1 Dr. Vijay.pptxMIS Unit 1 Dr. Vijay.pptx
MIS Unit 1 Dr. Vijay.pptxVikasRai405977
 
Organization Behaviors.pptx
Organization Behaviors.pptxOrganization Behaviors.pptx
Organization Behaviors.pptxVikasRai405977
 
BRM Unit 2 Sampling.ppt
BRM Unit 2 Sampling.pptBRM Unit 2 Sampling.ppt
BRM Unit 2 Sampling.pptVikasRai405977
 
BRM Unit 2 Measurement.pptx
BRM Unit 2 Measurement.pptxBRM Unit 2 Measurement.pptx
BRM Unit 2 Measurement.pptxVikasRai405977
 

More from VikasRai405977 (15)

162762-artificial-intelligence-template-16x9.pptx
162762-artificial-intelligence-template-16x9.pptx162762-artificial-intelligence-template-16x9.pptx
162762-artificial-intelligence-template-16x9.pptx
 
160002-business-template-000hhhhhhhh1.pptx
160002-business-template-000hhhhhhhh1.pptx160002-business-template-000hhhhhhhh1.pptx
160002-business-template-000hhhhhhhh1.pptx
 
Ughghhhhhhhhhhhhhhhhhjjjjjjjjjjnit -3.pptx
Ughghhhhhhhhhhhhhhhhhjjjjjjjjjjnit -3.pptxUghghhhhhhhhhhhhhhhhhjjjjjjjjjjnit -3.pptx
Ughghhhhhhhhhhhhhhhhhjjjjjjjjjjnit -3.pptx
 
Evans_Analytics2e_ppt_13.pptxbbbbbbbbbbb
Evans_Analytics2e_ppt_13.pptxbbbbbbbbbbbEvans_Analytics2e_ppt_13.pptxbbbbbbbbbbb
Evans_Analytics2e_ppt_13.pptxbbbbbbbbbbb
 
Chapter-2 ppt for the MBA 4rh seme6y.pdf
Chapter-2 ppt for the MBA 4rh seme6y.pdfChapter-2 ppt for the MBA 4rh seme6y.pdf
Chapter-2 ppt for the MBA 4rh seme6y.pdf
 
Project management. Forms of PROJECT.pptx
Project management. Forms of PROJECT.pptxProject management. Forms of PROJECT.pptx
Project management. Forms of PROJECT.pptx
 
Vikas Rai & Sobhit Jaiswal.pptx
Vikas Rai & Sobhit Jaiswal.pptxVikas Rai & Sobhit Jaiswal.pptx
Vikas Rai & Sobhit Jaiswal.pptx
 
231_97525_EA421_2013_1__2_1_Chap003.ppt
231_97525_EA421_2013_1__2_1_Chap003.ppt231_97525_EA421_2013_1__2_1_Chap003.ppt
231_97525_EA421_2013_1__2_1_Chap003.ppt
 
Partnerships.ppt
Partnerships.pptPartnerships.ppt
Partnerships.ppt
 
POM.pptx
POM.pptxPOM.pptx
POM.pptx
 
MIS Unit 1 Dr. Vijay.pptx
MIS Unit 1 Dr. Vijay.pptxMIS Unit 1 Dr. Vijay.pptx
MIS Unit 1 Dr. Vijay.pptx
 
Organization Behaviors.pptx
Organization Behaviors.pptxOrganization Behaviors.pptx
Organization Behaviors.pptx
 
Six_Sigma.ppt
Six_Sigma.pptSix_Sigma.ppt
Six_Sigma.ppt
 
BRM Unit 2 Sampling.ppt
BRM Unit 2 Sampling.pptBRM Unit 2 Sampling.ppt
BRM Unit 2 Sampling.ppt
 
BRM Unit 2 Measurement.pptx
BRM Unit 2 Measurement.pptxBRM Unit 2 Measurement.pptx
BRM Unit 2 Measurement.pptx
 

Recently uploaded

Case study on tata clothing brand zudio in detail
Case study on tata clothing brand zudio in detailCase study on tata clothing brand zudio in detail
Case study on tata clothing brand zudio in detailAriel592675
 
8447779800, Low rate Call girls in Shivaji Enclave Delhi NCR
8447779800, Low rate Call girls in Shivaji Enclave Delhi NCR8447779800, Low rate Call girls in Shivaji Enclave Delhi NCR
8447779800, Low rate Call girls in Shivaji Enclave Delhi NCRashishs7044
 
Intro to BCG's Carbon Emissions Benchmark_vF.pdf
Intro to BCG's Carbon Emissions Benchmark_vF.pdfIntro to BCG's Carbon Emissions Benchmark_vF.pdf
Intro to BCG's Carbon Emissions Benchmark_vF.pdfpollardmorgan
 
Marketplace and Quality Assurance Presentation - Vincent Chirchir
Marketplace and Quality Assurance Presentation - Vincent ChirchirMarketplace and Quality Assurance Presentation - Vincent Chirchir
Marketplace and Quality Assurance Presentation - Vincent Chirchirictsugar
 
Call Girls In Connaught Place Delhi ❤️88604**77959_Russian 100% Genuine Escor...
Call Girls In Connaught Place Delhi ❤️88604**77959_Russian 100% Genuine Escor...Call Girls In Connaught Place Delhi ❤️88604**77959_Russian 100% Genuine Escor...
Call Girls In Connaught Place Delhi ❤️88604**77959_Russian 100% Genuine Escor...lizamodels9
 
Kenya Coconut Production Presentation by Dr. Lalith Perera
Kenya Coconut Production Presentation by Dr. Lalith PereraKenya Coconut Production Presentation by Dr. Lalith Perera
Kenya Coconut Production Presentation by Dr. Lalith Pereraictsugar
 
8447779800, Low rate Call girls in Kotla Mubarakpur Delhi NCR
8447779800, Low rate Call girls in Kotla Mubarakpur Delhi NCR8447779800, Low rate Call girls in Kotla Mubarakpur Delhi NCR
8447779800, Low rate Call girls in Kotla Mubarakpur Delhi NCRashishs7044
 
/:Call Girls In Indirapuram Ghaziabad ➥9990211544 Independent Best Escorts In...
/:Call Girls In Indirapuram Ghaziabad ➥9990211544 Independent Best Escorts In.../:Call Girls In Indirapuram Ghaziabad ➥9990211544 Independent Best Escorts In...
/:Call Girls In Indirapuram Ghaziabad ➥9990211544 Independent Best Escorts In...lizamodels9
 
Kenya’s Coconut Value Chain by Gatsby Africa
Kenya’s Coconut Value Chain by Gatsby AfricaKenya’s Coconut Value Chain by Gatsby Africa
Kenya’s Coconut Value Chain by Gatsby Africaictsugar
 
Youth Involvement in an Innovative Coconut Value Chain by Mwalimu Menza
Youth Involvement in an Innovative Coconut Value Chain by Mwalimu MenzaYouth Involvement in an Innovative Coconut Value Chain by Mwalimu Menza
Youth Involvement in an Innovative Coconut Value Chain by Mwalimu Menzaictsugar
 
Call Girls In Radisson Blu Hotel New Delhi Paschim Vihar ❤️8860477959 Escorts...
Call Girls In Radisson Blu Hotel New Delhi Paschim Vihar ❤️8860477959 Escorts...Call Girls In Radisson Blu Hotel New Delhi Paschim Vihar ❤️8860477959 Escorts...
Call Girls In Radisson Blu Hotel New Delhi Paschim Vihar ❤️8860477959 Escorts...lizamodels9
 
Annual General Meeting Presentation Slides
Annual General Meeting Presentation SlidesAnnual General Meeting Presentation Slides
Annual General Meeting Presentation SlidesKeppelCorporation
 
Investment in The Coconut Industry by Nancy Cheruiyot
Investment in The Coconut Industry by Nancy CheruiyotInvestment in The Coconut Industry by Nancy Cheruiyot
Investment in The Coconut Industry by Nancy Cheruiyotictsugar
 
Future Of Sample Report 2024 | Redacted Version
Future Of Sample Report 2024 | Redacted VersionFuture Of Sample Report 2024 | Redacted Version
Future Of Sample Report 2024 | Redacted VersionMintel Group
 
BEST Call Girls In Greater Noida ✨ 9773824855 ✨ Escorts Service In Delhi Ncr,
BEST Call Girls In Greater Noida ✨ 9773824855 ✨ Escorts Service In Delhi Ncr,BEST Call Girls In Greater Noida ✨ 9773824855 ✨ Escorts Service In Delhi Ncr,
BEST Call Girls In Greater Noida ✨ 9773824855 ✨ Escorts Service In Delhi Ncr,noida100girls
 
International Business Environments and Operations 16th Global Edition test b...
International Business Environments and Operations 16th Global Edition test b...International Business Environments and Operations 16th Global Edition test b...
International Business Environments and Operations 16th Global Edition test b...ssuserf63bd7
 
Keppel Ltd. 1Q 2024 Business Update Presentation Slides
Keppel Ltd. 1Q 2024 Business Update  Presentation SlidesKeppel Ltd. 1Q 2024 Business Update  Presentation Slides
Keppel Ltd. 1Q 2024 Business Update Presentation SlidesKeppelCorporation
 
Pitch Deck Teardown: Geodesic.Life's $500k Pre-seed deck
Pitch Deck Teardown: Geodesic.Life's $500k Pre-seed deckPitch Deck Teardown: Geodesic.Life's $500k Pre-seed deck
Pitch Deck Teardown: Geodesic.Life's $500k Pre-seed deckHajeJanKamps
 
Global Scenario On Sustainable and Resilient Coconut Industry by Dr. Jelfina...
Global Scenario On Sustainable  and Resilient Coconut Industry by Dr. Jelfina...Global Scenario On Sustainable  and Resilient Coconut Industry by Dr. Jelfina...
Global Scenario On Sustainable and Resilient Coconut Industry by Dr. Jelfina...ictsugar
 

Recently uploaded (20)

Case study on tata clothing brand zudio in detail
Case study on tata clothing brand zudio in detailCase study on tata clothing brand zudio in detail
Case study on tata clothing brand zudio in detail
 
8447779800, Low rate Call girls in Shivaji Enclave Delhi NCR
8447779800, Low rate Call girls in Shivaji Enclave Delhi NCR8447779800, Low rate Call girls in Shivaji Enclave Delhi NCR
8447779800, Low rate Call girls in Shivaji Enclave Delhi NCR
 
Japan IT Week 2024 Brochure by 47Billion (English)
Japan IT Week 2024 Brochure by 47Billion (English)Japan IT Week 2024 Brochure by 47Billion (English)
Japan IT Week 2024 Brochure by 47Billion (English)
 
Intro to BCG's Carbon Emissions Benchmark_vF.pdf
Intro to BCG's Carbon Emissions Benchmark_vF.pdfIntro to BCG's Carbon Emissions Benchmark_vF.pdf
Intro to BCG's Carbon Emissions Benchmark_vF.pdf
 
Marketplace and Quality Assurance Presentation - Vincent Chirchir
Marketplace and Quality Assurance Presentation - Vincent ChirchirMarketplace and Quality Assurance Presentation - Vincent Chirchir
Marketplace and Quality Assurance Presentation - Vincent Chirchir
 
Call Girls In Connaught Place Delhi ❤️88604**77959_Russian 100% Genuine Escor...
Call Girls In Connaught Place Delhi ❤️88604**77959_Russian 100% Genuine Escor...Call Girls In Connaught Place Delhi ❤️88604**77959_Russian 100% Genuine Escor...
Call Girls In Connaught Place Delhi ❤️88604**77959_Russian 100% Genuine Escor...
 
Kenya Coconut Production Presentation by Dr. Lalith Perera
Kenya Coconut Production Presentation by Dr. Lalith PereraKenya Coconut Production Presentation by Dr. Lalith Perera
Kenya Coconut Production Presentation by Dr. Lalith Perera
 
8447779800, Low rate Call girls in Kotla Mubarakpur Delhi NCR
8447779800, Low rate Call girls in Kotla Mubarakpur Delhi NCR8447779800, Low rate Call girls in Kotla Mubarakpur Delhi NCR
8447779800, Low rate Call girls in Kotla Mubarakpur Delhi NCR
 
/:Call Girls In Indirapuram Ghaziabad ➥9990211544 Independent Best Escorts In...
/:Call Girls In Indirapuram Ghaziabad ➥9990211544 Independent Best Escorts In.../:Call Girls In Indirapuram Ghaziabad ➥9990211544 Independent Best Escorts In...
/:Call Girls In Indirapuram Ghaziabad ➥9990211544 Independent Best Escorts In...
 
Kenya’s Coconut Value Chain by Gatsby Africa
Kenya’s Coconut Value Chain by Gatsby AfricaKenya’s Coconut Value Chain by Gatsby Africa
Kenya’s Coconut Value Chain by Gatsby Africa
 
Youth Involvement in an Innovative Coconut Value Chain by Mwalimu Menza
Youth Involvement in an Innovative Coconut Value Chain by Mwalimu MenzaYouth Involvement in an Innovative Coconut Value Chain by Mwalimu Menza
Youth Involvement in an Innovative Coconut Value Chain by Mwalimu Menza
 
Call Girls In Radisson Blu Hotel New Delhi Paschim Vihar ❤️8860477959 Escorts...
Call Girls In Radisson Blu Hotel New Delhi Paschim Vihar ❤️8860477959 Escorts...Call Girls In Radisson Blu Hotel New Delhi Paschim Vihar ❤️8860477959 Escorts...
Call Girls In Radisson Blu Hotel New Delhi Paschim Vihar ❤️8860477959 Escorts...
 
Annual General Meeting Presentation Slides
Annual General Meeting Presentation SlidesAnnual General Meeting Presentation Slides
Annual General Meeting Presentation Slides
 
Investment in The Coconut Industry by Nancy Cheruiyot
Investment in The Coconut Industry by Nancy CheruiyotInvestment in The Coconut Industry by Nancy Cheruiyot
Investment in The Coconut Industry by Nancy Cheruiyot
 
Future Of Sample Report 2024 | Redacted Version
Future Of Sample Report 2024 | Redacted VersionFuture Of Sample Report 2024 | Redacted Version
Future Of Sample Report 2024 | Redacted Version
 
BEST Call Girls In Greater Noida ✨ 9773824855 ✨ Escorts Service In Delhi Ncr,
BEST Call Girls In Greater Noida ✨ 9773824855 ✨ Escorts Service In Delhi Ncr,BEST Call Girls In Greater Noida ✨ 9773824855 ✨ Escorts Service In Delhi Ncr,
BEST Call Girls In Greater Noida ✨ 9773824855 ✨ Escorts Service In Delhi Ncr,
 
International Business Environments and Operations 16th Global Edition test b...
International Business Environments and Operations 16th Global Edition test b...International Business Environments and Operations 16th Global Edition test b...
International Business Environments and Operations 16th Global Edition test b...
 
Keppel Ltd. 1Q 2024 Business Update Presentation Slides
Keppel Ltd. 1Q 2024 Business Update  Presentation SlidesKeppel Ltd. 1Q 2024 Business Update  Presentation Slides
Keppel Ltd. 1Q 2024 Business Update Presentation Slides
 
Pitch Deck Teardown: Geodesic.Life's $500k Pre-seed deck
Pitch Deck Teardown: Geodesic.Life's $500k Pre-seed deckPitch Deck Teardown: Geodesic.Life's $500k Pre-seed deck
Pitch Deck Teardown: Geodesic.Life's $500k Pre-seed deck
 
Global Scenario On Sustainable and Resilient Coconut Industry by Dr. Jelfina...
Global Scenario On Sustainable  and Resilient Coconut Industry by Dr. Jelfina...Global Scenario On Sustainable  and Resilient Coconut Industry by Dr. Jelfina...
Global Scenario On Sustainable and Resilient Coconut Industry by Dr. Jelfina...
 

BRM Unit 3 Data Analysis-1.pptx

  • 1. Dr. Urooj A Siddiqui
  • 2.  Data – Raw Facts, especially numerical facts, collected together for reference or information.  Data is collected on some particular variable/s  Data analysis is processing of data to derive useful information  Knowledge communicated concerning some particular fact  The created knowledge helps in APPLICATION / DECISION MAKING
  • 3.  Categorical: Qualitative  Continuous: Quantitative Data Categorical Nominal Ordinal Continuous Interval Ratio
  • 4.  Any phenomenon which takes at least two different values/ observations  Data: Set of values/ observations collected on variable is called data  Nominal  Ordinal  Interval  Ratio
  • 5. 1. Data Preparation / Initial Operations 2. Summarizing Data / Data Analysis Operations  Editing / Cleaning  Coding  Classification  Tabulation  Graphical Representation  Tables / Crosstab  Graph / Figure  Statistical Analysis 1. Descriptive Methods  Frequency, %age, Ratio,  Mean, Median, Standard Deviation (Variance) 2. Inferential Methods  Comparison (t/z-test/Anova)  Association (chi square test)  Correlation (r)  Prediction/ Regression (y = ax + b)
  • 6.  Editing / Data Cleaning  examining the collected raw data to detect any errors and omit/correct it if possible  Coding  assigning numerals to answers so that responses can be put into a limited number of categories  Classification  Grouping of data on some basis (large volume of raw data is reduced into homogenous groups I. Attribute - on the basis of demographic bases eg. gender, rural/urban, day scholar/hosteller II. Class Interval – on the basis on some numeric range eg. 0-10, 10-20 etc.
  • 7. I. Tabulation  is the process of displaying raw data in tabular form and summarising it for further analysis  orderly arranging data in columns and rows Tabulation is essential because  It conserves space and reduces statements  It facilitates the process of summation of items, comparison, detection of errors and omissions  Basis for various statistical computations
  • 8. Name Gende r Caste Age Mob. No. Edu Yrs in school IQ Pain level temp of locality deg cel Ram M Hindu 60 9450366367 NIL 0 16 Mild-0 -4 Akbar M Muslim 65 8004896712 HS 16 14 Mod-1 20 Sita F Hindu 305 9934876545 Int. 19 0 Mild-0 15 Shalini F Hindu 90 2542543598 HS 8 1 6 Mild-0 0 Mehnaj F Sikh 38 9458098734 UG 21 13 Severe-2 0 Ravi M Hindu 48 9412890112 PG 23 20 Mod-1 -1 Hari M Hindu 45 8796654398 Prim 12 10 Mod-1 30
  • 9. Name Gender Caste Age Mob.No. Edu level Yrs in sch. IQ Pain level temp of locality deg cel 7 1 1 60 9450366367 -1 0 16 0 -4 2 1 2 65 8004896712 1 16 14 2 20 5 2 1 35 9934876545 2 19 0 0 15 4 2 1 90 2542543598 1 8 1 6 0 0 3 2 3 38 9458098734 3 21 13 3 0 6 1 1 48 9412890112 4 23 20 2 -1 1 1 1 45 8796654398 0 12 10 2 30 Nominal & Ordinal called qualitative . Interval and Ratio called quantitative
  • 10. Single Variable Freq. Table Age Group (years) Freq. Below 20 2 20-22 28 22-24 16 24-26 10 Above 26 4 60 Roll. No Age (yr) 1 22 2 24 3 23 4 26 5 19 6 25 . . . . . . . . . . 60 22  Single / Multi Variable Table - one or more variable (no interaction) **Multiple Variable Table – as presented in above slide
  • 11.  Crosstabs – interaction of two or more variables Two Variable Interaction – Crosstab Age Group Gender Male Female Total Below 20 1 1 2 20-22 18 10 28 22-24 9 7 16 24-26 7 3 10 Above 26 3 1 4 38 22 60
  • 12. Graphical Representation of Data  Pie Chart  Bar Graph  Histogram  Line Graph  Scatter Plot  Scatter Plot & Correlation
  • 13. Pie Charts  It is used to represent %ages, distribution of 1 variable at various levels 8.2, 58% 3.2, 23% 1.4, 10% 1.2, 8% Sales (in mn) 1st Qtr 2nd Qtr 3rd Qtr 4th Qtr
  • 14. Bar Chart  It is used to represent 1 variable at various levels  Levels can be year/ groups etc. 4.3 2.5 3.5 4.5 0 0.5 1 1.5 2 2.5 3 3.5 4 2018 2019 2020 2021 Sales
  • 16. Histogram  To show the distribution of a quantitative variable 4 6 10 8 2 0 0 2 4 6 8 10 12 10 20 30 40 50 Frequency Class Interval/Variable Unit
  • 17. Line Diagram  To show change in variable in a particular time period / on some reference range ₹ 5.60 ₹ 5.80 ₹ 6.00 ₹ 6.20 ₹ 6.40 ₹ 6.60 ₹ 6.80 ₹ 7.00 ₹ 7.20 ₹ 7.40 1 2 3 4 5 6 7 8 9 10 Stock Price Last 10 Days
  • 18. Line Diagram  May also be used to compare 2 or more variables along the range 0 2 4 6 8 10 12 14 1 2 3 4 5 6 7 8 Adani Tata Reliance
  • 19. Scatter Plot  It is used to express relationships between two variables 0 1 2 3 4 5 6 0 1 2 3 4 Sales in Crore Adv Budget in 10’Lacs Y-Values
  • 20. Scatter Plot  to express relationships between two variables
  • 21. Scatter Plot  Trend Lines - Correlation
  • 22. Income / day No. of families 0-500 20 500-1000 30 1000-1500 50 1500-2000 70 2000-2500 40 2500-3000 30 3000-3500 10 . . 0 10 20 30 40 50 60 70 80 0 1000 2000 3000 4000 No.of families Income
  • 23. age (xi) x-xi (x-xi) sqr. A 21 2 4 B 22 1 1 C 23 0 0 D 24 -1 1 E 25 -2 4 mean x 23 Sum 0 10 (sum x-xi sq) Avg Sq (variance) 2 (10 by 5), n=5 SD (root v) s 1.41
  • 24. Age Group (years) Freq. Probability Below 20 2 2/60 20-22 28 28/60 22-24 16 16/60 24-26 10 10/60 Above 26 4 4/60 60 Mean (x-sample, µ-population) 23 (years) SD (s-sample, sigma- population) 2 (years) Roll. No Age (yr) 1 22 2 24 3 23 4 26 5 19 6 22 . . . . . . . . . . 60 22
  • 25.  A distribution in frequencies of observations is known – probability distribution  Z- Normal Distribution/Test - Mean (µ), SD-  To compare means (1 or 2 means)  t – Distribution/Test- Mean (x), SD (s)  To compare means (1 or 2 means)  Chi Square Distribution / Test  To compare sample SD with population SD  F Test  To compare two sample variances
  • 26.  A freq. distribution with bell shape curve and some known properties  Parameters - Mean (µ), SD (sigma)  Known properties  68% values are within µ ± 1 SD  95% values are within µ ± 2 SD  99% values are within µ ± 3 SD  95% CI = µ ± 2.SD (range)  Lower limit µ - 2.SD  Upper limit µ + 2.SD
  • 28. Example of our case  95% CI = µ ± 2.SD  Lower limit = µ - 2.SD, Upper limit = µ + 2.SD,  LL = 23 - 2.2 = 19, UL = 23 + 2.2 = 27  95% CI Range = 19-27 years  95% of the students in the class are in the range of 19-27 yrs  We are 95% confident that if we randomly select a student from the class his/her age will be within this range (19-27 yrs)  Reverse is Hypothesis Testing  If mean and SD of any population is known and if some value is given can we determine whether it belongs to this population or distribution ?
  • 30. When Population SD is KNOWN When Population SD is UNKNOWN Finding Probability  Calculate z score (test statistic) of the observed value or hypothesized value with the formula  Determine p value associated with particular z score at selected significance level (5%)  P value can be seen in the tables of the particular test t =
  • 31.
  • 32.  Two types of Hypothesis, Null - H0, Alternate - Ha
  • 33.
  • 34. P Value Method  Determine p value  Compare with selected alpha level (0.05)  p ≤ 0.05 – Reject Null  P > 0.05 – Fail to Reject null / accept null  This method is generally employed by data analysis software – Excel, SPSS Table Value Method  Calculate test statistic value – Calculated TS Value  Determine Critical value of test statistic at selected significance level – Table TS Value  If TSCal ≥ TSTab – Reject Null  If TSCal < TSTab – Fail to Reject null / accept null  This method is generally employed when manual testing is done
  • 35.
  • 36.
  • 37.
  • 38. RN Gender G Caste C Age A Mob.No. No. of Classes N Marks Obtained M Specialization Opted S 1 1 1 22 9450366367 87 72 HR-3 2 1 2 24 8004896712 65 68 HR-3 3 2 1 26 9934876545 48 56 Fin.-2 4 2 1 21 2542543598 95 83 Mktg.-1 5 2 3 22 9458098734 65 58 Fin.-2 6 1 1 23 9412890112 74 65 Mktg.-1 • Mean & Variance (SD) – Eg. A, N, M – sample stat. – x, s • Correlation Eg. N-M, A-N, A-M – r • Association between Gender and Sp. Opted (G n S) - chi Note Sample Ch.c – Statistic , Population Ch.c - Parameter
  • 39.  Assume a population – N, µ,  Now assume we take many samples of size n and calculate mean for each sample  x1, x2, x3, x4, x5, x6, . . . . . . . . x100  Can we make a freq. distribution of these values and draw a curve?  Now when we draw a distribution of these values we will have an average (x) and SD (s)  This average is called mean of means and considered mean of population  The SD of population is calculated as which is called as Standard Error
  • 40.
  • 41.
  • 42.  Sample mean & their difference - z / t  Sample correlation statistic– z / t (derived from r)  Variance (SD2) – F  Association – Chi Sqr.  Central Limit Theorem  If we collect many samples and draw its distribution the mean of this distribution is population mean and SD of population is  We use CLT in Hypothesis Testing
  • 43.  z - when is Known and sample size is ≥ 30  t - when is Unknown and sample size < 30  In sample estimation t test is employed  Example - H0 & H1  H0 – There is no difference b/w mean of two groups  H1 – There is a significant difference b/w mean of two groups  H0 – There is no difference b/w mean marks of males & females  H1 – There is a significant difference b/w male & females  Hypothesis Testing steps  Set Null Value (u1=u2, u1-u2=0) – Make Null Distribution – Calculate z /t sample test statistic – compare with table value/set p value – reject/accept null
  • 44.  Used to compare variance of two samples  Employed in ANOVA – analysis of variance  When there are more than two groups and their means are to be compared  Example  Comparison of marks among three streams of students arts, commerce and science  H0 – There is no difference among mean marks of three groups  H1 – There is a significant difference among mean marks of three groups  Set Null Value (µ1=µ2=µ3) – Make Null Distribution – Calculate F test statistic – compare with table value/p value – reject/accept null
  • 45.  Test of Independence  It is used to determine association between two categorical variables (nominal & ordinal)  Example  Gender (M/F) and Opted Specialization (M/F/HR)  Question like ‘is any specialisation is preferred by females?’ are answered  H0 – There is no association b/w gender and opted speclisa.n  H1 – There is a significant association b/w gender & opted speclisa.n  Here, mean is not calculated instead frequency of categories is taken into consideration  Actual Frequency and Expected Frequency
  • 46.  Cross tabs are used to calculate actual & expected freq  Hypothesis Testing steps  Set Null Value (actual freq. = expected freq.) – Make Null Distribution – Calculate chi sqr. sample test statistic – compare with table value/set p value – reject/accept null Two Variable Interaction – Crosstab Opted Specialization Total (60) Gender Male (40) Female (20) Mktg. 30 20 8 Fin. 15 10 2 HR 15 10 10 60 40 20
  • 47.  Set Null and Alternate Hypothesis – H0 H1  Select the null value  Null – status quo, no difference, no effect  Status quo – no change  No difference – 0 difference  No relationship – 0 effect / 0 correlation  No association – 0 relationship (b/w nominal variab.)  It is assumed that H0 is true in population  Draw Null Distribution – find range of expected values if null is true (µ ± 2.SE)  Take observed value from sample and compare with expected null values  If observed value is among expected null range – accept null  If observed value is different from null range – reject null
  • 48. 1. Univariate/Bi-variate 2. Muti-variate  Mean/Variance Estimation  Z test  T test  Chi Square  F Test  Correlation  Correlation  Regression  Discriminant  Cluster Analysis etc.
  • 49.  Regression analysis  1 dependent variable/DV (continuous)  many independent variables/IV (continuous)  Y = a.x1 +b.x2 +c.x3…….+.x.n  Discriminant analysis  1 dependent variable (catgorical)  many independent variables (continuous)  Z (yes/no) = a.x1 +b.x2 +c.x3…….+.x.n
  • 50.  Cluster analysis  No DV/IV  Used to group respondents/customers in various cluster  Employed in market segmentation  Factor analysis  No DV/IV  Used to group variables in various cluster of more condensed variables