SlideShare a Scribd company logo
Project Report
On
Poolability Analysis of NSS 67th
Round
Central and Delhi State Sample
For
Unincorporated Non-Agricultural
Enterprises
(Excluding Construction)
Sponsored by
Ministry of Statistics and Programme
Implementation(MOSPI)
Under the Scheme for
Internship of post-graduate/research student
during 2015-16
Made By:
Srimoyee Bose
M.Sc. Statistics
PAGE 2
ACKNOWLEDGEMENT
Every project big or small is successful largely dueto theeffort of a number of
brilliant people who havealways given their valuableadviceor lent a helping hand.
I sincerely appreciatethe inspiration; support and guidanceof all those peoplewho
havebeen instrumental in making this project a success.
I, SrimoyeeBose,thestudent of RamjasCollege (UniversityOf Delhi), am extremely
gratefulto DirectorateofEconomics &Statistics for theconfidencebestowed in me
and entrusting my project entitled “OperationalandFinancial Characteristics of
UnincorporatedNon-AgriculturalEnterprises(ExcludingConstruction) in Delhi”
withspecial reference to MinistryofStatistics and Programme Implementation
(MOSPI).
At this juncture I feel deeply honoured in expressing my sincere thanks to Mr. C.K.
Duttafor making theresources availableat right time and providing valuable
insights leading to the successful completion of my project.
I express my gratitudeto Department of Statistics for arranging thesummer
training in goodschedule. I alsoextend my gratitudeto Mr. PraveenSrivastava and
Mr. Hemant, who assisted mein compiling theproject.
PAGE 3
I would alsoliketothank Mr. PraveenChaurasiafor his critical adviceand guidance
without which this project would not havebeen possible.
Any omission in this brief acknowledgement does not mean lack of gratitude.
CONTENTS
Ch.No TOPIC Page No.
01.
Introduction & Background
a. OrganizationalStructure
b. Scheduleof Surveys
c. Background of DES
d. States participation in NSS
e. Need for Pooling
d. States participation in NSS
e. about NSS 67th
Round
f. Need for Pooling
g. Errors in Survey :
Sampling & Non-sampling errors
5-10
02.
Overview of 67th
round
a. Sampling Design & Estimation Procedure
b. Concepts & Definitions
c. Classificationof Enterprises (NIC-2008)
11 - 14
03. Methodology & Analysis
1. Methodology
2. Pooling of data
3. Non- Parametric Tests
a. MultinomialTests
b. Wald-Wolfowitz Runs Test
15-34
PAGE 4
c. Poolability Test Result
2. Methods of Pooling
a. InverseVariance
b. Weighted Average
c. Poolability Analysis
3. Relative StandardError
4. Divergence
04. Conclusion & Suggestions 35 – 39
OBJECTIVE
The main objective of our project is the
Poolability Analysis of NSS 67th
Round,
Sch.2.34 data of Unincorporated Non-
Agricultural Enterprises (Excluding
Construction)
Delhi,
for State and Central sample.
PAGE 5
INTRODUCTION AND BACKGROUND
The National Sample Survey (NSS) was set up in 1950 under Indian
Statistical Institute (ISI), to bridge large gaps in statistical data
needed for planning, policy formulation and computation of national
income aggregates, especially in respect of the unorganized and
household sector of the economy. The NSS was re-organized into
National Sample Survey Office (NSSO) in 1972 under the Ministry of
Statistics and Programme Implementation (MOSPI).
Organizational structure:
 The NSSO is headed by the DG & CEO
 NSSO has 4 Divisions
 Survey Design & Research Division (SDRD)
Located at Kolkata
 Field Operations Division (FOD)
Headquarter located at New Delhi & Faridabad
Six Zonal offices located at Bangalore, Kolkata, Guwahati,
Jaipur, Lucknow and Nagpur,
49 field offices at Regional level and 118 field offices at sub-
regional level spread over India
 Data Processing Division (DPD)
Headquarter is located at Kolkata
8 Data processing centres located at Kolkata, Nagpur, Delhi,
Giridih, Ahmedabad and Bangalore
 Coordination & Publication Division (CPD)
Headquarter located at New Delhi
PAGE 6
Schedule of Surveys:
 Ten Year Cycle
 Consumer Expenditure and Employment &
Unemployment – Twice
 Social Consumption (health, education etc.) - Twice
 Un-organised Manufacturing - Twice
 Un-organised services – Twice
 Land & Livestock holdings - Once
 Open Round – Once
(Special surveys are also undertaken)
 Annual Consumer Expenditure and Employment &
Unemployment Surveys (thin sample)
NSSO has been conducting nationwide multi-subject, integrated,
large-scale sample surveys in the form of successive rounds covering
various aspects of social, economic, demographic, industrial and
agricultural statistics. These surveys are undertaken striking a balance
between the urgent and contemporary need for reliable statistical data
on different topics and the constraints of limited resources, both
physical and financial.
Certain topics like labour force, household consumer expenditure,
social consumption, housing condition of people, and unorganized
non-agricultural enterprise surveys, household land and live stock
holding and debt and investment are repeated at decadal intervals.
The remaining years are for open rounds in which subjects of
current/special interest are undertaken on the demand of other
Central Ministries, and national and international organizations, etc.
NSSO has become synonymous with reliable estimates on various
aspects of economic and social life in India based on large scale sample
surveys.
Background about Directorate of Economics &
Statistics, Delhi
Directorate of Economics & Statistics is nodal department of National
Capital Territory of Delhi for collection, compilation, analysis and
presentation of statistical data and information. The Directorate of
Economics & Statistics also works as the Office of Chief Registrar
PAGE 7
Births & Deaths. Director is the Chief Registrar, Births & Deaths for
NCT of Delhi.
States participation in NSS:
The States started participating in the programme of collecting socio-
economic data on the same subjects from the 8th round (July 1954-
June1955) of NSS using the same concepts, definitions and procedures
and by adopting the same sample design based on independently
drawn sample as that ofNSSO.
These two field operations are generally referred as central and State
samples of the National Sample Survey.
Sample sizes of central and state samples are equal for most of the
States/UTs (equal matching sample). But there are some states where
the number of samples surveyed by state statistical agencies is double
that of the size of central samples
One of the objectives of states participation in the NSS programme is
to provide a mechanism by which sample size will be increased and
the pooling of the two sets of data would enable better estimates at
lower sub state level, particularly at district level. At the State level,
this will result in increased precision of the estimates and at
disaggregated level, estimates will be more stable. But the major
benefit will be derived in the case of estimates are generated at sub-
state level like NSS regions/districts.
Need for Pooling
It has been observed for quite some time that the results and reports
presented by Government about various parameters (e.g. employment
rate ,GDP etc) of our country is far away from actual situation which is
mainly due to mainly sampling and non-sampling errors.
In order to get precision in the estimated values, one needs to get as
much data as possible. This can be done with the help of pooling.
Pooling helps us increasing the sample by combining the central data
and state data. It is a new technique which aims at providing reliable
results for our parameters.
The harmonization of data processing process is one of the key
essences for pooling the different sets of data. The state sample data
PAGE 8
should be processed using the same set of validation rules as in the
case of central sample data. Accordingly, it is essential that the State
sample data is processed, ensuring the use of same data entry layout as
in the case of Central sample. If the States are evolving their own data
layout, as per their convenience, then the state data should be put in
the layout, harmonized of that with the Central data for using the
same software developed for central samples.
ABOUT NSS 67th
ROUND:
During 67th
round (1st
July 2010 to 30th
June 20ll),NSSO carried out an
all-India enterprise survey on unincorporated non-agricultural
enterprises in manufacturing, trade and other service sector excluding
construction and electricity, gas and water supply. The main aim was
to get estimates of various economic and operational characteristics of
these concerned enterprises at National as well as State level.
The survey was designed to estimate value of key characteristics per
enterprise like average no. of workers, fixed assets, outstanding loans,
total receipts, total operating expenses and gross value added
separately for ‘Own Account Enterprises (OAEs)’ and ‘Establishments’.
Information on various operational characteristics like ownership,
nature of operation, location, status of registration etc., were also
collected to have an insight into economic scenario of the
unincorporated non-agricultural enterprises in the country. These
economic and operational indicators are required for planning, policy
and decision making at various levels, both within the government
and outside.
Following parameters has been considered for Poolability testing
and analysis of NSS 67th
Round, Sch.2.34 data ofUnincorporated
Non-Agricultural Enterprises (Excluding Construction) in Delhi.
 TYPE OF ENTERPRISE
PAGE 9
 TYPE OF WORKER
 BROAD ACTIVITY TYPE
 GROSS VALUE ADDED (GVA)
 GROSS VALUE ADDED PER ENTERPRISE
The eligibility criterion for an enterprise to be covered in survey is at-
least 30 days of operation (15 for seasonal enterprises and self-help
groups) in the reference year.
SAMPLING & NON-SAMPLING ERRORS
The errors involvedin the collection, processing and analysis of a data
may be broadly classified under the following two heads:
 Sampling errors
 Non sampling errors
Sampling Errors
Sampling error arises in a data collection process as a result of taking a
sample from a population rather than using the whole population
primarily due to:
 Improper selection of sample in a survey
 Wrong usage of sub population in selection of sample
Non – Sampling Errors
Non-sampling error arises in a data collection process as a result of
factors other than taking a sample. The errors exist both in sample
surveys and censuses. These errors have the potential to cause bias in
polls, surveys or samples. They may primarily arise due to:
 Coverage or frame error: If units are not represented in the
frame but should have been part of the same, it results in zero
probability of selection for those units omitted from the frame.
PAGE 10
On the other hand if some units are duplicated, it results in
over-coverage with such units having larger probabilities of
selection.
 Response errors: Response errors result when data is
incorrectly requested, provided, received or recorded. These
errors may occur because of inefficiencies with the
questionnaire, the interviewer, the respondent or the survey
process.
1. Poor questionnaire design - It is essential that sample
survey or census questions are worded carefully in order to
avoid introducing bias. If questions are misleading or
confusing, then the responses may end up being distorted.
2. Interview bias- An interviewer can influence how a
respondent answers the survey questions. This may occur
when the interviewer is too friendly or prompts the
respondent.
3. Respondent errors - Respondents can also provide incorrect
answers. Faulty recollections, tendencies to exaggerate, and
inclinationsto give 'socially desirable'answersare few reasons
why a respondent may give an incorrect answer.
 Non-response errors- Non-response errors are the result of
not having obtained sufficient answers to survey questions.
There are two types of non-response errors complete and
partial.
 Processing errors - Processing errors are those which
sometimes emerge during the preparation of the final data files.
 Estimation errors - If an inappropriate estimation method is
used, then bias can still be introduced, regardless of how
PAGE 11
errorless the survey had been before the estimation process.
 Analysis errors - Analysis errors are those that occur when
using the wrong analytical tools or when the preliminary results
are provided instead of the final ones. Errors that occur during
the publication of data results are also considered analysis
errors.
PAGE 12
OVERVIEW OF 67TH ROUND
 Sampling Design And EstimationProcedure:
 The field work of 67th
round was carried out during 1st
July
2010 to 30th
June 2011. The entire survey period was divided
into four sub-rounds of three-month duration each and
equal number of sample villagesand blocks were
allocatedto each sub-round.
 During this round the followingtwo schedule of enquiry were
canvassed:
a. Schedule 0.0 : list of households and non-agricultural
enterprises
b. Schedule 2.34 : unincorporatednon-agricultural
enterprises( excluding construction)
However we are concerned with Schedule 2.34
 A total of 424 FSUs (16 villagesand 408 urban blocks) were
allotted for Delhi as state sample. All the 424FSUs were
surveyed for canvassingSchedule 2.34.
 A stratified multi-stage design was adopted for the 67th
round.
 The first stage units (FSU) are the census villages in the
rural sector and Urban Frame Survey (UFS) blocks in the
urban sector. The ultimate stage units (USU) are
enterprises in both the sectors. In case of large FSUs, one
intermediate stage of sampling is the selection of three
hamlet-groups (hgs)/ sub-blocks (sbs) from each large rural/
urban FSU.
 Two basic strata were formed at the state/UT level, viz.,
rural stratum and urban stratum.
PAGE 13
 For rural sector, if ‘r’ was the sample size allocatedfor a rural
stratum, the number of sub-strata formed was ‘r/4’ and from
each sub-stratum, sample villages were selected with
Probability Proportional to Size With Replacement
(PPSWR), size being the population of the villages as per
Census 2001.
 For urban sector, if ‘u’ was the sample size for an urban
stratum, the number of sub-strata formed was ‘u/4’ and from
each sub-stratum, FSUs were selected by Simple Random
Sampling without Replacement (SRSWOR). Enterprises
listed in the selected FSU/sub-FSU were stratified into19
second stage strata (SSS), from which sample households
were selected by SRSWOR.
 A sample of 16000 FSUs for central sample and 18248
FSUs for state sample have been allocated at all-India level.
 Concepts And Definitions:
1. ENTERPRISE:
An enterprise is an undertaking which is engaged in the
production and/or distribution of some goods and/or
services meant mainly for the purpose of sale, whether
fully or partly. An enterprise may be owned and operated
by a single household or by several households jointly, or
by an institutional body.
2. MANUFACTURING ENTERPRISE:
A manufacturing enterprise is a unit engaged in the
physical or chemical transformation of materials,
substances or components into new products. The units
primarily engaged in maintenance and repair of industrial,
commercial and similar machinery and equipment can also
be included in manufacturing enterprise. The production
PAGE 14
of goods for the sole purpose of domestic consumption was
not considered as manufacturing
3. TRADING ENTERPRISE:
A trading enterprise is an undertaking engaged in trade.
Trade is defined to be an act of purchase of goods and their
disposal by way of sale without any intermediate physical
transformation of goods.
4. SERVICING ENTERPRISE:
A servicing enterprise is engaged in activities carried out
for the benefit of a consuming unit and typically consists of
changes in the condition of consuming units realized by
the activities of servicing unit at the demand of the
consuming unit. It is possible for a unit to produce a
service for its own consumption provided by the type of
activity is such that it could have been carried out by
another unit.
5. OWN ACCOUNT ENTERPRISE (OAE):
An enterprise which is run without any hired worker
employed on a fairly basis is termed as an own account
enterprise.
6. ESTABLISHMENT:
An enterprise which is employing at least one hired
worker on a fairly basis is termed as establishment. Paid or
unpaid apprentices, paid household member/
servant/residentworker in an enterprise are considered as
hired worker.
7. WORKER:
A worker is defined as all persons working within the
premises of the enterprises who are in the payroll of the
enterprise as also the working owners and unpaid family
workers. Salespersons appointed by an enterprise for
selling its services and apprentices, paid or unpaid were
also treated as workers.
PAGE 15
8. REFERENCE PERIOD:
Last 30 days preceding the date of survey or last month has
been used as the reference period to collect most of the
data.
Classification Of Enterprise (NIC2008)
a. MANUFACTURING ENTERPRISE :
All the activities covered in divisions 10 to 33 of NIC-
2008 are considered as manufacturing for the purpose
of the survey. In addition to this NIC-2008 division
01632 is covered in the present survey.
b. TRADING ENTERPRISE :
All the activities covered in divisions 45 to 47 of NIC-
2008 are considered as trade for the purpose of the
survey.
c. SERVICING ENTERPRISE :
All the activitiescovered in divisions 50 to 96 of NIC-
2008 are considered as manufacturing for the
purpose of the survey. In addition to this, NIC-2008
divisions 37 to 39, 643, 649, 661 to 663, 771 to 773, 941,
949 are considered as servicing enterprises and
covered in the present survey.
NIC-2008 divisions 36, 491, 49212, 49213, 493, 51, 641, 642, 6611, 774,
942, 9491 (organisationsonly), 9492 are outside the coverage of the
survey.
PAGE 16
METHODOLOGY
SPSS and Microsoft Excel have been used to test and pool the data and
the following methodology is has been used:
1. TESTING:Complete analysis and poolability testing is based on
non-parametric tests mainly Chi-Square Goodness of fit Test to
test if the data of both the centre and the state follows the same
distribution or not, and hence they are actually poolable or not.
2. POOLED ESTIMATES: Estimates were calculated using ‘Inverse
Variance Method’ and ‘Weighted Average Method’.
3. RSE:In this report we will also calculate the SE and RSE for
checking the percentage of errors and its deviation from central
point. After getting the value of RSE for Urban and rural sectors
of state and central level data we need to pool that RSE to check
the percentage of error which is likely to occur at the time of
pooling.
4. The parameters which passed the tests were pooled and the
results have beenpublished henceforth.
PAGE 17
POOLING OF DATA
 Condition:
The harmonization of data processing process is one of the key
essences for pooling the different sets of data. The state sample
data should be processed using the same set of validation rules as
in the case of central sample data.
Need for testing:
Though the central sample and state sample are drawn
independently following identical sampling design with same
concepts, definitions and instructions to collect the state sample
data but due to lack of adequate training of field and processing
staff of State/UTs, the data files in some cases are not properly
validated. There is also expected agency bias in the two sets of
data generated by different agencies. As such they cannot be
merged for generating pooled estimate. Therefore one needs to
test that the samples are coming from identical distribution
function. Since the parametric distribution of the sample mean is
unknown one may adopt non-parametric tests such as Wald-
Wolfowitz Runs test, Multinomial Distribution test to test
that the samples are coming from identical distribution function.
Methodologyof pooling:
Two alternate methods are used in pooling the central and state
sample data.
o Weighting by Matching ratio: Building aggregate
estimate of pooled sample in proportion matching ratio m:
n of central and state sample aggregate estimate where m
and n are the allotted sample for central and state sample
separately for rural and urban sector. It leads to building
ratio estimate of pooled sample as ratio of aggregate
estimates.
o Weighting by Inverse of Variance: Ratio estimates are
built by weighting the ratio estimate of central and state
sample in proportion to inverse of variance of ratio of the
central and state sample.
PAGE 18
NON-PARAMETRIC TESTS:
 Multinomial Distribution Test (χ2
test for goodness of fit)
For discrete data such as status of activity, educational level and
categorical variable such as land possessed etc, standard tests of
equality of sample proportions of two sets of data based on
multinomial distributions, relevant chi-square tests may be used
after grouping the attributes/categorical variables in to a suitable
number of classes so that each class contains adequate number of
sample observations. Construct 2 X k contingency table for k classes
at the domain where two sets of data are to be pooled as below and
use chi-square test if State sample and Central sample have identical
distribution.
Sample-type
No of sample observation
Total
Class-1 Class-2 ... Class-k-1 Class-k
State Sample N11 N12 ... N1k-1 N1k N1.
Central Sample N21 N22 ... N2k-1 N2k N2.
Total N.1 N.2 ... N.k-1 N.k N..
H0: Two samples come from populations having same
distributions.
H1: Two samples come from populations having different
distributions.
Test – Statistic:
PAGE 19
where,
= Pearson'scumulative test statistic, which asymptotically
approaches a distribution.
= Observed Frequency = Nij
= Total number of observations
= Expected (theoretical) frequency
= (Ni. * N.j)/N..where i= 1 to 2, j= 1 to k.
Degrees of Freedom = (2-1)*(k-1) where k = no of columns
Fisher Yates table gives the tabulated value of chi square at (2-
1)*(k-1)d.f. at 5% level of significance.
Decision Rule: If 𝜒2
≥ Tabulated value then Reject H0
NOTE: If k = 2, then the contingency table becomes of order 2x2
Sample-type
No of sample
observation
Total
Class-1 Class-2
State Sample N11 N12 N1.
Central Sample N21 N22 N2.
Total N.1 N.2 N..
𝜒2
=
𝑁..(𝑁11 ∗ 𝑁22 – 𝑁12 ∗ 𝑁21 )2
(𝑁1. ∗ 𝑁.1 ∗ 𝑁2. ∗ 𝑁.2)
Degrees of Freedom = (2-1)*(2-1) = 1
Fisher Yates table gives the tabulated value of chi square at 1 d.f. at
5% level of significance.
Decision Rule: If 𝜒2
≥ Tabulated value then Reject H0
PAGE 20
 Wald-Wolfowitz run test
The run test is used to examine whether two random samples come
from populations having same distribution. This test can detect
differences in averages or spread or any other important aspect
between the two populations. This test is efficient when each
sample size is moderately large (greater than or equal to 10).
H0: Two samples come from populations having same
distributions.
H1: Two samples come from populations having different
distributions.
Test Statistic:Let‘r’ denote the number of runs. To obtain r list the
n1+n2observationsfrom two samples in order of magnitude. Denote
observations from one sample by x’s and other by y’s. Count the
number of runs.
Critical Value:Difference in location results in few runs and
difference in spread also results in few numbers of runs.
Consequently the critical region for this test is always one-sided.
The critical value to decide whether or not the number of runs, are
few, is obtained from the table. The table gives critical value rc for
n1 (size of sample 1) and n2 (size of sample 2) at 5% level of
significance.
Decision Rule:If r ≤ rc Reject H0
Large Sample Sizes: For sample sizes larger than 20 critical
value rc is given as :
𝑟𝑐 = µ − 1.96𝜎
whereµ = 1 +
2𝑛1 𝑛2
𝑛1+ 𝑛2
& 𝜎 = √
2𝑛1 𝑛2( 2𝑛1 𝑛2−𝑛1−𝑛2)
(𝑛1+ 𝑛2)2(𝑛1+ 𝑛2−1)
PAGE 21
ANAYLSIS OF POOLABILITY TEST
 The Parametric and Non-Parametric test is applied for Poolability
Testing and analysis of NSS 67th
Round data as per nature of
parameters.
 Non-Parametric test having capacity to analysis two types of data i.e.
Continuous and Discrete with the help of Chi-square test and Run
test.
 Run test is applied for those parameters that are continuous in
nature; however Chi-square test is applied for discrete nature of
parameters.
 The Parameters like Type ofenterprise, Type ofworker, Broadactivity
type is testedby Chi-square test due to discrete nature however Run
test is applied on Gross value added (GVA) andGross value added
(GVA) per enterprisedue to Continuous nature.
 The chi square Goodness of fit test at 1% and 5% level of significance
has been applied for rural and urban areas of Delhi for poolability test
of parameters like Type of Enterprises, Type of Workers and Broad
Activities.
 Null hypothesis has been accepted at 1% level of significance for the
parameters like Type of Enterprises and Broad Activities. However,
the null hypothesis is rejected for Type of Enterprises at 5% level of
significance for rural sectorindicating that non-sampling errors are in
large magnitude.
 The null hypothesis has been rejected at both 1% and 5% level of
significance in both rural and urban sectors.
 Wald-Wolfowitz run test has been applied for rural and urban areas
of Delhi for poolability test of Gross Value Added of all enterprises at
1% and 5% level of significance.
 The null hypothesis has been accepted at 1% and 5% level of
significance for both sectors i.e. rural and urban.
PAGE 22
METHODS OF POOLING
We first divide the central samples and state samples into two
independent sub-sample namely A & B to use the following
methods.
 Inverse Variance :
The estimates for state andcentral can be computed respectively
as:
𝑡 𝑠 =
𝑡 𝑠1
+ 𝑡 𝑠2
2
& 𝑡 𝑐 =
𝑡 𝑐1
+ 𝑡 𝑐2
2
where,
𝑡 𝑠1
& 𝑡 𝑠2
are resp aggregates of sub samples A and B of state
sample
𝑡 𝑐1& 𝑡 𝑐2 are resp aggregates of sub samples A and B of central
sample
Pooled estimate leading to optimum combination of these two
estimates is given byweighing with inverse of the variance of the
estimate. Thus the pooled estimate is given by:
𝑇𝑝 =
𝑉( 𝑡 𝑐) 𝑡 𝑠 + 𝑉( 𝑡 𝑠) 𝑡 𝑐
𝑉( 𝑡 𝑐)+ 𝑉(𝑡 𝑠 )
&𝑉(𝑇𝑝) =
𝑉( 𝑡 𝑐) 𝑉( 𝑡 𝑠)
𝑉( 𝑡 𝑐)+ 𝑉(𝑡 𝑠 )
In general 𝑉( 𝑡 𝑐) and 𝑉( 𝑡 𝑠) are unknown and can be estimated as
𝑉ˆ(𝑡 𝑠) =
(𝑡 𝑠1
−𝑡 𝑠2
)2
4
&𝑉ˆ(𝑡 𝑐 ) =
(𝑡 𝑐1
−𝑡 𝑐2
)2
4
Thus pooled estimate and estimated of pooled variance is given
by
𝑡 𝑝 =
𝑉ˆ( 𝑡 𝑐) 𝑡 𝑠 + 𝑉ˆ( 𝑡 𝑠) 𝑡 𝑐
𝑉ˆ( 𝑡 𝑐)+ 𝑉ˆ(𝑡 𝑠 )
&𝑉ˆ(𝑡 𝑝) =
𝑉ˆ( 𝑡 𝑐) 𝑉ˆ( 𝑡 𝑠)
𝑉ˆ( 𝑡 𝑐)+ 𝑉ˆ(𝑡 𝑠 )
By virtue of weighing the two estimates at the domain level at
which twoestimates are pooled, the pooled estimate will always
lie between the central and statesample estimates.
PAGE 23
 WeightedAverage :
When the State’s participation is of unequal matching of central
samples, theweighted average of two estimates with weights
being matching ratio of central and statesample may be a better
way of combining the estimates considering central and
statesamples as independent samples.
Let matching ratio of state and central sample be m : n.
Based on this, the respective estimates for state and central can
be computed as:
𝑡 𝑠 =
𝑡 𝑠1
+ 𝑡 𝑠2
2
& 𝑡 𝑐 =
𝑡 𝑐1
+ 𝑡 𝑐2
2
where,
𝑡 𝑠1
& 𝑡 𝑠2
are resp aggregates of sub samples A and B of state
sample
𝑡 𝑐1
& 𝑡 𝑐2
are resp aggregates of sub samples A and B of central
sample
Pooled estimate of these two estimates is given by weighingwith
matching participation ratem:n. Thus the pooled estimate is
given by:
𝑡 𝑝 =
𝑚𝑡 𝑠 + 𝑛𝑡 𝑐
𝑚+𝑛
& 𝑉(𝑇𝑝) =
𝑚2
𝑉( 𝑡 𝑠) + 𝑛2
𝑉( 𝑡 𝑐)
(𝑚+𝑛)2
In general 𝑉( 𝑡 𝑐) and 𝑉( 𝑡 𝑠) are unknown and can be estimated as
𝑉ˆ(𝑡 𝑠) =
(𝑡 𝑠1
−𝑡 𝑠2
)2
4
&𝑉ˆ(𝑡 𝑐 ) =
(𝑡 𝑐1
−𝑡 𝑐2
)2
4
Thus pooled estimate and estimated of pooled variance is given
by
𝑉ˆ(𝑡 𝑝) =
𝑛2
𝑉ˆ( 𝑡 𝑐) + 𝑚2
𝑉ˆ( 𝑡 𝑠)
(𝑚 + 𝑛)2
The pooled estimate will always lie between the estimates based
on central and state sampleseparately.
PAGE 24
POOLABILITY ANALYSIS
Type of Enterprises:
INVERSE VARIANCE METHOD
 The pooled number of Unincorporated Non-Agricultural Enterprises
of State & Centre was estimated to be 1153521. Out of them 27626
(2.27%) were in rural areas and 1128503(97.83%) were in urban areas of
Delhi.
 Out of the total enterprises 651390 (56.46%) were Own–Account
Enterprises and 502326 (43.54%) were Establishments.
 Estimates of Rural + Urban obtained, are closer to those obtained
under State as compared to Centre
WEIGHTED AVERAGE METHOD
 The pooled number of Unincorporated Non-Agricultural Enterprises
of State & Centre was estimated to be 1139089. Out of them 26178
(2.30%) were in rural areas and 1112911(97.7%) were in urban areas of
Delhi.
 Out of the total enterprises 635572 (55.8%) were Own–Account
Enterprises and 503517 (44.2%) were Establishments.
INVERSE VARIANCE METHOD vs. WEIGHTED AVERAGE METHOD
For Type of Enterprises, we observe that the inverse variance method
has lesser variation than weighted average method, hence making it
the better method.
PAGE 25
Broad Activity Type:
INVERSE VARIANCE METHOD
 The pooled numbers of broad activities of Unincorporated Non-
Agricultural Enterprises of State & Centre are estimated to be
1140863. Out of them 26028 were in rural areas and 1117152 were in
urban areas of Delhi.
 Out of total Unincorporated Non-Agricultural Enterprises, Trade
accounted for 41.22%, the share of Other Services was 37.58% and
Manufacturing constituted 21.20%.
 Estimates of Rural + Urban obtained, are closer to those obtained
under State as compared to Centre
WEIGHTED AVERAGE METHOD
 The pooled numbers of broad activities of Unincorporated Non-
Agricultural Enterprises of State & Centre are estimated to be
1120809. Out of them 26195 were in rural areas and 1094614 were in
urban areas of Delhi.
 Out of the total enterprises 635572 (55.8%) were Own–Account
Enterprises and 503517 (44.2%) were Establishments.
INVERSE VARIANCE METHOD vs. WEIGHTED AVERAGE
METHOD
For Broad Activity Type we observe that the inverse variance method
has lesser variation than weighted average method, hence making it a
better method.
PAGE 26
Gross Value Added:
INVERSE VARIANCE METHOD
 Gross Value Added (as per Product Approach) for pooled number of
enterprises lies between State & Centre was calculated to be
Rs.37510418687.
 Out of total Gross Value Added, rural sector accounted for 1.38%, and
the share of urban sector was 98.62%
 Estimates of Rural + Urban obtained, are closer to those obtained under
State as compared to Centre
WEIGHTED AVERAGE METHOD
 Gross Value Added (as per Product Approach) for pooled number of
enterprises lies between State & Centre was calculated to be
Rs.34295707983.
 Out of total Gross Value Added, rural sector accounted for 2.28%, and
the share of urban sector was 97.72%
INVERSE VARIANCE METHOD vs. WEIGHTED AVERAGE
METHOD
For, Gross Value Added, we observe that the inverse variance method
has lesser variation than weighted average method, hence making it a
better method.
PAGE 27
RELATIVE STANDARD ERROR (RSE)
Gauzing the size of entire population and deriving results out of it, in
any essence is an arduous and cumbersome task. Probability theory
and statistics being that branch of science which deals with the same,
uses the concepts of surveys, sample and standard error. Statisticians
use standard errorsto construct confidence intervalsfrom their
surveyed data. Confidence intervalsare important for determiningthe
validity of empirical tests and research.
Standard error is however not to be confused with standard deviation,
latter referringto variability in the given sample and former showing
the variability of the sampling distribution itself.
Estimates for any parameter are formulated on the basis of a sample
collected from a population, rather than the population itself. The
error induced due to non-inclusion of the entire population refersto
standard error. The standard error is an absolute measure between the
sample survey and the total population. It affects the accuracy of the
estimates.
The Relative Standard Error (RSE) is the standard error expressed as a
fraction of the estimate and is usually displayed as a percentage.
Estimates with a RSE of 25% or greater are subject to high
sampling error and should be used with utmost caution.
The relative standard error shows if the standard error is large relative
to the results. Thus, large relative standard errorssuggest the results
are not significant and further investigation is mandatory.
PAGE 28
The reliability of estimates can also be assessed in terms of a
confidence interval. Confidence intervals represent the range in
which the population value is likely to lie. They are constructed
using the estimate of the population value and its associated standard
error.
For example, there is approximately a 95% chance (i.e. 19 chances in
20) that the population value lies within two standard errors of the
estimates, so the 95% confidence interval is equal to the estimate plus
or minus two standard errors.
Formula:
𝑆. 𝐸(𝑥)
𝑥
∗ 100
WhereS.E = standard error of the estimate of a concernedparameter
x = the value of the estimator of a concerned parameter
Decision Criteria:
The general rule to tolerate error:
 Estimate havingRSE less than or equal to 5% is firmly considered
as an excellentestimate.
 Estimate havingRSE between 5% and 10% is considered as a
good estimate.
 Estimate havingRSE between 10% and 15% is considered as an
average estimate.
 Estimate havingRSE beyond 15% strongly indicates that estimate
needs to be further investigated.
PAGE 29
ANALYSIS OF RELATIVE STANDARD
ERROR (RSE)
Type of Enterprise:
INVERSE VARIANCE METHOD (I.V.):
 In case of Rural Sector, the RSE for OAE and total is well within
range of 6%. For Establishment, RSE is acceptable at margin of
10%.
 In case of Urban Sector, all the RSEs are well within acceptable
range.
 In case of Rural+ Urban, all RSEs are within 4%.
 The estimates of RSE are closer to that of State.
WEIGHTED AVERAGE METHOD (W.A.):
 In case of Rural Sector, the RSE for Establishment and total are
within the range of 15 %. For OAE, RSE is beyond 16% and
requires further examination.
 In case of Urban Sector, the sample for Establishment is average,
but all other RSEs are good within 10%.
 In case of Rural+ Urban, the sample for Establishmentis average,
but all other RSEs are good within 10%.
INVERSE VARIANCE METHOD VS. WEIGHTED AVERAGE
METHOD:
The RSEs for inverse variance method in all the cases are quite less
than those for weighted average method. This implies high variation
in the weightedaverage method, making the inverse variance method
better. The estimate of RSE obtained through I.V. is closer to that
obtained through W.A.
PAGE 30
Broad Activity:
INVERSE VARIANCE METHOD (I.V.):
 In case of Rural Sector, the RSE for Trading and Other Services
are acceptable, lying within range of 15%. For Manufacturing,
RSE is excellent well within the margin of 5%.
 In case of Urban Sector, all the RSEs are well within excellent
range.
 In case of Rural+ Urban, all RSEs are well within 5%.
 The estimates of RSE are closer to that of State.
WEIGHTED AVERAGE METHOD (W.A.):
 In case of Rural Sector, the RSE for Manufacturing and Trading
are acceptable, lying within range of 15%. For Other Services,
RSE is beyond 24% and requires further examination.
 In case of Urban Sector, the sample for Trading and Other
Services is average, where as that of Manufacturing is excellent,
lying well within 5%.
 In case of Rural+ Urban, the sample for Trading and Other
Services is average, where as that of Manufacturing, lies well
within 5%.
INVERSE VARIANCE METHOD VS. WEIGHTED AVERAGE
METHOD:
The RSEs for inverse variance method in all the cases are quite less
than those for weighted average method. This implies high variation
in the weightedaverage method, making the inverse variance method
better. The difference between the estimates of RSE obtained through
I.V and W.A is lesser in Manufacturing as compared to that of Trading
& Other Services.
PAGE 31
Gross Value Added:
INVERSE VARIANCE METHOD (I.V.):
 In case of Rural Sector, the RSE is beyond 16% indicating the
need for further examination.
 In case of Urban and Rural+ Urban sector, RSEs are good being
close to 6%.
 The estimates of RSE are closer to that of State.
WEIGHTED AVERAGE METHOD (W.A.):
 In case of Rural Sector, the RSE is beyond 16% indicating the
need for further examination.
 In case of Urban and Rural+ Urban sector, RSEs are average,
lying within 15%.
INVERSE VARIANCE METHOD VS. WEIGHTED AVERAGE
METHOD:
The RSEs for inverse variance method in all the cases are quite less
than those for weighted average method. This implies high variation
in the weightedaverage method, making the inverse variance method
better. The estimate of RSE obtained through I.V. is closer to that of
State.
PAGE 32
DIVERGENCE (d)
For substantial gain in reliability of the pooled estimate, the quality of
data collected by the two agencies must be of the same order
considering the non-samplingerrors. The estimates generatedfrom
central and state samples as such are not directly comparable for
some States even at the state level. Estimates show wide divergence –
raising doubts about the unknown magnitude of non-sampling error
as well as its agency bias. In such cases pooling may not result in
better estimate as the estimates are not poolable.
The situations that may arise for the estimates (aggregates) of a
parameter (θ), say t1 and t2 with relative standard errors r1 and r2,
respectively obtained from the central sample and state sample data
are illustrated below.
1. The divergence, d= |t1 - t2| ≈ 0 (i.e., small) and r1 and r2
are within the acceptable margins (r0).
2. The divergence, d= |t1 - t2|≈ 0 and r1>>r0& r2>> r0
3. The divergence, d= |t1 - t2|≈ 0 and r1<= r0 but r2>> r0
4. The divergence, d= |t1 - t2 |>> 0 and r1 ≤ r0& r2 ≤ r0
5. The divergence, d= |t1 - t2 |>> 0 and r1>> r0& r2>> r0
6. The divergence, d= |t1 - t2|>> 0 and r1>> r0& r2< r0
PAGE 33
In the case of situations 1 to 3 above, one may argue that the
estimates are acceptable in thesense that they are close to each other
and the pooling of the two estimates t1 and t2 willimprove the
reliability. Pooling of both the estimates, even though lie on the same
side ofthe true value, may result in a small loss of information in
respect of error, i.e., its closenessto its true value, but may result in
significant gain in the precision.
In the case of situations 4 to 6, one may need to look into the
estimates carefully in respectof its closeness to the true value of the
parameter either through external evidence orthrough prior
knowledge regarding the trend of the estimates. It may happen that
oneestimate is very close to the true value and the other is totally away
from it. In that case,although the pooled estimate may have a smaller
RSE but it may not describe the truesituation if the two estimates lie
on the same side of true value as compared to the estimatewhich is
closer to the true value. The examination of the RSEs of the estimates
is asecondary issue to such situations.
PAGE 34
OBSERVATIONS REGARDING
DIVERGENCE
 Checking the divergence of two sets of data is the alternative
approach to check the non-sampling errors involvedin unit level
data.
 As per normality concept of Statistics, a certain percentage of the
State and Centre estimates has been taken as the deciding
criteria for the aforementionedparameters.
 Estimates which are acceptable indicate that they are close to
each other and the pooling of the estimates of State & Centre will
improve the reliability of the data.
 The wide divergencesbetween these two sets of estimates
i.e.Central and State indicate that pooling will not be advisable
because it raises doubts about the unknown magnitude of non-
sampling error as well as its agency bias. Generally in such cases
pooling may not result in better estimate as the estimates are not
poolable.
 Estimates which need further investigation indicate that one may
need to look into the estimates carefully in respect to its
closeness to the true value of the parameter either through
external evidence or through prior knowledge regardingthe
trend of the estimates.
PAGE 35
Under the parameter
1. Type of Enterprises:
 Estimates of urban enterprises are acceptable.
 Estimates of rural enterprisesneed further investigation.
2. Broad Activities:
 Estimates of both urban and rural enterprises under Manufacturing
need further examination.
 Estimates of rural enterprises need further investigation under
Trading where as that of urban enterprisesare acceptable
 Estimates of rural enterprises are acceptable under Other Services
where as that of urban enterprisesneed further investigation.
3. Gross Value Added:
 Estimates of both urban and rural enterprises need further
examination.
PAGE 36
CONCLUSION
 Sampling
District wise unit level data is unavailable for the state
Delhi. Hence it is very difficult to apply the poolability test
for better analysis. Therefore poolability testing & analysis
has been done on the basis of sector wise (Urban and Rural)
unit level data.
 Testing
 It is known that whenever a parametric test is applied, it is
always more powerful than non parametric tests. But
parametric tests need to satisfy some assumptions before
the tests can actually be used. None of the concerned
assumptions were satisfied for the given 67th
Round data,
which indicates high chances of sampling & non sampling
errors.
PAGE 37
 For 67th
Round, Multinomial Test has been applied for
parameters like Type of Enterprises, Broad Activities etc.
Wald Wolfowitz Runs Test was applied for Gross Value
Added.
 Multinomial test was rejected for the parameter Type of
Worker indicating that the data cannot be pooled and error
is suspected in the data.
 If a test gets accepted at 5% then it will be also accepted at
1%. But if a test gets rejected at 1% then it will be rejected at
5% also
 Pooling
 All the pooled estimates derived through method of
Inverse Variance were better than that obtained through
Weighted Average. The Relative Standard Error in every
parameter are lesser in case of former, thus justifying the
above conclusion.
 For all the parameters , we observe that the inverse
variance method has lesser variation than weighted average
method, hence making it the better method
PAGE 38
UNINCORPORATED NON-AGRICULTURAL ENTERPRISES IN DELHI
EXECUTIVE SUMMARY
Following are the main highlights of the poolability analysis of NSS
67th round data (July 2010 – June 2011) through method of Inverse
Variance.
 The pooled number of Unincorporated Non-Agricultural
Enterprises of State & Centre was estimated to be 1153521. Out of
them 27626 (2.27%) were in rural areas and 1128503(97.83%) were
in urban areas of Delhi.
 Out of the total enterprises 651390 (56.46%) were Own–Account
Enterprises and 502326 (43.54%) were Establishments.
 Out of total Unincorporated Non-Agricultural Enterprises, Trade
accounted for 41.22%, the share of Other Services was 37.58% and
Manufacturing constituted 21.20%.
 Gross Value Added (as per Product Approach) for pooled number
of enterprises between State & Centre was calculated to be
Rs.37510418687..
 Gross Value Added (as per Product Approach) per Unincorporated
Non-Agricultural Enterprises was estimated at Rs 32518.
PAGE 39
UNINCORPORATED NON-AGRICULTURAL ENTERPRISES IN DELHI
EXECUTIVE SUMMARY
Following are the main highlights of the poolability analysis of NSS
67th round data (July 2010 – June 2011) through method of Weighted
Average.
 The pooled number of Unincorporated Non-Agricultural
Enterprises of State & Centre was estimated to be 1139089. Out of
them 26178 (2.30%) were in rural areas and 1112911(97.7%) were in
urban areas of Delhi.
 Out of the total enterprises 635572 (55.8%) were Own–Account
Enterprises and 503517 (44.2%) were Establishments.
 Out of total Unincorporated Non-Agricultural Enterprises, Trade
accounted for 43.62%, the share of Other Services was 36.58% and
Manufacturing constituted 19.80%.
 Gross Value Added (as per Product Approach) for pooled number
of enterprises between State & Centre was calculated to be
Rs.34295707983.
 Gross Value Added (as per Product Approach) per Unincorporated
Non-Agricultural Enterprises was estimated at Rs 30108.
PAGE 40
SUGGESTIONS
I. Accurate results concerning aforementioned parameters can be
obtained if data is collected district – wise.
II. We need to keep in mind the objective of the survey precisely
while preparing a questionnaire. Highly technical and
complicated questions must be avoided as they lead to partial or
non-response from respondents.
III. It is necessary to validate and remove non-sampling errors
during survey by the surveyor in NSS round. Non- sampling
errors leads to increase in Type-1 and Type-2 errors. Former
causes incorrect rejection of some parameters which should
actually be accepted whereas the latter leads to incorrect
acceptance of some parameters which should actually be
rejected. Both these errors result in misleading conclusions
about the sample.
IV. Updated maps of the locality need to be used and provided to
the surveyor as well, so that relevant data is collected with
correct demographics and in well in time.
PAGE 41
BIBLIOGRAPHY
 Report of NSS on Operational and Financial Characteristics of
Unincorporated Non-Agricultural Enterprises (Excluding
Construction) in Delhi 2010-11, Directorate of Economics and
Statistics, Delhi.
 Training Manual on Data Processing NSS 67th
Round, NSSO,
MOSPI.
 Note on Sample Design and Estimation Procedure NSS 67th
Round (July 2010 – June 2011), MOSPI, NSSO.
 Report of the Committee on Pooling of Central and State Sample
Data of NSS, NSC, Government of India, November 2011.
 www.google.com
 www.wikipedia.org

More Related Content

What's hot

Indicus Ma Foi Randstad Employment Trends Survey - Wave 1 - 2011
Indicus Ma Foi Randstad Employment Trends Survey - Wave 1 - 2011Indicus Ma Foi Randstad Employment Trends Survey - Wave 1 - 2011
Indicus Ma Foi Randstad Employment Trends Survey - Wave 1 - 2011
Indicus Analytics Private Limited
 
Meos Q3 2011 Us Report Final
Meos Q3 2011 Us Report FinalMeos Q3 2011 Us Report Final
Meos Q3 2011 Us Report Finalrtchapman
 
The TeamLease Employment Outlook Report: July-September 2011
The TeamLease Employment Outlook Report: July-September 2011The TeamLease Employment Outlook Report: July-September 2011
The TeamLease Employment Outlook Report: July-September 2011
valuvox
 
Meosq2 2011 Us Report Final
Meosq2 2011 Us Report FinalMeosq2 2011 Us Report Final
Meosq2 2011 Us Report Finalktarca
 
Q1 2011 Manpower Employment Outlook Survey
Q1 2011 Manpower Employment Outlook SurveyQ1 2011 Manpower Employment Outlook Survey
Q1 2011 Manpower Employment Outlook Surveynwhitebdm
 
Unemployment Rate (Jun 2012)
Unemployment Rate (Jun 2012)Unemployment Rate (Jun 2012)
Unemployment Rate (Jun 2012)
BDPA Education and Technology Foundation
 
A Review of the Uttarakhand’s Industrial Policies and Their Performance
A Review of the Uttarakhand’s Industrial Policies and Their PerformanceA Review of the Uttarakhand’s Industrial Policies and Their Performance
A Review of the Uttarakhand’s Industrial Policies and Their Performance
YogeshIJTSRD
 
National level survey relevant to health seminar (2)
National level survey relevant to health seminar (2)National level survey relevant to health seminar (2)
National level survey relevant to health seminar (2)vishal soyam
 
West Bengal State Report March 2018
West Bengal State Report March 2018West Bengal State Report March 2018
West Bengal State Report March 2018
India Brand Equity Foundation
 

What's hot (10)

Indicus Ma Foi Randstad Employment Trends Survey - Wave 1 - 2011
Indicus Ma Foi Randstad Employment Trends Survey - Wave 1 - 2011Indicus Ma Foi Randstad Employment Trends Survey - Wave 1 - 2011
Indicus Ma Foi Randstad Employment Trends Survey - Wave 1 - 2011
 
Meos Q3 2011 Us Report Final
Meos Q3 2011 Us Report FinalMeos Q3 2011 Us Report Final
Meos Q3 2011 Us Report Final
 
The TeamLease Employment Outlook Report: July-September 2011
The TeamLease Employment Outlook Report: July-September 2011The TeamLease Employment Outlook Report: July-September 2011
The TeamLease Employment Outlook Report: July-September 2011
 
Meosq2 2011 Us Report Final
Meosq2 2011 Us Report FinalMeosq2 2011 Us Report Final
Meosq2 2011 Us Report Final
 
Q1 2011 Manpower Employment Outlook Survey
Q1 2011 Manpower Employment Outlook SurveyQ1 2011 Manpower Employment Outlook Survey
Q1 2011 Manpower Employment Outlook Survey
 
Unemployment Rate (Jun 2012)
Unemployment Rate (Jun 2012)Unemployment Rate (Jun 2012)
Unemployment Rate (Jun 2012)
 
A Review of the Uttarakhand’s Industrial Policies and Their Performance
A Review of the Uttarakhand’s Industrial Policies and Their PerformanceA Review of the Uttarakhand’s Industrial Policies and Their Performance
A Review of the Uttarakhand’s Industrial Policies and Their Performance
 
National level survey relevant to health seminar (2)
National level survey relevant to health seminar (2)National level survey relevant to health seminar (2)
National level survey relevant to health seminar (2)
 
TSSSDP new
TSSSDP newTSSSDP new
TSSSDP new
 
West Bengal State Report March 2018
West Bengal State Report March 2018West Bengal State Report March 2018
West Bengal State Report March 2018
 

Similar to Final_project_report_67_Sri

Research paper on NIM
Research paper on NIMResearch paper on NIM
Research paper on NIM
KETUL KHANDAGALE
 
The Effect of Organizational Culture on Organizational Performance: Mediating...
The Effect of Organizational Culture on Organizational Performance: Mediating...The Effect of Organizational Culture on Organizational Performance: Mediating...
The Effect of Organizational Culture on Organizational Performance: Mediating...
theijes
 
UNVEILING FACTORS IN NON-PERFORMING LOANS: EXPLORING THE RURAL ECONOMIC FINAN...
UNVEILING FACTORS IN NON-PERFORMING LOANS: EXPLORING THE RURAL ECONOMIC FINAN...UNVEILING FACTORS IN NON-PERFORMING LOANS: EXPLORING THE RURAL ECONOMIC FINAN...
UNVEILING FACTORS IN NON-PERFORMING LOANS: EXPLORING THE RURAL ECONOMIC FINAN...
indexPub
 
Economic Census and the Economic Indicators - Sherine Al-Shawarby
Economic Census and the Economic Indicators - Sherine Al-ShawarbyEconomic Census and the Economic Indicators - Sherine Al-Shawarby
Economic Census and the Economic Indicators - Sherine Al-Shawarby
Economic Research Forum
 
ANALYTICALAPPROCH ON ANNUAL SURVEY OF INDUSTRIAL DATA OF NCT OF DELHI DURING ...
ANALYTICALAPPROCH ON ANNUAL SURVEY OF INDUSTRIAL DATA OF NCT OF DELHI DURING ...ANALYTICALAPPROCH ON ANNUAL SURVEY OF INDUSTRIAL DATA OF NCT OF DELHI DURING ...
ANALYTICALAPPROCH ON ANNUAL SURVEY OF INDUSTRIAL DATA OF NCT OF DELHI DURING ...
Sarvesh Kumar
 
100-hwang Impact assessment of R&D subsidies on input additionality and firms...
100-hwang Impact assessment of R&D subsidies on input additionality and firms...100-hwang Impact assessment of R&D subsidies on input additionality and firms...
100-hwang Impact assessment of R&D subsidies on input additionality and firms...
innovationoecd
 
Pdprm2011 2016-120209202743-phpapp01
Pdprm2011 2016-120209202743-phpapp01Pdprm2011 2016-120209202743-phpapp01
Pdprm2011 2016-120209202743-phpapp01
johara adelah ampuan
 
PHILIPPINE DEVELOPMENT PLAN
PHILIPPINE DEVELOPMENT PLANPHILIPPINE DEVELOPMENT PLAN
PHILIPPINE DEVELOPMENT PLAN
johara adelah ampuan
 
Measuring of informal sector and informal employment in st lucia
Measuring of informal sector and informal employment in st luciaMeasuring of informal sector and informal employment in st lucia
Measuring of informal sector and informal employment in st lucia
Dr Lendy Spires
 
Mis
MisMis
Assessment of Local Governance and Development Performance in Indonesia
   Assessment of Local Governance and Development  Performance in Indonesia   Assessment of Local Governance and Development  Performance in Indonesia
Assessment of Local Governance and Development Performance in IndonesiaDr. Astia Dendi
 
NSSO_Kritika.pptx
NSSO_Kritika.pptxNSSO_Kritika.pptx
NSSO_Kritika.pptx
Kritika Sarkar
 
GAMIFICATION AND RESOURCE POOLING FOR IMPROVING OPERATIONAL EFFICIENCY AND EF...
GAMIFICATION AND RESOURCE POOLING FOR IMPROVING OPERATIONAL EFFICIENCY AND EF...GAMIFICATION AND RESOURCE POOLING FOR IMPROVING OPERATIONAL EFFICIENCY AND EF...
GAMIFICATION AND RESOURCE POOLING FOR IMPROVING OPERATIONAL EFFICIENCY AND EF...
IAEME Publication
 
Statistical system
Statistical systemStatistical system
Statistical system
barghouthi2016
 
Using data analytics to improve public expenditures - Struan LITTLE, New-Zealand
Using data analytics to improve public expenditures - Struan LITTLE, New-ZealandUsing data analytics to improve public expenditures - Struan LITTLE, New-Zealand
Using data analytics to improve public expenditures - Struan LITTLE, New-Zealand
OECD Governance
 
Accounting Information and Decision Making in the Banking Sector Bank of Agri...
Accounting Information and Decision Making in the Banking Sector Bank of Agri...Accounting Information and Decision Making in the Banking Sector Bank of Agri...
Accounting Information and Decision Making in the Banking Sector Bank of Agri...
ijtsrd
 

Similar to Final_project_report_67_Sri (20)

NSB_MastersThesis
NSB_MastersThesisNSB_MastersThesis
NSB_MastersThesis
 
Research paper on NIM
Research paper on NIMResearch paper on NIM
Research paper on NIM
 
Grigol modebadze. cv
Grigol modebadze. cvGrigol modebadze. cv
Grigol modebadze. cv
 
The Effect of Organizational Culture on Organizational Performance: Mediating...
The Effect of Organizational Culture on Organizational Performance: Mediating...The Effect of Organizational Culture on Organizational Performance: Mediating...
The Effect of Organizational Culture on Organizational Performance: Mediating...
 
UNVEILING FACTORS IN NON-PERFORMING LOANS: EXPLORING THE RURAL ECONOMIC FINAN...
UNVEILING FACTORS IN NON-PERFORMING LOANS: EXPLORING THE RURAL ECONOMIC FINAN...UNVEILING FACTORS IN NON-PERFORMING LOANS: EXPLORING THE RURAL ECONOMIC FINAN...
UNVEILING FACTORS IN NON-PERFORMING LOANS: EXPLORING THE RURAL ECONOMIC FINAN...
 
Economic Census and the Economic Indicators - Sherine Al-Shawarby
Economic Census and the Economic Indicators - Sherine Al-ShawarbyEconomic Census and the Economic Indicators - Sherine Al-Shawarby
Economic Census and the Economic Indicators - Sherine Al-Shawarby
 
ANALYTICALAPPROCH ON ANNUAL SURVEY OF INDUSTRIAL DATA OF NCT OF DELHI DURING ...
ANALYTICALAPPROCH ON ANNUAL SURVEY OF INDUSTRIAL DATA OF NCT OF DELHI DURING ...ANALYTICALAPPROCH ON ANNUAL SURVEY OF INDUSTRIAL DATA OF NCT OF DELHI DURING ...
ANALYTICALAPPROCH ON ANNUAL SURVEY OF INDUSTRIAL DATA OF NCT OF DELHI DURING ...
 
100-hwang Impact assessment of R&D subsidies on input additionality and firms...
100-hwang Impact assessment of R&D subsidies on input additionality and firms...100-hwang Impact assessment of R&D subsidies on input additionality and firms...
100-hwang Impact assessment of R&D subsidies on input additionality and firms...
 
Pdprm2011 2016-120209202743-phpapp01
Pdprm2011 2016-120209202743-phpapp01Pdprm2011 2016-120209202743-phpapp01
Pdprm2011 2016-120209202743-phpapp01
 
PHILIPPINE DEVELOPMENT PLAN
PHILIPPINE DEVELOPMENT PLANPHILIPPINE DEVELOPMENT PLAN
PHILIPPINE DEVELOPMENT PLAN
 
D2,7.india country practice
D2,7.india country practiceD2,7.india country practice
D2,7.india country practice
 
Measuring of informal sector and informal employment in st lucia
Measuring of informal sector and informal employment in st luciaMeasuring of informal sector and informal employment in st lucia
Measuring of informal sector and informal employment in st lucia
 
Mis
MisMis
Mis
 
Assessment of Local Governance and Development Performance in Indonesia
   Assessment of Local Governance and Development  Performance in Indonesia   Assessment of Local Governance and Development  Performance in Indonesia
Assessment of Local Governance and Development Performance in Indonesia
 
Statistical education
Statistical educationStatistical education
Statistical education
 
NSSO_Kritika.pptx
NSSO_Kritika.pptxNSSO_Kritika.pptx
NSSO_Kritika.pptx
 
GAMIFICATION AND RESOURCE POOLING FOR IMPROVING OPERATIONAL EFFICIENCY AND EF...
GAMIFICATION AND RESOURCE POOLING FOR IMPROVING OPERATIONAL EFFICIENCY AND EF...GAMIFICATION AND RESOURCE POOLING FOR IMPROVING OPERATIONAL EFFICIENCY AND EF...
GAMIFICATION AND RESOURCE POOLING FOR IMPROVING OPERATIONAL EFFICIENCY AND EF...
 
Statistical system
Statistical systemStatistical system
Statistical system
 
Using data analytics to improve public expenditures - Struan LITTLE, New-Zealand
Using data analytics to improve public expenditures - Struan LITTLE, New-ZealandUsing data analytics to improve public expenditures - Struan LITTLE, New-Zealand
Using data analytics to improve public expenditures - Struan LITTLE, New-Zealand
 
Accounting Information and Decision Making in the Banking Sector Bank of Agri...
Accounting Information and Decision Making in the Banking Sector Bank of Agri...Accounting Information and Decision Making in the Banking Sector Bank of Agri...
Accounting Information and Decision Making in the Banking Sector Bank of Agri...
 

Final_project_report_67_Sri

  • 1. Project Report On Poolability Analysis of NSS 67th Round Central and Delhi State Sample For Unincorporated Non-Agricultural Enterprises (Excluding Construction) Sponsored by Ministry of Statistics and Programme Implementation(MOSPI) Under the Scheme for Internship of post-graduate/research student during 2015-16 Made By: Srimoyee Bose M.Sc. Statistics
  • 2. PAGE 2 ACKNOWLEDGEMENT Every project big or small is successful largely dueto theeffort of a number of brilliant people who havealways given their valuableadviceor lent a helping hand. I sincerely appreciatethe inspiration; support and guidanceof all those peoplewho havebeen instrumental in making this project a success. I, SrimoyeeBose,thestudent of RamjasCollege (UniversityOf Delhi), am extremely gratefulto DirectorateofEconomics &Statistics for theconfidencebestowed in me and entrusting my project entitled “OperationalandFinancial Characteristics of UnincorporatedNon-AgriculturalEnterprises(ExcludingConstruction) in Delhi” withspecial reference to MinistryofStatistics and Programme Implementation (MOSPI). At this juncture I feel deeply honoured in expressing my sincere thanks to Mr. C.K. Duttafor making theresources availableat right time and providing valuable insights leading to the successful completion of my project. I express my gratitudeto Department of Statistics for arranging thesummer training in goodschedule. I alsoextend my gratitudeto Mr. PraveenSrivastava and Mr. Hemant, who assisted mein compiling theproject.
  • 3. PAGE 3 I would alsoliketothank Mr. PraveenChaurasiafor his critical adviceand guidance without which this project would not havebeen possible. Any omission in this brief acknowledgement does not mean lack of gratitude. CONTENTS Ch.No TOPIC Page No. 01. Introduction & Background a. OrganizationalStructure b. Scheduleof Surveys c. Background of DES d. States participation in NSS e. Need for Pooling d. States participation in NSS e. about NSS 67th Round f. Need for Pooling g. Errors in Survey : Sampling & Non-sampling errors 5-10 02. Overview of 67th round a. Sampling Design & Estimation Procedure b. Concepts & Definitions c. Classificationof Enterprises (NIC-2008) 11 - 14 03. Methodology & Analysis 1. Methodology 2. Pooling of data 3. Non- Parametric Tests a. MultinomialTests b. Wald-Wolfowitz Runs Test 15-34
  • 4. PAGE 4 c. Poolability Test Result 2. Methods of Pooling a. InverseVariance b. Weighted Average c. Poolability Analysis 3. Relative StandardError 4. Divergence 04. Conclusion & Suggestions 35 – 39 OBJECTIVE The main objective of our project is the Poolability Analysis of NSS 67th Round, Sch.2.34 data of Unincorporated Non- Agricultural Enterprises (Excluding Construction) Delhi, for State and Central sample.
  • 5. PAGE 5 INTRODUCTION AND BACKGROUND The National Sample Survey (NSS) was set up in 1950 under Indian Statistical Institute (ISI), to bridge large gaps in statistical data needed for planning, policy formulation and computation of national income aggregates, especially in respect of the unorganized and household sector of the economy. The NSS was re-organized into National Sample Survey Office (NSSO) in 1972 under the Ministry of Statistics and Programme Implementation (MOSPI). Organizational structure:  The NSSO is headed by the DG & CEO  NSSO has 4 Divisions  Survey Design & Research Division (SDRD) Located at Kolkata  Field Operations Division (FOD) Headquarter located at New Delhi & Faridabad Six Zonal offices located at Bangalore, Kolkata, Guwahati, Jaipur, Lucknow and Nagpur, 49 field offices at Regional level and 118 field offices at sub- regional level spread over India  Data Processing Division (DPD) Headquarter is located at Kolkata 8 Data processing centres located at Kolkata, Nagpur, Delhi, Giridih, Ahmedabad and Bangalore  Coordination & Publication Division (CPD) Headquarter located at New Delhi
  • 6. PAGE 6 Schedule of Surveys:  Ten Year Cycle  Consumer Expenditure and Employment & Unemployment – Twice  Social Consumption (health, education etc.) - Twice  Un-organised Manufacturing - Twice  Un-organised services – Twice  Land & Livestock holdings - Once  Open Round – Once (Special surveys are also undertaken)  Annual Consumer Expenditure and Employment & Unemployment Surveys (thin sample) NSSO has been conducting nationwide multi-subject, integrated, large-scale sample surveys in the form of successive rounds covering various aspects of social, economic, demographic, industrial and agricultural statistics. These surveys are undertaken striking a balance between the urgent and contemporary need for reliable statistical data on different topics and the constraints of limited resources, both physical and financial. Certain topics like labour force, household consumer expenditure, social consumption, housing condition of people, and unorganized non-agricultural enterprise surveys, household land and live stock holding and debt and investment are repeated at decadal intervals. The remaining years are for open rounds in which subjects of current/special interest are undertaken on the demand of other Central Ministries, and national and international organizations, etc. NSSO has become synonymous with reliable estimates on various aspects of economic and social life in India based on large scale sample surveys. Background about Directorate of Economics & Statistics, Delhi Directorate of Economics & Statistics is nodal department of National Capital Territory of Delhi for collection, compilation, analysis and presentation of statistical data and information. The Directorate of Economics & Statistics also works as the Office of Chief Registrar
  • 7. PAGE 7 Births & Deaths. Director is the Chief Registrar, Births & Deaths for NCT of Delhi. States participation in NSS: The States started participating in the programme of collecting socio- economic data on the same subjects from the 8th round (July 1954- June1955) of NSS using the same concepts, definitions and procedures and by adopting the same sample design based on independently drawn sample as that ofNSSO. These two field operations are generally referred as central and State samples of the National Sample Survey. Sample sizes of central and state samples are equal for most of the States/UTs (equal matching sample). But there are some states where the number of samples surveyed by state statistical agencies is double that of the size of central samples One of the objectives of states participation in the NSS programme is to provide a mechanism by which sample size will be increased and the pooling of the two sets of data would enable better estimates at lower sub state level, particularly at district level. At the State level, this will result in increased precision of the estimates and at disaggregated level, estimates will be more stable. But the major benefit will be derived in the case of estimates are generated at sub- state level like NSS regions/districts. Need for Pooling It has been observed for quite some time that the results and reports presented by Government about various parameters (e.g. employment rate ,GDP etc) of our country is far away from actual situation which is mainly due to mainly sampling and non-sampling errors. In order to get precision in the estimated values, one needs to get as much data as possible. This can be done with the help of pooling. Pooling helps us increasing the sample by combining the central data and state data. It is a new technique which aims at providing reliable results for our parameters. The harmonization of data processing process is one of the key essences for pooling the different sets of data. The state sample data
  • 8. PAGE 8 should be processed using the same set of validation rules as in the case of central sample data. Accordingly, it is essential that the State sample data is processed, ensuring the use of same data entry layout as in the case of Central sample. If the States are evolving their own data layout, as per their convenience, then the state data should be put in the layout, harmonized of that with the Central data for using the same software developed for central samples. ABOUT NSS 67th ROUND: During 67th round (1st July 2010 to 30th June 20ll),NSSO carried out an all-India enterprise survey on unincorporated non-agricultural enterprises in manufacturing, trade and other service sector excluding construction and electricity, gas and water supply. The main aim was to get estimates of various economic and operational characteristics of these concerned enterprises at National as well as State level. The survey was designed to estimate value of key characteristics per enterprise like average no. of workers, fixed assets, outstanding loans, total receipts, total operating expenses and gross value added separately for ‘Own Account Enterprises (OAEs)’ and ‘Establishments’. Information on various operational characteristics like ownership, nature of operation, location, status of registration etc., were also collected to have an insight into economic scenario of the unincorporated non-agricultural enterprises in the country. These economic and operational indicators are required for planning, policy and decision making at various levels, both within the government and outside. Following parameters has been considered for Poolability testing and analysis of NSS 67th Round, Sch.2.34 data ofUnincorporated Non-Agricultural Enterprises (Excluding Construction) in Delhi.  TYPE OF ENTERPRISE
  • 9. PAGE 9  TYPE OF WORKER  BROAD ACTIVITY TYPE  GROSS VALUE ADDED (GVA)  GROSS VALUE ADDED PER ENTERPRISE The eligibility criterion for an enterprise to be covered in survey is at- least 30 days of operation (15 for seasonal enterprises and self-help groups) in the reference year. SAMPLING & NON-SAMPLING ERRORS The errors involvedin the collection, processing and analysis of a data may be broadly classified under the following two heads:  Sampling errors  Non sampling errors Sampling Errors Sampling error arises in a data collection process as a result of taking a sample from a population rather than using the whole population primarily due to:  Improper selection of sample in a survey  Wrong usage of sub population in selection of sample Non – Sampling Errors Non-sampling error arises in a data collection process as a result of factors other than taking a sample. The errors exist both in sample surveys and censuses. These errors have the potential to cause bias in polls, surveys or samples. They may primarily arise due to:  Coverage or frame error: If units are not represented in the frame but should have been part of the same, it results in zero probability of selection for those units omitted from the frame.
  • 10. PAGE 10 On the other hand if some units are duplicated, it results in over-coverage with such units having larger probabilities of selection.  Response errors: Response errors result when data is incorrectly requested, provided, received or recorded. These errors may occur because of inefficiencies with the questionnaire, the interviewer, the respondent or the survey process. 1. Poor questionnaire design - It is essential that sample survey or census questions are worded carefully in order to avoid introducing bias. If questions are misleading or confusing, then the responses may end up being distorted. 2. Interview bias- An interviewer can influence how a respondent answers the survey questions. This may occur when the interviewer is too friendly or prompts the respondent. 3. Respondent errors - Respondents can also provide incorrect answers. Faulty recollections, tendencies to exaggerate, and inclinationsto give 'socially desirable'answersare few reasons why a respondent may give an incorrect answer.  Non-response errors- Non-response errors are the result of not having obtained sufficient answers to survey questions. There are two types of non-response errors complete and partial.  Processing errors - Processing errors are those which sometimes emerge during the preparation of the final data files.  Estimation errors - If an inappropriate estimation method is used, then bias can still be introduced, regardless of how
  • 11. PAGE 11 errorless the survey had been before the estimation process.  Analysis errors - Analysis errors are those that occur when using the wrong analytical tools or when the preliminary results are provided instead of the final ones. Errors that occur during the publication of data results are also considered analysis errors.
  • 12. PAGE 12 OVERVIEW OF 67TH ROUND  Sampling Design And EstimationProcedure:  The field work of 67th round was carried out during 1st July 2010 to 30th June 2011. The entire survey period was divided into four sub-rounds of three-month duration each and equal number of sample villagesand blocks were allocatedto each sub-round.  During this round the followingtwo schedule of enquiry were canvassed: a. Schedule 0.0 : list of households and non-agricultural enterprises b. Schedule 2.34 : unincorporatednon-agricultural enterprises( excluding construction) However we are concerned with Schedule 2.34  A total of 424 FSUs (16 villagesand 408 urban blocks) were allotted for Delhi as state sample. All the 424FSUs were surveyed for canvassingSchedule 2.34.  A stratified multi-stage design was adopted for the 67th round.  The first stage units (FSU) are the census villages in the rural sector and Urban Frame Survey (UFS) blocks in the urban sector. The ultimate stage units (USU) are enterprises in both the sectors. In case of large FSUs, one intermediate stage of sampling is the selection of three hamlet-groups (hgs)/ sub-blocks (sbs) from each large rural/ urban FSU.  Two basic strata were formed at the state/UT level, viz., rural stratum and urban stratum.
  • 13. PAGE 13  For rural sector, if ‘r’ was the sample size allocatedfor a rural stratum, the number of sub-strata formed was ‘r/4’ and from each sub-stratum, sample villages were selected with Probability Proportional to Size With Replacement (PPSWR), size being the population of the villages as per Census 2001.  For urban sector, if ‘u’ was the sample size for an urban stratum, the number of sub-strata formed was ‘u/4’ and from each sub-stratum, FSUs were selected by Simple Random Sampling without Replacement (SRSWOR). Enterprises listed in the selected FSU/sub-FSU were stratified into19 second stage strata (SSS), from which sample households were selected by SRSWOR.  A sample of 16000 FSUs for central sample and 18248 FSUs for state sample have been allocated at all-India level.  Concepts And Definitions: 1. ENTERPRISE: An enterprise is an undertaking which is engaged in the production and/or distribution of some goods and/or services meant mainly for the purpose of sale, whether fully or partly. An enterprise may be owned and operated by a single household or by several households jointly, or by an institutional body. 2. MANUFACTURING ENTERPRISE: A manufacturing enterprise is a unit engaged in the physical or chemical transformation of materials, substances or components into new products. The units primarily engaged in maintenance and repair of industrial, commercial and similar machinery and equipment can also be included in manufacturing enterprise. The production
  • 14. PAGE 14 of goods for the sole purpose of domestic consumption was not considered as manufacturing 3. TRADING ENTERPRISE: A trading enterprise is an undertaking engaged in trade. Trade is defined to be an act of purchase of goods and their disposal by way of sale without any intermediate physical transformation of goods. 4. SERVICING ENTERPRISE: A servicing enterprise is engaged in activities carried out for the benefit of a consuming unit and typically consists of changes in the condition of consuming units realized by the activities of servicing unit at the demand of the consuming unit. It is possible for a unit to produce a service for its own consumption provided by the type of activity is such that it could have been carried out by another unit. 5. OWN ACCOUNT ENTERPRISE (OAE): An enterprise which is run without any hired worker employed on a fairly basis is termed as an own account enterprise. 6. ESTABLISHMENT: An enterprise which is employing at least one hired worker on a fairly basis is termed as establishment. Paid or unpaid apprentices, paid household member/ servant/residentworker in an enterprise are considered as hired worker. 7. WORKER: A worker is defined as all persons working within the premises of the enterprises who are in the payroll of the enterprise as also the working owners and unpaid family workers. Salespersons appointed by an enterprise for selling its services and apprentices, paid or unpaid were also treated as workers.
  • 15. PAGE 15 8. REFERENCE PERIOD: Last 30 days preceding the date of survey or last month has been used as the reference period to collect most of the data. Classification Of Enterprise (NIC2008) a. MANUFACTURING ENTERPRISE : All the activities covered in divisions 10 to 33 of NIC- 2008 are considered as manufacturing for the purpose of the survey. In addition to this NIC-2008 division 01632 is covered in the present survey. b. TRADING ENTERPRISE : All the activities covered in divisions 45 to 47 of NIC- 2008 are considered as trade for the purpose of the survey. c. SERVICING ENTERPRISE : All the activitiescovered in divisions 50 to 96 of NIC- 2008 are considered as manufacturing for the purpose of the survey. In addition to this, NIC-2008 divisions 37 to 39, 643, 649, 661 to 663, 771 to 773, 941, 949 are considered as servicing enterprises and covered in the present survey. NIC-2008 divisions 36, 491, 49212, 49213, 493, 51, 641, 642, 6611, 774, 942, 9491 (organisationsonly), 9492 are outside the coverage of the survey.
  • 16. PAGE 16 METHODOLOGY SPSS and Microsoft Excel have been used to test and pool the data and the following methodology is has been used: 1. TESTING:Complete analysis and poolability testing is based on non-parametric tests mainly Chi-Square Goodness of fit Test to test if the data of both the centre and the state follows the same distribution or not, and hence they are actually poolable or not. 2. POOLED ESTIMATES: Estimates were calculated using ‘Inverse Variance Method’ and ‘Weighted Average Method’. 3. RSE:In this report we will also calculate the SE and RSE for checking the percentage of errors and its deviation from central point. After getting the value of RSE for Urban and rural sectors of state and central level data we need to pool that RSE to check the percentage of error which is likely to occur at the time of pooling. 4. The parameters which passed the tests were pooled and the results have beenpublished henceforth.
  • 17. PAGE 17 POOLING OF DATA  Condition: The harmonization of data processing process is one of the key essences for pooling the different sets of data. The state sample data should be processed using the same set of validation rules as in the case of central sample data. Need for testing: Though the central sample and state sample are drawn independently following identical sampling design with same concepts, definitions and instructions to collect the state sample data but due to lack of adequate training of field and processing staff of State/UTs, the data files in some cases are not properly validated. There is also expected agency bias in the two sets of data generated by different agencies. As such they cannot be merged for generating pooled estimate. Therefore one needs to test that the samples are coming from identical distribution function. Since the parametric distribution of the sample mean is unknown one may adopt non-parametric tests such as Wald- Wolfowitz Runs test, Multinomial Distribution test to test that the samples are coming from identical distribution function. Methodologyof pooling: Two alternate methods are used in pooling the central and state sample data. o Weighting by Matching ratio: Building aggregate estimate of pooled sample in proportion matching ratio m: n of central and state sample aggregate estimate where m and n are the allotted sample for central and state sample separately for rural and urban sector. It leads to building ratio estimate of pooled sample as ratio of aggregate estimates. o Weighting by Inverse of Variance: Ratio estimates are built by weighting the ratio estimate of central and state sample in proportion to inverse of variance of ratio of the central and state sample.
  • 18. PAGE 18 NON-PARAMETRIC TESTS:  Multinomial Distribution Test (χ2 test for goodness of fit) For discrete data such as status of activity, educational level and categorical variable such as land possessed etc, standard tests of equality of sample proportions of two sets of data based on multinomial distributions, relevant chi-square tests may be used after grouping the attributes/categorical variables in to a suitable number of classes so that each class contains adequate number of sample observations. Construct 2 X k contingency table for k classes at the domain where two sets of data are to be pooled as below and use chi-square test if State sample and Central sample have identical distribution. Sample-type No of sample observation Total Class-1 Class-2 ... Class-k-1 Class-k State Sample N11 N12 ... N1k-1 N1k N1. Central Sample N21 N22 ... N2k-1 N2k N2. Total N.1 N.2 ... N.k-1 N.k N.. H0: Two samples come from populations having same distributions. H1: Two samples come from populations having different distributions. Test – Statistic:
  • 19. PAGE 19 where, = Pearson'scumulative test statistic, which asymptotically approaches a distribution. = Observed Frequency = Nij = Total number of observations = Expected (theoretical) frequency = (Ni. * N.j)/N..where i= 1 to 2, j= 1 to k. Degrees of Freedom = (2-1)*(k-1) where k = no of columns Fisher Yates table gives the tabulated value of chi square at (2- 1)*(k-1)d.f. at 5% level of significance. Decision Rule: If 𝜒2 ≥ Tabulated value then Reject H0 NOTE: If k = 2, then the contingency table becomes of order 2x2 Sample-type No of sample observation Total Class-1 Class-2 State Sample N11 N12 N1. Central Sample N21 N22 N2. Total N.1 N.2 N.. 𝜒2 = 𝑁..(𝑁11 ∗ 𝑁22 – 𝑁12 ∗ 𝑁21 )2 (𝑁1. ∗ 𝑁.1 ∗ 𝑁2. ∗ 𝑁.2) Degrees of Freedom = (2-1)*(2-1) = 1 Fisher Yates table gives the tabulated value of chi square at 1 d.f. at 5% level of significance. Decision Rule: If 𝜒2 ≥ Tabulated value then Reject H0
  • 20. PAGE 20  Wald-Wolfowitz run test The run test is used to examine whether two random samples come from populations having same distribution. This test can detect differences in averages or spread or any other important aspect between the two populations. This test is efficient when each sample size is moderately large (greater than or equal to 10). H0: Two samples come from populations having same distributions. H1: Two samples come from populations having different distributions. Test Statistic:Let‘r’ denote the number of runs. To obtain r list the n1+n2observationsfrom two samples in order of magnitude. Denote observations from one sample by x’s and other by y’s. Count the number of runs. Critical Value:Difference in location results in few runs and difference in spread also results in few numbers of runs. Consequently the critical region for this test is always one-sided. The critical value to decide whether or not the number of runs, are few, is obtained from the table. The table gives critical value rc for n1 (size of sample 1) and n2 (size of sample 2) at 5% level of significance. Decision Rule:If r ≤ rc Reject H0 Large Sample Sizes: For sample sizes larger than 20 critical value rc is given as : 𝑟𝑐 = µ − 1.96𝜎 whereµ = 1 + 2𝑛1 𝑛2 𝑛1+ 𝑛2 & 𝜎 = √ 2𝑛1 𝑛2( 2𝑛1 𝑛2−𝑛1−𝑛2) (𝑛1+ 𝑛2)2(𝑛1+ 𝑛2−1)
  • 21. PAGE 21 ANAYLSIS OF POOLABILITY TEST  The Parametric and Non-Parametric test is applied for Poolability Testing and analysis of NSS 67th Round data as per nature of parameters.  Non-Parametric test having capacity to analysis two types of data i.e. Continuous and Discrete with the help of Chi-square test and Run test.  Run test is applied for those parameters that are continuous in nature; however Chi-square test is applied for discrete nature of parameters.  The Parameters like Type ofenterprise, Type ofworker, Broadactivity type is testedby Chi-square test due to discrete nature however Run test is applied on Gross value added (GVA) andGross value added (GVA) per enterprisedue to Continuous nature.  The chi square Goodness of fit test at 1% and 5% level of significance has been applied for rural and urban areas of Delhi for poolability test of parameters like Type of Enterprises, Type of Workers and Broad Activities.  Null hypothesis has been accepted at 1% level of significance for the parameters like Type of Enterprises and Broad Activities. However, the null hypothesis is rejected for Type of Enterprises at 5% level of significance for rural sectorindicating that non-sampling errors are in large magnitude.  The null hypothesis has been rejected at both 1% and 5% level of significance in both rural and urban sectors.  Wald-Wolfowitz run test has been applied for rural and urban areas of Delhi for poolability test of Gross Value Added of all enterprises at 1% and 5% level of significance.  The null hypothesis has been accepted at 1% and 5% level of significance for both sectors i.e. rural and urban.
  • 22. PAGE 22 METHODS OF POOLING We first divide the central samples and state samples into two independent sub-sample namely A & B to use the following methods.  Inverse Variance : The estimates for state andcentral can be computed respectively as: 𝑡 𝑠 = 𝑡 𝑠1 + 𝑡 𝑠2 2 & 𝑡 𝑐 = 𝑡 𝑐1 + 𝑡 𝑐2 2 where, 𝑡 𝑠1 & 𝑡 𝑠2 are resp aggregates of sub samples A and B of state sample 𝑡 𝑐1& 𝑡 𝑐2 are resp aggregates of sub samples A and B of central sample Pooled estimate leading to optimum combination of these two estimates is given byweighing with inverse of the variance of the estimate. Thus the pooled estimate is given by: 𝑇𝑝 = 𝑉( 𝑡 𝑐) 𝑡 𝑠 + 𝑉( 𝑡 𝑠) 𝑡 𝑐 𝑉( 𝑡 𝑐)+ 𝑉(𝑡 𝑠 ) &𝑉(𝑇𝑝) = 𝑉( 𝑡 𝑐) 𝑉( 𝑡 𝑠) 𝑉( 𝑡 𝑐)+ 𝑉(𝑡 𝑠 ) In general 𝑉( 𝑡 𝑐) and 𝑉( 𝑡 𝑠) are unknown and can be estimated as 𝑉ˆ(𝑡 𝑠) = (𝑡 𝑠1 −𝑡 𝑠2 )2 4 &𝑉ˆ(𝑡 𝑐 ) = (𝑡 𝑐1 −𝑡 𝑐2 )2 4 Thus pooled estimate and estimated of pooled variance is given by 𝑡 𝑝 = 𝑉ˆ( 𝑡 𝑐) 𝑡 𝑠 + 𝑉ˆ( 𝑡 𝑠) 𝑡 𝑐 𝑉ˆ( 𝑡 𝑐)+ 𝑉ˆ(𝑡 𝑠 ) &𝑉ˆ(𝑡 𝑝) = 𝑉ˆ( 𝑡 𝑐) 𝑉ˆ( 𝑡 𝑠) 𝑉ˆ( 𝑡 𝑐)+ 𝑉ˆ(𝑡 𝑠 ) By virtue of weighing the two estimates at the domain level at which twoestimates are pooled, the pooled estimate will always lie between the central and statesample estimates.
  • 23. PAGE 23  WeightedAverage : When the State’s participation is of unequal matching of central samples, theweighted average of two estimates with weights being matching ratio of central and statesample may be a better way of combining the estimates considering central and statesamples as independent samples. Let matching ratio of state and central sample be m : n. Based on this, the respective estimates for state and central can be computed as: 𝑡 𝑠 = 𝑡 𝑠1 + 𝑡 𝑠2 2 & 𝑡 𝑐 = 𝑡 𝑐1 + 𝑡 𝑐2 2 where, 𝑡 𝑠1 & 𝑡 𝑠2 are resp aggregates of sub samples A and B of state sample 𝑡 𝑐1 & 𝑡 𝑐2 are resp aggregates of sub samples A and B of central sample Pooled estimate of these two estimates is given by weighingwith matching participation ratem:n. Thus the pooled estimate is given by: 𝑡 𝑝 = 𝑚𝑡 𝑠 + 𝑛𝑡 𝑐 𝑚+𝑛 & 𝑉(𝑇𝑝) = 𝑚2 𝑉( 𝑡 𝑠) + 𝑛2 𝑉( 𝑡 𝑐) (𝑚+𝑛)2 In general 𝑉( 𝑡 𝑐) and 𝑉( 𝑡 𝑠) are unknown and can be estimated as 𝑉ˆ(𝑡 𝑠) = (𝑡 𝑠1 −𝑡 𝑠2 )2 4 &𝑉ˆ(𝑡 𝑐 ) = (𝑡 𝑐1 −𝑡 𝑐2 )2 4 Thus pooled estimate and estimated of pooled variance is given by 𝑉ˆ(𝑡 𝑝) = 𝑛2 𝑉ˆ( 𝑡 𝑐) + 𝑚2 𝑉ˆ( 𝑡 𝑠) (𝑚 + 𝑛)2 The pooled estimate will always lie between the estimates based on central and state sampleseparately.
  • 24. PAGE 24 POOLABILITY ANALYSIS Type of Enterprises: INVERSE VARIANCE METHOD  The pooled number of Unincorporated Non-Agricultural Enterprises of State & Centre was estimated to be 1153521. Out of them 27626 (2.27%) were in rural areas and 1128503(97.83%) were in urban areas of Delhi.  Out of the total enterprises 651390 (56.46%) were Own–Account Enterprises and 502326 (43.54%) were Establishments.  Estimates of Rural + Urban obtained, are closer to those obtained under State as compared to Centre WEIGHTED AVERAGE METHOD  The pooled number of Unincorporated Non-Agricultural Enterprises of State & Centre was estimated to be 1139089. Out of them 26178 (2.30%) were in rural areas and 1112911(97.7%) were in urban areas of Delhi.  Out of the total enterprises 635572 (55.8%) were Own–Account Enterprises and 503517 (44.2%) were Establishments. INVERSE VARIANCE METHOD vs. WEIGHTED AVERAGE METHOD For Type of Enterprises, we observe that the inverse variance method has lesser variation than weighted average method, hence making it the better method.
  • 25. PAGE 25 Broad Activity Type: INVERSE VARIANCE METHOD  The pooled numbers of broad activities of Unincorporated Non- Agricultural Enterprises of State & Centre are estimated to be 1140863. Out of them 26028 were in rural areas and 1117152 were in urban areas of Delhi.  Out of total Unincorporated Non-Agricultural Enterprises, Trade accounted for 41.22%, the share of Other Services was 37.58% and Manufacturing constituted 21.20%.  Estimates of Rural + Urban obtained, are closer to those obtained under State as compared to Centre WEIGHTED AVERAGE METHOD  The pooled numbers of broad activities of Unincorporated Non- Agricultural Enterprises of State & Centre are estimated to be 1120809. Out of them 26195 were in rural areas and 1094614 were in urban areas of Delhi.  Out of the total enterprises 635572 (55.8%) were Own–Account Enterprises and 503517 (44.2%) were Establishments. INVERSE VARIANCE METHOD vs. WEIGHTED AVERAGE METHOD For Broad Activity Type we observe that the inverse variance method has lesser variation than weighted average method, hence making it a better method.
  • 26. PAGE 26 Gross Value Added: INVERSE VARIANCE METHOD  Gross Value Added (as per Product Approach) for pooled number of enterprises lies between State & Centre was calculated to be Rs.37510418687.  Out of total Gross Value Added, rural sector accounted for 1.38%, and the share of urban sector was 98.62%  Estimates of Rural + Urban obtained, are closer to those obtained under State as compared to Centre WEIGHTED AVERAGE METHOD  Gross Value Added (as per Product Approach) for pooled number of enterprises lies between State & Centre was calculated to be Rs.34295707983.  Out of total Gross Value Added, rural sector accounted for 2.28%, and the share of urban sector was 97.72% INVERSE VARIANCE METHOD vs. WEIGHTED AVERAGE METHOD For, Gross Value Added, we observe that the inverse variance method has lesser variation than weighted average method, hence making it a better method.
  • 27. PAGE 27 RELATIVE STANDARD ERROR (RSE) Gauzing the size of entire population and deriving results out of it, in any essence is an arduous and cumbersome task. Probability theory and statistics being that branch of science which deals with the same, uses the concepts of surveys, sample and standard error. Statisticians use standard errorsto construct confidence intervalsfrom their surveyed data. Confidence intervalsare important for determiningthe validity of empirical tests and research. Standard error is however not to be confused with standard deviation, latter referringto variability in the given sample and former showing the variability of the sampling distribution itself. Estimates for any parameter are formulated on the basis of a sample collected from a population, rather than the population itself. The error induced due to non-inclusion of the entire population refersto standard error. The standard error is an absolute measure between the sample survey and the total population. It affects the accuracy of the estimates. The Relative Standard Error (RSE) is the standard error expressed as a fraction of the estimate and is usually displayed as a percentage. Estimates with a RSE of 25% or greater are subject to high sampling error and should be used with utmost caution. The relative standard error shows if the standard error is large relative to the results. Thus, large relative standard errorssuggest the results are not significant and further investigation is mandatory.
  • 28. PAGE 28 The reliability of estimates can also be assessed in terms of a confidence interval. Confidence intervals represent the range in which the population value is likely to lie. They are constructed using the estimate of the population value and its associated standard error. For example, there is approximately a 95% chance (i.e. 19 chances in 20) that the population value lies within two standard errors of the estimates, so the 95% confidence interval is equal to the estimate plus or minus two standard errors. Formula: 𝑆. 𝐸(𝑥) 𝑥 ∗ 100 WhereS.E = standard error of the estimate of a concernedparameter x = the value of the estimator of a concerned parameter Decision Criteria: The general rule to tolerate error:  Estimate havingRSE less than or equal to 5% is firmly considered as an excellentestimate.  Estimate havingRSE between 5% and 10% is considered as a good estimate.  Estimate havingRSE between 10% and 15% is considered as an average estimate.  Estimate havingRSE beyond 15% strongly indicates that estimate needs to be further investigated.
  • 29. PAGE 29 ANALYSIS OF RELATIVE STANDARD ERROR (RSE) Type of Enterprise: INVERSE VARIANCE METHOD (I.V.):  In case of Rural Sector, the RSE for OAE and total is well within range of 6%. For Establishment, RSE is acceptable at margin of 10%.  In case of Urban Sector, all the RSEs are well within acceptable range.  In case of Rural+ Urban, all RSEs are within 4%.  The estimates of RSE are closer to that of State. WEIGHTED AVERAGE METHOD (W.A.):  In case of Rural Sector, the RSE for Establishment and total are within the range of 15 %. For OAE, RSE is beyond 16% and requires further examination.  In case of Urban Sector, the sample for Establishment is average, but all other RSEs are good within 10%.  In case of Rural+ Urban, the sample for Establishmentis average, but all other RSEs are good within 10%. INVERSE VARIANCE METHOD VS. WEIGHTED AVERAGE METHOD: The RSEs for inverse variance method in all the cases are quite less than those for weighted average method. This implies high variation in the weightedaverage method, making the inverse variance method better. The estimate of RSE obtained through I.V. is closer to that obtained through W.A.
  • 30. PAGE 30 Broad Activity: INVERSE VARIANCE METHOD (I.V.):  In case of Rural Sector, the RSE for Trading and Other Services are acceptable, lying within range of 15%. For Manufacturing, RSE is excellent well within the margin of 5%.  In case of Urban Sector, all the RSEs are well within excellent range.  In case of Rural+ Urban, all RSEs are well within 5%.  The estimates of RSE are closer to that of State. WEIGHTED AVERAGE METHOD (W.A.):  In case of Rural Sector, the RSE for Manufacturing and Trading are acceptable, lying within range of 15%. For Other Services, RSE is beyond 24% and requires further examination.  In case of Urban Sector, the sample for Trading and Other Services is average, where as that of Manufacturing is excellent, lying well within 5%.  In case of Rural+ Urban, the sample for Trading and Other Services is average, where as that of Manufacturing, lies well within 5%. INVERSE VARIANCE METHOD VS. WEIGHTED AVERAGE METHOD: The RSEs for inverse variance method in all the cases are quite less than those for weighted average method. This implies high variation in the weightedaverage method, making the inverse variance method better. The difference between the estimates of RSE obtained through I.V and W.A is lesser in Manufacturing as compared to that of Trading & Other Services.
  • 31. PAGE 31 Gross Value Added: INVERSE VARIANCE METHOD (I.V.):  In case of Rural Sector, the RSE is beyond 16% indicating the need for further examination.  In case of Urban and Rural+ Urban sector, RSEs are good being close to 6%.  The estimates of RSE are closer to that of State. WEIGHTED AVERAGE METHOD (W.A.):  In case of Rural Sector, the RSE is beyond 16% indicating the need for further examination.  In case of Urban and Rural+ Urban sector, RSEs are average, lying within 15%. INVERSE VARIANCE METHOD VS. WEIGHTED AVERAGE METHOD: The RSEs for inverse variance method in all the cases are quite less than those for weighted average method. This implies high variation in the weightedaverage method, making the inverse variance method better. The estimate of RSE obtained through I.V. is closer to that of State.
  • 32. PAGE 32 DIVERGENCE (d) For substantial gain in reliability of the pooled estimate, the quality of data collected by the two agencies must be of the same order considering the non-samplingerrors. The estimates generatedfrom central and state samples as such are not directly comparable for some States even at the state level. Estimates show wide divergence – raising doubts about the unknown magnitude of non-sampling error as well as its agency bias. In such cases pooling may not result in better estimate as the estimates are not poolable. The situations that may arise for the estimates (aggregates) of a parameter (θ), say t1 and t2 with relative standard errors r1 and r2, respectively obtained from the central sample and state sample data are illustrated below. 1. The divergence, d= |t1 - t2| ≈ 0 (i.e., small) and r1 and r2 are within the acceptable margins (r0). 2. The divergence, d= |t1 - t2|≈ 0 and r1>>r0& r2>> r0 3. The divergence, d= |t1 - t2|≈ 0 and r1<= r0 but r2>> r0 4. The divergence, d= |t1 - t2 |>> 0 and r1 ≤ r0& r2 ≤ r0 5. The divergence, d= |t1 - t2 |>> 0 and r1>> r0& r2>> r0 6. The divergence, d= |t1 - t2|>> 0 and r1>> r0& r2< r0
  • 33. PAGE 33 In the case of situations 1 to 3 above, one may argue that the estimates are acceptable in thesense that they are close to each other and the pooling of the two estimates t1 and t2 willimprove the reliability. Pooling of both the estimates, even though lie on the same side ofthe true value, may result in a small loss of information in respect of error, i.e., its closenessto its true value, but may result in significant gain in the precision. In the case of situations 4 to 6, one may need to look into the estimates carefully in respectof its closeness to the true value of the parameter either through external evidence orthrough prior knowledge regarding the trend of the estimates. It may happen that oneestimate is very close to the true value and the other is totally away from it. In that case,although the pooled estimate may have a smaller RSE but it may not describe the truesituation if the two estimates lie on the same side of true value as compared to the estimatewhich is closer to the true value. The examination of the RSEs of the estimates is asecondary issue to such situations.
  • 34. PAGE 34 OBSERVATIONS REGARDING DIVERGENCE  Checking the divergence of two sets of data is the alternative approach to check the non-sampling errors involvedin unit level data.  As per normality concept of Statistics, a certain percentage of the State and Centre estimates has been taken as the deciding criteria for the aforementionedparameters.  Estimates which are acceptable indicate that they are close to each other and the pooling of the estimates of State & Centre will improve the reliability of the data.  The wide divergencesbetween these two sets of estimates i.e.Central and State indicate that pooling will not be advisable because it raises doubts about the unknown magnitude of non- sampling error as well as its agency bias. Generally in such cases pooling may not result in better estimate as the estimates are not poolable.  Estimates which need further investigation indicate that one may need to look into the estimates carefully in respect to its closeness to the true value of the parameter either through external evidence or through prior knowledge regardingthe trend of the estimates.
  • 35. PAGE 35 Under the parameter 1. Type of Enterprises:  Estimates of urban enterprises are acceptable.  Estimates of rural enterprisesneed further investigation. 2. Broad Activities:  Estimates of both urban and rural enterprises under Manufacturing need further examination.  Estimates of rural enterprises need further investigation under Trading where as that of urban enterprisesare acceptable  Estimates of rural enterprises are acceptable under Other Services where as that of urban enterprisesneed further investigation. 3. Gross Value Added:  Estimates of both urban and rural enterprises need further examination.
  • 36. PAGE 36 CONCLUSION  Sampling District wise unit level data is unavailable for the state Delhi. Hence it is very difficult to apply the poolability test for better analysis. Therefore poolability testing & analysis has been done on the basis of sector wise (Urban and Rural) unit level data.  Testing  It is known that whenever a parametric test is applied, it is always more powerful than non parametric tests. But parametric tests need to satisfy some assumptions before the tests can actually be used. None of the concerned assumptions were satisfied for the given 67th Round data, which indicates high chances of sampling & non sampling errors.
  • 37. PAGE 37  For 67th Round, Multinomial Test has been applied for parameters like Type of Enterprises, Broad Activities etc. Wald Wolfowitz Runs Test was applied for Gross Value Added.  Multinomial test was rejected for the parameter Type of Worker indicating that the data cannot be pooled and error is suspected in the data.  If a test gets accepted at 5% then it will be also accepted at 1%. But if a test gets rejected at 1% then it will be rejected at 5% also  Pooling  All the pooled estimates derived through method of Inverse Variance were better than that obtained through Weighted Average. The Relative Standard Error in every parameter are lesser in case of former, thus justifying the above conclusion.  For all the parameters , we observe that the inverse variance method has lesser variation than weighted average method, hence making it the better method
  • 38. PAGE 38 UNINCORPORATED NON-AGRICULTURAL ENTERPRISES IN DELHI EXECUTIVE SUMMARY Following are the main highlights of the poolability analysis of NSS 67th round data (July 2010 – June 2011) through method of Inverse Variance.  The pooled number of Unincorporated Non-Agricultural Enterprises of State & Centre was estimated to be 1153521. Out of them 27626 (2.27%) were in rural areas and 1128503(97.83%) were in urban areas of Delhi.  Out of the total enterprises 651390 (56.46%) were Own–Account Enterprises and 502326 (43.54%) were Establishments.  Out of total Unincorporated Non-Agricultural Enterprises, Trade accounted for 41.22%, the share of Other Services was 37.58% and Manufacturing constituted 21.20%.  Gross Value Added (as per Product Approach) for pooled number of enterprises between State & Centre was calculated to be Rs.37510418687..  Gross Value Added (as per Product Approach) per Unincorporated Non-Agricultural Enterprises was estimated at Rs 32518.
  • 39. PAGE 39 UNINCORPORATED NON-AGRICULTURAL ENTERPRISES IN DELHI EXECUTIVE SUMMARY Following are the main highlights of the poolability analysis of NSS 67th round data (July 2010 – June 2011) through method of Weighted Average.  The pooled number of Unincorporated Non-Agricultural Enterprises of State & Centre was estimated to be 1139089. Out of them 26178 (2.30%) were in rural areas and 1112911(97.7%) were in urban areas of Delhi.  Out of the total enterprises 635572 (55.8%) were Own–Account Enterprises and 503517 (44.2%) were Establishments.  Out of total Unincorporated Non-Agricultural Enterprises, Trade accounted for 43.62%, the share of Other Services was 36.58% and Manufacturing constituted 19.80%.  Gross Value Added (as per Product Approach) for pooled number of enterprises between State & Centre was calculated to be Rs.34295707983.  Gross Value Added (as per Product Approach) per Unincorporated Non-Agricultural Enterprises was estimated at Rs 30108.
  • 40. PAGE 40 SUGGESTIONS I. Accurate results concerning aforementioned parameters can be obtained if data is collected district – wise. II. We need to keep in mind the objective of the survey precisely while preparing a questionnaire. Highly technical and complicated questions must be avoided as they lead to partial or non-response from respondents. III. It is necessary to validate and remove non-sampling errors during survey by the surveyor in NSS round. Non- sampling errors leads to increase in Type-1 and Type-2 errors. Former causes incorrect rejection of some parameters which should actually be accepted whereas the latter leads to incorrect acceptance of some parameters which should actually be rejected. Both these errors result in misleading conclusions about the sample. IV. Updated maps of the locality need to be used and provided to the surveyor as well, so that relevant data is collected with correct demographics and in well in time.
  • 41. PAGE 41 BIBLIOGRAPHY  Report of NSS on Operational and Financial Characteristics of Unincorporated Non-Agricultural Enterprises (Excluding Construction) in Delhi 2010-11, Directorate of Economics and Statistics, Delhi.  Training Manual on Data Processing NSS 67th Round, NSSO, MOSPI.  Note on Sample Design and Estimation Procedure NSS 67th Round (July 2010 – June 2011), MOSPI, NSSO.  Report of the Committee on Pooling of Central and State Sample Data of NSS, NSC, Government of India, November 2011.  www.google.com  www.wikipedia.org