2. STATISTICS-DEFINITION
It is a branch of science which deals with collection , classification,
summarizing, analysis and interpretation of the numerical data.
When statistics is applied in biology including human biology ,
medicine and public health it is called “biostatistics”
Sir. Francis Galton is known as the “Father of Biostatistics”
Facts and statistics collected together for reference or analysis is
called DATA.
4. GEOGRAPHICAL DATA
Data classification is based on geographical regions, countries,
states, districts, mandals etc.
Yeild of agricultural production per acre in different
countries in a particular year are given below:
USA 600
CHINA 300
PAKISTAN 250
INDIA 150
5. CHRONOLOGICAL DATA
Data classification is based on time occurrence such as year ,
quarter, month, weeks etc.
Profit of BHEL company over 5 years is as below:
YEAR BHEL PROFITS
2006 950
2007 1120
2008 1370
2009 1280
6. QUALITATIVE DATA
This is not numerical and the values taken usually are names.
No notion of magnitude or size of the characteristic as the same cannot
be measured.
Classified based on the individuals having the same characteristics and
not by measurement.
There is only one variable.
Also called nominal, categorical or attribute variables
7. QUALITATIVE DATA
Persons from same characteristic are counted to form specific group.
Example:
Number of attacks, births, cures etc.
Pain when measured as present or absent is qualitative but when
measures between 6-10 scale becomes quantitative
8. QUANTITATIVE DATA
Data have magnitude or size. The characteristic can be measured on the interval or on a ratio
scale.
• Ex: height, weight, Hb, BP, enzyme levels etc
• The data is divided into class intervals . For example a persons whose age is within 21-25 comes
under one group and those within 26-30 will be in another group.
• By this way the whole data is divided into number of groups or class intervals.
• Each CI will have an UPPER and LOWER LIMITS. The difference between these limits is called
CLASS MAGNITUDE/INTERVAL.
• The number of items which fall under a certain given class is called FREQUENCY of that class
• All the classes groups with their frequencies are taken together put in the form of a table . It is
described as “ GROUP FREQUENCY DISTRIBUTION”
9. Organization of data
Ungrouped data
Ascending order
Descending order
Grouped data
discrete continuous
Exclusive/
Over laping Inclusive
Open end
,class interval
10. CLASSIFICATION ACCORDING TO SOURCE OF
INFORMATION
1. PRIMARY DATA:
• Collected from the investigator himself for the first time.
• In india there are various agencies which collect primary data.
• Ex:Census, National sample survey etc
2. SECONDARY DATA:
the data is the one which is already collected by a source other than the
present invigilator
For ex: hospital records, journals, magazines,TB.
11. DIFFERENCES BETWEEN PRIMARY AND
SECONDARY DATA
Primary data Secondary data
Investigator himself collects the data Investigator makes use of the data
collected by the other investigator and
stored it somewhere.
Large expenditure ,time and manpower
is required
It requires less expenditure ,time ,
manpower
The data can be used without extra
precaution
This data can not be used without special
care.
12. SOURCES OF PRIMARY DATA
Direct personal
investigation
Indirect oral
investigation
Information
through
correspondents
The
questionnaire
method
14. SOURCES OF PRIMARY DATA
1. Direct personal investigation:
In this the investigator has to go personally and collect the information. In this precautions
must be taken in ready to nature, behavior, personality of the investigator. The information
obtained is reliable. The success depends on qualities of the investigator.
2. Indirect oral investigation:
in this data can be collected from those persons who may possess some knowledge
about the investigation. They are called witnesses. This method is followed when the
information cannot be approached directly. In this degree of accuracy can be affected.
15. SOURCES OF PRIMARY DATA
3. Information through correspondents:
investigator does not collect the information directly from the respondents. The work is
entrusted to correspondents. This method is generally employed by press reporters,
broadcasting agencies, governmental organizations. In this information can be biased.
4. Questionnaire method:
Information is obtained through questionnaire which can be sent through post or
enumerators.
The following has to be followed while making a questionnaire:
16. SOURCES OF PRIMARY DATA
The size of questionnaire should be small.
The questionnaire should be simple
Possible answers should be suggested along with questions.
Hurting questions should not be used
Questions relating to mathematical calculations should be avoided.
Necessary instructions and definitions should be given
Questionnaire should be attractive.
17. SOURCES OF SECONDARY DATA
1. Published sources:
published sources are: International publications
Official publication of central and state governments
Private publications
Semi official publications
Commercial and financial publications
Publications of research publications
Magazines‘
Publications of research scholars
2. Unpublished resources:
records maintained by government and private offices, studies made by research
scholars and other research institutions
18. METHODS OF DATA COLLECTION
Observation method
Interview method
Questionnaire‘
Schedules
19. OBSERVATION METHOD
Information is collected in investigators own direct observation without asking from the
respondent. For example, in a study relating to consumer behavior, the investigator instead of
asking the brand of wrist watch, may himself look at the watch.
Advantage of it is subjective and so bias is eliminated.
It is suitable in studies where the respondents are not capable of giving verbal reports of their
feelings.
the disadvantage is , expensive and the information given is very limited and some unseen
factors may interfere with this method.
20. INTERVIEW METHOD
This method can be used through personal interviews and through telephone interviews.
1. Personal interview:
Interviewer asks the questions in face to face contact to the other person known as interviewed.
It can be in the form of direct personal investigation or indirect oral investigation
In direct personal investigation the interviewer has to collect the information personally and he
must be present on the spot and meet people which will be suitable for intensive investigations
21. INTERVIEW METHOD
At times where the above techniques can not be used an indirect oral
examination can be conducted through cross examinations from other persons
who have knowledge about problem under investigation
Most of the committees appointed by the government are using this method
2.Telephonic interviews:
Collection of information is by telephone
Its not very widely used method
It plays an important part in industrial surveys particularly in developed regions
22. Questionnaire
In case of big enquiries this method is quite popular
This method is being adopted by research worker , private and public organizations
In this method the questionnaire is prepared and sent to the persons concern with a request to answer
the questions printed in a definite order on a form or set of forms
The respondents have to answer the questions on their own.
The advantages are
• Low cost
• Free from bias of the interviewer
• Respondents have adequate time to think
• Respondents who are not easily approachable can also be reached
• Large samples can be made use and thus results can be made more dependable and reliable
23. Disadvantages:
Low rate of return of the duly filled in questionnaire
It can be used only when respondents are educated and co operative
There is possibility for omission of replies to certain questions
This method is slowest of all
Before using this method it is advisable to conduct pilot study for testing the questionnaire
Main aspects of Questionnaire:
General form structured or unstructured
Question sequence – clear and smoothly moving
Question formulation and wording - clear, impartial, easily understandable
Questionnaire
24. Collection through schedules
•Schedules are being are filled in by enumerate us who are specially
appointed for this purpose
•These enumerators along with schedules go to respondents to them
questions from proforma ,in the order the questions are listed and record the
replies in the spaces given for the same in the proforma.
25. FREQUENCY DISTRIBUTION
• DEFINITION:
•The number of items which fall in a given class is known as frequency of that class. All the class
groups with their frequencies are taken together and put in the form of a table which is called as
frequency distribution (shivani sharma)
•A mathematical function showing the number instances in which a variable takes each of its possible
values.
•In statistics a frequency distribution is a list, table, graph that displays the frequency of various
outcomes in a sample.
•Each entry in the table contains the frequency or count of the occurrences of values within a
particular group or interval.
26. •Frequency distribution in various types of data:
1.RAW DATA/INDIVIDUAL SERIES:
• The statistical information collected from the investigation is called raw data
• This data needs to be further organized according to the requirement
These organization can be done in different ways
• Alphabetical order by their names
• The serial order
• Ascending order – lowest to highest
• Descending order – highest to lowest
FREQUENCY DISTRIBUTION
27.
28. 2.DISCRETE SERIES:
When there is a large sample or if the observations are many it would be difficult to
organize the data in individual series in such situations discrete series of organization
can be used in discrete series the frequencies of a variable are given but the
variable is without class intervals.
For eg: if we find that there are 6 students whose weight is 136 lb we need not write
136, 6 times. We can simply say that the frequency of value 136 is 6.
In this the units and their related frequencies are given.
FREQUENCY DISTRIBUTION
29. CONTINOUS SERIES/CLASS INTERVAL
SERIES
• Units are expressed in various classes and their respective frequencies are given.
• CLASS: any stated interval is called a class( 110-120 is a class)
• CLASS LIMIT : the boundaries of the class are called class limits ( 110-120, 110 is lower limit
and 120 is upper limit)
• CLASS INTERVAL : DIFFERENCE BETWEEN LOWER LIMIT of the first class and the upper
limit of the second class.
• Ex: in classes 110-120 and 120-130 , the class interval is the difference between 120
and110. here the class interval is 10.
• CLASS FREQUENCIES: the number of values of the series which fall in a class are known
as class frequencies.
30. STEPS IN CONSTRUCTION OF A SERIES
1. DETERMINING THE RANGE:
Range is obtained by substracting smallest value from largest value.
Ex: In above problem may value is 176 and minimum value is 120; range is 176-120=56
2. DETERMINE THE NO. OF CLASSES:
There is no specific criteria laid down for number of classes. The data should meaningful, there can be 5-
15 classes.
The magnitude of each class interval should be same, but there can be unequal magnitudes also.
Multiples of 2,5,10 are generally preferred while determining the class magnitude
In case of very high or low values one can use open ended intervals such as less than 100 or above 200.
31. STEPS IN CONSTRUCTION OF A SERIES
3. DETERMINATION OF THE CLASS LIMIT:
Class limits are generally located at multiples of 2,5,10,20, 100 etc.
The two main types of CI are :
Exclusive -5-10,10-15,15-20…(OR )5-9,10-14,15-19….
In Exclusive type the items whose values are equal to upper limit of class
are put in the next higher class. For example an item having a value exact
15 would be put in 15-20 and not in 10-15 class interval.
INCLUSIVE TYPE CLASS INTERVALS:
Ex: 6-10,11-15,16-20,21-25
In this type the upper limit of a class interval is also included in the same class interval.
Example an item having value 15 will be put in 11-15 class interval
32. STEPS IN CONSTRUCTION OF A SERIES
4 USE OF TALLY BARS:
TALLY SHEETS:
Frequency is determined by tally sheets. In this the classes are
written on a paper and for each item a vertical line is put in front of
class groups in which it falls.
Generally after every 4 vertical lines a fifth line for the item falling in
the same group , an oblique line is drawn through the previous four
lines as a representative of five items. This helps in counting.
33. STEPS IN CONSTRUCTION OF A
SERIES
5. MIDVALUE:
The middle value in the class limit of a class interval is called mid value of the class
interval.
It can be found by dividing the total of lower limit and upper limit of the class interval by two.
Ex: Mid value of class interval 100-150 is 100+150/2=125.
34. CUMULATIVE FREQUENCY SERIES
The sum of frequencies corresponding to different classes, it is denoted by
the symbol.