SlideShare a Scribd company logo
Prof . T RAMA KRISHNA RAO (8839271225 )
BBA 2nd
SEM
STATISTICS
Prof . T RAMA KRISHNA RAO (8839271225 )
PT R.S.S.U
BBA II
Statistics
Unit-I
Meaning and definition of Statistics; Scope and Limitations of Statistics; Processing and
Presentation of Data.
Unit-II
Measures of Central Tendencies; Mean, Geometric Mean , Median, Mode.
Unit-III
Measure of Variation : Standard Deviation and Skewness.
Unit-IV
Correlation Analysis โ€“ Karlpearsonโ€™s co-efficient of Correlation.
Unit-V
Index Number, Time Series Analysis
Prof . T RAMA KRISHNA RAO (8839271225 )
Unit 1
Statistics
PT R.S.S.UNIVERSITY PREVIOUS YEAR QUESTION PAPERS
2016
Q.1 Define statistic Explain the ways in which statistical data can be presented with the help of suitable example?
Q.2 Different between classification and tabulation , mention the requisites of a good statistical table?
2015
Q.1 Explain the meaning and scope of statistics bringing out its importance in field of business?
Q.2 What do you mean by data ?what are objectives Explain different kind of classification of data?
Q.3Draw a histogram to represent the following frequency distribution .
Marks 0-10 10-20 20-40 40-50 50-60 60-70 70-90 90-100
No of students 4 6 14 16 14 10 16 5
2014
Q.1 Define statistics ,what are the main function ?discuss briefly the limitation of statistics ?
Q.2 What is tabulation ? what are its use ? mention the items that a good statistical table contain?
Q.3 Draw a frequency polygon for the following distribution
Class interval 15-25 25-35 35-45 45-55 55-65 65-75
Frequency 10 16 18 15 13 4
2013
Q.1 Explain the meaning and scope of statistics bringing out its importance in field of business?
Q.2 What is meant by classification ? what precaution are to be taken in selecting class intervals?
Q.3 Represent the following data by a Pie chart?
Food 87
Clothing 24
Recreation 11
Education 13
Rent 25
Miscellaneous 20
Meaning and definition of Statistics; Scope and Limitations of Statistics; Processing
and Presentation of Data
Prof . T RAMA KRISHNA RAO (8839271225 )
STATISTICS
Meaning:
โ€œStatisticsโ€, that a word is often used, has been derived from the Latin word โ€˜Statusโ€™ that means a group of numbers or figures; those
represent some information of our human interest.
collecting information about states and other information which was needed about their people, their number, revenue of the state etc.
Definition:
The term โ€˜Statisticsโ€™ has been defined in two senses, i.e. in Singular and in Plural sense.
In plural sense, it means a systematic collection of numerical facts and in singular sense; it is the science of collecting, classifying and
using statistics.
A. In the Plural Sense:
โ€œStatistics are numerical statements of facts in any department of enquiry placed in relation to each other.โ€ โ€”A.L.
Bowley
โ€œThe classified facts respecting the condition of the people in a stateโ€”especially those facts which can be stated in
numbers or in tables of numbers or in any tabular or classified arrangement.โ€ โ€”Webster
These definitions given above give a narrow meaning to the statistics as they do not indicate its various aspects as are
witnessed in its practical applications. From the this point of view the definition given by Prof. Horace Sacrist appears to
be the most comprehensive and meaningful:
โ€œBy statistics we mean aggregates of facts affected to a marked extent by multiplicity of causes, numerically expressed,
enumerated or estimated according to reasonable standard of accuracy, collected in a systematic manner for a
predetermined purpose, and placed in relation to each other.โ€โ€”Horace Sacrist
B. In the Singular Sense:
โ€œStatistics refers to the body of technique or methodology, which has been developed for the collection, presentation and
analysis of quantitative data and for the use of such data in decision making.โ€ โ€”Ncttor and Washerman
โ€œStatistics may rightly be called the science of averages.โ€ โ€”Bowleg
โ€œStatistics may be defined as the collection, presentation, analysis, and interpretation of numerical data.โ€ โ€”Croxton and
Cowden
Some Modern Definitions:
โ€œStatistics is a body of methods for making wise decisions on the face of uncertainty.โ€ โ€”Wallis and Roberts
โ€œStatistics is a body of methods for obtaining and analyzing numerical data in order to make better decisions in an uncertain world.โ€ โ€”
Edward N. Dubois
Stages of Investigations:
1. Collection of Data:It is the first stage of investigation and is regarding collection of data. It is determined that which method of
collection is needed in this problem and then data are collected.
Prof . T RAMA KRISHNA RAO (8839271225 )
2. Organisation of Data:It is second stage. The data are simplified and made comparative and are classified according to time and place.
3. Presentation of Data:In this third stage, organised data are made simple and attractive. These are presented in the form of tables
diagrams and graphs.
4. Analysis of Data:Forth stage of investigation is analysis. To get correct results, analysis is necessary. It is often undertaken using
Measures of central tendencies, Measures of dispersion, correlation, regression and interpolation etc.
5. Interpretation of Data:In this last stage, conclusions are enacted. Use of comparisons is made. On this basis, forecasting is made
Nature of Statistics
1. Statistics is Science :- Science, by definition, is a systematic body of knowledge which studies the cause and effect
relationship and endeavors to find out generalization. If we take the various statistical methods in consideration, we can define
statistics as a science in which we study:Numerous methods of collecting, editing, classifying, tabulating and presenting facts
using graphs and diagrams Several ways of condensing data regarding various social, political, and economic problems This is
done to establish a relationship between various facts. Also, it helps in analyzing and interpreting problems and forecast them
too.
2. Statistics is Art :- If Science is knowledge, Art is action or the actual application of science. While Science teaches us to know,
Art teaches us to do. statistics as an art of applying the science of scientific methods. As an art, statistics offer a better
understanding and solution to problems in real life as it offers quantitative information.While there are several statistical
methods, the successful application of the methods is dependent on the statisticianโ€™s degree of skill and experience.
According to Tippet, โ€œStatistic is both a science and an art. It is a science in that its methods are basically systematic and have general
application and art in that their successful application depends, to a considerable degree, on the skill and special experience of the
statistician, and on his knowledge of the field of application.โ€
Characteristics
1. Statistics are Aggregate of Facts: Only those facts which are capable of being studied in relation to time, place or frequency
can be called statistics. Individual, single or unconnected figures are not statistics because they cannot be studied in relation to
each other. Due to this reason, only aggregate of facts e.g., data relating to I.Q. of a group of students, academic achievement of
students, etc. are called statistics and are studied in relation to each other.
2. Statistics are Affected to a marked Extent by Multiplicity, of Causes:Statistical data are more related to social sciences and
as such, changes are affected to a combined effect of many factors. We cannot study the effect of a particular cause on a
phenomenon. It is only in physical sciences that individual causes can be traced and their impact is clearly known. In statistical
study of social sciences, we come to know the combined effect of multiple causes.
3. Statistics are Numerically Expressed:Qualitative phenomena which cannot be numerically expressed, cannot be described as
statistics e.g. honesty, goodness, ability, etc. But if we assign numerical expression, it maybe described as โ€˜statisticsโ€™.
4. Statistics are Enumerated or estimated according to Reasonable Standards of Accuracy:The standard of estimation and of
accuracy differs from enquiry to enquiry or from purpose to purpose. There cannot be one standard of uniformity for all types of
enquiries and for all purposes. A single student cannot be ignored while calculating I.Q. of 100 students in group whereas 10
soldiers can be easily ignored while finding out I.Q. of soldiers of whole country.
5. Statistics are Collected in a Systematic Manner:In order to have reasonable standard of accuracy statistics must be collected
in a very systematic manner. Any rough and haphazard method of collection will not be desirable for that may lead to improper
and wrong conclusion. Accuracy will also be not definite and as such cannot be believed.
6. Statistics for a Pre-determined Purpose:The investigator must have a purpose beforehand and then should start the work of
collection. Data collected without any purpose is of no use. Suppose we want to know intelligence of a section of people, we
must not collect data relating to income, attitude and interest. Without having a clear idea about the purpose we will not be in a
position to distinguish between necessary data and unnecessary data or relevant data and irrelevant data.
7. Statistics are Capable of being Placed in Relation to each other:Statistics is a method for the purpose of comparison etc. It
must be capable of being compared, otherwise, it will lose much of its value and significance. Comparison can be made only if
the data are homogeneous.
Importance and Scope of Statistics:
(i) Statistics in Planning:Statistics is indispensable in planningโ€”may it be in business, economics or government
level. The modern age is termed as the โ€˜age of planningโ€™ and almost all organisations in the government or
business or management are resorting to planning for efficient working and for formulating policy decision.To
achieve this end, the statistical data relating to production, consumption, birth, death, investment, income are of
paramount importance. Today efficient planning is a must for almost all countries, particularly the developing
economies for their economic development.
(ii) Statistics in Mathematics:Statistics is intimately related to and essentially dependent upon mathematics. The
modern theory of Statistics has its foundations on the theory of probability which in turn is a particular branch of
Prof . T RAMA KRISHNA RAO (8839271225 )
more advanced mathematical theory of Measures and Integration. Ever increasing role of mathematics into
statistics has led to the development of a new branch of statistics called Mathematical Statistics.Thus Statistics
may be considered to be an important member of the mathematics family. In the words of Connor, โ€œStatistics is a
branch of applied mathematics which specialises in data.โ€
(iii) Statistics in Economics:Statistics and Economics are so intermixed with each other that it looks foolishness to
separate them. Development of modern statistical methods has led to an extensive use of statistics in
Economics.All the important branches of Economicsโ€”consumption, production, exchange, distribution, public
financeโ€”use statistics for the purpose of comparison, presentation, interpretation, etc. Problem of spending of
income on and by different sections of the people, production of national wealth, adjustment of demand and
supply, effect of economic policies on the economy etc. simply indicate the importance of statistics in the field of
economics and in its different branches.Statistics of Public Finance enables us to impose tax, to provide subsidy,
to spend on various heads, amount of money to be borrowed or lent etc. So we cannot think of Statistics without
Economics or Economics without Statistics.
(iv) Statistics in Social Sciences:Every social phenomenon is affected to a marked extent by a multiplicity of factors
which bring out the variation in observations from time to time, place to place and object to object. Statistical
tools of Regression and Correlation Analysis can be used to study and isolate the effect of each of these factors
on the given observation.Sampling Techniques and Estimation Theory are very powerful and indispensable tools
for conducting any social survey, pertaining to any strata of society and then analysing the results and drawing
valid inferences. The most important application of statistics in sociology is in the field of Demography for
studying mortality (death rates), fertility (birth rates), marriages, population growth and so on.In this context
Croxton and Cowden have rightly remarked:โ€œWithout an adequate understanding of the statistical methods, the
investigators in the social sciences may be like the blind man groping in a dark room for a black cat that is not
there. The methods of statistics are useful in an over-widening range of human activities in any field of thought
in which numerical data may be had.โ€
(v) Statistics in Trade:As already mentioned, statistics is a body of methods to make wise decisions in the face of
uncertainties. Business is full of uncertainties and risks. We have to forecast at every step. Speculation is just
gaining or losing by way of forecasting. Can we forecast without taking into view the past? Perhaps, no. The
future trend of the market can only be expected if we make use of statistics. Failure in anticipation will mean
failure of business.Changes in demand, supply, habits, fashion etc. can be anticipated with the help of statistics.
Statistics is of utmost significance in determining prices of the various products, determining the phases of boom
and depression etc. Use of statistics helps in smooth running of the business, in reducing the uncertainties and
thus contributes towards the success of business.
(vi) Statistics in ResearchWork:The job of a research worker is to present the result of his research before the
community. The effect of a variable on a particular problem, under differing conditions, can be known by the
research worker only if he makes use of statistical methods. Statistics are everywhere basic to research activities.
To keep alive his research interests and research activities, the researcher is required to lean upon his knowledge
and skills in statistical methods.
Limitations of Statistics
1. Qualitative Aspect Ignored:The statistical methods donโ€™t study the nature of phenomenon which cannot be expressed in
quantitative terms.Such phenomena cannot be a part of the study of statistics. These include health, riches, intelligence etc. It
needs conversion of qualitative data into quantitative data.
2. It does not deal with individual items:It is clear from the definition given by Prof. Horace Sacrist, โ€œBy statistics we mean
aggregates of factsโ€ฆ. and placed in relation to each otherโ€, that statistics deals with only aggregates of facts or items and it does
not recognize any individual item. Thus, individual terms as death of 6 persons in a accident, 85% results of a class of a school
in a particular year, will not amount to statistics as they are not placed in a group of similar items. It does not deal with the
individual items, however, important they may be.
3. It does not depict entire story of phenomenon:When even phenomena happen, that is due to many causes, but all these causes
can not be expressed in terms of data. So we cannot reach at the correct conclusions. Development of a group depends upon
many social factors like, parentsโ€™ economic condition, education, culture, region, administration by government etc. But all
these factors cannot be placed in data. So we analyse only that data we find quantitatively and not qualitatively. So results or
conclusion are not 100% correct because many aspects are ignored.
4. It is liable to be miscued:As W.I. King points out, โ€œOne of the short-comings of statistics is that do not bear on their face the
label of their quality.โ€ So we can say that we can check the data and procedures of its approaching to conclusions. But these
data may have been collected by inexperienced persons or they may have been dishonest or biased. As it is a delicate science
and can be easily misused by an unscrupulous person. So data must be used with a caution. Otherwise results may prove to be
disastrous.
5. Laws are not exact:As far as two fundamental laws are concerned with statistics:(i) Law of inertia of large numbers and(ii)
Law of statistical regularity, are not as good as their science laws.They are based on probability. So these results will not always
Prof . T RAMA KRISHNA RAO (8839271225 )
be as good as of scientific laws. On the basis of probability or interpolation, we can only estimate the production of paddy in
2008 but cannot make a claim that it would be exactly 100 %. Here only approximations are made.
6. Results are true only on average: the results are interpolated for which time series or regression or probability can be used.
These are not absolutely true. If average of two sections of students in statistics is same, it does not mean that all the 50 students
is section A has got same marks as in B. There may be much variation between the two. So we get average results.
โ€œStatistics largely deals with averages and these averages may be made up of individual items radically different from each
other.โ€ โ€”W.L King
7. To Many methods to study problems:In this subject we use so many methods to find a single result. Variation can be found by
quartile deviation, mean deviation or standard deviations and results vary in each case.
โ€œIt must not be assumed that the statistics is the only method to use in research, neither should this method of considered the
best attack for the problem.โ€ โ€”Croxten and Cowden
Data
The facts and figures which can be numerically measured are studied in statistics. Numerical measures of same characteristic is known as
observation and collection of observations is termed as data. Data are collected by individual research workers or by organization through
sample surveys or experiments, keeping in view the objectives of the study. The data collected may be:
โ€ข Primary Data
โ€ข Secondary Data
Primary Data
Primary data means the raw data (data without fabrication or not tailored data) which has just been collected from the source and has not
gone any kind of statistical treatment like sorting and tabulation. The term primary data may sometimes be used to refer to first hand
information.
Sources of Primary Data :- The sources of primary data are primary units such as basic experimental units, individuals, households.
Following methods are used to collect data from primary units usually and these methods depends on the nature of the primary unit.
Published data and the data collected in the past is called secondary data.
โ€ข Personal Investigation : -The researcher conducts the experiment or survey himself/herself and collected data from it. The
collected data is generally accurate and reliable. This method of collecting primary data is feasible only in case of small scale
laboratory, field experiments or pilot surveys and is not practicable for large scale experiments and surveys because it take too
much time.
โ€ข Investigators :- The trained (experienced) investigators are employed to collect the required data. In case of surveys, they
contact the individuals and fill in the questionnaires after asking the required information, where a questionnaire is an inquiry
form having a number of questions designed to obtain information from the respondents. This method of collecting data is
usually employed by most of the organizations and its gives reasonably accurate information but it is very costly and may be
time taking too.
โ€ข Questionnaire :- The required information (data) is obtained by sending a questionnaire (printed or soft form) to the selected
individuals (respondents) (by mail) who fill in the questionnaire and return it to the investigator. This method is relatively cheap
as compared to โ€œthrough investigatorโ€ method but non-response rate is very high as most of the respondents donโ€™t bother to fill
in the questionnaire and send it back to investigator.
โ€ข Local Sources : -The local representatives or agents are asked to send requisite information who provide the information based
upon their own experience. This method is quick but it gives rough estimates only.
โ€ข Telephone :- The information may be obtained by contacting the individuals on telephone. Its a Quick and provide accurate
required information.
โ€ข Internet :- With the introduction of information technology, the people may be contacted through internet and the individuals
may be asked to provide the pertinent information. Google survey is widely used as online method for data collection now a
day. There are many paid online survey services too.
Secondary Data
Data which has already been collected by someone, may be sorted, tabulated and has undergone a statistical treatment. It is fabricated or
tailored data.
Sources of Secondary Data
โ€ข Government Organizations;- Federal and Provincial Bureau of Statistics, Crop Reporting Service-Agriculture Department,
Census and Registration Organization etc.
โ€ข Semi-Government Organization ;- Municipal committees, District Councils, Commercial and Financial Institutions like banks
etc
Prof . T RAMA KRISHNA RAO (8839271225 )
โ€ข Teaching and Research Organizations:-Research Journals and Newspapers
โ€ข Internet
Primary and Secondary Data in Statistics :The difference between primary and secondary data in Statistics is that Primary data
is collected firsthand by a researcher (organization, person, authority, agency or party etc) through experiments, surveys, questionnaires,
focus groups, conducting interviews and taking (required) measurements, while the secondary data is readily available (collected by
someone else) and is available to the public through publications, journals and newspapers.
Classification of Data:
Data classification is broadly defined as the process of organizing data by relevant categories so that it may be used and protected more
efficiently. On a basic level, the classification process makes data easier to locate and retrieve. Data classification is of particular
importance when it comes to risk management, compliance, and data security.
Data classification involves tagging data to make it easily searchable and trackable. It also eliminates multiple duplications of data, which
can reduce storage and backup costs while speeding up the search process. Though the classification process may sound highly technical,
it is a topic that should be understood by your organizationโ€™s leadership.
Data is of two types:
qualitative data and quantitative data. : Qualitative data are data that represent a quality. Whereas, quantitative data are data that represent
a numeric quality.
Definition of Classification of Data:
According to Secrist, โ€œClassification is the process of arranging data into sequences and groups according to their common
characteristicsโ€.
In other words, classification of data is the process of organizing data into groups according to various parameters. The most crucial
parameter is the similarities that exist among data.
For example, the number of students who have registered for a sports event can be classified on the following basis:
โ€ข Gender
โ€ข Age
โ€ข Weight
โ€ข Height
โ€ข Institutions/Colleges
โ€ข Sports played by them etc.
Functions of Classification of Data:
1. Studying relations โ€“ classifying the collected data helps analyse and study the relationships between them. Moreover, the
organization of statistical data can enable effective decision making.
2. Condense the data โ€“ sometimes the data collected for statistical manipulations are wide and raw. In order to make decisions
based on the data, it is crucial to make the data more comprehensive. This can be done with the help of tabulation. Hence,
classifying the data provides a condensed form of it that can be easily comprehensible.
3. Treatment of data โ€“ data collected from various sources is meaningless by itself. The data so collected should undergo
manipulation in order to be useful for decision making. It becomes difficult to treat raw and unclassified data and is hence
important to classify the data before doing so. Classification of data helps facilitate the statistical treatment of the data.
4. Comparisons โ€“ wide, raw and unclassified data is impossible to deal with and arrive at any conclusion. Conclusions cannot be
arrived at without treating the data and making a statistical analysis. Hence, classified/organized/tabulated data enables analysts
to make meaningful comparisons on various criteria.
Rules For Classifying Data:
1. classification of the collected data is a very important technique while performing statistical treatments. It is all the more
important to remember the rules of classifying the data. These rules form the backbone and act as guiding principles for well-
classified data. These rules are mentioned below:
2. Unambiguous โ€“ the classes should be rigid and unambiguous (clear). An unclear classification can have severe consequences
and can also impact all further statistical treatments.
3. Exhaustive โ€“ every classified data must be exhaustive in the sense that they should belong to one of the classes or categories.
4. Stability โ€“ in order to facilitate effective comparisons of data, it is important that the classified data are stable. Classified data
should be stable in the sense that the same classification pattern must be adopted throughout the analysis. Adopting different
classification techniques for the same analysis would lead to ambiguity.
Prof . T RAMA KRISHNA RAO (8839271225 )
5. Suitable for the purpose โ€“ it is crucial to remember the objective of the report or analysis while classifying data. Avoid
classifying the data in a manner that does not suit the purpose of the inquiry.
6. Flexibility โ€“ it is important to classify data in a manner that allows future modification. Due to changing conditions, there may
arise the need to change the statistical methods and data classifications. In such a situation, a flexible classification of data
would solve many issues.
Problems With Classifying Data:
1. Classification of data has many functions and various benefits. But there are also some key issues in organizing data. The most
important problems associated with it are mentioned below:
2. Organizing data can be a very tedious and complex task for many companies or individuals.
3. Classifying data is a purely instinctive and a non-intuitive action that can lead to misjudgements. These misjudgements can
often cause a lot of inconvenience and errors.
4. Redoing the entire process of classification can be very time consuming and nerve-racking.
5. Classifying data can be done only with the help of a statistical analyst.
6. It is impossible to classify data without having moderate knowledge on the same.
Organization of Data:
1. Chronological Classification โ€“ The chronological classification of data emphasizes the occurrence of time. Under this type of
data classification, data is classified on the bases of differences in time. The time series data (used frequently in economic and
business statistics) is an example of data being classified in a chronological manner.
2. Geographical Classification โ€“ The geographical organization of data emphasizes on the geographical representation of data.
Under this type of data classification, data is classified on the basis of geographical boundaries and location differences.
Classifying based on states, cities and districts is a geographical classification. Classifying based on countries and continents are
also examples of data being classified in a geographical manner.
3. Qualitative Classification โ€“ The qualitative classification of the data emphasizes on certain qualitative phenomenon of the
data. Under this type of data classification, data is classified on the basis of qualitative measurements. Classifying based on
qualities like honesty, intelligence and also aptitude are some examples of data being classified in a qualitative manner.
4. Quantitative Classification โ€“ The quantitative classification of the data emphasizes on certain quantitative phenomenon of the
data. Under this type of data classification, data is classified on the basis of quantitative measurements. Classifying based on
quantities like sales, profits, age, height and also weight are some examples of data being classified in a quantitative manner.
Introducing Tabulation:
Tabulation refers to the process of arranging all the collected data in a tabular format. Tabulation is also the systematic presentation of
data in rows and columns. Rows are horizontal arrangements whereas columns are vertical arrangements. Tabulation is an important
device for presenting data in a condensed manner that is easily understandable and furnishes maximum information. It also facilitates easy
comparison between 2 or more parameters.
There are 7 key parts of a table
1. Table number
2. Table title
3. Headnotes (also known as prefatory notes)
4. Captions
5. The body of the table
6. Foot-note
7. Source note
Tabulation is mandatory to create charts and graphical representations. Data, tabulation and these diagrammatic representations are very
important in the process of policy making, decision making and formulation of strategies.
STEPS FOR EFFECTIVE DATA CLASSIFICATION
1. Understand the Current Setup: Taking a detailed look at the location of current data and all regulations that pertain to your
organization is perhaps the best starting point for effectively classifying data. You must know what data you have before you
can classify it.
2. Creating a Data Classification Policy: Staying compliant with data protection principles in an organization is nearly impossible
without proper policy. Creating a policy should be your top priority.
Prof . T RAMA KRISHNA RAO (8839271225 )
3. Prioritize and Organize Data: Now that you have a policy and a picture of your current data, itโ€™s time to properly classify the
data. Decide on the best way to tag your data based on its sensitivity and privacy.
Different between classification and tabulation ,
BASIS FOR COMPARISON CLASSIFICATION TABULATION
Meaning
Classification is the process of grouping
data into different categories, on the
basis of nature, behavior, or common
characteristics.
Tabulation is a process of summarizing
data and presenting it in a compact
form, by putting data into statistical
table.
Order After data collection After classification
Arrangement Attributes and variables Columns and rows
Purpose To analyse data To present data
Bifurcates data into Categories and sub-categories Headings and sub-headings
Requisites of good statistical table
1. Suit the purpose
2. Scientifically prepared
3. Clarity
4. Manageable size
5. Columns and rows should be numbered
6. Suitably approximated
7. Attractive getup
8. Units should be mentioned
9. Averages & totals should be given
10. Logically arranged
11. Proper lettering
Frequency
The frequency of any value is the number of times that value appears in a data set. So from the above examples of colours, we can say two
children like the colour blue, so its frequency is two. So to make meaning of the raw data, we must organize. And finding out the
frequency of the data values is how this organisation is done.
Frequency Distribution
Many times it is not easy or feasible to find the frequency of data from a very large dataset. So to make sense of the data we make a
frequency table and graphs.
Types of Frequency Distribution:The frequency distribution is further classified into five. These are:
1. Exclusive Series
2. Inclusive Series
3. Open End Series
4. Cumulative Frequency Series
5. Mid-Values Frequency Series
Exclusive Series
In such a series, for a particular class interval, all the data items having values ranging from its lower limit to just below the upper limit
are counted in the class interval. In other words, we do not include the items that have values less than the lower limit, equal to the upper
limit and greater than the upper limit.Note that here the upper limit of a class repeats itself in the lower limit of the next interval. This is
the most used type of frequency distribution.
Weight Frequency
40-50 2
50-60 10
Prof . T RAMA KRISHNA RAO (8839271225 )
60-70 5
70-80 3
Inclusive Series
On the contrary to exclusive series, an inclusive series includes both its upper and lower limit. Of course, this means that we do not
include the items with values less than the lower limit and greater than the upper limit.
Marks Frequency
10-19 5
20-29 13
30-39 6
Open End Series
In an open-end series, the lower limit of the first class in the series and the upper limit of the last class in the series is missing. Instead,
there is โ€˜below the lower limitโ€™ of the first class and โ€˜lower limit and above the lower limitโ€™ of the last class.
Age Frequency
Below 5 4
5-10 6
10-20 10
20 and above 8
Cumulative Frequency Series
In a cumulative frequency series, we either add or subtract the frequencies of all the preceding class intervals to determine the frequency
for a particular class. Further, the classes are converted into either โ€˜less than the upper limitโ€™ or โ€˜more than the lower limitโ€™.
Mid-Values Frequency Series
A mid-value frequency series is the one in which we have the mid values of class intervals and the corresponding frequencies. In other
words, the mid values represent the range of a particular class interval.
GRAPH OF DATA FREQUENCY
1. Histogram
2. Bar Graphs
3. Polygons
4. pie chart
5. Line Graphs
6. Ogive Graph / Cumulative Frequency
Histogram
A histogram is a plot that lets you discover, and show, the underlying frequency distribution (shape) of a set of continuous data. This
allows the inspection of the data for its underlying distribution (e.g., normal distribution), outliers, skewness, etc. An example of a
histogram, and the raw data it was constructed from, is shown below:
36 25 38 46 55 68 72 55 36 38
67 45 22 48 91 46 52 61 58 55
construct histogram from a continuous variable
To construct a histogram from a continuous variable you first need to split the data into intervals, called bins. In the example above, age
has been split into bins, with each bin representing a 10-year period starting at 20 years. Each bin contains the number of occurrences of
scores in the data set that are contained within that bin. For the above data set, the frequencies in each bin have been tabulated along with
the scores that contributed to the frequency in each bin
Bin Frequency Scores Included in Bin
20-30 2 25,22
30-40 4 36,38,36,38
40-50 4 46,45,48,46
50-60 5 55,55,52,58,55
Prof . T RAMA KRISHNA RAO (8839271225 )
60-70 3 68,67,61
70-80 1 72
80-90 0 -
90-100 1 91
Notice that, unlike a bar chart, there are no "gaps" between the bars (although some bars might be "absent" reflecting no frequencies).
This is because a histogram represents a continuous data set, and as such, there are no gaps in the data (although you will have to decide
whether you round up or round down scores on the boundaries of bins).
Bar graph
A bar graph is a chart that uses bars to show comparisons between categories of data. The bars can be either horizontal or vertical. Bar
graphs with vertical bars are sometimes called vertical bar graphs. A bar graph will have two axes. One axis will describe the types of
categories being compared, and the other will have numerical values that represent the values of the data. It does not matter which axis is
which, but it will determine what bar graph is shown. If the descriptions are on the horizontal axis, the bars will be oriented vertically, and
if the values are along the horizontal axis, the bars will be oriented horizontally.
Types of Bar Graphs
There are many different types of bar graphs. They are not always interchangeable. Each type will work best with a different type of
comparison. The comparison you want to make will help determine which type of bar graph to use. First we'll discuss some simple bar
graphs.
vertical bar :- A simple vertical bar graph is best when you have to compare between two or more independent variables. Each variable
will relate to a fixed value. The values are positive and therefore, can be fixed to the horizontal value.
Horizontal bar graph:- If your data has negative and positive values but is still a comparison between two or more fixed independent
variables, it is best suited for a horizontal bar graph. The vertical axis can be oriented in the middle of the horizontal axis, allowing for
negative and positive values to be represented.
Range Bar Graph represents a range of data for each independent variable. Temperature ranges or price ranges are common sets of data
for range graphs. Unlike the above graphs, the data do not start from a common zero point but begin at a low number for that particular
point's range of data. A range bar graph can be either horizontal or vertical.
Prof . T RAMA KRISHNA RAO (8839271225 )
.
Difference Between A Bar Chart And A Histogram
The major difference is that a histogram is only used to plot the frequency of score occurrences in a continuous data set that has been
divided into classes, called bins. Bar charts, on the other hand, can be used for a great deal of other types of variables including ordinal
and nominal data sets.
Polygons
A frequency polygon is almost identical to a histogram, which is used to compare sets of data or to display a cumulative frequency
distribution. It uses a line graph to represent quantitative data.
Statistics deals with the collection of data and information for a particular purpose. The tabulation of each run for each ball in cricket
gives the statistics of the game. Tables, graphs, pie-charts, bar graphs, histograms, polygons etc. are used to represent statistical data
pictorially.
In the upcoming discussion let us discuss how to represent a frequency polygons. These are visually substantial method of representing
quantitative data and its frequencies.
To draw frequency polygons, we begin with, drawing histograms and follow the following steps:
Step 1- Choose the class interval and mark the values on the horizontal axes
Step 2- Mark the mid value of each interval on the horizontal axes.
Step 3- Mark the frequency of the class on the vertical axes.
Step 4- Corresponding to the frequency of each class interval, mark a point at the height in the middle of the class interval
Step 5- Connect these points using the line segment.
Step 6- The obtained representation is a frequency polygon
Solution: Following steps are to be followed to construct a histogram from the given data:
โ€ข The heights are represented on the horizontal axes on a suitable scale as shown.
โ€ข The number of students is represented on the vertical axes on a suitable scale as shown.
โ€ข Now rectangular bars of widths equal to the class- size and the length of the bars corresponding to a frequency of the class
interval is drawn.
โ€ข ABCDEF represents the given data graphically in form of frequency polygon as:
Prof . T RAMA KRISHNA RAO (8839271225 )
PIE CHART
A pie chart (or a circle chart) is a circular statistical graphic, which is divided into slices to illustrate numerical proportion. In a pie chart,
the arc length of each slice (and consequently its central angle and area), is proportional to the quantity it represents. While it is named for
its resemblance to a pie which has been sliced, there are variations on the way it can be presented. The earliest known pie chart is
generally credited to William Playfair's Statistical Breviary of 1801
Represent the following data by a Pie chart?
Food 87
Clothing 24
Recreation 11
Education 13
Rent 25
Miscellaneous 20
Exp Persentage Degree
Food 8700 48.33333333 174
Clothing 2400 13.33333333 48
Recreation 1100 6.111111111 22
Education 1300 7.222222222 26
Rent 2500 13.88888889 50
Miscellaneous 2000 11.11111111 40
total salary 18000 100 360
Convert percentage to degree = ( 360 * Percentage ) /100
Prof . T RAMA KRISHNA RAO (8839271225 )
Line Graphs
Line Graphs are used to display quantitative values over a continuous interval or time period. A Line Graph is most frequently used to
show trends and analyse how the data has changed over time.Line Graphs are drawn by first plotting data points on a Cartesian coordinate
grid, then connecting a line between all of these points. Typically, the y-axis has a quantitative value, while the x-axis is a timescale or a
sequence of intervals. Negative values can be displayed below the x-axis.
The direction of the lines on the graph works as a nice metaphor for the data: an upward slope indicates where values have increased and a
downward slope indicates where values have decreased. The line's journey across the graph can create patterns that reveal trends in a
dataset.
When grouped with other lines (other data series), individual lines can be compared to one another. However, avoid using more than 3-4
lines per graph, as this makes the chart more cluttered and harder to read. A solution to this is to divide the chart into smaller multiples
(have a small Line Graph for each data series).
Food 8700
Clothing 2400
Recreation 1100
Education 1300
Rent 2500
Miscellaneous 2000
total salary 18000
Ogive Graph / Cumulative Frequency
An ogive (oh-jive), sometimes called a cumulative frequency polygon, is a type of frequency polygon that shows cumulative frequencies.
In other words, the cumulative percents are added on the graph from left to right.
Food, 174
Clothing , 48
Recreation, 22
Education
, 26
Rent , 50
Miscellaneous, 40
Food
Clothing
Recreation
Education
Rent
Miscellaneous
0
1000
2000
3000
4000
5000
6000
7000
8000
9000
10000
Series1
Prof . T RAMA KRISHNA RAO (8839271225 )
An ogive graph plots cumulative frequency on the y-axis and class boundaries along the x-axis. Itโ€™s very similar to a histogram, only
instead of rectangles, an ogive has a single point marking where the top right of the rectangle would be. It is usually easier to create this
kind of graph from a frequency table.
Draw an Ogive Graph :Example question: Draw an Ogive graph for the following set of data:
02, 07, 16, 21, 31, 03, 08, 17, 21, 55 03, 13, 18, 22, 55, 04,14, 19, 25, 57,06, 15, 20, 29, 58.
Step 1: Make a relative frequency table from the data. The first column has the class limits, the second column has the frequency (the
count) and the third column has the relative frequency (class frequency / total number of items):
Step 2: Add a fourth column and cumulate (add up) the frequencies in column 2, going down from top to bottom. For example, the second
entry is the sum of the first row and the second row in the frequency column (5 + 5 = 10), and the third entry is the sum of the first,
second, and third rows in the frequency column (5 + 5 + 6 = 16):
Step 3: Add a fifth column and cumulate the relative frequencies from column 3. If you do this step correctly, your values should add up
to 100% (or 1 as a decimal):
Step 4: Draw an x-y graph with percent cumulative relative frequency on the y-axis (from 0 to 100%, or as a decimal, 0 to 1). Mark the x-
axis with the class boundaries.
Step 5: Plot your points. Note: Each point should be plotted on the upper limit of the class boundary. For example, if your first class
boundary is 0 to 10, the point should be plotted at 10.
Step 6: Connect the dots with straight lines. the ogive is one continuous line, made up of several smaller lines that connect pairs of dots,
moving from left to right.
Draw Histogram ,Bar Graphs,Polygons,piechart,Line Graphs ,Ogive Graph / Cumulative Frequency
Q.1
Prof . T RAMA KRISHNA RAO (8839271225 )
X: 0 โ€“ 9 10 โ€“ 19 20 โ€“ 29 30 โ€“ 39 40 โ€“ 49 50 โ€“ 59
F: 5 8 7 11 9 10
Q.2
Section Average marks in Mathematics No. of Students
A 75 50
B 60 60
C 55 50
Q .3
Wages ` 0 โ€“ 10 10 โ€“ 20 20 โ€“ 30 30 โ€“ 40 40 โ€“ 50 50 โ€“ 60
No. of workers: 25 30 45 15 25 30
[Ans: 35, 40]
Q.4
Marks 0 โ€“ 10 10 โ€“ 20 20 โ€“ 30 30 โ€“ 40 40 โ€“ 50 50 โ€“ 60 60 โ€“ 70
Frequency: 50 10 20 40 20 30 30
[Ans: 30; 28]
Q.5
Marks No. of Students Marks No. of Students
Less than 10
Less than 20
10 โ€“ 30
30 and above
5
20
35
60
40 โ€“ 50
50 and above
60 and above
10
25
9
Prof . T RAMA KRISHNA RAO (8839271225 )
UNIT 2
STATISTICS
UNIVERSITY PREVIOUS YEAR QUESTION PAPERS
2016
Q. 1 Calculate the average daily sales from the following data by assumed mean method
Daily sales 40 50 60 70 80
No od salesman 5 6 10 12 3
Ans : 60.55
Q.2 find out median from the following table:
Daily Wages no of employees Daily Wages no of employees
50-59 15 90-99 45
60-69 40 100-109 40
70-79 50 110-109 15
80-89 60 Ans 84.08
2015
Q.1 What do you mean by Arithmetic mean ? Discuss its merits and demerits Also state its importance properties?
Q.2 An incomplete distribution is given below
Class 0-10 10-20 20-30 30-40 40-50 50-60 60-70
Frequency 10 20 ? 40 ? 25 12
Total frequencies if median value is 35 (Ans :35,25)
Q.3 calculate mode from the following
Marks 0 -10 10-20 20-40 40-50 50-70
No of student 2 7 18 15 8
Ans 35.71
2014
Q.1 What do you mean by central tendency ? what are the common measures of central tendency?
Q.2 Given below is the distributation of weights of a group of 60 student in class
Weight 30-34 35-39 40-44 45-49 50-54 55-59 60-64
No of student 3 5 12 18 14 6 2
Ans: 47.5
Q.3 find the geometric mean from the following data:
Diameter 130 135 140 145 143 148 149 150
No of screwa 3 4 6 6 3 5 2 2
Ans: 142.3
2013
Q.1 the purpose of an average is to represent a group of individual values in simple and concise manner so that a quick understanding of
the general size of individual in the group can be made easily Explain?
Q .2 Find the missing frequency from the following data:
Class interval 0-10 10-20 20-30 30-40 40-50
Frequency 3 5 ? 3 2
The mean of the distribution is 23Ans:7
Q.3 calculate median from the following data :
Value Frequency Value Frequency
Less then 10 4 Less then 50 96
Less then 20 16 Less then 60 112
Less then 30 40 Less then 70 120
Less then 40 76 Less then 80 125
Ans: 36.25
Measures of Central Tendencies; Mean, Median, Mode, Geometric Mean.
Prof . T RAMA KRISHNA RAO (8839271225 )
MEASURES OF CENTRAL TENDENCY
Meaning
The word measures means โ€˜methodsโ€™ and the word Central Tendency means โ€˜average valueโ€™ of any statistical series. The , the
combined term measures of central tendency means the methods of finding out the central value or average value of a statistical
series or any series of quantitative information.
Definitions
According to Croxton and Cowden, โ€œAn averages value is a single value within the range of the data that is used to represent
all the values in the series. Since an average is somewhere within the range of the data, it is sometimes called a measure of
central value.โ€
In the words of Clark,โ€œAverage is an attempt to find one single figure to describe whole of figures.โ€
Characteristics
1. It is a single figure expressed in some quantitative form.
2. It lies between the extreme values of a series
3. It is a typical value that represents all the values of a series
4. It is capable of giving a central ideal about the series it represents
5. It is determined by some method or procedure.
Essentials of a Good Average
1. It should have clear Definition โ€“ The definitions of an average should be clear and unambiguous. It should be defined in
the form of an algebraic formula, so that each person calculating the average from a set of data, arrives at the same
figure.
2. It hold be simple to understand and easy to calculate โ€“ An average should be simple so that everybody could able to
understand without any dubious meaning. The method for calculation of average should be such that everybody can
calculate the same in an easier way.
3. It should be based on all the observations โ€“ Average is not representative unless the entire data are taken for its
calculation. So in order to make an average ideal it should be based on all the items of the series.
4. It should be suitable for further mathematical treatment โ€“ An ideal average should possess some important
mathematical property, so that it will be easier on the part of person using the same for further mathematical or
statistical analysis. By no way the use of the average figure should be restricted for single purpose rather by that average
can be used for calculation of other statistical measures like dispersion, correlation, regression and others.
5. It should not be affected by extreme items โ€“ In a sample, there may be wide variation of figures. The extreme items
i.e., highest values and lowest values, are of much higher or lower than other values. In such case, the average so
calculated will be greatly influenced by these extreme values and it cannot be treated as the true representative of
the whole distribution.
Various Measures of Central Tendency
A. Mathematical Averages:
๏† Arithmetic Average or Mean
๏† Geometric Mean
๏† Harmonic Mean
B. Positional averages:
๏†Median
๏†Mode
๏†Quartiles
๏†Deciles
๏†Percentiles
C. Miscellaneous Averages:
๏† Moving Average
๏† Progressive Average
Prof . T RAMA KRISHNA RAO (8839271225 )
Mean
โ€œ Mean of a series is the sum of the values of a variable divided by the number of observations. โ€œ
๐—
ฬ… =
โˆ‘ ๐—
๐
Method Individual Series Discrete Series Continues Series
Direct method
๐—
ฬ… =
โˆ‘ ๐—
๐
๐—
ฬ… =
โˆ‘ ๐…๐—
๐
๐—
ฬ… =
โˆ‘ ๐…๐ฆ
๐
Short cut method
๐—
ฬ… = ๐€ +
โˆ‘ ๐
๐
๐—
ฬ… = ๐€ +
โˆ‘ ๐…๐
๐
๐—
ฬ… = ๐€ +
โˆ‘ ๐…๐
๐
Step deviation method
๐—
ฬ… = ๐€ +
โˆ‘ ๐โ€ฒ
๐
ร— ๐ข ๐—
ฬ… = ๐€ +
โˆ‘ ๐…๐โ€ฒ
๐
ร— ๐ข ๐—
ฬ… = ๐€ +
โˆ‘ ๐…๐โ€ฒ
๐
ร— ๐ข
Shortest method ๐‘ฟ
ฬ… = ๐’Ž๐‘ณ โˆ’ ๐’Š (
โˆ‘ ๐‘ช๐‘ญ
๐‘ต
โˆ’ ๐Ÿ) mL = mid value of last class
Combined Mean
๐—
ฬ…๐Ÿ.๐Ÿ.๐Ÿ‘ =
๐๐Ÿ๐—
ฬ…๐Ÿ + ๐๐Ÿ๐—
ฬ…๐Ÿ + ๐๐Ÿ‘๐—
ฬ…๐Ÿ‘
๐๐Ÿ + ๐๐Ÿ + ๐๐Ÿ‘
Properties of Arithmetic Mean
1. The sum of the deviations of the items from the actual mean is always zero.
2. The sum of the squares of deviations of items from the arithmetic mean is minimum i.e., less than the sum of the squares
of deviations of items from any other value.
3. The sum of the given values of a series is equal to the product of their arithmetic average and number of items of the
series.
4. The sum of the number if items of a series are equal to the quotient of the sum of the values of the items and their
arithmetic mean.
Advantages of Mean
1. It is easy to understand and simple to compute
2. It is rigidly defined and there is no scope for ambiguity or misunderstanding about its meaning and nature.
3. Its value is based on each and every items of the data. With every change in any item, value of average will change.
4. Arrangement like ascending or descending order of data is not required while computing arithmetic mean.
5. It is not very much affected by fluctuations in sampling and thus its result is relatively dependable.
6. It can be reused for further statistical computations.
Disadvantages of Mean
1. In some cases where extreme items are either too big or small, then average is greatly affected by values of these
extreme items. Thus it fails to be the true representative of the series.
2. Its value cannot be determined graphically
3. In certain cases, arithmetic mean may give absurd result.
Prof . T RAMA KRISHNA RAO (8839271225 )
[Arithmetic Mean]
1. What do you understand by measures of averages? Explain features and functions of averages
2. Define the term โ€˜Averagesโ€™. Discuss the functions and types of statistical averages.
3. Explain different methods of measuring averages with examples
4. State various functions of measures of averages.
5. Why are the averages also known as central tendency? Examine the features of central tendency.
6. What is a statistical average? Explain features of good average
7. What are the functions and limitations of averages/
8. What is arithmetic mean? Explain its properties, merits and limitations
Practical Problems:
1. Find mean income of 10 employees in an organization.
Income ` (000) 10.2 15.5 18.9 20.2 25.4 26.2 29.3 31.4 32.5 32.9
[Ans: 24.25]
2. The following are the daily savings of a group of workers in a factory calculate average saving.
Savings ` 10 11 12 13 15 16 18 20 22 23 25
No of workers 2 3 5 8 9 10 15 8 6 5 4
[Ans: 17.19]
3. From the following data relating to daily wages of certain workers in a factory compute the average marks under direct and short-cut
method.
Wages ` 0 โ€“ 10 10 โ€“ 20 20 โ€“ 30 30 โ€“ 40 40 โ€“ 50 50 โ€“ 60 60 - 70
No. of workers: 7 8 9 12 16 6 2
[Ans: 33]
4. From the data given below find the mean under the step deviation method.
X: 0 โ€“ 9 10 โ€“ 19 20 โ€“ 29 30 โ€“ 39 40 โ€“ 49 50 โ€“ 59
F: 5 8 7 11 9 10
[Ans: 32.7]
5. From the following data relating to marks in Statistics secured by a batch +3 Commerce students, find out the mean marks:
Marks above: 0 10 20 30 40 50
No. of Students: 50 40 35 27 15 8
[Ans: 30]
6. Calculate the average marks of the students from the following data;
Marks below; 10 20 30 40 50 60 70 80
No. of Students: 15 35 60 84 96 127 198 250
[Ans: 50.4]
7. From the following data compute the arithmetic average under the step deviation method:
Marks below: 100 80 60 40 20
No of students: 60 55 40 35 5
[Ans: 45]
8. Find the missing frequencies of the following series, if the arithmetic average is 29.75 and the total number of items is 200:
Wages ` 0 โ€“ 10 10 โ€“ 20 20 โ€“ 30 30 โ€“ 40 40 โ€“ 50 50 โ€“ 60
No. of workers: 25 ? 45 ? 25 30
[Ans: 35, 40]
9. Find the missing frequencies of the following series, if the arithmetic average ins 39.5 and the total number of items is 100:
Marks 0 โ€“ 10 10 โ€“ 20 20 โ€“ 30 30 โ€“ 40 40 โ€“ 50 50 โ€“ 60 60 โ€“ 70
Frequency: 5 10 ? 4 20 3 ?
[Ans: 30; 28]
10. From the following frequency distribution, find the value of the median:
Marks No. of Students Marks No. of Students
Less than 10
Less than 20
10 โ€“ 30
30 and above
5
20
35
60
40 โ€“ 50
50 and above
60 and above
10
25
9
11. calculate arithmetic mean from the following from the following data:
wages (in Rs) NO of workers wages (in Rs) NO of workers
less then 48 5 72-80 8
less then 56 12 80 and above 19
Prof . T RAMA KRISHNA RAO (8839271225 )
48-64 29 88 and above 5
64 and above 31
13. Calculate mean from following data:
โ€ข 5 persons get less then rs 5
โ€ข 12 persons get less then rs 10
โ€ข 22 persons get less then rs 15
โ€ข 30 persons get less then rs 20
โ€ข 36persons get less then rs 25
โ€ข 40 persons get less then rs 30
12. The average percentage of marks secured by 200 students of Arts and Commerce is 50. The mean percentage of marks of the Arts
students is 40 and that of the commerce students is 60. Find the number of Arts and Commerce students separately. [Ans: 100; 100]
13. The average marks secured in Economics by all the Commerce and Arts students in their Board examination is 60. The average of
such mark of the Commerce students is 70, and that of Arts student is 50. Find the ratio of the number of students in the Commerce
and Arts class. [Ans: 1:1]
14. The arithmetic average of a series of 20 items has been computed as 400. While computing, two values 450 and 360 have been taken
as 540 and 630. Find correct value of the mean. [Ans: 382]
15. In a B. Com class of 128 students, 48 have failed securing 25 marks on an average. If the total marks of all the students be 5120, find
the average marks secured by the students passing the test. [Ans; 49]
Prof . T RAMA KRISHNA RAO (8839271225 )
GEOMETRIC MEAN
G.M. is the nth root of the product of โ€˜nโ€™ items of a series. It is found out by multiplying all the โ€˜nโ€™ values of a series and extracting
nth root of the product.
Direct method
Individual Series Discrete and Continuous Series
๐‘ฎ. ๐‘ด = โˆš๐‘ฟ๐Ÿ ร— ๐‘ฟ๐Ÿ ร— ๐‘ฟ๐Ÿ‘ ร— โ‹ฏ ร— ๐‘ฟ๐’
๐‘ต
๐‘ฎ. ๐‘ด = โˆš๐‘ญ๐Ÿ๐‘ฟ๐Ÿ ร— ๐‘ญ๐Ÿ๐‘ฟ๐Ÿ ร— ๐‘ญ๐Ÿ‘๐‘ฟ๐Ÿ‘ โ‹ฏ ๐‘ญ๐’๐‘ฟ๐’
๐‘ต
Logarithmic Method
Individual Series Discrete and Continuous Series
๐†. ๐Œ = ๐€๐‹๐จ๐Ÿ
โˆ‘ ๐ฅ๐จ๐  ๐—
๐
๐†. ๐Œ = ๐€๐‹๐จ๐Ÿ
โˆ‘(๐… ๐ฅ๐จ๐  ๐—)
๐
Uses of geometric Mean:
1. Geometric mean is useful in calculating the average of ratios.
2. It is useful in calculating the average of changes i.e., percentage increase of decrease in sales, production, population, rate of
interest or any other variables
3. It is considered as the best of averages where more weights are to be given to small items, and less weights to large items,
4. It is most suitable in constructing index numbers.
Properties of G.M.
1. The product of items of a series will remain unchanged if each item is replaced by the geometric mean.
2. The sum of the deviations of the logarithm of the original observations above and below the logarithm of the geometric mean
are equal to zero
3. If geometric means and the number of items of two series are known, combined geometric mean can be computed.
4. If G.M. and the number of items are known, the product of the values can be found out by using the formula (G.M)n
Advantages of G.M
1. It is based on all the items of the series.
2. It is capable of further algebraic treatment.
3. It is less affected by the extreme item
4. It is specially useful in determining the average of ratios and percentage.
5. It is a suitable average in determining rates of change in any variables.
6. It is very much useful in construction of an idal index number
7. It is hardly affected by the fluctuation of sampling.
Disadvantages of Geometric Mean
1. It is not easily understood and difficult ot calculate
2. It any value of a series is Zero, then the value of G.M will also be zero.
3. It gives comparatively more weights to smaller items and less weight to larger items.
Exercise โ€“ B [Geometric Mean]
1. Find the G.M of the series: 133; 141; 125; 173; 182 [Ans:
149]
2. Calculate the G.M of the figures: 5, 10, 192, 14374, 20498,
120674. [Ans: 126.9]
3. From the following figures find the G.M:
X: 10 20 30 40 50 60
F: 12 15 25 10 6 2
[Ans: 25.30]
4. Calculate the G.M for the following distribution:
X: 0 โ€“
10
10 โ€“
20
20 โ€“
30
30 โ€“
40
40 โ€“
50
F: 14 23 27 21 15
[Ans: 20.80]
5. Calculate the weighted Geometric Mean from the following
data:
Groups Index
Number
Weights
Food 125 7
Clothing 133 5
Fuel and Lighting 141 4
House Rent 173 1
Miscellaneous 182 3
[Ans: 139.8]
Median
Median refers to that value of the variable which divides the series into two equal parts, one part consists of all values
greater than the median and other part consists of all values less than the median. It is a positional average.
Direct method:
Individual & Discrete Series:๐Œ = ๐•๐š๐ฅ๐ฎ๐ž๐จ๐Ÿ
๐+๐Ÿ
๐Ÿ
๐ญ๐ก๐ข๐ญ๐ž๐ฆ
Continuous Series: ๐Œ = ๐•๐š๐ฅ๐ฎ๐ž๐จ๐Ÿ
๐
๐Ÿ
๐ญ๐ก๐ข๐ญ๐ž๐ฆ
Interpolation method:
For ascending series: ๐Œ = ๐‹๐Ÿ +
๐ข
๐Ÿ
(๐ฆ โˆ’ ๐œ)
Where;
๐‹๐Ÿ= Lower Limit of median Class
I = Class interval of Median Class
f = Respective frequency of Median Class
m =
๐‘ต
๐Ÿ
c = Previous Cumulative Frequency of Median Class
Properties of Median:
1. Median is an average of position.
2. The sum of the deviations taken from the median ignoring plus and minus signs will be less than the sum of deviations
from any other arbitrary point.
3. If median and number of items are known, missing frequencies can be traced out.
4. Advantages of median:
5. It is easy to calculate and simple to understand it is rigidly defined.
6. It is not affected by the extreme items of a series.
7. It can be determined easily in open end series and unequal class intervals.
8. It can be calculated graphically.
9. It is useful when the data cannot be measured quantitatively such as honesty, wealth, intelligence etc.
10. It can be located by inspection from the series.
Disadvantages of median:
1. It is not based on all the observations of the series, hence may not be representative in many cases.
2. It is not cable of further algebraic treatment.
3. It is very much affected by fluctuations in sampling
4. Median ignores the values of extreme items.
5. It is erratic if the number of items is small.
6. It cannot be determined if the data are not arranged in proper form either ascending or descending order.
[Median]
1. Determine the value of the median from the following series
X: 5 7 9 12 10 8 7 15 21
[Ans: 9]
2. From the following frequency distribution determine the value of median:
Wages (`): 35 55 45 60 70 65 75 80
No. of
Workers:
25 10 12 9 16 8 15 5
[Ans: 60]
3. From the following data given below calculate the median:
Classes: 0 โ€“ 10 10 โ€“ 20 20 โ€“ 30 30 โ€“ 40 40 โ€“ 50 50 โ€“ 60 60 โ€“ 70
Frequency: 7 18 24 32 10 6 5
[Ans: 30.625]
4. From the following data determine the value of the median:
X: 0 โ€“ 9 10 โ€“ 19 20 โ€“ 29 30 โ€“ 39 40 โ€“ 49 50 โ€“ 59
F: 5 10 12 8 9 6
[Ans: 27.83]
5. From the following data find out the value of the median:
Marks: Below 20 20 โ€“ 30 30 โ€“ 50 50 โ€“ 70 70 above
No. of Students: 3 4 10 5 3
[Ans: 41]
6. Locate the value of the median from the following series:
Marks less than 10 20 30 40 50 60 70
No. of Students: 3 10 18 24 33 38 40
[Ans: 33.34]
7. From the following data find out the value of the median:
Marks above 0 10 20 30 40 50 60
No. of
students:
100 80 65 53 43 25 12
[Ans: 33]
8. From the following frequency distribution, find the value of the median:
Marks No. of Students Marks No. of Students
Less than 10
Less than 20
10 โ€“ 30
30 and above
5
20
35
60
40 โ€“ 50
50 and above
60 and above
10
25
9
[Ans: 34]
9. From the data given below, trace out the missing frequency when the median is 70:
X: 0 โ€“ 20 20 โ€“ 40 40 โ€“ 60 60 โ€“ 80 80 โ€“ 100 100 โ€“ 120 120 โ€“ 140
F: 5 7 8 ? 10 6 4
[Ans: 20]
10. From the following series, find out the missing frequencies, if its median be 25 and number of students 100:
Marks: 0 โ€“ 10 10 โ€“ 20 20 โ€“ 30 30 โ€“ 40 40 โ€“ 50 50 โ€“ 60
No. of students: 20 10 ? 15 ? 5
[Ans: 40, 10]
11. From the following series, trace out the missing frequencies, if its median is 27.5 and number of items is 50
X: 0 โ€“ 10 10 โ€“ 20 20 โ€“ 30 30 โ€“ 40 40 โ€“ 50 50 โ€“ 60
F: 4 ? 20 ? 7 3
[Ans: 6, 10]
12. Assume N= 100 and there are class intervals all of equal intervals all of equal size the first class intervals is 10 and under
20 the cumulative frequency of the 5th , 6th ,7th and 8th class intervals are 45,70,90,and 99 respectively .find out median
Mode
Mode is that value in a series which occurs with the greatest frequency. In the words of Croxton and Cowden, โ€œThe mode of
a distribution is value at the point around which the items tend to be most heavily concentrated. It may be regarded as
the most typical of a series of values.โ€
Advantages of Mode:
1. It is very simple to calculate, as it can be found even by inspection.
2. It is not affected by extreme items.
3. For open end class intervals it can be determined straight away without estimating the two extreme class limits.
4. It can also be used in case of qualitative phenomenon as its calculation depends on the frequencies.
5. It can be determined graphically.
6. It is understood by a layman as it refers to a value containing maximum frequency.
Disadvantages of mode:
1. It is not rigidly defined.
2. It is not based on all the observations. Any change in extreme items will not affect the mode value.
3. It is affected by the fluctuation of sample.
4. It cannot be determined directly in case of bimodal or multimodal series.
5. It is not capable of further algebraic treatment.
6. It cannot be determined from a series of unequal class intervals unless they are arranged in a proper manner.
Choice of a suitable average
It is known that not a single average is suitable for all practical purposes. The different averages have different
characteristics and there is no universally accepted average. The choice of a particular average is usually determined on the
basis of the purpose for which investigation is undertaken. For sound statistical analysis, the choice of the average depends
upon:
1. The nature and availability of data;
2. The nature of the variable involved;
3. The purpose of the investigation;
4. The system of classification adopted, and
5. The use of the average for further statistical computations.
Choice of a suitable average is very important because it may lead to fallacious conclusions. The following points should
be remembered while selecting a particular average:
A. Arithmetic mean should be used when:
1. The distribution is not very asymmetrical.
2. The series does not have very large or very small item
3. The series does not have open end class intervals.
4. All values of the series are considered as equally important.
B. Median should be used when:
1. The series has unequal class intervals.
2. The series has open end class intervals.
3. The purpose is to determine the rank of various values.
C. Mode should be used when:
1. The purpose is to find out the most frequently items of a series.
2. The data are qualitative in nature.
3. The purpose is to find out the most common item of a series.
4. The purpose is to find the average number of children per household, average size of the shirt collar or shoes, average
number of rooms per household etc.
D. Geometric mean should be used when:
1. Ratios, rates and percentages are to be averaged
2. More weights are to be given to small items and less weights to large items.
3. It is required to construct index numbers.
E. Harmonic man should be used hen:
1. It is required to find out the average speed, average time to do a particular work, and average price at which an item
can be bought or sold.
2. It is required to compute the average rate of change in profit or loss of a concern.
Limitations of averages:
1. Sometimes an average might give very absurd result. For example, the average number of children per family might
come out in fractions which are obviously absurd.
2. An average being a single figure gives only the central idea of a phenomenon and does not reveal its entire story.
3. In certain types of distributions like U shaped distributions, an average files to represent the entire series,
4. Since average is a single figure representing the characteristics f a given distribution, proper are should be taken in its
interpretation, otherwise it might lead to very misleading conclusions.
[Mode]
1. The following are the size of shoes worn by 9 persons. Calculate the modal size:
Size: 5 4 4.5 5.5 4.5 6 4.5 4 4.5
[Ans: 4.5]
2. Find out the mode from the following observations:
Income (in `) 300 600 900 1200 1500 1800 2100
Employees: 4 8 29 11 18 13 5
[Ans: ` 900]
3. Find out the mode from the following data using an analysis table:
X: 3 4 5 6 7 8 9 10 11 12
F: 30 40 38 44 45 42 38 35 30 45
[Ans: 7]
4. Calculate the mode from the following data:
Marks: 5 โ€“ 10 10 โ€“ 15 15 โ€“ 20 20 โ€“ 25 25 โ€“ 30
Students: 10 15 25 20 12
[Ans: 18.3]
5. Calculate the modal value from the following frequency distribution:
X: 0 โ€“ 9 10 โ€“ 19 20 โ€“ 29 30 โ€“ 39 40 โ€“ 49 50 โ€“ 59 60 โ€“ 69 70 โ€“ 79 80 โ€“ 89 90 โ€“ 99
F: 6 29 87 181 247 263 133 43 9 2
[Ans: 47.55]
6. Find out the mode from the following data:
Less than: 5 10 15 20 25 30 35 40 45
No. of
items
29 224 465 582 634 644 650 653 655
[Ans: 11.35]
7. From the following data given below find the mode;
Wages ` (above): 30 40 50 60 70 80 90
No. of Workers: 520 470 399 210 105 45 7
[Ans: `55.84]
8. From the following series, determine the value of mode:
Marks below: 100 90 80 70 60 50 40 30 20 10
No. of
Students:
50 45 43 36 30 20 16 11 6 3
[Ans: 56]
9. Locate the value of the mode from the data given below by the appropriate method:
X: 0 โ€“ 10 10 โ€“ 20 20 โ€“ 30 30 โ€“ 40 40 โ€“ 50 50 โ€“ 60 60 โ€“ 70 70 โ€“ 80
F: 4 6 20 32 33 17 8 2
[Ans: 40.05]
10. Find out the missing frequencies in the following series, if the mode is 34 and the number of items are 60:
Wages ` 0 โ€“ 10 10 โ€“ 20 20 โ€“ 30 30 โ€“ 40 40 โ€“ 50 50 โ€“ 60 60 โ€“ 70
No. of Students: 8 7 ? 20 ? 6 4
[Ans; 10, 5]
11. From the data given below, find out the missing frequencies, if median is 67, mode is 68 and number of observations is 115:
X: 0 โ€“ 20 20 โ€“ 40 40 โ€“ 60 60 โ€“ 80 80 โ€“ 100 100 โ€“ 120 120 โ€“ 140
F: 2 8 30 ? ? ? 2
[Ans: 50, 20, 3]
12. In the following wage distribution, the median and mode are ` 33.5 and 34 rspectivly. But three class frequencies are missing.
Find out them:
X: 0 โ€“ 10 10 โ€“ 20 20 โ€“ 30 30 โ€“ 40 40 โ€“ 50 50 โ€“ 60 60 โ€“ 70 Total
F: 4 16 ? ? ? 6 4 230
[Ans: 60, 100, 40]
Unit-3
MEASURE OF VARIATION
PT R.S.S.UNIVERSITY PREVIOUS YEAR QUESTION PAPERS
2016
Q. 1 A factory produce two type of electric lamps A and B in an Experiment relating to their lives the following result were
obtained :
length of life No of lamps (A) No of lamps (B)
500-700 5 4
700-900 11 30
900-1100 26 12
1100-1300 10 8
1300-1500 8 6
Ans: SD of A 21.64% , SD of B 23.41%
A ismore consistent
Q.2 Calculate the standard deviation of the following distribution in by taking assumed mean:
Age no of persons Age no of persons
20-25 170 35-40 45
25-30 110 40-45 40
30-35 80 45-50 35
Ans : 7.936
2015
Q.1 (A)What do you mean by mean deviation ? how is it different from standard deviation ?
(B)For a certain distribution the arithmetic mean is 45 median is 48 and Karl pearson coefficient of skewness is 0.4
calculate. (1) mode (2) standard deviation (3) the coefficient of variation
Ans (1) mode=54 (2) standard deviation= -22.5 (3) the coefficient of variation= -50
Q.2 Calculate the standard deviation of the following data obtained by 5 student in group marks are 8, 12 ,13 , 15 ,22 Ans : 4.60
2014
Q.1 (A)What do you mean by deviation ? how is it different from standard deviation ?
(B) Karl pearson coefficient of skewness is 0.5, the median is 42 and mode 32 calculate. (1) mean (2) standard deviation
(3) the coefficient of variation .Ans (1) mean=47 (2) standard deviation= 30 (3) the coefficient of variation= 63.83%
Q.2 Calculate the standard deviation of the following data :
X 20 30 40 50 60 70
Frequency 8 12 20 10 6 4
Ans 13.75
2013
Q.1 (A) Explain the meaning of the coefficient of variation mention how its is different from variance
(B) Calculate the standard deviation of the following data : 160, 160, 161, 162, 163, 163, 163, 164, 164, 170
Ans: 2.72
Q.2 calculate coefficient of skewness by any method of given data.
Wages 0-10 10-20 20-30 30-40 40-50 50-60 60-70
No of person 1 3 11 21 43 32 09
Ans: -0.18
Measure of Variation : Standard Deviation and Skewness
PARTITION VALUES
Quartiles Deciles Percentiles
Quartiles
The median of a distribution splits the data into two equally-sized groups. In the same way, the quartiles are the three values that
split a data set into four equal parts. Note that the 'middle' quartile is the median. The upper quartile describes a 'typical' mark for
the top half of a class and the lower quartile is a 'typical' mark for the bottom half of the class. The quartiles are closely related to
the histogram of a data set. Since area equals the proportion of values in a histogram, the quartiles split the histogram into four
approximately equal areas.
Individual SERIES
For Odd series Q1 = Value of
(N+1)โˆ—1
4
th item
FOR even Series Q1 =value of
(
N
4
+
1+N
4
)โˆ—1
4
th item
Discrete Series
Q1 = Value of
(N + 1) โˆ— 1
4
th item
Q2 = Value of
(N + 1) โˆ— 2
4
th item
Q3 = Value of
(N + 1) โˆ— 3
4
th item
Continuous Series
For ascending series: M = L1 +
i
f
(m โˆ’ c)
Where;
L1= Lower Limit of median Class
I = Class interval of Median Class
f = Respective frequency of Median Class
for Q1 , m =
๐‘โˆ—1
4
for Q2 , m =
๐‘โˆ—2
4
for Q3 , m =
๐‘โˆ—3
4
c = Previous Cumulative Frequency of Class
Deciles
In a similar way, the deciles of a distribution are the nine values that split the data set into ten equal parts.You should not try to
calculate deciles from small data sets -- a single class of marks is too small to get useful values since the extreme deciles are very
variable. However the deciles can be useful descriptions for larger data sets such as national distributions for marks from standard
tests.
Individual SERIES
For Odd series D1 = Value of
(N+1)โˆ—1
10
th item
FOR even Series D1 =value of
(
N
10
+
1+N
10
)โˆ—1
10
th item
Discrete Series
D1 = Value of
(N + 1) โˆ— 1
10
th item
D2 = Value of
(N + 1) โˆ— 2
10
th item
D9 = Value of
(N + 1) โˆ— 9
10
th item
Continuous Series
For ascending series: M = L1 +
i
f
(m โˆ’ c)
Where;
L1= Lower Limit of median Class
I = Class interval of Median Class
f = Respective frequency of Median Class
for D1 , m =
๐‘โˆ—1
10
for D2 , m =
๐‘โˆ—2
10
for D9 , m =
๐‘โˆ—9
10
c = Previous Cumulative Frequency of Class
Percentiles
In a similar way, the percentiles of a distribution are the 99 values that split the data set into a hundred equal parts. These
percentiles can be used to categorise the individuals into percentile 1, ..., percentile 100. A very large data set is required before
the extreme percentiles can be estimated with any accuracy. (The 'random' variability in marks is especially noticeable in the
extremes of a data set.)
Individual SERIES
For Odd series P1 = Value of
(N+1)โˆ—1
100
th item
FOR even Series P1 =value of
(
N
100
+
1+N
100
)โˆ—1
100
th item
Discrete Series
P1 = Value of
(N + 1) โˆ— 1
100
th item
P2 = Value of
(N + 1) โˆ— 2
100
th item
P65 = Value of
(N + 1) โˆ— 65
100
th item
Continuous Series
For ascending series: M = L1 +
i
f
(m โˆ’ c)
Where;
L1= Lower Limit of median Class
I = Class interval of Median Class
f = Respective frequency of Median Class
for P1 , m =
๐‘โˆ—1
100
for P2 , m =
๐‘โˆ—2
100
for P65 , m =
๐‘โˆ—65
100
c = Previous Cumulative Frequency of Class
1. From the following data find out quartiles deciles percentiles
.
Weight in Kg. 47 50 58 45 53 59 47 60 49
From the following data find out quartiles deciles percentiles
Size of items; 5 15 25 35 45 55 65 75 85
Frequency: 3 8 15 20 25 10 9 6 4
From the following data find out quartiles deciles percentiles
Marks: 0 โ€“ 10 10 โ€“ 20 20 โ€“ 30 30 โ€“ 40 40 โ€“ 50
No. of Students: 5 8 15 16 6
Measures of Dispersion
Formulae of Measures of Dispersion
On dispersion by the methods of limits
1. Range = L โ€“ S
2. Co-efficient of Range =
Lโˆ’S
L+S
3. Inter-quartile Range = Q3 โˆ’ Q1
4. Coefficient of Inter-quartile Range =
Q3โˆ’Q1
Q3+ Q1
5. Semi inter quartile range or Quartile deviation: Q.D. =
Q3โˆ’Q1
2
6. Co-efficient of Q.D =
Q3โˆ’Q1
Q3+ Q1
On dispersion by the method of computation:
1. Mean deviation:
Individual series Discrete and Continuous series
Mean Deviation: ฮด =
โˆ‘|D|
N
Mean Deviation: ฮด =
โˆ‘ f|D|
N
2. Coefficient of M.D
From Mean From Median From Mode
Coeff. M.D. =
ฮด
Mean
Coeff. M.D. =
ฮด
Median
Coeff. M.D. =
ฮด
Mode
3. Standard Deviation
Methods Individual series Discrete / Continuous
Series
Direct method (based
on deviation from
Mean)
ฯƒ =โˆš
โˆ‘ x2
N
ฯƒ =โˆš
โˆ‘ fx2
N
Short-cut Method (on
assumed Mean) ฯƒ =โˆšโˆ‘ dx
2
N
โˆ’ (
โˆ‘ dx
N
)
2
ฯƒ =โˆšโˆ‘ fdx
2
N
โˆ’ (
โˆ‘ fdx
N
)
2
Step-deviation method
ฯƒ =โˆšโˆ‘ d2
N
โˆ’ (
โˆ‘ d
N
)
2
ฯƒ =โˆšโˆ‘ fd2
N
โˆ’ (
โˆ‘ fd
N
)
2
Method based on values
(when assumed mean is
taken as zero)
ฯƒ =โˆšโˆ‘ X2
N
โˆ’ (
โˆ‘ X
N
)
2
ฯƒ =โˆšโˆ‘ FX2
N
โˆ’ (
โˆ‘ FX
N
)
2
4. Other Formulae
Variance: V = ฯƒ2
Standard deviation of 1st
โ€˜Nโ€™ natural numbers: ฯƒ =โˆš
1
12
(N2 โˆ’ 1)
Coefficient of Standard Deviation: Coeff. ฯƒ =
ฯƒ
Mean
Coefficient of Variance: Coeff. C.V. =
ฯƒ
Mean
ร— 100
Range = 6 ฯƒ; Q.D =
2
3
ร—ฯƒ ; and M.D. =
4
5
ร—ฯƒ
DISPERSION
Meaning:
The word dispersion means deviation or difference. In statistics dispersion refers to deviation of the values of a variable from
their central value. Measures of dispersion indicate the extent to which individual items vary from their averages i.e., Mean,
Median or Mode. It shows the spread of items of a series from their central value.
Definition:
1. According to A. L. Bowely, โ€œDispersion is the measure of variation of the items.โ€
2. According to L. R. Connor โ€œDispersion is a measure of the extent to which the individual items varyโ€
3. According to Spiegal, โ€œthe degree to which numerical data tend to spread about an average value is called the variation
of dispersion of the data.โ€
Characteristics of dispersion:
For the foregoing definition, the essential characteristics of a measures of dispersion can be outlined as under:
1. It consists of different methods through which variations can be measured in quantitative manner.
2. It deals with a statistical series.
3. It indicates the degree, or extent to which the various items of a series deviate from its central value.
4. It supplements the measures of central tendency in revealing the characteristics of a frequency distribution.
5. It speaks of the reliability, or otherwise of the average value of a series.
Characteristics for an ideal measures of dispersion
๏† It should be rigidly defined.
๏† It should be easy to calculate and simple to understand
๏† It should be based on all the observations of the series.
๏† It should be used further for any algebraic treatment.
๏† It should not be affected much by the fluctuation of sampling
๏† It should be affected by the extreme items of th series.
Objectives of dispersion
๏† A measure of dispersion tells us whether an average is a true representative of the series or not.
๏† The extent of variability between two or more series can be compared with the help of measures of dispersion. It is
useful to determine the degree of uniformity, reliability and consistency amongst two or more sets or data.
๏† Measures of dispersion facilitate the use of other statistical measure like correlation, regression etc, for further analysis.
๏† Measures of dispersion serve as a basis for control of the variability itself.
Types of Measures of Dispersion
A. Methods of Limit
๏† Range
๏† Inter-quartile range
๏† Semi inter quartile range
๏† Deciles range
๏† Percentile range
B. Methods of Moment
๏† Mean deviation
๏† Standard deviation
๏† Coefficient of variance
๏† Variance
C. Graphic Method โ€“Lorenz Curve
Range
Range is defined as the difference between the two extreme values of a series. Thus, it is merely the difference between the
largest and smallest items of the series.
Advantages of Range;
1. It is easy to calculate and simple to understand.
2. It is rigidly defined
3. It takes the least possible time for calculation
4. In certain types of problems like quality control, weather forecasts etc. use of range is very useful.
Disadvantages of Range:
1. It is influenced very much by fluctuation of sampling
2. It does not take into consideration all the items of the series.
3. It is not capable of further algebraic treatment.
4. It does not take into consideration the frequencies of a series
Uses of Range:
๏† Quality control-Range has got a special application in the quality control measures. The control charts are prepared on
the basis of range for controlling the quality of products.
๏† Weather forecast- range is used advantageously by a metrological department for forecast the weather condition.
๏† Measurement of fluctuations- Range is a very useful measure to study the fluctuation of prices of certain commodities
viz, stock and shares, gold, silver etc.
Inter-quartile Rang;
Inter-quartile range is computed by deducting the value of the first quartile from the value of third quartile. Inter-quartile range is
defined as the difference between the two extreme quartiles of a series.
Advantages of inter-quartile range;
1. It is rigidly defined.
2. It can be easily calculated and simple to understand.
3. Its calculation is not affected even if first 25% and last 25% of a series are missing or changed.
Disadvantages of inter-quartile range:
1. It is not based on all the observations of the series.
2. It is not capable of further algebraic treatment.
3. It is affected by fluctuation in sampling.
Quartile deviation
Quartile deviation is based on central 50% of items. Quartile range is the difference between Q3 and Q1 and when this difference
is divided by 2 we get quartile deviation. Thus quartile deviation is defined as the average of the difference of two extreme
quartiles of a series.
Advantages of quartile deviation:
1. It is easy to calculate and simple to understand
2. Its calculation is based on middle 50% of item; hence it is a goods measure of dispersion.
3. It is rigidly defined. it is not very much affected by the extreme values of a series.
4. It is easy to calculate in case of open-end series.
Disadvantage
1. It is not capable of further algebraic treatment
2. It is too much affected by fluctuations of samples
3. It is not based on all the observations of a series
4. It does not show the scatterness around any average.
Mean deviation
Mean deviation is the average difference between the items in a series from the mean, median or mode.
Merits:
๏† It is better measure for comparison
๏† It is extensively used in other fields
๏† Mean deviation is less affected by the value of extreme items than the standard deviation.
Demerits
๏† It ignores ยฑsigns in its calculation
๏† It is difficult to compute when average is in fraction.
๏† It is rarely used in sociological studies.
Standard Deviation
S.D. is the square root of the mean of the squared deviation from the actual mean. It is introduced by Karl person in 1823. It is by
far the most important and widely used measure of studying dispersion.
Note : - if we find consistence of two group the which S.D is less id more consistence
Merits:
๏† All individual values are taken into account for calculation of S.D.
๏† It is capable of further algebraic treatment.
๏† It is the most rigidly defined measure of dispersion.
๏† It is used as an important instrument in making higher statistical analysis viz., correlation, regression etc.
Demerits
๏† It is not easy to calculate S.D.
๏† It is not understood by a common man.
๏† It is affected very much by the extreme items of a series.
Difference between M.D. and S.D
๏† While calculating standard deviation algebraic signs ยฑ are not ignored whereas in mean deviation algebraic signs are
completely ignored.
๏† Standard deviation is always calculated from arithmetic mean whereas mean deviation can be calculated either from
mean, median or mode.
๏† Standard deviation is much affected by the extreme observations of the series but that is not the cases with mean
deviation.
Variance:
Variance is the square of standard deviation. Thus, variance is calculated as โ€“ (S.D.)2
The term variance was used by R.A. Fisher in 1913, if a phenomenon is affected by a number of variables, variances helps in
isolating the effects of differential factors.
Coefficient of Variation
Coefficient of variation is defined as โ€œthe percentage of variation in mean, standard deviation being considered as the total
variation in the mean.โ€
This measure developed by Karl Pearson is the most commonly used measure of relative variation. It is used in such problems
where we want to comparative the variability of two or more than two series.
Lorenz Curve
For studying the dispersion of a series graphically we are to draw a graph of Lorenz curves as devised by the famous Economist
Lorenz of England. This curve was used for the first time for measuring the distribution of wealth and income.
Coefficient of Variation (CV)
The coefficient of variation (CV) is a statistical measure of the dispersion of data points in a data series around the mean. The
coefficient of variation represents the ratio of the standard deviation to the mean, and it is a useful statistic for comparing the
degree of variation from one data series to another, even if the means are drastically different from one another.
Exercise A
1. Form the following distribution ascertain the value of range and its coefficient.
10 15 20 25 30 40 50 55 60 70
[Ans: 60; 0.75]
2. From the following series, determine the value of range and its coefficient:
Salary (per month) 1000 1500 2000 2500 3000 3500 4000 5000
No. of worker 30 20 15 3 7 10 9 6
[Ans: 4000; 0.67]
3. From the following distribution, determine the value of the range and its coefficient:
Wages (per day) 20 โ€“ 25 25 โ€“ 30 30 โ€“ 35 35 โ€“ 40 40 โ€“ 45 45 โ€“ 50
No. of labourers 2 14 6 8 11 9
[Ans: 30; 0.43]
4. From the following data, determine the Range and the Coefficient of Range of marks awarded in statistics by the +2Commerce
students of Swami Vivekananda College:
Marks 10 โ€“ 19 20 โ€“ 29 30 โ€“ 39 40 โ€“ 49 50 โ€“ 59 60 โ€“ 69
No. of Students 15 5 12 14 10 8
[Ans: 60; 0.76]
5. From the following distribution, find the range and its coefficient:
Group Below 50 50 โ€“ 60 60 โ€“ 80 80 โ€“ 110 110 โ€“ 150 150 & above
Frequency 5 10 8 7 13 7
[Ans: 155; 0.224]
6. Calculate the semi-inter quartile range, or quartile deviation and its coefficient of the following data:
Wages in ` 20 30 40 50 60 70 80
No. of workers 3 61 132 153 140 51 3
[Ans: ` 10; ` 0.2]
7. From the following discrete series, find out the deciles range, semi deciles range, and their coefficients:
Age 15 16 17 18 19 20 21 22
No of students 5 20 18 17 10 5 3 1
[Ans: 4; 2; 0.8]
8. Calculate quartile deviation and its relative measure for the following distribution:
Group: 20 โ€“ 29 30 โ€“ 39 40 โ€“ 49 50 โ€“59 60 โ€“ 69 70 -- 79
Frequency: 306 182 144 96 42 34
[Ans: 10.71; 0.29]
Mean Deviation (ฮด)
1. From the following series relating to the marks obtained by a batch of 9 students in a certain test, calculate the mean deviation
from mean and median and also calculate their coefficients.
Weight in Kg. 47 50 58 45 53 59 47 60 49
[Ans: 4.89; 0.094; 4.67; 0.0934]
2. Find out the mean deviation from mean, median and mode, and also their coefficient form the following series:
Size of items; 5 15 25 35 45 55 65 75 85
Frequency: 3 8 15 20 25 10 9 6 4
[Ans: 14.99; 14.8; 14.8]
3. Calculate the mean deviation from mean for the following series. Also, find out its coefficient:
Marks: 0 โ€“ 10 10 โ€“ 20 20 โ€“ 30 30 โ€“ 40 40 โ€“ 50
No. of Students: 5 8 15 16 6
[Ans: 9.44; 0.35]
4. Calculate mean deviation from median from the following data:
Marks secured Below Below Below Below Below Below Below Below
80 70 60 50 40 30 20 10
No of students 100 90 80 60 32 20 13 5
[Ans: 14.31]
5. Calculate median, and mean deviation from median for the following frequency distribution:
Age in years 1-5 6-10 11-15 16-20 21-25 26-30 31-35 36-40 41-45
No of person 7 10 16 32 24 18 10 5 1
[Ans: 19.95; 7.1]
Standard Deviation
1. Calculate the standard deviation from the following data of income of 10employees of a firm by direct method; short-cut
method, and step deviation method:
Income (`) 600 620 640 620 680 670 680 640 700 650
[Ans: ` 30.33]
2. From the following discrete series, find out the standard deviation by all the possible methods:
Marks: 10 20 30 40 50 60
No. of students 8 12 20 10 7 3
[Ans: 13.45]
3. Calculate the standard deviation for the following data in different possible methods:
Class interval: 0 โ€“ 10 10 โ€“ 20 20 โ€“ 30 30 โ€“ 40 40 โ€“ 50
No of students: 7 12 24 10 7
[Ans: 11.397]
4. Calculate the standard deviation from the following data:
Age in years 10-19` 20-29 30-39 40-49 50-59 60-69 70-79
Frequency: 3 61 233 137 53 79 4
[Ans: 12.4]
5. Calculate standard deviation and coefficient of standard deviation of the following series:
Wages in ` No of workers Wages in ` No of workers
Upto ` 10 12 Upto ` 50 165
Upto ` 20 30 Upto ` 60 202
Upto ` 30 45 Upto ` 70 222
Upto ` 40 107 Upto ` 80 230
[Ans: 16.52; 41]
6. The following data relate to the profit/loss made by engineering companies in Odisha during the year 2012-13:
Wages in ` -10 โ€“ 0 0 โ€“ 10 10 โ€“ 20 20 โ€“ 30 30 โ€“ 40 40 โ€“ 50
Less than 10 19 24 49 87 31 27
Calculate the standard deviation, and its coefficients. Also, calculate the coefficient of variation. [Ans: 13.55; 0.6134; and
61.34%]
7. The following are the maks obtained by 40 students of a class. Calculate the coefficient of variation:
Marks Students Marks Students Marks Students
80 โ€“ 84
75 โ€“ 79
70 โ€“ 74
65 โ€“ 69
1
1
1
4
60 โ€“ 64
55 โ€“ 59
50 โ€“ 54
45 โ€“ 49
4
7
6
6
40 โ€“ 44
35 โ€“ 39
30 โ€“ 34
25 โ€“ 29
6
3
0
1
[Ans: 21.8%]
8. A factory produces two types of lams. In an experiment in the working life of these lams, the following results were obtained:
Length of life
(in hours)
No. of lamps
Type โ€“ A Type โ€“ B
500 โ€“ 700
700 โ€“ 900
900 โ€“ 1100
1100 โ€“ 1300
1300 โ€“ 1500
5
11
26
10
8
4
30
12
8
6
Compare the variability using the coefficient of variation. [Ans: 21.64; 23.40]
Skewness
If one tail is longer than another, the distribution is skewed. These distributions are sometimes called asymmetric or asymmetrical
distributions as they donโ€™t show any kind of symmetry. Symmetry means that one half of the distribution is a mirror image of the
other half. For example, the normal distribution is a symmetric distribution with no skew. The tails are exactly the same.
A left-skewed distribution has a long left tail. Left-skewed distributions are also called negatively-skewed distributions. Thatโ€™s
because there is a long tail in the negative direction on the number line. The mean is also to the left of the peak.
A right-skewed distribution has a long right tail. Right-skewed distributions are also called positive-skew distributions. Thatโ€™s
because there is a long tail in the positive direction on the number line. The mean is also to the right of the peak.
Mean and Median in Skewed Distributions
In a normal distribution, the mean and the median are the same number while the mean and median in a skewed distribution
become different numbers:A left-skewed, negative distribution will have the mean to the left of the median
A right-skewed distribution will have the mean to the right of the median.
Effects on Statistics
The normal distribution is the easiest distribution to work with in order to gain an understanding about statistics. Real life
distributions are usually skewed. Too much skewness, and many statistical techniques donโ€™t work. As a result, advanced
mathematical techniques including logarithms and quantile regression techniques are used. Read more about quantile regression
here.
Skewed Left (Negative Skew) :- A left skewed distribution is sometimes called a negatively skewed distribution because itโ€™s
long tail is on the negative direction on a number line.A common misconception is that the peak of distribution is what defines
โ€œpeakness.โ€ In other words, a peak that tends to the left is left skewed distribution. This is incorrect. There are two main things
that make a distribution skewed left:The mean is to the left of the peak. This is the main definition behind โ€œskewnessโ€, which is
technically a measure of the distribution of values around the mean.The tail is longer on the left.In most cases, the mean is to the
left of the median. This isnโ€™t a reliable test for skewness though, as some distributions (i.e. many multimodal distributions)
violate this rule. You should think of this as a โ€œgeneral ideaโ€ kind of rule, and not a set-in-stone one.
Skewed Right / Positive Skew :-A right skewed distribution is sometimes called a positive skew distribution. Thatโ€™s because the
tail is longer on the positive direction of the number line.
Formula
Karl Pearsonโ€™s Coefficient of Skewness
1. Pearsonโ€™s Coefficient of Skewness #1 uses the mode. The formula is:
Where = the mean, Mo = the mode and s = the standard deviation
2. Pearsonโ€™s Coefficient of Skewness uses the median. The formula is:
Where = the mean, Mo = the mode and s = the standard deviation
Bowleyโ€™s coefficient of skewness
Absolute formula =(Q3 โ€“ M ) โ€“ (M- Q1 ) = Q3 + Q1 -2M
Relative measure = (Q3 + Q1 -2M) / (Q3-Q1 )
Kelly coefficient of skewness
jpercentile = (P90 + P10 -2P50) / (P90-P10)
Based on deciles
jdeciles = (D9 + D1 -2D5) / ( D9-D1 )
1. Calculate the Karl Pearsonโ€™s coefficient of Skewness from the following data:
Size: 1 2 3 4 5 6 7
Frequency: 10 18 30 25 12 3 2
[Ans: 0.184]
2. Calculate the coefficient of Skewness based on mean and median from the following distribution:
X: 0 โ€“ 10 10 โ€“ 20 20 โ€“ 30 30 โ€“ 40 40 โ€“ 50 50 โ€“ 60 60 โ€“ 70 70 โ€“ 80
F: 6 12 22 48 56 32 18 6
[Ans: 41.7; 42.14; โ€“0.086]
3. Calculate Karl Pearsonโ€™s Coefficient of Skewness from the following data:
X: 10 โ€“ 15 15 โ€“ 20 20 โ€“ 25 25 โ€“ 30 30 โ€“ 35 35 โ€“ 40 40 โ€“ 45 45 โ€“ 50
F: 8 16 30 45 62 32 15 6
[Ans: โ€“0.22]
4. Calculate coefficient of variation and Karl Pearsonโ€™s coefficient of Skewness from the following data:
Sales (crores) less than 20 40 60 80 100
No of companies: 8 20 50 70 80
[Ans: 42.65; 0.0063]
5. From the following data find out the Bowleyโ€™โ€™s coefficient of Skewness:
Marks in Maths 90 50 52 86 87 76 80 85 58 61 65
[Ans: โ€“0.286]
6. Calculate the Quartile coefficient of Skewness for the following
Monthly Income ` No of family Monthly Income ` No of family
501 โ€“ 600
601 โ€“ 700
701 โ€“ 800
801 โ€“ 900
5
17
80
186
901 โ€“ 1000
1001 โ€“ 1100
1101 โ€“ 1200
1201 - 1300
208
134
68
18
[Ans: 0.025]
7. The measure of Skewness for a certain distribution is โ€“0.8. If the lower and upper quartiles are 44.1 and 56.6 respectively,
find the median. [Ans: 55.35]
8. In a frequency distribution of the coefficient of Skewness based on quartiles is 0.6. If the sum of upper and lower quartiles is
100 and median is 38, find the value of the upper quartile. [Ans: 70]
9. Pearsonโ€™s coefficient of Skewness of a distribution is 0.64. Its mean is 82 and Mode 50. Find the standard deviation [Ans:
50]
10. When mean 86, Median 80 and Karl Pearsonโ€™s coefficient of Skewness 0.42, find the coefficient of variance [Ans: 49.83]
Unit-4
CORRELATION
PREVIOUS YEAR PT R.S.S.U QUESTION PAPERS
2016
Q.1 Calculate Karl Pearsonโ€™s coefficient of correlation from the data given below:
X: 3 7 5 4 6 8 2 7
Y: 7 12 8 8 10 13 5 10
[Ans: 0.963]
Q.2 what is correlation ? Explain implication of positive and negative correlation show by means of scatter diagram the presence
of perfect positive and perfect negative correlation ?
2015
Q.1 define correlation Explain different types of correlation with suitable example
Q.2 Calculate Karl Pearsonโ€™s coefficient of correlation from the data given below?c
X: 6 2 10 4 8
Y: 9 11 5 8 7
Ans:- -0.92
Q.3 define Karl Pearsonโ€™s coefficient of correlation what is intended to measure?
2014
Q.1 define correlation Explain different types of correlation with suitable example
Q.2 calculate spearmanโ€™s coefficient of rank correlation from the following data :
X: 57 16 24 65 16 16 9 40 33 48
Y: 19 6 9 20 4 15 6 24 13 13
Ans:0.7333
Q.3 Find out the coefficient of correlation between the age of husband and wife from the following data
Age Of Wife
Age of husband
20-30 30-40 40-50 50-60 60-70 Total
15-25 4 9 4 17
25-35 8 24 5 37
35-45 2 11 2 15
45-55 6 14 5 25
55-65 4 2 6
Total 4 19 45 25 7
Ans: 0.73
2013
Q.1 Define Karl Pearsonโ€™s coefficient of correlation what is intended to measure? How would you interpret the sign of
correlation coefficient ?
Q.2 explain the importance of correlation in statistical analysis in management decision situation with examples
Q.3 Calculate coefficient of correlation from the data given below:
X: 1 2 3 4 5
Y: 3 3 7 9 12
[Ans: 0.97]
Correlation Analysis โ€“ Karlpearsonโ€™s co-efficient of Correlation.
CORRELATION
Correlation is a statistical measure for finding out the degree or strength of association between two (or more) variables. By
โ€˜associationโ€™ we mean the tendency of the variables to move together. If two variables x and y are so related that movements (or
variations) in one, say X, tend to be accompanied by corresponding movements ( or variations) in the other variable Y, then X
and Y are said to be correlated. The movements may be in the same direction (i.e., one, say X, increases and the other i.e., Y
decreases). Correlation is said to be positive or negative according as these movements are in the same or in the opposite
directions. If y is unaffected by any change in X, then X and Y are said to be uncorrelated.
Definition
L . R . Conner: โ€œIf two or more quantities vary in sympathy so that movements in the one tend to be accompanied by
corresponding movements in the other, then they are said to be correlated.โ€
Correlation may be linear or non-linear. If the amount of variation in X bears a constant ration to the corresponding amount of
variation in Y, then correlation between X and Y is said to be linear. Otherwise it is non-linear. Correlation coefficient or
Coefficient of correlation [r] measures the degree of linear relationship, (i.e., linear correlation) between two variables.
Utility
The utility of the study of correlation is immense both in physical as well as social sciences.. However, we shall confine
ourselves to the utility of correlation studies in social sciences only.
1. The study of correlation reduces the range of uncertainty associated with decision making. In social sciences,
particularly in the business world, forecasting is an important phenomenon, and correlation studies help us to make
relatively more dependable forecasts.
2. Correlation analysis is very helpful in understanding economic behavior; it helps us in locating such variables on
which other variables depend. This is helpful in studying factors by which economic events are affected. For example,
we can find out the factory responsible for price rise or low productivity.
3. Correlation study helps us in identifying such factors which can stabilize a disturbed economic situation.
4. Correlation study helps us to estimate the likely change in a variable with a particular amount of change in related
variable. For example correlation study can help us in finding out the change in demand with a certain amount of
change in price.
5. Inter-relationship studies between different variables are very helpful tools in promoting research and opening new
frontiers of knowledge.
TYPES OF CORRELATION
Correlation can be: [1] Positive or Negative; [2] Simple, Multiple or Partial; [3] Linear or Non-linear.
1. Positive and Negative correlation: Correlation can be either positive or negative. When the values of two variables
move in the same direction i.e., when an increase in the value of one variable is associates with an increase in the value
of other variable and a decrease in the value of one variable is associated with the decrease in the value of the other
variable, correlation is to be positive.
If, on the other hand, the values of two variables move in opposite directions, so that with an increase in the
values of one variable the value of the other variable decrease, and with a decrease in the values of one variable the
values of the other variable increase, correlation is said to be negative. There are some data in which correlation is
generally positive while in others it is negative.
2. Simple, Multiple and Partial correlation: In simple correlation we study only two variables- say price and demand.
In multiple correlations we study together the relationship between three or more factors like production, rainfall and
use of fertilizes. In partial correlation though more than two factors are involved but correlation is studied only
between two factors and the other factors are assumed to be constant.
๐ซ =
๐‚๐จ๐ฏ๐š๐ซ๐ข๐š๐ง๐œ๐ž๐จ๐Ÿ๐—๐š๐ง๐๐˜
๐›”๐ฑ ร— ๐›”๐ฒ
BBA 2ND SEM STATISTIC.pdf
BBA 2ND SEM STATISTIC.pdf
BBA 2ND SEM STATISTIC.pdf
BBA 2ND SEM STATISTIC.pdf
BBA 2ND SEM STATISTIC.pdf
BBA 2ND SEM STATISTIC.pdf
BBA 2ND SEM STATISTIC.pdf
BBA 2ND SEM STATISTIC.pdf
BBA 2ND SEM STATISTIC.pdf
BBA 2ND SEM STATISTIC.pdf
BBA 2ND SEM STATISTIC.pdf
BBA 2ND SEM STATISTIC.pdf
BBA 2ND SEM STATISTIC.pdf
BBA 2ND SEM STATISTIC.pdf
BBA 2ND SEM STATISTIC.pdf
BBA 2ND SEM STATISTIC.pdf
BBA 2ND SEM STATISTIC.pdf
BBA 2ND SEM STATISTIC.pdf
BBA 2ND SEM STATISTIC.pdf
BBA 2ND SEM STATISTIC.pdf
BBA 2ND SEM STATISTIC.pdf
BBA 2ND SEM STATISTIC.pdf
BBA 2ND SEM STATISTIC.pdf
BBA 2ND SEM STATISTIC.pdf
BBA 2ND SEM STATISTIC.pdf
BBA 2ND SEM STATISTIC.pdf
BBA 2ND SEM STATISTIC.pdf
BBA 2ND SEM STATISTIC.pdf
BBA 2ND SEM STATISTIC.pdf

More Related Content

What's hot

Attitude Organization behaviour
Attitude Organization behaviourAttitude Organization behaviour
Attitude Organization behaviour
Dipankar Dutta
ย 
Nature and Scope of Managerial Economics
Nature and Scope of Managerial EconomicsNature and Scope of Managerial Economics
Nature and Scope of Managerial Economics
dvy92010
ย 
Introduction to Operations Research
Introduction to Operations ResearchIntroduction to Operations Research
Introduction to Operations Research
Sundar B N
ย 
ROLE OF CIO
ROLE OF CIOROLE OF CIO
ROLE OF CIO
ashok kumar
ย 
Business environment ppt
Business environment pptBusiness environment ppt
Business environment pptRonnie Sirsikar
ย 
Business Environment- Features,Meaning,Importance,Objectives & Porter's Model
Business Environment- Features,Meaning,Importance,Objectives & Porter's Model Business Environment- Features,Meaning,Importance,Objectives & Porter's Model
Business Environment- Features,Meaning,Importance,Objectives & Porter's Model
Nikhil Soares
ย 
Evolution of management thought
Evolution of management thoughtEvolution of management thought
Evolution of management thought
david blessley
ย 
2 contributing discipline
2 contributing discipline2 contributing discipline
2 contributing discipline
Neha Yadav
ย 
Theories of Motivation in Organizational Behavior
Theories of Motivation in Organizational BehaviorTheories of Motivation in Organizational Behavior
Theories of Motivation in Organizational Behavior
Masum Hussain
ย 
Organizational Behavior : Personality
Organizational Behavior : PersonalityOrganizational Behavior : Personality
Organizational Behavior : Personality
Dr Kiran Kakade
ย 
Scaling
ScalingScaling
Scaling
Vivek Parashar
ย 
Management information system ( MIS )
Management information system ( MIS )Management information system ( MIS )
Management information system ( MIS )
QualitativeIn
ย 
Management information system
Management information systemManagement information system
Management information system
Sikander Saini
ย 
Structure of mis
Structure of misStructure of mis
Structure of mis
Arti Parab Academics
ย 
factors including economic and industrial analysis
factors including economic and industrial analysisfactors including economic and industrial analysis
factors including economic and industrial analysis
vishnu1204
ย 
Management accounting
Management accountingManagement accounting
Management accounting
Yamini Kahaliya
ย 
Attitude - Organizational Behaviour
Attitude - Organizational Behaviour Attitude - Organizational Behaviour
Attitude - Organizational Behaviour
Dr. Rajasshrie Pillai
ย 
Characteristics of MIS
Characteristics of MISCharacteristics of MIS
Characteristics of MIS
Self-employed
ย 
Delegation of authority and decentralization
Delegation of authority and decentralizationDelegation of authority and decentralization
Delegation of authority and decentralization
AMALDASKH
ย 
Hrd culture
Hrd cultureHrd culture
Hrd culture
Supriya Sharma
ย 

What's hot (20)

Attitude Organization behaviour
Attitude Organization behaviourAttitude Organization behaviour
Attitude Organization behaviour
ย 
Nature and Scope of Managerial Economics
Nature and Scope of Managerial EconomicsNature and Scope of Managerial Economics
Nature and Scope of Managerial Economics
ย 
Introduction to Operations Research
Introduction to Operations ResearchIntroduction to Operations Research
Introduction to Operations Research
ย 
ROLE OF CIO
ROLE OF CIOROLE OF CIO
ROLE OF CIO
ย 
Business environment ppt
Business environment pptBusiness environment ppt
Business environment ppt
ย 
Business Environment- Features,Meaning,Importance,Objectives & Porter's Model
Business Environment- Features,Meaning,Importance,Objectives & Porter's Model Business Environment- Features,Meaning,Importance,Objectives & Porter's Model
Business Environment- Features,Meaning,Importance,Objectives & Porter's Model
ย 
Evolution of management thought
Evolution of management thoughtEvolution of management thought
Evolution of management thought
ย 
2 contributing discipline
2 contributing discipline2 contributing discipline
2 contributing discipline
ย 
Theories of Motivation in Organizational Behavior
Theories of Motivation in Organizational BehaviorTheories of Motivation in Organizational Behavior
Theories of Motivation in Organizational Behavior
ย 
Organizational Behavior : Personality
Organizational Behavior : PersonalityOrganizational Behavior : Personality
Organizational Behavior : Personality
ย 
Scaling
ScalingScaling
Scaling
ย 
Management information system ( MIS )
Management information system ( MIS )Management information system ( MIS )
Management information system ( MIS )
ย 
Management information system
Management information systemManagement information system
Management information system
ย 
Structure of mis
Structure of misStructure of mis
Structure of mis
ย 
factors including economic and industrial analysis
factors including economic and industrial analysisfactors including economic and industrial analysis
factors including economic and industrial analysis
ย 
Management accounting
Management accountingManagement accounting
Management accounting
ย 
Attitude - Organizational Behaviour
Attitude - Organizational Behaviour Attitude - Organizational Behaviour
Attitude - Organizational Behaviour
ย 
Characteristics of MIS
Characteristics of MISCharacteristics of MIS
Characteristics of MIS
ย 
Delegation of authority and decentralization
Delegation of authority and decentralizationDelegation of authority and decentralization
Delegation of authority and decentralization
ย 
Hrd culture
Hrd cultureHrd culture
Hrd culture
ย 

Similar to BBA 2ND SEM STATISTIC.pdf

Meaning and uses of statistics
Meaning and uses of statisticsMeaning and uses of statistics
Meaning and uses of statistics
RekhaChoudhary24
ย 
Stats notes
Stats notesStats notes
Stats notes
Prabal Chakraborty
ย 
Statistics.pptx
Statistics.pptxStatistics.pptx
Statistics.pptx
MdAbidurRahman9
ย 
Statistics / Quantitative Techniques Study Material
Statistics / Quantitative Techniques Study MaterialStatistics / Quantitative Techniques Study Material
Statistics / Quantitative Techniques Study Material
Prabal Chakraborty
ย 
Business statistics review
Business statistics reviewBusiness statistics review
Business statistics reviewFELIXARCHER
ย 
Statistics Reference Book
Statistics Reference BookStatistics Reference Book
Statistics Reference Book
Ram Kumar Shah "Struggler"
ย 
Probability and statistics
Probability and statisticsProbability and statistics
Probability and statisticsCyrus S. Koroma
ย 
Introduction to statistics
Introduction to statisticsIntroduction to statistics
Introduction to statisticsakbhanj
ย 
Chapter 1 Introduction to statistics, Definitions, scope and limitations.pptx
Chapter 1 Introduction to statistics, Definitions, scope and limitations.pptxChapter 1 Introduction to statistics, Definitions, scope and limitations.pptx
Chapter 1 Introduction to statistics, Definitions, scope and limitations.pptx
SubashYadav14
ย 
Statistics an introduction (1)
Statistics  an introduction (1)Statistics  an introduction (1)
Statistics an introduction (1)
Suresh Kumar Murugesan
ย 
Introduction to Business Statistics
Introduction to Business StatisticsIntroduction to Business Statistics
Introduction to Business Statistics
SOMASUNDARAM T
ย 
Basics of Research Types of Data Classification
Basics of Research Types of Data ClassificationBasics of Research Types of Data Classification
Basics of Research Types of Data Classification
Harshit Pandey
ย 
Introduction to Business Statistics
Introduction to Business StatisticsIntroduction to Business Statistics
Introduction to Business StatisticsMegha Mishra
ย 
Business statistics what and why
Business statistics what and whyBusiness statistics what and why
Business statistics what and whydibasharmin
ย 
Statistics for Managers notes.pdf
Statistics for Managers notes.pdfStatistics for Managers notes.pdf
Statistics for Managers notes.pdf
Velujv
ย 
Statistics text book higher secondary
Statistics text book higher secondaryStatistics text book higher secondary
Statistics text book higher secondary
Chethan Kumar M
ย 
Stastistics in Physical Education - SMK.pptx
Stastistics in Physical Education - SMK.pptxStastistics in Physical Education - SMK.pptx
Stastistics in Physical Education - SMK.pptx
shatrunjaykote
ย 
Statistics assignment
Statistics assignmentStatistics assignment
Statistics assignment
Pragati Mehndiratta
ย 
Unit 1 Introduction to Statistics with history (1).pptx
Unit 1 Introduction to Statistics with history (1).pptxUnit 1 Introduction to Statistics with history (1).pptx
Unit 1 Introduction to Statistics with history (1).pptx
DrSJayashree
ย 

Similar to BBA 2ND SEM STATISTIC.pdf (20)

Meaning and uses of statistics
Meaning and uses of statisticsMeaning and uses of statistics
Meaning and uses of statistics
ย 
Stats notes
Stats notesStats notes
Stats notes
ย 
Statistics.pptx
Statistics.pptxStatistics.pptx
Statistics.pptx
ย 
Statistics / Quantitative Techniques Study Material
Statistics / Quantitative Techniques Study MaterialStatistics / Quantitative Techniques Study Material
Statistics / Quantitative Techniques Study Material
ย 
Business statistics review
Business statistics reviewBusiness statistics review
Business statistics review
ย 
Statistics Reference Book
Statistics Reference BookStatistics Reference Book
Statistics Reference Book
ย 
Probability and statistics
Probability and statisticsProbability and statistics
Probability and statistics
ย 
Introduction to statistics
Introduction to statisticsIntroduction to statistics
Introduction to statistics
ย 
Chapter 1 Introduction to statistics, Definitions, scope and limitations.pptx
Chapter 1 Introduction to statistics, Definitions, scope and limitations.pptxChapter 1 Introduction to statistics, Definitions, scope and limitations.pptx
Chapter 1 Introduction to statistics, Definitions, scope and limitations.pptx
ย 
Statistics an introduction (1)
Statistics  an introduction (1)Statistics  an introduction (1)
Statistics an introduction (1)
ย 
Introduction to Business Statistics
Introduction to Business StatisticsIntroduction to Business Statistics
Introduction to Business Statistics
ย 
S4 pn
S4 pnS4 pn
S4 pn
ย 
Basics of Research Types of Data Classification
Basics of Research Types of Data ClassificationBasics of Research Types of Data Classification
Basics of Research Types of Data Classification
ย 
Introduction to Business Statistics
Introduction to Business StatisticsIntroduction to Business Statistics
Introduction to Business Statistics
ย 
Business statistics what and why
Business statistics what and whyBusiness statistics what and why
Business statistics what and why
ย 
Statistics for Managers notes.pdf
Statistics for Managers notes.pdfStatistics for Managers notes.pdf
Statistics for Managers notes.pdf
ย 
Statistics text book higher secondary
Statistics text book higher secondaryStatistics text book higher secondary
Statistics text book higher secondary
ย 
Stastistics in Physical Education - SMK.pptx
Stastistics in Physical Education - SMK.pptxStastistics in Physical Education - SMK.pptx
Stastistics in Physical Education - SMK.pptx
ย 
Statistics assignment
Statistics assignmentStatistics assignment
Statistics assignment
ย 
Unit 1 Introduction to Statistics with history (1).pptx
Unit 1 Introduction to Statistics with history (1).pptxUnit 1 Introduction to Statistics with history (1).pptx
Unit 1 Introduction to Statistics with history (1).pptx
ย 

Recently uploaded

How To Leak-Proof Your Magazine Business
How To Leak-Proof Your Magazine BusinessHow To Leak-Proof Your Magazine Business
How To Leak-Proof Your Magazine Business
Charlie McDermott
ย 
Dining Tables and Chairs | Furniture Store in Sarasota, Florida
Dining Tables and Chairs | Furniture Store in Sarasota, FloridaDining Tables and Chairs | Furniture Store in Sarasota, Florida
Dining Tables and Chairs | Furniture Store in Sarasota, Florida
The Sarasota Collection Home Store
ย 
Textile Chemical Brochure - Tradeasia (1).pdf
Textile Chemical Brochure - Tradeasia (1).pdfTextile Chemical Brochure - Tradeasia (1).pdf
Textile Chemical Brochure - Tradeasia (1).pdf
jeffmilton96
ย 
Showcase Portfolio- Marian Andrea Tana.pdf
Showcase Portfolio- Marian Andrea Tana.pdfShowcase Portfolio- Marian Andrea Tana.pdf
Showcase Portfolio- Marian Andrea Tana.pdf
MarianAndreaSTana
ย 
Get To Know About Salma Karina Hayat.pdf
Get To Know About Salma Karina Hayat.pdfGet To Know About Salma Karina Hayat.pdf
Get To Know About Salma Karina Hayat.pdf
Salma Karina Hayat
ย 
Create a spend money transaction during bank reconciliation.pdf
Create a spend money transaction during bank reconciliation.pdfCreate a spend money transaction during bank reconciliation.pdf
Create a spend money transaction during bank reconciliation.pdf
andreakaterasco
ย 
How to Build a Diversified Investment Portfolio.pdf
How to Build a Diversified Investment Portfolio.pdfHow to Build a Diversified Investment Portfolio.pdf
How to Build a Diversified Investment Portfolio.pdf
Trims Creators
ย 
Michael Economou - Don't build a marketplace.pdf
Michael Economou - Don't build a marketplace.pdfMichael Economou - Don't build a marketplace.pdf
Michael Economou - Don't build a marketplace.pdf
Michael Oikonomou
ย 
Best Crypto Marketing Ideas to Lead Your Project to Success
Best Crypto Marketing Ideas to Lead Your Project to SuccessBest Crypto Marketing Ideas to Lead Your Project to Success
Best Crypto Marketing Ideas to Lead Your Project to Success
Intelisync
ย 
Web Technology LAB MANUAL for Undergraduate Programs
Web Technology  LAB MANUAL for Undergraduate ProgramsWeb Technology  LAB MANUAL for Undergraduate Programs
Web Technology LAB MANUAL for Undergraduate Programs
Chandrakant Divate
ย 
Office Furniture | Furniture Store in Sarasota, Florida | Sarasota Collection
Office Furniture | Furniture Store in Sarasota, Florida | Sarasota CollectionOffice Furniture | Furniture Store in Sarasota, Florida | Sarasota Collection
Office Furniture | Furniture Store in Sarasota, Florida | Sarasota Collection
The Sarasota Collection Home Store
ย 

Recently uploaded (11)

How To Leak-Proof Your Magazine Business
How To Leak-Proof Your Magazine BusinessHow To Leak-Proof Your Magazine Business
How To Leak-Proof Your Magazine Business
ย 
Dining Tables and Chairs | Furniture Store in Sarasota, Florida
Dining Tables and Chairs | Furniture Store in Sarasota, FloridaDining Tables and Chairs | Furniture Store in Sarasota, Florida
Dining Tables and Chairs | Furniture Store in Sarasota, Florida
ย 
Textile Chemical Brochure - Tradeasia (1).pdf
Textile Chemical Brochure - Tradeasia (1).pdfTextile Chemical Brochure - Tradeasia (1).pdf
Textile Chemical Brochure - Tradeasia (1).pdf
ย 
Showcase Portfolio- Marian Andrea Tana.pdf
Showcase Portfolio- Marian Andrea Tana.pdfShowcase Portfolio- Marian Andrea Tana.pdf
Showcase Portfolio- Marian Andrea Tana.pdf
ย 
Get To Know About Salma Karina Hayat.pdf
Get To Know About Salma Karina Hayat.pdfGet To Know About Salma Karina Hayat.pdf
Get To Know About Salma Karina Hayat.pdf
ย 
Create a spend money transaction during bank reconciliation.pdf
Create a spend money transaction during bank reconciliation.pdfCreate a spend money transaction during bank reconciliation.pdf
Create a spend money transaction during bank reconciliation.pdf
ย 
How to Build a Diversified Investment Portfolio.pdf
How to Build a Diversified Investment Portfolio.pdfHow to Build a Diversified Investment Portfolio.pdf
How to Build a Diversified Investment Portfolio.pdf
ย 
Michael Economou - Don't build a marketplace.pdf
Michael Economou - Don't build a marketplace.pdfMichael Economou - Don't build a marketplace.pdf
Michael Economou - Don't build a marketplace.pdf
ย 
Best Crypto Marketing Ideas to Lead Your Project to Success
Best Crypto Marketing Ideas to Lead Your Project to SuccessBest Crypto Marketing Ideas to Lead Your Project to Success
Best Crypto Marketing Ideas to Lead Your Project to Success
ย 
Web Technology LAB MANUAL for Undergraduate Programs
Web Technology  LAB MANUAL for Undergraduate ProgramsWeb Technology  LAB MANUAL for Undergraduate Programs
Web Technology LAB MANUAL for Undergraduate Programs
ย 
Office Furniture | Furniture Store in Sarasota, Florida | Sarasota Collection
Office Furniture | Furniture Store in Sarasota, Florida | Sarasota CollectionOffice Furniture | Furniture Store in Sarasota, Florida | Sarasota Collection
Office Furniture | Furniture Store in Sarasota, Florida | Sarasota Collection
ย 

BBA 2ND SEM STATISTIC.pdf

  • 1. Prof . T RAMA KRISHNA RAO (8839271225 ) BBA 2nd SEM STATISTICS
  • 2. Prof . T RAMA KRISHNA RAO (8839271225 ) PT R.S.S.U BBA II Statistics Unit-I Meaning and definition of Statistics; Scope and Limitations of Statistics; Processing and Presentation of Data. Unit-II Measures of Central Tendencies; Mean, Geometric Mean , Median, Mode. Unit-III Measure of Variation : Standard Deviation and Skewness. Unit-IV Correlation Analysis โ€“ Karlpearsonโ€™s co-efficient of Correlation. Unit-V Index Number, Time Series Analysis
  • 3. Prof . T RAMA KRISHNA RAO (8839271225 ) Unit 1 Statistics PT R.S.S.UNIVERSITY PREVIOUS YEAR QUESTION PAPERS 2016 Q.1 Define statistic Explain the ways in which statistical data can be presented with the help of suitable example? Q.2 Different between classification and tabulation , mention the requisites of a good statistical table? 2015 Q.1 Explain the meaning and scope of statistics bringing out its importance in field of business? Q.2 What do you mean by data ?what are objectives Explain different kind of classification of data? Q.3Draw a histogram to represent the following frequency distribution . Marks 0-10 10-20 20-40 40-50 50-60 60-70 70-90 90-100 No of students 4 6 14 16 14 10 16 5 2014 Q.1 Define statistics ,what are the main function ?discuss briefly the limitation of statistics ? Q.2 What is tabulation ? what are its use ? mention the items that a good statistical table contain? Q.3 Draw a frequency polygon for the following distribution Class interval 15-25 25-35 35-45 45-55 55-65 65-75 Frequency 10 16 18 15 13 4 2013 Q.1 Explain the meaning and scope of statistics bringing out its importance in field of business? Q.2 What is meant by classification ? what precaution are to be taken in selecting class intervals? Q.3 Represent the following data by a Pie chart? Food 87 Clothing 24 Recreation 11 Education 13 Rent 25 Miscellaneous 20 Meaning and definition of Statistics; Scope and Limitations of Statistics; Processing and Presentation of Data
  • 4. Prof . T RAMA KRISHNA RAO (8839271225 ) STATISTICS Meaning: โ€œStatisticsโ€, that a word is often used, has been derived from the Latin word โ€˜Statusโ€™ that means a group of numbers or figures; those represent some information of our human interest. collecting information about states and other information which was needed about their people, their number, revenue of the state etc. Definition: The term โ€˜Statisticsโ€™ has been defined in two senses, i.e. in Singular and in Plural sense. In plural sense, it means a systematic collection of numerical facts and in singular sense; it is the science of collecting, classifying and using statistics. A. In the Plural Sense: โ€œStatistics are numerical statements of facts in any department of enquiry placed in relation to each other.โ€ โ€”A.L. Bowley โ€œThe classified facts respecting the condition of the people in a stateโ€”especially those facts which can be stated in numbers or in tables of numbers or in any tabular or classified arrangement.โ€ โ€”Webster These definitions given above give a narrow meaning to the statistics as they do not indicate its various aspects as are witnessed in its practical applications. From the this point of view the definition given by Prof. Horace Sacrist appears to be the most comprehensive and meaningful: โ€œBy statistics we mean aggregates of facts affected to a marked extent by multiplicity of causes, numerically expressed, enumerated or estimated according to reasonable standard of accuracy, collected in a systematic manner for a predetermined purpose, and placed in relation to each other.โ€โ€”Horace Sacrist B. In the Singular Sense: โ€œStatistics refers to the body of technique or methodology, which has been developed for the collection, presentation and analysis of quantitative data and for the use of such data in decision making.โ€ โ€”Ncttor and Washerman โ€œStatistics may rightly be called the science of averages.โ€ โ€”Bowleg โ€œStatistics may be defined as the collection, presentation, analysis, and interpretation of numerical data.โ€ โ€”Croxton and Cowden Some Modern Definitions: โ€œStatistics is a body of methods for making wise decisions on the face of uncertainty.โ€ โ€”Wallis and Roberts โ€œStatistics is a body of methods for obtaining and analyzing numerical data in order to make better decisions in an uncertain world.โ€ โ€” Edward N. Dubois Stages of Investigations: 1. Collection of Data:It is the first stage of investigation and is regarding collection of data. It is determined that which method of collection is needed in this problem and then data are collected.
  • 5. Prof . T RAMA KRISHNA RAO (8839271225 ) 2. Organisation of Data:It is second stage. The data are simplified and made comparative and are classified according to time and place. 3. Presentation of Data:In this third stage, organised data are made simple and attractive. These are presented in the form of tables diagrams and graphs. 4. Analysis of Data:Forth stage of investigation is analysis. To get correct results, analysis is necessary. It is often undertaken using Measures of central tendencies, Measures of dispersion, correlation, regression and interpolation etc. 5. Interpretation of Data:In this last stage, conclusions are enacted. Use of comparisons is made. On this basis, forecasting is made Nature of Statistics 1. Statistics is Science :- Science, by definition, is a systematic body of knowledge which studies the cause and effect relationship and endeavors to find out generalization. If we take the various statistical methods in consideration, we can define statistics as a science in which we study:Numerous methods of collecting, editing, classifying, tabulating and presenting facts using graphs and diagrams Several ways of condensing data regarding various social, political, and economic problems This is done to establish a relationship between various facts. Also, it helps in analyzing and interpreting problems and forecast them too. 2. Statistics is Art :- If Science is knowledge, Art is action or the actual application of science. While Science teaches us to know, Art teaches us to do. statistics as an art of applying the science of scientific methods. As an art, statistics offer a better understanding and solution to problems in real life as it offers quantitative information.While there are several statistical methods, the successful application of the methods is dependent on the statisticianโ€™s degree of skill and experience. According to Tippet, โ€œStatistic is both a science and an art. It is a science in that its methods are basically systematic and have general application and art in that their successful application depends, to a considerable degree, on the skill and special experience of the statistician, and on his knowledge of the field of application.โ€ Characteristics 1. Statistics are Aggregate of Facts: Only those facts which are capable of being studied in relation to time, place or frequency can be called statistics. Individual, single or unconnected figures are not statistics because they cannot be studied in relation to each other. Due to this reason, only aggregate of facts e.g., data relating to I.Q. of a group of students, academic achievement of students, etc. are called statistics and are studied in relation to each other. 2. Statistics are Affected to a marked Extent by Multiplicity, of Causes:Statistical data are more related to social sciences and as such, changes are affected to a combined effect of many factors. We cannot study the effect of a particular cause on a phenomenon. It is only in physical sciences that individual causes can be traced and their impact is clearly known. In statistical study of social sciences, we come to know the combined effect of multiple causes. 3. Statistics are Numerically Expressed:Qualitative phenomena which cannot be numerically expressed, cannot be described as statistics e.g. honesty, goodness, ability, etc. But if we assign numerical expression, it maybe described as โ€˜statisticsโ€™. 4. Statistics are Enumerated or estimated according to Reasonable Standards of Accuracy:The standard of estimation and of accuracy differs from enquiry to enquiry or from purpose to purpose. There cannot be one standard of uniformity for all types of enquiries and for all purposes. A single student cannot be ignored while calculating I.Q. of 100 students in group whereas 10 soldiers can be easily ignored while finding out I.Q. of soldiers of whole country. 5. Statistics are Collected in a Systematic Manner:In order to have reasonable standard of accuracy statistics must be collected in a very systematic manner. Any rough and haphazard method of collection will not be desirable for that may lead to improper and wrong conclusion. Accuracy will also be not definite and as such cannot be believed. 6. Statistics for a Pre-determined Purpose:The investigator must have a purpose beforehand and then should start the work of collection. Data collected without any purpose is of no use. Suppose we want to know intelligence of a section of people, we must not collect data relating to income, attitude and interest. Without having a clear idea about the purpose we will not be in a position to distinguish between necessary data and unnecessary data or relevant data and irrelevant data. 7. Statistics are Capable of being Placed in Relation to each other:Statistics is a method for the purpose of comparison etc. It must be capable of being compared, otherwise, it will lose much of its value and significance. Comparison can be made only if the data are homogeneous. Importance and Scope of Statistics: (i) Statistics in Planning:Statistics is indispensable in planningโ€”may it be in business, economics or government level. The modern age is termed as the โ€˜age of planningโ€™ and almost all organisations in the government or business or management are resorting to planning for efficient working and for formulating policy decision.To achieve this end, the statistical data relating to production, consumption, birth, death, investment, income are of paramount importance. Today efficient planning is a must for almost all countries, particularly the developing economies for their economic development. (ii) Statistics in Mathematics:Statistics is intimately related to and essentially dependent upon mathematics. The modern theory of Statistics has its foundations on the theory of probability which in turn is a particular branch of
  • 6. Prof . T RAMA KRISHNA RAO (8839271225 ) more advanced mathematical theory of Measures and Integration. Ever increasing role of mathematics into statistics has led to the development of a new branch of statistics called Mathematical Statistics.Thus Statistics may be considered to be an important member of the mathematics family. In the words of Connor, โ€œStatistics is a branch of applied mathematics which specialises in data.โ€ (iii) Statistics in Economics:Statistics and Economics are so intermixed with each other that it looks foolishness to separate them. Development of modern statistical methods has led to an extensive use of statistics in Economics.All the important branches of Economicsโ€”consumption, production, exchange, distribution, public financeโ€”use statistics for the purpose of comparison, presentation, interpretation, etc. Problem of spending of income on and by different sections of the people, production of national wealth, adjustment of demand and supply, effect of economic policies on the economy etc. simply indicate the importance of statistics in the field of economics and in its different branches.Statistics of Public Finance enables us to impose tax, to provide subsidy, to spend on various heads, amount of money to be borrowed or lent etc. So we cannot think of Statistics without Economics or Economics without Statistics. (iv) Statistics in Social Sciences:Every social phenomenon is affected to a marked extent by a multiplicity of factors which bring out the variation in observations from time to time, place to place and object to object. Statistical tools of Regression and Correlation Analysis can be used to study and isolate the effect of each of these factors on the given observation.Sampling Techniques and Estimation Theory are very powerful and indispensable tools for conducting any social survey, pertaining to any strata of society and then analysing the results and drawing valid inferences. The most important application of statistics in sociology is in the field of Demography for studying mortality (death rates), fertility (birth rates), marriages, population growth and so on.In this context Croxton and Cowden have rightly remarked:โ€œWithout an adequate understanding of the statistical methods, the investigators in the social sciences may be like the blind man groping in a dark room for a black cat that is not there. The methods of statistics are useful in an over-widening range of human activities in any field of thought in which numerical data may be had.โ€ (v) Statistics in Trade:As already mentioned, statistics is a body of methods to make wise decisions in the face of uncertainties. Business is full of uncertainties and risks. We have to forecast at every step. Speculation is just gaining or losing by way of forecasting. Can we forecast without taking into view the past? Perhaps, no. The future trend of the market can only be expected if we make use of statistics. Failure in anticipation will mean failure of business.Changes in demand, supply, habits, fashion etc. can be anticipated with the help of statistics. Statistics is of utmost significance in determining prices of the various products, determining the phases of boom and depression etc. Use of statistics helps in smooth running of the business, in reducing the uncertainties and thus contributes towards the success of business. (vi) Statistics in ResearchWork:The job of a research worker is to present the result of his research before the community. The effect of a variable on a particular problem, under differing conditions, can be known by the research worker only if he makes use of statistical methods. Statistics are everywhere basic to research activities. To keep alive his research interests and research activities, the researcher is required to lean upon his knowledge and skills in statistical methods. Limitations of Statistics 1. Qualitative Aspect Ignored:The statistical methods donโ€™t study the nature of phenomenon which cannot be expressed in quantitative terms.Such phenomena cannot be a part of the study of statistics. These include health, riches, intelligence etc. It needs conversion of qualitative data into quantitative data. 2. It does not deal with individual items:It is clear from the definition given by Prof. Horace Sacrist, โ€œBy statistics we mean aggregates of factsโ€ฆ. and placed in relation to each otherโ€, that statistics deals with only aggregates of facts or items and it does not recognize any individual item. Thus, individual terms as death of 6 persons in a accident, 85% results of a class of a school in a particular year, will not amount to statistics as they are not placed in a group of similar items. It does not deal with the individual items, however, important they may be. 3. It does not depict entire story of phenomenon:When even phenomena happen, that is due to many causes, but all these causes can not be expressed in terms of data. So we cannot reach at the correct conclusions. Development of a group depends upon many social factors like, parentsโ€™ economic condition, education, culture, region, administration by government etc. But all these factors cannot be placed in data. So we analyse only that data we find quantitatively and not qualitatively. So results or conclusion are not 100% correct because many aspects are ignored. 4. It is liable to be miscued:As W.I. King points out, โ€œOne of the short-comings of statistics is that do not bear on their face the label of their quality.โ€ So we can say that we can check the data and procedures of its approaching to conclusions. But these data may have been collected by inexperienced persons or they may have been dishonest or biased. As it is a delicate science and can be easily misused by an unscrupulous person. So data must be used with a caution. Otherwise results may prove to be disastrous. 5. Laws are not exact:As far as two fundamental laws are concerned with statistics:(i) Law of inertia of large numbers and(ii) Law of statistical regularity, are not as good as their science laws.They are based on probability. So these results will not always
  • 7. Prof . T RAMA KRISHNA RAO (8839271225 ) be as good as of scientific laws. On the basis of probability or interpolation, we can only estimate the production of paddy in 2008 but cannot make a claim that it would be exactly 100 %. Here only approximations are made. 6. Results are true only on average: the results are interpolated for which time series or regression or probability can be used. These are not absolutely true. If average of two sections of students in statistics is same, it does not mean that all the 50 students is section A has got same marks as in B. There may be much variation between the two. So we get average results. โ€œStatistics largely deals with averages and these averages may be made up of individual items radically different from each other.โ€ โ€”W.L King 7. To Many methods to study problems:In this subject we use so many methods to find a single result. Variation can be found by quartile deviation, mean deviation or standard deviations and results vary in each case. โ€œIt must not be assumed that the statistics is the only method to use in research, neither should this method of considered the best attack for the problem.โ€ โ€”Croxten and Cowden Data The facts and figures which can be numerically measured are studied in statistics. Numerical measures of same characteristic is known as observation and collection of observations is termed as data. Data are collected by individual research workers or by organization through sample surveys or experiments, keeping in view the objectives of the study. The data collected may be: โ€ข Primary Data โ€ข Secondary Data Primary Data Primary data means the raw data (data without fabrication or not tailored data) which has just been collected from the source and has not gone any kind of statistical treatment like sorting and tabulation. The term primary data may sometimes be used to refer to first hand information. Sources of Primary Data :- The sources of primary data are primary units such as basic experimental units, individuals, households. Following methods are used to collect data from primary units usually and these methods depends on the nature of the primary unit. Published data and the data collected in the past is called secondary data. โ€ข Personal Investigation : -The researcher conducts the experiment or survey himself/herself and collected data from it. The collected data is generally accurate and reliable. This method of collecting primary data is feasible only in case of small scale laboratory, field experiments or pilot surveys and is not practicable for large scale experiments and surveys because it take too much time. โ€ข Investigators :- The trained (experienced) investigators are employed to collect the required data. In case of surveys, they contact the individuals and fill in the questionnaires after asking the required information, where a questionnaire is an inquiry form having a number of questions designed to obtain information from the respondents. This method of collecting data is usually employed by most of the organizations and its gives reasonably accurate information but it is very costly and may be time taking too. โ€ข Questionnaire :- The required information (data) is obtained by sending a questionnaire (printed or soft form) to the selected individuals (respondents) (by mail) who fill in the questionnaire and return it to the investigator. This method is relatively cheap as compared to โ€œthrough investigatorโ€ method but non-response rate is very high as most of the respondents donโ€™t bother to fill in the questionnaire and send it back to investigator. โ€ข Local Sources : -The local representatives or agents are asked to send requisite information who provide the information based upon their own experience. This method is quick but it gives rough estimates only. โ€ข Telephone :- The information may be obtained by contacting the individuals on telephone. Its a Quick and provide accurate required information. โ€ข Internet :- With the introduction of information technology, the people may be contacted through internet and the individuals may be asked to provide the pertinent information. Google survey is widely used as online method for data collection now a day. There are many paid online survey services too. Secondary Data Data which has already been collected by someone, may be sorted, tabulated and has undergone a statistical treatment. It is fabricated or tailored data. Sources of Secondary Data โ€ข Government Organizations;- Federal and Provincial Bureau of Statistics, Crop Reporting Service-Agriculture Department, Census and Registration Organization etc. โ€ข Semi-Government Organization ;- Municipal committees, District Councils, Commercial and Financial Institutions like banks etc
  • 8. Prof . T RAMA KRISHNA RAO (8839271225 ) โ€ข Teaching and Research Organizations:-Research Journals and Newspapers โ€ข Internet Primary and Secondary Data in Statistics :The difference between primary and secondary data in Statistics is that Primary data is collected firsthand by a researcher (organization, person, authority, agency or party etc) through experiments, surveys, questionnaires, focus groups, conducting interviews and taking (required) measurements, while the secondary data is readily available (collected by someone else) and is available to the public through publications, journals and newspapers. Classification of Data: Data classification is broadly defined as the process of organizing data by relevant categories so that it may be used and protected more efficiently. On a basic level, the classification process makes data easier to locate and retrieve. Data classification is of particular importance when it comes to risk management, compliance, and data security. Data classification involves tagging data to make it easily searchable and trackable. It also eliminates multiple duplications of data, which can reduce storage and backup costs while speeding up the search process. Though the classification process may sound highly technical, it is a topic that should be understood by your organizationโ€™s leadership. Data is of two types: qualitative data and quantitative data. : Qualitative data are data that represent a quality. Whereas, quantitative data are data that represent a numeric quality. Definition of Classification of Data: According to Secrist, โ€œClassification is the process of arranging data into sequences and groups according to their common characteristicsโ€. In other words, classification of data is the process of organizing data into groups according to various parameters. The most crucial parameter is the similarities that exist among data. For example, the number of students who have registered for a sports event can be classified on the following basis: โ€ข Gender โ€ข Age โ€ข Weight โ€ข Height โ€ข Institutions/Colleges โ€ข Sports played by them etc. Functions of Classification of Data: 1. Studying relations โ€“ classifying the collected data helps analyse and study the relationships between them. Moreover, the organization of statistical data can enable effective decision making. 2. Condense the data โ€“ sometimes the data collected for statistical manipulations are wide and raw. In order to make decisions based on the data, it is crucial to make the data more comprehensive. This can be done with the help of tabulation. Hence, classifying the data provides a condensed form of it that can be easily comprehensible. 3. Treatment of data โ€“ data collected from various sources is meaningless by itself. The data so collected should undergo manipulation in order to be useful for decision making. It becomes difficult to treat raw and unclassified data and is hence important to classify the data before doing so. Classification of data helps facilitate the statistical treatment of the data. 4. Comparisons โ€“ wide, raw and unclassified data is impossible to deal with and arrive at any conclusion. Conclusions cannot be arrived at without treating the data and making a statistical analysis. Hence, classified/organized/tabulated data enables analysts to make meaningful comparisons on various criteria. Rules For Classifying Data: 1. classification of the collected data is a very important technique while performing statistical treatments. It is all the more important to remember the rules of classifying the data. These rules form the backbone and act as guiding principles for well- classified data. These rules are mentioned below: 2. Unambiguous โ€“ the classes should be rigid and unambiguous (clear). An unclear classification can have severe consequences and can also impact all further statistical treatments. 3. Exhaustive โ€“ every classified data must be exhaustive in the sense that they should belong to one of the classes or categories. 4. Stability โ€“ in order to facilitate effective comparisons of data, it is important that the classified data are stable. Classified data should be stable in the sense that the same classification pattern must be adopted throughout the analysis. Adopting different classification techniques for the same analysis would lead to ambiguity.
  • 9. Prof . T RAMA KRISHNA RAO (8839271225 ) 5. Suitable for the purpose โ€“ it is crucial to remember the objective of the report or analysis while classifying data. Avoid classifying the data in a manner that does not suit the purpose of the inquiry. 6. Flexibility โ€“ it is important to classify data in a manner that allows future modification. Due to changing conditions, there may arise the need to change the statistical methods and data classifications. In such a situation, a flexible classification of data would solve many issues. Problems With Classifying Data: 1. Classification of data has many functions and various benefits. But there are also some key issues in organizing data. The most important problems associated with it are mentioned below: 2. Organizing data can be a very tedious and complex task for many companies or individuals. 3. Classifying data is a purely instinctive and a non-intuitive action that can lead to misjudgements. These misjudgements can often cause a lot of inconvenience and errors. 4. Redoing the entire process of classification can be very time consuming and nerve-racking. 5. Classifying data can be done only with the help of a statistical analyst. 6. It is impossible to classify data without having moderate knowledge on the same. Organization of Data: 1. Chronological Classification โ€“ The chronological classification of data emphasizes the occurrence of time. Under this type of data classification, data is classified on the bases of differences in time. The time series data (used frequently in economic and business statistics) is an example of data being classified in a chronological manner. 2. Geographical Classification โ€“ The geographical organization of data emphasizes on the geographical representation of data. Under this type of data classification, data is classified on the basis of geographical boundaries and location differences. Classifying based on states, cities and districts is a geographical classification. Classifying based on countries and continents are also examples of data being classified in a geographical manner. 3. Qualitative Classification โ€“ The qualitative classification of the data emphasizes on certain qualitative phenomenon of the data. Under this type of data classification, data is classified on the basis of qualitative measurements. Classifying based on qualities like honesty, intelligence and also aptitude are some examples of data being classified in a qualitative manner. 4. Quantitative Classification โ€“ The quantitative classification of the data emphasizes on certain quantitative phenomenon of the data. Under this type of data classification, data is classified on the basis of quantitative measurements. Classifying based on quantities like sales, profits, age, height and also weight are some examples of data being classified in a quantitative manner. Introducing Tabulation: Tabulation refers to the process of arranging all the collected data in a tabular format. Tabulation is also the systematic presentation of data in rows and columns. Rows are horizontal arrangements whereas columns are vertical arrangements. Tabulation is an important device for presenting data in a condensed manner that is easily understandable and furnishes maximum information. It also facilitates easy comparison between 2 or more parameters. There are 7 key parts of a table 1. Table number 2. Table title 3. Headnotes (also known as prefatory notes) 4. Captions 5. The body of the table 6. Foot-note 7. Source note Tabulation is mandatory to create charts and graphical representations. Data, tabulation and these diagrammatic representations are very important in the process of policy making, decision making and formulation of strategies. STEPS FOR EFFECTIVE DATA CLASSIFICATION 1. Understand the Current Setup: Taking a detailed look at the location of current data and all regulations that pertain to your organization is perhaps the best starting point for effectively classifying data. You must know what data you have before you can classify it. 2. Creating a Data Classification Policy: Staying compliant with data protection principles in an organization is nearly impossible without proper policy. Creating a policy should be your top priority.
  • 10. Prof . T RAMA KRISHNA RAO (8839271225 ) 3. Prioritize and Organize Data: Now that you have a policy and a picture of your current data, itโ€™s time to properly classify the data. Decide on the best way to tag your data based on its sensitivity and privacy. Different between classification and tabulation , BASIS FOR COMPARISON CLASSIFICATION TABULATION Meaning Classification is the process of grouping data into different categories, on the basis of nature, behavior, or common characteristics. Tabulation is a process of summarizing data and presenting it in a compact form, by putting data into statistical table. Order After data collection After classification Arrangement Attributes and variables Columns and rows Purpose To analyse data To present data Bifurcates data into Categories and sub-categories Headings and sub-headings Requisites of good statistical table 1. Suit the purpose 2. Scientifically prepared 3. Clarity 4. Manageable size 5. Columns and rows should be numbered 6. Suitably approximated 7. Attractive getup 8. Units should be mentioned 9. Averages & totals should be given 10. Logically arranged 11. Proper lettering Frequency The frequency of any value is the number of times that value appears in a data set. So from the above examples of colours, we can say two children like the colour blue, so its frequency is two. So to make meaning of the raw data, we must organize. And finding out the frequency of the data values is how this organisation is done. Frequency Distribution Many times it is not easy or feasible to find the frequency of data from a very large dataset. So to make sense of the data we make a frequency table and graphs. Types of Frequency Distribution:The frequency distribution is further classified into five. These are: 1. Exclusive Series 2. Inclusive Series 3. Open End Series 4. Cumulative Frequency Series 5. Mid-Values Frequency Series Exclusive Series In such a series, for a particular class interval, all the data items having values ranging from its lower limit to just below the upper limit are counted in the class interval. In other words, we do not include the items that have values less than the lower limit, equal to the upper limit and greater than the upper limit.Note that here the upper limit of a class repeats itself in the lower limit of the next interval. This is the most used type of frequency distribution. Weight Frequency 40-50 2 50-60 10
  • 11. Prof . T RAMA KRISHNA RAO (8839271225 ) 60-70 5 70-80 3 Inclusive Series On the contrary to exclusive series, an inclusive series includes both its upper and lower limit. Of course, this means that we do not include the items with values less than the lower limit and greater than the upper limit. Marks Frequency 10-19 5 20-29 13 30-39 6 Open End Series In an open-end series, the lower limit of the first class in the series and the upper limit of the last class in the series is missing. Instead, there is โ€˜below the lower limitโ€™ of the first class and โ€˜lower limit and above the lower limitโ€™ of the last class. Age Frequency Below 5 4 5-10 6 10-20 10 20 and above 8 Cumulative Frequency Series In a cumulative frequency series, we either add or subtract the frequencies of all the preceding class intervals to determine the frequency for a particular class. Further, the classes are converted into either โ€˜less than the upper limitโ€™ or โ€˜more than the lower limitโ€™. Mid-Values Frequency Series A mid-value frequency series is the one in which we have the mid values of class intervals and the corresponding frequencies. In other words, the mid values represent the range of a particular class interval. GRAPH OF DATA FREQUENCY 1. Histogram 2. Bar Graphs 3. Polygons 4. pie chart 5. Line Graphs 6. Ogive Graph / Cumulative Frequency Histogram A histogram is a plot that lets you discover, and show, the underlying frequency distribution (shape) of a set of continuous data. This allows the inspection of the data for its underlying distribution (e.g., normal distribution), outliers, skewness, etc. An example of a histogram, and the raw data it was constructed from, is shown below: 36 25 38 46 55 68 72 55 36 38 67 45 22 48 91 46 52 61 58 55 construct histogram from a continuous variable To construct a histogram from a continuous variable you first need to split the data into intervals, called bins. In the example above, age has been split into bins, with each bin representing a 10-year period starting at 20 years. Each bin contains the number of occurrences of scores in the data set that are contained within that bin. For the above data set, the frequencies in each bin have been tabulated along with the scores that contributed to the frequency in each bin Bin Frequency Scores Included in Bin 20-30 2 25,22 30-40 4 36,38,36,38 40-50 4 46,45,48,46 50-60 5 55,55,52,58,55
  • 12. Prof . T RAMA KRISHNA RAO (8839271225 ) 60-70 3 68,67,61 70-80 1 72 80-90 0 - 90-100 1 91 Notice that, unlike a bar chart, there are no "gaps" between the bars (although some bars might be "absent" reflecting no frequencies). This is because a histogram represents a continuous data set, and as such, there are no gaps in the data (although you will have to decide whether you round up or round down scores on the boundaries of bins). Bar graph A bar graph is a chart that uses bars to show comparisons between categories of data. The bars can be either horizontal or vertical. Bar graphs with vertical bars are sometimes called vertical bar graphs. A bar graph will have two axes. One axis will describe the types of categories being compared, and the other will have numerical values that represent the values of the data. It does not matter which axis is which, but it will determine what bar graph is shown. If the descriptions are on the horizontal axis, the bars will be oriented vertically, and if the values are along the horizontal axis, the bars will be oriented horizontally. Types of Bar Graphs There are many different types of bar graphs. They are not always interchangeable. Each type will work best with a different type of comparison. The comparison you want to make will help determine which type of bar graph to use. First we'll discuss some simple bar graphs. vertical bar :- A simple vertical bar graph is best when you have to compare between two or more independent variables. Each variable will relate to a fixed value. The values are positive and therefore, can be fixed to the horizontal value. Horizontal bar graph:- If your data has negative and positive values but is still a comparison between two or more fixed independent variables, it is best suited for a horizontal bar graph. The vertical axis can be oriented in the middle of the horizontal axis, allowing for negative and positive values to be represented. Range Bar Graph represents a range of data for each independent variable. Temperature ranges or price ranges are common sets of data for range graphs. Unlike the above graphs, the data do not start from a common zero point but begin at a low number for that particular point's range of data. A range bar graph can be either horizontal or vertical.
  • 13. Prof . T RAMA KRISHNA RAO (8839271225 ) . Difference Between A Bar Chart And A Histogram The major difference is that a histogram is only used to plot the frequency of score occurrences in a continuous data set that has been divided into classes, called bins. Bar charts, on the other hand, can be used for a great deal of other types of variables including ordinal and nominal data sets. Polygons A frequency polygon is almost identical to a histogram, which is used to compare sets of data or to display a cumulative frequency distribution. It uses a line graph to represent quantitative data. Statistics deals with the collection of data and information for a particular purpose. The tabulation of each run for each ball in cricket gives the statistics of the game. Tables, graphs, pie-charts, bar graphs, histograms, polygons etc. are used to represent statistical data pictorially. In the upcoming discussion let us discuss how to represent a frequency polygons. These are visually substantial method of representing quantitative data and its frequencies. To draw frequency polygons, we begin with, drawing histograms and follow the following steps: Step 1- Choose the class interval and mark the values on the horizontal axes Step 2- Mark the mid value of each interval on the horizontal axes. Step 3- Mark the frequency of the class on the vertical axes. Step 4- Corresponding to the frequency of each class interval, mark a point at the height in the middle of the class interval Step 5- Connect these points using the line segment. Step 6- The obtained representation is a frequency polygon Solution: Following steps are to be followed to construct a histogram from the given data: โ€ข The heights are represented on the horizontal axes on a suitable scale as shown. โ€ข The number of students is represented on the vertical axes on a suitable scale as shown. โ€ข Now rectangular bars of widths equal to the class- size and the length of the bars corresponding to a frequency of the class interval is drawn. โ€ข ABCDEF represents the given data graphically in form of frequency polygon as:
  • 14. Prof . T RAMA KRISHNA RAO (8839271225 ) PIE CHART A pie chart (or a circle chart) is a circular statistical graphic, which is divided into slices to illustrate numerical proportion. In a pie chart, the arc length of each slice (and consequently its central angle and area), is proportional to the quantity it represents. While it is named for its resemblance to a pie which has been sliced, there are variations on the way it can be presented. The earliest known pie chart is generally credited to William Playfair's Statistical Breviary of 1801 Represent the following data by a Pie chart? Food 87 Clothing 24 Recreation 11 Education 13 Rent 25 Miscellaneous 20 Exp Persentage Degree Food 8700 48.33333333 174 Clothing 2400 13.33333333 48 Recreation 1100 6.111111111 22 Education 1300 7.222222222 26 Rent 2500 13.88888889 50 Miscellaneous 2000 11.11111111 40 total salary 18000 100 360 Convert percentage to degree = ( 360 * Percentage ) /100
  • 15. Prof . T RAMA KRISHNA RAO (8839271225 ) Line Graphs Line Graphs are used to display quantitative values over a continuous interval or time period. A Line Graph is most frequently used to show trends and analyse how the data has changed over time.Line Graphs are drawn by first plotting data points on a Cartesian coordinate grid, then connecting a line between all of these points. Typically, the y-axis has a quantitative value, while the x-axis is a timescale or a sequence of intervals. Negative values can be displayed below the x-axis. The direction of the lines on the graph works as a nice metaphor for the data: an upward slope indicates where values have increased and a downward slope indicates where values have decreased. The line's journey across the graph can create patterns that reveal trends in a dataset. When grouped with other lines (other data series), individual lines can be compared to one another. However, avoid using more than 3-4 lines per graph, as this makes the chart more cluttered and harder to read. A solution to this is to divide the chart into smaller multiples (have a small Line Graph for each data series). Food 8700 Clothing 2400 Recreation 1100 Education 1300 Rent 2500 Miscellaneous 2000 total salary 18000 Ogive Graph / Cumulative Frequency An ogive (oh-jive), sometimes called a cumulative frequency polygon, is a type of frequency polygon that shows cumulative frequencies. In other words, the cumulative percents are added on the graph from left to right. Food, 174 Clothing , 48 Recreation, 22 Education , 26 Rent , 50 Miscellaneous, 40 Food Clothing Recreation Education Rent Miscellaneous 0 1000 2000 3000 4000 5000 6000 7000 8000 9000 10000 Series1
  • 16. Prof . T RAMA KRISHNA RAO (8839271225 ) An ogive graph plots cumulative frequency on the y-axis and class boundaries along the x-axis. Itโ€™s very similar to a histogram, only instead of rectangles, an ogive has a single point marking where the top right of the rectangle would be. It is usually easier to create this kind of graph from a frequency table. Draw an Ogive Graph :Example question: Draw an Ogive graph for the following set of data: 02, 07, 16, 21, 31, 03, 08, 17, 21, 55 03, 13, 18, 22, 55, 04,14, 19, 25, 57,06, 15, 20, 29, 58. Step 1: Make a relative frequency table from the data. The first column has the class limits, the second column has the frequency (the count) and the third column has the relative frequency (class frequency / total number of items): Step 2: Add a fourth column and cumulate (add up) the frequencies in column 2, going down from top to bottom. For example, the second entry is the sum of the first row and the second row in the frequency column (5 + 5 = 10), and the third entry is the sum of the first, second, and third rows in the frequency column (5 + 5 + 6 = 16): Step 3: Add a fifth column and cumulate the relative frequencies from column 3. If you do this step correctly, your values should add up to 100% (or 1 as a decimal): Step 4: Draw an x-y graph with percent cumulative relative frequency on the y-axis (from 0 to 100%, or as a decimal, 0 to 1). Mark the x- axis with the class boundaries. Step 5: Plot your points. Note: Each point should be plotted on the upper limit of the class boundary. For example, if your first class boundary is 0 to 10, the point should be plotted at 10. Step 6: Connect the dots with straight lines. the ogive is one continuous line, made up of several smaller lines that connect pairs of dots, moving from left to right. Draw Histogram ,Bar Graphs,Polygons,piechart,Line Graphs ,Ogive Graph / Cumulative Frequency Q.1
  • 17. Prof . T RAMA KRISHNA RAO (8839271225 ) X: 0 โ€“ 9 10 โ€“ 19 20 โ€“ 29 30 โ€“ 39 40 โ€“ 49 50 โ€“ 59 F: 5 8 7 11 9 10 Q.2 Section Average marks in Mathematics No. of Students A 75 50 B 60 60 C 55 50 Q .3 Wages ` 0 โ€“ 10 10 โ€“ 20 20 โ€“ 30 30 โ€“ 40 40 โ€“ 50 50 โ€“ 60 No. of workers: 25 30 45 15 25 30 [Ans: 35, 40] Q.4 Marks 0 โ€“ 10 10 โ€“ 20 20 โ€“ 30 30 โ€“ 40 40 โ€“ 50 50 โ€“ 60 60 โ€“ 70 Frequency: 50 10 20 40 20 30 30 [Ans: 30; 28] Q.5 Marks No. of Students Marks No. of Students Less than 10 Less than 20 10 โ€“ 30 30 and above 5 20 35 60 40 โ€“ 50 50 and above 60 and above 10 25 9
  • 18. Prof . T RAMA KRISHNA RAO (8839271225 ) UNIT 2 STATISTICS UNIVERSITY PREVIOUS YEAR QUESTION PAPERS 2016 Q. 1 Calculate the average daily sales from the following data by assumed mean method Daily sales 40 50 60 70 80 No od salesman 5 6 10 12 3 Ans : 60.55 Q.2 find out median from the following table: Daily Wages no of employees Daily Wages no of employees 50-59 15 90-99 45 60-69 40 100-109 40 70-79 50 110-109 15 80-89 60 Ans 84.08 2015 Q.1 What do you mean by Arithmetic mean ? Discuss its merits and demerits Also state its importance properties? Q.2 An incomplete distribution is given below Class 0-10 10-20 20-30 30-40 40-50 50-60 60-70 Frequency 10 20 ? 40 ? 25 12 Total frequencies if median value is 35 (Ans :35,25) Q.3 calculate mode from the following Marks 0 -10 10-20 20-40 40-50 50-70 No of student 2 7 18 15 8 Ans 35.71 2014 Q.1 What do you mean by central tendency ? what are the common measures of central tendency? Q.2 Given below is the distributation of weights of a group of 60 student in class Weight 30-34 35-39 40-44 45-49 50-54 55-59 60-64 No of student 3 5 12 18 14 6 2 Ans: 47.5 Q.3 find the geometric mean from the following data: Diameter 130 135 140 145 143 148 149 150 No of screwa 3 4 6 6 3 5 2 2 Ans: 142.3 2013 Q.1 the purpose of an average is to represent a group of individual values in simple and concise manner so that a quick understanding of the general size of individual in the group can be made easily Explain? Q .2 Find the missing frequency from the following data: Class interval 0-10 10-20 20-30 30-40 40-50 Frequency 3 5 ? 3 2 The mean of the distribution is 23Ans:7 Q.3 calculate median from the following data : Value Frequency Value Frequency Less then 10 4 Less then 50 96 Less then 20 16 Less then 60 112 Less then 30 40 Less then 70 120 Less then 40 76 Less then 80 125 Ans: 36.25 Measures of Central Tendencies; Mean, Median, Mode, Geometric Mean.
  • 19. Prof . T RAMA KRISHNA RAO (8839271225 ) MEASURES OF CENTRAL TENDENCY Meaning The word measures means โ€˜methodsโ€™ and the word Central Tendency means โ€˜average valueโ€™ of any statistical series. The , the combined term measures of central tendency means the methods of finding out the central value or average value of a statistical series or any series of quantitative information. Definitions According to Croxton and Cowden, โ€œAn averages value is a single value within the range of the data that is used to represent all the values in the series. Since an average is somewhere within the range of the data, it is sometimes called a measure of central value.โ€ In the words of Clark,โ€œAverage is an attempt to find one single figure to describe whole of figures.โ€ Characteristics 1. It is a single figure expressed in some quantitative form. 2. It lies between the extreme values of a series 3. It is a typical value that represents all the values of a series 4. It is capable of giving a central ideal about the series it represents 5. It is determined by some method or procedure. Essentials of a Good Average 1. It should have clear Definition โ€“ The definitions of an average should be clear and unambiguous. It should be defined in the form of an algebraic formula, so that each person calculating the average from a set of data, arrives at the same figure. 2. It hold be simple to understand and easy to calculate โ€“ An average should be simple so that everybody could able to understand without any dubious meaning. The method for calculation of average should be such that everybody can calculate the same in an easier way. 3. It should be based on all the observations โ€“ Average is not representative unless the entire data are taken for its calculation. So in order to make an average ideal it should be based on all the items of the series. 4. It should be suitable for further mathematical treatment โ€“ An ideal average should possess some important mathematical property, so that it will be easier on the part of person using the same for further mathematical or statistical analysis. By no way the use of the average figure should be restricted for single purpose rather by that average can be used for calculation of other statistical measures like dispersion, correlation, regression and others. 5. It should not be affected by extreme items โ€“ In a sample, there may be wide variation of figures. The extreme items i.e., highest values and lowest values, are of much higher or lower than other values. In such case, the average so calculated will be greatly influenced by these extreme values and it cannot be treated as the true representative of the whole distribution. Various Measures of Central Tendency A. Mathematical Averages: ๏† Arithmetic Average or Mean ๏† Geometric Mean ๏† Harmonic Mean B. Positional averages: ๏†Median ๏†Mode ๏†Quartiles ๏†Deciles ๏†Percentiles C. Miscellaneous Averages: ๏† Moving Average ๏† Progressive Average
  • 20. Prof . T RAMA KRISHNA RAO (8839271225 ) Mean โ€œ Mean of a series is the sum of the values of a variable divided by the number of observations. โ€œ ๐— ฬ… = โˆ‘ ๐— ๐ Method Individual Series Discrete Series Continues Series Direct method ๐— ฬ… = โˆ‘ ๐— ๐ ๐— ฬ… = โˆ‘ ๐…๐— ๐ ๐— ฬ… = โˆ‘ ๐…๐ฆ ๐ Short cut method ๐— ฬ… = ๐€ + โˆ‘ ๐ ๐ ๐— ฬ… = ๐€ + โˆ‘ ๐…๐ ๐ ๐— ฬ… = ๐€ + โˆ‘ ๐…๐ ๐ Step deviation method ๐— ฬ… = ๐€ + โˆ‘ ๐โ€ฒ ๐ ร— ๐ข ๐— ฬ… = ๐€ + โˆ‘ ๐…๐โ€ฒ ๐ ร— ๐ข ๐— ฬ… = ๐€ + โˆ‘ ๐…๐โ€ฒ ๐ ร— ๐ข Shortest method ๐‘ฟ ฬ… = ๐’Ž๐‘ณ โˆ’ ๐’Š ( โˆ‘ ๐‘ช๐‘ญ ๐‘ต โˆ’ ๐Ÿ) mL = mid value of last class Combined Mean ๐— ฬ…๐Ÿ.๐Ÿ.๐Ÿ‘ = ๐๐Ÿ๐— ฬ…๐Ÿ + ๐๐Ÿ๐— ฬ…๐Ÿ + ๐๐Ÿ‘๐— ฬ…๐Ÿ‘ ๐๐Ÿ + ๐๐Ÿ + ๐๐Ÿ‘ Properties of Arithmetic Mean 1. The sum of the deviations of the items from the actual mean is always zero. 2. The sum of the squares of deviations of items from the arithmetic mean is minimum i.e., less than the sum of the squares of deviations of items from any other value. 3. The sum of the given values of a series is equal to the product of their arithmetic average and number of items of the series. 4. The sum of the number if items of a series are equal to the quotient of the sum of the values of the items and their arithmetic mean. Advantages of Mean 1. It is easy to understand and simple to compute 2. It is rigidly defined and there is no scope for ambiguity or misunderstanding about its meaning and nature. 3. Its value is based on each and every items of the data. With every change in any item, value of average will change. 4. Arrangement like ascending or descending order of data is not required while computing arithmetic mean. 5. It is not very much affected by fluctuations in sampling and thus its result is relatively dependable. 6. It can be reused for further statistical computations. Disadvantages of Mean 1. In some cases where extreme items are either too big or small, then average is greatly affected by values of these extreme items. Thus it fails to be the true representative of the series. 2. Its value cannot be determined graphically 3. In certain cases, arithmetic mean may give absurd result.
  • 21. Prof . T RAMA KRISHNA RAO (8839271225 ) [Arithmetic Mean] 1. What do you understand by measures of averages? Explain features and functions of averages 2. Define the term โ€˜Averagesโ€™. Discuss the functions and types of statistical averages. 3. Explain different methods of measuring averages with examples 4. State various functions of measures of averages. 5. Why are the averages also known as central tendency? Examine the features of central tendency. 6. What is a statistical average? Explain features of good average 7. What are the functions and limitations of averages/ 8. What is arithmetic mean? Explain its properties, merits and limitations Practical Problems: 1. Find mean income of 10 employees in an organization. Income ` (000) 10.2 15.5 18.9 20.2 25.4 26.2 29.3 31.4 32.5 32.9 [Ans: 24.25] 2. The following are the daily savings of a group of workers in a factory calculate average saving. Savings ` 10 11 12 13 15 16 18 20 22 23 25 No of workers 2 3 5 8 9 10 15 8 6 5 4 [Ans: 17.19] 3. From the following data relating to daily wages of certain workers in a factory compute the average marks under direct and short-cut method. Wages ` 0 โ€“ 10 10 โ€“ 20 20 โ€“ 30 30 โ€“ 40 40 โ€“ 50 50 โ€“ 60 60 - 70 No. of workers: 7 8 9 12 16 6 2 [Ans: 33] 4. From the data given below find the mean under the step deviation method. X: 0 โ€“ 9 10 โ€“ 19 20 โ€“ 29 30 โ€“ 39 40 โ€“ 49 50 โ€“ 59 F: 5 8 7 11 9 10 [Ans: 32.7] 5. From the following data relating to marks in Statistics secured by a batch +3 Commerce students, find out the mean marks: Marks above: 0 10 20 30 40 50 No. of Students: 50 40 35 27 15 8 [Ans: 30] 6. Calculate the average marks of the students from the following data; Marks below; 10 20 30 40 50 60 70 80 No. of Students: 15 35 60 84 96 127 198 250 [Ans: 50.4] 7. From the following data compute the arithmetic average under the step deviation method: Marks below: 100 80 60 40 20 No of students: 60 55 40 35 5 [Ans: 45] 8. Find the missing frequencies of the following series, if the arithmetic average is 29.75 and the total number of items is 200: Wages ` 0 โ€“ 10 10 โ€“ 20 20 โ€“ 30 30 โ€“ 40 40 โ€“ 50 50 โ€“ 60 No. of workers: 25 ? 45 ? 25 30 [Ans: 35, 40] 9. Find the missing frequencies of the following series, if the arithmetic average ins 39.5 and the total number of items is 100: Marks 0 โ€“ 10 10 โ€“ 20 20 โ€“ 30 30 โ€“ 40 40 โ€“ 50 50 โ€“ 60 60 โ€“ 70 Frequency: 5 10 ? 4 20 3 ? [Ans: 30; 28] 10. From the following frequency distribution, find the value of the median: Marks No. of Students Marks No. of Students Less than 10 Less than 20 10 โ€“ 30 30 and above 5 20 35 60 40 โ€“ 50 50 and above 60 and above 10 25 9 11. calculate arithmetic mean from the following from the following data: wages (in Rs) NO of workers wages (in Rs) NO of workers less then 48 5 72-80 8 less then 56 12 80 and above 19
  • 22. Prof . T RAMA KRISHNA RAO (8839271225 ) 48-64 29 88 and above 5 64 and above 31 13. Calculate mean from following data: โ€ข 5 persons get less then rs 5 โ€ข 12 persons get less then rs 10 โ€ข 22 persons get less then rs 15 โ€ข 30 persons get less then rs 20 โ€ข 36persons get less then rs 25 โ€ข 40 persons get less then rs 30 12. The average percentage of marks secured by 200 students of Arts and Commerce is 50. The mean percentage of marks of the Arts students is 40 and that of the commerce students is 60. Find the number of Arts and Commerce students separately. [Ans: 100; 100] 13. The average marks secured in Economics by all the Commerce and Arts students in their Board examination is 60. The average of such mark of the Commerce students is 70, and that of Arts student is 50. Find the ratio of the number of students in the Commerce and Arts class. [Ans: 1:1] 14. The arithmetic average of a series of 20 items has been computed as 400. While computing, two values 450 and 360 have been taken as 540 and 630. Find correct value of the mean. [Ans: 382] 15. In a B. Com class of 128 students, 48 have failed securing 25 marks on an average. If the total marks of all the students be 5120, find the average marks secured by the students passing the test. [Ans; 49]
  • 23. Prof . T RAMA KRISHNA RAO (8839271225 ) GEOMETRIC MEAN G.M. is the nth root of the product of โ€˜nโ€™ items of a series. It is found out by multiplying all the โ€˜nโ€™ values of a series and extracting nth root of the product. Direct method Individual Series Discrete and Continuous Series ๐‘ฎ. ๐‘ด = โˆš๐‘ฟ๐Ÿ ร— ๐‘ฟ๐Ÿ ร— ๐‘ฟ๐Ÿ‘ ร— โ‹ฏ ร— ๐‘ฟ๐’ ๐‘ต ๐‘ฎ. ๐‘ด = โˆš๐‘ญ๐Ÿ๐‘ฟ๐Ÿ ร— ๐‘ญ๐Ÿ๐‘ฟ๐Ÿ ร— ๐‘ญ๐Ÿ‘๐‘ฟ๐Ÿ‘ โ‹ฏ ๐‘ญ๐’๐‘ฟ๐’ ๐‘ต Logarithmic Method Individual Series Discrete and Continuous Series ๐†. ๐Œ = ๐€๐‹๐จ๐Ÿ โˆ‘ ๐ฅ๐จ๐  ๐— ๐ ๐†. ๐Œ = ๐€๐‹๐จ๐Ÿ โˆ‘(๐… ๐ฅ๐จ๐  ๐—) ๐ Uses of geometric Mean: 1. Geometric mean is useful in calculating the average of ratios. 2. It is useful in calculating the average of changes i.e., percentage increase of decrease in sales, production, population, rate of interest or any other variables 3. It is considered as the best of averages where more weights are to be given to small items, and less weights to large items, 4. It is most suitable in constructing index numbers. Properties of G.M. 1. The product of items of a series will remain unchanged if each item is replaced by the geometric mean. 2. The sum of the deviations of the logarithm of the original observations above and below the logarithm of the geometric mean are equal to zero 3. If geometric means and the number of items of two series are known, combined geometric mean can be computed. 4. If G.M. and the number of items are known, the product of the values can be found out by using the formula (G.M)n Advantages of G.M 1. It is based on all the items of the series. 2. It is capable of further algebraic treatment. 3. It is less affected by the extreme item 4. It is specially useful in determining the average of ratios and percentage. 5. It is a suitable average in determining rates of change in any variables. 6. It is very much useful in construction of an idal index number 7. It is hardly affected by the fluctuation of sampling. Disadvantages of Geometric Mean 1. It is not easily understood and difficult ot calculate 2. It any value of a series is Zero, then the value of G.M will also be zero. 3. It gives comparatively more weights to smaller items and less weight to larger items. Exercise โ€“ B [Geometric Mean] 1. Find the G.M of the series: 133; 141; 125; 173; 182 [Ans: 149] 2. Calculate the G.M of the figures: 5, 10, 192, 14374, 20498, 120674. [Ans: 126.9] 3. From the following figures find the G.M: X: 10 20 30 40 50 60 F: 12 15 25 10 6 2 [Ans: 25.30] 4. Calculate the G.M for the following distribution: X: 0 โ€“ 10 10 โ€“ 20 20 โ€“ 30 30 โ€“ 40 40 โ€“ 50 F: 14 23 27 21 15 [Ans: 20.80] 5. Calculate the weighted Geometric Mean from the following data: Groups Index Number Weights Food 125 7 Clothing 133 5 Fuel and Lighting 141 4 House Rent 173 1 Miscellaneous 182 3 [Ans: 139.8]
  • 24. Median Median refers to that value of the variable which divides the series into two equal parts, one part consists of all values greater than the median and other part consists of all values less than the median. It is a positional average. Direct method: Individual & Discrete Series:๐Œ = ๐•๐š๐ฅ๐ฎ๐ž๐จ๐Ÿ ๐+๐Ÿ ๐Ÿ ๐ญ๐ก๐ข๐ญ๐ž๐ฆ Continuous Series: ๐Œ = ๐•๐š๐ฅ๐ฎ๐ž๐จ๐Ÿ ๐ ๐Ÿ ๐ญ๐ก๐ข๐ญ๐ž๐ฆ Interpolation method: For ascending series: ๐Œ = ๐‹๐Ÿ + ๐ข ๐Ÿ (๐ฆ โˆ’ ๐œ) Where; ๐‹๐Ÿ= Lower Limit of median Class I = Class interval of Median Class f = Respective frequency of Median Class m = ๐‘ต ๐Ÿ c = Previous Cumulative Frequency of Median Class Properties of Median: 1. Median is an average of position. 2. The sum of the deviations taken from the median ignoring plus and minus signs will be less than the sum of deviations from any other arbitrary point. 3. If median and number of items are known, missing frequencies can be traced out. 4. Advantages of median: 5. It is easy to calculate and simple to understand it is rigidly defined. 6. It is not affected by the extreme items of a series. 7. It can be determined easily in open end series and unequal class intervals. 8. It can be calculated graphically. 9. It is useful when the data cannot be measured quantitatively such as honesty, wealth, intelligence etc. 10. It can be located by inspection from the series. Disadvantages of median: 1. It is not based on all the observations of the series, hence may not be representative in many cases. 2. It is not cable of further algebraic treatment. 3. It is very much affected by fluctuations in sampling 4. Median ignores the values of extreme items. 5. It is erratic if the number of items is small. 6. It cannot be determined if the data are not arranged in proper form either ascending or descending order.
  • 25. [Median] 1. Determine the value of the median from the following series X: 5 7 9 12 10 8 7 15 21 [Ans: 9] 2. From the following frequency distribution determine the value of median: Wages (`): 35 55 45 60 70 65 75 80 No. of Workers: 25 10 12 9 16 8 15 5 [Ans: 60] 3. From the following data given below calculate the median: Classes: 0 โ€“ 10 10 โ€“ 20 20 โ€“ 30 30 โ€“ 40 40 โ€“ 50 50 โ€“ 60 60 โ€“ 70 Frequency: 7 18 24 32 10 6 5 [Ans: 30.625] 4. From the following data determine the value of the median: X: 0 โ€“ 9 10 โ€“ 19 20 โ€“ 29 30 โ€“ 39 40 โ€“ 49 50 โ€“ 59 F: 5 10 12 8 9 6 [Ans: 27.83] 5. From the following data find out the value of the median: Marks: Below 20 20 โ€“ 30 30 โ€“ 50 50 โ€“ 70 70 above No. of Students: 3 4 10 5 3 [Ans: 41] 6. Locate the value of the median from the following series: Marks less than 10 20 30 40 50 60 70 No. of Students: 3 10 18 24 33 38 40 [Ans: 33.34] 7. From the following data find out the value of the median: Marks above 0 10 20 30 40 50 60 No. of students: 100 80 65 53 43 25 12 [Ans: 33] 8. From the following frequency distribution, find the value of the median: Marks No. of Students Marks No. of Students Less than 10 Less than 20 10 โ€“ 30 30 and above 5 20 35 60 40 โ€“ 50 50 and above 60 and above 10 25 9 [Ans: 34] 9. From the data given below, trace out the missing frequency when the median is 70: X: 0 โ€“ 20 20 โ€“ 40 40 โ€“ 60 60 โ€“ 80 80 โ€“ 100 100 โ€“ 120 120 โ€“ 140 F: 5 7 8 ? 10 6 4 [Ans: 20] 10. From the following series, find out the missing frequencies, if its median be 25 and number of students 100: Marks: 0 โ€“ 10 10 โ€“ 20 20 โ€“ 30 30 โ€“ 40 40 โ€“ 50 50 โ€“ 60 No. of students: 20 10 ? 15 ? 5 [Ans: 40, 10] 11. From the following series, trace out the missing frequencies, if its median is 27.5 and number of items is 50 X: 0 โ€“ 10 10 โ€“ 20 20 โ€“ 30 30 โ€“ 40 40 โ€“ 50 50 โ€“ 60 F: 4 ? 20 ? 7 3 [Ans: 6, 10] 12. Assume N= 100 and there are class intervals all of equal intervals all of equal size the first class intervals is 10 and under 20 the cumulative frequency of the 5th , 6th ,7th and 8th class intervals are 45,70,90,and 99 respectively .find out median
  • 26. Mode Mode is that value in a series which occurs with the greatest frequency. In the words of Croxton and Cowden, โ€œThe mode of a distribution is value at the point around which the items tend to be most heavily concentrated. It may be regarded as the most typical of a series of values.โ€ Advantages of Mode: 1. It is very simple to calculate, as it can be found even by inspection. 2. It is not affected by extreme items. 3. For open end class intervals it can be determined straight away without estimating the two extreme class limits. 4. It can also be used in case of qualitative phenomenon as its calculation depends on the frequencies. 5. It can be determined graphically. 6. It is understood by a layman as it refers to a value containing maximum frequency. Disadvantages of mode: 1. It is not rigidly defined. 2. It is not based on all the observations. Any change in extreme items will not affect the mode value. 3. It is affected by the fluctuation of sample. 4. It cannot be determined directly in case of bimodal or multimodal series. 5. It is not capable of further algebraic treatment. 6. It cannot be determined from a series of unequal class intervals unless they are arranged in a proper manner. Choice of a suitable average It is known that not a single average is suitable for all practical purposes. The different averages have different characteristics and there is no universally accepted average. The choice of a particular average is usually determined on the basis of the purpose for which investigation is undertaken. For sound statistical analysis, the choice of the average depends upon: 1. The nature and availability of data; 2. The nature of the variable involved; 3. The purpose of the investigation; 4. The system of classification adopted, and 5. The use of the average for further statistical computations. Choice of a suitable average is very important because it may lead to fallacious conclusions. The following points should be remembered while selecting a particular average: A. Arithmetic mean should be used when: 1. The distribution is not very asymmetrical. 2. The series does not have very large or very small item 3. The series does not have open end class intervals. 4. All values of the series are considered as equally important. B. Median should be used when: 1. The series has unequal class intervals. 2. The series has open end class intervals. 3. The purpose is to determine the rank of various values. C. Mode should be used when: 1. The purpose is to find out the most frequently items of a series.
  • 27. 2. The data are qualitative in nature. 3. The purpose is to find out the most common item of a series. 4. The purpose is to find the average number of children per household, average size of the shirt collar or shoes, average number of rooms per household etc. D. Geometric mean should be used when: 1. Ratios, rates and percentages are to be averaged 2. More weights are to be given to small items and less weights to large items. 3. It is required to construct index numbers. E. Harmonic man should be used hen: 1. It is required to find out the average speed, average time to do a particular work, and average price at which an item can be bought or sold. 2. It is required to compute the average rate of change in profit or loss of a concern. Limitations of averages: 1. Sometimes an average might give very absurd result. For example, the average number of children per family might come out in fractions which are obviously absurd. 2. An average being a single figure gives only the central idea of a phenomenon and does not reveal its entire story. 3. In certain types of distributions like U shaped distributions, an average files to represent the entire series, 4. Since average is a single figure representing the characteristics f a given distribution, proper are should be taken in its interpretation, otherwise it might lead to very misleading conclusions. [Mode] 1. The following are the size of shoes worn by 9 persons. Calculate the modal size: Size: 5 4 4.5 5.5 4.5 6 4.5 4 4.5 [Ans: 4.5] 2. Find out the mode from the following observations: Income (in `) 300 600 900 1200 1500 1800 2100 Employees: 4 8 29 11 18 13 5 [Ans: ` 900] 3. Find out the mode from the following data using an analysis table: X: 3 4 5 6 7 8 9 10 11 12 F: 30 40 38 44 45 42 38 35 30 45 [Ans: 7] 4. Calculate the mode from the following data: Marks: 5 โ€“ 10 10 โ€“ 15 15 โ€“ 20 20 โ€“ 25 25 โ€“ 30 Students: 10 15 25 20 12 [Ans: 18.3] 5. Calculate the modal value from the following frequency distribution: X: 0 โ€“ 9 10 โ€“ 19 20 โ€“ 29 30 โ€“ 39 40 โ€“ 49 50 โ€“ 59 60 โ€“ 69 70 โ€“ 79 80 โ€“ 89 90 โ€“ 99 F: 6 29 87 181 247 263 133 43 9 2 [Ans: 47.55] 6. Find out the mode from the following data: Less than: 5 10 15 20 25 30 35 40 45 No. of items 29 224 465 582 634 644 650 653 655 [Ans: 11.35] 7. From the following data given below find the mode;
  • 28. Wages ` (above): 30 40 50 60 70 80 90 No. of Workers: 520 470 399 210 105 45 7 [Ans: `55.84] 8. From the following series, determine the value of mode: Marks below: 100 90 80 70 60 50 40 30 20 10 No. of Students: 50 45 43 36 30 20 16 11 6 3 [Ans: 56] 9. Locate the value of the mode from the data given below by the appropriate method: X: 0 โ€“ 10 10 โ€“ 20 20 โ€“ 30 30 โ€“ 40 40 โ€“ 50 50 โ€“ 60 60 โ€“ 70 70 โ€“ 80 F: 4 6 20 32 33 17 8 2 [Ans: 40.05] 10. Find out the missing frequencies in the following series, if the mode is 34 and the number of items are 60: Wages ` 0 โ€“ 10 10 โ€“ 20 20 โ€“ 30 30 โ€“ 40 40 โ€“ 50 50 โ€“ 60 60 โ€“ 70 No. of Students: 8 7 ? 20 ? 6 4 [Ans; 10, 5] 11. From the data given below, find out the missing frequencies, if median is 67, mode is 68 and number of observations is 115: X: 0 โ€“ 20 20 โ€“ 40 40 โ€“ 60 60 โ€“ 80 80 โ€“ 100 100 โ€“ 120 120 โ€“ 140 F: 2 8 30 ? ? ? 2 [Ans: 50, 20, 3] 12. In the following wage distribution, the median and mode are ` 33.5 and 34 rspectivly. But three class frequencies are missing. Find out them: X: 0 โ€“ 10 10 โ€“ 20 20 โ€“ 30 30 โ€“ 40 40 โ€“ 50 50 โ€“ 60 60 โ€“ 70 Total F: 4 16 ? ? ? 6 4 230 [Ans: 60, 100, 40]
  • 29. Unit-3 MEASURE OF VARIATION PT R.S.S.UNIVERSITY PREVIOUS YEAR QUESTION PAPERS 2016 Q. 1 A factory produce two type of electric lamps A and B in an Experiment relating to their lives the following result were obtained : length of life No of lamps (A) No of lamps (B) 500-700 5 4 700-900 11 30 900-1100 26 12 1100-1300 10 8 1300-1500 8 6 Ans: SD of A 21.64% , SD of B 23.41% A ismore consistent Q.2 Calculate the standard deviation of the following distribution in by taking assumed mean: Age no of persons Age no of persons 20-25 170 35-40 45 25-30 110 40-45 40 30-35 80 45-50 35 Ans : 7.936 2015 Q.1 (A)What do you mean by mean deviation ? how is it different from standard deviation ? (B)For a certain distribution the arithmetic mean is 45 median is 48 and Karl pearson coefficient of skewness is 0.4 calculate. (1) mode (2) standard deviation (3) the coefficient of variation Ans (1) mode=54 (2) standard deviation= -22.5 (3) the coefficient of variation= -50 Q.2 Calculate the standard deviation of the following data obtained by 5 student in group marks are 8, 12 ,13 , 15 ,22 Ans : 4.60 2014 Q.1 (A)What do you mean by deviation ? how is it different from standard deviation ? (B) Karl pearson coefficient of skewness is 0.5, the median is 42 and mode 32 calculate. (1) mean (2) standard deviation (3) the coefficient of variation .Ans (1) mean=47 (2) standard deviation= 30 (3) the coefficient of variation= 63.83% Q.2 Calculate the standard deviation of the following data : X 20 30 40 50 60 70 Frequency 8 12 20 10 6 4 Ans 13.75 2013 Q.1 (A) Explain the meaning of the coefficient of variation mention how its is different from variance (B) Calculate the standard deviation of the following data : 160, 160, 161, 162, 163, 163, 163, 164, 164, 170 Ans: 2.72 Q.2 calculate coefficient of skewness by any method of given data. Wages 0-10 10-20 20-30 30-40 40-50 50-60 60-70 No of person 1 3 11 21 43 32 09 Ans: -0.18 Measure of Variation : Standard Deviation and Skewness
  • 30. PARTITION VALUES Quartiles Deciles Percentiles Quartiles The median of a distribution splits the data into two equally-sized groups. In the same way, the quartiles are the three values that split a data set into four equal parts. Note that the 'middle' quartile is the median. The upper quartile describes a 'typical' mark for the top half of a class and the lower quartile is a 'typical' mark for the bottom half of the class. The quartiles are closely related to the histogram of a data set. Since area equals the proportion of values in a histogram, the quartiles split the histogram into four approximately equal areas. Individual SERIES For Odd series Q1 = Value of (N+1)โˆ—1 4 th item FOR even Series Q1 =value of ( N 4 + 1+N 4 )โˆ—1 4 th item Discrete Series Q1 = Value of (N + 1) โˆ— 1 4 th item Q2 = Value of (N + 1) โˆ— 2 4 th item Q3 = Value of (N + 1) โˆ— 3 4 th item Continuous Series For ascending series: M = L1 + i f (m โˆ’ c) Where; L1= Lower Limit of median Class I = Class interval of Median Class f = Respective frequency of Median Class for Q1 , m = ๐‘โˆ—1 4 for Q2 , m = ๐‘โˆ—2 4 for Q3 , m = ๐‘โˆ—3 4 c = Previous Cumulative Frequency of Class Deciles In a similar way, the deciles of a distribution are the nine values that split the data set into ten equal parts.You should not try to calculate deciles from small data sets -- a single class of marks is too small to get useful values since the extreme deciles are very variable. However the deciles can be useful descriptions for larger data sets such as national distributions for marks from standard tests. Individual SERIES For Odd series D1 = Value of (N+1)โˆ—1 10 th item FOR even Series D1 =value of ( N 10 + 1+N 10 )โˆ—1 10 th item Discrete Series D1 = Value of (N + 1) โˆ— 1 10 th item D2 = Value of (N + 1) โˆ— 2 10 th item D9 = Value of (N + 1) โˆ— 9 10 th item Continuous Series For ascending series: M = L1 + i f (m โˆ’ c) Where; L1= Lower Limit of median Class I = Class interval of Median Class f = Respective frequency of Median Class for D1 , m = ๐‘โˆ—1 10 for D2 , m = ๐‘โˆ—2 10 for D9 , m = ๐‘โˆ—9 10 c = Previous Cumulative Frequency of Class
  • 31. Percentiles In a similar way, the percentiles of a distribution are the 99 values that split the data set into a hundred equal parts. These percentiles can be used to categorise the individuals into percentile 1, ..., percentile 100. A very large data set is required before the extreme percentiles can be estimated with any accuracy. (The 'random' variability in marks is especially noticeable in the extremes of a data set.) Individual SERIES For Odd series P1 = Value of (N+1)โˆ—1 100 th item FOR even Series P1 =value of ( N 100 + 1+N 100 )โˆ—1 100 th item Discrete Series P1 = Value of (N + 1) โˆ— 1 100 th item P2 = Value of (N + 1) โˆ— 2 100 th item P65 = Value of (N + 1) โˆ— 65 100 th item Continuous Series For ascending series: M = L1 + i f (m โˆ’ c) Where; L1= Lower Limit of median Class I = Class interval of Median Class f = Respective frequency of Median Class for P1 , m = ๐‘โˆ—1 100 for P2 , m = ๐‘โˆ—2 100 for P65 , m = ๐‘โˆ—65 100 c = Previous Cumulative Frequency of Class 1. From the following data find out quartiles deciles percentiles . Weight in Kg. 47 50 58 45 53 59 47 60 49 From the following data find out quartiles deciles percentiles Size of items; 5 15 25 35 45 55 65 75 85 Frequency: 3 8 15 20 25 10 9 6 4 From the following data find out quartiles deciles percentiles Marks: 0 โ€“ 10 10 โ€“ 20 20 โ€“ 30 30 โ€“ 40 40 โ€“ 50 No. of Students: 5 8 15 16 6
  • 32. Measures of Dispersion Formulae of Measures of Dispersion On dispersion by the methods of limits 1. Range = L โ€“ S 2. Co-efficient of Range = Lโˆ’S L+S 3. Inter-quartile Range = Q3 โˆ’ Q1 4. Coefficient of Inter-quartile Range = Q3โˆ’Q1 Q3+ Q1 5. Semi inter quartile range or Quartile deviation: Q.D. = Q3โˆ’Q1 2 6. Co-efficient of Q.D = Q3โˆ’Q1 Q3+ Q1 On dispersion by the method of computation: 1. Mean deviation: Individual series Discrete and Continuous series Mean Deviation: ฮด = โˆ‘|D| N Mean Deviation: ฮด = โˆ‘ f|D| N 2. Coefficient of M.D From Mean From Median From Mode Coeff. M.D. = ฮด Mean Coeff. M.D. = ฮด Median Coeff. M.D. = ฮด Mode 3. Standard Deviation Methods Individual series Discrete / Continuous Series Direct method (based on deviation from Mean) ฯƒ =โˆš โˆ‘ x2 N ฯƒ =โˆš โˆ‘ fx2 N Short-cut Method (on assumed Mean) ฯƒ =โˆšโˆ‘ dx 2 N โˆ’ ( โˆ‘ dx N ) 2 ฯƒ =โˆšโˆ‘ fdx 2 N โˆ’ ( โˆ‘ fdx N ) 2 Step-deviation method ฯƒ =โˆšโˆ‘ d2 N โˆ’ ( โˆ‘ d N ) 2 ฯƒ =โˆšโˆ‘ fd2 N โˆ’ ( โˆ‘ fd N ) 2 Method based on values (when assumed mean is taken as zero) ฯƒ =โˆšโˆ‘ X2 N โˆ’ ( โˆ‘ X N ) 2 ฯƒ =โˆšโˆ‘ FX2 N โˆ’ ( โˆ‘ FX N ) 2 4. Other Formulae Variance: V = ฯƒ2 Standard deviation of 1st โ€˜Nโ€™ natural numbers: ฯƒ =โˆš 1 12 (N2 โˆ’ 1) Coefficient of Standard Deviation: Coeff. ฯƒ = ฯƒ Mean Coefficient of Variance: Coeff. C.V. = ฯƒ Mean ร— 100 Range = 6 ฯƒ; Q.D = 2 3 ร—ฯƒ ; and M.D. = 4 5 ร—ฯƒ
  • 33. DISPERSION Meaning: The word dispersion means deviation or difference. In statistics dispersion refers to deviation of the values of a variable from their central value. Measures of dispersion indicate the extent to which individual items vary from their averages i.e., Mean, Median or Mode. It shows the spread of items of a series from their central value. Definition: 1. According to A. L. Bowely, โ€œDispersion is the measure of variation of the items.โ€ 2. According to L. R. Connor โ€œDispersion is a measure of the extent to which the individual items varyโ€ 3. According to Spiegal, โ€œthe degree to which numerical data tend to spread about an average value is called the variation of dispersion of the data.โ€ Characteristics of dispersion: For the foregoing definition, the essential characteristics of a measures of dispersion can be outlined as under: 1. It consists of different methods through which variations can be measured in quantitative manner. 2. It deals with a statistical series. 3. It indicates the degree, or extent to which the various items of a series deviate from its central value. 4. It supplements the measures of central tendency in revealing the characteristics of a frequency distribution. 5. It speaks of the reliability, or otherwise of the average value of a series. Characteristics for an ideal measures of dispersion ๏† It should be rigidly defined. ๏† It should be easy to calculate and simple to understand ๏† It should be based on all the observations of the series. ๏† It should be used further for any algebraic treatment. ๏† It should not be affected much by the fluctuation of sampling ๏† It should be affected by the extreme items of th series. Objectives of dispersion ๏† A measure of dispersion tells us whether an average is a true representative of the series or not. ๏† The extent of variability between two or more series can be compared with the help of measures of dispersion. It is useful to determine the degree of uniformity, reliability and consistency amongst two or more sets or data. ๏† Measures of dispersion facilitate the use of other statistical measure like correlation, regression etc, for further analysis. ๏† Measures of dispersion serve as a basis for control of the variability itself. Types of Measures of Dispersion A. Methods of Limit ๏† Range ๏† Inter-quartile range ๏† Semi inter quartile range ๏† Deciles range ๏† Percentile range B. Methods of Moment ๏† Mean deviation ๏† Standard deviation ๏† Coefficient of variance ๏† Variance C. Graphic Method โ€“Lorenz Curve
  • 34. Range Range is defined as the difference between the two extreme values of a series. Thus, it is merely the difference between the largest and smallest items of the series. Advantages of Range; 1. It is easy to calculate and simple to understand. 2. It is rigidly defined 3. It takes the least possible time for calculation 4. In certain types of problems like quality control, weather forecasts etc. use of range is very useful. Disadvantages of Range: 1. It is influenced very much by fluctuation of sampling 2. It does not take into consideration all the items of the series. 3. It is not capable of further algebraic treatment. 4. It does not take into consideration the frequencies of a series Uses of Range: ๏† Quality control-Range has got a special application in the quality control measures. The control charts are prepared on the basis of range for controlling the quality of products. ๏† Weather forecast- range is used advantageously by a metrological department for forecast the weather condition. ๏† Measurement of fluctuations- Range is a very useful measure to study the fluctuation of prices of certain commodities viz, stock and shares, gold, silver etc. Inter-quartile Rang; Inter-quartile range is computed by deducting the value of the first quartile from the value of third quartile. Inter-quartile range is defined as the difference between the two extreme quartiles of a series. Advantages of inter-quartile range; 1. It is rigidly defined. 2. It can be easily calculated and simple to understand. 3. Its calculation is not affected even if first 25% and last 25% of a series are missing or changed. Disadvantages of inter-quartile range: 1. It is not based on all the observations of the series. 2. It is not capable of further algebraic treatment. 3. It is affected by fluctuation in sampling. Quartile deviation Quartile deviation is based on central 50% of items. Quartile range is the difference between Q3 and Q1 and when this difference is divided by 2 we get quartile deviation. Thus quartile deviation is defined as the average of the difference of two extreme quartiles of a series. Advantages of quartile deviation: 1. It is easy to calculate and simple to understand 2. Its calculation is based on middle 50% of item; hence it is a goods measure of dispersion. 3. It is rigidly defined. it is not very much affected by the extreme values of a series. 4. It is easy to calculate in case of open-end series. Disadvantage 1. It is not capable of further algebraic treatment 2. It is too much affected by fluctuations of samples 3. It is not based on all the observations of a series 4. It does not show the scatterness around any average. Mean deviation Mean deviation is the average difference between the items in a series from the mean, median or mode. Merits: ๏† It is better measure for comparison ๏† It is extensively used in other fields ๏† Mean deviation is less affected by the value of extreme items than the standard deviation. Demerits ๏† It ignores ยฑsigns in its calculation ๏† It is difficult to compute when average is in fraction. ๏† It is rarely used in sociological studies.
  • 35. Standard Deviation S.D. is the square root of the mean of the squared deviation from the actual mean. It is introduced by Karl person in 1823. It is by far the most important and widely used measure of studying dispersion. Note : - if we find consistence of two group the which S.D is less id more consistence Merits: ๏† All individual values are taken into account for calculation of S.D. ๏† It is capable of further algebraic treatment. ๏† It is the most rigidly defined measure of dispersion. ๏† It is used as an important instrument in making higher statistical analysis viz., correlation, regression etc. Demerits ๏† It is not easy to calculate S.D. ๏† It is not understood by a common man. ๏† It is affected very much by the extreme items of a series. Difference between M.D. and S.D ๏† While calculating standard deviation algebraic signs ยฑ are not ignored whereas in mean deviation algebraic signs are completely ignored. ๏† Standard deviation is always calculated from arithmetic mean whereas mean deviation can be calculated either from mean, median or mode. ๏† Standard deviation is much affected by the extreme observations of the series but that is not the cases with mean deviation. Variance: Variance is the square of standard deviation. Thus, variance is calculated as โ€“ (S.D.)2 The term variance was used by R.A. Fisher in 1913, if a phenomenon is affected by a number of variables, variances helps in isolating the effects of differential factors. Coefficient of Variation Coefficient of variation is defined as โ€œthe percentage of variation in mean, standard deviation being considered as the total variation in the mean.โ€ This measure developed by Karl Pearson is the most commonly used measure of relative variation. It is used in such problems where we want to comparative the variability of two or more than two series. Lorenz Curve For studying the dispersion of a series graphically we are to draw a graph of Lorenz curves as devised by the famous Economist Lorenz of England. This curve was used for the first time for measuring the distribution of wealth and income. Coefficient of Variation (CV) The coefficient of variation (CV) is a statistical measure of the dispersion of data points in a data series around the mean. The coefficient of variation represents the ratio of the standard deviation to the mean, and it is a useful statistic for comparing the degree of variation from one data series to another, even if the means are drastically different from one another.
  • 36. Exercise A 1. Form the following distribution ascertain the value of range and its coefficient. 10 15 20 25 30 40 50 55 60 70 [Ans: 60; 0.75] 2. From the following series, determine the value of range and its coefficient: Salary (per month) 1000 1500 2000 2500 3000 3500 4000 5000 No. of worker 30 20 15 3 7 10 9 6 [Ans: 4000; 0.67] 3. From the following distribution, determine the value of the range and its coefficient: Wages (per day) 20 โ€“ 25 25 โ€“ 30 30 โ€“ 35 35 โ€“ 40 40 โ€“ 45 45 โ€“ 50 No. of labourers 2 14 6 8 11 9 [Ans: 30; 0.43] 4. From the following data, determine the Range and the Coefficient of Range of marks awarded in statistics by the +2Commerce students of Swami Vivekananda College: Marks 10 โ€“ 19 20 โ€“ 29 30 โ€“ 39 40 โ€“ 49 50 โ€“ 59 60 โ€“ 69 No. of Students 15 5 12 14 10 8 [Ans: 60; 0.76] 5. From the following distribution, find the range and its coefficient: Group Below 50 50 โ€“ 60 60 โ€“ 80 80 โ€“ 110 110 โ€“ 150 150 & above Frequency 5 10 8 7 13 7 [Ans: 155; 0.224] 6. Calculate the semi-inter quartile range, or quartile deviation and its coefficient of the following data: Wages in ` 20 30 40 50 60 70 80 No. of workers 3 61 132 153 140 51 3 [Ans: ` 10; ` 0.2] 7. From the following discrete series, find out the deciles range, semi deciles range, and their coefficients: Age 15 16 17 18 19 20 21 22 No of students 5 20 18 17 10 5 3 1 [Ans: 4; 2; 0.8] 8. Calculate quartile deviation and its relative measure for the following distribution: Group: 20 โ€“ 29 30 โ€“ 39 40 โ€“ 49 50 โ€“59 60 โ€“ 69 70 -- 79 Frequency: 306 182 144 96 42 34 [Ans: 10.71; 0.29] Mean Deviation (ฮด) 1. From the following series relating to the marks obtained by a batch of 9 students in a certain test, calculate the mean deviation from mean and median and also calculate their coefficients. Weight in Kg. 47 50 58 45 53 59 47 60 49 [Ans: 4.89; 0.094; 4.67; 0.0934] 2. Find out the mean deviation from mean, median and mode, and also their coefficient form the following series: Size of items; 5 15 25 35 45 55 65 75 85 Frequency: 3 8 15 20 25 10 9 6 4 [Ans: 14.99; 14.8; 14.8] 3. Calculate the mean deviation from mean for the following series. Also, find out its coefficient: Marks: 0 โ€“ 10 10 โ€“ 20 20 โ€“ 30 30 โ€“ 40 40 โ€“ 50 No. of Students: 5 8 15 16 6 [Ans: 9.44; 0.35] 4. Calculate mean deviation from median from the following data: Marks secured Below Below Below Below Below Below Below Below
  • 37. 80 70 60 50 40 30 20 10 No of students 100 90 80 60 32 20 13 5 [Ans: 14.31] 5. Calculate median, and mean deviation from median for the following frequency distribution: Age in years 1-5 6-10 11-15 16-20 21-25 26-30 31-35 36-40 41-45 No of person 7 10 16 32 24 18 10 5 1 [Ans: 19.95; 7.1] Standard Deviation 1. Calculate the standard deviation from the following data of income of 10employees of a firm by direct method; short-cut method, and step deviation method: Income (`) 600 620 640 620 680 670 680 640 700 650 [Ans: ` 30.33] 2. From the following discrete series, find out the standard deviation by all the possible methods: Marks: 10 20 30 40 50 60 No. of students 8 12 20 10 7 3 [Ans: 13.45] 3. Calculate the standard deviation for the following data in different possible methods: Class interval: 0 โ€“ 10 10 โ€“ 20 20 โ€“ 30 30 โ€“ 40 40 โ€“ 50 No of students: 7 12 24 10 7 [Ans: 11.397] 4. Calculate the standard deviation from the following data: Age in years 10-19` 20-29 30-39 40-49 50-59 60-69 70-79 Frequency: 3 61 233 137 53 79 4 [Ans: 12.4] 5. Calculate standard deviation and coefficient of standard deviation of the following series: Wages in ` No of workers Wages in ` No of workers Upto ` 10 12 Upto ` 50 165 Upto ` 20 30 Upto ` 60 202 Upto ` 30 45 Upto ` 70 222 Upto ` 40 107 Upto ` 80 230 [Ans: 16.52; 41] 6. The following data relate to the profit/loss made by engineering companies in Odisha during the year 2012-13: Wages in ` -10 โ€“ 0 0 โ€“ 10 10 โ€“ 20 20 โ€“ 30 30 โ€“ 40 40 โ€“ 50 Less than 10 19 24 49 87 31 27 Calculate the standard deviation, and its coefficients. Also, calculate the coefficient of variation. [Ans: 13.55; 0.6134; and 61.34%] 7. The following are the maks obtained by 40 students of a class. Calculate the coefficient of variation: Marks Students Marks Students Marks Students 80 โ€“ 84 75 โ€“ 79 70 โ€“ 74 65 โ€“ 69 1 1 1 4 60 โ€“ 64 55 โ€“ 59 50 โ€“ 54 45 โ€“ 49 4 7 6 6 40 โ€“ 44 35 โ€“ 39 30 โ€“ 34 25 โ€“ 29 6 3 0 1 [Ans: 21.8%] 8. A factory produces two types of lams. In an experiment in the working life of these lams, the following results were obtained: Length of life (in hours) No. of lamps Type โ€“ A Type โ€“ B 500 โ€“ 700 700 โ€“ 900 900 โ€“ 1100 1100 โ€“ 1300 1300 โ€“ 1500 5 11 26 10 8 4 30 12 8 6 Compare the variability using the coefficient of variation. [Ans: 21.64; 23.40]
  • 38. Skewness If one tail is longer than another, the distribution is skewed. These distributions are sometimes called asymmetric or asymmetrical distributions as they donโ€™t show any kind of symmetry. Symmetry means that one half of the distribution is a mirror image of the other half. For example, the normal distribution is a symmetric distribution with no skew. The tails are exactly the same. A left-skewed distribution has a long left tail. Left-skewed distributions are also called negatively-skewed distributions. Thatโ€™s because there is a long tail in the negative direction on the number line. The mean is also to the left of the peak. A right-skewed distribution has a long right tail. Right-skewed distributions are also called positive-skew distributions. Thatโ€™s because there is a long tail in the positive direction on the number line. The mean is also to the right of the peak. Mean and Median in Skewed Distributions In a normal distribution, the mean and the median are the same number while the mean and median in a skewed distribution become different numbers:A left-skewed, negative distribution will have the mean to the left of the median A right-skewed distribution will have the mean to the right of the median.
  • 39. Effects on Statistics The normal distribution is the easiest distribution to work with in order to gain an understanding about statistics. Real life distributions are usually skewed. Too much skewness, and many statistical techniques donโ€™t work. As a result, advanced mathematical techniques including logarithms and quantile regression techniques are used. Read more about quantile regression here. Skewed Left (Negative Skew) :- A left skewed distribution is sometimes called a negatively skewed distribution because itโ€™s long tail is on the negative direction on a number line.A common misconception is that the peak of distribution is what defines โ€œpeakness.โ€ In other words, a peak that tends to the left is left skewed distribution. This is incorrect. There are two main things that make a distribution skewed left:The mean is to the left of the peak. This is the main definition behind โ€œskewnessโ€, which is technically a measure of the distribution of values around the mean.The tail is longer on the left.In most cases, the mean is to the left of the median. This isnโ€™t a reliable test for skewness though, as some distributions (i.e. many multimodal distributions) violate this rule. You should think of this as a โ€œgeneral ideaโ€ kind of rule, and not a set-in-stone one. Skewed Right / Positive Skew :-A right skewed distribution is sometimes called a positive skew distribution. Thatโ€™s because the tail is longer on the positive direction of the number line. Formula Karl Pearsonโ€™s Coefficient of Skewness 1. Pearsonโ€™s Coefficient of Skewness #1 uses the mode. The formula is: Where = the mean, Mo = the mode and s = the standard deviation 2. Pearsonโ€™s Coefficient of Skewness uses the median. The formula is: Where = the mean, Mo = the mode and s = the standard deviation Bowleyโ€™s coefficient of skewness Absolute formula =(Q3 โ€“ M ) โ€“ (M- Q1 ) = Q3 + Q1 -2M Relative measure = (Q3 + Q1 -2M) / (Q3-Q1 ) Kelly coefficient of skewness jpercentile = (P90 + P10 -2P50) / (P90-P10)
  • 40. Based on deciles jdeciles = (D9 + D1 -2D5) / ( D9-D1 ) 1. Calculate the Karl Pearsonโ€™s coefficient of Skewness from the following data: Size: 1 2 3 4 5 6 7 Frequency: 10 18 30 25 12 3 2 [Ans: 0.184] 2. Calculate the coefficient of Skewness based on mean and median from the following distribution: X: 0 โ€“ 10 10 โ€“ 20 20 โ€“ 30 30 โ€“ 40 40 โ€“ 50 50 โ€“ 60 60 โ€“ 70 70 โ€“ 80 F: 6 12 22 48 56 32 18 6 [Ans: 41.7; 42.14; โ€“0.086] 3. Calculate Karl Pearsonโ€™s Coefficient of Skewness from the following data: X: 10 โ€“ 15 15 โ€“ 20 20 โ€“ 25 25 โ€“ 30 30 โ€“ 35 35 โ€“ 40 40 โ€“ 45 45 โ€“ 50 F: 8 16 30 45 62 32 15 6 [Ans: โ€“0.22] 4. Calculate coefficient of variation and Karl Pearsonโ€™s coefficient of Skewness from the following data: Sales (crores) less than 20 40 60 80 100 No of companies: 8 20 50 70 80 [Ans: 42.65; 0.0063] 5. From the following data find out the Bowleyโ€™โ€™s coefficient of Skewness: Marks in Maths 90 50 52 86 87 76 80 85 58 61 65 [Ans: โ€“0.286] 6. Calculate the Quartile coefficient of Skewness for the following Monthly Income ` No of family Monthly Income ` No of family 501 โ€“ 600 601 โ€“ 700 701 โ€“ 800 801 โ€“ 900 5 17 80 186 901 โ€“ 1000 1001 โ€“ 1100 1101 โ€“ 1200 1201 - 1300 208 134 68 18 [Ans: 0.025] 7. The measure of Skewness for a certain distribution is โ€“0.8. If the lower and upper quartiles are 44.1 and 56.6 respectively, find the median. [Ans: 55.35] 8. In a frequency distribution of the coefficient of Skewness based on quartiles is 0.6. If the sum of upper and lower quartiles is 100 and median is 38, find the value of the upper quartile. [Ans: 70] 9. Pearsonโ€™s coefficient of Skewness of a distribution is 0.64. Its mean is 82 and Mode 50. Find the standard deviation [Ans: 50] 10. When mean 86, Median 80 and Karl Pearsonโ€™s coefficient of Skewness 0.42, find the coefficient of variance [Ans: 49.83]
  • 41. Unit-4 CORRELATION PREVIOUS YEAR PT R.S.S.U QUESTION PAPERS 2016 Q.1 Calculate Karl Pearsonโ€™s coefficient of correlation from the data given below: X: 3 7 5 4 6 8 2 7 Y: 7 12 8 8 10 13 5 10 [Ans: 0.963] Q.2 what is correlation ? Explain implication of positive and negative correlation show by means of scatter diagram the presence of perfect positive and perfect negative correlation ? 2015 Q.1 define correlation Explain different types of correlation with suitable example Q.2 Calculate Karl Pearsonโ€™s coefficient of correlation from the data given below?c X: 6 2 10 4 8 Y: 9 11 5 8 7 Ans:- -0.92 Q.3 define Karl Pearsonโ€™s coefficient of correlation what is intended to measure? 2014 Q.1 define correlation Explain different types of correlation with suitable example Q.2 calculate spearmanโ€™s coefficient of rank correlation from the following data : X: 57 16 24 65 16 16 9 40 33 48 Y: 19 6 9 20 4 15 6 24 13 13 Ans:0.7333 Q.3 Find out the coefficient of correlation between the age of husband and wife from the following data Age Of Wife Age of husband 20-30 30-40 40-50 50-60 60-70 Total 15-25 4 9 4 17 25-35 8 24 5 37 35-45 2 11 2 15 45-55 6 14 5 25 55-65 4 2 6 Total 4 19 45 25 7 Ans: 0.73 2013 Q.1 Define Karl Pearsonโ€™s coefficient of correlation what is intended to measure? How would you interpret the sign of correlation coefficient ? Q.2 explain the importance of correlation in statistical analysis in management decision situation with examples Q.3 Calculate coefficient of correlation from the data given below: X: 1 2 3 4 5 Y: 3 3 7 9 12 [Ans: 0.97] Correlation Analysis โ€“ Karlpearsonโ€™s co-efficient of Correlation.
  • 42. CORRELATION Correlation is a statistical measure for finding out the degree or strength of association between two (or more) variables. By โ€˜associationโ€™ we mean the tendency of the variables to move together. If two variables x and y are so related that movements (or variations) in one, say X, tend to be accompanied by corresponding movements ( or variations) in the other variable Y, then X and Y are said to be correlated. The movements may be in the same direction (i.e., one, say X, increases and the other i.e., Y decreases). Correlation is said to be positive or negative according as these movements are in the same or in the opposite directions. If y is unaffected by any change in X, then X and Y are said to be uncorrelated. Definition L . R . Conner: โ€œIf two or more quantities vary in sympathy so that movements in the one tend to be accompanied by corresponding movements in the other, then they are said to be correlated.โ€ Correlation may be linear or non-linear. If the amount of variation in X bears a constant ration to the corresponding amount of variation in Y, then correlation between X and Y is said to be linear. Otherwise it is non-linear. Correlation coefficient or Coefficient of correlation [r] measures the degree of linear relationship, (i.e., linear correlation) between two variables. Utility The utility of the study of correlation is immense both in physical as well as social sciences.. However, we shall confine ourselves to the utility of correlation studies in social sciences only. 1. The study of correlation reduces the range of uncertainty associated with decision making. In social sciences, particularly in the business world, forecasting is an important phenomenon, and correlation studies help us to make relatively more dependable forecasts. 2. Correlation analysis is very helpful in understanding economic behavior; it helps us in locating such variables on which other variables depend. This is helpful in studying factors by which economic events are affected. For example, we can find out the factory responsible for price rise or low productivity. 3. Correlation study helps us in identifying such factors which can stabilize a disturbed economic situation. 4. Correlation study helps us to estimate the likely change in a variable with a particular amount of change in related variable. For example correlation study can help us in finding out the change in demand with a certain amount of change in price. 5. Inter-relationship studies between different variables are very helpful tools in promoting research and opening new frontiers of knowledge. TYPES OF CORRELATION Correlation can be: [1] Positive or Negative; [2] Simple, Multiple or Partial; [3] Linear or Non-linear. 1. Positive and Negative correlation: Correlation can be either positive or negative. When the values of two variables move in the same direction i.e., when an increase in the value of one variable is associates with an increase in the value of other variable and a decrease in the value of one variable is associated with the decrease in the value of the other variable, correlation is to be positive. If, on the other hand, the values of two variables move in opposite directions, so that with an increase in the values of one variable the value of the other variable decrease, and with a decrease in the values of one variable the values of the other variable increase, correlation is said to be negative. There are some data in which correlation is generally positive while in others it is negative. 2. Simple, Multiple and Partial correlation: In simple correlation we study only two variables- say price and demand. In multiple correlations we study together the relationship between three or more factors like production, rainfall and use of fertilizes. In partial correlation though more than two factors are involved but correlation is studied only between two factors and the other factors are assumed to be constant. ๐ซ = ๐‚๐จ๐ฏ๐š๐ซ๐ข๐š๐ง๐œ๐ž๐จ๐Ÿ๐—๐š๐ง๐๐˜ ๐›”๐ฑ ร— ๐›”๐ฒ