1. Presented From: Aleena Alvi
Topic: Misuses of Statistics , Misleading Results of Statistics &
Limitation of Statistics
PAKISTAN
2.
3. MISUSE OF STATISTICS
Statistics is the practice of collecting, organizing and representing large amounts
of numerical data. Statistics can tell us about trends that are happening in the
world. Misuse of statistics occurs when a statistical argument asserts a
falsehood. In some cases, the misuse may be accidental. In others, it is purposeful
and for the gain of the perpetrator.
4. Organizations that do not publish every study they carry out, such as tobacco
companies denying a link between smoking and cancer, anti-smoking advocacy
groups and media outlets trying to prove a link between smoking and various
ailments.
For example, a company has to do to promote a neutral (useless) product is to find
or conduct, for example, 40 studies with a confidence level of 95%. If the product is
really useless, this would on average produce one study showing the product was
beneficial, one study showing it was harmful and thirty-eight inconclusive studies
(38 is 95% of 40).
5. The answers to surveys can often be manipulated by wording the question in such
a way as to induce a prevalence towards a certain answer from the respondent. For
example, in polling support for a war, the questions:
Do you support cuts in income tax?
Do you support cuts in income tax?
The point should be that the person being asked has no way of guessing from the
wording what the questioner might want to hear. The proper formulation of
questions can be very subtle. The responses to two questions can vary dramatically
depending on the order in which they are asked.
6. Overgeneralization is a fallacy occurring when a statistic about a particular
population is asserted to hold among members of a group for which the original
population is not a representative sample.
For example, As young people are more likely to lack a conventional "landline"
phone, a telephone poll that exclusively surveys responders of calls landline
phones, may cause the poll results to under sample the views of young people, if no
other measures are taken to account for this skewing of the sampling.
7. There are also many other measurement problems in population surveys. People
may think that it is impossible to get data on the opinion of dozens of millions of
people by just polling a few thousands. This is also inaccurate a poll with perfect
unbiased sampling and truthful answers has a mathematically determined error,
which only depends on the number of people polled.
For example, a survey of 1000 people may contain 100 people from a certain
ethnic or economic group. The results focusing on that group will be much less
reliable than results for the full population. If the margin of error for the full sample
was 4%, say, then the margin of error for such a subgroup could be around 13%.
8. When a statistical test shows a correlation between A and B, there are usually four
possibilities:
1. A causes B.
2. B causes A.
3. A and B are both caused by a third factor, C.
4. The observed correlation was due purely to chance.
9. The fourth possibility can be quantified by statistical tests that can calculate the
probability that the correlation observed would be as large as it is just by chance if,
in fact, there is no relationship between the variables.
If the number of people buying ice cream at the beach is statistically related to the
number of people who drown at the beach, then nobody would claim ice cream
causes drowning because it's obvious that it isn't so. (In this case, both drowning
and ice cream buying are clearly related by a third factor: the number of people at
the beach).
10. In data dredging, large compilations of data are examined in order to find a
correlation, without any pre-defined choice of a hypothesis to be tested. Since the
required confidence interval to establish a relationship between two parameters is
usually chosen to be 95% (meaning that there is a 95% chance that the
relationship observed is not due to random chance), there is a thus a 5% chance of
finding a correlation between any two sets of completely random variables.
For example, Magnetic media, such as hard disk drives, floppy disks and magnetic
tapes, may experience data decay as bits lose their magnetic orientation. Periodic
refreshing by rewriting the data can alleviate this problem
11. Data manipulation is a serious issue/consideration in the most honest of statistical
analyses. Outliers, missing data and non-normality can all adversely affect the
validity of statistical analysis.
Informally called "fudging the data," this practice includes selective reporting and
even simply making up false data.
Examples of selective reporting abound. The easiest and most common examples
involve choosing a group of results that follow a pattern consistent with the
preferred hypothesis while ignoring other results or "data runs" that contradict the
hypothesis.
12. MISLEADING STATISTICS
The misusage of numerical data, either intentionally or due to error, that results in
misleading information. Misleading statistics can deceive the receiver of the
information if the receiver is not careful to notice the error or deception. Statistics
can be misleading in a number of ways i.e. inventing false statistical information,
misinformation, neglecting the baseline and making fallacious comparisons
13. An obvious problem with statistics is that they can be simply be fabricated. Of
course this could be true with any claim, but because statistics use specific
numbers, they have a quality of authority about them, and we may be a little less
suspicious that a statistical claim is false than we would be for a more descriptive
argument.
i.e. "83% of high school students admit cheating on tests" just sounds more
authoritative than "most high school students admit they cheat on tests."
14. Statistics are obtained by taking a sample from a larger group and assuming the
whole group has the same characteristics as the sample.
For example, if we ask 100 people who they are going to vote for in the next
election, and 55 of them say they will vote for PTI, we might assume that about
55% of all the voters will vote for PTI. This is very useful, since we can't possibly
ask all the voters, but it has some important limitations.
15. Statistics based on polls can be faulty if the poll is constructed in such a way as to
encourage a particular answer.
If a question is worded, "Do you feel you should be taxed so some people can get
paid for staying home and doing nothing?" it is likely to get a lot of "no" responses.
On the other hand, the question "Do you think the government should help people
who are unable to find work?" is likely to get a lot more positive responses.
16. Statistics in the form of rankings: "He is ranked fifth among hitters for most career
home runs" or "this is the third leading cause of accidents in the home." Since
these are based on comparisons with other quantities rather than specifying
specific amounts, there are special problems we need to be aware of.
The problem with ranking is that it does not tell us much about the actual amount
involved. The most popular restaurant in the city might only do one one-thousandth
of the business in the city, while the most popular brand of soup might have 70% of
the sales, so simply being ranked number one doesn't tell us much about the actual
percentage or amount of business.
17. LIMITATION OF STATISTICS
1.Statistics does not deal with isolated measurement
2.Statistics deals with only quantitative characteristics
3.Statistics laws are true on average. Statistics are aggregates of facts. So single
observation is not a statistics, it deals with groups and aggregates only.
18. 4.Statistical methods are best applicable on quantitative data.
5.It sufficient care is not exercised in collecting, analyzing and interpretation the
data, statistical results might be misleading.
6.Some errors are possible in statistical decisions. Particularly the inferential
statistics involves certain errors. We do not know whether an error has been
committed or not.
19. Statistics deals with facts and figures. So the quality aspect of a variable or the
subjective phenomenon falls out of the scope of statistics. For example, qualities
like beauty, honesty, intelligence etc. cannot be numerically expressed. So these
characteristics cannot be examined statistically. This limits the scope of the
subject.
20. Statistical laws are not exact as incase of natural sciences. These laws are true
only on average. They hold good under certain conditions. They cannot be
universally applied. So statistics has less practical utility.
21. Statistics deals with aggregate of facts. Single or isolated figures are not statistics.
This is considered to be a major handicap of statistics.
22. Statistics does not prove or disprove anything. It is just a means to an end. For
this, statistics is often misused. Statistical methods rightly used are beneficial but if
misused these become harmful. Statistical methods used by less expert hands will
lead to inaccurate results. Here the fault does not lie with the subject of statistics
but with the person who makes wrong use of it.