2. Introduction of Data Analytics
Most companies are collecting loads of data all the time—but, in its raw form, this
data doesn’t really mean anything. This is where data analytics comes in. Data
analytics is the process of analyzing raw data in order to draw out meaningful,
actionable insights, which are then used to inform and drive smart business
decisions.
A data analyst will extract raw data, organize it, and then analyze it, transforming
it from incomprehensible numbers into coherent, intelligible information.
3. Meaning of Term Statistics
Statistics is a branch of applied mathematics that involves the collection, description,
analysis, and inference of conclusions from quantitative data. The mathematical
theories behind statistics rely heavily on differential and integral calculus, linear
algebra, and probability theory.
Statisticians, people who do statistics, are particularly concerned with determining
how to draw reliable conclusions about large groups and general events from the
behavior and other observable characteristics of small samples. These small samples
represent a portion of the large group or a limited number of instances of a
general phenomenon.
4. Importance of Data Interpretation
The interpretation of data helps researchers to categorize, manipulate, and
summarize the information in order to answer critical questions. The
importance of data interpretation is evident and this is why it needs to be
done properly.
The importance of data interpretation: is evident and this is why it needs to
be done properly. While there are several different types of processes that are
implemented based on individual data nature, the two broadest and most
common categories are “quantitative analysis” and “qualitative analysis”.
5. What are Quantitative and Qualitative Data?
Quantitative data are measures of values or counts and are expressed as
numbers.
Quantitative data are data about numeric variables (e.g. how many; how
much; or how often).
Qualitative data are measures of 'types' and may be represented by a name,
symbol, or a number code.
Qualitative data are data about categorical variables (e.g. what type).
Statistical Language - Quantitative and Qualitative Data
6. Graphical Representation
Graphical Representation is a way of analysing numerical data. It exhibits the
relation between data, ideas, information and concepts in a diagram. It is easy
to understand and it is one of the most important learning strategies. It always
depends on the type of information in a particular domain. There are different
types of graphical representation.
Some of them are as follows:
7. Line Graphs – Line graph or the linear graph is used to display the continuous data and it is
useful for predicting future events over time.
Bar Graphs – Bar Graph is used to display the category of data and it compares the data
using solid bars to represent the quantities.
Histograms – The graph that uses bars to represent the frequency of numerical data that
are organised into intervals. Since all the intervals are equal and continuous, all the bars
have the same width.
Line Plot – It shows the frequency of data on a given number line. ‘ x ‘ is placed above a
number line each time when that data occurs again.
Frequency Table – The table shows the number of pieces of data that falls within the given
interval.
8. Advantages of graphical representation
1) Graphs are attractive, interesting and impressive.
2) No knowledge of mathematics required.
3) simplest method of presenting data.
4) comparison is made easy.
5) certain statistical measures can be ascertained with care.
9. Disadvantages of Graphical Representation
1. There are various methods of representing data which can make the
researcher difficult to select the suitable method.
2. It may be complicated and difficult for some people to understand
graphical representation.
3. As graphical representations are complex, there are higher chances of
mistakes and errors.
4. Graphical representation can be costly as it requires images and colours.
5. It takes more effort and time to prepare comparatively to normal reports.
10. Sampling Design
A procedure or plan which is used to select sample from given population
Sampling means selecting the group that you will actually collect data from in
your research. For example, if you are researching the opinions of students in
your university, you could survey a sample of 100 students. In statistics,
sampling allows you to test a hypothesis about the characteristics of a
population.
11. Type of Sampling Design
There are two major type of sampling Design:
First One: Probability Sampling Design in the selection of a sample from a population,
when this selection is based on the principle of randomization, that is, random
selection or chance.
Second One: Non – probability sampling design a method of selecting units from
population using subjective is non- probability sampling method. .
12. Probability Sampling Design
Simple random sampling type of probability sampling in which the researcher
randomly selects a subset of participants from a population.
Stratified random sampling is a method of sampling that involves the division of a
population into smaller subgroups known as strata.
Systematic sampling a probability sampling method where researchers select
members of the population at a regular interval.
Cluster sampling a probability sampling method in which you divide a population
into clusters schools and then randomly select some of these clusters as your
sample.
13. Type of Sampling Design
And the last one Non probability a method of selecting units from a population using a subjective
(i.e. Non-random) method. Non probability Sampling Design also in four minor type:
Convenience sampling is a non-probability sampling method where units are selected for
inclusion in the sample due to convenience.
Judgement Judgment sampling, also referred to as judgmental sampling or authoritative
sampling, is a non-probability sampling technique where the researcher selects units to be
sampled based on his own existing knowledge, or his professional judgment.
Quata a non-probability sampling method that relies on the non-random selection of a
predetermined number or proportion of units.
Snowball a recruitment technique in which research participants are asked to assist researchers
in identifying other potential subjects
14. Ex:
To study work life balance of working women in IT industry of Pune.
Stratified Sampling
We will take sample on the basis of women Status like:
Age Group:
23-30
31-40
41-50
50 & Above
Age Group
23-30 31-40 41-50 50 and Above
23-30
31-40
41-50
50
&
above
15. Ex:
To Study Employee Satisfication among the employees of small Scale company of 100 employees
about employee welfare Policies.
16. Ex: Draw Histogram of the following data:
Salary in
Rs.{Th}
30-40 40-50 50-60 60-70 70-80 80-90 90-100
No of
Employees
20 30 60 75 115 100 60
17. Mean, Median and Mode
Mean, Median and Mode are three types of averages commonly used in statistics.
These averages are also called the measures of central tendency.
Mean
Mean is also known as arithmetic mean (A.M.)
Mean is the average value for a set of data.
Formula:
Mean= Sum of data values/ Number of values
18. Median
Median is middle value in set of values arranged in order of magnitude.
It is denoted by M.
Formula
M= {N+1} / 2 Observation
19. Mode
Mode is the most common fact or value with the highest frequency in
a set of data.
Example :
Find the Mode for the numbers: 6,3,5,6,3,8,2,6 and 9
Solution: Mode = 6, frequency =3
The number 6 occur thrice. All the other number not occur more than
thrice. 6 is the highest frequency.