We've updated our privacy policy. Click here to review the details. Tap here to review the details.

Successfully reported this slideshow.

Your SlideShare is downloading.
×

Activate your 30 day free trial to unlock unlimited reading.

Activate your 30 day free trial to continue reading.

Top clipped slide

1 of 26
Ad

data analysis

data analysis

- 1. Data Analysis Lab - 1 Introduction By Dr. Abhishek Kumar Singh
- 2. Student Introduction • Name • City and State • Education detail (graduation, XII and X)
- 3. • PhD (IIT BHU Varanasi) • M Tech (IIT BHU Varanasi) • B Tech (GBTU) • 3 Research Paper in SCOPUS/ABDC Indexed journals • 8 papers reviewed as a reviewer • Six sigma green belt
- 4. Content • Syllabus • Data Analysis • Variables • Univariate • Bivariate
- 5. Univariate Descriptive Analysis • Measures of Central Tendency- Mean, Median, Mode • Measures of Variability- Range, Variance, Standard Deviation, Co-efficient of Deviation • Measures of Shape- Skewness and Kurtosis • Measures of Stability- Standard Error
- 6. Bivariate Descriptive Analysis • Covariance • Correlation
- 7. Data Analysis • The Process of cleaning, transforming, interpreting, analyzing and visualizing the data to extract useful information and gain valuable insights to make more effective business decisions is called data analysis.
- 8. Variables • Variables: Any character, characteristics or quality that varies is termed a variable. • E.g.: To collect the basic clinical and demographic information on patients with particular illness. Variables of interest may include Gender (M/F), age and height of the patients.
- 9. Variable Categorical Numerical Nominal Ordinal Discrete Continuous Categories are mutually exclusive and unordered. Eg. Gender (M/F) Blood Group (A/B/AB/O) Categories are mutually exclusive and ordered. Eg. Disease severity (Mild, Moderate and Severe) Integer values, typically counts no notion of magnitude. Eg. No. of children vaccinated, days sick per year Takes any value in a range of values have a magnitude. E.g. weight in kg and Height in cm
- 10. Statistics Descriptive Inferential • Collecting • Organizing • Summarizing • Presenting Data • Making inference • Hypothesis testing • Determining relationship • Making Prediction
- 11. Three types of analysis • Univariate analysis: the examination of cases on only one variable at a time (e.g., weight of college students). • Bivariate analysis: the examination of two variables simultaneously (e.g., the relation between gender and weight of college students). • Multivariate analysis: examination of two variables simultaneously (e.g., the relationship between gender, race, and weight of college students).
- 12. Purpose of different type of analysis • Univariate analysis: mainly description • Bivariate analysis: Determining the empirical relationship between two variables. • Multivariate analysis: Determining the empirical relationship among multiple variables.
- 13. Univariate • The objective of univariate analysis is to derive the data, define and summarize it and analyze the pattern present in it. • Univariate techniques are appropriate when there is a single measurement of each element in the sample or when there are several measurements of each element but each variable is analyzed in isolation.
- 14. Univariate Descriptive Inferential • Measures of Central Tendency- Mean, Median, Mode • Measures of Variability- Range, Variance, Standard Deviation, Co-efficient of Deviation • Measures of Shape- Skewness and Kurtosis • Measures of Stability- Standard Error • z test • t test • Chi square test
- 15. Numerical Methods • Mean – Let X1, X2, X3,….Xn be the n data points, then mean of data is defined as – Mean provide the central value about which the data is spread out.
- 16. Numerical Methods • Median – Median is the value which divide the data in two halves – Let X1, X2, X3,….Xn be the n data points – Order the n data values – If the number of data points is odd then sample median is the value in position of (n+1)/2 – If the number of data points is even then sample median is the average of value in position of n/2 and (n/2+1)
- 17. Mean or Median? • Both the measures provide the “middle” value of data, so how do they compare? – Median is robust again extreme values in the data – While mean is affected by the extreme values • Example: 8, 9, 10, 11, 12 be the five data points – Mean = 10 and Median = 10 – Replace 12 by 18 • Mean = 11.2 but Median =10
- 18. Numerical Methods • Mode – Mode is the a value in data that occurs with highest frequency – It’s the most probable value of the data – It is possible to have data that has more than one Mode value. Such data is called multimodal.
- 19. Measures of Variability • Percentile – Order the data in ascending order • Then, p1 in called the first percentile if 1% of points lie below this value • Similarly pk is called the k% of data points lie below this value, where 0≤k≤100 • Quartile – P25 is called the 1st quartile Q1 – P75 is called the 3rd quartile Q3 – P50 is Median
- 20. Measure of Dispersion • Measures the spread of data – Range – Variation or standard deviation • Measures the spread about mean/average value of data – Interquartile range • Measures the spread about median value of the data
- 21. Measure of Dispersion • Range = M-m, where, – M = Max (x1, x2, ….xn) – m = Min (x1, x2, ….xn) • Variance – S2 = – Standard deviation = S • Interquartile range: Q3 - Q1
- 22. Standard Deviation • Standard Deviation is most commonly used measure of dispersion. – Under the assumption of normality the range of Covers 67% of the data. • Hence, this is commonly used to show possible error in the observed value of data
- 23. Graphical Method • Histogram or Bar chart – Frequency Plot • Pie Chart • Cumulative frequency plot • Box and Whisker plot
- 24. Bivariate • Bi means two and variate means variable, so here there are two variables. The analysisis related to cause and the relationship between the two variables. • Correlation • Covariance

No public clipboards found for this slide

You just clipped your first slide!

Clipping is a handy way to collect important slides you want to go back to later. Now customize the name of a clipboard to store your clips.Hate ads?

Enjoy access to millions of presentations, documents, ebooks, audiobooks, magazines, and more **ad-free.**

The SlideShare family just got bigger. Enjoy access to millions of ebooks, audiobooks, magazines, and more from Scribd.

Cancel anytime.
Be the first to like this

Total views

5

On SlideShare

0

From Embeds

0

Number of Embeds

1

Unlimited Reading

Learn faster and smarter from top experts

Unlimited Downloading

Download to take your learnings offline and on the go

You also get free access to Scribd!

Instant access to millions of ebooks, audiobooks, magazines, podcasts and more.

Read and listen offline with any device.

Free access to premium services like Tuneln, Mubi and more.

We’ve updated our privacy policy so that we are compliant with changing global privacy regulations and to provide you with insight into the limited ways in which we use your data.

You can read the details below. By accepting, you agree to the updated privacy policy.

Thank you!

We've encountered a problem, please try again.