Slideshare uses cookies to improve functionality and performance, and to provide you with relevant advertising. If you continue browsing the site, you agree to the use of cookies on this website. See our User Agreement and Privacy Policy.

Slideshare uses cookies to improve functionality and performance, and to provide you with relevant advertising. If you continue browsing the site, you agree to the use of cookies on this website. See our Privacy Policy and User Agreement for details.

Successfully reported this slideshow.

Like this presentation? Why not share!

- The AI Rush by Jean-Baptiste Dumont 2694835 views
- AI and Machine Learning Demystified... by Carol Smith 4183233 views
- 10 facts about jobs in the future by Pew Research Cent... 1197551 views
- Harry Surden - Artificial Intellige... by Harry Surden 945650 views
- Inside Google's Numbers in 2017 by Rand Fishkin 1523416 views
- Pinot: Realtime Distributed OLAP da... by Kishore Gopalakri... 756817 views

286 views

Published on

Measures in Statistics and Basics of Data

Published in:
Health & Medicine

No Downloads

Total views

286

On SlideShare

0

From Embeds

0

Number of Embeds

1

Shares

0

Downloads

0

Comments

0

Likes

4

No embeds

No notes for slide

- 1. Introduction to Statistics By; Mr. Johny Kutty Joseph Asstt. Professor
- 2. Concepts & Definition • It is to organize, interpret, and communicate numeric information. • Logical thinking is required more than mathematical ability. • The word statistics comes from the Italian words Statista means Statement and a German word Statistik means political state.. • It is a science of learning from numbers/data. • It is a science of collecting, classifying, analyzing and interpreting the data.
- 3. Concepts & Definition • A branch of mathematics dealing with the collection, analysis, interpretation, and presentation of masses of numerical data. (Merriam-Webster) • Statistics is defined as collection, Presentation, analysis and interpretation of numerical data”. ( Croxton & Cowden) • It is the sciences and art of dealing with figure and facts.
- 4. Uses of Statistics • To make the raw data meaningful. • To test null hypothesis. • To test the statistical significance of data . • To draw inferences and make the generalization. • To estimate parameters. • Make decisions based on data, and make predictions. • It helps in comparison
- 5. Biostatistics • Biostatistics is the branch of statistics applied to biological or medical sciences. • Biostatistics is the methods used in dealing with statistics in the field of health sciences such as biology, medicine, nursing, public health etc. • Biostatistics is the branch of statistics applied to biology or medical sciences. Biostatistics is also called “Biometry”
- 6. Data • Data is defined as factors known or assumed as facts, making the basis of reasoning or calculation. • Broadly there are quantitative and qualitative data. • Quantitative data deals with numbers and things you can measure objectively: Eg; height, weight, length, temperature, volume, area etc. It is number value. • Qualitative data deals with characteristics and descriptors that can't be easily measured, but can be observed subjectively. Eg. smells, tastes, textures, attractiveness, and color.
- 7. Data • Quantitative data; continuous and discrete. • Discrete data is a count that can't be made more precise. For instance, the number of children in your family is discrete data, because you are counting whole, indivisible entities: you can't have 2.5 kids. • Continuous data could be divided and reduced to finer and finer levels. Eg; Height of children made more precise by Meters- centimeters-millimeters and beyond. So height is continuous data.
- 8. Data • Qualitative data; It is also referred as attributable data. Binary, Nominal (unordered) and Ordinal (ordered) data. • Binary data place things in one of two mutually exclusive categories: right/wrong, true/false, or accept/reject. • Nominal Data: We assign individual items number or category that do not have an implicit or natural value or rank. (Gender: 1=male and 2= female) • Ordinal Data: The items are assigned to categories that have some kind of implicit or natural order. Eg. "Short, Medium, or Tall." Rating from 1 to 5 on scale where 5 is most appropriate.
- 9. Scales of Measurement • Measurement is the process of assigning numbers or labels to objects, persons, states, or events in accordance with specific rules to represent quantities or qualities of attributes. • We do not measure specific objects, persons, etc., we measure attributes or features that define them. • It is a system of classifying measurements according to the nature of the measurement and the type of mathematical operations to which they match.
- 10. Scales of Measurement
- 11. Data Classification in Science
- 12. Nominal Measurement • The lowest level of measurement also referred as categorical data. • It represents characteristics. Eg. Gender, Language, locality etc. Numerical values may be given but do not have any mathematical meaning. • It act as labels and hence changing order doesn’t have any significance.
- 13. Ordinal Measurement • It is the second level, in which the scores are given in such a manner as the number increases the status/condition also increases or upgrades. • The limitation of this type of data is that difference between all the 4 options are not equally measurable or not known. • It is mainly used to measure non numerical features such as patient satisfaction, etc. How often do you feel back pain ? No Pain: 1, Mild Pain: 2 Moderate: 3, Severe : 4
- 14. Interval Measurement • An interval scale has the characteristics of an ordinal scale. • An interval scale permits use of measurement that enables data to be placed at equally spaced intervals in relation to the spread of the variable. • This measurement has a starting and a terminating point that is divided into equal space intervals. • The problem with interval values data is that they don’t have a true zero. What is the room temperature ? a) -20 to -10; b) -10 to 0; c) 0 to 10 ; d) 10 to 20
- 15. Ratio Measurement • It is the highest level of data. • A ratio scale is a scale that measures in terms of equal intervals and an absolute zero point of origin. It has all the properties of nominal, interval and ordinal. • The bio-physiological characteristics such as age, weight, height are examples. • The variables that are measured either on interval or ratio are considered continuous. • Eg. It can easily be stated that one who weighs 80 kg is twice heavy as someone who weighs 40kg.
- 16. Comparison of levels • The levels of measurement forms a hierarchy, with ratio at the top and nominal at the base. • The higher the level of measurement precise is the data. • It is possible to convert data to lower level but not the reverse process. • A ratio may be converted to ordinal but ordinal cannot be ratio. Assess the weight of people Ordinal Ratio a. Below 50 a. 40 to 50 b.50 to 70 b. 50 to 60 c. Above 70 c. 60 to 70 d. 70 to 80 Some psychological scales (Likert’s scale) are considered ordinal as well as interval.
- 17. Classification of Statistics • Descriptive Statistics: It is the enumeration, organization and graphical representation of data. It helps to summarize the meaning of data. Eg. Demographic variables. • Inferential Statistics: It is also called as sampling statistics. It is the inference of conditions that exist in large set of observations. Eg. Test the efficiency of a new hypertensive drug on a particular population.
- 18. Descriptive Statistics • It is classified as the following • Frequency distribution and graphical presentation(measures of condensation). • Measures of central tendency. (Mean, Median, Mode) • Measures of dispersion. (difference) Eg. Range, Mean deviation, Standard deviation, Quartile deviation • Measures of relationship (correlation coefficient, regression etc.)
- 19. Frequency Distribution • A set of data can be described in terms of three characteristics. Distribution of values, central tendency and variability (dispersion and relationship). • Distribution of values or frequency distribution are used to organize the numeric data. • It is a systematic arrangement of values from lower to higher together with count of number/frequency with which the value was obtained.
- 20. Frequency Distribution • Observe the below given table for anxiety scores of 60 patients. • Inspection of these numbers does not help us to understand patients anxiety. 22 24 25 19 24 25 23 23 24 20 25 16 20 25 17 22 24 18 22 23 15 24 23 22 21 24 20 25 18 25 24 23 16 25 30 20 19 21 23 24 19 18 20 21 17 25 22 24 20 17 20 25 21 24 23 19 21 21 25 21
- 21. Frequency Distribution • Frequency distribution consists of two parts; observed values (X) and frequency (f). N is the sample size. • Scores are in order in a column and corresponding frequencies in another. • The sum of numbers in the frequency must be equal to N. (Σf=N) • See the following frequency distribution table of the given patient’s anxiety scores that gives clear understanding of the data.
- 22. Frequency Distribution SCORE (X) Frequency (f) Percentage (%) 15 1 1.7 16 2 3.3 17 3 5.0 18 4 6.7 19 4 6.7 20 7 11.7 21 7 11.7 22 4 6.7 23 8 13.3 24 10 16.7 25 10 16.7 N = 60 = Σf Σ% = 100.0%
- 23. Tables • It represents data in concise, systematic manner from the masses of statistical data. • Tabulation is the first step in data analysis. • A table consist of table number, title, contents, foot notes etc. • Tables are broadly classified into • A. Frequency distribution table • B. Contingency Table • C. Multiple response table • D. Miscellaneous Table.
- 24. Tables • Frequency distribution tables: it represents frequency and percentage distribution of the collected information. Usually the number of classes vary between 3 to 8. Too many or too few classes may fail to reveal the salient features of data. Socio demographic Profile of patients Variables N = 60 F (%) Age (years) 20 -40 41 - 60 18 (30.0) 42 (70.0) Gender Male Female Transgender 39 (65.0) 21 (21.0) 0 (0.0) Marital Status Married Unmarried Divorced 52 (86.7) 08 (13.3) 0 Locality Urban Rural 31 (51.7) 29 (48.3)
- 25. Tables • Contingency tables: it represents frequency distribution of two mutually exclusive nominal variables simultaneously. It is also called as cross tables. These tables could be 2x2, 2x3 and 3x3 depending on the number of variables. The number of subjects in a cell is called as cell frequency. These tables are usually used for Chi-square test. Type of Ventilation and Bowel movements in patients Bowel Movements Spontaneous ventilation Mechanical Ventilation Total frequency χ2 value Present 391 (64.0) 32 (29.4) 423 45.87 df=1 (c-1)(r-1)Absent 220 (36.0) 77 (70.6) 297 Total 611 109 720 (N)
- 26. Tables • Multiple response table: It is used to represent data that are neither exclusive nor exhaustive. It is used when “f” exceeds “N”. It is made to represent the percentage distribution. Factors Contributing to sleep deprivation among patients. Factors N = 60 F (%) Blood sampling 35 (58.3) Diagnostic Tests 33 (55.0) Medication 33 (55.0) Vital Signs monitoring 32 (53.3) Noise 32 (53.3) Bright Lights 30 (50.0)
- 27. Tables • Miscellaneous Table: Table that represent data other than frequency or percentage distributions such as mean, median, mode, SD etc.
- 28. Graphical Representation of Data • It is most convenient and appealing way in which statistical results may be presented. • It gives an overall view of the entire data and is visually attractive. • It facilitates comparison.
- 29. Types of Graphs and Diagram • Bar Diagram: Useful in displaying nominal or ordinal data. It shows the visual comparison of magnitude of a variable and its frequency. It may either be prepared vertically or horizontally. • There are mainly three types of Bar diagram such as simple, multiple and proportion bar charts. See the following examples.
- 30. Types of Graphs and Diagram 72 28 0 10 20 30 40 50 60 70 80 Vegetarian Non vegetarian Simple bar diagram showing dietary pattern of people Vegetarian Non vegetarian
- 31. Types of Graphs and Diagram 60 14 26 40 30 30 0 10 20 30 40 50 60 70 Asia Africa Europe Multiple bar diagram showing the percentage of population and land. Population Land
- 32. Types of Graphs and Diagram 0 10 20 30 40 50 60 70 80 90 100 Population Land 60 40 14 30 26 30 Proportionate bar graph showing worlds population and land area Europe Africa Asia
- 33. Types of Graphs and Diagram • Pie Diagram/ Sector diagram: Useful to present discrete data such as age groups, gender, etc in a population. The input must be in percentage. Size of the angle is calculated by the formula class frequency/total observation x 360 degree. 32 40 8 20 Health Problems of the old age in Jammu Hypertensi on Diabetes Arthritis Sensory
- 34. Types of Graphs and Diagram • Histogram: The most commonly used graphical representation of grouped frequency. • Variable characters of different group/class is on the x axis and their respective frequencies on y axis. • Frequency of each group forms a column or rectangle. • The area of rectangle is proportional to the frequency of the class interval. • Eg: Age group (years) 15-20 20-25 25-30 30-35 35-40 No. of males 15 20 40 60 50
- 35. Types of Graphs and Diagram
- 36. Types of Graphs and Diagram • Frequency Polygon: It is the curve (two dimensional) obtained by joining the middle top points of the rectangles in a histogram by straight lines. • The two end points of the line drawn are joined to the x axis at the midpoint of the empty class intervals. • It is more simple and sketch the outline of the data clearly than histogram. • Eg Age group (years) 15-20 20-25 25-30 30-35 35-40 No. of males 15 20 40 60 50
- 37. Types of Graphs and Diagram 0 15 20 40 60 50 00 10 20 30 40 50 60 70 15 - 20 20 - 25 25 - 30 30 - 35 35 - 40 Number of Males Number of Males
- 38. Types of Graphs and Diagram • Line graph: In this the frequency polygon are depicting by line. • Commonly used to represent those data that is collected over a long period of time. • On x axis independent variables are presented and dependent variables on the y axis. • The plotted data can be joined by a straight lines. Year 2001 2002 2003 2004 2005 2006 2007 Cars sold in Delhi (in thousand) 123 203 328 298 337 417 486 Cars sold in Mumbai(in thousand) 456 402 387 347 342 307 298
- 39. Types of Graphs and Diagram 123 203 328 298 337 417 486 456 402 387 347 342 307 298 0 100 200 300 400 500 600 2001 2002 2003 2004 2005 2006 2007 Line graph presenting the number of cars sold in Delhi and Mumbai during 2001 - 2007 In Delhi In Mumbai
- 40. Types of Graphs and Diagram • Cumulative Frequency curve/ “ogive”: It is the representation of cumulative frequency for statistical purpose. • First convert the frequency table to cumulative frequency and then plot it on the line. • It is also called as “ogive”. Age group (years) 15-20 20-25 25-30 30-35 35-40 No. of males 15 20 40 60 50 Cumulative Frequency 15 35 75 135 185
- 41. Types of Graphs and Diagram 15 35 75 135 185 0 20 40 60 80 100 120 140 160 180 200 15 - 20 20 - 25 25 - 30 30 - 35 35 - 40 Number of Males Number of Males
- 42. Types of Graphs and Diagram • Scattered or dotted diagrams: It is a graphic representation shows the nature of correlation between two variables. Eg. Student marks in an examination • It is also called as correlation diagram. • It may be positive (upward) or negative (downward) Number of students Marks obtained out of 100 12 40-50 10 50-60 8 60-70 7 70-80 5 80-90 2 90-100
- 43. Types of Graphs and Diagram 0 2 4 6 8 10 12 14 0 50 100 150 Numberofstudents Marks obtained out of 100 Scattered diagram show the negative correlation No. of students
- 44. Types of Graphs and Diagram • Pictograms or picture diagram: Use of pictures to plot the frequency of a characteristics. • Map diagram or spot map: Maps are prepared to show geographical distribution of frequencies of characteristics.
- 45. Limitations of Graphs • It is confusing (depend on the type) • It presents only quantitative data. • It gets only on one aspect or on limited characteristics. • It can present only approximate values.

No public clipboards found for this slide

Be the first to comment