•0 likes•10 views

Report

Share

Download to read offline

Biostatistics

- 1. Introduction to Biostatistics Dr. Karunambigai.M Public Health Sciences Department KI University
- 2. Data • Research is any process by which information is systematically and carefully gathered for the purpose of answering questions, examining ideas, or testing theories. • Numerical information collected as part of any research is called Data. Depending on the nature of the problem, the data may relate to individuals, families, houses, villages etc… • The data collected are known as observations. The individual subjects upon whom the data are collected are known as statistical units.
- 3. Variables • The characteristics or events that are measured on a subject, in a research study are called variables, because they vary. (i.e., they take different values in different subjects or vary from one subject to another). • Variables are measured according to two broad types of measurement scales: Numerical & Categorical (otherwise known as Quantitative & Qualitative).
- 4. Types of dataset and their measure • Population - dataset consisting of all outcomes, measurements, or responses of interest. • Sample - dataset which is a subset of the population. • Parameter - a numerical measurement made using the population. • Statistic - a numerical measurement made using a sample.
- 5. Properties of Measurement • Difference - Different numerals mean different instances the variable can take • Magnitude – This indicates that something is more or less than the other • Equal Appearing Interval – Different numerals have equal distances with preceding & succeeding numbers • True Zero – Zero has an absolute meaning
- 6. Level of Measurement (Measurement Scale) • Nominal • Ordinal • Interval • Ratio The measurement levels are considered in the following hierarchy: (LOWEST) Nominal -Ordinal – Interval - Ratio (HIGHEST)
- 7. Nominal Scale • Numbers serve as labels. • Numbers used only for identification and one- to- one correspondence with the objects • Only permissible operation is counting • Statistical analysis based on frequency counts such as percentage, mode. Example: gender, religion, locality, party affiliation etc
- 8. Ordinal Scale • Ranking scale, assign numbers to indicate relative extent to which the object possess some characteristics • Can determine whether an object has more or less some characteristics than other object and not how much more or less • Any series of numbers can be given that preserves the ordered relationship among objects. • Along with counting operation of nominal scale this has statistics based on percentiles, quartiles and median. Example: social class, severity of a behavior disorder
- 9. Interval Scale • Distance between any two objects is fixed and equal • It allows comparison of difference between two objects • Meaningful addition and subtraction of scale values are possible • The zero point and the unit of measurement are arbitrary • In addition to the statistical techniques applied to nominal and ordinal data, the arithmetic mean and standard deviation are used Example: Temperature (Fahrenheit or Celsius)
- 10. Ratio Scale • Possess all the properties of nominal, ordinal and interval scale • This has absolute zero point • It is meaningful to calculate ratio of scale values. • All statistical techniques can be applied. Examples: Income, age, weight, height so on
- 11. Categorical variables • They can be placed into one of two (dichotomous) or more (polychotomous) categories. • Examples of dichotomous categorical variables: Male / Female Pregnant / Not pregnant Smoker / Non smoker Married / Single • However, many classifications require more than two categories. For e.g., Married / Single / Divorced/ Separated/ Widowed; Blood group: A/ B/ AB/ O; Religion: Hindu/ Christian/ Muslim etc…. There is no ordering of the categories. • These are examples of nominal scale, in which the values fall into unordered categories or classes.
- 12. Categorical variables • But often there is a natural order, as with the varying stages of cancer and social class. • Example : degree of smoking can be further divided as non-smokers/ ex-smokers/ light smokers/ heavy smokers. This is an example of ordinal scale. • In ordinal scales, the categories bear an ordered relationship to one another.
- 13. Numerical variables • Also called quantitative or interval variables. They are expressed as integers, fractions or decimals, in which equal distances exist between successive intervals. Age, systolic & diastolic blood pressure, and height are examples of continuous variables. • Numerical variables can be further divided into discrete & continuous. Discrete numerical variable can take only intermittent values over a range, they differ by fixed amount, and no intermediate values are possible. • Examples of discrete numerical variables are no. of children, no. of ectopic heart beats etc…
- 14. Numerical variables • Data that represent measurable quantities but are not restricted to taking on specified values such as integers are known as continuous data. • If the values of the measurement take any number in a range, the data are said to be continuous. • The difference between any two possible data values can be very small. Common examples include height, weight, temperature etc… • Continuous data can be reduced to several categories.
- 15. Discrete data -- Gaps between possible values Continuous data -- no gaps between possible values Discrete vs. Continuous Data
- 16. Derived Variables • Used to measure diseases in epidemiological studies. • Rate, ratio and proportion. Ratio: quantifies the magnitude of one occurrence or condition to another. Expresses the relationship between two numbers Example: The ratio of males or females in Ethiopia Proportion: quantifies occurrences in relation to the population in which these occurrences take place Expressed as a percentage Example: The proportion of all births that was male
- 17. Derived Variables… • Rate: expresses probability or risk of disease in a defined population over a specified period of time. Considered to be a basic measure of disease occurrence. Example: The number of newly diagnosed breast cancer cases per 100,000 women.
- 18. Data collection • There are two sources of data: • Primary Data Data measured or collect by the investigator or the user directly from the source. Data collected first hand by the investigator. • Secondary Data Data gathered or compiled from published and unpublished sources or files.
- 19. Planning & Measuring Planning: • Identify source and elements of the data. • Decide whether to consider sample or census. • If sampling is preferred, decide on sample size, selection method,… etc • Decide measurement procedure. • Set up the necessary organizational structure. Measuring: • there are different methods.
- 20. Methods of collecting primary data • Survey method - Investigator makes personal contact with the informants either directly or indirectly and collect the data (Telephone Interview, Mail Questionnaires) - Collected information is more reliable/accurate • Experimental method -Determine whether/in what manner variables are related to each other - Large scale organizations with R & D departments doing to determine the cause and effect relationships. -to study the effect of fertilizer on crop
- 21. Methods of collecting primary data… • Observation method -Investigator observes the overall nature of the event and collects the required data. -devices used are automatic recorder, motion picture etc -ex: individual doing research on growth of plants, behavior of bats, keenly observes and finds out the required information. -Gives more accurate result and supplementary information. Costly and time consuming.
- 22. Secondary data sources • Official publications of Government • Publications of research institutions • Professional bodies • Economic trade and scientific Journals
- 23. When the source is secondary data check that: • The type and objective of the situations. • The purpose for which the data are collected and compatible with the present problem. • The nature and classification of data is appropriate to our problem. • There are no biases and misreporting in the published data. Note: Data which are primary for one may be secondary for the other.
- 24. Descriptive Vs Inferential Statistics Depending on how data can be used, statistics is sometimes divided in to two main areas or branches. • Descriptive Statistics: is concerned with summary calculations, graphs, charts and tables. Generally characterizes or describes a set of data elements by graphically displaying the information or describing its central tendencies and how it is distributed.
- 25. • Inferential Statistics: consists of generalizing from samples to populations, performing estimations and hypothesis tests, determining relationships among variables, and making predictions. Statistical techniques based on probability theory are required. • Example: the following is the number of malaria patients who have been treated in a Hospital from 2001 to 2005: 3645; 4568; 5432; 6751; 7369 If we calculate the average malaria patients from 2001 to 2005, then our work belongs to the domain of descriptive statistics. If we predict the number of malaria patients in the year 2015 to be 9917, then our work belongs to the domain of inferential statistics.
- 26. Thank You