This document discusses rank correlation and Spearman's coefficient of rank correlation. Rank correlation is used to measure the relationship between two variables when only rank orders are available rather than exact numerical values. Spearman's coefficient of rank correlation (rs) is calculated using the differences in ranks between two data sets. A higher rs value indicates a closer relationship between the rankings. The document provides two examples to demonstrate calculating rs and interpreting the results.
Deforestation is one of the most serious environmental issues in Sri Lanka. ... Between 1990 and 2000, Sri Lanka lost an average of 26,800 ha of forests per year. This amounts to an average annual deforestation rate of 1.14%. Between 2000 and 2005 the rate accelerated to 1.43% per annum.
This document discusses correlation analysis and its various types. Correlation is the degree of relationship between two or more variables. There are three stages to solve correlation problems: determining the relationship, measuring significance, and establishing causation. Correlation can be positive, negative, simple, partial, or multiple depending on the direction and number of variables. It is used to understand relationships, reduce uncertainty in predictions, and present average relationships. Conditions like probable error and coefficient of determination help interpret correlation values.
Properties of coefficient of correlationNadeem Uddin
The document discusses properties of the coefficient of correlation (r) including:
1) r always lies between -1 and 1
2) r is the geometric mean of the two regression coefficients
3) Several examples are shown calculating r from regression coefficients and comparing to Pearson's coefficient of correlation.
केंद्रीय प्रवृत्ति’ शब्द 1920 के दशक के उत्तरार्ध की देन है (wikipedia)। सांख्यिकी, विशेष रूप से सामाजिक अनुसंधान में केंद्रीय प्रवृत्ति एक प्रकार का औसत (Average) होता है। आमतौर पर औसत तीन प्रकार के होते हैं अर्थात मध्यमान, माध्य एवं बहुलक (Mean, Median, Mode)। औसत ऐसी संख्या होती है जो स्कोर या व्यक्तियों के एक समूह के केंद्रीय मूल्य को दर्शाती है (Guilford & Fruchter, 1978)।
The document discusses resource conflicts between indigenous communities and governments/corporations in India. It provides historical context on colonial-era forest policies that alienated tribes from their traditional lands. Subsequent policies failed to recognize tribal rights, leading to current conflicts over mining, plantations, and tiger reserves. The Forest Rights Act of 2006 aimed to address this, but implementation has been problematic, fueling activism and the ongoing Maoist insurgency. Cases from Odisha, Chhattisgarh, Jharkhand, and Telangana illustrate persisting tensions.
This document provides an overview of the nature and scope of human geography. It discusses how human geography studies the relationship between human societies and the earth's surface. Key points covered include the different approaches to geography like environmental determinism and possibilism. Environmental determinism suggests that the environment determines human activities, while possibilism argues that humans can modify their environment. The document also discusses new determinism as a middle path between these views. It outlines the different schools of thought in human geography like welfare, radical, and behavioral schools. Finally, it discusses how human geography relates to other social science disciplines through different time periods.
Partial Differential Equation plays an important role in our daily life.In mathematics, a partial differential equation (PDE) is a differential equation that contains unknown multivariable functions and their partial derivatives. PDEs are used to formulate problems involving functions of several variables, and are either solved by hand, or used to create a computer model. A special case is ordinary differential equations (ODEs), which deal with functions of a single variable and their derivatives.
PDEs can be used to describe a wide variety of phenomena such as sound, heat, diffusion, electrostatics, electrodynamics, fluid dynamics, elasticity, or quantum mechanics. These seemingly distinct physical phenomena can be formalised similarly in terms of PDEs. Just as ordinary differential equations often model one-dimensional dynamical systems, partial differential equations often model multidimensional systems. PDEs find their generalisation in stochastic partial differential equations.
This document discusses rank correlation and Spearman's coefficient of rank correlation. Rank correlation is used to measure the relationship between two variables when only rank orders are available rather than exact numerical values. Spearman's coefficient of rank correlation (rs) is calculated using the differences in ranks between two data sets. A higher rs value indicates a closer relationship between the rankings. The document provides two examples to demonstrate calculating rs and interpreting the results.
Deforestation is one of the most serious environmental issues in Sri Lanka. ... Between 1990 and 2000, Sri Lanka lost an average of 26,800 ha of forests per year. This amounts to an average annual deforestation rate of 1.14%. Between 2000 and 2005 the rate accelerated to 1.43% per annum.
This document discusses correlation analysis and its various types. Correlation is the degree of relationship between two or more variables. There are three stages to solve correlation problems: determining the relationship, measuring significance, and establishing causation. Correlation can be positive, negative, simple, partial, or multiple depending on the direction and number of variables. It is used to understand relationships, reduce uncertainty in predictions, and present average relationships. Conditions like probable error and coefficient of determination help interpret correlation values.
Properties of coefficient of correlationNadeem Uddin
The document discusses properties of the coefficient of correlation (r) including:
1) r always lies between -1 and 1
2) r is the geometric mean of the two regression coefficients
3) Several examples are shown calculating r from regression coefficients and comparing to Pearson's coefficient of correlation.
केंद्रीय प्रवृत्ति’ शब्द 1920 के दशक के उत्तरार्ध की देन है (wikipedia)। सांख्यिकी, विशेष रूप से सामाजिक अनुसंधान में केंद्रीय प्रवृत्ति एक प्रकार का औसत (Average) होता है। आमतौर पर औसत तीन प्रकार के होते हैं अर्थात मध्यमान, माध्य एवं बहुलक (Mean, Median, Mode)। औसत ऐसी संख्या होती है जो स्कोर या व्यक्तियों के एक समूह के केंद्रीय मूल्य को दर्शाती है (Guilford & Fruchter, 1978)।
The document discusses resource conflicts between indigenous communities and governments/corporations in India. It provides historical context on colonial-era forest policies that alienated tribes from their traditional lands. Subsequent policies failed to recognize tribal rights, leading to current conflicts over mining, plantations, and tiger reserves. The Forest Rights Act of 2006 aimed to address this, but implementation has been problematic, fueling activism and the ongoing Maoist insurgency. Cases from Odisha, Chhattisgarh, Jharkhand, and Telangana illustrate persisting tensions.
This document provides an overview of the nature and scope of human geography. It discusses how human geography studies the relationship between human societies and the earth's surface. Key points covered include the different approaches to geography like environmental determinism and possibilism. Environmental determinism suggests that the environment determines human activities, while possibilism argues that humans can modify their environment. The document also discusses new determinism as a middle path between these views. It outlines the different schools of thought in human geography like welfare, radical, and behavioral schools. Finally, it discusses how human geography relates to other social science disciplines through different time periods.
Partial Differential Equation plays an important role in our daily life.In mathematics, a partial differential equation (PDE) is a differential equation that contains unknown multivariable functions and their partial derivatives. PDEs are used to formulate problems involving functions of several variables, and are either solved by hand, or used to create a computer model. A special case is ordinary differential equations (ODEs), which deal with functions of a single variable and their derivatives.
PDEs can be used to describe a wide variety of phenomena such as sound, heat, diffusion, electrostatics, electrodynamics, fluid dynamics, elasticity, or quantum mechanics. These seemingly distinct physical phenomena can be formalised similarly in terms of PDEs. Just as ordinary differential equations often model one-dimensional dynamical systems, partial differential equations often model multidimensional systems. PDEs find their generalisation in stochastic partial differential equations.
This document discusses correlation and regression analysis. It defines correlation analysis as examining the relationship between two or more variables, and regression analysis as examining how one variable changes when another specific variable changes in volume. It covers positive and negative correlation, linear and non-linear correlation, and how to calculate the coefficient of correlation. Regression analysis and regression equations are introduced for using a known variable to predict an unknown variable. Examples are provided to illustrate key concepts.
History of Environmental Grassroots movementmeharoof786
This document provides a brief history of grassroots environmental movements around the world. It discusses early movements in India like the Bishnoi movement in the 18th century and the Chipko movement in the 1970s. It also summarizes key environmental movements in the UK, US, and climate movements including the Kyoto Protocol, Copenhagen conference, People's Climate March, fossil fuel divestment movement, and Paris Agreement. It concludes by outlining the goals and campaigns of the environmental organization 350.org.
The document discusses various coordinate transformations between Cartesian, cylindrical, and spherical coordinate systems. It provides the transformation equations for scalar and vector variables between these coordinate systems. Examples are included to demonstrate transforming between Cartesian and cylindrical coordinates for points in both scalar and vector form. The key topics covered are the four types of coordinate transformations, the transformation equations, and examples to illustrate the transformations.
The Spearman’s Rank Correlation Coefficient is the non-parametric statistical measure used to study the strength of association between the two ranked variables. This method is applied to the ordinal set of numbers, which can be arranged in order, i.e. one after the other so that ranks can be given to each. This presentation slides explains the procedure to find out the Rank Difference correlation and its applications.
This document discusses various measures of dispersion in statistics. It defines dispersion as the extent to which items in a data set vary from the central value. Some key measures of dispersion discussed include range, interquartile range, quartile deviation, mean deviation, and standard deviation. Formulas and examples are provided for calculating range, quartile deviation, and mean deviation from data sets. The objectives, properties, merits and demerits of each measure are outlined.
Green’s Function Solution of Non-homogenous Singular Sturm-Liouville ProblemIJSRED
This document discusses solving non-homogeneous singular Sturm-Liouville problems using Green's function methods. It begins with an introduction to Green's functions and their use in solving differential equations with singularities. It then provides examples of applying Green's functions to solve two specific singular Sturm-Liouville problems - Bessel's equation and a second-order differential equation with a singular point at 0. The document derives the Green's function for each example problem and uses it to find the solution that satisfies the given boundary conditions.
An ideal in a ring is called a principal ideal if it is generated by a single element of the ring. The First Fundamental Theorem of Ideals states that every ideal in a principal ideal domain is a principal ideal.
Ecologically sustainable development involves meeting human needs while maintaining or enhancing natural ecosystems. It requires using resources efficiently and producing less waste. Tools to achieve ESD include life cycle analysis, environmental impact assessments, and environmental management systems. An EMS establishes procedures to manage environmental impacts and continually improve performance. ISO 14001 provides standards for EMS certification. Risk assessment, the precautionary principle, and regulatory frameworks also support ecologically sustainable development.
Road to Rio+20, UN Conference on Sustainable Development 2012ISCIENCES, L.L.C.
Road to Rio+20 is a summary of preparations for the United Nations Conference on Sustainable Development (UNCSD) called “Rio+20” to be held in Rio de Janeiro, Brazil June 20-22, 2012.
A relation maps elements from one set to another set through ordered pairs. The domain is the set of first elements in the ordered pairs and the range is the set of second elements. Relations can have properties like being reflexive, symmetric, transitive, or an equivalence relation. Relations are used in applications like relational databases, project scheduling, and communication networks.
Correlation- an introduction and application of spearman rank correlation by...Gunjan Verma
this presentation contains the types of correlation, uses, limitations, introduction to spearman rank correlation, and its application. a numerical is also given in the presentation
The document discusses the roles of several major international non-governmental organizations (NGOs) working in the area of environmental protection and conservation. It provides brief overviews of 15 NGOs, including their founding year, type of organization, focus areas, and roles in areas like biodiversity conservation, sustainable development, climate change mitigation, wildlife and habitat protection, and sustainable resource management. The NGOs discussed work to protect the environment through advocacy, research, education, community initiatives, and policy influence on both national and international levels.
Measures Taken to Preserve Fauna And Flora Of Our Country.pptxNishathAnjum4
The document summarizes the steps taken by the Indian government to protect the country's flora and fauna. These include implementing the Wildlife Protection Act in 1972, establishing 14 biosphere reserves to protect plants and animals in their natural habitats, providing financial assistance to botanical gardens since 1992, and introducing projects like Project Tiger, Project Rhino, and others. The government has also set up 89 national parks, 490 wildlife sanctuaries, and zoological gardens to conserve India's natural heritage.
This document discusses the meaning and types of correlation. It defines correlation as a statistical tool that measures the relationship between two variables. The degree of relationship is measured by the correlation coefficient, which ranges from -1 to 1. A positive correlation means the variables change in the same direction, while a negative correlation means they change in opposite directions. Common methods for studying correlation include scatter plots, Karl Pearson's coefficient, and Spearman's rank correlation coefficient. The coefficient of correlation, denoted by r, measures the strength and direction of the linear relationship between variables.
This document discusses fractional calculus and its applications. It begins with an introduction to fractional calculus, which involves defining derivatives and integrals of arbitrary real or complex order. Naive approaches to defining fractional derivatives are inconsistent. The document then motivates a rigorous definition by generalizing the formula for differentiation to non-integer orders. This generalized formula reduces to the standard formula when the order is a positive integer.
Econometrics combines economic theory, mathematics, statistics, and economic data to empirically test economic relationships and quantify economic models. It involves stating an economic theory, specifying the mathematical and econometric models, obtaining data, estimating model parameters, testing hypotheses, forecasting, and using models for policy purposes. The econometrician adds a stochastic error term to account for uncertainty from omitted variables, data limitations, intrinsic randomness, and incorrect model specification. Econometrics aims to numerically measure relationships posited by economic theories.
Environmental determinism and possibilismAmstrongofori
Human-environment relationships involve how people use and are limited by their environment. There are three main aspects of this relationship:
1) Humans depend on the environment for survival.
2) Humans adapt to environmental conditions.
3) Humans modify their environment.
A binomial random variable is the number of successes x in n repeated trials of a binomial experiment. The probability distribution of a binomial random variable is called a binomial distribution. Suppose we flip a coin two times and count the number of heads (successes).
The document discusses Spearman's rank correlation coefficient, a nonparametric measure of statistical dependence between two variables. It assumes values between -1 and 1, with -1 indicating a perfect negative correlation and 1 a perfect positive correlation. The steps involve converting values to ranks, calculating the differences between ranks, and determining if there is a statistically significant correlation based on the test statistic and critical values. An example calculates Spearman's rho using rankings of cricket teams in test and one day international matches.
The document discusses correlation analysis and different types of correlation. It defines correlation as the linear association between two random variables. There are three main types of correlation:
1) Positive vs negative vs no correlation based on the relationship between two variables as one increases or decreases.
2) Linear vs non-linear correlation based on the shape of the relationship when plotted on a graph.
3) Simple vs multiple vs partial correlation based on the number of variables.
The document also discusses methods for studying correlation including scatter plots, Karl Pearson's coefficient of correlation r, and Spearman's rank correlation coefficient. It provides interpretations of the correlation coefficient r and coefficient of determination r2.
This document discusses correlation and regression analysis. It defines correlation analysis as examining the relationship between two or more variables, and regression analysis as examining how one variable changes when another specific variable changes in volume. It covers positive and negative correlation, linear and non-linear correlation, and how to calculate the coefficient of correlation. Regression analysis and regression equations are introduced for using a known variable to predict an unknown variable. Examples are provided to illustrate key concepts.
History of Environmental Grassroots movementmeharoof786
This document provides a brief history of grassroots environmental movements around the world. It discusses early movements in India like the Bishnoi movement in the 18th century and the Chipko movement in the 1970s. It also summarizes key environmental movements in the UK, US, and climate movements including the Kyoto Protocol, Copenhagen conference, People's Climate March, fossil fuel divestment movement, and Paris Agreement. It concludes by outlining the goals and campaigns of the environmental organization 350.org.
The document discusses various coordinate transformations between Cartesian, cylindrical, and spherical coordinate systems. It provides the transformation equations for scalar and vector variables between these coordinate systems. Examples are included to demonstrate transforming between Cartesian and cylindrical coordinates for points in both scalar and vector form. The key topics covered are the four types of coordinate transformations, the transformation equations, and examples to illustrate the transformations.
The Spearman’s Rank Correlation Coefficient is the non-parametric statistical measure used to study the strength of association between the two ranked variables. This method is applied to the ordinal set of numbers, which can be arranged in order, i.e. one after the other so that ranks can be given to each. This presentation slides explains the procedure to find out the Rank Difference correlation and its applications.
This document discusses various measures of dispersion in statistics. It defines dispersion as the extent to which items in a data set vary from the central value. Some key measures of dispersion discussed include range, interquartile range, quartile deviation, mean deviation, and standard deviation. Formulas and examples are provided for calculating range, quartile deviation, and mean deviation from data sets. The objectives, properties, merits and demerits of each measure are outlined.
Green’s Function Solution of Non-homogenous Singular Sturm-Liouville ProblemIJSRED
This document discusses solving non-homogeneous singular Sturm-Liouville problems using Green's function methods. It begins with an introduction to Green's functions and their use in solving differential equations with singularities. It then provides examples of applying Green's functions to solve two specific singular Sturm-Liouville problems - Bessel's equation and a second-order differential equation with a singular point at 0. The document derives the Green's function for each example problem and uses it to find the solution that satisfies the given boundary conditions.
An ideal in a ring is called a principal ideal if it is generated by a single element of the ring. The First Fundamental Theorem of Ideals states that every ideal in a principal ideal domain is a principal ideal.
Ecologically sustainable development involves meeting human needs while maintaining or enhancing natural ecosystems. It requires using resources efficiently and producing less waste. Tools to achieve ESD include life cycle analysis, environmental impact assessments, and environmental management systems. An EMS establishes procedures to manage environmental impacts and continually improve performance. ISO 14001 provides standards for EMS certification. Risk assessment, the precautionary principle, and regulatory frameworks also support ecologically sustainable development.
Road to Rio+20, UN Conference on Sustainable Development 2012ISCIENCES, L.L.C.
Road to Rio+20 is a summary of preparations for the United Nations Conference on Sustainable Development (UNCSD) called “Rio+20” to be held in Rio de Janeiro, Brazil June 20-22, 2012.
A relation maps elements from one set to another set through ordered pairs. The domain is the set of first elements in the ordered pairs and the range is the set of second elements. Relations can have properties like being reflexive, symmetric, transitive, or an equivalence relation. Relations are used in applications like relational databases, project scheduling, and communication networks.
Correlation- an introduction and application of spearman rank correlation by...Gunjan Verma
this presentation contains the types of correlation, uses, limitations, introduction to spearman rank correlation, and its application. a numerical is also given in the presentation
The document discusses the roles of several major international non-governmental organizations (NGOs) working in the area of environmental protection and conservation. It provides brief overviews of 15 NGOs, including their founding year, type of organization, focus areas, and roles in areas like biodiversity conservation, sustainable development, climate change mitigation, wildlife and habitat protection, and sustainable resource management. The NGOs discussed work to protect the environment through advocacy, research, education, community initiatives, and policy influence on both national and international levels.
Measures Taken to Preserve Fauna And Flora Of Our Country.pptxNishathAnjum4
The document summarizes the steps taken by the Indian government to protect the country's flora and fauna. These include implementing the Wildlife Protection Act in 1972, establishing 14 biosphere reserves to protect plants and animals in their natural habitats, providing financial assistance to botanical gardens since 1992, and introducing projects like Project Tiger, Project Rhino, and others. The government has also set up 89 national parks, 490 wildlife sanctuaries, and zoological gardens to conserve India's natural heritage.
This document discusses the meaning and types of correlation. It defines correlation as a statistical tool that measures the relationship between two variables. The degree of relationship is measured by the correlation coefficient, which ranges from -1 to 1. A positive correlation means the variables change in the same direction, while a negative correlation means they change in opposite directions. Common methods for studying correlation include scatter plots, Karl Pearson's coefficient, and Spearman's rank correlation coefficient. The coefficient of correlation, denoted by r, measures the strength and direction of the linear relationship between variables.
This document discusses fractional calculus and its applications. It begins with an introduction to fractional calculus, which involves defining derivatives and integrals of arbitrary real or complex order. Naive approaches to defining fractional derivatives are inconsistent. The document then motivates a rigorous definition by generalizing the formula for differentiation to non-integer orders. This generalized formula reduces to the standard formula when the order is a positive integer.
Econometrics combines economic theory, mathematics, statistics, and economic data to empirically test economic relationships and quantify economic models. It involves stating an economic theory, specifying the mathematical and econometric models, obtaining data, estimating model parameters, testing hypotheses, forecasting, and using models for policy purposes. The econometrician adds a stochastic error term to account for uncertainty from omitted variables, data limitations, intrinsic randomness, and incorrect model specification. Econometrics aims to numerically measure relationships posited by economic theories.
Environmental determinism and possibilismAmstrongofori
Human-environment relationships involve how people use and are limited by their environment. There are three main aspects of this relationship:
1) Humans depend on the environment for survival.
2) Humans adapt to environmental conditions.
3) Humans modify their environment.
A binomial random variable is the number of successes x in n repeated trials of a binomial experiment. The probability distribution of a binomial random variable is called a binomial distribution. Suppose we flip a coin two times and count the number of heads (successes).
The document discusses Spearman's rank correlation coefficient, a nonparametric measure of statistical dependence between two variables. It assumes values between -1 and 1, with -1 indicating a perfect negative correlation and 1 a perfect positive correlation. The steps involve converting values to ranks, calculating the differences between ranks, and determining if there is a statistically significant correlation based on the test statistic and critical values. An example calculates Spearman's rho using rankings of cricket teams in test and one day international matches.
The document discusses correlation analysis and different types of correlation. It defines correlation as the linear association between two random variables. There are three main types of correlation:
1) Positive vs negative vs no correlation based on the relationship between two variables as one increases or decreases.
2) Linear vs non-linear correlation based on the shape of the relationship when plotted on a graph.
3) Simple vs multiple vs partial correlation based on the number of variables.
The document also discusses methods for studying correlation including scatter plots, Karl Pearson's coefficient of correlation r, and Spearman's rank correlation coefficient. It provides interpretations of the correlation coefficient r and coefficient of determination r2.
Zach Mosier is a student who grew up and currently lives in Kentucky. He attends the University of Kentucky where he studies sociology and is a member of the sociology club. He has worked at a hospital and hopes to pursue counseling or sociology after graduation. His strengths include being hardworking, creative, insightful, consistent, and adaptable.
The study aimed to determine if there is a correlation between age and number of personal electronic devices owned. Data was collected through surveys of randomly selected individuals aged 15-70 across different public locations and times. The results showed a significant negative correlation, with older individuals owning fewer devices than younger individuals.
This document provides an overview of a presentation on Ordinary Least Squares (OLS) estimation in econometrics. OLS is introduced as a method used to estimate parameters of economic relationships from data by minimizing errors. Key points covered include: what OLS estimates, why it is used in econometrics to estimate regression parameters from a sample regression function in order to approximate the true population regression function, and details on how the OLS criterion minimizes the sum of squared residuals to obtain parameter estimates. Goodness of fit is also discussed as a measure of how well the estimated regression line fits the sample data.
This document discusses rank correlation and Spearman's rank correlation coefficient. It was developed in 1904 by British psychologist Charles Spearman. The rank correlation coefficient (RCC) is applied to a set of ordinal rank numbers from 1 to n. Spearman's rank correlation coefficient formula is provided as R = 1 - 6∑D2/N(N2-1) where R is the coefficient, D is the difference between ranks of paired items, and N is the number of items. An example is given to calculate the Spearman's coefficient using two sets of ranked data.
Ordinary least squares linear regressionElkana Rorio
Ordinary Least Squares Linear Regression is commonly used but often misunderstood and misapplied. It works by minimizing the sum of squared errors between predictions and actual values in the training data to determine coefficients for the linear regression equation. However, it is very sensitive to outliers in the data which can dramatically affect the determined coefficients and reduce prediction accuracy. Alternative regression techniques like least absolute deviations are more robust to outliers but less computationally efficient. Preprocessing data to remove or de-emphasize outliers can help address these issues with Ordinary Least Squares regression.
STATA is data analysis software that can be used via menu options or typed commands. It has a wide range of econometric techniques and can open, examine, and run regressions on datasets. The tutorials on www.STATA.org.uk provide step-by-step guides for using STATA to perform tasks like data management, statistical analysis, importing data, summary statistics, graphs, regressions, and other analyses.
This document provides an overview of report writing. It defines a report as a statement of the results of an investigation or matter where definite information is required. Report writing is an essential skill for professionals in many fields as reports aim to clearly and succinctly inform readers. The document outlines the common structures of reports, including cover letters, titles, executive summaries, introductions, bodies, conclusions, and appendices. It also discusses the process of report writing, including planning, collecting and organizing information, considering the audience, and finishing touches. Reports differ from essays in their objective to present information rather than arguments.
Measures of correlation (pearson's r correlation coefficient and spearman rho)Jyl Matz
This document defines and provides formulas for several measures of correlation:
1) Pearson's product-moment correlation coefficient (Pearson's r) measures the linear relationship between two variables.
2) Spearman's rank correlation coefficient (Spearman's rho) measures the relationship between paired ranks assigned to scores on two variables.
3) An example is provided to demonstrate calculating Spearman's rho between capital and profit for dried fish businessmen.
4) Guidelines are given for interpreting the strength of correlation based on the correlation coefficient value.
- Regression analysis is a statistical technique used to measure the relationship between two quantitative variables and make causal inferences.
- A regression model graphs the relationship between a dependent variable (Y axis) and one or more independent variables (X axis). The goal is to find the linear equation that best fits the data.
- The regression equation takes the form Y = a + bX, where a is the intercept, b is the slope coefficient, and X and Y are the variables. The coefficient b indicates the strength and direction of the relationship.
This document explains how to use Spearman's rank correlation coefficient to determine the strength and significance of the relationship between two variables. It provides steps to calculate the coefficient using birth rate and economic development data from 12 Central and South American countries. These steps are then applied to determine if there is a correlation between life expectancy and economic development in the same countries.
Overviews non-parametric and parametric approaches to (bivariate) linear correlation. See also: http://en.wikiversity.org/wiki/Survey_research_and_design_in_psychology/Lectures/Correlation
This document discusses Spearman's rank correlation coefficient, a non-parametric measure of statistical dependence between two variables. It does not assume a normal distribution like other correlation coefficients. The Spearman coefficient is calculated by ranking the values of each variable separately and then calculating the difference between their ranks, summing the squared differences, and dividing by the number of samples. The document provides an example calculation of the Spearman coefficient between two variables and its interpretation.
The document presents the results of a simple linear regression analysis conducted by a black belt to predict the number of calls answered (dependent variable) based on staffing levels (independent variable) using data collected over 240 samples in a call center. The regression equation found 83.4% of the variation in calls answered was explained by staffing levels. Notable outliers and leverage points were identified that could impact the strength of the predicted relationship between calls answered and staffing.
Pearson Correlation, Spearman Correlation &Linear RegressionAzmi Mohd Tamil
This document discusses correlation and linear regression. It defines correlation as a statistic that measures the strength and direction of the linear relationship between two continuous variables. Positive correlation indicates that as one variable increases, so does the other. Negative correlation means the variables are inversely related. Linear regression can be used to predict a continuous outcome variable based on a continuous predictor variable using the regression equation y=a+bx. The regression line minimizes the sum of squared differences between the data points and the line. The slope coefficient b indicates the strength of the linear prediction and can be tested for significance.
What is a Spearman's Rank Order Correlation (independence)?Ken Plummer
This document provides an overview of Spearman's rank-order correlation test. It explains that Spearman's rho is a non-parametric analogue to the Pearson product-moment correlation coefficient that can be used to measure the strength of association between two ranked variables. It compares Spearman's rho to other correlation tests and notes that it produces identical results to Pearson correlation but can handle ordinal data and situations where variables are skewed or tied ranks.
- Regression analysis is a statistical tool used to examine relationships between variables and can help predict future outcomes. It allows one to assess how the value of a dependent variable changes as the value of an independent variable is varied.
- Simple linear regression involves one independent variable, while multiple regression can include any number of independent variables. Regression analysis outputs include coefficients, residuals, and measures of fit like the R-squared value.
- An example uses home size and price data from 10 houses to generate a linear regression equation predicting that price increases by around $110 for each additional square foot. This model explains 58% of the variation in home prices.
How to Manage Your Lost Opportunities in Odoo 17 CRMCeline George
Odoo 17 CRM allows us to track why we lose sales opportunities with "Lost Reasons." This helps analyze our sales process and identify areas for improvement. Here's how to configure lost reasons in Odoo 17 CRM
it describes the bony anatomy including the femoral head , acetabulum, labrum . also discusses the capsule , ligaments . muscle that act on the hip joint and the range of motion are outlined. factors affecting hip joint stability and weight transmission through the joint are summarized.
हिंदी वर्णमाला पीपीटी, hindi alphabet PPT presentation, hindi varnamala PPT, Hindi Varnamala pdf, हिंदी स्वर, हिंदी व्यंजन, sikhiye hindi varnmala, dr. mulla adam ali, hindi language and literature, hindi alphabet with drawing, hindi alphabet pdf, hindi varnamala for childrens, hindi language, hindi varnamala practice for kids, https://www.drmullaadamali.com
Main Java[All of the Base Concepts}.docxadhitya5119
This is part 1 of my Java Learning Journey. This Contains Custom methods, classes, constructors, packages, multithreading , try- catch block, finally block and more.
A review of the growth of the Israel Genealogy Research Association Database Collection for the last 12 months. Our collection is now passed the 3 million mark and still growing. See which archives have contributed the most. See the different types of records we have, and which years have had records added. You can also see what we have for the future.
Walmart Business+ and Spark Good for Nonprofits.pdfTechSoup
"Learn about all the ways Walmart supports nonprofit organizations.
You will hear from Liz Willett, the Head of Nonprofits, and hear about what Walmart is doing to help nonprofits, including Walmart Business and Spark Good. Walmart Business+ is a new offer for nonprofits that offers discounts and also streamlines nonprofits order and expense tracking, saving time and money.
The webinar may also give some examples on how nonprofits can best leverage Walmart Business+.
The event will cover the following::
Walmart Business + (https://business.walmart.com/plus) is a new shopping experience for nonprofits, schools, and local business customers that connects an exclusive online shopping experience to stores. Benefits include free delivery and shipping, a 'Spend Analytics” feature, special discounts, deals and tax-exempt shopping.
Special TechSoup offer for a free 180 days membership, and up to $150 in discounts on eligible orders.
Spark Good (walmart.com/sparkgood) is a charitable platform that enables nonprofits to receive donations directly from customers and associates.
Answers about how you can do more with Walmart!"
How to Make a Field Mandatory in Odoo 17Celine George
In Odoo, making a field required can be done through both Python code and XML views. When you set the required attribute to True in Python code, it makes the field required across all views where it's used. Conversely, when you set the required attribute in XML views, it makes the field required only in the context of that particular view.
This presentation was provided by Steph Pollock of The American Psychological Association’s Journals Program, and Damita Snow, of The American Society of Civil Engineers (ASCE), for the initial session of NISO's 2024 Training Series "DEIA in the Scholarly Landscape." Session One: 'Setting Expectations: a DEIA Primer,' was held June 6, 2024.
Pollock and Snow "DEIA in the Scholarly Landscape, Session One: Setting Expec...
Rank correlation- some features and an application
1. On some interesting features and an
application of rank correlation
Kushal Kr. Dey
Indian Statistical Institute
D.Basu Memorial Award Talk 2011
Kushal Kr. Dey [1.5 pt] Indian Statistical Institute D.Basu Memorial Award Talk 2011
On some interesting features and an application of rank correla
2. List of contents
1 Historical overview of rank correlation.
2 Some properties of rank correlation.
3 A practical example of rank correlation.
Kushal Kr. Dey [1.5 pt] Indian Statistical Institute D.Basu Memorial Award Talk 2011
On some interesting features and an application of rank correla
3. Historical Overview—Correlation
In 1886, Sir Francis Galton coined the term correlation by
quoting
length of a human arm is said to be correlated with
that of the leg, because a person with long arm has
usually long legs and conversely.
Galton wanted a measure of correlation that takes value +1
for perfect correspondence, 0 for independence, and -1 for
perfect inverse correspondence.
Kushal Kr. Dey [1.5 pt] Indian Statistical Institute D.Basu Memorial Award Talk 2011
On some interesting features and an application of rank correla
4. Historical Overview—Correlation
In 1886, Sir Francis Galton coined the term correlation by
quoting
length of a human arm is said to be correlated with
that of the leg, because a person with long arm has
usually long legs and conversely.
Galton wanted a measure of correlation that takes value +1
for perfect correspondence, 0 for independence, and -1 for
perfect inverse correspondence.
Kushal Kr. Dey [1.5 pt] Indian Statistical Institute D.Basu Memorial Award Talk 2011
On some interesting features and an application of rank correla
5. Historical overview—contd.
Karl Pearson, a student of Galton, worked on his idea and
formulated his ”product moments” measure of correlation in
1896.
Sxy
r=√ . (1)
Sxx Syy
Spearman observed that for characteristics not quantitatively
measurable, the Pearsonian measure fails to measure the
association. This motivated him to use rank-based methods
for association and develop his rank correlation coefficient in
1904. [”The proof and measurement of association between
two things” by C. Spearman in The American Journal of
Psychology (1904)].
Kushal Kr. Dey [1.5 pt] Indian Statistical Institute D.Basu Memorial Award Talk 2011
On some interesting features and an application of rank correla
6. Historical overview—contd.
Karl Pearson, a student of Galton, worked on his idea and
formulated his ”product moments” measure of correlation in
1896.
Sxy
r=√ . (1)
Sxx Syy
Spearman observed that for characteristics not quantitatively
measurable, the Pearsonian measure fails to measure the
association. This motivated him to use rank-based methods
for association and develop his rank correlation coefficient in
1904. [”The proof and measurement of association between
two things” by C. Spearman in The American Journal of
Psychology (1904)].
Kushal Kr. Dey [1.5 pt] Indian Statistical Institute D.Basu Memorial Award Talk 2011
On some interesting features and an application of rank correla
7. Historical overview contd
In 1938, two years after the death of Pearson, Maurice
Kendall, a British scientist, while working on psychological
experiments, came up with a new measure of correlation
popularly known as Kendall’s τ . [”A new measure of rank
correlation”, M. Kendall, Biometrika,(1938)].
Th next few years saw extensive research in this area due to
Kendall, Daniels, Hoeffding and others.
In 1954, a modification to Kendall’s coefficient in case of ties
was made by Goodman and Kruskal. [”Measures of
association for cross classifications” Part I, L.A.Goodman and
W.H. Kruskal, J. Amer. Statist. Assoc, (1954)]
Kushal Kr. Dey [1.5 pt] Indian Statistical Institute D.Basu Memorial Award Talk 2011
On some interesting features and an application of rank correla
8. Historical overview contd
In 1938, two years after the death of Pearson, Maurice
Kendall, a British scientist, while working on psychological
experiments, came up with a new measure of correlation
popularly known as Kendall’s τ . [”A new measure of rank
correlation”, M. Kendall, Biometrika,(1938)].
Th next few years saw extensive research in this area due to
Kendall, Daniels, Hoeffding and others.
In 1954, a modification to Kendall’s coefficient in case of ties
was made by Goodman and Kruskal. [”Measures of
association for cross classifications” Part I, L.A.Goodman and
W.H. Kruskal, J. Amer. Statist. Assoc, (1954)]
Kushal Kr. Dey [1.5 pt] Indian Statistical Institute D.Basu Memorial Award Talk 2011
On some interesting features and an application of rank correla
9. Daniel’s Generalized correlation coefficient
H.E. Daniels of Cambridge University, a close associate of
Kendall, proposed a measure in 1944 to unify Pearson’s r ,
Spearman’s ρ and Kendall’s τ [The relation between
measures of correlation in the universe of sample
permutations, H.E.Daniels, Biometrika,(1944)].
Consider n data points given by (Xi , Yi ), i = 1(|)n , for each
pair of X ’s, (Xi , Xj ), we may allot aij = −aji and aii = 0,
similarly, we may allot bij to the pair (Yi , Yj ), then Daniel’s
generalized coefficient D is given by
n n
d i=1 j=1 aij bij
D= n n n n 1 (2)
2 2 2
( i=1 j=1 aij . i=1 j=1 bij )
Kushal Kr. Dey [1.5 pt] Indian Statistical Institute D.Basu Memorial Award Talk 2011
On some interesting features and an application of rank correla
10. Daniel’s Generalized correlation coefficient
H.E. Daniels of Cambridge University, a close associate of
Kendall, proposed a measure in 1944 to unify Pearson’s r ,
Spearman’s ρ and Kendall’s τ [The relation between
measures of correlation in the universe of sample
permutations, H.E.Daniels, Biometrika,(1944)].
Consider n data points given by (Xi , Yi ), i = 1(|)n , for each
pair of X ’s, (Xi , Xj ), we may allot aij = −aji and aii = 0,
similarly, we may allot bij to the pair (Yi , Yj ), then Daniel’s
generalized coefficient D is given by
n n
d i=1 j=1 aij bij
D= n n n n 1 (2)
2 2 2
( i=1 j=1 aij . i=1 j=1 bij )
Kushal Kr. Dey [1.5 pt] Indian Statistical Institute D.Basu Memorial Award Talk 2011
On some interesting features and an application of rank correla
11. Daniel’s generalized coefficient contd.
Special cases
Put aij as Xj − Xi and bij as Yj − Yi to get Pearson’s r .
Put aij as Rank(Xj ) − Rank(Xi ) and bij as
Rank(Yj ) − Rank(Yi ) to get Spearman’s ρ.
Put aij as sgn(Xj − Xi ) and bij as sgn(Yj − Yi ) to get
Kendall’s τ .
Kushal Kr. Dey [1.5 pt] Indian Statistical Institute D.Basu Memorial Award Talk 2011
On some interesting features and an application of rank correla
12. Alternative expression for τ and ρ
First, we define dij to be +1 when the rank j ( j > i) precedes
the rank i in the second ranking and zero otherwise.
We can write the Kendall’s τ as the following
4Q
τ =1− (3)
n(n − 1)
where Q is the total score, Q = i<j dij and n is the total
number of elements in the sample.
Similarly, we can write Spearman’s ρ as the following
12V
ρ=1− (4)
n(n2 − 1)
where V = i<j (j − i)dij is the sum of inversions weighted
by the numerical difference between the ranks inverted. This
difference is called the weight of inversion.
Kushal Kr. Dey [1.5 pt] Indian Statistical Institute D.Basu Memorial Award Talk 2011
On some interesting features and an application of rank correla
13. Alternative expression for τ and ρ
First, we define dij to be +1 when the rank j ( j > i) precedes
the rank i in the second ranking and zero otherwise.
We can write the Kendall’s τ as the following
4Q
τ =1− (3)
n(n − 1)
where Q is the total score, Q = i<j dij and n is the total
number of elements in the sample.
Similarly, we can write Spearman’s ρ as the following
12V
ρ=1− (4)
n(n2 − 1)
where V = i<j (j − i)dij is the sum of inversions weighted
by the numerical difference between the ranks inverted. This
difference is called the weight of inversion.
Kushal Kr. Dey [1.5 pt] Indian Statistical Institute D.Basu Memorial Award Talk 2011
On some interesting features and an application of rank correla
14. An interesting result
We simulated observations in large sample size from a
bivariate normal distribution and plotted the mean values of
Spearman’s ρ and Kendall’s τ against Pearson’s r . We
obtained the following graph.
Kushal Kr. Dey [1.5 pt] Indian Statistical Institute D.Basu Memorial Award Talk 2011
On some interesting features and an application of rank correla
15. The graph
Kushal Kr. Dey [1.5 pt] Indian Statistical Institute D.Basu Memorial Award Talk 2011
On some interesting features and an application of rank correla
16. Relation of τ and ρ with r for BVN
In 1907, Pearson , in his book [”On Further Methods of
Determining Correlation”, Karl Pearson, Biometric series IV,
(1907)], established the following relation between
Spearman’s ρ and his r for bivariate normal distribution.
π
r = 2 sin ρ (5)
6
Cramer, in 1946, also established a relation between Kendall’s
τ and Pearson’s r for bivariate normal.
π
r = sin τ (6)
2
However it is easy to show that the above two relations hold
for any elliptic distribution.
Kushal Kr. Dey [1.5 pt] Indian Statistical Institute D.Basu Memorial Award Talk 2011
On some interesting features and an application of rank correla
17. Relation of τ and ρ with r for BVN
In 1907, Pearson , in his book [”On Further Methods of
Determining Correlation”, Karl Pearson, Biometric series IV,
(1907)], established the following relation between
Spearman’s ρ and his r for bivariate normal distribution.
π
r = 2 sin ρ (5)
6
Cramer, in 1946, also established a relation between Kendall’s
τ and Pearson’s r for bivariate normal.
π
r = sin τ (6)
2
However it is easy to show that the above two relations hold
for any elliptic distribution.
Kushal Kr. Dey [1.5 pt] Indian Statistical Institute D.Basu Memorial Award Talk 2011
On some interesting features and an application of rank correla
18. Relation of τ and ρ with r for BVN
In 1907, Pearson , in his book [”On Further Methods of
Determining Correlation”, Karl Pearson, Biometric series IV,
(1907)], established the following relation between
Spearman’s ρ and his r for bivariate normal distribution.
π
r = 2 sin ρ (5)
6
Cramer, in 1946, also established a relation between Kendall’s
τ and Pearson’s r for bivariate normal.
π
r = sin τ (6)
2
However it is easy to show that the above two relations hold
for any elliptic distribution.
Kushal Kr. Dey [1.5 pt] Indian Statistical Institute D.Basu Memorial Award Talk 2011
On some interesting features and an application of rank correla
19. Relation between Kendall’s τ and r for bivariate
normal
Let (X1 , Y1 ), (X2 , Y2 ), . . . , (Xn , Yn ) be a sample drawn from
BVN(0,0,1,1,r). Then Kendall’s τ computed from the data is
an unbiased estimator of
2P((X1 − X2 )(Y1 − Y2 ) > 0) − 1 = 2P(Z1 Z2 > 0) − 1 (7)
where (Z1 , Z2 ) ∼ BVN(0, 0, 2, 2, 2r ).
d √ √
Note that (Z1 , Z2 ) = 2(V 1 − r 2 + Wr , W ) where (V , W )
have standard normal distribution. Since (Z1 , Z2 ) is symmetric
about (0, 0)
4P(Z1 > 0, Z2 > 0)−1 = 4P(V 1 − r 2 +Wr > 0, W > 0)−1
(8)
Use polar transformation on (V , W ) and evaluate this
probability to get π sin−1 r .
2
Kushal Kr. Dey [1.5 pt] Indian Statistical Institute D.Basu Memorial Award Talk 2011
On some interesting features and an application of rank correla
20. Relation between Kendall’s τ and r for bivariate
normal
Let (X1 , Y1 ), (X2 , Y2 ), . . . , (Xn , Yn ) be a sample drawn from
BVN(0,0,1,1,r). Then Kendall’s τ computed from the data is
an unbiased estimator of
2P((X1 − X2 )(Y1 − Y2 ) > 0) − 1 = 2P(Z1 Z2 > 0) − 1 (7)
where (Z1 , Z2 ) ∼ BVN(0, 0, 2, 2, 2r ).
d √ √
Note that (Z1 , Z2 ) = 2(V 1 − r 2 + Wr , W ) where (V , W )
have standard normal distribution. Since (Z1 , Z2 ) is symmetric
about (0, 0)
4P(Z1 > 0, Z2 > 0)−1 = 4P(V 1 − r 2 +Wr > 0, W > 0)−1
(8)
Use polar transformation on (V , W ) and evaluate this
probability to get π sin−1 r .
2
Kushal Kr. Dey [1.5 pt] Indian Statistical Institute D.Basu Memorial Award Talk 2011
On some interesting features and an application of rank correla
21. Relation between Spearman’s ρ and r for bivariate
normal
Now we try to give a sketch of a proof of the relationship
between Pearson’s r and Spearman’s ρ for bivariate normal
distribution .
Let R(Xi ) and R(Yi ) be the ranks of Xi and Yi . Define
H(t) = I{t>0} . Then, observe that
n
R(Xi ) = H(Xi − Xj ) + 1 (9)
j=1
Note that Spearman’s ρ is the Pearson’s correlation coefficient
h− 1 n(n−1)2
between R(Xi ) and R(Yi ) which is 1
4
n(n2 −1)
12
n n n
where h = i=1 j=1 k=1 H(Xi − Xj )H(Yi − Yk ).
Kushal Kr. Dey [1.5 pt] Indian Statistical Institute D.Basu Memorial Award Talk 2011
On some interesting features and an application of rank correla
22. Proof continued
Case 1
If i, j, k are distinct, then (Xi − Xj , Yi − Yk ) are distributed as
r
BVN(0, 0, 2, 2, 2 ).
E {H(Xi − Xj )H(Yi − Yk )} will reduce to the integral of the
probability density over the positive quadrant.
We can check, following similar technique as in the case of τ
that, this integral is 2 (1 − π cos−1 2 ).
1 1 r
Kushal Kr. Dey [1.5 pt] Indian Statistical Institute D.Basu Memorial Award Talk 2011
On some interesting features and an application of rank correla
23. Proof continued
Case 2
If i = j = k, then (Xi − Xj , Yi − Yk ) are distributed as
BVN(0, 0, 2, 2, r ) and the above expectation would reduce to
1 1 −1 r ). Then,
2 (1 − π cos
h − 4 n(n − 1)2
1
6 n − 2 −1 r 1
E 1 2
= sin + sin−1 r
12 n(n − 1)
π n+1 2 n+1
(10)
As n goes to infinity, the R.H.S reduces to 6
π sin−1 2 .
r
Kushal Kr. Dey [1.5 pt] Indian Statistical Institute D.Basu Memorial Award Talk 2011
On some interesting features and an application of rank correla
24. Reason for approximate linear relationship between
Spearman’s ρ and Pearson’s r for BVN
As observed from the graph, Spearman’s ρ for Bivariate
normal is almost linearly related with Pearson’s r . This may
be attributed to the fact that ρ = π sin−1 2
6 r
3
= π ( 2 + 1 r8 + . . .)
6 r
6
3
= π r + terms very small compared to 1st order term
3
≈ πr
For Kendall’s τ , using similar expansion, we can also show
that τ convex function of r in the interval [0,1]. a
Kushal Kr. Dey [1.5 pt] Indian Statistical Institute D.Basu Memorial Award Talk 2011
On some interesting features and an application of rank correla
25. Kendall’s comparative assessment of τ and ρ
−n 3
Kendall in his paper admitted that ρ can take n 6 values
2 −n
between −1 and +1, whereas τ can take only n 2 values in
the range, but according to him, this does not seriously affect
the sensitivity of τ .
Both Kendall’s τ and Spearman’s ρ computed from the
sample have asymptotically normal distributions.
But Kendall showed using simulation experiments that the
distribution for his correlation coefficient is surprisingly close
to normal even for small values of n, which is not the case for
Spearman’s correlation.
Kushal Kr. Dey [1.5 pt] Indian Statistical Institute D.Basu Memorial Award Talk 2011
On some interesting features and an application of rank correla
26. Bias properties of Kendall’s τ and Spearman’s ρ
Consider a finite population. Let ρ and τ be Spearman’s
and Kendall’s rank correlation coefficients computed from the
entire population.
Suppose that we have a simple random sample without
replacement from that population. And we compute
Spearman’s ρ and Kendall’s τ from the sample.
Then, τ is an unbiased estimator for τ but ρ is a biased
estimator for ρ .
If the population size N tends to infinity, expected value of
1
Spearman’s ρ goes to n+1 {3τ + (n − 2)ρ } where n is the
size of the sample.
Kushal Kr. Dey [1.5 pt] Indian Statistical Institute D.Basu Memorial Award Talk 2011
On some interesting features and an application of rank correla
27. small sample distribution of τ , ρ and r
It is well-known that for a simple random sample of size n
drawn from a bivariate normal distribution, under the
assumption of zero correlation, Pearson’s r satisfies
√
r n−2
√ ∼ tn−2 (11)
1 − r2
But the distribution of r for small samples from normal
distribution with non-zero correlation and from non-normal
distributions, is not tractable.
τ and ρ are distribution free statistics in the sense that their
distributions do not depend on the distribution of the data so
long as X and Y are independent. Consequently, their
distributions under the hypothesis of independence of X and
Y can be tabulated.
Kushal Kr. Dey [1.5 pt] Indian Statistical Institute D.Basu Memorial Award Talk 2011
On some interesting features and an application of rank correla
28. Asymptotic normality of r , ρ and τ
Note that each of Pearson’s r , Spearman’s ρ and Kendall’s τ
computed from a bivariate data are asymptotically normally
distributed.
Asymptotic normality of Pearson’s r can be derived using
Central Limit Theorem applied to various bivariate sample
moments.
Asymptotic normality of Spearman’s ρ follows from
asymptotic normality of linear rank statistics.
Asymptotic normality of Kendall’s τ follows from asymptotic
normality of U-statistics.
Kushal Kr. Dey [1.5 pt] Indian Statistical Institute D.Basu Memorial Award Talk 2011
On some interesting features and an application of rank correla
29. List of contents
Historical overview of rank correlation.
Some properties of rank correlation.
A practical example of rank correlation.
Kushal Kr. Dey [1.5 pt] Indian Statistical Institute D.Basu Memorial Award Talk 2011
On some interesting features and an application of rank correla
30. A practical application of rank correlation
Recently, the Ministry of Human Resource Development
(MHRD) considered giving weightage to the marks scored in
the 10+2 Board exams for admission to engineering colleges
in India.
The raw scores across the Boards are not comparable. So,
they wanted help in this regard from the Indian Statistical
Institute.
The use of percentile ranks of students based on their
aggregate scores was recommended by Indian Statistical
Institute.
Kushal Kr. Dey [1.5 pt] Indian Statistical Institute D.Basu Memorial Award Talk 2011
On some interesting features and an application of rank correla
31. A practical application of rank correlation
Recently, the Ministry of Human Resource Development
(MHRD) considered giving weightage to the marks scored in
the 10+2 Board exams for admission to engineering colleges
in India.
The raw scores across the Boards are not comparable. So,
they wanted help in this regard from the Indian Statistical
Institute.
The use of percentile ranks of students based on their
aggregate scores was recommended by Indian Statistical
Institute.
Kushal Kr. Dey [1.5 pt] Indian Statistical Institute D.Basu Memorial Award Talk 2011
On some interesting features and an application of rank correla
32. The Data
Indian Statistical Institute was provided data from 4 boards
(namely, ICSE , CBSE , West Bengal Board and
Tamil Nadu Board) for two consecutive years 2008 and 2009
Though the recommendation from Indian Statistical Institute
was to use aggregate scores of a student for computing the
percentile rank of the student (and that recommendation was
favorably accepted by MHRD), a statistically interesting
question is what happens if we consider various subject scores
separately instead of the aggregate score.
We intend to investigate this issue under some appropriate
assumptions.
2
Kushal Kr. Dey [1.5 pt] Indian Statistical Institute D.Basu Memorial Award Talk 2011
On some interesting features and an application of rank correla
33. The Data
Indian Statistical Institute was provided data from 4 boards
(namely, ICSE , CBSE , West Bengal Board and
Tamil Nadu Board) for two consecutive years 2008 and 2009
Though the recommendation from Indian Statistical Institute
was to use aggregate scores of a student for computing the
percentile rank of the student (and that recommendation was
favorably accepted by MHRD), a statistically interesting
question is what happens if we consider various subject scores
separately instead of the aggregate score.
We intend to investigate this issue under some appropriate
assumptions.
2
Kushal Kr. Dey [1.5 pt] Indian Statistical Institute D.Basu Memorial Award Talk 2011
On some interesting features and an application of rank correla
34. The Data
Indian Statistical Institute was provided data from 4 boards
(namely, ICSE , CBSE , West Bengal Board and
Tamil Nadu Board) for two consecutive years 2008 and 2009
Though the recommendation from Indian Statistical Institute
was to use aggregate scores of a student for computing the
percentile rank of the student (and that recommendation was
favorably accepted by MHRD), a statistically interesting
question is what happens if we consider various subject scores
separately instead of the aggregate score.
We intend to investigate this issue under some appropriate
assumptions.
2
Kushal Kr. Dey [1.5 pt] Indian Statistical Institute D.Basu Memorial Award Talk 2011
On some interesting features and an application of rank correla
35. The Model
For convenience, let us consider only two subjects namely
Mathematics and Physics.
Let us denote the observed score of a student in Mathematics
and Physics as XM and XP . Assume the existence of
unobserved merit variables WP and WM such that the scores
in the two subjects are related as
XM ≈ gM (WM ) XP ≈ gP (WP ) (12)
WM and WP may be treated as attributes of the student
which depend on the knowledge and understanding of Maths
and Physics respectively and also on other factors like
schooling, intelligence etc.
gM and gP relate to the examination procedure corresponding
to the two subjects. They may vary across the boards. 3
Kushal Kr. Dey [1.5 pt] Indian Statistical Institute D.Basu Memorial Award Talk 2011
On some interesting features and an application of rank correla
36. The Model
For convenience, let us consider only two subjects namely
Mathematics and Physics.
Let us denote the observed score of a student in Mathematics
and Physics as XM and XP . Assume the existence of
unobserved merit variables WP and WM such that the scores
in the two subjects are related as
XM ≈ gM (WM ) XP ≈ gP (WP ) (12)
WM and WP may be treated as attributes of the student
which depend on the knowledge and understanding of Maths
and Physics respectively and also on other factors like
schooling, intelligence etc.
gM and gP relate to the examination procedure corresponding
to the two subjects. They may vary across the boards. 3
Kushal Kr. Dey [1.5 pt] Indian Statistical Institute D.Basu Memorial Award Talk 2011
On some interesting features and an application of rank correla
37. The Model
For convenience, let us consider only two subjects namely
Mathematics and Physics.
Let us denote the observed score of a student in Mathematics
and Physics as XM and XP . Assume the existence of
unobserved merit variables WP and WM such that the scores
in the two subjects are related as
XM ≈ gM (WM ) XP ≈ gP (WP ) (12)
WM and WP may be treated as attributes of the student
which depend on the knowledge and understanding of Maths
and Physics respectively and also on other factors like
schooling, intelligence etc.
gM and gP relate to the examination procedure corresponding
to the two subjects. They may vary across the boards. 3
Kushal Kr. Dey [1.5 pt] Indian Statistical Institute D.Basu Memorial Award Talk 2011
On some interesting features and an application of rank correla
38. Formulation of the model
Two students may obtain different scores in Mathematics and
Physics because of the difference in their merit variables WM
and WP or due to the difference in examination procedure gM
and gP across the boards.
It is time that we lay down our assumptions about WM , WP
and gM and gP .
Kushal Kr. Dey [1.5 pt] Indian Statistical Institute D.Basu Memorial Award Talk 2011
On some interesting features and an application of rank correla
39. Assumptions of the model
Assumption 1
The functions gP and gM are monotonically increasing. This
implies the scores of the students are expected to increase
from less meritorious to more meritorious students for each of
the two subjects.
Assumption 2
The joint distribution of (WP , WM ) for the students is the
same in different boards.
Kushal Kr. Dey [1.5 pt] Indian Statistical Institute D.Basu Memorial Award Talk 2011
On some interesting features and an application of rank correla
40. How Assumptions can be checked
Imagine a common test in Mathematics and Physics taken by
students of all the boards.
Mathematics score in the common test would be a monotone
function of the Mathematics score in the board examination,
as both are monotone functions of the same merit variable.
(The same holds for Physics scores).
This can be tested by using Spearman’s ρ and Kendall’s τ
statistics.
Mathematics and Physics scores in the common test would
have the same distribution in the subpopulations
corresponding to different boards.
This can be tested using any non-parametric test for equality
of bivariate distributions.
Kushal Kr. Dey [1.5 pt] Indian Statistical Institute D.Basu Memorial Award Talk 2011
On some interesting features and an application of rank correla
41. Is there a way to check the validity of these
assumptions using currently available data?
Kushal Kr. Dey [1.5 pt] Indian Statistical Institute D.Basu Memorial Award Talk 2011
On some interesting features and an application of rank correla
42. How assumptions can be checked without a
common test
According to Assumption 2, the dependence between merits
in Physics and Mathematics should be similar in all the
boards.
Rank correlation between Physics and Mathematics scores in
a particular board should not depend on the board-specific
monotone functions gM and gP .
Therefore, rank correlation between Physics and Mathematics
scores across the boards should be the same.
Kushal Kr. Dey [1.5 pt] Indian Statistical Institute D.Basu Memorial Award Talk 2011
On some interesting features and an application of rank correla
43. Rank correlation between Physics & Maths for
different boards and years
0
Kushal Kr. Dey [1.5 pt] Indian Statistical Institute D.Basu Memorial Award Talk 2011
On some interesting features and an application of rank correla
44. Rank correlation Physics & Chemistry
Figure: Rank correlation between Physics and Chemistry marks over
years
0
Kushal Kr. Dey [1.5 pt] Indian Statistical Institute D.Basu Memorial Award Talk 2011
On some interesting features and an application of rank correla
45. bar chart of rank correlation Chemistry & Maths
Figure: Rank correlation between Chemistry and Maths marks over years
m
Kushal Kr. Dey [1.5 pt] Indian Statistical Institute D.Basu Memorial Award Talk 2011
On some interesting features and an application of rank correla
46. Subject percentile graph WBHS 2008
Kushal Kr. Dey [1.5 pt] Indian Statistical Institute D.Basu Memorial Award Talk 2011
On some interesting features and an application of rank correla
47. Variation of a subject across a board same year
Kushal Kr. Dey [1.5 pt] Indian Statistical Institute D.Basu Memorial Award Talk 2011
On some interesting features and an application of rank correla
48. Inference from the data analysis
Between boards variation is significantly higher than within
board variation across the two years.
Visibly,there is high correlation in Tamil Nadu Board, whereas
low correlation is observed in CBSE Board.
If we interpret the data available as a large sample from a
larger hypothetical population, the rank correlation computed
for a board in a particular year will have an approximate
normal distribution.
So, we can use this rank correlation values to carry out
ANOVA type statistical analysis to see whether there is
significant difference values across different boards and across
different years. When this is done, rank correlation appears to
be significant across different boards.
This essentially implies breakdown of Assumption 2.
Study of the rank correlation brings out this fact even without
scores of a common test.
Kushal Kr. Dey [1.5 pt] Indian Statistical Institute D.Basu Memorial Award Talk 2011
On some interesting features and an application of rank correla
49. Inference from the data analysis
Between boards variation is significantly higher than within
board variation across the two years.
Visibly,there is high correlation in Tamil Nadu Board, whereas
low correlation is observed in CBSE Board.
If we interpret the data available as a large sample from a
larger hypothetical population, the rank correlation computed
for a board in a particular year will have an approximate
normal distribution.
So, we can use this rank correlation values to carry out
ANOVA type statistical analysis to see whether there is
significant difference values across different boards and across
different years. When this is done, rank correlation appears to
be significant across different boards.
This essentially implies breakdown of Assumption 2.
Study of the rank correlation brings out this fact even without
scores of a common test.
Kushal Kr. Dey [1.5 pt] Indian Statistical Institute D.Basu Memorial Award Talk 2011
On some interesting features and an application of rank correla
50. Inference from the data analysis
Between boards variation is significantly higher than within
board variation across the two years.
Visibly,there is high correlation in Tamil Nadu Board, whereas
low correlation is observed in CBSE Board.
If we interpret the data available as a large sample from a
larger hypothetical population, the rank correlation computed
for a board in a particular year will have an approximate
normal distribution.
So, we can use this rank correlation values to carry out
ANOVA type statistical analysis to see whether there is
significant difference values across different boards and across
different years. When this is done, rank correlation appears to
be significant across different boards.
This essentially implies breakdown of Assumption 2.
Study of the rank correlation brings out this fact even without
scores of a common test.
Kushal Kr. Dey [1.5 pt] Indian Statistical Institute D.Basu Memorial Award Talk 2011
On some interesting features and an application of rank correla
51. Inference from the data analysis
Between boards variation is significantly higher than within
board variation across the two years.
Visibly,there is high correlation in Tamil Nadu Board, whereas
low correlation is observed in CBSE Board.
If we interpret the data available as a large sample from a
larger hypothetical population, the rank correlation computed
for a board in a particular year will have an approximate
normal distribution.
So, we can use this rank correlation values to carry out
ANOVA type statistical analysis to see whether there is
significant difference values across different boards and across
different years. When this is done, rank correlation appears to
be significant across different boards.
This essentially implies breakdown of Assumption 2.
Study of the rank correlation brings out this fact even without
scores of a common test.
Kushal Kr. Dey [1.5 pt] Indian Statistical Institute D.Basu Memorial Award Talk 2011
On some interesting features and an application of rank correla
52. Inference from the data analysis
Between boards variation is significantly higher than within
board variation across the two years.
Visibly,there is high correlation in Tamil Nadu Board, whereas
low correlation is observed in CBSE Board.
If we interpret the data available as a large sample from a
larger hypothetical population, the rank correlation computed
for a board in a particular year will have an approximate
normal distribution.
So, we can use this rank correlation values to carry out
ANOVA type statistical analysis to see whether there is
significant difference values across different boards and across
different years. When this is done, rank correlation appears to
be significant across different boards.
This essentially implies breakdown of Assumption 2.
Study of the rank correlation brings out this fact even without
scores of a common test.
Kushal Kr. Dey [1.5 pt] Indian Statistical Institute D.Basu Memorial Award Talk 2011
On some interesting features and an application of rank correla
53. Acknowledgement
I would like to express my gratitude towards my mentors for this
project, Prof.Probal Chaudhuri and Prof. Debasis Sengupta
for their immense co-operation. I would also like to think all those
who have been associated with this work in some way or the other.
Kushal Kr. Dey [1.5 pt] Indian Statistical Institute D.Basu Memorial Award Talk 2011
On some interesting features and an application of rank correla
54. Thank You
Kushal Kr. Dey [1.5 pt] Indian Statistical Institute D.Basu Memorial Award Talk 2011
On some interesting features and an application of rank correla