The document discusses using data to detect anomalies, see the real impact of policies, determine performance factors, and monitor outcomes effectively. It provides examples of analyzing exam grade distributions, subject failure rates, performance differences between genders, and birth month correlations with test scores. Overall, the document advocates embracing data to gain insights, find hidden correlations, compare performance, and drive decisions.
3. Richard Quinn
Strategic Management, UCF
“The exam was running at a
grade and a half higher than it
had ever run before... You don’t
see that kind of grade
improvement by chance.”
Summer
2010
Mid-term
Fall
2010
Mid-term
“A bimodal distribution exists when an
external force is applied to the dataset
that creates a systematic bias.”
15. Subject Girs higher by Girls Boys
Physics 0 119 119
Chemistry 1 123 122
English 4 130 126
Computers 6 137 131
Biology 6 129 123
Mathematics 11 123 112
Language 11 152 141
Accounting 12 138 126
Commerce 13 127 114
Economics 16 142 126
PERFORMANCE: GIRLS VS BOYS
16. Based on the results of the 20 lakh
students taking the Class XII exams
at Tamil Nadu over the last 3
years, it appears that the month you
were born in can make a difference
of as much as 120 marks out of
1,200.
June borns
score the lowest
The marks shoot
up for Aug borns
… and peaks for
Sep-borns
120 marks out of
1200 explainable
by month of birth
An identical pattern was observed in 2009 and 2010…
… and across districts, gender, subjects, and class X & XII.
“It’s simply that in Canada the eligibility
cutoff for age-class hockey is January 1. A
boy who turns ten on January
2, then, could be playing alongside
someone who doesn’t turn ten until the
end of the year—and at that age, in
preadolescence, a twelve-month gap in
age represents an enormous difference in
physical maturity.”
-- Malcolm Gladwell, Outliers
We took the results of class 12th state board examination in Tamil Nadu and looked at the most popular names -- the top 5,000 names to be precise -- and plotted them based on their marks. The visualisation you see plotslarge boxes for the popular names. For example the big rectangle on the top left indicates people who have the name Kumar and the colour of the boxes indicate the average percentages scored by these students. The darker the blue, the higher the marks. The closer it is to white, the lower the marks. There are some fairly interesting patterns here. For example the names Jain, Shah, Agarwal and Gupta tend to score fairly high marks. These are typical north Indian names. Names like Ashwin, Shweta, Sneha, Pooja, Harini, Sanjana, Varshini, Deepti, etc they tend to score high marks as well. These are classic urban names and you’ll also notice that vast majority of them are girl’s names. Names such as Manigandan, Venkatesan, Ezhumalai, Silambarasan, Pandiyan, Kumaresan, Tirupathi, they tend to score relatively low marks. If you notice these are classic rural names and predominantly male. This is NOT an indication of marks being predicted by the names -- but rather both marks and names are a consequence of socio economic and cultural background of students.