This presentation was made at the Open Data Camp in Bangalore on 2nd Mar 2013. It explores the question of government data quality by using the rural sanitation scheme as an example.
1. QUALITY OF
GOVERNMENT DATA
ODC, 2013 Amrtha Kasturi Rangan
2. Rural Sanitation
“India is well on track to meeting the MDG on
water coverage, though quality and
sustainability remain key issues. On
sanitation, achieving the MDG will demand
massive investments in facilities and even
more in changing hygiene practices.”
UNICEF
http://www.unicef.org/india/about_unicef_3708.htm
3. What India is doing?
GOI has launched a programme for accelerating rural
sanitation coverage – Nirmal Bharat Abhiyan – Total
Sanitation Campaign – CRSP
GoI provides incentives to households to build toilets
More than 27 lakh toilets have been constructed in rural
households in the annum 2012-13 (up to November 2012).
Massive online repository of data through an open IMIS data
sharing platform
4.
5. Rj
Pb
J&K Or
UP
Ch Jh Br
Data shows us that almost every state has achieved close
to 100% sanitation coverage
7. Why?
Problems with the way data is presented
Problems with data entry & collection
Implementation issues
8. Problems with the way data is
presented
The way information is presented makes it
difficult to get a snapshot to understand status
of sanitation quickly
No one way to actually look at the data and
understand what it is saying
9. Problems with data entry & collection
Data entry & Monitoring
Entered at district level – no central mechanism to
monitor data inputs
Data has not changed – scrapers set up in Dec
(supposed to change every month)
11. Problems with data entry & collection
Cannot track progress - Over-writing of data
12. Problems with data entry & collection
46% cumulative achievement in the category
of people who did not receive incentives from
government scheme ( Above poverty line)
There is a high chance that this number is not
be accurate
13. Implementation issues
Toilet subsidies may still get disbursed but actual
toilets do not get constructed
Even if good quality toilets get built – sometimes
people are not even aware of the use – they use it
to store fodder/ as an extra room
All these toilets are counted as constructed –
skewing the data on actually useable & used
toilets
These numbers need to be accurate to build
accountability in the scheme
17. Correlation & monitoring through
data
Not possible to arrive at cost per unit from data
provided
Govt collects various kinds of data for different
departments:
Health(IDSP)
Water Quality
We could correlate these to provide checks
and balances for sanitation scheme reach
18. Good things
Earnest attempt to create and publicise
information
Very detailed set of data
Tie up with WSP to create performance
benchmarking (nirmalbharat.org)
New scheme - new start to data
Annual Information Reports shared on the site
More vigorous monitoring mechanisms
New baseline because of the Census
19. Challenges still remain
Data entry even in baseline is not error-free
Over-writing takes away chances of actually
understanding progress over time.
20. Why are we interested?
Arghyam – foundation set up by Rohini
Nilekani in 2005 – Safe, sustainable water for
all
Arghyam is interested in looking at
government spends on sanitation and seeing
if there can be measures to sharpen it
Arghyam collaborating with Gramener to
understand the sanitation data better & help
open up the data actually
21. What we can do
Provide a platform for people interested in sanitation
data to access and interpret this data easily -
http://arghyam.github.com/arghyam-scrapers/
Create and provide tools for scraping data off sites
similar to the sanitation website
Consolidate these various data sheets into a few
critical sheets & then try and get government to buy
this format
Identify few actual visualisations/ data analysis that
will help govt quickly assess if data is correct
Set up dashboards to correlate data with other data –
Census, IDSP, then see if govt will pick this up as
methods of monitoring sanitation coverage