Data Literacy Training - case of CA Election 70

911 views

Published on

This hands-on training was given to the journalists in Dec 2013 under OpenNepal banner.

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
911
On SlideShare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
2
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Data Literacy Training - case of CA Election 70

  1. 1. Hands-on Training Data – what and how? A case of CA Election 70 YoungInnovations OpenNepal
  2. 2. Data → Story ● Find data ● Wrangle/Cleanup the data ● Merge data with others (if any) ● Filter and sort the data ● Analyze data ● Visualize data (story)
  3. 3. CA Election 2070 ● What is data? – The candidates (age, gender, party) – The constituencies (vdc, ward, party) – The results (with votes, winner) – …..
  4. 4. Where to find it? ● http://election.gov.np ● The following FPTP results data in XML
  5. 5. Not lucky every time finding data ● Scrapping (requires programming knowledge) – Using google scraper ● PDF conversion ● PDF manual transcribe
  6. 6. Chrome Scraper Extension ● Search for “Chrome extension Scraper” from Chrome browser to install
  7. 7. Scraper in Action
  8. 8. PDF to Text ● Online tools available ● Linux has different set of utilities ● PDF is still a big nuisance (though something is better than nothing)
  9. 9. PDF to Text http://www.election.gov.np/election/uploads/fil es/ecn_report/constwisecandidatecount.pdf
  10. 10. PDF to Text ● Linux utility - pdftotext
  11. 11. CSV ● ● ● CSV - Comma Separated Value Opens in MS Excel, Open Office, Google Spreadsheet Easy to work with
  12. 12. CA XML Data to CSV
  13. 13. XML to CSV? ● Online services are available ● Might need help from technologist ● In linux (there might be several ways, e.g) xml2 < FPTP-CA70.xml | 2csv FPTP DISTNAME CONST CANDIDATE AGE SEX PARTYNAME SYMBOLNAME TOTALVOTE STATUS > FPTP-CA70.csv
  14. 14. OpenNepal ● Repository of datasets – ● ● ● data in csv, xml or json format Request for dataset Request for help in conversion from one format to another, scrapping data, ... OpenNepal Community (GoogleGroup) is very vibrant
  15. 15. CA Results CSV data ● Converted from XML http://dev.yipl.com.np/data-training/data/FPTP-CA70.csv
  16. 16. Processing/Cleaning CSV – Basics ● Add header ● Sorting (by different fields) ● Filter ● Simple formulas
  17. 17. Add headers ● Insert row at the top ● Add header for each column
  18. 18. Sorting ● Sorting by Age – Ascending, Descending ● Find out youngest winning candidate age
  19. 19. Filtering ● Filter the list of winning female candidates
  20. 20. Some exercise ● ● ● ● ● Are there people who didn't receive a single vote? What is the highest and lowest number of votes of candidate who didn't win? Find the percentage of female and male candidates, percentage of winning female candidates? Try the above exercise in one district of your interest? Think of other things you can do with this basic skills
  21. 21. More questions ● ● ● How many parties have candidates in all 240 constituencies? How many male and female candidates are there in Nepali Congress? Ratio of male-female in far-west districts? Which party has the highest number of female candidates?
  22. 22. Data Processing - Pivottable
  23. 23. PivotTable - more ● Breakdown of independent candidates
  24. 24. Lets again see numbers ● Sorted by total number of candidates
  25. 25. Visualization ● Bar graph of male-female candidates of top few districts
  26. 26. What else visualizations are possible? ● https://github.com/mbostock/d3/wiki/Gallery
  27. 27. What else visualizations are possible? ● https://github.com/mbostock/d3/wiki/Gallery
  28. 28. Geocoding ● Geo-coding – – ● the conversion of a human-readable location name into a numeric (or other machine-processable) location such as a longitude and latitude Kathmandu => [geocoding] => {latitude: 27.70169, longitude: 85.3206} Online tools available for geocoding – Google fusion table – cartodb
  29. 29. Lat-long in maps.google.com ● Put the lat long (27.70169 85.3206) in google map search box
  30. 30. Services available for geocoding http://open.mapquestapi.com/nominatim/v1/sea rch?format=xml&q=Kathmandu,Nepal
  31. 31. Problems with this CSV ● Unicode in districts name ● Can't geocode (currently only english)
  32. 32. Adding english district name http://dev.yipl.com.np/data-training/data/FPTP-CA70-eng.csv
  33. 33. Google Fusion Table ● tables.googlelabs.com (need @gmail account)
  34. 34. Imported data
  35. 35. Geocoding
  36. 36. Using filter in the map
  37. 37. Use of heatmap based on votes
  38. 38. Thank you

×