Digital Enterprise Research Institute                                                       www.deri.ie                   ...
IntroductionDigital Enterprise Research Institute   www.deri.ie            DSPL consists of :                   XML     ...
DSPL DatasetDigital Enterprise Research Institute                                                www.deri.ie            G...
School Enrollment 2009_2010 *Digital Enterprise Research Institute                                                        ...
DSPL – Contd.Digital Enterprise Research Institute                                                                www.deri...
DSPL – Contd.Digital Enterprise Research Institute                                                                        ...
DSPL – Contd.Digital Enterprise Research Institute                                                                   www.d...
School Enrollment SliceDigital Enterprise Research Institute                                               www.deri.ie    ...
DSPL – Contd.Digital Enterprise Research Institute                                                         www.deri.ie    ...
Data CleansingDigital Enterprise Research Institute                                                                       ...
Digital Enterprise Research Institute                                                                                     ...
Upcoming SlideShare
Loading in …5
×

Google Public Data Explorer

517 views

Published on

  • Be the first to comment

  • Be the first to like this

Google Public Data Explorer

  1. 1. Digital Enterprise Research Institute www.deri.ie Google Public Data Explorer Aftab Iqbal Stefan.Decker@deri.org http://www.StefanDecker.org/ Copyright 2010 Digital Enterprise Research Institute. All rights reserved.
  2. 2. IntroductionDigital Enterprise Research Institute www.deri.ie  DSPL consists of :  XML  CSV files
  3. 3. DSPL DatasetDigital Enterprise Research Institute www.deri.ie  General information  About the dataset  Concepts  Definitions of "things" that appear in the dataset (e.g., counties, unemployment rate, gender, etc.)  Slices  Combinations of concepts for which there are data  Tables  Data for concepts and slices. Concept tables hold enumerations and slice tables hold statistical data  Topics  Organize the concepts of the dataset in a meaningful hierarchy through labeling
  4. 4. School Enrollment 2009_2010 *Digital Enterprise Research Institute www.deri.ie School_Roll_No Short_Name Level Male Female 00697S ST BRIDGIDS NS Primary 377 447 01170G NAUL NS Primary 40 61 09492W BALSCADDEN NS Primary 98 133 … … … … …* Snapshot took from http://data.fingal.ie/ViewDataSets/Details/default.aspx?datasetID=385
  5. 5. DSPL – Contd.Digital Enterprise Research Institute www.deri.ie  General Information  General information about the provider of the dataset <info> <name> <value>School</value> </name> <description> <value>Statistics about Fingal County Schools</value> </description> <url> <value></value> </url> </info> <provider> <name> <value>County Fingal School Enrollment Statistics</value> </name> <url> <value>http://data.fingal.ie/ViewDataSets/Details/default.aspx?datasetID=385</value> </url> </provider>
  6. 6. DSPL – Contd.Digital Enterprise Research Institute www.deri.ie  Concepts  Type of data that appears in a dataset <concept id="Schools“ extends="geo:location" > <info> <table id="schools_table"> <name> <column id="School" type="string"/> <value>Schools</value> <column id=“School_Roll_No" type="string"/> </name> <column id="latitude" type="float"/> <description> <column id="longitude" type="float"/> <value>List of schools for Co. Fingal</value> <data> </description> <file format="csv" encoding="utf-8">schools.csv</file> </info> </data> <type ref="string"/> </table> <table ref="schools_table"/> </concept> school name latitude longitude 00697S Saint Bridgids National School 53.37514 -6.36221 01170G S N Na H Aille Naul National School 53.57887 -6.28564 09492W Balscadden National School 53.61528 -6.23218 09642P Burrow National School 53.39129 -6.10028 … … … …
  7. 7. DSPL – Contd.Digital Enterprise Research Institute www.deri.ie  Slices  It’s a combination of concepts for which data exists  contains two kinds of concept references: Dimensions and metrics. <table id="enrolment_slice_table"> <slice id="enrolment_slice"> <column id="school" type="string"/> <dimension concept="school"/> <column id="M" type="integer"/> <dimension concept="time:year"/> <column id="F" type="integer"/> <metric concept="M"/> <column id="year" type="date" format="yyyy"/> <metric concept="F"/> <data> <table ref="enrolment_slice_table"/> <file format="csv" encoding="utf- </slice> 8">school_enrolment_slice.csv</file> </data> </table>
  8. 8. School Enrollment SliceDigital Enterprise Research Institute www.deri.ie Dimensions metrics School Male Female Year Saint Bridgids National School 377 447 2009 Saint Bridgids National School 475 392 2010 Balscadden National School 98 133 2009 Balscadden National School 126 102 2010 … … … …
  9. 9. DSPL – Contd.Digital Enterprise Research Institute www.deri.ie  Topics  Classify concepts hierarchically, and are used by applications to help users navigate to your data. <topic id="Male_indicators"> <info> <name><value>Male Students Enrollment</value></name> </info> </topic> <topic id="Female_indicators"> <info> <name><value>Female Students Enrollment</value></name> </info> </topic>
  10. 10. Data CleansingDigital Enterprise Research Institute www.deri.ie School Enrollment 2009 School Enrollment 2010 School_Roll_No Short_Name Level Male Female School_Roll_No Short_Name Level Male Female 00697S ST BRIDGIDS NS Primary 377 447 00697S ST BRIDGIDS NS Primary 475 392 01170G NAUL NS Primary 40 61 01170G NAUL NS Primary 58 40 … … … … … … … … … … School Male Female Year 00697S 377 447 2009 00697S 475 392 2010 01170G 40 61 2009 01170G 58 40 2010 … … … … School_Enrollment_Slice.csv School Name Latitude Longitude 00697S Saint Bridgids National School 53.37514 -6.36221 01170G S N Na H Aille Naul National School 53.57887 -6.28564 … … … … Schools.csv
  11. 11. Digital Enterprise Research Institute www.deri.ie <table id="enrolment_slice_table"> <slice id="enrolment_slice"> <column id="school" type="string"/> <dimension concept="school"/> <column id="Male" type="integer"/> <dimension concept="time:year"/> <column id="Female" type="integer"/> <metric concept="Male"/> <column id="year" type="date" format="yyyy"/> <metric concept="Female"/> <data> <table ref="enrolment_slice_table"/> <file format="csv" encoding="utf-8">School_Enrollment_Slice.csv</file> </slice> </data> </table> Deployment Compressed CSV files metadata

×