Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Maintaining	
  scholarly	
  
standards	
  in	
  the	
  digital	
  age:	
  
Publishing	
  historical	
  gaze6eers	
  and	
 ...
Structure	
  of	
  presenta7on	
  
•  Why	
  a6ribu7on	
  really	
  ma6ers	
  
•  Introduc7on	
  to	
  Linked	
  Open	
  D...
Sorry:	
  nothing	
  about	
  boundaries	
  today	
  
•  Not	
  sure	
  why	
  I	
  included	
  boundary	
  data	
  in	
  ...
Three	
  conflic7ng	
  pressures	
  …	
  
4	
  
Tradi*onal	
  scholarship:	
  
•  Cares	
  deeply	
  about	
  
cita7ons	
  ...
Three	
  conflic7ng	
  pressures	
  …	
  
5	
  
Tradi*onal	
  scholarship:	
  
•  Care	
  deeply	
  about	
  
cita7ons	
  a...
But	
  we	
  need	
  data	
  as	
  well	
  as	
  money	
  
•  Almost	
  all	
  our	
  digital	
  boundaries	
  created	
  ...
Why	
  people	
  give	
  us	
  data	
  
•  They	
  want	
  to	
  see	
  it	
  mapped	
  
–  Which	
  is	
  maybe	
  anothe...
Linked	
  Open	
  Data	
  Defini7ons	
  
•  Open	
  Data:	
  Data	
  is	
  open	
  if	
  anyone	
  is	
  free	
  to	
  acce...
“Openness”	
  Five	
  Star	
  Ra7ng	
  
9	
  
Star	
  ra*ng	
   Defini*on	
   Example	
   No.	
  of	
  
datasets	
  
Unavai...
Data.gov.uk	
  home	
  page	
  
10	
  
Data.gov.uk	
  home	
  page	
  
11	
  
Linked	
  open	
  data	
  cloud	
  as	
  of	
  August	
  2014	
  
12	
  
Linked	
  open	
  data	
  cloud	
  as	
  of	
  August	
  2014	
  
13	
  
Gaze6eers	
  as	
  Linked	
  Open	
  Data	
  (1)	
  
•  It	
  would	
  be	
  both	
  unrealis7c	
  and	
  undesirable	
  
...
Gaze6eers	
  as	
  Linked	
  Open	
  Data	
  (2)	
  	
  
•  Both	
  central	
  hubs	
  of	
  LOD	
  cloud	
  are	
  gaze6e...
Introducing	
  PastPlace	
  
•  Based	
  not	
  on	
  DBpedia	
  or	
  GeoNames	
  but	
  on	
  
Wikidata	
  
–  Project	
...
Oxford	
  in	
  Wikipedia	
  
17	
  
Oxford	
  search	
  results	
  in	
  Wikidata	
  
18	
  
•  Very	
  visibly	
  not	
  a	
  gaze6eer	
  search	
  interface...
Oxford	
  city	
  page	
  in	
  Wikidata	
  
19	
  
20	
  
Oxford	
  as	
  
Wikidata	
  RDF	
  
@prefix	
  rdf:	
  <h6p://www.w3.org/1999/02/22-­‐rdf-­‐syntax-­‐ns#>	
  .	
  
...
Oxford	
  in	
  
PastPlace:	
  
header	
  
•  We	
  reference	
  so	
  
many	
  namespaces	
  
because	
  we	
  are	
  
tr...
Oxford	
  in	
  PastPlace:	
  defining	
  and	
  loca7ng	
  en7ty	
  
•  seeAlso	
  and	
  
sameAs	
  link	
  
our	
  en7ty...
Oxford	
  in	
  
PastPlace:	
  
adding	
  names	
  
•  Some	
  names	
  are	
  
simply	
  from	
  
Wikidata,	
  with	
  20...
Oxford	
  in	
  
PastPlace:	
  
iden7fying	
  
sources	
  
•  Mostly	
  conven7onal	
  
bibliographic	
  
references	
  us...
Acknowledging	
  contributors	
  
Standard	
  approach	
  with	
  open	
  data	
  is	
  to	
  use	
  a	
  Crea7ve	
  Commo...
CC	
  Licence	
  
26	
  
•  xx	
  
CC	
  Licence	
  
27	
  
•  But	
  if	
  the	
  digital	
  
representa7on	
  of	
  the	
  
historical	
  source	
  does	
 ...
Acknowledging	
  
contributors	
  in	
  
PastPlace	
  
•  Again,	
  uses	
  exis7ng	
  
standards,	
  notably	
  
Friend	
...
Iden*fying	
  contributors’	
  roles	
  
•  Based	
  on	
  Library	
  of	
  Congress's	
  MARC	
  Code	
  List	
  
for	
  ...
Developing	
  a	
  Linked	
  Data	
  API	
  for	
  
historical	
  sta7s7cs	
  
•  Work	
  funded	
  by	
  Arts	
  &	
  Hum...
GBH	
  GIS	
  Sta7s7cal	
  Content	
  
•  Data	
  from	
  every	
  Census	
  of	
  Popula7on	
  1801-­‐2001	
  
–  Parish-...
…	
  and	
  all	
  these	
  data	
  are	
  in	
  one	
  table	
  
•  Eliminates	
  the	
  evil	
  concept	
  of	
  “datase...
33	
  
GBH	
  GIS	
  Data	
  
Architecture	
  
Overview	
  
Data	
  
DDS	
  
Date	
  Object	
  
Data	
  
Documenta*on	
  
...
34	
  
GBH	
  GIS	
  Data	
  
Architecture	
  
Overview	
  
Data	
  
DDS	
  
Date	
  Object	
  
Data	
  
Documenta*on	
  
...
Towards	
  an	
  API	
  for	
  UK	
  census	
  data:	
  
•  CAIRD:	
  Census	
  Aggregate	
  Informa7on	
  Resource	
  
De...
Data	
  cube	
  vocab	
  spec	
  
6th	
  September	
  
36	
  
VoB	
  DataCube	
  API	
  Phase	
  1:	
  Dumping	
  
all	
  14m.	
  data	
  values	
  as	
  RDF	
  
•  Completed	
  autumn...
VoB	
  Datacube	
  API	
  Phase	
  2:	
  Adding	
  a	
  
sub-­‐seZng	
  mechanism	
  
•  Currently	
  implemented:	
  
–  ...
39	
  
@prefix sdmx-subject: <http://purl.org/linked-data/sdmx/2009/subject#> .
@prefix qb: <http://purl.org/linked-data/c...
VoB	
  DataCube	
  API	
  Phase	
  3:	
  Adding	
  a	
  
discovery	
  mechanism	
  
•  Le„ng	
  you	
  find	
  relevant	
  ...
…	
  but	
  most	
  historians	
  will	
  use	
  
these	
  resources	
  via	
  
conven7onal	
  web	
  interfaces	
  …	
  
...
The	
  PastPlace	
  app	
  is	
  designed	
  to	
  work	
  
equally	
  well	
  on	
  tablets	
  and	
  phones	
  
Ea7ng	
 ...
Web	
  sites:	
  
•  PastPlace	
  API:	
  
–  h6p://data.pastplace.org/search	
  
–  h6p://data.pastplace.org/search?q=oxf...
Upcoming SlideShare
Loading in …5
×

Maintaining scholarly standards in the digital age: Publishing historical gazetteers and statistics as Linked Open Data

338 views

Published on

This presentation: (1( Discusses why providing detailed attributions of individual contributions is essential to large scale sharing of historical research data; (2) Provides a short introduction to Open Linked Data; (3) Introduces the PastPlace Gazetteer API (Applications Programming Interface), explaining components of the RDF it generates using the example of Oxford, UK; (4) Notes that most open data projects use the Creative Commons -- Must Ackowledge license (CC-BY) while not actually acknowledging contributors within their RDF, then shows how we do it; (5) Introduces the separate PastPlace Datafeed API, which implements the W3C Datacube Vocabulary.

Published in: Technology
  • Be the first to comment

  • Be the first to like this

Maintaining scholarly standards in the digital age: Publishing historical gazetteers and statistics as Linked Open Data

  1. 1. Maintaining  scholarly   standards  in  the  digital  age:   Publishing  historical  gaze6eers  and   sta7s7cs  as  Linked  Open  Data     Humphrey  Southall  &  Michael  Stoner   (University  of  Portsmouth,  UK)   1  
  2. 2. Structure  of  presenta7on   •  Why  a6ribu7on  really  ma6ers   •  Introduc7on  to  Linked  Open  Data   •  Case  study  1:  PastPlace  gaze6eer  API   •  Adding  acknowledgments  to  PastPlace  API   •  Case  study  2:  VoB  sta7s7cal  datafeed  API   – Method  for  a6aching  acknowledgments  to   PastPlace  directly  applicable  to  datafeed   2  
  3. 3. Sorry:  nothing  about  boundaries  today   •  Not  sure  why  I  included  boundary  data  in  7tle   •  No  work  done  to  expose  DBDs  as  linked  data   –  Whereas  we  have  Linked  Data  APIs  to  gaze6eer  and   sta7s7cs  running  today   –  See  Glen  Hart  and  Catherine  Dolbear,  Linked  Data:  A   Geographic  Perspec6ve  (CRC  Press,  2013)   •  We  would  have  major  problem  exposing  DBDs  as   Linked  Open  Data  as  depend  on  licensing  income   from  them  to  fund  staff   –  In  process  of  licensing  UK  DBDs  to  US  universi7es   3  
  4. 4. Three  conflic7ng  pressures  …   4   Tradi*onal  scholarship:   •  Cares  deeply  about   cita7ons  and   acknowledgment   •  Most  historians   •  O]en  anally  reten7ve   •  Uncommercial   Data  Libera*on:   •  Data  wants  to  be   free   •  Computer  scien7sts   •  O]en  indifferent  to   data  quality   •  Uncommercial   Need  to  pay  staff:   •  Everyone  thinks  we  have  core   funding  but  we  don’t   •  University  commercial  office   •  What  level  won’t  we  sink  to?   •  Commercial   Great  Britain   Historical  GIS   Team  
  5. 5. Three  conflic7ng  pressures  …   5   Tradi*onal  scholarship:   •  Care  deeply  about   cita7ons  and   acknowledgment   •  Most  historians   •  O]en  anally  reten7ve   •  Uncommercial   Data  Libera*on:   •  Data  wants  to  be   free   •  Computer  scien7sts   •  O]en  indifferent  to   data  quality   •  Uncommercial   Need  to  pay  staff:   •  Everyone  thinks  we  have  core   funding  but  we  don’t   •  University  commercial  office   •  What  level  won’t  we  sink  to?   •  Commercial   Great  Britain   Historical  GIS   Team   Today’s  presenta7on   mostly  about   reconciling  conflict   between  tradi7onal   scholarshire  and  data   libera7on  
  6. 6. But  we  need  data  as  well  as  money   •  Almost  all  our  digital  boundaries  created  by  us   •  But  much  of  our  sta7s7cal  content  originally   computerized  by  other  researchers   – And  almost  all  content  for  the  PastPlace  global   gaze6eer  comes  from  others   •  Arguably,  gathering  in  data  is  the  only  way  to   build  a  global  historical  resource  going  back   more  than  100  years   6  
  7. 7. Why  people  give  us  data   •  They  want  to  see  it  mapped   –  Which  is  maybe  another  reason  for  not  completely   opening  up  boundary  data  …   •  They  want  to  analyze  it  jointly  with  other  data  we   hold   •  They  want  it  to  go  to  a  good  home   –  Powerful  mo7vator  for  academics  nearing  or  past   re7rement   •  But  they  need  it  to  be  s7ll  their  data   –  Real  issue  is  a6ribu7ons,  not  them  retaining  copyright   7  
  8. 8. Linked  Open  Data  Defini7ons   •  Open  Data:  Data  is  open  if  anyone  is  free  to  access,   use,  modify,  and  share  it  —  subject,  at  most,  to   measures  that  preserve  provenance  and  openness.   •  Linked  Data:  The  technical  term  used  to  describe  the   best  prac7ce  of  exposing,  sharing  and  connec7ng   items  of  data  on  the  seman7c  web  using  unique   resource  iden7fiers  (URIs)  and  resource  descrip7on   framework  (RDF).  Not  to  be  confused  with  data  linking.   •  RDF:  a  W3C  standard  …  the  founda7on  of  several   technologies  for  modelling  distributed  knowledge  and   is  meant  to  be  used  as  the  basis  of  the  Seman7c  Web   –  All  defini7ons  from  data.gov.uk   8  
  9. 9. “Openness”  Five  Star  Ra7ng   9   Star  ra*ng   Defini*on   Example   No.  of   datasets   Unavailable  or  not  openly   licensed   210   ★   Unstructured  data   PDF   7909   ★★   Structured  data  but  proprietary   format   Excel   1090   ★★★   Structured  data  in  open  format   CSV   412   ★★★★★   Linked  Data:  data  URIs  and   linked  to  other  data   RDF   210   •  From  data.gov.uk  –  see  h6ps://data.gov.uk/data/search  
  10. 10. Data.gov.uk  home  page   10  
  11. 11. Data.gov.uk  home  page   11  
  12. 12. Linked  open  data  cloud  as  of  August  2014   12  
  13. 13. Linked  open  data  cloud  as  of  August  2014   13  
  14. 14. Gaze6eers  as  Linked  Open  Data  (1)   •  It  would  be  both  unrealis7c  and  undesirable   for  academics  to  try  to  establish  a  quite  new   gaze6eer  for  general  historical  use   – Exis7ng  digital  gaze6eers  well  established,  and   their  size  would  take  too  long  to  match   – And  we  want  to  connect  the  past  to  the  present   •  Rather,  we  should  add  historical  place  name   a6esta7ons  to  an  exis7ng  “modern”  gaze6eer   14  
  15. 15. Gaze6eers  as  Linked  Open  Data  (2)     •  Both  central  hubs  of  LOD  cloud  are  gaze6eers   •  GeoNames  is  very  large  global  gaze6eer   –  Assembled  from  diverse  sources  but  largest  are  USGS  and   NGA  gaze6eers   –  GeoNames  under  personal  control  of  Marc  Wick   •  DBpedia  is  Linked  Data  version  of  Wikipedia   –  Wikipedia  arguably  most  widely  used  gaze6eer   –  Includes  over  2  million  geographically-­‐located  en77es   –  DBpedia  project  not  part  of  Wikimedia  Founda7on   –  Uses  same  textual  iden7fiers  as  Wikipedia,  and  there  is  a   different  DBpedia  for  each  language  edi7on  of  Wikipedia   15  
  16. 16. Introducing  PastPlace   •  Based  not  on  DBpedia  or  GeoNames  but  on   Wikidata   –  Project  of  Wikimedia  Founda7on   –  Provides  core  language-­‐neutral  spine  to  Wikipedia,   using  a  single  set  of  numerical  IDs   –  Not  under  control  of  single  individual   –  Would  a  newer  LoD  cloud  figure  show  it?   •  PastPlace  also  includes  GeoNames  iden7fiers,  as   Wikidata  project  had  already  done  this   •  Presenta7on  based  mainly  on  example  of   “Oxford”   16  
  17. 17. Oxford  in  Wikipedia   17  
  18. 18. Oxford  search  results  in  Wikidata   18   •  Very  visibly  not  a  gaze6eer  search  interface  
  19. 19. Oxford  city  page  in  Wikidata   19  
  20. 20. 20   Oxford  as   Wikidata  RDF   @prefix  rdf:  <h6p://www.w3.org/1999/02/22-­‐rdf-­‐syntax-­‐ns#>  .   @prefix  xsd:  <h6p://www.w3.org/2001/XMLSchema#>  .   @prefix  rdfs:  <h6p://www.w3.org/2000/01/rdf-­‐schema#>  .   @prefix  owl:  <h6p://www.w3.org/2002/07/owl#>  .   @prefix  wikibase:  <h6p://wikiba.se/ontology-­‐beta#>  .   @prefix  wdata:  <h6ps://www.wikidata.org/wiki/Special:En7tyData/>  .   @prefix  wd:  <h6p://www.wikidata.org/en7ty/>  .   @prefix  wds:  <h6p://www.wikidata.org/en7ty/statement/>  .   @prefix  wdref:  <h6p://www.wikidata.org/reference/>  .   [Big  chunk  of  header  cut  out]     wdata:Q34217  a  schema:Dataset  ;    schema:about  wd:Q34217  ;    cc:license  <h6p://crea7vecommons.org/publicdomain/zero/1.0/>  ;    schema:so]wareVersion  "0.0.1"  ;    schema:version  "269756320"^^xsd:integer  ;    schema:dateModified  "2015-­‐11-­‐07T21:03:27Z"^^xsd:dateTime  .     wd:Q34217  a  wikibase:Item  ;    rdfs:label  "Oxford"@lb  ;    skos:prefLabel  "Oxford"@lb  ;    schema:name  "Oxford"@lb  ;    rdfs:label  "牛津"@zh  ;    skos:prefLabel  "牛津"@zh  ;    schema:name  "牛津"@zh  ;    rdfs:label  "Oxford"@jv  ;   •  To  get  this  machine  readable  version,  go  to: h6ps://www.wikidata.org/wiki/Special:En7tyData/Q34217.6l   rather  than   h6ps://www.wikidata.org/en7ty/Q34217   •  This  par7cular  way  of  organizing  RDF  is  N3,  or  Turtle  
  21. 21. Oxford  in   PastPlace:   header   •  We  reference  so   many  namespaces   because  we  are   trying  to  invent  as   li6le  as  possible   •  Instead,  we   reference  exis7ng   authori7es  –  the   basic  linked  data   approach,  and  good   scholarship   21   #  baseURI:  h6p://auo.gbhgis.geog.port.ac.uk/uri/#   @base                    <h6p://auo.gbhgis.geog.port.ac.uk/uri/#>  .   @prefix  obs:      <h6p://gbhgis.geog.port.ac.uk/>  .   @prefix  dcam:    <h6p://purl.org/dc/dcam/>  .   @prefix  year:    <h6p://gbhgis.geog.port.ac.uk/>  .   @prefix  owl:      <h6p://www.w3.org/2002/07/owl#>  .   @prefix  xsd:      <h6p://www.w3.org/2001/XMLSchema#>  .   @prefix  rdfs:    <h6p://www.w3.org/2000/01/rdf-­‐schema#>  .   @prefix  geo:      <h6p://www.w3.org/2003/01/geo/wgs84_pos#>  .   @prefix  ref-­‐auo:  <h6p://gbhgis.geog.port.ac.uk/>  .   @prefix  ref-­‐indicator:  <h6p://gbhgis.geog.port.ac.uk/>  .   @prefix  dcterms:  <h6p://purl.org/dc/terms/>  .   @prefix  value:  <h6p://gbhgis.geog.port.ac.uk/>  .   @prefix  dataSet:  <h6p://purl.org/linked-­‐data/cube#>  .   @prefix  ref-­‐aou:  <h6p://gbhgis.geog.port.ac.uk/>  .   @prefix  hgisMeaningDDI:  <h6p://gbhgis.geog.port.ac.uk/>  .   @prefix  foaf:    <h6p://xmlns.com/foaf/0.1/>  .   @prefix  cc:        <h6p://crea7vecommons.org/ns#>  .   @prefix  refTime:  <h6p://purl.org/linked-­‐data/sdmx/2009/dimension#>  .   @prefix  fabio:  <h6p://purl.org/spar/fabio/>  .   @prefix  hgisArea:  <h6p://gbhgis.geog.port.ac.uk/>  .   @prefix  unitMeasure:  <h6p://purl.org/linked-­‐data/sdmx/2009/ a6ribute#>  .   @prefix  lawd:    <h6p://lawd.info/ontology/>  .   @prefix  DataStructureDefini7on:  <h6p://purl.org/linked-­‐data/cube#>  .   @prefix  rdf:      <h6p://www.w3.org/1999/02/22-­‐rdf-­‐syntax-­‐ns#>  .   @prefix  marcrel:  <h6p://www.loc.gov/loc.terms/relators/>  .   @prefix  Observa7on:  <h6p://obs.gbhgis.geog.port.ac.uk/uri/#>  .   @prefix  auo:      <h6p://auo.gbhgis.geog.port.ac.uk/uri/#>  .   @prefix  ref-­‐period:  <h6p://gbhgis.geog.port.ac.uk/>  .   @prefix  dc:        <h6p://purl.org/dc/elements/1.1/>  .  .    
  22. 22. Oxford  in  PastPlace:  defining  and  loca7ng  en7ty   •  seeAlso  and   sameAs  link   our  en7ty  to   other  linked   data  en77es   •  Nearby   rela7onships   support  our   own  public   interface   •  Coordinates   are  a  small   part  of  th   whole!   22   auo:PastP_Place_OXFORD_34217   a                                    auo:PastP_Place  ;   rdfs:label                "OXFORD"^^xsd:string  ;   rdfs:seeAlso            <h6p://www.wikidata.org/en7ty/Q34217>  ,          <h6p://www.visiono•ritain.org.uk/place/887>  ,          <h6p://www.workhouses.org.uk/Oxford>  ,          <h6p://dbpedia.org/resource/Oxford>  ;       auo:hasNearbyPlace  auo:PastP_Nearby_940669  ,  auo:PastP_Nearby_2094727  ,          auo:PastP_Nearby_3090863  ,  auo:PastP_Nearby_1997905  ,            auo:PastP_Nearby_1953777  ,  auo:PastP_Nearby_2891392  ,            auo:PastP_Nearby_5468919  ,  auo:PastP_Nearby_3399484  ,            auo:PastP_Nearby_2650730  ,  auo:PastP_Nearby_2584506  ;     auo:hasPlaceId                "34217"^^xsd:int  ;     auo:isInContainer        auo:PastP_Place_THE_UNITED_KINGDOM  ;   owl:sameAs                        <h6p://dbpedia.org/resource/Oxford>  ,            <h6p://sws.geonames.org//about.rdf>  ;   geo:lat      "51.751944444444"^^xsd:double  ;   geo:lon                              "-­‐1.2577777777778"^^xsd:double  .  
  23. 23. Oxford  in   PastPlace:   adding  names   •  Some  names  are   simply  from   Wikidata,  with  2014   as  a6esta7on  date   •  Others  come  from   our  historical   sources,  including   sta7s7cal  reports   23   auo:PastP_Name_Oxford_eng_2014                  rdfs:label                "Oxford"^^xsd:string  ;                  dcterms:date            "2014"^^xsd:string  ;                  dcterms:language    “eng”^^xsd:string  ;                  dcterms:source        auo:Wikipedia  .     auo:PastP_Name_オクスフォード_jpn_2014                  rdfs:label                "オクスフォード"^^xsd:string  ;                  dcterms:date            "2014"^^xsd:string  ;                  dcterms:language    "jpn"^^xsd:string  ;                  dcterms:source        auo:Wikipedia  .     auo:PastP_Name_THE_UNIVERSITY_OF_OXFORD_eng_1851                  rdfs:label                "THE  UNIVERSITY  OF  OXFORD"^^xsd:string  ;                  dcterms:date            "1851"^^xsd:string  ;                  dcterms:language    "eng"^^xsd:string  ;                  dcterms:source        <#SRC_GB1851POP2_M[1]>  .     auo:PastP_Name_ALDATES_ST_eng_1831                  rdfs:label                "ALDATES  ST"^^xsd:string  ;                  dcterms:date            "1831"^^xsd:string  ;                  dcterms:language    "eng"^^xsd:string  ;                  dcterms:source        <#SRC_GB1831ABS_M[1]>  ,  marcrel:GATLEYD_par_1831  .     auo:PastP_Name_Oxfford_eng_1699                  rdfs:label                "Oxfford"^^xsd:string  ;                  dcterms:date            "1695-­‐1702"^^xsd:string  ;                  dcterms:language    "eng"^^xsd:string  ;                  dcterms:source        auo:Fiennes  .     auo:PastP_Name_Oxenford_eng_1610                  rdfs:label                "Oxenford"^^xsd:string  ;                  dcterms:date            "1610"^^xsd:string  ;                  dcterms:language    "eng"^^xsd:string  ;                  dcterms:source        auo:Camden  ,  marcrel:SUTTOND_camden  .  
  24. 24. Oxford  in   PastPlace:   iden7fying   sources   •  Mostly  conven7onal   bibliographic   references  using  the   standard  Dublin  Core     metadata  elements   •  A6esta7on  dates   some7mes  very   different  from   publica7on  dates,  as   with  Fiennes   24     auo:Wikipedia    dcterms:creator      "Wikimedia  Founda7on  Inc."  ;                  dcterms:iden7fier              "h6p://www.wikidata.org/en7ty/Q328"  ;                  dcterms:issued                      "2001/2005"  ;                  dcterms:publisher                "Wikimedia  Founda7on  Inc."  ;                  dcterms:7tle                        "Wikipedia"  ;                  fabio:hasRepresenta7on    "h6p://en.wikipedia.org/wiki/Main_Page"  .     <#SRC_GB1831ABS_M[1]>                  dcterms:iden7fier              "h6p://www.wikidata.org/en7ty/Q5058981"  ;                  dcterms:issued                      "1833"  ;                  dcterms:publisher                "His  Majesty's  Sta7onery  Office"  ;                  dcterms:publisher.place    "London"  ;                  dcterms:7tle                        "1831  Census  of  Great  Britain,  Abstract  of  answers,            Table  [1],  Popula7on  Abstract"  .     auo:Fiennes    dcterms:creator          "Celia  Fiennes"  ;                  dcterms:issued                      "1888"  ;                  dcterms:publisher                "Field  and  Tuer,  The  Leadenhall  Press"  ;                  dcterms:publisher.place    "London"  ;                  dcterms:7tle                          "Through  England  on  a  Side  Saddle  in  the            Time  of  William  and  Mary"  .     auo:Camden    dcterms:creator            "William  Camden"  ;                  dcterms:iden7fier              "h6p://www.wikidata.org/en7ty/Q12899074"  ;                  dcterms:issued                      "1610"  ;                  dcterms:publisher                "George  Bishop  and  John  Norton"  ;                  dcterms:publisher.place    "London"  ;                  dcterms:7tle                        "Britain,  or,  a  Chorographicall  Descrip7on          of  the  most  flourishing  Kingdomes,  England,  Scotland,          and  Ireland"  .  
  25. 25. Acknowledging  contributors   Standard  approach  with  open  data  is  to  use  a  Crea7ve  Commons   license,  and  most  commonly  one  requiring  acknowledgment:     Wikipedia:     Geonames:     Open  Street  Map:   25  
  26. 26. CC  Licence   26   •  xx  
  27. 27. CC  Licence   27   •  But  if  the  digital   representa7on  of  the   historical  source  does  not   iden7fy  the  underlying   creators,  how  can  users   acknowledge  them?  
  28. 28. Acknowledging   contributors  in   PastPlace   •  Again,  uses  exis7ng   standards,  notably   Friend  of  a  Friend   •  But  iden7fying  their   role  was  harder  …   •  NB  last  example   here  shows   approach  but  is   from  earlier  version   of  API   28     auo:PastP_Name_ALDATES_ST_eng_1831                  rdfs:label                "ALDATES  ST"^^xsd:string  ;                  dcterms:date            "1831"^^xsd:string  ;                  dcterms:language    "eng"^^xsd:string  ;                  dcterms:source        <#SRC_GB1831ABS_M[1]>  ,                  marcrel:GATLEYD_par_1831  .     auo:PastP_Name_Oxenford_eng_1610                  rdfs:label                "Oxenford"^^xsd:string  ;                  dcterms:date            "1610"^^xsd:string  ;                  dcterms:language    "eng"^^xsd:string  ;                  dcterms:source        auo:Camden  ,                    marcrel:SUTTOND_camden  .     pleiades:hasName  [          rdfs:label  "BRIDE  ST"@eng  ;          dcterms:date  "1831"  ;          dcterms:source  "SRC_GB1831ABS_M[1]"  ;          marcrel:TRC  [              rdf:type  foaf:Person  ;              foaf:name  "David  Allan  Gatley"  ;              foaf:mbox  "<d.a.gatley@staffs.ac.uk>"  ;              foaf:homepage  "<h6p://www.staffs.ac.uk/staff/profiles/ dag1.jsp>"  ;          ]  ;      ]  ;  
  29. 29. Iden*fying  contributors’  roles   •  Based  on  Library  of  Congress's  MARC  Code  List   for  Relator  with  one  addi7on  (DIG):   –  h6p://id.loc.gov/vocabulary/relators)   29   ID   Role   Descrip*on   AUT   Author   Use  for  a  person  or  corporate  body  chiefly  responsible  for   intellectual  or  ar7s7c  content  of  a  work,  usually  printed  text   EDT   Editor    Use  for  a  person  who  prepares  for  publica7on  a  work  not   primarily  his/her  own,  such  as  by  elucida7ng  text,  adding   introductory  ma6er,  or  technically  direc7ng  an  editorial  staff.     TRC   Transcriber   Use  for  a  person  who  prepares  a  handwri6en  or  typewri6en   copy  from  original  material,  including  from  dictated  or  orally   recorded  material.   DIG   Digi6zer   Use  for  a  person  who  computerized  the  data  from  an  earlier   transcrip6on.  Use  only  when  a  transcriber  is  also  iden6fied.  
  30. 30. Developing  a  Linked  Data  API  for   historical  sta7s7cs   •  Work  funded  by  Arts  &  Humani7es  Research   Council’s  “Humani7es  Big  Data”  program   •  Originally  expected  to  be  a  UK  contribu7on  to   Collabora7ve  for  Historical  Informa7on  and   Analysis   •  Now  stand-­‐alone  project  crea7ng  Linked  Data   API  to  exis7ng  Vision  of  Britain  data  structure   30  
  31. 31. GBH  GIS  Sta7s7cal  Content   •  Data  from  every  Census  of  Popula7on  1801-­‐2001   –  Parish-­‐level  popula7on  counts   –  Age  by  sex  by  district   –  Occupa7ons,  industries,  housing  condi7ons,  etc  …   •  Vital  sta7s7cs   –  Decennial  data:  cause  of  death  by  age  by  sex  by  district   •  Votes  for  each  candidate  in  each  cons7tuency  in   each  elec7on  1832-­‐2005   •  Farm  census:  crop  acreages  and  numbers  of   animals,  by  county,  at  least  decennial  since  1869   •  Unemployment  and  poor  law  data,  by  town/district   •  …  a  diverse  but  very  ordinary  set  of  social  sta7s7cs   31  
  32. 32. …  and  all  these  data  are  in  one  table   •  Eliminates  the  evil  concept  of  “datasets”   –  One  absolutely  central  goal  of  our  work  was  to  be  able   to  work  with  census  data  as  7me  seriesActually,  some   no7on  of  datasets  survives:  nCubes   •  Enables  data  to  be  presented  easily  as  both  7me-­‐ series  and  cross-­‐sec7ons   •  Greatly  simplifies  wri7ng  programs  accessing  data   •  Database  is  indefinitely  extensible  without  adding   database  tables   –  NB  architecture  enables  us  to  add  quite  new  themes  or   geographical  areas,  not  just  more  Bri7sh  census  data   32  
  33. 33. 33   GBH  GIS  Data   Architecture   Overview   Data   DDS   Date  Object   Data   Documenta*on   System   GazeOeer/   Admin  Unit   Ontology   Source?   Where?   What?   When?   Source   Documenta*on   System   Ack.   Thanks  
  34. 34. 34   GBH  GIS  Data   Architecture   Overview   Data   DDS   Date  Object   Data   Documenta*on   System   GazeOeer/   Admin  Unit   Ontology   Source?   Where?   What?   When?   Source   Documenta*on   System   Ack.   Thanks   Time   Space  
  35. 35. Towards  an  API  for  UK  census  data:   •  CAIRD:  Census  Aggregate  Informa7on  Resource   Demonstrator   –  Funded  by  ESRC  census  programme  in  2008-­‐9,  led  by  MIMAS/CDU  but   with  GBH  GIS  as  collaborators   •  ONS  2011  Census  Web  Services  Working  Group   –  Fairly  informal,  mee7ng  mainly  in  2009-­‐10   •  These  led  to  ….   •  InFuse:  New  MIMAS/UK  Data  Service  interface,  which  sits  on   top  of  a  currently  private  API   •  ONS  experimental  OpenAPI   –  Launched  November  2014   –  h6ps://www.ons.gov.uk/ons/apiservice/web/apiservice   35  
  36. 36. Data  cube  vocab  spec   6th  September   36  
  37. 37. VoB  DataCube  API  Phase  1:  Dumping   all  14m.  data  values  as  RDF   •  Completed  autumn  2014   •  Defining  basic  format   –  Broadly  based  on  W3C  Data  Cube  Vocabulary   –  Addi7onal  details  follow  Irish  census  gateway   •  Main  issue  is  choosing  vocabularies  to  link  to:   –  RDFS  Schema  (W3C)   –  SDMX  (Sta7s7cal  Data  Metadata  eXchange)   –  GB  Historical  GIS  (we  need  to  implement  this!)   •  NB  this  stage  is  of  limited  usefulness  by  itself,  as   result  is  a  single  file  of  >  100  Gb   37  
  38. 38. VoB  Datacube  API  Phase  2:  Adding  a   sub-­‐seZng  mechanism   •  Currently  implemented:   –  Geography:   •  One  or  more  specific  unit  IDs  (repor7ng  areas)   •  One  or  more  unit  types  (broadly,  GIS  coverages)   –  Cell  references   •  Specific  measures,  e.g.  TOT_POP:now  or  AGESEX_85UP:male/70_74     –  Date  range   •  To  be  added:   –  nCube   •  All  components  of  a  classifica7on  or  cross-­‐classifica7on   –  Rates   •  I.E.  extrac7ng  pairs  of  data  values  which  we  have  associated  in  a  rate   defini7on,  e.g.  TOT_POP:now  /  AREA_HECTARES:total     38  
  39. 39. 39   @prefix sdmx-subject: <http://purl.org/linked-data/sdmx/2009/subject#> . @prefix qb: <http://purl.org/linked-data/cube#> . @prefix obs: <http://obs.gbhgis.geog.port.ac.uk/uri/#> . @prefix sdmx-measure: <http://purl.org/linked-data/sdmx/2009/measure#> . @prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> . @prefix org: <http://xmlns.com/foaf/0.1/> . @prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> . @prefix gbhgis: <http://gbhgis.geog.port.ac.uk/> . @prefix admingeo: <http://data.ordnancesurvey.co.uk/ontology/admingeo/> . @prefix sdmx-dimension: <http://purl.org/linked-data/sdmx/2009/dimension#> . @prefix dc: <http://purl.org/dc/elements/1.1/> . obs:Observation-848458 a qb:Observation ; gbhgis:ref-auo "10076810" ; gbhgis:ref-auo-type "MOD_DIST" ; gbhgis:ref-period "1851" ; qb:dataset "http://dataset.gbhgis.geog.port.ac.uk/uri/#AGESEX_85UP:male/ 70_74" ; sdmx-measure:obsValue "166"^^<http://www.w3.org/2001/XMLSchema#decimal> . obs:Observation-1048478 a qb:Observation ; gbhgis:ref-auo "10076810" ; gbhgis:ref-auo-type "MOD_DIST" ; gbhgis:ref-period "1981" ; qb:dataset "http://dataset.gbhgis.geog.port.ac.uk/uri/#AREA_HECTARES:total" ; sdmx-measure:obsValue 21095.704 . obs:Observation-1259294 a qb:Observation ; gbhgis:ref-auo "10188982" ; gbhgis:ref-auo-type "MOD_DIST" ; gbhgis:ref-period "2001" ; Sample   GBHGIS   RDF   h6p://testapi.pastplace.org/datacube?units=10188982,10076810  
  40. 40. VoB  DataCube  API  Phase  3:  Adding  a   discovery  mechanism   •  Le„ng  you  find  relevant  nCubes,  cells,  unit  types  and  units!   •  Broader  queries  based  on  date  ranges,  sta7s7cal  keywords   and  “geography”   •  Geography  can  be  queried  by:   –  Bounding  box:  specific  units  which  overlap   –  Point:  specific  units  containing   –  Name:  units  named  a]er  place   •  Geography  problems:   –  We  have  nearly  200  “kinds”  of  unit  in  system!   –  22%  of  our  units  with  sta7s7cs  have  point  but  no  polygon   –  Units  named  a]er  place  don’t  always  contain  it!   40  
  41. 41. …  but  most  historians  will  use   these  resources  via   conven7onal  web  interfaces  …   This  is  the  PastPlace  web  app  41  
  42. 42. The  PastPlace  app  is  designed  to  work   equally  well  on  tablets  and  phones   Ea7ng  our  own  dogfood:  accesses  gaze6eer  database  en7rely  via  the  API  
  43. 43. Web  sites:   •  PastPlace  API:   –  h6p://data.pastplace.org/search   –  h6p://data.pastplace.org/search?q=oxford&format=n3   •  PastPlace  gaze6eer  site:   –  h6p://www.pastplace.org   •  Datacube  API   –  h6p://data.pastplace.org/datacube   –  Total  popula7ons  for  everywhere  in  1881:   –  h6p://data.pastplace.org/datacube? yearfrom=1881&yearto=1881&cellref=TOT_POP:now   •  We  need  people  to  start  using  these  APIs   •  About  us:   –  h6p://www.gbhgis.org   43  

×