Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

DBpedia as Gaeilge Chapter

847 views

Published on

Slides for the first DBpedia as Gaeilge chapter as a session in the DBpedia Community Meeting 2015 in Dublin

Published in: Data & Analytics
  • Be the first to comment

DBpedia as Gaeilge Chapter

  1. 1. DBpedia  as  Gaeilge   Chapter  Mee*ng,  9th  January  2015  
  2. 2. @en LOD Cloud
  3. 3. DBpedia   ?????? @ga LOD Cloud
  4. 4. Creating DBpedia as Gaeilge Chapter
  5. 5. Wikipedia   Infoboxes   DBpedia  Mappings   DBpedia  Triples   Applica:ons   DBpedia as Gaeilge Chapter Workflow
  6. 6. Wikipedia   Infoboxes   DBpedia  Mappings   DBpedia  Triples   Applica:ons   DBpedia as Gaeilge Wikipedia Structure & Infoboxes
  7. 7. How Data is Structured in Wikipedia
  8. 8. SOURCE: http://stats.wikimedia.org/EN/SummaryGA.htm
  9. 9. SOURCE: http://ga.wikipedia.org/wiki/Terry_Pratchett Terry Pratchett Vicipéid Page
  10. 10. SOURCE: http://ga.wikipedia.org/wiki/Terry_Pratchett Infobox (Bosca Sonraí Scríbhneoir)
  11. 11. Terry  Pratche?   SOURCE: http://ga.wikipedia.org/wiki/Terry_Pratchett Infobox (Bosca Sonraí Scríbhneoir) Vicipéid Infobox Template
  12. 12. Editing the Infobox: Visual UI SOURCE: http://ga.wikipedia.org/wiki/Terry_Pratchett
  13. 13. From Wikipedia to DBpedia…
  14. 14. Wikipedia   Infoboxes   DBpedia  Mappings   DBpedia  Triples   Applica:ons   DBpedia as Gaeilge •  What are DBpedia mappings? •  How to create those mappings? •  What is the current status?
  15. 15. What is the DBpedia Ontology? Terry Pratchett is an Artist of subclass … Writer subclass has label: @en: writer @ga: scríbhneoir @fr: écrivain SOURCE: http://mappings.dbpedia.org/server/ontology/classes/
  16. 16. Ontology Labels & Comments •  Current  ontology:  685  classes,  2795  proper:es   •  Star:ng  point:  Label  &  comment  transla:ons   for  the  classes?   •  These  labels  and  comments  are  available  to  all   DBpedia   queries,   not   just   those   from   ga.dbpedia    
  17. 17. Mapping  Process ?
  18. 18. SOURCE: http://mappings.dbpedia.org/server/statistics/ga/
  19. 19. Wikipedia   Infoboxes   DBpedia  Mappings   DBpedia  Triples   Applica:ons   DBpedia as Gaeilge •  DBpedia Extraction Framework •  SPARQL Endpoint Extracting Triples
  20. 20. Extracted Terry Pratchett Triples SOURCE: http://ga.wikipedia.org/wiki/Terry_Pratchett SOURCE: http://ga.dbpedia.org/sparql
  21. 21. Raw Extractions (not from Ontology Mappings)
  22. 22. SOURCE: http://ga.dbpedia.org/sparql SPARQL Endpoint Interface
  23. 23. SOURCE: http://ga.dbpedia.org/sparql Query Result
  24. 24. SPARQL  Endpoint   SOURCE: http://ga.dbpedia.org/sparql
  25. 25. SPARQL  Endpoint   SOURCE: http://ga.dbpedia.org/sparql
  26. 26. DBpedia as Gaeilge Use Cases
  27. 27. Wikipedia   Infoboxes   DBpedia  Mappings   DBpedia  Triples   Applica:ons   DBpedia as Gaeilge
  28. 28. A Linked Data proof-of-concept SOURCE: http://apps.dri.ie/locationLODer/
  29. 29. Video: http://www.bbc.com/news/entertainment-arts-25324655 Article: http://www.irishtimes.com/culture/books/david-butler-great-writers-enrich-experience-even- the-mundane-1.2082759 Image: http://tvtropes.org/pmwiki/pmwiki.php/Creator/TerryPratchett http://ga.dbpedia.org/resource/Terry_Pratchett Archives
  30. 30. What can the Chapter do?
  31. 31. Wikipedia   Infoboxes   DBpedia  Mappings   DBpedia  Triples   Applica:ons   DBpedia as Gaeilge •  Adding  Infoboxes  to   Vicipéid   •  Transla:ng  DBpedia   Ontology  (Labels  &   Proper:es)   •  Crea:ng  Wikipedia   Infobox  to  Ontology   Mappings •  Create   ga.dbpedia.org   instance  (Insight)   •  Crea:ng   applica:ons   What can we do?
  32. 32. Who can help? •  Linked  Data  specialists   •  Irish-­‐speaking  soware  developers   •  Vicipéid   community,   students,   translators-­‐in-­‐ training   •  Translators,  editors,  linguists,  cultural  scholars  
  33. 33. Cumas  Gaeilge   Linked  Data  Knowledge   Editors   Translators   Cultural  Scholars   Linguists   Translators  in  training   Students            Irish-­‐speaking   Soware  Developers   Data  Scien:sts   Linked  Data  Academics  Groups  &  Possible   Tasks   Vicipéid  Editors  
  34. 34. Initial Work on DBpedia Gaeilge ü Instance  ga.dbpedia  created   ü Chapter  website  created     ü Experiments  with  automa:c  transla:ons  of   ontology  class  labels  
  35. 35.             SMT system English  Labels   Irish  Labels   Comments   canadian  football   team   foireann  sacair  cheanada   Correct  gramma*cal  case,  football  translated  as   soccer   database   bunachar  sonraí   correct   television  personality   pearsantacht  teiliIse   Correct  gramma*cal  case  for  2  nouns  together   broadcast  network   craoladh  líonra   Specific  term  required  here,  'líonra  craolacháin'  on   focal.ie,  but  stáisiún  in  general  use.  En.  word  order.   government  agency   rialtas  an   ghníomhaireacht   English  word  order,  changes  meaning  en*rely,   incorrect  case  used   military  structure   struchtúr  míleata   Correct  word  order  but  ambiguous  descrip*on   radio  program   raidió  ríomhchlár   English  word  order,  domain  compu*ng  instead  of   media  for  program   English Irish
  36. 36. Next Steps & Discussion?
  37. 37. Contacts http://ga.dbpedia.org Bianca Pereira bianca.pereira@insight-centre.org Caoilfhionn Lane caoilfhionn.lane@insight-centre.org

×