Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Every Identity, its Ontology

2,605 views

Published on

To be useful, Linked Open Data requires shared identities and the reuse of their identifiers (URIs). This presentation argues that exact identity matching is both theoretically and practically impossible, and proposes some practical considerations for how to create an actual web of data.
Presented as invited seminar at UC Berkeley, February 24th, 2017

Published in: Technology
  • DOWNLOAD FULL BOOKS, INTO AVAILABLE FORMAT ......................................................................................................................... ......................................................................................................................... 1.DOWNLOAD FULL. PDF EBOOK here { https://tinyurl.com/y3nhqquc } ......................................................................................................................... 1.DOWNLOAD FULL. EPUB Ebook here { https://tinyurl.com/y3nhqquc } ......................................................................................................................... 1.DOWNLOAD FULL. doc Ebook here { https://tinyurl.com/y3nhqquc } ......................................................................................................................... 1.DOWNLOAD FULL. PDF EBOOK here { https://tinyurl.com/y3nhqquc } ......................................................................................................................... 1.DOWNLOAD FULL. EPUB Ebook here { https://tinyurl.com/y3nhqquc } ......................................................................................................................... 1.DOWNLOAD FULL. doc Ebook here { https://tinyurl.com/y3nhqquc } ......................................................................................................................... ......................................................................................................................... ......................................................................................................................... .............. Browse by Genre Available eBooks ......................................................................................................................... Art, Biography, Business, Chick Lit, Children's, Christian, Classics, Comics, Contemporary, Cookbooks, Crime, Ebooks, Fantasy, Fiction, Graphic Novels, Historical Fiction, History, Horror, Humor And Comedy, Manga, Memoir, Music, Mystery, Non Fiction, Paranormal, Philosophy, Poetry, Psychology, Religion, Romance, Science, Science Fiction, Self Help, Suspense, Spirituality, Sports, Thriller, Travel, Young Adult,
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here

Every Identity, its Ontology

  1. 1. @azaroth42 rsanderson @getty.edu IIIF:  Interoperabilituy Every  Identity, Its  Ontology @azaroth42 rsanderson @getty.edu Every  Identity, its  Ontology Robert  Sanderson Semantic  Architect J.  Paul  Getty  Trust rsanderson@getty.edu /        @azaroth42
  2. 2. @azaroth42 rsanderson @getty.edu IIIF:  Interoperabilituy Every  Identity, Its  Ontology @azaroth42 rsanderson @getty.edu The  shared  identity  of  the   concept  of  the  fictional  person   Dr Strangelove:   How  I  learned  to  stop  worrying   and love inconsistency
  3. 3. @azaroth42 rsanderson @getty.edu IIIF:  Interoperabilituy Every  Identity, Its  Ontology @azaroth42 rsanderson @getty.edu Overview • Linked  Open  Data  and  Identity • Philosophical  Challenges • Practical  Challenges • Practical  Philosophy • A  Philosophy  of  Practicality
  4. 4. @azaroth42 rsanderson @getty.edu IIIF:  Interoperabilituy Every  Identity, Its  Ontology @azaroth42 rsanderson @getty.edu Linked  Open  Data’s  Potential Linked  Open  Data  achieves  its  potential  when   institutions:   • link  outside  of  their  own  data  (⭐⭐⭐⭐⭐), • trust  other  organizations  to manage,  publish  and   maintain  data   which  they   use
  5. 5. @azaroth42 rsanderson @getty.edu IIIF:  Interoperabilituy Every  Identity, Its  Ontology @azaroth42 rsanderson @getty.edu Linked  Open  Data’s  Challenges Commonly  cited: • Amount  of  data  to  transform • Data  is  mostly  “strings”,  not  “things” • Cost  of  new  management  system • Cost  of  new  business  workflows • Difficulty  of  data  enrichment • Institutional  reluctance  to  embrace  change,   trust,  imperfection
  6. 6. @azaroth42 rsanderson @getty.edu IIIF:  Interoperabilituy Every  Identity, Its  Ontology @azaroth42 rsanderson @getty.edu Identity We  need  to  understand  the  entity   before  we  can  reuse  its  identifier! Questions: 1. What  constitutes  “identity”? 2. How  does  one  describe  entities? 3. How  does  one  discover  identifiers?
  7. 7. @azaroth42 rsanderson @getty.edu IIIF:  Interoperabilituy Every  Identity, Its  Ontology @azaroth42 rsanderson @getty.edu LOD  Identity  Fundamentals • Open  World  Assumption • What  is  not  stated  is  unknown,  not  false • No  single  agent  or  observer  has  complete  knowledge  in   a  distributed  system • Identifier  space  is  infinite • No  formal  character  limit  for  IRIs • Even  practical  limit  is  very  large  (65536  ^ length)
  8. 8. @azaroth42 rsanderson @getty.edu IIIF:  Interoperabilituy Every  Identity, Its  Ontology @azaroth42 rsanderson @getty.edu LOD  Identity  Fundamentals • IRIs  are  globally  unique • IRIs  used  for  identifying  entities  and  relationships • No  identity  for  instance  of  a  relationship • Only  one  contextual  identity  (named  graph) per  statement,  with  inconsistent  use • Anyone  may  make  assertions  about  any  entity
  9. 9. @azaroth42 rsanderson @getty.edu IIIF:  Interoperabilituy Every  Identity, Its  Ontology @azaroth42 rsanderson @getty.edu Every  Identity,  …
  10. 10. @azaroth42 rsanderson @getty.edu IIIF:  Interoperabilituy Every  Identity, Its  Ontology @azaroth42 rsanderson @getty.edu http://www.getty.edu/art/collection/objects/249050/ …  some  Philosophy • RDF  falls  in  Plato’s   “Universals”  space • Same  relationship   had  by  many  entities • No  relationship   instances • Fictional  entities  and   relationships  ok
  11. 11. @azaroth42 rsanderson @getty.edu IIIF:  Interoperabilituy Every  Identity, Its  Ontology @azaroth42 rsanderson @getty.edu 1.  Indiscernibility  of  Identicals for each object a: for each object b: if a === b: for each property P: P(a) === P(b) Or  …    owl:sameAs
  12. 12. @azaroth42 rsanderson @getty.edu IIIF:  Interoperabilituy Every  Identity, Its  Ontology @azaroth42 rsanderson @getty.edu Open  World  Ramifications If  we  know  that                             a owl:sameAs b And  discover  that                         a property x Then  we  know  that                   b property x The  rule  is  an  effect  of  identity,   it  doesn’t  help  us  determine  identity.
  13. 13. @azaroth42 rsanderson @getty.edu IIIF:  Interoperabilituy Every  Identity, Its  Ontology @azaroth42 rsanderson @getty.edu 2.  Identity  of  Indiscernibles object a === object b if: for each property P: P(a) === P(b) Or:  If  two  entities  share  all of  their  properties,  they   are  the  same  entity.
  14. 14. @azaroth42 rsanderson @getty.edu IIIF:  Interoperabilituy Every  Identity, Its  Ontology @azaroth42 rsanderson @getty.edu Open  World  Ramifications  (1) Uh-­‐oh! • There  are  infinite  (potential)  properties • We  cannot  compute  indiscernibility  as  the  for   loop  on  the  properties  would  run  forever len(Ψ)  = ∞ Indiscernibility:  (∀ P ∈ Ψ)(P(a)  =  P(b))  →  a  =  b
  15. 15. @azaroth42 rsanderson @getty.edu IIIF:  Interoperabilituy Every  Identity, Its  Ontology @azaroth42 rsanderson @getty.edu Open  World  Ramifications  (2) Uh-­‐oh!! • There  are  infinite  (potential)  properties • [Imagine  the  loop  could  run  in  zero  time] • Any  different  property  would  prevent  identity • The  likelihood  of  encountering  indiscernibles is   1/∞  …  or  0
  16. 16. @azaroth42 rsanderson @getty.edu IIIF:  Interoperabilituy Every  Identity, Its  Ontology @azaroth42 rsanderson @getty.edu Open  World  Ramifications  (3) Uh-­‐oh!!! • There  are  infinite  (potential)  properties • Any  property  not  asserted  is  just  not  known   locally  and  could  be  known  elsewhere • To  compute,  you  need  complete  knowledge  of  an   infinite  set  of  instances  and  infinite  properties,   and  zero  cost  comparison.
  17. 17. @azaroth42 rsanderson @getty.edu IIIF:  Interoperabilituy Every  Identity, Its  Ontology @azaroth42 rsanderson @getty.edu Escaping  the  Infinite  Loop? But  … • Finite  asserted  properties • Finite  set  of  publishers • Finite  changes  over  time Can’t  we  iterate  over  only  the   properties  actually asserted?
  18. 18. @azaroth42 rsanderson @getty.edu IIIF:  Interoperabilituy Every  Identity, Its  Ontology @azaroth42 rsanderson @getty.edu Escaping  the  Infinite  Loop? Still  need  the  big-­‐triplestore-­‐in-­‐the-­‐sky  with  all   assertions  from  all  publishers. Answer:  Google  can  do  it! Google,  will  you  run  a  big  triplestore for  us?
  19. 19. @azaroth42 rsanderson @getty.edu IIIF:  Interoperabilituy Every  Identity, Its  Ontology @azaroth42 rsanderson @getty.edu Google  is  Disinclined  to  Acquiesce to  your  Request
  20. 20. @azaroth42 rsanderson @getty.edu IIIF:  Interoperabilituy Every  Identity, Its  Ontology @azaroth42 rsanderson @getty.edu Escaping  the  Infinite  Loop? Also  trivial  to  construct  a  failing  case: let Ψ = [rdfs:label] a rdfs:label “Unknown” b rdfs:label “Unknown” Should  not  conclude  that          a owl:sameAs b
  21. 21. @azaroth42 rsanderson @getty.edu IIIF:  Interoperabilituy Every  Identity, Its  Ontology @azaroth42 rsanderson @getty.edu (╯ರ ~ ರ)╯︵ ┻━┻
  22. 22. @azaroth42 rsanderson @getty.edu IIIF:  Interoperabilituy Every  Identity, Its  Ontology @azaroth42 rsanderson @getty.edu (╯ರ ~ ರ)╯︵ ┻━┻ angry  tableflip
  23. 23. @azaroth42 rsanderson @getty.edu IIIF:  Interoperabilituy Every  Identity, Its  Ontology @azaroth42 rsanderson @getty.edu Practical  Philosophy:  John  Locke “You  cannot  know  an   entity’s  identity,   only  its  qualities.”   (paraphrased) This  rings  true: <urn:uuid:493650E7-­‐ACBB-­‐ 40EC-­‐B141-­‐4F2B6C660A71>
  24. 24. @azaroth42 rsanderson @getty.edu IIIF:  Interoperabilituy Every  Identity, Its  Ontology @azaroth42 rsanderson @getty.edu More  Properties,  More  Identity? Identity  is  a  relationship  that  admits  of  degree: • Less  than  100%  identity  is  resemblance • The  more  resemblance,   the  more  certain  the  identity  relation skos:exactMatch • “high  degree  of  confidence  that  the  concepts  can  be  used   interchangeably  across  a  wide  range  of  applications” skos:closeMatch • “sufficiently  similar  that  they  can  be  used  interchangeably   in some applications”
  25. 25. @azaroth42 rsanderson @getty.edu IIIF:  Interoperabilituy Every  Identity, Its  Ontology @azaroth42 rsanderson @getty.edu …  its  Ontology
  26. 26. @azaroth42 rsanderson @getty.edu IIIF:  Interoperabilituy Every  Identity, Its  Ontology @azaroth42 rsanderson @getty.edu Resemblance? • Given  “sufficient  resemblance”,  we  can  conclude   identity  for  practical  purposes • Resemblance  is  via  shared  properties • To  compute  resemblance,  we  must  understand   the  properties  shared  by  candidate  entities • Properties  are  given  as  predicates  in  LOD • Need  for  shared  ontology?
  27. 27. @azaroth42 rsanderson @getty.edu IIIF:  Interoperabilituy Every  Identity, Its  Ontology @azaroth42 rsanderson @getty.edu Porridge  Too  Hot?  Too  Cold? Too  few  properties: • Sufficiency  of  resemblance  impossible Too  many  properties: • Amount  of  information  overwhelming • More  likely  to  run  into  incompatible  properties
  28. 28. @azaroth42 rsanderson @getty.edu IIIF:  Interoperabilituy Every  Identity, Its  Ontology @azaroth42 rsanderson @getty.edu _:Porridge   crm:P51_has_former_or_current_owner   _:Papa  Bear? Understanding  can  then  be  increased  by  not  only   looking  at  the  one  entity,  but  where  it  fits  within   the  graph  of  connected  entities. Now  you  have  many  resemblance  problems.
  29. 29. @azaroth42 rsanderson @getty.edu IIIF:  Interoperabilituy Every  Identity, Its  Ontology @azaroth42 rsanderson @getty.edu Graphs  unlikely  to  have  the  same  shape,  even  with   a  shared  ontology. Different  organizations: • know  different  information • are  from  different  domains • have  different  foci • have  different  contexts  for  the  work Graph  Isomorphism
  30. 30. @azaroth42 rsanderson @getty.edu IIIF:  Interoperabilituy Every  Identity, Its  Ontology @azaroth42 rsanderson @getty.edu Costs  /  Values Reuse Philosophically  Infinite Automated:  Expensive Manual:  Very  Expensive Reinvention Free  as  in  Kittens! Cheap,  Fast,  Good:  Pick  One! And  forget  about  picking  Cheap!
  31. 31. @azaroth42 rsanderson @getty.edu IIIF:  Interoperabilituy Every  Identity, Its  Ontology @azaroth42 rsanderson @getty.edu Every  Identity,  its  Ontology In  the  absence  of  continuous  community   pressure,  demonstration  of  value,  and  in-­‐ house  expertise,  even  well-­‐intentioned   organizations  will  create  their  own  identities   and  ontologies  for  describing  entities.
  32. 32. @azaroth42 rsanderson @getty.edu IIIF:  Interoperabilituy Every  Identity, Its  Ontology @azaroth42 rsanderson @getty.edu Cultural  Heritage  Sector • Getty  ULAN • Library  of  Congress  NAF • Bibliotheque nationale de  France • Deutsche  National  Bibliothek • British  Library • ISNI • VIAF • SNAC • … Example:  Lewis  Carroll Industry • MusicBrainz (LinkedBrainz) • IMDB  (LinkedMDB) • DBPedia • WikiData • Google  /  Freebase • Genealogics • Quora • ReadSocial • …
  33. 33. @azaroth42 rsanderson @getty.edu IIIF:  Interoperabilituy Every  Identity, Its  Ontology @azaroth42 rsanderson @getty.edu Practical  Philosophy
  34. 34. @azaroth42 rsanderson @getty.edu IIIF:  Interoperabilituy Every  Identity, Its  Ontology @azaroth42 rsanderson @getty.edu We  could  stop  requiring  perfection  in  our  use  of   others’  data: • skos:exactMatch,  not  owl:sameAs • Data  that  is  good  enough • And  contribute  improvements! • Persistence,  not  Permanence • Target  is  Comprehension,  not  Inference Perfect  is  the  Enemy  of  the  Good
  35. 35. @azaroth42 rsanderson @getty.edu IIIF:  Interoperabilituy Every  Identity, Its  Ontology @azaroth42 rsanderson @getty.edu We  could  publish  a  set  of  rules  per  class  by  which   sufficiency  of  resemblance  can  be  determined: • Which  properties  must  overlap? • Which  properties  must  be  exactly  the  same? • Which  properties  can  be  ignored? • Which  relationships  must  match? Sufficiency  of  Resemblance
  36. 36. @azaroth42 rsanderson @getty.edu IIIF:  Interoperabilituy Every  Identity, Its  Ontology @azaroth42 rsanderson @getty.edu We  could  publish  services  to  make  it  easier  to   discover  and  reconcile  our  identities: • Auto-­‐complete  /  type-­‐ahead • Open  Refine  reconciliation • Embeddable  widgets Resemblance  Services
  37. 37. @azaroth42 rsanderson @getty.edu IIIF:  Interoperabilituy Every  Identity, Its  Ontology @azaroth42 rsanderson @getty.edu We  could  contribute  to  shared  infrastructure  for   discovery  and  change  management: • Shared  infrastructure,  decentralized  publication • Notifications  when  data  changes • Notifications  when  identities  are  used • With  links  back  from  the  identity • Separate  publishing  /  discovery  concerns   Shared  Infrastructure
  38. 38. @azaroth42 rsanderson @getty.edu IIIF:  Interoperabilituy Every  Identity, Its  Ontology @azaroth42 rsanderson @getty.edu Philosophy  of  Practicality
  39. 39. @azaroth42 rsanderson @getty.edu IIIF:  Interoperabilituy Every  Identity, Its  Ontology @azaroth42 rsanderson @getty.edu Five  Laws  of  LOD • Linked  Open  Data  is  for  Use • Every  Developer,  her  Data • Every  Data,  its  Application • Save  the  time  of  the  Developer • LOD  [Community]  is  a  growing  Organism
  40. 40. @azaroth42 rsanderson @getty.edu IIIF:  Interoperabilituy Every  Identity, Its  Ontology @azaroth42 rsanderson @getty.edu
  41. 41. @azaroth42 rsanderson @getty.edu IIIF:  Interoperabilituy Every  Identity, Its  Ontology @azaroth42 rsanderson @getty.edu
  42. 42. @azaroth42 rsanderson @getty.edu IIIF:  Interoperabilituy Every  Identity, Its  Ontology @azaroth42 rsanderson @getty.edu
  43. 43. @azaroth42 rsanderson @getty.edu IIIF:  Interoperabilituy Every  Identity, Its  Ontology @azaroth42 rsanderson @getty.edu Patrick  Hochstenbach,  @hochstenbach
  44. 44. @azaroth42 rsanderson @getty.edu IIIF:  Interoperabilituy Every  Identity, Its  Ontology @azaroth42 rsanderson @getty.edu https://www.flickr.com/photos/harris77/3357537737
  45. 45. @azaroth42 rsanderson @getty.edu IIIF:  Interoperabilituy Every  Identity, Its  Ontology @azaroth42 rsanderson @getty.edu Linked  Open  Usable  Data! • Strict  identity  matching  is  impossible • Target  is  skos:exactMatch,  not  owl:sameAs • Shared  ontologies  are  more  important  than  precision • Target  is  comprehension,  not  inference • Build  services  &  infrastructure  to  enable  reconciliation • Target  audience  of  LOD  is  Developers Pick  Usable  not  Perfect!  
  46. 46. @azaroth42 rsanderson @getty.edu IIIF:  Interoperabilituy Every  Identity, Its  Ontology @azaroth42 rsanderson @getty.edu Thank  You! Rob  Sanderson rsanderson@getty.edu /        @azaroth42
  47. 47. @azaroth42 rsanderson @getty.edu IIIF:  Interoperabilituy Every  Identity, Its  Ontology @azaroth42 rsanderson @getty.edu Discuss!

×