SlideShare a Scribd company logo
1 of 20
Download to read offline
Persistent	
  Iden+fiers,	
  NeIC	
  workshop	
  August	
  2015	
  in	
  Oslo	
  
Dag	
  Endresen,	
  GBIF	
  Norway,	
  UiO	
  Natural	
  History	
  Museum	
  
The	
  purpose	
  of	
  iden.fiers	
  
	
  
	
  	
  	
  	
  	
  	
  …is	
  to	
  name	
  things,	
  	
  
	
  	
  	
  	
  	
  	
  making	
  it	
  possible	
  to	
  refer	
  to	
  them.	
  
2	
  
Name	
  ambiguity:	
   	
  
Many	
  things	
  (in	
  GBIF)	
  are	
  named	
  123	
  
3	
  
Catalog	
  number:	
  123	
  
GBIF	
  ID:	
  543392241	
  
urn:catalog:CAS:BOT:123	
  
Bigelowia	
  juncea	
  
Catalog	
  number:	
  123	
  
GBIF	
  ID:	
  1030591721	
  
UAMb:Herb:123	
  
Sphagnum	
  girgensohnii	
  
Catalog	
  number:	
  123	
  
GBIF	
  ID:	
  893477175	
  
Parides	
  erithalion	
  
Catalog	
  number:	
  123	
  
GBIF	
  ID:	
  1050327334	
  
Cinchona	
  ledgeriana	
  
Catalog	
  number:	
  123	
  
GBIF	
  ID:	
  231564351	
  
Umbrina	
  canariensis	
  
Catalog	
  number:	
  123	
  
GBIF	
  ID:	
  931031820	
  
Bromus	
  kalmii	
  
Catalog	
  number:	
  123	
  
GBIF	
  ID:	
  283363	
  
urn:occurrence:Arctos:MVZ:Egg:123:164	
  
Mercurialis	
  ovata	
  
Catalog	
  number:	
  123	
  
GBIF	
  ID:	
  896547722	
  
urn:occurrence:Arctos:MVZ:Egg:123:164	
  
Contopus	
  sordidulus	
  veliei	
  
When	
  is	
  the	
  iden.fier	
  “good	
  enough”?	
  
	
  
Unique	
  and	
  persistent	
  -­‐	
  within	
  a	
  given	
  context.	
  
	
  
“The	
  common	
  experience	
  is	
  that	
  an	
  idenEfier	
  is	
  created	
  within	
  
a	
  system	
  or	
  within	
  a	
  context,	
  and	
  that	
  at	
  a	
  later	
  date	
  it	
  needs	
  
to	
  be	
  used	
  in	
  another	
  or	
  larger	
  context”	
  (Karen	
  Coyle	
  2006).	
  
	
  
Expanding	
  context:	
  
1.  Within	
  one	
  museum	
  collec+on	
  (catalog	
  number).	
  
2.  Within	
  a	
  network	
  between	
  museum	
  collec+ons	
  (collec+on	
  code	
  +	
  
catalogue	
  number).	
  
3.  Within	
  biodiversity	
  informa.on	
  network	
  (ins+tu+on	
  code	
  +	
  
collec+on/dataset	
  code	
  +	
  catalogue	
  number).	
  
4.  At	
  the	
  Internet	
  (e.g.	
  hep	
  URI,	
  DOI,	
  LSID,	
  etc…)	
  
5.  …	
  larger	
  contexts	
  are	
  possible	
  to	
  imagine	
  in	
  the	
  future!!	
  
4	
  
Expanding	
  context…	
  
5	
  
Internet	
  	
  
Museum	
  
Iden+fier	
  
Iden.fiers	
  for	
  museum	
  collec.ons	
  
	
  
The	
  longevity	
  of	
  museums	
  lead	
  to:	
  
	
  
“The	
  need	
  to	
  use	
  iden3fiers	
  from	
  our	
  past	
  in	
  the	
  current	
  highly-­‐
networked	
  digital	
  systems”	
  (Karen	
  Coyle	
  2006	
  [talking	
  about	
  libraries]).	
  
	
  
Specify	
  a	
  namespace	
  for	
  the	
  iden+fiers?	
  
•  URI	
  –	
  uniform	
  resource	
  iden+fier	
  (unique	
  in	
  the	
  context	
  of	
  the	
  web).	
  
•  URN	
  –	
  uniform	
  resource	
  name	
  (name	
  not	
  +ed	
  to	
  loca+on).	
  
•  URL	
  –	
  uniform	
  resource	
  locator	
  (network	
  loca+on	
  as	
  iden+fier).	
  
•  PURL	
  –	
  persistent	
  URL	
  (commitment	
  to	
  service	
  longevity).	
  
	
  
Something	
  else…?	
  
•  DOI	
  –	
  digital	
  object	
  iden+fier	
  
•  ARK	
  –	
  archival	
  resource	
  key	
  
•  UUID	
  –	
  universal	
  unique	
  iden+fier	
  
6	
  
•  Persistent	
  Iden+fier	
  (PID)	
  
•  Globally	
  Unique	
  Iden+fier	
  (GUID)	
  
•  Universal	
  Resource	
  Iden+fier	
  (URI)	
  
•  Persistent	
  Uniform	
  Resource	
  Locator	
  (PURL)	
  
•  Life	
  Science	
  Iden+fier	
  (LSID)	
  
•  Digital	
  Object	
  Iden+fier	
  (DOI)	
  
•  Handle	
  system	
  (Handle)	
  
•  Archival	
  Resource	
  Key	
  (ARK,	
  EZID)	
  
•  Universally	
  Unique	
  Iden+fier	
  (UUID)	
  
•  …	
  
7	
  
Photo:	
  Smithsonian	
  Na+onal	
  Museum	
  of	
  Natural	
  History,	
  USNM-­‐445024-­‐Eutoxeres-­‐aquila	
  
PURL	
  
Reuse	
  exis3ng	
  iden3fiers	
  
8	
  
•  Globally	
  unique	
  	
  
•  Scalability,	
  number	
  of	
  IDs	
  
•  Community	
  acceptance	
  
•  Long-­‐term	
  life-­‐cycle	
  
•  Resolvable,	
  resolu+on	
  service(s)	
  
•  Cost	
  per	
  iden+fier	
  
•  People-­‐friendly	
  or	
  machine-­‐friendly	
  
•  Solu+on	
  for	
  the	
  genera+on	
  of	
  new	
  IDs	
  
–  Central	
  genera+on,	
  PID	
  issuer	
  	
  
–  Distributed	
  genera.on	
  at	
  source	
  
9	
  
•  A	
  UUID	
  is	
  a	
  16-­‐octet	
  (128-­‐bit)	
  36-­‐chars	
  number.	
  
•  Example:	
  41d9cbb4-­‐4590-­‐4265-­‐8079-­‐ca44d46d27c3	
  
•  The	
  probability	
  of	
  one	
  duplicate	
  would	
  be	
  about	
  
50%	
  if	
  every	
  person	
  on	
  earth	
  create	
  600	
  million	
  
UUIDs.	
  
•  Allows	
  for	
  easy	
  genera.on	
  at	
  source	
  in	
  a	
  
distributed	
  network.	
  
10	
  
hep	
  –	
  PURL	
  –	
  UUID	
  	
  
hep://purl.org/nhmuio/id/41d9cbb4-­‐4590-­‐4265-­‐8079-­‐ca44d46d27c3	
  
11	
  
Iden+fier	
   Resolver	
  
Loca+on	
   Specimen	
  
The	
  resolver	
  is	
  a	
  system	
  to	
  resolve	
  loca+ons	
  from	
  iden+fiers,	
  
enabling	
  retrieval	
  even	
  when	
  the	
  loca+on	
  changes.	
  
hep://purl.org/nhmuio/id/[UUID]	
  
hep://gbif.no/resolver/[UUID]	
   No-­‐informaEon	
  object	
  (hMp	
  redirect)	
  
hMp	
  303	
  	
  
	
  
redirect	
  
hep://purl.org/nhmuio/id/UUID	
  	
  	
  	
  à	
  	
  	
  	
  hep://gbif.no/resolver/UUID	
  	
  	
  
hep://purl.org/gbifnorway/id/UUID	
  	
  	
  	
  à	
  	
  	
  	
  hep://gbif.no/resolver/UUID	
  	
  	
  
13	
  
Including	
  machine	
  
readable	
  formats	
  
14	
  
Catalog	
  number:	
  O-­‐L-­‐000014	
  	
  	
  	
  	
  	
  	
  	
  	
  hep://purl.org/nhmuio/id/41d9cbb4-­‐4590-­‐4265-­‐8079-­‐ca44d46d27c3	
  
15	
  
UUID	
  QR	
  codes	
  for	
  museum	
  
objects	
  at	
  NHM-­‐UiO	
  provides:	
  
	
  
•  Machine-­‐readable	
  iden.fiers	
  
(using	
  a	
  simple	
  smart	
  phone	
  -­‐	
  or	
  a	
  
barcode	
  reader)	
  
•  Allows	
  for	
  new	
  and	
  efficient	
  
workflows	
  for	
  collec+on	
  
management.	
  
•  Deployment	
  for	
  stable	
  iden.fiers	
  
appropriate	
  for	
  data-­‐basing.	
  
16	
  
hep://purl.org/nhmuio/id/41d9cbb4-­‐4590-­‐4265-­‐8079-­‐ca44d46d27c3	
  
(machine	
  friendly)	
  
Catalog	
  number:	
  O-­‐L-­‐000014	
  	
  
(human	
  friendly)	
  
Efficient	
  workflow	
  rou+nes	
  
hep://gbif.no/transcribe/	
  
18	
  
19	
  
Some	
  key	
  challenges	
  for	
  the	
  group	
  work	
  
	
  
•  Many	
  of	
  the	
  original	
  source	
  datasets	
  indexed	
  by	
  GBIF	
  are	
  regularly	
  updated	
  and	
  re-­‐indexed	
  by	
  the	
  GBIF	
  portal.	
  Without	
  
stable	
  and	
  persistent	
  iden+fiers	
  informa+on	
  on	
  the	
  same	
  herbarium	
  specimen	
  (or	
  species	
  observa+on)	
  are	
  some+mes	
  
included	
  more	
  than	
  one	
  .me,	
  leading	
  to	
  duplicated	
  informa.on	
  -­‐	
  duplicated	
  in	
  the	
  sense	
  of	
  more	
  than	
  one	
  (unlinked)	
  
data	
  record	
  for	
  the	
  same	
  Real	
  World	
  en+ty.	
  
•  Without	
  stable	
  and	
  persistent	
  iden+fiers	
  for	
  herbarium	
  specimens	
  (and	
  species	
  observa+ons)	
  it	
  is	
  difficult	
  to	
  link	
  the	
  
same	
  data	
  record	
  indexed	
  at	
  different	
  re-­‐indexing	
  cycles	
  of	
  the	
  GBIF	
  portal.	
  When	
  a	
  data	
  record	
  previously	
  indexed	
  is	
  not	
  
re-­‐iden+fied	
  in	
  a	
  new	
  version	
  of	
  a	
  given	
  dataset,	
  then	
  the	
  record	
  is	
  deleted	
  from	
  the	
  portal,	
  and	
  the	
  link	
  to	
  previous	
  
versions	
  of	
  this	
  data	
  record	
  is	
  lost.	
  
•  A	
  composite	
  key	
  iden.fier	
  (such	
  as	
  the	
  Darwin	
  Core	
  triplet)	
  based	
  on	
  a	
  combina.on	
  the	
  metadata	
  aIributes	
  for	
  
ins+tute	
  code	
  (dwc:ins+tuteCode),	
  collec+on	
  code	
  (dwc:collec+onCode),	
  and	
  the	
  local	
  specimen	
  iden+fier	
  
(dwc:catalogNumber)	
  is	
  generally	
  used	
  as	
  the	
  specimen	
  iden+fier	
  in	
  GBIF.	
  However,	
  all	
  three	
  metadata	
  aeributes	
  can	
  
(and	
  do)	
  some+mes	
  change.	
  
•  What	
  could	
  be	
  a	
  best	
  prac+ce	
  guideline	
  for	
  iden.fier	
  resolu.on.	
  Is	
  it	
  useful	
  to	
  define	
  and	
  agree	
  on	
  a	
  (set	
  of)	
  common	
  
and	
  well-­‐defined	
  response	
  format?	
  Is	
  it	
  useful	
  to	
  provide	
  recommenda+ons	
  for	
  a	
  set	
  of	
  metadata	
  profiles	
  with	
  a	
  clear	
  
set	
  of	
  defined	
  metadata	
  aeributes?	
  Or	
  would	
  more	
  general	
  principles	
  and	
  more	
  open	
  recommenda+ons	
  be	
  more	
  likely	
  
to	
  stand	
  the	
  test	
  of	
  +me	
  and	
  remain	
  relevant	
  with	
  the	
  emergence	
  of	
  new	
  informa+on	
  infrastructure	
  technologies?	
  
•  Challenges,	
  pros	
  and	
  cons	
  of	
  reusing	
  object	
  iden.fiers	
  and	
  metadata	
  aIribute	
  terms	
  declared	
  by	
  others	
  without	
  full	
  
control	
  of	
  how	
  these	
  objects	
  and	
  terms	
  are	
  maintained.	
  Objects	
  and	
  concepts	
  declared	
  for	
  a	
  par+cular	
  purpose	
  will	
  oren	
  
not	
  match	
  exactly	
  the	
  needs	
  suitable	
  for	
  another	
  purpose.	
  How	
  to	
  op+mally	
  reuse	
  each	
  others	
  OWL	
  ontologies,	
  
metadata	
  vocabularies	
  and	
  data	
  object	
  models?	
  
•  Iden.fiers	
  iden.fying	
  the	
  Real	
  World	
  physical	
  objects,	
  the	
  en++es	
  that	
  the	
  collec+on	
  curators	
  and	
  users	
  of	
  the	
  
informa+on	
  care	
  about.	
  Or	
  should	
  the	
  iden+fier	
  be	
  assigned	
  to	
  database	
  records?	
  Real	
  World	
  en++es	
  will	
  not	
  have	
  a	
  
signature	
  byte-­‐sequence	
  and	
  will	
  rely	
  of	
  interpreta+on	
  of	
  when	
  an	
  object	
  is	
  considered	
  to	
  be	
  the	
  same	
  thing.	
  
gbif-­‐drir@nhm.uio.no	
  	
  
	
  
Dag	
  Endresen	
  
dag.endresen@nhm.uio.no	
  
	
  
Chris+an	
  Svindseth	
  
chris+an.svindseth@nhm.uio.no	
  	
  
Gary Larson, 1987	
  
20	
  
Workshop
in Oslo
26th Aug	
  

More Related Content

More from Dag Endresen

BioDT for the UiO Science section meeting 2023-03-24
BioDT for the UiO Science section meeting 2023-03-24BioDT for the UiO Science section meeting 2023-03-24
BioDT for the UiO Science section meeting 2023-03-24Dag Endresen
 
Data and Stats Forum at MINA NMBU - 2023-04-26
Data and Stats Forum at MINA NMBU - 2023-04-26Data and Stats Forum at MINA NMBU - 2023-04-26
Data and Stats Forum at MINA NMBU - 2023-04-26Dag Endresen
 
BioDATA final conference in Oslo, November 2022
BioDATA final conference in Oslo, November 2022BioDATA final conference in Oslo, November 2022
BioDATA final conference in Oslo, November 2022Dag Endresen
 
GBIF data mobilisation for the Nansen Legacy, Tromsø, 2022-09-20
GBIF data mobilisation for the Nansen Legacy, Tromsø, 2022-09-20GBIF data mobilisation for the Nansen Legacy, Tromsø, 2022-09-20
GBIF data mobilisation for the Nansen Legacy, Tromsø, 2022-09-20Dag Endresen
 
GBIF at Living Norway Open Science Lab 2022-03-03
GBIF at Living Norway Open Science Lab 2022-03-03GBIF at Living Norway Open Science Lab 2022-03-03
GBIF at Living Norway Open Science Lab 2022-03-03Dag Endresen
 
GBIF & GRScicoll, Høstseminar Norges museumsforbunds Seksjon for natur, 2021-...
GBIF & GRScicoll, Høstseminar Norges museumsforbunds Seksjon for natur, 2021-...GBIF & GRScicoll, Høstseminar Norges museumsforbunds Seksjon for natur, 2021-...
GBIF & GRScicoll, Høstseminar Norges museumsforbunds Seksjon for natur, 2021-...Dag Endresen
 
Råd fra GBIF-Norge til datainfrastrukturutvalget i dialogmøte 2021-11-19
Råd fra GBIF-Norge til datainfrastrukturutvalget i dialogmøte 2021-11-19Råd fra GBIF-Norge til datainfrastrukturutvalget i dialogmøte 2021-11-19
Råd fra GBIF-Norge til datainfrastrukturutvalget i dialogmøte 2021-11-19Dag Endresen
 
The role of biodiversity informatics in GBIF, 2021-05-18
The role of biodiversity informatics in GBIF, 2021-05-18The role of biodiversity informatics in GBIF, 2021-05-18
The role of biodiversity informatics in GBIF, 2021-05-18Dag Endresen
 
GBIF and Biodiversity informatics for museums, 15 March 2021
GBIF and Biodiversity informatics for museums, 15 March 2021GBIF and Biodiversity informatics for museums, 15 March 2021
GBIF and Biodiversity informatics for museums, 15 March 2021Dag Endresen
 
2016-10-12 MUSIT & GBIF - Dataset portals
2016-10-12 MUSIT & GBIF - Dataset portals2016-10-12 MUSIT & GBIF - Dataset portals
2016-10-12 MUSIT & GBIF - Dataset portalsDag Endresen
 
2021-01-27--biodiversity-informatics-gbif-(52slides)
2021-01-27--biodiversity-informatics-gbif-(52slides)2021-01-27--biodiversity-informatics-gbif-(52slides)
2021-01-27--biodiversity-informatics-gbif-(52slides)Dag Endresen
 
GBIF and Open Science
GBIF and Open ScienceGBIF and Open Science
GBIF and Open ScienceDag Endresen
 
FAIR and open biodiversity collection data management
FAIR and open biodiversity collection data managementFAIR and open biodiversity collection data management
FAIR and open biodiversity collection data managementDag Endresen
 
BioDATA capacity enhancement curriculum at GBIF GB26 Global Nodes Meeting in ...
BioDATA capacity enhancement curriculum at GBIF GB26 Global Nodes Meeting in ...BioDATA capacity enhancement curriculum at GBIF GB26 Global Nodes Meeting in ...
BioDATA capacity enhancement curriculum at GBIF GB26 Global Nodes Meeting in ...Dag Endresen
 
GBIF-Norway node story lightning talk at GB26 in Leiden, October 2019
GBIF-Norway node story lightning talk at GB26 in Leiden, October 2019GBIF-Norway node story lightning talk at GB26 in Leiden, October 2019
GBIF-Norway node story lightning talk at GB26 in Leiden, October 2019Dag Endresen
 
Museum collections as research data - October 2019
Museum collections as research data - October 2019Museum collections as research data - October 2019
Museum collections as research data - October 2019Dag Endresen
 
Open science curriculum for students, June 2019
Open science curriculum for students, June 2019Open science curriculum for students, June 2019
Open science curriculum for students, June 2019Dag Endresen
 
GBIF towards 2030 (November 2018)
GBIF towards 2030 (November 2018)GBIF towards 2030 (November 2018)
GBIF towards 2030 (November 2018)Dag Endresen
 
Event core and new datatypes in GBIF - 10th European GBIF Nodes Meeting in Ta...
Event core and new datatypes in GBIF - 10th European GBIF Nodes Meeting in Ta...Event core and new datatypes in GBIF - 10th European GBIF Nodes Meeting in Ta...
Event core and new datatypes in GBIF - 10th European GBIF Nodes Meeting in Ta...Dag Endresen
 
GBIF/OBIS hackathon in Brussels January 2018
GBIF/OBIS hackathon in Brussels January 2018GBIF/OBIS hackathon in Brussels January 2018
GBIF/OBIS hackathon in Brussels January 2018Dag Endresen
 

More from Dag Endresen (20)

BioDT for the UiO Science section meeting 2023-03-24
BioDT for the UiO Science section meeting 2023-03-24BioDT for the UiO Science section meeting 2023-03-24
BioDT for the UiO Science section meeting 2023-03-24
 
Data and Stats Forum at MINA NMBU - 2023-04-26
Data and Stats Forum at MINA NMBU - 2023-04-26Data and Stats Forum at MINA NMBU - 2023-04-26
Data and Stats Forum at MINA NMBU - 2023-04-26
 
BioDATA final conference in Oslo, November 2022
BioDATA final conference in Oslo, November 2022BioDATA final conference in Oslo, November 2022
BioDATA final conference in Oslo, November 2022
 
GBIF data mobilisation for the Nansen Legacy, Tromsø, 2022-09-20
GBIF data mobilisation for the Nansen Legacy, Tromsø, 2022-09-20GBIF data mobilisation for the Nansen Legacy, Tromsø, 2022-09-20
GBIF data mobilisation for the Nansen Legacy, Tromsø, 2022-09-20
 
GBIF at Living Norway Open Science Lab 2022-03-03
GBIF at Living Norway Open Science Lab 2022-03-03GBIF at Living Norway Open Science Lab 2022-03-03
GBIF at Living Norway Open Science Lab 2022-03-03
 
GBIF & GRScicoll, Høstseminar Norges museumsforbunds Seksjon for natur, 2021-...
GBIF & GRScicoll, Høstseminar Norges museumsforbunds Seksjon for natur, 2021-...GBIF & GRScicoll, Høstseminar Norges museumsforbunds Seksjon for natur, 2021-...
GBIF & GRScicoll, Høstseminar Norges museumsforbunds Seksjon for natur, 2021-...
 
Råd fra GBIF-Norge til datainfrastrukturutvalget i dialogmøte 2021-11-19
Råd fra GBIF-Norge til datainfrastrukturutvalget i dialogmøte 2021-11-19Råd fra GBIF-Norge til datainfrastrukturutvalget i dialogmøte 2021-11-19
Råd fra GBIF-Norge til datainfrastrukturutvalget i dialogmøte 2021-11-19
 
The role of biodiversity informatics in GBIF, 2021-05-18
The role of biodiversity informatics in GBIF, 2021-05-18The role of biodiversity informatics in GBIF, 2021-05-18
The role of biodiversity informatics in GBIF, 2021-05-18
 
GBIF and Biodiversity informatics for museums, 15 March 2021
GBIF and Biodiversity informatics for museums, 15 March 2021GBIF and Biodiversity informatics for museums, 15 March 2021
GBIF and Biodiversity informatics for museums, 15 March 2021
 
2016-10-12 MUSIT & GBIF - Dataset portals
2016-10-12 MUSIT & GBIF - Dataset portals2016-10-12 MUSIT & GBIF - Dataset portals
2016-10-12 MUSIT & GBIF - Dataset portals
 
2021-01-27--biodiversity-informatics-gbif-(52slides)
2021-01-27--biodiversity-informatics-gbif-(52slides)2021-01-27--biodiversity-informatics-gbif-(52slides)
2021-01-27--biodiversity-informatics-gbif-(52slides)
 
GBIF and Open Science
GBIF and Open ScienceGBIF and Open Science
GBIF and Open Science
 
FAIR and open biodiversity collection data management
FAIR and open biodiversity collection data managementFAIR and open biodiversity collection data management
FAIR and open biodiversity collection data management
 
BioDATA capacity enhancement curriculum at GBIF GB26 Global Nodes Meeting in ...
BioDATA capacity enhancement curriculum at GBIF GB26 Global Nodes Meeting in ...BioDATA capacity enhancement curriculum at GBIF GB26 Global Nodes Meeting in ...
BioDATA capacity enhancement curriculum at GBIF GB26 Global Nodes Meeting in ...
 
GBIF-Norway node story lightning talk at GB26 in Leiden, October 2019
GBIF-Norway node story lightning talk at GB26 in Leiden, October 2019GBIF-Norway node story lightning talk at GB26 in Leiden, October 2019
GBIF-Norway node story lightning talk at GB26 in Leiden, October 2019
 
Museum collections as research data - October 2019
Museum collections as research data - October 2019Museum collections as research data - October 2019
Museum collections as research data - October 2019
 
Open science curriculum for students, June 2019
Open science curriculum for students, June 2019Open science curriculum for students, June 2019
Open science curriculum for students, June 2019
 
GBIF towards 2030 (November 2018)
GBIF towards 2030 (November 2018)GBIF towards 2030 (November 2018)
GBIF towards 2030 (November 2018)
 
Event core and new datatypes in GBIF - 10th European GBIF Nodes Meeting in Ta...
Event core and new datatypes in GBIF - 10th European GBIF Nodes Meeting in Ta...Event core and new datatypes in GBIF - 10th European GBIF Nodes Meeting in Ta...
Event core and new datatypes in GBIF - 10th European GBIF Nodes Meeting in Ta...
 
GBIF/OBIS hackathon in Brussels January 2018
GBIF/OBIS hackathon in Brussels January 2018GBIF/OBIS hackathon in Brussels January 2018
GBIF/OBIS hackathon in Brussels January 2018
 

Recently uploaded

In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi ArabiaIn Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabiaahmedjiabur940
 
Dubai Call Girls Peeing O525547819 Call Girls Dubai
Dubai Call Girls Peeing O525547819 Call Girls DubaiDubai Call Girls Peeing O525547819 Call Girls Dubai
Dubai Call Girls Peeing O525547819 Call Girls Dubaikojalkojal131
 
PLE-statistics document for primary schs
PLE-statistics document for primary schsPLE-statistics document for primary schs
PLE-statistics document for primary schscnajjemba
 
Ranking and Scoring Exercises for Research
Ranking and Scoring Exercises for ResearchRanking and Scoring Exercises for Research
Ranking and Scoring Exercises for ResearchRajesh Mondal
 
怎样办理纽约州立大学宾汉姆顿分校毕业证(SUNY-Bin毕业证书)成绩单学校原版复制
怎样办理纽约州立大学宾汉姆顿分校毕业证(SUNY-Bin毕业证书)成绩单学校原版复制怎样办理纽约州立大学宾汉姆顿分校毕业证(SUNY-Bin毕业证书)成绩单学校原版复制
怎样办理纽约州立大学宾汉姆顿分校毕业证(SUNY-Bin毕业证书)成绩单学校原版复制vexqp
 
Harnessing the Power of GenAI for BI and Reporting.pptx
Harnessing the Power of GenAI for BI and Reporting.pptxHarnessing the Power of GenAI for BI and Reporting.pptx
Harnessing the Power of GenAI for BI and Reporting.pptxParas Gupta
 
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...Elaine Werffeli
 
Predicting HDB Resale Prices - Conducting Linear Regression Analysis With Orange
Predicting HDB Resale Prices - Conducting Linear Regression Analysis With OrangePredicting HDB Resale Prices - Conducting Linear Regression Analysis With Orange
Predicting HDB Resale Prices - Conducting Linear Regression Analysis With OrangeThinkInnovation
 
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...gajnagarg
 
Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Researchmichael115558
 
Switzerland Constitution 2002.pdf.........
Switzerland Constitution 2002.pdf.........Switzerland Constitution 2002.pdf.........
Switzerland Constitution 2002.pdf.........EfruzAsilolu
 
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...Klinik kandungan
 
Lecture_2_Deep_Learning_Overview-newone1
Lecture_2_Deep_Learning_Overview-newone1Lecture_2_Deep_Learning_Overview-newone1
Lecture_2_Deep_Learning_Overview-newone1ranjankumarbehera14
 
一比一原版(曼大毕业证书)曼尼托巴大学毕业证成绩单留信学历认证一手价格
一比一原版(曼大毕业证书)曼尼托巴大学毕业证成绩单留信学历认证一手价格一比一原版(曼大毕业证书)曼尼托巴大学毕业证成绩单留信学历认证一手价格
一比一原版(曼大毕业证书)曼尼托巴大学毕业证成绩单留信学历认证一手价格q6pzkpark
 
Jual Cytotec Asli Obat Aborsi No. 1 Paling Manjur
Jual Cytotec Asli Obat Aborsi No. 1 Paling ManjurJual Cytotec Asli Obat Aborsi No. 1 Paling Manjur
Jual Cytotec Asli Obat Aborsi No. 1 Paling Manjurptikerjasaptiker
 
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...nirzagarg
 
如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样
如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样
如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样wsppdmt
 
怎样办理旧金山城市学院毕业证(CCSF毕业证书)成绩单学校原版复制
怎样办理旧金山城市学院毕业证(CCSF毕业证书)成绩单学校原版复制怎样办理旧金山城市学院毕业证(CCSF毕业证书)成绩单学校原版复制
怎样办理旧金山城市学院毕业证(CCSF毕业证书)成绩单学校原版复制vexqp
 
Reconciling Conflicting Data Curation Actions: Transparency Through Argument...
Reconciling Conflicting Data Curation Actions:  Transparency Through Argument...Reconciling Conflicting Data Curation Actions:  Transparency Through Argument...
Reconciling Conflicting Data Curation Actions: Transparency Through Argument...Bertram Ludäscher
 
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...gajnagarg
 

Recently uploaded (20)

In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi ArabiaIn Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia
 
Dubai Call Girls Peeing O525547819 Call Girls Dubai
Dubai Call Girls Peeing O525547819 Call Girls DubaiDubai Call Girls Peeing O525547819 Call Girls Dubai
Dubai Call Girls Peeing O525547819 Call Girls Dubai
 
PLE-statistics document for primary schs
PLE-statistics document for primary schsPLE-statistics document for primary schs
PLE-statistics document for primary schs
 
Ranking and Scoring Exercises for Research
Ranking and Scoring Exercises for ResearchRanking and Scoring Exercises for Research
Ranking and Scoring Exercises for Research
 
怎样办理纽约州立大学宾汉姆顿分校毕业证(SUNY-Bin毕业证书)成绩单学校原版复制
怎样办理纽约州立大学宾汉姆顿分校毕业证(SUNY-Bin毕业证书)成绩单学校原版复制怎样办理纽约州立大学宾汉姆顿分校毕业证(SUNY-Bin毕业证书)成绩单学校原版复制
怎样办理纽约州立大学宾汉姆顿分校毕业证(SUNY-Bin毕业证书)成绩单学校原版复制
 
Harnessing the Power of GenAI for BI and Reporting.pptx
Harnessing the Power of GenAI for BI and Reporting.pptxHarnessing the Power of GenAI for BI and Reporting.pptx
Harnessing the Power of GenAI for BI and Reporting.pptx
 
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
 
Predicting HDB Resale Prices - Conducting Linear Regression Analysis With Orange
Predicting HDB Resale Prices - Conducting Linear Regression Analysis With OrangePredicting HDB Resale Prices - Conducting Linear Regression Analysis With Orange
Predicting HDB Resale Prices - Conducting Linear Regression Analysis With Orange
 
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
 
Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Research
 
Switzerland Constitution 2002.pdf.........
Switzerland Constitution 2002.pdf.........Switzerland Constitution 2002.pdf.........
Switzerland Constitution 2002.pdf.........
 
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
Jual obat aborsi Bandung ( 085657271886 ) Cytote pil telat bulan penggugur ka...
 
Lecture_2_Deep_Learning_Overview-newone1
Lecture_2_Deep_Learning_Overview-newone1Lecture_2_Deep_Learning_Overview-newone1
Lecture_2_Deep_Learning_Overview-newone1
 
一比一原版(曼大毕业证书)曼尼托巴大学毕业证成绩单留信学历认证一手价格
一比一原版(曼大毕业证书)曼尼托巴大学毕业证成绩单留信学历认证一手价格一比一原版(曼大毕业证书)曼尼托巴大学毕业证成绩单留信学历认证一手价格
一比一原版(曼大毕业证书)曼尼托巴大学毕业证成绩单留信学历认证一手价格
 
Jual Cytotec Asli Obat Aborsi No. 1 Paling Manjur
Jual Cytotec Asli Obat Aborsi No. 1 Paling ManjurJual Cytotec Asli Obat Aborsi No. 1 Paling Manjur
Jual Cytotec Asli Obat Aborsi No. 1 Paling Manjur
 
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
 
如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样
如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样
如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样
 
怎样办理旧金山城市学院毕业证(CCSF毕业证书)成绩单学校原版复制
怎样办理旧金山城市学院毕业证(CCSF毕业证书)成绩单学校原版复制怎样办理旧金山城市学院毕业证(CCSF毕业证书)成绩单学校原版复制
怎样办理旧金山城市学院毕业证(CCSF毕业证书)成绩单学校原版复制
 
Reconciling Conflicting Data Curation Actions: Transparency Through Argument...
Reconciling Conflicting Data Curation Actions:  Transparency Through Argument...Reconciling Conflicting Data Curation Actions:  Transparency Through Argument...
Reconciling Conflicting Data Curation Actions: Transparency Through Argument...
 
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...
 

Persistent identifiers for museum specimens, NeIC workshop, August 2015

  • 1. Persistent  Iden+fiers,  NeIC  workshop  August  2015  in  Oslo   Dag  Endresen,  GBIF  Norway,  UiO  Natural  History  Museum  
  • 2. The  purpose  of  iden.fiers                …is  to  name  things,                making  it  possible  to  refer  to  them.   2  
  • 3. Name  ambiguity:     Many  things  (in  GBIF)  are  named  123   3   Catalog  number:  123   GBIF  ID:  543392241   urn:catalog:CAS:BOT:123   Bigelowia  juncea   Catalog  number:  123   GBIF  ID:  1030591721   UAMb:Herb:123   Sphagnum  girgensohnii   Catalog  number:  123   GBIF  ID:  893477175   Parides  erithalion   Catalog  number:  123   GBIF  ID:  1050327334   Cinchona  ledgeriana   Catalog  number:  123   GBIF  ID:  231564351   Umbrina  canariensis   Catalog  number:  123   GBIF  ID:  931031820   Bromus  kalmii   Catalog  number:  123   GBIF  ID:  283363   urn:occurrence:Arctos:MVZ:Egg:123:164   Mercurialis  ovata   Catalog  number:  123   GBIF  ID:  896547722   urn:occurrence:Arctos:MVZ:Egg:123:164   Contopus  sordidulus  veliei  
  • 4. When  is  the  iden.fier  “good  enough”?     Unique  and  persistent  -­‐  within  a  given  context.     “The  common  experience  is  that  an  idenEfier  is  created  within   a  system  or  within  a  context,  and  that  at  a  later  date  it  needs   to  be  used  in  another  or  larger  context”  (Karen  Coyle  2006).     Expanding  context:   1.  Within  one  museum  collec+on  (catalog  number).   2.  Within  a  network  between  museum  collec+ons  (collec+on  code  +   catalogue  number).   3.  Within  biodiversity  informa.on  network  (ins+tu+on  code  +   collec+on/dataset  code  +  catalogue  number).   4.  At  the  Internet  (e.g.  hep  URI,  DOI,  LSID,  etc…)   5.  …  larger  contexts  are  possible  to  imagine  in  the  future!!   4  
  • 5. Expanding  context…   5   Internet     Museum   Iden+fier  
  • 6. Iden.fiers  for  museum  collec.ons     The  longevity  of  museums  lead  to:     “The  need  to  use  iden3fiers  from  our  past  in  the  current  highly-­‐ networked  digital  systems”  (Karen  Coyle  2006  [talking  about  libraries]).     Specify  a  namespace  for  the  iden+fiers?   •  URI  –  uniform  resource  iden+fier  (unique  in  the  context  of  the  web).   •  URN  –  uniform  resource  name  (name  not  +ed  to  loca+on).   •  URL  –  uniform  resource  locator  (network  loca+on  as  iden+fier).   •  PURL  –  persistent  URL  (commitment  to  service  longevity).     Something  else…?   •  DOI  –  digital  object  iden+fier   •  ARK  –  archival  resource  key   •  UUID  –  universal  unique  iden+fier   6  
  • 7. •  Persistent  Iden+fier  (PID)   •  Globally  Unique  Iden+fier  (GUID)   •  Universal  Resource  Iden+fier  (URI)   •  Persistent  Uniform  Resource  Locator  (PURL)   •  Life  Science  Iden+fier  (LSID)   •  Digital  Object  Iden+fier  (DOI)   •  Handle  system  (Handle)   •  Archival  Resource  Key  (ARK,  EZID)   •  Universally  Unique  Iden+fier  (UUID)   •  …   7  
  • 8. Photo:  Smithsonian  Na+onal  Museum  of  Natural  History,  USNM-­‐445024-­‐Eutoxeres-­‐aquila   PURL   Reuse  exis3ng  iden3fiers   8  
  • 9. •  Globally  unique     •  Scalability,  number  of  IDs   •  Community  acceptance   •  Long-­‐term  life-­‐cycle   •  Resolvable,  resolu+on  service(s)   •  Cost  per  iden+fier   •  People-­‐friendly  or  machine-­‐friendly   •  Solu+on  for  the  genera+on  of  new  IDs   –  Central  genera+on,  PID  issuer     –  Distributed  genera.on  at  source   9  
  • 10. •  A  UUID  is  a  16-­‐octet  (128-­‐bit)  36-­‐chars  number.   •  Example:  41d9cbb4-­‐4590-­‐4265-­‐8079-­‐ca44d46d27c3   •  The  probability  of  one  duplicate  would  be  about   50%  if  every  person  on  earth  create  600  million   UUIDs.   •  Allows  for  easy  genera.on  at  source  in  a   distributed  network.   10  
  • 11. hep  –  PURL  –  UUID     hep://purl.org/nhmuio/id/41d9cbb4-­‐4590-­‐4265-­‐8079-­‐ca44d46d27c3   11  
  • 12. Iden+fier   Resolver   Loca+on   Specimen   The  resolver  is  a  system  to  resolve  loca+ons  from  iden+fiers,   enabling  retrieval  even  when  the  loca+on  changes.   hep://purl.org/nhmuio/id/[UUID]   hep://gbif.no/resolver/[UUID]   No-­‐informaEon  object  (hMp  redirect)   hMp  303       redirect  
  • 13. hep://purl.org/nhmuio/id/UUID        à        hep://gbif.no/resolver/UUID       hep://purl.org/gbifnorway/id/UUID        à        hep://gbif.no/resolver/UUID       13  
  • 14. Including  machine   readable  formats   14  
  • 15. Catalog  number:  O-­‐L-­‐000014                  hep://purl.org/nhmuio/id/41d9cbb4-­‐4590-­‐4265-­‐8079-­‐ca44d46d27c3   15  
  • 16. UUID  QR  codes  for  museum   objects  at  NHM-­‐UiO  provides:     •  Machine-­‐readable  iden.fiers   (using  a  simple  smart  phone  -­‐  or  a   barcode  reader)   •  Allows  for  new  and  efficient   workflows  for  collec+on   management.   •  Deployment  for  stable  iden.fiers   appropriate  for  data-­‐basing.   16  
  • 17. hep://purl.org/nhmuio/id/41d9cbb4-­‐4590-­‐4265-­‐8079-­‐ca44d46d27c3   (machine  friendly)   Catalog  number:  O-­‐L-­‐000014     (human  friendly)   Efficient  workflow  rou+nes  
  • 19. 19   Some  key  challenges  for  the  group  work     •  Many  of  the  original  source  datasets  indexed  by  GBIF  are  regularly  updated  and  re-­‐indexed  by  the  GBIF  portal.  Without   stable  and  persistent  iden+fiers  informa+on  on  the  same  herbarium  specimen  (or  species  observa+on)  are  some+mes   included  more  than  one  .me,  leading  to  duplicated  informa.on  -­‐  duplicated  in  the  sense  of  more  than  one  (unlinked)   data  record  for  the  same  Real  World  en+ty.   •  Without  stable  and  persistent  iden+fiers  for  herbarium  specimens  (and  species  observa+ons)  it  is  difficult  to  link  the   same  data  record  indexed  at  different  re-­‐indexing  cycles  of  the  GBIF  portal.  When  a  data  record  previously  indexed  is  not   re-­‐iden+fied  in  a  new  version  of  a  given  dataset,  then  the  record  is  deleted  from  the  portal,  and  the  link  to  previous   versions  of  this  data  record  is  lost.   •  A  composite  key  iden.fier  (such  as  the  Darwin  Core  triplet)  based  on  a  combina.on  the  metadata  aIributes  for   ins+tute  code  (dwc:ins+tuteCode),  collec+on  code  (dwc:collec+onCode),  and  the  local  specimen  iden+fier   (dwc:catalogNumber)  is  generally  used  as  the  specimen  iden+fier  in  GBIF.  However,  all  three  metadata  aeributes  can   (and  do)  some+mes  change.   •  What  could  be  a  best  prac+ce  guideline  for  iden.fier  resolu.on.  Is  it  useful  to  define  and  agree  on  a  (set  of)  common   and  well-­‐defined  response  format?  Is  it  useful  to  provide  recommenda+ons  for  a  set  of  metadata  profiles  with  a  clear   set  of  defined  metadata  aeributes?  Or  would  more  general  principles  and  more  open  recommenda+ons  be  more  likely   to  stand  the  test  of  +me  and  remain  relevant  with  the  emergence  of  new  informa+on  infrastructure  technologies?   •  Challenges,  pros  and  cons  of  reusing  object  iden.fiers  and  metadata  aIribute  terms  declared  by  others  without  full   control  of  how  these  objects  and  terms  are  maintained.  Objects  and  concepts  declared  for  a  par+cular  purpose  will  oren   not  match  exactly  the  needs  suitable  for  another  purpose.  How  to  op+mally  reuse  each  others  OWL  ontologies,   metadata  vocabularies  and  data  object  models?   •  Iden.fiers  iden.fying  the  Real  World  physical  objects,  the  en++es  that  the  collec+on  curators  and  users  of  the   informa+on  care  about.  Or  should  the  iden+fier  be  assigned  to  database  records?  Real  World  en++es  will  not  have  a   signature  byte-­‐sequence  and  will  rely  of  interpreta+on  of  when  an  object  is  considered  to  be  the  same  thing.  
  • 20. gbif-­‐drir@nhm.uio.no       Dag  Endresen   dag.endresen@nhm.uio.no     Chris+an  Svindseth   chris+an.svindseth@nhm.uio.no     Gary Larson, 1987   20   Workshop in Oslo 26th Aug