SlideShare a Scribd company logo
Towards	
  FAIR	
  Open	
  Science	
  with	
  PID	
  
Kernel	
  Information:	
  the	
  RPID	
  
Testbed
Beth	
  Plale
School	
  of	
  Informatics,	
  Computing	
  and	
  Engineering
Data	
  To	
  Insight	
  Center
Indiana	
  University
Basarim 2017 Istanbul,	
  Turkey 15	
  Sep	
  2017	
  ,
The	
  ideas	
  expressed	
  here	
  have	
  been	
  shaped	
  through	
  conversations	
  
in	
  Research	
  Data	
  Alliance	
  (RDA).	
  	
  Special	
  thanks	
  to	
  Peter	
  
Wittenburg,	
  Tobias	
  Wiegel,	
  and	
  Larry	
  Lannom.
Ideas	
  are	
  being	
  put	
  into	
  action	
  through	
  a	
  US	
  NSF	
  funded	
  project	
  
called	
  Robust	
  PID	
  (RPID)	
  Testbed
Project	
  partners	
  include
Beth	
  Plale,	
  Robert	
  Quick,	
  Robert	
  McDonald
Indiana	
  University
Bridget	
  Almas,	
  Tufts	
  University
Larry	
  Lannom,	
  CNRI
The	
  opinions	
  expressed	
  here	
  are	
  those	
  of	
  author	
  alone	
  and	
  do	
  not	
  represent	
  the	
  views	
  of	
  
the	
  US	
  National	
  Science	
  Foundation
Scientific	
  data	
  today	
  
is	
  baskets	
  of	
  apples	
  
across	
  random	
  
orchards
Discovery	
  is	
  a	
  
blindman’s bluff	
  
game	
  
Commitment	
  to	
  data	
  
as	
  it	
  ages	
  a	
  mere	
  
hope
Cartoon	
  credit:	
  Auke	
  Herrema
The	
  Internet	
  is	
  a	
  worldwide	
  network	
  of	
  
connected	
  computers.	
  	
  	
  Computers	
  have	
  an	
  IP	
  
address	
  that	
  uniquely	
  identifies	
  device	
  on	
  
network.	
  	
  
Imagine	
  worldwide	
  network	
  of	
  data	
  objects.	
  
Data	
  objects	
  persist	
  (until	
  they	
  don’t).	
  Objects	
  
are	
  findable,	
  accessible,	
  interoperable,	
  and	
  
usable	
  (especially	
  reusable)
Indiana	
  University	
  	
  
Guiding	
  abstraction	
  for	
  Data	
  
Sharing:
Identifies	
  entities	
  and	
  
stakeholders
Of	
  interest	
  to	
  technologists	
  
and	
  policy	
  makers	
  alike
Fecher B,	
  Friesike S,	
  Hebing	
  M	
  (2015)	
  What	
  Drives	
  Academic	
  Data	
  Sharing?.	
  PLOS	
  ONE	
  10(2):	
  e0118053.	
  
https://doi.org/10.1371/journal.pone.0118053
http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0118053
Fecher B,	
  Friesike S,	
  Hebing	
  M	
  (2015)	
  What	
  Drives	
  Academic	
  Data	
  Sharing?.	
  PLOS	
  ONE	
  10(2):	
  e0118053.	
  
https://doi.org/10.1371/journal.pone.0118053
http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0118053
this	
  piece	
  is	
  
actually	
  a	
  
network
A
C
GF
B
D
E
Network	
  of	
  
indepen-­‐
dent,	
  globally	
  
unique	
  and	
  
persistent	
  
Data	
  Objects	
  
that	
  have	
  
relationships	
  
between	
  
them	
  that	
  we	
  
should	
  
exploit
Data Object Layer
Is	
  part	
  of
Repositories
Data	
  
Objects
In	
  reality	
  Data Objects	
  reside	
  in	
  
repositories
Data	
  objects	
  reside	
  in	
  
repositories	
  but	
  should	
  not	
  
be	
  completely	
  controlled	
  by	
  
repositories
Open	
  science
Open	
  science	
  
is	
  an	
  umbrella	
  
term	
  for	
  
transparent	
  
science	
  with	
  
ease	
  of	
  access	
  
to	
  all	
  products	
  
from	
  
beginning	
  to	
  
end
Indiana	
  University	
  	
  
Image	
  credit:	
  	
  Gema Bueno	
  de	
  la	
  Fuente	
  by	
  CC-­‐BY
Open	
  science
Risk	
  in	
  defining	
  open	
  science	
  too	
  broadly
Open	
  science	
  must	
  respect	
  boundaries	
  set	
  by	
  law	
  or	
  decency:	
  
licenses,	
  copyright,	
  human	
  subjects	
  privacy
Open	
  Science	
  increasingly	
  connected	
  to	
  FAIR	
  principles:	
  	
  
Findable
Accessible
Interoperable
Reusable	
  	
  
FAIR	
  Guiding	
  Principles
1. To	
  be	
  Findable any	
  Data	
  Object	
  should	
  be	
  uniquely	
  and	
  
persistently	
  identifiable
1.1.	
  Same	
  Data	
  Object	
  should	
  be	
  re-­‐findable	
  at	
  any	
  point	
  in	
  
time,	
  thus	
  Data	
  Objects	
  should	
  be	
  persistent,	
  with	
  emphasis	
  
on	
  their	
  metadata
1.2.	
  Data	
  Object	
  should	
  minimally	
  contain	
  basic	
  machine	
  
actionable	
  metadata	
  that	
  allows	
  it	
  to	
  be	
  distinguished	
  from	
  
other	
  Data	
  Objects
1.3.	
  Identifiers	
  for	
  any	
  concept	
  used	
  in	
  Data	
  Objects	
  should	
  
therefore	
  be	
  Unique and	
  Persistent
FAIR	
  Guiding	
  Principles
2.	
  Data	
  is	
  Accessible in	
  that	
  it	
  can	
  be	
  always	
  obtained	
  by	
  
machines	
  and	
  humans
2.1	
  Upon	
  appropriate	
  authorization
2.2	
  Through	
  a	
  well-­‐defined	
  protocol
2.3	
  Thus,	
  machines	
  and	
  humans	
  alike	
  will	
  be	
  able	
  to	
  judge	
  the	
  
actual	
  accessibility	
  of	
  each	
  Data	
  Object	
  
FAIR	
  Guiding	
  Principles,	
  cont.
3.	
  Data	
  Objects	
  can	
  be	
  Interoperable	
  only	
  if:
3.1.	
  (Meta)	
  data	
  is	
  machine-­‐actionable
3.2.	
  (Meta)	
  data	
  formats	
  utilize	
  shared	
  vocabularies	
  
and/or	
  ontologies
3.3	
  	
  (Meta)	
  data	
  within	
  Data	
  Object	
  should	
  thus	
  be	
  
both	
  syntactically	
  parseable and	
  semantically	
  
machine-­‐accessible
FAIR	
  Guiding	
  Principles,	
  cont.
4.	
  For	
  Data	
  Objects	
  to	
  be	
  Re-­‐usable additional	
  criteria	
  are:
4.1	
  Data	
  Objects	
  should	
  be	
  compliant	
  with principles	
  1-­‐3
4.2	
  (Meta)	
  data	
  should	
  be	
  sufficiently	
  well-­‐described	
  and	
  rich	
  
that	
  it	
  can	
  be	
  automatically	
  (or	
  with	
  minimal	
  human	
  effort)	
  
linked	
  or	
  integrated,	
  like-­‐with-­‐like,	
  with	
  other	
  data	
  sources
4.3	
  Published	
  Data	
  Objects	
  should	
  refer	
  to	
  their	
  sources	
  with	
  
rich	
  enough	
  metadata	
  and	
  provenance	
  to	
  enable	
  proper	
  
citation
Our	
  vision
• Starts	
  with	
  data	
  network	
  based	
  on	
  Digital	
  
Object	
  Architecture	
  (DOA),	
  a	
  distributed	
  
architecture	
  of	
  services	
  spread	
  worldwide	
  that	
  
together	
  identify	
  and	
  resolve	
  digital	
  objects
• DOA	
  first	
  espoused	
  by	
  Internet	
  founder	
  Robert	
  
Khan	
  in	
  the	
  mid’80’s.	
  	
  
• DOA	
  is	
  a	
  network	
  of	
  Handle	
  servers	
  at	
  its	
  core
Indiana	
  University	
  	
  
The	
  Digital	
  Object	
  Architecture	
  serves	
  as	
  base	
  
infrastructure	
  only.	
  DOA	
  is	
  silent	
  on	
  issues	
  of	
  
modeling	
  data	
  objects	
  themselves:	
  their	
  
content,	
  their	
  relationship	
  to	
  their	
  own	
  
metadata,	
  and	
  relationship	
  between	
  data	
  
objects
For	
  object	
  modeling	
  we	
  turn	
  to	
  FAIR	
  principles	
  
and	
  PID	
  Kernel	
  Information	
  	
  
Data	
  Object	
  Model	
  based	
  on	
  FAIR	
  
principles
Data	
  modeling	
  questions	
  address	
  issues:
1)	
  	
  	
  	
  	
  What	
  goes	
  into	
  a	
  data	
  object?
2)	
  	
  	
  	
  	
  Should	
  a	
  data	
  object	
  include	
  its	
  metadata	
  or	
  
should	
  the	
  metadata	
  be	
  a	
  new	
  object	
  or	
  both?
3)	
  	
  	
  	
  	
  What	
  kind	
  of	
  metadata	
  should	
  be	
  considered?
4)	
  	
  	
  	
  	
  What	
  is	
  the	
  granularity	
  of	
  a	
  data	
  object?
5)	
  	
  	
  	
  	
  Where	
  does	
  kernel	
  information	
  come	
  in?
Persistent	
  IDs	
  are	
  the	
  
backbone	
  of	
  data	
  
sharing
[	
  primary	
  and	
  
secondary	
  use	
  ]
• Persistent	
  IDs	
  (PID)
-­‐-­‐ names	
  a	
  data	
  object	
  with	
  name
that	
  is	
  globally	
  unique
-­‐-­‐ data	
  object	
  can	
  be	
  metadata,	
  
data	
  or	
  a	
  digital	
  proxy	
  to
physical	
  object
-­‐-­‐ is	
  persistent	
  over	
  time
plale@indiana.edu
PID	
  makeup
• Handles	
  have	
  a	
  prefix	
  assigned	
  to	
  a	
  Local	
  Handle	
  Server
• Suffix	
  is	
  under	
  control	
  of	
  Local	
  Handle	
  Server
• e.g.,	
  RPID	
  testbed	
  assigns only	
  test	
  temporary	
  handles:
– 11723.1.test,	
  11723.2.test,	
  ...	
  11723.8.test	
  :	
  	
  
assigned	
  for	
  internal use	
  
– 11723.9.test.<proj	
  name>	
  	
  :	
  	
  assigned	
  to	
  projects
avoids	
  collisions	
  within	
  LHS	
  namespace
Indiana	
  University	
  	
  
• Handle	
  system	
  allows	
  key-­‐value	
  
information	
  stored	
  to	
  a	
  Local	
  Handle	
  
Server
-­‐-­‐ names	
  a	
  Data	
  Object	
  with	
  name
that	
  is	
  globally	
  unique
-­‐-­‐ Data	
  Object	
  can	
  be	
  metadata,	
  
data	
  or	
  a	
  digital	
  proxy	
  to
physical	
  object
-­‐-­‐ Is	
  persistent	
  over	
  time
15	
  Sep	
  2017
Data	
  Type	
  
Registry	
  Service
Stores	
  type	
  definitions	
  
for	
  kernel	
  information
Client
PIT	
  
API	
  
SDK
Handle	
  System
Global	
  Handle	
  Servers
Local	
  Handle	
  
Service
Q:	
  prefix	
  authority
Local	
  Handle	
  Service	
  IP
Q:	
  local	
  handle
Handle	
  information
Q:	
  DTR	
  with	
  Profile	
  PID
DTR	
  Profile	
  Definition
(e.g.,	
  PID	
  to	
  Profile,	
  URL	
  to	
  target)	
  
Scale:	
  
[1000…50
00]	
  LHS
Scale:	
  
[1..10]
Stores	
  PID	
  kernel	
  
information
Handle	
  resolution	
  in	
  a	
  Digital	
  Object	
  Architecture
Trusted	
  PIDs
Filter-­‐
ed
PIDS
Scale:	
  
[80…100]	
  
GHS
What	
  should	
  go	
  into	
  the	
  PID	
  Kernel	
  
Information?
PID Kernel	
  Information is	
  a	
  small	
  amount	
  of	
  
information	
  stored	
  at	
  resolver	
  (Local	
  Handle	
  Server)	
  
in	
  PID	
  record	
  of	
  a	
  PID
Inspiration:	
  take	
  FAIR	
  principles	
  as	
  guide:	
  how	
  far	
  
can	
  PID	
  Kernel	
  Information	
  aid	
  in	
  implementing	
  
FAIR?
Kernel	
  Information	
  is	
  Cached
• By	
  FAIR	
  principle	
  1.1,	
  a	
  Local	
  Handle	
  Server	
  is	
  not	
  a	
  
metadata	
  repository	
  so	
  cannot	
  serve	
  as	
  the	
  
authoritative	
  source	
  for	
  any	
  form	
  of	
  metadata	
  for	
  a	
  
data	
  object
• Thus	
  Kernel	
  Information	
  is	
  cached	
  copy	
  of	
  metadata	
  
that	
  is	
  stored	
  and	
  stewarded	
  elsewhere
• FAIR	
  principle	
  1.1:	
  Same	
  Data	
  Object	
  should	
  be	
  re-­‐
findable	
  at	
  any	
  point	
  in	
  time,	
  thus	
  Data	
  Objects	
  should	
  
be	
  persistent,	
  with	
  emphasis	
  on	
  their	
  metadata
Promising	
  candidate	
  for	
  Kernel	
  
Information	
  is	
  Provenance
Imagine	
  a	
  world	
  where	
  PIDs	
  identify	
  
just	
  about	
  everything:
-­‐>	
  Internet	
  of	
  Things
-­‐>	
  Movie	
  clips
-­‐>	
  Smart	
  city	
  sensor	
  data
-­‐>	
  Pages	
  from	
  digitized	
  books
-­‐>	
  Baby	
  food	
  containers
Further	
  imagine	
  an	
  Internet-­‐scale	
  data	
  
client	
  that	
  is	
  handed	
  a	
  list	
  of	
  a	
  
100,000,000	
  PIDs.
How	
  does	
  client	
  quickly	
  sift	
  through	
  
list	
  to	
  find	
  research	
  data	
  objects?
Further	
  suppose	
  client	
  is	
  able	
  to	
  
winnow	
  list	
  down	
  to	
  just	
  research	
  data	
  
objects,	
  how	
  does	
  it	
  then	
  quickly	
  
discard	
  fakes?	
  
plale@indiana.edu
Data	
  Type	
  
Registry	
  Service
Stores	
  type	
  
definitions	
  
for	
  kernel	
  
information
Client Handle	
  System
Global	
  Handle	
  Registry
Local	
  Handle	
  
Service
Q:	
  prefix	
  authority
Local	
  Handle	
  Service	
  IP
Q:	
  local	
  handle
Handle	
  information
Q:	
  DTR	
  with	
  Profile	
  PID
DTR	
  Profile	
  Definition
[1000…
5000]
Stores	
  PID	
  
kernel	
  
information
PID	
  Kernel	
  Information	
  Use	
  case:	
  	
  Client	
  filters	
  list	
  of	
  millions	
  of	
  PIDs	
  to	
  identify	
  
research	
  data	
  and	
  makes	
  simple	
  determination	
  of	
  trust
Trusted	
  
research	
  
PIDs
Filter
-­‐ed
PIDS
Client	
  working	
  with	
  PID	
  Kernel	
  Information	
  looks	
  
at	
  each	
  PID	
  in	
  list,	
  accepts	
  those	
  that	
  have:
-­‐-­‐ Kernel	
  Information	
  profile	
  stored	
  in	
  Data	
  Type	
  
Registry	
  (DTR),	
  
-­‐-­‐ That	
  profile	
  is	
  associated	
  with	
  RDA	
  (in	
  some	
  
unspecified	
  manner)
-­‐-­‐ PID	
  Kernel	
  Information	
  holds	
  tiny	
  amount	
  of	
  
data	
  provenance	
  from	
  which	
  basic	
  sense	
  of	
  trust	
  
is	
  derived	
  
Kernel	
  Information	
  for	
  FAIR	
  
Accessibility
• By	
  FAIR	
  principle	
  2,	
  Kernel	
  Information	
  conveys	
  accessibility	
  
information	
  thus	
  making	
  it	
  easier	
  for	
  navigating	
  direct	
  data	
  
object	
  access	
  
• Includes	
  privacy	
  or	
  legal	
  restrictions	
  on	
  a	
  data	
  object	
  that	
  
may	
  limit	
  access	
  to,	
  say	
  the	
  object’s	
  metadata	
  alone.
FAIR	
  Principle	
  2.	
  Data	
  is	
  Accessible in	
  that	
  it	
  can	
  be	
  always	
  
obtained	
  by	
  machines	
  and	
  humans
2.1	
  Upon	
  appropriate	
  authorization
2.2	
  Through	
  a	
  well-­‐defined	
  protocol
2.3	
  Thus,	
  machines	
  and	
  humans	
  alike	
  will	
  be	
  able	
  to	
  judge	
  the	
  
actual	
  accessibility	
  of	
  each	
  Data	
  Object	
  
Indiana	
  University	
  	
  
Data	
  Type	
  
Registry	
  Service
Client Handle	
  System
Global	
  Handle	
  Registry
Local	
  Handle	
  
Service
Q:	
  prefix	
  authority
Local	
  Handle	
  Service	
  IP
Q:	
  local	
  handle
Handle	
  information	
  +	
  PID	
  
Kernel	
  Information
Q:	
  DTR	
  with	
  Profile	
  PID
DTR	
  Profile	
  Definition	
  for	
  PID	
  
Kernel	
  Information
[1..10]
PID	
  Kernel	
  Information	
  Use	
  case:	
  	
  Filter	
  list	
  of	
  million	
  PIDs	
  to	
  identify	
  research	
  
data;	
  make	
  simple	
  determination	
  of	
  trust
Repository	
  Access
Retrieve	
  data	
  
object	
  as	
  per	
  
access	
  and	
  
rights	
  
restriction	
  in	
  
PID	
  KI
PID	
  Kernel	
  Information	
  Summary
• Exploration	
  driven	
  by	
  identifying	
  and	
  
evaluating	
  minimal	
  information	
  that	
  can	
  go	
  
into	
  Kernel	
  Information	
  that	
  can	
  help	
  make	
  
Data	
  Objects	
  FAIR	
  and	
  less	
  dependent	
  on	
  the	
  
repository	
  system	
  to	
  enforce	
  FAIRness?	
  	
  	
  
• Long	
  term	
  goal:	
  	
  Smart	
  data	
  objects
• Kernel	
  information	
  has	
  potential	
  to	
  spawn	
  
new	
  ecosystem	
  of	
  data	
  services	
  for	
  smart	
  data	
  
objects
RPID	
  testbed
• Suite	
  of	
  software	
  services	
  for	
  use	
  by	
  community
– Data	
  type	
  registry	
  (RDA)
– PIT	
  API	
  (RDA)
– Handle	
  service
• Exploratory	
  services
– PID	
  Kernel	
  Information
– Mapping	
  CTS	
  URNs	
  to	
  handles
– Packaging	
  for	
  use	
  by	
  others
• Help	
  and	
  advice
• User	
  advisory	
  group	
  
Indiana	
  University	
  	
  
Data	
  Type	
  
Registry
Handle	
  Service
Prefix:	
  11723 Service	
  
Installation	
  
Testing	
  for	
  
Reproducibility
36-­‐Month	
  Testbed
RPID	
  Testbed
Who	
  can	
  use	
  the	
  Testbed
The	
  RPID	
  testbed	
  is	
  open	
  for research,	
  
education,	
  non-­‐profit,	
  or	
  pre-­‐
competitive	
  use.
Fecher B,	
  Friesike S,	
  Hebing	
  M	
  (2015)	
  What	
  Drives	
  Academic	
  Data	
  Sharing?.	
  PLOS	
  ONE	
  10(2):	
  e0118053.	
  
https://doi.org/10.1371/journal.pone.0118053
http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0118053
Summary:	
  
Foundational	
  
infrastructure	
  for	
  data	
  
sharing	
  is	
  FAIR	
  inspired	
  
Digital	
  Object	
  Archit
with	
  PID	
  Kernel	
  Info
• In	
  conclusion,	
  this	
  work	
  proposes
– Level	
  1a	
  data	
  resolution:	
  	
  Digital	
  Object	
  Architecture	
  
[Kahn]	
  
– Level	
  1b	
  high	
  level	
  data	
  filtering:	
  	
  PID	
  Kernel	
  Information
– Level	
  2:	
  	
  FAIR	
  principles	
  as	
  data	
  object	
  layer
• Thus	
  contributes	
  to	
  Open	
  Science	
  with	
  foundational	
  
infrastructure	
  enabling	
  new	
  ecosystem	
  of	
  data	
  services
• Follow	
  work	
  at:
– https://github.com/rpidproject
– RDA	
  PID	
  Kernel	
  Information	
  Working	
  Group
– Reach	
  us	
  at	
  rpid-­‐l@iu.edu
Acknowledgements:	
  	
  this	
  work	
  funded	
  in	
  part	
  by	
  the	
  National	
  Science	
  Foundation
under	
  grants 1659310	
  and	
  1349002

More Related Content

What's hot

Clustering of Deep WebPages: A Comparative Study
Clustering of Deep WebPages: A Comparative StudyClustering of Deep WebPages: A Comparative Study
Clustering of Deep WebPages: A Comparative Study
ijcsit
 
Preservation Metadata
Preservation MetadataPreservation Metadata
Preservation Metadata
DigitalPreservationEurope
 
Ir 01
Ir   01Ir   01
Information Security in Big Data : Privacy and Data Mining
Information Security in Big Data : Privacy and Data MiningInformation Security in Big Data : Privacy and Data Mining
Information Security in Big Data : Privacy and Data Mining
wanani181
 
Convolutional Neural Networks
Convolutional Neural Networks Convolutional Neural Networks
Convolutional Neural Networks
MichaelRodriguesdosS1
 
data mining for security application
data mining for security applicationdata mining for security application
data mining for security applicationbharatsvnit
 
香港六合彩
香港六合彩香港六合彩
香港六合彩
shujia
 
DataTags: Sharing Privacy Sensitive Data by Michael Bar-sinai
DataTags: Sharing Privacy Sensitive Data by Michael Bar-sinaiDataTags: Sharing Privacy Sensitive Data by Michael Bar-sinai
DataTags: Sharing Privacy Sensitive Data by Michael Bar-sinai
datascienceiqss
 
2016 BE Final year Projects in chennai - 1 Crore Projects
2016 BE Final year Projects in chennai - 1 Crore Projects 2016 BE Final year Projects in chennai - 1 Crore Projects
2016 BE Final year Projects in chennai - 1 Crore Projects
1crore projects
 
Digital library and metadata
Digital library and metadataDigital library and metadata
Digital library and metadataramncsi
 
Research on ontology based information retrieval techniques
Research on ontology based information retrieval techniquesResearch on ontology based information retrieval techniques
Research on ontology based information retrieval techniques
Kausar Mukadam
 
EXPLAIN REMOTE ACCESS TO LIBRARY RESOURCES? DESCRIBE ANY ONE SOFTWARE AVAILAB...
EXPLAIN REMOTE ACCESS TO LIBRARY RESOURCES? DESCRIBE ANY ONE SOFTWARE AVAILAB...EXPLAIN REMOTE ACCESS TO LIBRARY RESOURCES? DESCRIBE ANY ONE SOFTWARE AVAILAB...
EXPLAIN REMOTE ACCESS TO LIBRARY RESOURCES? DESCRIBE ANY ONE SOFTWARE AVAILAB...
`Shweta Bhavsar
 
Data Sharing and the Polar Information Commons
Data Sharing and the Polar Information CommonsData Sharing and the Polar Information Commons
Data Sharing and the Polar Information CommonsKaitlin Thaney
 
Big Data Repository for Structural Biology: Challenges and Opportunities by P...
Big Data Repository for Structural Biology: Challenges and Opportunities by P...Big Data Repository for Structural Biology: Challenges and Opportunities by P...
Big Data Repository for Structural Biology: Challenges and Opportunities by P...
datascienceiqss
 
Electronic Discovery 101 - From ESI to the EDRM
Electronic Discovery 101 - From ESI to the EDRMElectronic Discovery 101 - From ESI to the EDRM
Electronic Discovery 101 - From ESI to the EDRM
Rob Robinson
 
Enabling Use of Dynamic Anonymization for Enhanced Security in Cloud
Enabling Use of Dynamic Anonymization for Enhanced Security in CloudEnabling Use of Dynamic Anonymization for Enhanced Security in Cloud
Enabling Use of Dynamic Anonymization for Enhanced Security in Cloud
IOSR Journals
 
Exploring Process Barriers to Release Public Sector Information in Local Gove...
Exploring Process Barriers to Release Public Sector Information in Local Gove...Exploring Process Barriers to Release Public Sector Information in Local Gove...
Exploring Process Barriers to Release Public Sector Information in Local Gove...
Peter Conradie
 
"Analysis of Different Text Classification Algorithms: An Assessment "
"Analysis of Different Text Classification Algorithms: An Assessment ""Analysis of Different Text Classification Algorithms: An Assessment "
"Analysis of Different Text Classification Algorithms: An Assessment "
ijtsrd
 

What's hot (20)

Clustering of Deep WebPages: A Comparative Study
Clustering of Deep WebPages: A Comparative StudyClustering of Deep WebPages: A Comparative Study
Clustering of Deep WebPages: A Comparative Study
 
AIDA ICITET
AIDA ICITETAIDA ICITET
AIDA ICITET
 
Preservation Metadata
Preservation MetadataPreservation Metadata
Preservation Metadata
 
Ir 01
Ir   01Ir   01
Ir 01
 
Information Security in Big Data : Privacy and Data Mining
Information Security in Big Data : Privacy and Data MiningInformation Security in Big Data : Privacy and Data Mining
Information Security in Big Data : Privacy and Data Mining
 
Convolutional Neural Networks
Convolutional Neural Networks Convolutional Neural Networks
Convolutional Neural Networks
 
data mining for security application
data mining for security applicationdata mining for security application
data mining for security application
 
香港六合彩
香港六合彩香港六合彩
香港六合彩
 
DataTags: Sharing Privacy Sensitive Data by Michael Bar-sinai
DataTags: Sharing Privacy Sensitive Data by Michael Bar-sinaiDataTags: Sharing Privacy Sensitive Data by Michael Bar-sinai
DataTags: Sharing Privacy Sensitive Data by Michael Bar-sinai
 
2016 BE Final year Projects in chennai - 1 Crore Projects
2016 BE Final year Projects in chennai - 1 Crore Projects 2016 BE Final year Projects in chennai - 1 Crore Projects
2016 BE Final year Projects in chennai - 1 Crore Projects
 
Digital library and metadata
Digital library and metadataDigital library and metadata
Digital library and metadata
 
Research on ontology based information retrieval techniques
Research on ontology based information retrieval techniquesResearch on ontology based information retrieval techniques
Research on ontology based information retrieval techniques
 
EXPLAIN REMOTE ACCESS TO LIBRARY RESOURCES? DESCRIBE ANY ONE SOFTWARE AVAILAB...
EXPLAIN REMOTE ACCESS TO LIBRARY RESOURCES? DESCRIBE ANY ONE SOFTWARE AVAILAB...EXPLAIN REMOTE ACCESS TO LIBRARY RESOURCES? DESCRIBE ANY ONE SOFTWARE AVAILAB...
EXPLAIN REMOTE ACCESS TO LIBRARY RESOURCES? DESCRIBE ANY ONE SOFTWARE AVAILAB...
 
Data Sharing and the Polar Information Commons
Data Sharing and the Polar Information CommonsData Sharing and the Polar Information Commons
Data Sharing and the Polar Information Commons
 
Big Data Repository for Structural Biology: Challenges and Opportunities by P...
Big Data Repository for Structural Biology: Challenges and Opportunities by P...Big Data Repository for Structural Biology: Challenges and Opportunities by P...
Big Data Repository for Structural Biology: Challenges and Opportunities by P...
 
Electronic Discovery 101 - From ESI to the EDRM
Electronic Discovery 101 - From ESI to the EDRMElectronic Discovery 101 - From ESI to the EDRM
Electronic Discovery 101 - From ESI to the EDRM
 
Enabling Use of Dynamic Anonymization for Enhanced Security in Cloud
Enabling Use of Dynamic Anonymization for Enhanced Security in CloudEnabling Use of Dynamic Anonymization for Enhanced Security in Cloud
Enabling Use of Dynamic Anonymization for Enhanced Security in Cloud
 
Exploring Process Barriers to Release Public Sector Information in Local Gove...
Exploring Process Barriers to Release Public Sector Information in Local Gove...Exploring Process Barriers to Release Public Sector Information in Local Gove...
Exploring Process Barriers to Release Public Sector Information in Local Gove...
 
Ediscovery 101
Ediscovery 101Ediscovery 101
Ediscovery 101
 
"Analysis of Different Text Classification Algorithms: An Assessment "
"Analysis of Different Text Classification Algorithms: An Assessment ""Analysis of Different Text Classification Algorithms: An Assessment "
"Analysis of Different Text Classification Algorithms: An Assessment "
 

Similar to Towards FAIR Open Science with PID Kernel Information: RPID Testbed

FAIRy stories: the FAIR Data principles in theory and in practice
FAIRy stories: the FAIR Data principles in theory and in practiceFAIRy stories: the FAIR Data principles in theory and in practice
FAIRy stories: the FAIR Data principles in theory and in practice
Carole Goble
 
Komatsoulis internet2 executive track
Komatsoulis internet2 executive trackKomatsoulis internet2 executive track
Komatsoulis internet2 executive track
George Komatsoulis
 
VODAN Africa IN.pptx
VODAN Africa IN.pptxVODAN Africa IN.pptx
VODAN Africa IN.pptx
Getu Tadele
 
Research Object Composer: A Tool for Publishing Complex Data Objects in the C...
Research Object Composer: A Tool for Publishing Complex Data Objects in the C...Research Object Composer: A Tool for Publishing Complex Data Objects in the C...
Research Object Composer: A Tool for Publishing Complex Data Objects in the C...
Anita de Waard
 
Data commons bonazzi bd2 k fundamentals of science feb 2017
Data commons bonazzi   bd2 k fundamentals of science feb 2017Data commons bonazzi   bd2 k fundamentals of science feb 2017
Data commons bonazzi bd2 k fundamentals of science feb 2017
Vivien Bonazzi
 
BD2K and the Commons : ELIXR All Hands
BD2K and the Commons : ELIXR All Hands BD2K and the Commons : ELIXR All Hands
BD2K and the Commons : ELIXR All Hands
Vivien Bonazzi
 
Overview of Big Data by Sunny
Overview of Big Data by SunnyOverview of Big Data by Sunny
Overview of Big Data by Sunny
DignitasDigital1
 
Dynamic Data Analytics for the Internet of Things: Challenges and Opportunities
Dynamic Data Analytics for the Internet of Things: Challenges and OpportunitiesDynamic Data Analytics for the Internet of Things: Challenges and Opportunities
Dynamic Data Analytics for the Internet of Things: Challenges and Opportunities
PayamBarnaghi
 
DataCite and its Members: Connecting Research and Identifying Knowledge
DataCite and its Members: Connecting Research and Identifying KnowledgeDataCite and its Members: Connecting Research and Identifying Knowledge
DataCite and its Members: Connecting Research and Identifying Knowledge
ETH-Bibliothek
 
Semantic technologies for the Internet of Things
Semantic technologies for the Internet of Things Semantic technologies for the Internet of Things
Semantic technologies for the Internet of Things
PayamBarnaghi
 
From Data Platforms to Dataspaces: Enabling Data Ecosystems for Intelligent S...
From Data Platforms to Dataspaces: Enabling Data Ecosystems for Intelligent S...From Data Platforms to Dataspaces: Enabling Data Ecosystems for Intelligent S...
From Data Platforms to Dataspaces: Enabling Data Ecosystems for Intelligent S...
Edward Curry
 
Unit 2
Unit 2Unit 2
BrightTALK - Semantic AI
BrightTALK - Semantic AI BrightTALK - Semantic AI
BrightTALK - Semantic AI
Semantic Web Company
 
Intelligent Data Processing for the Internet of Things
Intelligent Data Processing for the Internet of Things Intelligent Data Processing for the Internet of Things
Intelligent Data Processing for the Internet of Things
PayamBarnaghi
 
John morrissey c3 dis fair working data.pptx
John morrissey c3 dis fair working data.pptxJohn morrissey c3 dis fair working data.pptx
John morrissey c3 dis fair working data.pptx
ARDC
 
How to make data more usable on the Internet of Things
How to make data more usable on the Internet of ThingsHow to make data more usable on the Internet of Things
How to make data more usable on the Internet of ThingsPayamBarnaghi
 
DataCite – Bridging the gap and helping to find, access and reuse data – Herb...
DataCite – Bridging the gap and helping to find, access and reuse data – Herb...DataCite – Bridging the gap and helping to find, access and reuse data – Herb...
DataCite – Bridging the gap and helping to find, access and reuse data – Herb...
OpenAIRE
 
Working with real world data
Working with real world dataWorking with real world data
Working with real world dataPayamBarnaghi
 
How to Make Your Content Smarter
How to Make Your Content SmarterHow to Make Your Content Smarter
How to Make Your Content Smarter
Bianca Pereira
 
FSCI Persistent Identifiers
FSCI Persistent IdentifiersFSCI Persistent Identifiers
FSCI Persistent Identifiers
ARDC
 

Similar to Towards FAIR Open Science with PID Kernel Information: RPID Testbed (20)

FAIRy stories: the FAIR Data principles in theory and in practice
FAIRy stories: the FAIR Data principles in theory and in practiceFAIRy stories: the FAIR Data principles in theory and in practice
FAIRy stories: the FAIR Data principles in theory and in practice
 
Komatsoulis internet2 executive track
Komatsoulis internet2 executive trackKomatsoulis internet2 executive track
Komatsoulis internet2 executive track
 
VODAN Africa IN.pptx
VODAN Africa IN.pptxVODAN Africa IN.pptx
VODAN Africa IN.pptx
 
Research Object Composer: A Tool for Publishing Complex Data Objects in the C...
Research Object Composer: A Tool for Publishing Complex Data Objects in the C...Research Object Composer: A Tool for Publishing Complex Data Objects in the C...
Research Object Composer: A Tool for Publishing Complex Data Objects in the C...
 
Data commons bonazzi bd2 k fundamentals of science feb 2017
Data commons bonazzi   bd2 k fundamentals of science feb 2017Data commons bonazzi   bd2 k fundamentals of science feb 2017
Data commons bonazzi bd2 k fundamentals of science feb 2017
 
BD2K and the Commons : ELIXR All Hands
BD2K and the Commons : ELIXR All Hands BD2K and the Commons : ELIXR All Hands
BD2K and the Commons : ELIXR All Hands
 
Overview of Big Data by Sunny
Overview of Big Data by SunnyOverview of Big Data by Sunny
Overview of Big Data by Sunny
 
Dynamic Data Analytics for the Internet of Things: Challenges and Opportunities
Dynamic Data Analytics for the Internet of Things: Challenges and OpportunitiesDynamic Data Analytics for the Internet of Things: Challenges and Opportunities
Dynamic Data Analytics for the Internet of Things: Challenges and Opportunities
 
DataCite and its Members: Connecting Research and Identifying Knowledge
DataCite and its Members: Connecting Research and Identifying KnowledgeDataCite and its Members: Connecting Research and Identifying Knowledge
DataCite and its Members: Connecting Research and Identifying Knowledge
 
Semantic technologies for the Internet of Things
Semantic technologies for the Internet of Things Semantic technologies for the Internet of Things
Semantic technologies for the Internet of Things
 
From Data Platforms to Dataspaces: Enabling Data Ecosystems for Intelligent S...
From Data Platforms to Dataspaces: Enabling Data Ecosystems for Intelligent S...From Data Platforms to Dataspaces: Enabling Data Ecosystems for Intelligent S...
From Data Platforms to Dataspaces: Enabling Data Ecosystems for Intelligent S...
 
Unit 2
Unit 2Unit 2
Unit 2
 
BrightTALK - Semantic AI
BrightTALK - Semantic AI BrightTALK - Semantic AI
BrightTALK - Semantic AI
 
Intelligent Data Processing for the Internet of Things
Intelligent Data Processing for the Internet of Things Intelligent Data Processing for the Internet of Things
Intelligent Data Processing for the Internet of Things
 
John morrissey c3 dis fair working data.pptx
John morrissey c3 dis fair working data.pptxJohn morrissey c3 dis fair working data.pptx
John morrissey c3 dis fair working data.pptx
 
How to make data more usable on the Internet of Things
How to make data more usable on the Internet of ThingsHow to make data more usable on the Internet of Things
How to make data more usable on the Internet of Things
 
DataCite – Bridging the gap and helping to find, access and reuse data – Herb...
DataCite – Bridging the gap and helping to find, access and reuse data – Herb...DataCite – Bridging the gap and helping to find, access and reuse data – Herb...
DataCite – Bridging the gap and helping to find, access and reuse data – Herb...
 
Working with real world data
Working with real world dataWorking with real world data
Working with real world data
 
How to Make Your Content Smarter
How to Make Your Content SmarterHow to Make Your Content Smarter
How to Make Your Content Smarter
 
FSCI Persistent Identifiers
FSCI Persistent IdentifiersFSCI Persistent Identifiers
FSCI Persistent Identifiers
 

More from Beth Plale

Trustworthy AI and Open Science
Trustworthy AI and Open ScienceTrustworthy AI and Open Science
Trustworthy AI and Open Science
Beth Plale
 
Open science as roadmap to better data science research
Open science as roadmap to better data science researchOpen science as roadmap to better data science research
Open science as roadmap to better data science research
Beth Plale
 
Capsule Computing: Safe Open Science
Capsule Computing: Safe Open Science Capsule Computing: Safe Open Science
Capsule Computing: Safe Open Science
Beth Plale
 
HathiTrust Research Center Secure Commons
HathiTrust Research Center Secure CommonsHathiTrust Research Center Secure Commons
HathiTrust Research Center Secure Commons
Beth Plale
 
Trust threads : Active Curation and Publishing in SEAD
Trust threads : Active Curation and Publishing in SEADTrust threads : Active Curation and Publishing in SEAD
Trust threads : Active Curation and Publishing in SEAD
Beth Plale
 
Trust threads: Provenance for Data Reuse in Long Tail Science
Trust threads: Provenance for Data Reuse in Long Tail ScienceTrust threads: Provenance for Data Reuse in Long Tail Science
Trust threads: Provenance for Data Reuse in Long Tail Science
Beth Plale
 
Case Study Big Data: Socio-Technical Issues of HathiTrust Digital Texts
Case Study Big Data: Socio-Technical Issues of HathiTrust Digital TextsCase Study Big Data: Socio-Technical Issues of HathiTrust Digital Texts
Case Study Big Data: Socio-Technical Issues of HathiTrust Digital Texts
Beth Plale
 
Plale HathiTrust El Colegio de Mexico May2014
Plale HathiTrust El Colegio de Mexico May2014Plale HathiTrust El Colegio de Mexico May2014
Plale HathiTrust El Colegio de Mexico May2014
Beth Plale
 
Bridging Digital Humanities Research and Big Data Repositories of Digital Text
Bridging Digital Humanities Research and Big Data Repositories of Digital TextBridging Digital Humanities Research and Big Data Repositories of Digital Text
Bridging Digital Humanities Research and Big Data Repositories of Digital Text
Beth Plale
 
Big data and open access: a collision course for science
Big data and open access: a collision course for scienceBig data and open access: a collision course for science
Big data and open access: a collision course for science
Beth Plale
 
HathiTrust Reserach Center Nov2013
HathiTrust Reserach Center Nov2013HathiTrust Reserach Center Nov2013
HathiTrust Reserach Center Nov2013
Beth Plale
 

More from Beth Plale (11)

Trustworthy AI and Open Science
Trustworthy AI and Open ScienceTrustworthy AI and Open Science
Trustworthy AI and Open Science
 
Open science as roadmap to better data science research
Open science as roadmap to better data science researchOpen science as roadmap to better data science research
Open science as roadmap to better data science research
 
Capsule Computing: Safe Open Science
Capsule Computing: Safe Open Science Capsule Computing: Safe Open Science
Capsule Computing: Safe Open Science
 
HathiTrust Research Center Secure Commons
HathiTrust Research Center Secure CommonsHathiTrust Research Center Secure Commons
HathiTrust Research Center Secure Commons
 
Trust threads : Active Curation and Publishing in SEAD
Trust threads : Active Curation and Publishing in SEADTrust threads : Active Curation and Publishing in SEAD
Trust threads : Active Curation and Publishing in SEAD
 
Trust threads: Provenance for Data Reuse in Long Tail Science
Trust threads: Provenance for Data Reuse in Long Tail ScienceTrust threads: Provenance for Data Reuse in Long Tail Science
Trust threads: Provenance for Data Reuse in Long Tail Science
 
Case Study Big Data: Socio-Technical Issues of HathiTrust Digital Texts
Case Study Big Data: Socio-Technical Issues of HathiTrust Digital TextsCase Study Big Data: Socio-Technical Issues of HathiTrust Digital Texts
Case Study Big Data: Socio-Technical Issues of HathiTrust Digital Texts
 
Plale HathiTrust El Colegio de Mexico May2014
Plale HathiTrust El Colegio de Mexico May2014Plale HathiTrust El Colegio de Mexico May2014
Plale HathiTrust El Colegio de Mexico May2014
 
Bridging Digital Humanities Research and Big Data Repositories of Digital Text
Bridging Digital Humanities Research and Big Data Repositories of Digital TextBridging Digital Humanities Research and Big Data Repositories of Digital Text
Bridging Digital Humanities Research and Big Data Repositories of Digital Text
 
Big data and open access: a collision course for science
Big data and open access: a collision course for scienceBig data and open access: a collision course for science
Big data and open access: a collision course for science
 
HathiTrust Reserach Center Nov2013
HathiTrust Reserach Center Nov2013HathiTrust Reserach Center Nov2013
HathiTrust Reserach Center Nov2013
 

Recently uploaded

一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
vcaxypu
 
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
ahzuo
 
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
ewymefz
 
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
John Andrews
 
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
ewymefz
 
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
slg6lamcq
 
Ch03-Managing the Object-Oriented Information Systems Project a.pdf
Ch03-Managing the Object-Oriented Information Systems Project a.pdfCh03-Managing the Object-Oriented Information Systems Project a.pdf
Ch03-Managing the Object-Oriented Information Systems Project a.pdf
haila53
 
一比一原版(YU毕业证)约克大学毕业证成绩单
一比一原版(YU毕业证)约克大学毕业证成绩单一比一原版(YU毕业证)约克大学毕业证成绩单
一比一原版(YU毕业证)约克大学毕业证成绩单
enxupq
 
Adjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTESAdjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTES
Subhajit Sahu
 
一比一原版(BU毕业证)波士顿大学毕业证成绩单
一比一原版(BU毕业证)波士顿大学毕业证成绩单一比一原版(BU毕业证)波士顿大学毕业证成绩单
一比一原版(BU毕业证)波士顿大学毕业证成绩单
ewymefz
 
一比一原版(NYU毕业证)纽约大学毕业证成绩单
一比一原版(NYU毕业证)纽约大学毕业证成绩单一比一原版(NYU毕业证)纽约大学毕业证成绩单
一比一原版(NYU毕业证)纽约大学毕业证成绩单
ewymefz
 
Predicting Product Ad Campaign Performance: A Data Analysis Project Presentation
Predicting Product Ad Campaign Performance: A Data Analysis Project PresentationPredicting Product Ad Campaign Performance: A Data Analysis Project Presentation
Predicting Product Ad Campaign Performance: A Data Analysis Project Presentation
Boston Institute of Analytics
 
一比一原版(TWU毕业证)西三一大学毕业证成绩单
一比一原版(TWU毕业证)西三一大学毕业证成绩单一比一原版(TWU毕业证)西三一大学毕业证成绩单
一比一原版(TWU毕业证)西三一大学毕业证成绩单
ocavb
 
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
oz8q3jxlp
 
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
vcaxypu
 
Opendatabay - Open Data Marketplace.pptx
Opendatabay - Open Data Marketplace.pptxOpendatabay - Open Data Marketplace.pptx
Opendatabay - Open Data Marketplace.pptx
Opendatabay
 
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
ewymefz
 
社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .
NABLAS株式会社
 
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
u86oixdj
 
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
AbhimanyuSinha9
 

Recently uploaded (20)

一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
一比一原版(RUG毕业证)格罗宁根大学毕业证成绩单
 
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
一比一原版(UIUC毕业证)伊利诺伊大学|厄巴纳-香槟分校毕业证如何办理
 
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
 
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...
 
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
 
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
 
Ch03-Managing the Object-Oriented Information Systems Project a.pdf
Ch03-Managing the Object-Oriented Information Systems Project a.pdfCh03-Managing the Object-Oriented Information Systems Project a.pdf
Ch03-Managing the Object-Oriented Information Systems Project a.pdf
 
一比一原版(YU毕业证)约克大学毕业证成绩单
一比一原版(YU毕业证)约克大学毕业证成绩单一比一原版(YU毕业证)约克大学毕业证成绩单
一比一原版(YU毕业证)约克大学毕业证成绩单
 
Adjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTESAdjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTES
 
一比一原版(BU毕业证)波士顿大学毕业证成绩单
一比一原版(BU毕业证)波士顿大学毕业证成绩单一比一原版(BU毕业证)波士顿大学毕业证成绩单
一比一原版(BU毕业证)波士顿大学毕业证成绩单
 
一比一原版(NYU毕业证)纽约大学毕业证成绩单
一比一原版(NYU毕业证)纽约大学毕业证成绩单一比一原版(NYU毕业证)纽约大学毕业证成绩单
一比一原版(NYU毕业证)纽约大学毕业证成绩单
 
Predicting Product Ad Campaign Performance: A Data Analysis Project Presentation
Predicting Product Ad Campaign Performance: A Data Analysis Project PresentationPredicting Product Ad Campaign Performance: A Data Analysis Project Presentation
Predicting Product Ad Campaign Performance: A Data Analysis Project Presentation
 
一比一原版(TWU毕业证)西三一大学毕业证成绩单
一比一原版(TWU毕业证)西三一大学毕业证成绩单一比一原版(TWU毕业证)西三一大学毕业证成绩单
一比一原版(TWU毕业证)西三一大学毕业证成绩单
 
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
一比一原版(Deakin毕业证书)迪肯大学毕业证如何办理
 
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
 
Opendatabay - Open Data Marketplace.pptx
Opendatabay - Open Data Marketplace.pptxOpendatabay - Open Data Marketplace.pptx
Opendatabay - Open Data Marketplace.pptx
 
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
 
社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .
 
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
原版制作(Deakin毕业证书)迪肯大学毕业证学位证一模一样
 
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
 

Towards FAIR Open Science with PID Kernel Information: RPID Testbed

  • 1. Towards  FAIR  Open  Science  with  PID   Kernel  Information:  the  RPID   Testbed Beth  Plale School  of  Informatics,  Computing  and  Engineering Data  To  Insight  Center Indiana  University Basarim 2017 Istanbul,  Turkey 15  Sep  2017  ,
  • 2. The  ideas  expressed  here  have  been  shaped  through  conversations   in  Research  Data  Alliance  (RDA).    Special  thanks  to  Peter   Wittenburg,  Tobias  Wiegel,  and  Larry  Lannom. Ideas  are  being  put  into  action  through  a  US  NSF  funded  project   called  Robust  PID  (RPID)  Testbed Project  partners  include Beth  Plale,  Robert  Quick,  Robert  McDonald Indiana  University Bridget  Almas,  Tufts  University Larry  Lannom,  CNRI The  opinions  expressed  here  are  those  of  author  alone  and  do  not  represent  the  views  of   the  US  National  Science  Foundation
  • 3. Scientific  data  today   is  baskets  of  apples   across  random   orchards Discovery  is  a   blindman’s bluff   game   Commitment  to  data   as  it  ages  a  mere   hope Cartoon  credit:  Auke  Herrema
  • 4. The  Internet  is  a  worldwide  network  of   connected  computers.      Computers  have  an  IP   address  that  uniquely  identifies  device  on   network.     Imagine  worldwide  network  of  data  objects.   Data  objects  persist  (until  they  don’t).  Objects   are  findable,  accessible,  interoperable,  and   usable  (especially  reusable) Indiana  University    
  • 5. Guiding  abstraction  for  Data   Sharing: Identifies  entities  and   stakeholders Of  interest  to  technologists   and  policy  makers  alike
  • 6. Fecher B,  Friesike S,  Hebing  M  (2015)  What  Drives  Academic  Data  Sharing?.  PLOS  ONE  10(2):  e0118053.   https://doi.org/10.1371/journal.pone.0118053 http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0118053
  • 7. Fecher B,  Friesike S,  Hebing  M  (2015)  What  Drives  Academic  Data  Sharing?.  PLOS  ONE  10(2):  e0118053.   https://doi.org/10.1371/journal.pone.0118053 http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0118053 this  piece  is   actually  a   network
  • 8. A C GF B D E Network  of   indepen-­‐ dent,  globally   unique  and   persistent   Data  Objects   that  have   relationships   between   them  that  we   should   exploit Data Object Layer Is  part  of
  • 9. Repositories Data   Objects In  reality  Data Objects  reside  in   repositories Data  objects  reside  in   repositories  but  should  not   be  completely  controlled  by   repositories
  • 10. Open  science Open  science   is  an  umbrella   term  for   transparent   science  with   ease  of  access   to  all  products   from   beginning  to   end Indiana  University     Image  credit:    Gema Bueno  de  la  Fuente  by  CC-­‐BY
  • 11. Open  science Risk  in  defining  open  science  too  broadly Open  science  must  respect  boundaries  set  by  law  or  decency:   licenses,  copyright,  human  subjects  privacy Open  Science  increasingly  connected  to  FAIR  principles:     Findable Accessible Interoperable Reusable    
  • 12. FAIR  Guiding  Principles 1. To  be  Findable any  Data  Object  should  be  uniquely  and   persistently  identifiable 1.1.  Same  Data  Object  should  be  re-­‐findable  at  any  point  in   time,  thus  Data  Objects  should  be  persistent,  with  emphasis   on  their  metadata 1.2.  Data  Object  should  minimally  contain  basic  machine   actionable  metadata  that  allows  it  to  be  distinguished  from   other  Data  Objects 1.3.  Identifiers  for  any  concept  used  in  Data  Objects  should   therefore  be  Unique and  Persistent
  • 13. FAIR  Guiding  Principles 2.  Data  is  Accessible in  that  it  can  be  always  obtained  by   machines  and  humans 2.1  Upon  appropriate  authorization 2.2  Through  a  well-­‐defined  protocol 2.3  Thus,  machines  and  humans  alike  will  be  able  to  judge  the   actual  accessibility  of  each  Data  Object  
  • 14. FAIR  Guiding  Principles,  cont. 3.  Data  Objects  can  be  Interoperable  only  if: 3.1.  (Meta)  data  is  machine-­‐actionable 3.2.  (Meta)  data  formats  utilize  shared  vocabularies   and/or  ontologies 3.3    (Meta)  data  within  Data  Object  should  thus  be   both  syntactically  parseable and  semantically   machine-­‐accessible
  • 15. FAIR  Guiding  Principles,  cont. 4.  For  Data  Objects  to  be  Re-­‐usable additional  criteria  are: 4.1  Data  Objects  should  be  compliant  with principles  1-­‐3 4.2  (Meta)  data  should  be  sufficiently  well-­‐described  and  rich   that  it  can  be  automatically  (or  with  minimal  human  effort)   linked  or  integrated,  like-­‐with-­‐like,  with  other  data  sources 4.3  Published  Data  Objects  should  refer  to  their  sources  with   rich  enough  metadata  and  provenance  to  enable  proper   citation
  • 16. Our  vision • Starts  with  data  network  based  on  Digital   Object  Architecture  (DOA),  a  distributed   architecture  of  services  spread  worldwide  that   together  identify  and  resolve  digital  objects • DOA  first  espoused  by  Internet  founder  Robert   Khan  in  the  mid’80’s.     • DOA  is  a  network  of  Handle  servers  at  its  core Indiana  University    
  • 17. The  Digital  Object  Architecture  serves  as  base   infrastructure  only.  DOA  is  silent  on  issues  of   modeling  data  objects  themselves:  their   content,  their  relationship  to  their  own   metadata,  and  relationship  between  data   objects For  object  modeling  we  turn  to  FAIR  principles   and  PID  Kernel  Information    
  • 18. Data  Object  Model  based  on  FAIR   principles Data  modeling  questions  address  issues: 1)          What  goes  into  a  data  object? 2)          Should  a  data  object  include  its  metadata  or   should  the  metadata  be  a  new  object  or  both? 3)          What  kind  of  metadata  should  be  considered? 4)          What  is  the  granularity  of  a  data  object? 5)          Where  does  kernel  information  come  in?
  • 19. Persistent  IDs  are  the   backbone  of  data   sharing [  primary  and   secondary  use  ]
  • 20. • Persistent  IDs  (PID) -­‐-­‐ names  a  data  object  with  name that  is  globally  unique -­‐-­‐ data  object  can  be  metadata,   data  or  a  digital  proxy  to physical  object -­‐-­‐ is  persistent  over  time plale@indiana.edu
  • 21. PID  makeup • Handles  have  a  prefix  assigned  to  a  Local  Handle  Server • Suffix  is  under  control  of  Local  Handle  Server • e.g.,  RPID  testbed  assigns only  test  temporary  handles: – 11723.1.test,  11723.2.test,  ...  11723.8.test  :     assigned  for  internal use   – 11723.9.test.<proj  name>    :    assigned  to  projects avoids  collisions  within  LHS  namespace Indiana  University    
  • 22. • Handle  system  allows  key-­‐value   information  stored  to  a  Local  Handle   Server -­‐-­‐ names  a  Data  Object  with  name that  is  globally  unique -­‐-­‐ Data  Object  can  be  metadata,   data  or  a  digital  proxy  to physical  object -­‐-­‐ Is  persistent  over  time 15  Sep  2017
  • 23. Data  Type   Registry  Service Stores  type  definitions   for  kernel  information Client PIT   API   SDK Handle  System Global  Handle  Servers Local  Handle   Service Q:  prefix  authority Local  Handle  Service  IP Q:  local  handle Handle  information Q:  DTR  with  Profile  PID DTR  Profile  Definition (e.g.,  PID  to  Profile,  URL  to  target)   Scale:   [1000…50 00]  LHS Scale:   [1..10] Stores  PID  kernel   information Handle  resolution  in  a  Digital  Object  Architecture Trusted  PIDs Filter-­‐ ed PIDS Scale:   [80…100]   GHS
  • 24. What  should  go  into  the  PID  Kernel   Information? PID Kernel  Information is  a  small  amount  of   information  stored  at  resolver  (Local  Handle  Server)   in  PID  record  of  a  PID Inspiration:  take  FAIR  principles  as  guide:  how  far   can  PID  Kernel  Information  aid  in  implementing   FAIR?
  • 25. Kernel  Information  is  Cached • By  FAIR  principle  1.1,  a  Local  Handle  Server  is  not  a   metadata  repository  so  cannot  serve  as  the   authoritative  source  for  any  form  of  metadata  for  a   data  object • Thus  Kernel  Information  is  cached  copy  of  metadata   that  is  stored  and  stewarded  elsewhere • FAIR  principle  1.1:  Same  Data  Object  should  be  re-­‐ findable  at  any  point  in  time,  thus  Data  Objects  should   be  persistent,  with  emphasis  on  their  metadata
  • 26. Promising  candidate  for  Kernel   Information  is  Provenance Imagine  a  world  where  PIDs  identify   just  about  everything: -­‐>  Internet  of  Things -­‐>  Movie  clips -­‐>  Smart  city  sensor  data -­‐>  Pages  from  digitized  books -­‐>  Baby  food  containers
  • 27. Further  imagine  an  Internet-­‐scale  data   client  that  is  handed  a  list  of  a   100,000,000  PIDs. How  does  client  quickly  sift  through   list  to  find  research  data  objects? Further  suppose  client  is  able  to   winnow  list  down  to  just  research  data   objects,  how  does  it  then  quickly   discard  fakes?   plale@indiana.edu
  • 28. Data  Type   Registry  Service Stores  type   definitions   for  kernel   information Client Handle  System Global  Handle  Registry Local  Handle   Service Q:  prefix  authority Local  Handle  Service  IP Q:  local  handle Handle  information Q:  DTR  with  Profile  PID DTR  Profile  Definition [1000… 5000] Stores  PID   kernel   information PID  Kernel  Information  Use  case:    Client  filters  list  of  millions  of  PIDs  to  identify   research  data  and  makes  simple  determination  of  trust Trusted   research   PIDs Filter -­‐ed PIDS
  • 29. Client  working  with  PID  Kernel  Information  looks   at  each  PID  in  list,  accepts  those  that  have: -­‐-­‐ Kernel  Information  profile  stored  in  Data  Type   Registry  (DTR),   -­‐-­‐ That  profile  is  associated  with  RDA  (in  some   unspecified  manner) -­‐-­‐ PID  Kernel  Information  holds  tiny  amount  of   data  provenance  from  which  basic  sense  of  trust   is  derived  
  • 30. Kernel  Information  for  FAIR   Accessibility • By  FAIR  principle  2,  Kernel  Information  conveys  accessibility   information  thus  making  it  easier  for  navigating  direct  data   object  access   • Includes  privacy  or  legal  restrictions  on  a  data  object  that   may  limit  access  to,  say  the  object’s  metadata  alone. FAIR  Principle  2.  Data  is  Accessible in  that  it  can  be  always   obtained  by  machines  and  humans 2.1  Upon  appropriate  authorization 2.2  Through  a  well-­‐defined  protocol 2.3  Thus,  machines  and  humans  alike  will  be  able  to  judge  the   actual  accessibility  of  each  Data  Object   Indiana  University    
  • 31. Data  Type   Registry  Service Client Handle  System Global  Handle  Registry Local  Handle   Service Q:  prefix  authority Local  Handle  Service  IP Q:  local  handle Handle  information  +  PID   Kernel  Information Q:  DTR  with  Profile  PID DTR  Profile  Definition  for  PID   Kernel  Information [1..10] PID  Kernel  Information  Use  case:    Filter  list  of  million  PIDs  to  identify  research   data;  make  simple  determination  of  trust Repository  Access Retrieve  data   object  as  per   access  and   rights   restriction  in   PID  KI
  • 32. PID  Kernel  Information  Summary • Exploration  driven  by  identifying  and   evaluating  minimal  information  that  can  go   into  Kernel  Information  that  can  help  make   Data  Objects  FAIR  and  less  dependent  on  the   repository  system  to  enforce  FAIRness?       • Long  term  goal:    Smart  data  objects • Kernel  information  has  potential  to  spawn   new  ecosystem  of  data  services  for  smart  data   objects
  • 33. RPID  testbed • Suite  of  software  services  for  use  by  community – Data  type  registry  (RDA) – PIT  API  (RDA) – Handle  service • Exploratory  services – PID  Kernel  Information – Mapping  CTS  URNs  to  handles – Packaging  for  use  by  others • Help  and  advice • User  advisory  group   Indiana  University    
  • 34. Data  Type   Registry Handle  Service Prefix:  11723 Service   Installation   Testing  for   Reproducibility 36-­‐Month  Testbed RPID  Testbed
  • 35. Who  can  use  the  Testbed The  RPID  testbed  is  open  for research,   education,  non-­‐profit,  or  pre-­‐ competitive  use.
  • 36. Fecher B,  Friesike S,  Hebing  M  (2015)  What  Drives  Academic  Data  Sharing?.  PLOS  ONE  10(2):  e0118053.   https://doi.org/10.1371/journal.pone.0118053 http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0118053 Summary:   Foundational   infrastructure  for  data   sharing  is  FAIR  inspired   Digital  Object  Archit with  PID  Kernel  Info
  • 37. • In  conclusion,  this  work  proposes – Level  1a  data  resolution:    Digital  Object  Architecture   [Kahn]   – Level  1b  high  level  data  filtering:    PID  Kernel  Information – Level  2:    FAIR  principles  as  data  object  layer • Thus  contributes  to  Open  Science  with  foundational   infrastructure  enabling  new  ecosystem  of  data  services • Follow  work  at: – https://github.com/rpidproject – RDA  PID  Kernel  Information  Working  Group – Reach  us  at  rpid-­‐l@iu.edu Acknowledgements:    this  work  funded  in  part  by  the  National  Science  Foundation under  grants 1659310  and  1349002