SlideShare a Scribd company logo
1 of 15
Download to read offline
Research	
  Data	
  Services:	
  	
  
Towards	
  a	
  Framework	
  for	
  Incidental	
  Collaboratories	
  




                           Anita	
  de	
  Waard	
  
       VP	
  Research	
  Data	
  Collabora@ons,	
  Elsevier	
  RDS	
  
                          Jericho,	
  VT,	
  USA	
  
Brief	
  bio:	
  
•  Background:	
  	
  
    –  Low-­‐temperature	
  physics	
  (Leiden	
  &	
  Moscow)	
  
    –  Joined	
  Elsevier	
  in	
  1988	
  as	
  publisher	
  in	
  solid	
  state	
  physics	
  
    –  1991:	
  ArXiV	
  =>	
  publishers	
  will	
  go	
  out	
  of	
  business	
  very	
  soon!	
  
•  1997-­‐	
  now:	
  Disrup@ve	
  Technologies	
  Director,	
  focus	
  on	
  beXer	
  
   representa@on	
  of	
  scien@fic	
  knowledge:	
  
    –  Iden@fying	
  key	
  knowledge	
  elements	
  in	
  ar@cles	
  (linguis@cs	
  thesis)	
  
    –  Building	
  claim-­‐evidence	
  networks	
  (through	
  collabora@ons)	
  
    –  Help	
  build	
  communi@es	
  to	
  accelerate	
  rate	
  of	
  change	
  (Force11)	
  
•  Star@ng	
  1/1/2013:	
  VP	
  Research	
  Data	
  Collabora@ons	
  -­‐	
  why?	
  	
  
    –  Douglas	
  Engelbart’s	
  thinking:	
  connect	
  minds!	
  
    –  My	
  (non-­‐biologists)	
  understanding	
  of	
  biology:	
  
The	
  big	
  problem	
  in	
  biology:	
  
Interspecies	
  variability:	
  A	
  specimen	
  is	
  not	
  a	
  species	
  
Gene	
  expression	
  variability:	
  Knowing	
  genes	
  is	
  not	
  	
  
knowing	
  how	
  they	
  are	
  expressed	
  
Microbiome:	
  An	
  animal	
  is	
  an	
  ecosystem	
  
Systems	
  biology:	
  A	
  whole	
  is	
  more	
  than	
  the	
  sum	
  of	
  its	
  parts	
  	
  
	
  
  Reduc@onist	
  science	
  doesn’t	
  work	
  
  for	
  living	
  systems!	
  



                                                           hXp://en.wikipedia.org/wiki/File:Duck_of_Vaucanson.jpg	
  
Sta@s@cs	
  to	
  the	
  rescue!	
  	
  
With	
  enough	
  observa@ons,	
  trends	
  and	
  anomalies	
  can	
  be	
  detected:	
  
•  	
  “Here	
  we	
  present	
  resources	
  from	
  a	
  popula@on	
  of	
  242	
  healthy	
  adults	
  
    sampled	
  at	
  15	
  or	
  18	
  body	
  sites	
  up	
  to	
  three	
  @mes,	
  which	
  have	
  generated	
  
    5,177	
  microbial	
  taxonomic	
  profiles	
  from	
  16S	
  ribosomal	
  RNA	
  genes	
  and	
  
    over	
  3.5	
  terabases	
  of	
  metagenomic	
  sequence	
  so	
  far.”	
  	
  
            The	
  Human	
  Microbiome	
  Project	
  Consor@um,	
  Structure,	
  func@on	
  and	
  diversity	
  of	
  the	
  healthy	
  
            human	
  microbiome,	
  Nature	
  486,	
  207–214	
  (14	
  June	
  2012)	
  doi:10.1038/nature11234	
  
•  “The	
  large	
  sample	
  size	
  —	
  4,298	
  North	
  Americans	
  of	
  European	
  descent	
  
   and	
  2,217	
  African	
  Americans	
  —	
  has	
  enabled	
  the	
  researchers	
  to	
  mine	
  
   down	
  into	
  the	
  human	
  genome.”	
  	
  
            Nidhi	
  Subbaraman,	
  Nature	
  News,	
  28	
  November	
  2012,	
  High-­‐resolu@on	
  sequencing	
  study	
  
            emphasizes	
  importance	
  of	
  rare	
  variants	
  in	
  disease.	
  
•  “A	
  profile	
  unique	
  for	
  a	
  DNA	
  sample	
  source	
  is	
  obtained	
  	
  …	
  a	
  series	
  
   of	
  numbers	
  are	
  generated	
  which	
  can	
  be	
  used	
  as	
  a	
  bar	
  code	
  for	
  
   that	
  DNA	
  source.	
  A	
  registry	
  of	
  bar	
  codes	
  would	
  make	
  it	
  easy	
  to	
  
   compare	
  DNA	
  samples”	
  	
  
            Roland	
  M.	
  Nardone,	
  Ph.D.,	
  Eradica@on	
  of	
  Cross-­‐Contaminated	
  Cell	
  Lines:	
  A	
  Call	
  for	
  Ac@on,	
  
            hXp://www.sivb.org/publicPolicy_Eradica@on.pdf	
  
     	
  
Enable	
  ‘incidental	
  collaboratories’:	
  
•  Collect:	
  store	
  data	
  at	
  the	
  level	
  of	
  the	
  experiment:	
  
    –  Accessible	
  through	
  a	
  single	
  interface	
  
    –  Add	
  enough	
  metadata	
  to	
  know	
  what	
  was	
  done/seen	
  
•  Connect:	
  allow	
  analyses	
  over:	
  	
  
    –  Similar	
  experiment	
  types	
  	
  
    –  Experiments	
  done	
  with/on	
  similar	
  biological	
  ‘things’:	
  
         •  Species,	
  strains,	
  systems,	
  cells	
  
         •  Anatomical	
  components	
  (e.g.	
  spleen,	
  hypothalamus)	
  
         •  An@bodies,	
  biomarkers,	
  bioac@ve	
  chemicals,	
  etc	
  
•  Keep:	
  
    –  Long-­‐term	
  preserva@on	
  of	
  data	
  and	
  sosware	
  (Olive)	
  
    –  Fulfill	
  Data	
  Management	
  Plan	
  requirements	
  
    –  Allow	
  gated	
  access,	
  if	
  needed	
  	
  
Problem:	
  biological	
  research	
  is	
  quite	
  insular	
  
•  Biology	
  is	
  small:	
  because	
  objects/
   equipment	
  are	
  10^-­‐5	
  –	
  10^2	
  m,	
  you	
  
   can	
  work	
  alone	
  (‘King’	
  and	
  
   ‘subjects’).	
  	
                                                          Prepare	
  
•  Biology	
  is	
  messy:	
  it	
  doesn’t	
  happen	
  
   behind	
  a	
  terminal.	
  	
  
                                                                  Ponder	
                   Observe	
  
•  Biology	
  is	
  compe@@ve:	
  different	
  
                                                            Communicate	
  
   people	
  with	
  similar	
  skill	
  sets,	
  vying	
  
   for	
  the	
  same	
  grants.	
  	
                                         Analyze	
  

•  In	
  summary:	
  it	
  does	
  not	
  promote	
  
   inherent	
  collabora@on	
  (vs.,	
  for	
  
   instance,	
  big	
  physics	
  or	
  astronomy).	
  
Try	
  to	
  pop	
  the	
  ‘lab	
  bubble’!	
  

                                             Prepare	
  


                                                                                                  Observa@ons	
  
Labs	
  go	
  from	
  being	
  
                                              Analyze	
   Communicate	
   Think	
      Observa@ons	
  
informa@on	
  islands,	
  	
  
to	
  being	
  ‘sensors	
  in	
  a	
  
                                                                                                      Observa@ons	
  
network’.	
  


                             Prepare	
  

                                                                    Prepare	
  


                               Analyze	
      Communicate	
  

                                                                      Analyze	
     Communicate	
  
Some	
  objec@ons,	
  and	
  rebuXals:	
  
Objec&on:	
                                                    Rebu-al:	
  
“But	
  our	
  lab	
  notebooks	
  are	
  all	
  on	
          Develop	
  smart	
  phone/tablet	
  apps	
  for	
  data	
  
paper”	
                                                       input	
  
“I	
  need	
  to	
  see	
  a	
  direct	
  benefit	
  from	
     Develop	
  ‘data	
  manipula@on	
  dashboard’	
  
something	
  I	
  spend	
  my	
  @me	
  on”	
                  for	
  PI	
  to	
  allow	
  beXer	
  access	
  to	
  full	
  
	
                                                             experimental	
  output	
  for	
  his/her	
  lab	
  
“I	
  am	
  afraid	
  other	
  people	
  might	
               Develop	
  intra-­‐lab	
  data	
  communica@on	
  
scoop	
  my	
  discoveries”	
                                  systems	
  first	
  and	
  allow	
  @med/granular	
  
	
                                                             data	
  export	
  
“I	
  want	
  things	
  to	
  be	
  peer	
  reviewed	
         Allow	
  reviewers	
  access	
  to	
  experimental	
  
before	
  I	
  expose	
  them”	
                               database	
  before	
  publica@on	
  (of	
  data	
  or	
  
	
                                                             paper)	
  
“I	
  don’t	
  really	
  trust	
  anyone	
  else’s	
           Add	
  a	
  social	
  networking	
  component	
  to	
  
data	
  –	
  well,	
  except	
  for	
  the	
  guys	
  I	
      this	
  data	
  repository	
  so	
  you	
  know	
  who	
  (to	
  
went	
  to	
  Grad	
  School	
  with…”	
  	
                   the	
  individual)	
  created	
  that	
  data	
  point.	
  	
  
Elsevier	
  Research	
  Data	
  Services:	
  Goals	
  
1.  Help	
  increase	
  the	
  amount	
  of	
  data	
  shared	
  from	
  the	
  lab,	
  
    enabling	
  incidental	
  collaboratories	
  
2.  Help	
  increase	
  the	
  value	
  of	
  the	
  data	
  shared	
  by	
  
    increasing	
  annota@on,	
  normaliza@on,	
  provenance	
  
    enabling	
  enhanced	
  interoperability	
  
3.  Help	
  measure	
  and	
  deliver	
  credit	
  for	
  shared	
  data,	
  the	
  
    researchers,	
  the	
  ins@tute,	
  and	
  the	
  funding	
  body,	
  
    enabling	
  more	
  sustainable	
  pla;orms	
  
RDS	
  Guiding	
  Principles:	
  
•  In	
  principle,	
  all	
  open	
  data	
  stays	
  open	
  and	
  URLs,	
  front	
  
   end	
  etc.	
  stay	
  where	
  they	
  are	
  (i.e.	
  with	
  repository)	
  
•  Collabora@on	
  is	
  tailored	
  to	
  data	
  repositories’	
  	
  unique	
  
   needs/interests	
  and	
  of	
  a	
  ‘service-­‐model’	
  type:	
  	
  
    –  Aspects	
  where	
  collabora@on	
  is	
  needed	
  are	
  discussed	
  
    –  A	
  collabora@on	
  plan	
  is	
  drawn	
  up	
  using	
  a	
  Service-­‐Level	
  
       Agreement:	
  agree	
  on	
  @me,	
  condi@ons,	
  etc.	
  	
  
    –  All	
  communica@on,	
  finance,	
  IPR	
  etc.	
  is	
  completely	
  
       transparent	
  at	
  all	
  @mes.	
  	
  
•  Very	
  small	
  (2/3	
  people)	
  department;	
  immediate	
  
     communica@on;	
  instant	
  deployment	
  of	
  ideas	
  
	
  
RDS	
  Approach:	
  
•  Collaborate	
  and	
  build	
  on	
  rela@onships	
  with	
  data	
  
   repositories	
  (life	
  science,	
  earth	
  science,	
  others)	
  
•  Integrate	
  with	
  other	
  content	
  sources,	
  if	
  possible	
  
•  Build	
  annota@on	
  and	
  standardisa@on	
  tools	
  and	
  
   processes	
  to	
  implement	
  this	
  
•  Develop	
  next-­‐genera@on	
  infrastructure	
  solu@ons	
  
   for	
  back-­‐end	
  integra@on	
  
•  Explore	
  crea@ve	
  revenue	
  opportuni@es	
  
NIF	
  An@body	
  Registry:	
  
Problem:	
  	
  
•  95	
  an@bodies	
  were	
  iden@fied	
  in	
  8	
  papers	
  
•  52	
  did	
  not	
  contain	
  enough	
  informa@on	
  	
  
   to	
  determine	
  the	
  an@body	
  used	
  
•  Some	
  provided	
  details	
  in	
  another	
  paper	
  
•  Failed	
  to	
  give	
  species,	
  vendor,	
  catalog	
  #	
  
Solu@on	
  #	
  1:	
  	
  
•     Journals	
  ask	
  authors	
  to	
  provide	
  	
  
      an@body	
  catalog	
  nr	
  	
  
•     Link	
  to	
  NIF	
  Registry	
  from	
  manufacturers/
      vendors’	
  sites	
  
Solu@on	
  #2:	
  	
  
•     Pilot	
  with	
  a	
  lab:	
  	
  
Let’s	
  start	
  with	
  the	
  Urban	
  Lab	
  	
  

•  Geyng	
  an@bodies	
  	
  
•  And	
  messy	
  bits	
  	
  	
  
•  From	
  the	
  notebook	
  	
  
•  Into	
  Nathan	
  Urban’s	
  
   command	
  center	
  	
  
•  By	
  providing	
  
    – 7”	
  Tablets	
  
    – Links	
  to	
  IgorPro	
  
    – A	
  dashboard	
  UI	
  
My	
  ques@ons	
  to	
  you:	
  
•  Thoughts	
  on	
  this	
  approach:	
  	
  
     –  In	
  principle?	
  	
  
     –  In	
  prac@ce?	
  
•  Do	
  you	
  see	
  serious	
  hurdles:	
  	
  
     –  Are	
  we	
  overlapping	
  with	
  other	
  ini@a@ves;	
  if	
  so,	
  are	
  we	
  
        complementary?	
  
     –  How	
  does	
  this	
  connect	
  to	
  libraries/local	
  repositories?	
  	
  
     –  Are	
  there	
  sensi@vi@es/pain	
  points	
  we	
  are	
  overlooking?	
  	
  
•  Where	
  to	
  start:	
  	
  
     –  How	
  to	
  collaborate?	
  	
  
     –  Who	
  to	
  talk	
  to	
  –	
  funding	
  agencies,	
  socie@es:	
  who	
  else?	
  	
  
     –  Thoughts	
  on	
  data	
  repositories/plazorms	
  to	
  connect	
  to?	
  	
  
Your	
  ques@ons	
  to	
  me?	
  
             a.dewaard@elsevier.com	
  
         hXp://elsatglabs.com/labs/anita/	
  	
  
       hXp://www.slideshare.net/anitawaard	
  	
  


Thanks	
  go	
  to:	
  
•  Anita	
  Bandrowski	
  and	
  Maryann	
  Martone,	
  NIF	
  
•  Nathan	
  Urban,	
  Shreejoy	
  Tripathy,	
  CMU	
  
•  David	
  Marques,	
  SVP	
  RDS	
  

More Related Content

What's hot

Assessing Galaxy's ability to express scientific workflows in bioinformatics
Assessing Galaxy's ability to express scientific workflows in bioinformaticsAssessing Galaxy's ability to express scientific workflows in bioinformatics
Assessing Galaxy's ability to express scientific workflows in bioinformaticsPeter van Heusden
 
Research Objects for FAIRer Science
Research Objects for FAIRer Science Research Objects for FAIRer Science
Research Objects for FAIRer Science Carole Goble
 
Computational Reproducibility vs. Transparency: Is It FAIR Enough?
Computational Reproducibility vs. Transparency: Is It FAIR Enough?Computational Reproducibility vs. Transparency: Is It FAIR Enough?
Computational Reproducibility vs. Transparency: Is It FAIR Enough?Bertram Ludäscher
 
SEEK for Science: A Data and Model Management Platform to support Open and Re...
SEEK for Science: A Data and Model Management Platform to support Open and Re...SEEK for Science: A Data and Model Management Platform to support Open and Re...
SEEK for Science: A Data and Model Management Platform to support Open and Re...Carole Goble
 
The real world of ontologies and phenotype representation: perspectives from...
The real world of ontologies and phenotype representation:  perspectives from...The real world of ontologies and phenotype representation:  perspectives from...
The real world of ontologies and phenotype representation: perspectives from...Neuroscience Information Framework
 
The Seven Deadly Sins of Bioinformatics
The Seven Deadly Sins of BioinformaticsThe Seven Deadly Sins of Bioinformatics
The Seven Deadly Sins of BioinformaticsDuncan Hull
 
The possibility and probability of a global Neuroscience Information Framework
The possibility and probability of a global Neuroscience Information Framework The possibility and probability of a global Neuroscience Information Framework
The possibility and probability of a global Neuroscience Information Framework Neuroscience Information Framework
 
Scott Edmunds A*STAR open access workshop: how licensing can change the way w...
Scott Edmunds A*STAR open access workshop: how licensing can change the way w...Scott Edmunds A*STAR open access workshop: how licensing can change the way w...
Scott Edmunds A*STAR open access workshop: how licensing can change the way w...GigaScience, BGI Hong Kong
 
Scott Edmunds: GigaScience - Big-Data, Data Citation and Future Data Handling
Scott Edmunds: GigaScience - Big-Data, Data Citation and Future Data HandlingScott Edmunds: GigaScience - Big-Data, Data Citation and Future Data Handling
Scott Edmunds: GigaScience - Big-Data, Data Citation and Future Data HandlingGigaScience, BGI Hong Kong
 
Reproducible research: First steps.
Reproducible research: First steps. Reproducible research: First steps.
Reproducible research: First steps. Richard Layton
 
BEACON 101: Sequencing tech
BEACON 101: Sequencing techBEACON 101: Sequencing tech
BEACON 101: Sequencing techc.titus.brown
 
Being Reproducible: SSBSS Summer School 2017
Being Reproducible: SSBSS Summer School 2017Being Reproducible: SSBSS Summer School 2017
Being Reproducible: SSBSS Summer School 2017Carole Goble
 
Research Objects: more than the sum of the parts
Research Objects: more than the sum of the partsResearch Objects: more than the sum of the parts
Research Objects: more than the sum of the partsCarole Goble
 
2013 bms-retreat-talk
2013 bms-retreat-talk2013 bms-retreat-talk
2013 bms-retreat-talkc.titus.brown
 
Scott Edmunds: Data publication in the data deluge
Scott Edmunds: Data publication in the data delugeScott Edmunds: Data publication in the data deluge
Scott Edmunds: Data publication in the data delugeGigaScience, BGI Hong Kong
 

What's hot (20)

Assessing Galaxy's ability to express scientific workflows in bioinformatics
Assessing Galaxy's ability to express scientific workflows in bioinformaticsAssessing Galaxy's ability to express scientific workflows in bioinformatics
Assessing Galaxy's ability to express scientific workflows in bioinformatics
 
Research Objects for FAIRer Science
Research Objects for FAIRer Science Research Objects for FAIRer Science
Research Objects for FAIRer Science
 
Computational Reproducibility vs. Transparency: Is It FAIR Enough?
Computational Reproducibility vs. Transparency: Is It FAIR Enough?Computational Reproducibility vs. Transparency: Is It FAIR Enough?
Computational Reproducibility vs. Transparency: Is It FAIR Enough?
 
SEEK for Science: A Data and Model Management Platform to support Open and Re...
SEEK for Science: A Data and Model Management Platform to support Open and Re...SEEK for Science: A Data and Model Management Platform to support Open and Re...
SEEK for Science: A Data and Model Management Platform to support Open and Re...
 
The real world of ontologies and phenotype representation: perspectives from...
The real world of ontologies and phenotype representation:  perspectives from...The real world of ontologies and phenotype representation:  perspectives from...
The real world of ontologies and phenotype representation: perspectives from...
 
2015 illinois-talk
2015 illinois-talk2015 illinois-talk
2015 illinois-talk
 
DCC Keynote 2007
DCC Keynote 2007DCC Keynote 2007
DCC Keynote 2007
 
The Seven Deadly Sins of Bioinformatics
The Seven Deadly Sins of BioinformaticsThe Seven Deadly Sins of Bioinformatics
The Seven Deadly Sins of Bioinformatics
 
The possibility and probability of a global Neuroscience Information Framework
The possibility and probability of a global Neuroscience Information Framework The possibility and probability of a global Neuroscience Information Framework
The possibility and probability of a global Neuroscience Information Framework
 
Scott Edmunds A*STAR open access workshop: how licensing can change the way w...
Scott Edmunds A*STAR open access workshop: how licensing can change the way w...Scott Edmunds A*STAR open access workshop: how licensing can change the way w...
Scott Edmunds A*STAR open access workshop: how licensing can change the way w...
 
Scott Edmunds: GigaScience - Big-Data, Data Citation and Future Data Handling
Scott Edmunds: GigaScience - Big-Data, Data Citation and Future Data HandlingScott Edmunds: GigaScience - Big-Data, Data Citation and Future Data Handling
Scott Edmunds: GigaScience - Big-Data, Data Citation and Future Data Handling
 
Reproducible research: First steps.
Reproducible research: First steps. Reproducible research: First steps.
Reproducible research: First steps.
 
BEACON 101: Sequencing tech
BEACON 101: Sequencing techBEACON 101: Sequencing tech
BEACON 101: Sequencing tech
 
Being Reproducible: SSBSS Summer School 2017
Being Reproducible: SSBSS Summer School 2017Being Reproducible: SSBSS Summer School 2017
Being Reproducible: SSBSS Summer School 2017
 
FAIRy Stories
FAIRy StoriesFAIRy Stories
FAIRy Stories
 
Open science 2014
Open science 2014Open science 2014
Open science 2014
 
Research Objects: more than the sum of the parts
Research Objects: more than the sum of the partsResearch Objects: more than the sum of the parts
Research Objects: more than the sum of the parts
 
2013 bms-retreat-talk
2013 bms-retreat-talk2013 bms-retreat-talk
2013 bms-retreat-talk
 
Scott Edmunds: Data publication in the data deluge
Scott Edmunds: Data publication in the data delugeScott Edmunds: Data publication in the data deluge
Scott Edmunds: Data publication in the data deluge
 
Peer Review and Science2.0
Peer Review and Science2.0Peer Review and Science2.0
Peer Review and Science2.0
 

Viewers also liked

Overview of scientific discourse annotatoin
Overview of scientific discourse annotatoinOverview of scientific discourse annotatoin
Overview of scientific discourse annotatoinAnita de Waard
 
Knowledge Media Panel U Toronto, Sept 30 2010
Knowledge Media Panel U Toronto, Sept 30 2010Knowledge Media Panel U Toronto, Sept 30 2010
Knowledge Media Panel U Toronto, Sept 30 2010Anita de Waard
 
Is Assessment Really So Horrible?
Is Assessment Really So Horrible?Is Assessment Really So Horrible?
Is Assessment Really So Horrible?OPUS Management
 

Viewers also liked (6)

Overview of scientific discourse annotatoin
Overview of scientific discourse annotatoinOverview of scientific discourse annotatoin
Overview of scientific discourse annotatoin
 
Knowledge Media Panel U Toronto, Sept 30 2010
Knowledge Media Panel U Toronto, Sept 30 2010Knowledge Media Panel U Toronto, Sept 30 2010
Knowledge Media Panel U Toronto, Sept 30 2010
 
Assessment
AssessmentAssessment
Assessment
 
Epistemics
EpistemicsEpistemics
Epistemics
 
De Waard Carusi
De Waard CarusiDe Waard Carusi
De Waard Carusi
 
Is Assessment Really So Horrible?
Is Assessment Really So Horrible?Is Assessment Really So Horrible?
Is Assessment Really So Horrible?
 

Similar to Towards Incidental Collaboratories; Research Data Services

Talk at OHSU, September 25, 2013
Talk at OHSU, September 25, 2013Talk at OHSU, September 25, 2013
Talk at OHSU, September 25, 2013Anita de Waard
 
Looking for Data: Finding New Science
Looking for Data: Finding New ScienceLooking for Data: Finding New Science
Looking for Data: Finding New ScienceAnita de Waard
 
Big Data Standards - Workshop, ExpBio, Boston, 2015
Big Data Standards - Workshop, ExpBio, Boston, 2015Big Data Standards - Workshop, ExpBio, Boston, 2015
Big Data Standards - Workshop, ExpBio, Boston, 2015Susanna-Assunta Sansone
 
Creating an Urban Legend: A System for Electrophysiology Data Management and ...
Creating an Urban Legend: A System for Electrophysiology Data Management and ...Creating an Urban Legend: A System for Electrophysiology Data Management and ...
Creating an Urban Legend: A System for Electrophysiology Data Management and ...Anita de Waard
 
Share and analyze geonomic data at scale by Andy Petrella and Xavier Tordoir
Share and analyze geonomic data at scale by Andy Petrella and Xavier TordoirShare and analyze geonomic data at scale by Andy Petrella and Xavier Tordoir
Share and analyze geonomic data at scale by Andy Petrella and Xavier TordoirSpark Summit
 
Spark Summit Europe: Share and analyse genomic data at scale
Spark Summit Europe: Share and analyse genomic data at scaleSpark Summit Europe: Share and analyse genomic data at scale
Spark Summit Europe: Share and analyse genomic data at scaleAndy Petrella
 
Jonathan Tedds Distinguished Lecture at DLab, UC Berkeley, 12 Sep 2013: "The ...
Jonathan Tedds Distinguished Lecture at DLab, UC Berkeley, 12 Sep 2013: "The ...Jonathan Tedds Distinguished Lecture at DLab, UC Berkeley, 12 Sep 2013: "The ...
Jonathan Tedds Distinguished Lecture at DLab, UC Berkeley, 12 Sep 2013: "The ...Jonathan Tedds
 
Idcc kansa-kansa-arbuckle
Idcc kansa-kansa-arbuckleIdcc kansa-kansa-arbuckle
Idcc kansa-kansa-arbuckleEric Kansa
 
New e-Science Edinburgh Late Edition
New e-Science Edinburgh Late EditionNew e-Science Edinburgh Late Edition
New e-Science Edinburgh Late EditionDavid De Roure
 
Laurie Goodman at #SSPBoston: Article+Data+Tools Reproducibility, Reuse, & Ra...
Laurie Goodman at #SSPBoston: Article+Data+ToolsReproducibility, Reuse, & Ra...Laurie Goodman at #SSPBoston: Article+Data+ToolsReproducibility, Reuse, & Ra...
Laurie Goodman at #SSPBoston: Article+Data+Tools Reproducibility, Reuse, & Ra...GigaScience, BGI Hong Kong
 
Tragedy of the (Data) Commons
Tragedy of the (Data) CommonsTragedy of the (Data) Commons
Tragedy of the (Data) CommonsJames Hendler
 
NPG Scientific Data; SSP, Boston, May 2014: http://www.sspnet.org/events/annu...
NPG Scientific Data; SSP, Boston, May 2014: http://www.sspnet.org/events/annu...NPG Scientific Data; SSP, Boston, May 2014: http://www.sspnet.org/events/annu...
NPG Scientific Data; SSP, Boston, May 2014: http://www.sspnet.org/events/annu...Susanna-Assunta Sansone
 
How to Execute A Research Paper
How to Execute A Research PaperHow to Execute A Research Paper
How to Execute A Research PaperAnita de Waard
 
Databases and Ontologies: Where do we go from here?
Databases and Ontologies:  Where do we go from here?Databases and Ontologies:  Where do we go from here?
Databases and Ontologies: Where do we go from here?Maryann Martone
 
ContentMine: Mining the Scientific Literature
ContentMine: Mining the Scientific LiteratureContentMine: Mining the Scientific Literature
ContentMine: Mining the Scientific Literaturepetermurrayrust
 
How do we know what we don’t know: Using the Neuroscience Information Framew...
How do we know what we don’t know:  Using the Neuroscience Information Framew...How do we know what we don’t know:  Using the Neuroscience Information Framew...
How do we know what we don’t know: Using the Neuroscience Information Framew...Maryann Martone
 
Early Career Reseachers in Science. Start Early, Be Open , Be Brave
Early Career Reseachers in Science. Start Early, Be Open , Be BraveEarly Career Reseachers in Science. Start Early, Be Open , Be Brave
Early Career Reseachers in Science. Start Early, Be Open , Be Bravepetermurrayrust
 
The culture of researchData
The culture of researchData The culture of researchData
The culture of researchData TheContentMine
 
The Culture of Research Data, by Peter Murray-Rust
The Culture of Research Data, by Peter Murray-RustThe Culture of Research Data, by Peter Murray-Rust
The Culture of Research Data, by Peter Murray-RustLEARN Project
 

Similar to Towards Incidental Collaboratories; Research Data Services (20)

Talk at OHSU, September 25, 2013
Talk at OHSU, September 25, 2013Talk at OHSU, September 25, 2013
Talk at OHSU, September 25, 2013
 
Looking for Data: Finding New Science
Looking for Data: Finding New ScienceLooking for Data: Finding New Science
Looking for Data: Finding New Science
 
Big Data Standards - Workshop, ExpBio, Boston, 2015
Big Data Standards - Workshop, ExpBio, Boston, 2015Big Data Standards - Workshop, ExpBio, Boston, 2015
Big Data Standards - Workshop, ExpBio, Boston, 2015
 
Creating an Urban Legend: A System for Electrophysiology Data Management and ...
Creating an Urban Legend: A System for Electrophysiology Data Management and ...Creating an Urban Legend: A System for Electrophysiology Data Management and ...
Creating an Urban Legend: A System for Electrophysiology Data Management and ...
 
Share and analyze geonomic data at scale by Andy Petrella and Xavier Tordoir
Share and analyze geonomic data at scale by Andy Petrella and Xavier TordoirShare and analyze geonomic data at scale by Andy Petrella and Xavier Tordoir
Share and analyze geonomic data at scale by Andy Petrella and Xavier Tordoir
 
Spark Summit Europe: Share and analyse genomic data at scale
Spark Summit Europe: Share and analyse genomic data at scaleSpark Summit Europe: Share and analyse genomic data at scale
Spark Summit Europe: Share and analyse genomic data at scale
 
Jonathan Tedds Distinguished Lecture at DLab, UC Berkeley, 12 Sep 2013: "The ...
Jonathan Tedds Distinguished Lecture at DLab, UC Berkeley, 12 Sep 2013: "The ...Jonathan Tedds Distinguished Lecture at DLab, UC Berkeley, 12 Sep 2013: "The ...
Jonathan Tedds Distinguished Lecture at DLab, UC Berkeley, 12 Sep 2013: "The ...
 
Idcc kansa-kansa-arbuckle
Idcc kansa-kansa-arbuckleIdcc kansa-kansa-arbuckle
Idcc kansa-kansa-arbuckle
 
New e-Science Edinburgh Late Edition
New e-Science Edinburgh Late EditionNew e-Science Edinburgh Late Edition
New e-Science Edinburgh Late Edition
 
Laurie Goodman at #SSPBoston: Article+Data+Tools Reproducibility, Reuse, & Ra...
Laurie Goodman at #SSPBoston: Article+Data+ToolsReproducibility, Reuse, & Ra...Laurie Goodman at #SSPBoston: Article+Data+ToolsReproducibility, Reuse, & Ra...
Laurie Goodman at #SSPBoston: Article+Data+Tools Reproducibility, Reuse, & Ra...
 
Tragedy of the (Data) Commons
Tragedy of the (Data) CommonsTragedy of the (Data) Commons
Tragedy of the (Data) Commons
 
NPG Scientific Data; SSP, Boston, May 2014: http://www.sspnet.org/events/annu...
NPG Scientific Data; SSP, Boston, May 2014: http://www.sspnet.org/events/annu...NPG Scientific Data; SSP, Boston, May 2014: http://www.sspnet.org/events/annu...
NPG Scientific Data; SSP, Boston, May 2014: http://www.sspnet.org/events/annu...
 
How to Execute A Research Paper
How to Execute A Research PaperHow to Execute A Research Paper
How to Execute A Research Paper
 
2014 aus-agta
2014 aus-agta2014 aus-agta
2014 aus-agta
 
Databases and Ontologies: Where do we go from here?
Databases and Ontologies:  Where do we go from here?Databases and Ontologies:  Where do we go from here?
Databases and Ontologies: Where do we go from here?
 
ContentMine: Mining the Scientific Literature
ContentMine: Mining the Scientific LiteratureContentMine: Mining the Scientific Literature
ContentMine: Mining the Scientific Literature
 
How do we know what we don’t know: Using the Neuroscience Information Framew...
How do we know what we don’t know:  Using the Neuroscience Information Framew...How do we know what we don’t know:  Using the Neuroscience Information Framew...
How do we know what we don’t know: Using the Neuroscience Information Framew...
 
Early Career Reseachers in Science. Start Early, Be Open , Be Brave
Early Career Reseachers in Science. Start Early, Be Open , Be BraveEarly Career Reseachers in Science. Start Early, Be Open , Be Brave
Early Career Reseachers in Science. Start Early, Be Open , Be Brave
 
The culture of researchData
The culture of researchData The culture of researchData
The culture of researchData
 
The Culture of Research Data, by Peter Murray-Rust
The Culture of Research Data, by Peter Murray-RustThe Culture of Research Data, by Peter Murray-Rust
The Culture of Research Data, by Peter Murray-Rust
 

More from Anita de Waard

Mendeley Data: Enhancing Data Discovery, Sharing and Reuse
Mendeley Data: Enhancing Data Discovery, Sharing and ReuseMendeley Data: Enhancing Data Discovery, Sharing and Reuse
Mendeley Data: Enhancing Data Discovery, Sharing and ReuseAnita de Waard
 
Why would a publisher care about open data?
Why would a publisher care about open data?Why would a publisher care about open data?
Why would a publisher care about open data?Anita de Waard
 
Research Object Composer: A Tool for Publishing Complex Data Objects in the C...
Research Object Composer: A Tool for Publishing Complex Data Objects in the C...Research Object Composer: A Tool for Publishing Complex Data Objects in the C...
Research Object Composer: A Tool for Publishing Complex Data Objects in the C...Anita de Waard
 
NFAIS Talk on Enabling FAIR Data
NFAIS Talk on Enabling FAIR DataNFAIS Talk on Enabling FAIR Data
NFAIS Talk on Enabling FAIR DataAnita de Waard
 
CNI 2018: A Research Object Authoring Tool for the Data Commons
CNI 2018: A Research Object Authoring Tool for the Data CommonsCNI 2018: A Research Object Authoring Tool for the Data Commons
CNI 2018: A Research Object Authoring Tool for the Data CommonsAnita de Waard
 
Enabling FAIR Data: TAG B Authoring Guidelines
Enabling FAIR Data: TAG B Authoring GuidelinesEnabling FAIR Data: TAG B Authoring Guidelines
Enabling FAIR Data: TAG B Authoring GuidelinesAnita de Waard
 
Scientific facts are myths, told through fairytales and spread by gossip.
Scientific facts are myths, told through fairytales and spread by gossip.Scientific facts are myths, told through fairytales and spread by gossip.
Scientific facts are myths, told through fairytales and spread by gossip.Anita de Waard
 
Data, Data Everywhere: What's A Publisher to Do?
Data, Data Everywhere: What's  A Publisher to Do?Data, Data Everywhere: What's  A Publisher to Do?
Data, Data Everywhere: What's A Publisher to Do?Anita de Waard
 
Talk on Research Data Management
Talk on Research Data ManagementTalk on Research Data Management
Talk on Research Data ManagementAnita de Waard
 
Networked Science, And Integrating with Dataverse
Networked Science, And Integrating with DataverseNetworked Science, And Integrating with Dataverse
Networked Science, And Integrating with DataverseAnita de Waard
 
Big Data and the Future of Publishing
Big Data and the Future of PublishingBig Data and the Future of Publishing
Big Data and the Future of PublishingAnita de Waard
 
Real-World Data Challenges: Moving Towards Richer Data Ecosystems
Real-World Data Challenges: Moving Towards Richer Data EcosystemsReal-World Data Challenges: Moving Towards Richer Data Ecosystems
Real-World Data Challenges: Moving Towards Richer Data EcosystemsAnita de Waard
 
Data Repositories: Recommendation, Certification and Models for Cost Recovery
Data Repositories: Recommendation, Certification and Models for Cost RecoveryData Repositories: Recommendation, Certification and Models for Cost Recovery
Data Repositories: Recommendation, Certification and Models for Cost RecoveryAnita de Waard
 
The Economics of Data Sharing
The Economics of Data SharingThe Economics of Data Sharing
The Economics of Data SharingAnita de Waard
 
Public Identifiers in Scholarly Publishing
Public Identifiers in Scholarly PublishingPublic Identifiers in Scholarly Publishing
Public Identifiers in Scholarly PublishingAnita de Waard
 
Elsevier‘s RDM Program: Habits of Effective Data and the Bourne Ulitmatum
Elsevier‘s RDM Program: Habits of Effective Data and the Bourne UlitmatumElsevier‘s RDM Program: Habits of Effective Data and the Bourne Ulitmatum
Elsevier‘s RDM Program: Habits of Effective Data and the Bourne UlitmatumAnita de Waard
 
Elsevier‘s RDM Program: Ten Habits of Highly Effective Data
Elsevier‘s RDM Program: Ten Habits of Highly Effective DataElsevier‘s RDM Program: Ten Habits of Highly Effective Data
Elsevier‘s RDM Program: Ten Habits of Highly Effective DataAnita de Waard
 
Charleston Conference 2016
Charleston Conference 2016Charleston Conference 2016
Charleston Conference 2016Anita de Waard
 
The Narrative Structure of Research Articles, or, Why Science is Like a Fairy...
The Narrative Structure of Research Articles, or, Why Science is Like a Fairy...The Narrative Structure of Research Articles, or, Why Science is Like a Fairy...
The Narrative Structure of Research Articles, or, Why Science is Like a Fairy...Anita de Waard
 

More from Anita de Waard (20)

Mendeley Data: Enhancing Data Discovery, Sharing and Reuse
Mendeley Data: Enhancing Data Discovery, Sharing and ReuseMendeley Data: Enhancing Data Discovery, Sharing and Reuse
Mendeley Data: Enhancing Data Discovery, Sharing and Reuse
 
Why would a publisher care about open data?
Why would a publisher care about open data?Why would a publisher care about open data?
Why would a publisher care about open data?
 
Research Object Composer: A Tool for Publishing Complex Data Objects in the C...
Research Object Composer: A Tool for Publishing Complex Data Objects in the C...Research Object Composer: A Tool for Publishing Complex Data Objects in the C...
Research Object Composer: A Tool for Publishing Complex Data Objects in the C...
 
NFAIS Talk on Enabling FAIR Data
NFAIS Talk on Enabling FAIR DataNFAIS Talk on Enabling FAIR Data
NFAIS Talk on Enabling FAIR Data
 
CNI 2018: A Research Object Authoring Tool for the Data Commons
CNI 2018: A Research Object Authoring Tool for the Data CommonsCNI 2018: A Research Object Authoring Tool for the Data Commons
CNI 2018: A Research Object Authoring Tool for the Data Commons
 
Enabling FAIR Data: TAG B Authoring Guidelines
Enabling FAIR Data: TAG B Authoring GuidelinesEnabling FAIR Data: TAG B Authoring Guidelines
Enabling FAIR Data: TAG B Authoring Guidelines
 
Scientific facts are myths, told through fairytales and spread by gossip.
Scientific facts are myths, told through fairytales and spread by gossip.Scientific facts are myths, told through fairytales and spread by gossip.
Scientific facts are myths, told through fairytales and spread by gossip.
 
Data, Data Everywhere: What's A Publisher to Do?
Data, Data Everywhere: What's  A Publisher to Do?Data, Data Everywhere: What's  A Publisher to Do?
Data, Data Everywhere: What's A Publisher to Do?
 
Talk on Research Data Management
Talk on Research Data ManagementTalk on Research Data Management
Talk on Research Data Management
 
History of the future
History of the futureHistory of the future
History of the future
 
Networked Science, And Integrating with Dataverse
Networked Science, And Integrating with DataverseNetworked Science, And Integrating with Dataverse
Networked Science, And Integrating with Dataverse
 
Big Data and the Future of Publishing
Big Data and the Future of PublishingBig Data and the Future of Publishing
Big Data and the Future of Publishing
 
Real-World Data Challenges: Moving Towards Richer Data Ecosystems
Real-World Data Challenges: Moving Towards Richer Data EcosystemsReal-World Data Challenges: Moving Towards Richer Data Ecosystems
Real-World Data Challenges: Moving Towards Richer Data Ecosystems
 
Data Repositories: Recommendation, Certification and Models for Cost Recovery
Data Repositories: Recommendation, Certification and Models for Cost RecoveryData Repositories: Recommendation, Certification and Models for Cost Recovery
Data Repositories: Recommendation, Certification and Models for Cost Recovery
 
The Economics of Data Sharing
The Economics of Data SharingThe Economics of Data Sharing
The Economics of Data Sharing
 
Public Identifiers in Scholarly Publishing
Public Identifiers in Scholarly PublishingPublic Identifiers in Scholarly Publishing
Public Identifiers in Scholarly Publishing
 
Elsevier‘s RDM Program: Habits of Effective Data and the Bourne Ulitmatum
Elsevier‘s RDM Program: Habits of Effective Data and the Bourne UlitmatumElsevier‘s RDM Program: Habits of Effective Data and the Bourne Ulitmatum
Elsevier‘s RDM Program: Habits of Effective Data and the Bourne Ulitmatum
 
Elsevier‘s RDM Program: Ten Habits of Highly Effective Data
Elsevier‘s RDM Program: Ten Habits of Highly Effective DataElsevier‘s RDM Program: Ten Habits of Highly Effective Data
Elsevier‘s RDM Program: Ten Habits of Highly Effective Data
 
Charleston Conference 2016
Charleston Conference 2016Charleston Conference 2016
Charleston Conference 2016
 
The Narrative Structure of Research Articles, or, Why Science is Like a Fairy...
The Narrative Structure of Research Articles, or, Why Science is Like a Fairy...The Narrative Structure of Research Articles, or, Why Science is Like a Fairy...
The Narrative Structure of Research Articles, or, Why Science is Like a Fairy...
 

Towards Incidental Collaboratories; Research Data Services

  • 1. Research  Data  Services:     Towards  a  Framework  for  Incidental  Collaboratories   Anita  de  Waard   VP  Research  Data  Collabora@ons,  Elsevier  RDS   Jericho,  VT,  USA  
  • 2. Brief  bio:   •  Background:     –  Low-­‐temperature  physics  (Leiden  &  Moscow)   –  Joined  Elsevier  in  1988  as  publisher  in  solid  state  physics   –  1991:  ArXiV  =>  publishers  will  go  out  of  business  very  soon!   •  1997-­‐  now:  Disrup@ve  Technologies  Director,  focus  on  beXer   representa@on  of  scien@fic  knowledge:   –  Iden@fying  key  knowledge  elements  in  ar@cles  (linguis@cs  thesis)   –  Building  claim-­‐evidence  networks  (through  collabora@ons)   –  Help  build  communi@es  to  accelerate  rate  of  change  (Force11)   •  Star@ng  1/1/2013:  VP  Research  Data  Collabora@ons  -­‐  why?     –  Douglas  Engelbart’s  thinking:  connect  minds!   –  My  (non-­‐biologists)  understanding  of  biology:  
  • 3. The  big  problem  in  biology:   Interspecies  variability:  A  specimen  is  not  a  species   Gene  expression  variability:  Knowing  genes  is  not     knowing  how  they  are  expressed   Microbiome:  An  animal  is  an  ecosystem   Systems  biology:  A  whole  is  more  than  the  sum  of  its  parts       Reduc@onist  science  doesn’t  work   for  living  systems!   hXp://en.wikipedia.org/wiki/File:Duck_of_Vaucanson.jpg  
  • 4. Sta@s@cs  to  the  rescue!     With  enough  observa@ons,  trends  and  anomalies  can  be  detected:   •   “Here  we  present  resources  from  a  popula@on  of  242  healthy  adults   sampled  at  15  or  18  body  sites  up  to  three  @mes,  which  have  generated   5,177  microbial  taxonomic  profiles  from  16S  ribosomal  RNA  genes  and   over  3.5  terabases  of  metagenomic  sequence  so  far.”     The  Human  Microbiome  Project  Consor@um,  Structure,  func@on  and  diversity  of  the  healthy   human  microbiome,  Nature  486,  207–214  (14  June  2012)  doi:10.1038/nature11234   •  “The  large  sample  size  —  4,298  North  Americans  of  European  descent   and  2,217  African  Americans  —  has  enabled  the  researchers  to  mine   down  into  the  human  genome.”     Nidhi  Subbaraman,  Nature  News,  28  November  2012,  High-­‐resolu@on  sequencing  study   emphasizes  importance  of  rare  variants  in  disease.   •  “A  profile  unique  for  a  DNA  sample  source  is  obtained    …  a  series   of  numbers  are  generated  which  can  be  used  as  a  bar  code  for   that  DNA  source.  A  registry  of  bar  codes  would  make  it  easy  to   compare  DNA  samples”     Roland  M.  Nardone,  Ph.D.,  Eradica@on  of  Cross-­‐Contaminated  Cell  Lines:  A  Call  for  Ac@on,   hXp://www.sivb.org/publicPolicy_Eradica@on.pdf    
  • 5. Enable  ‘incidental  collaboratories’:   •  Collect:  store  data  at  the  level  of  the  experiment:   –  Accessible  through  a  single  interface   –  Add  enough  metadata  to  know  what  was  done/seen   •  Connect:  allow  analyses  over:     –  Similar  experiment  types     –  Experiments  done  with/on  similar  biological  ‘things’:   •  Species,  strains,  systems,  cells   •  Anatomical  components  (e.g.  spleen,  hypothalamus)   •  An@bodies,  biomarkers,  bioac@ve  chemicals,  etc   •  Keep:   –  Long-­‐term  preserva@on  of  data  and  sosware  (Olive)   –  Fulfill  Data  Management  Plan  requirements   –  Allow  gated  access,  if  needed    
  • 6. Problem:  biological  research  is  quite  insular   •  Biology  is  small:  because  objects/ equipment  are  10^-­‐5  –  10^2  m,  you   can  work  alone  (‘King’  and   ‘subjects’).     Prepare   •  Biology  is  messy:  it  doesn’t  happen   behind  a  terminal.     Ponder   Observe   •  Biology  is  compe@@ve:  different   Communicate   people  with  similar  skill  sets,  vying   for  the  same  grants.     Analyze   •  In  summary:  it  does  not  promote   inherent  collabora@on  (vs.,  for   instance,  big  physics  or  astronomy).  
  • 7. Try  to  pop  the  ‘lab  bubble’!   Prepare   Observa@ons   Labs  go  from  being   Analyze   Communicate   Think   Observa@ons   informa@on  islands,     to  being  ‘sensors  in  a   Observa@ons   network’.   Prepare   Prepare   Analyze   Communicate   Analyze   Communicate  
  • 8. Some  objec@ons,  and  rebuXals:   Objec&on:   Rebu-al:   “But  our  lab  notebooks  are  all  on   Develop  smart  phone/tablet  apps  for  data   paper”   input   “I  need  to  see  a  direct  benefit  from   Develop  ‘data  manipula@on  dashboard’   something  I  spend  my  @me  on”   for  PI  to  allow  beXer  access  to  full     experimental  output  for  his/her  lab   “I  am  afraid  other  people  might   Develop  intra-­‐lab  data  communica@on   scoop  my  discoveries”   systems  first  and  allow  @med/granular     data  export   “I  want  things  to  be  peer  reviewed   Allow  reviewers  access  to  experimental   before  I  expose  them”   database  before  publica@on  (of  data  or     paper)   “I  don’t  really  trust  anyone  else’s   Add  a  social  networking  component  to   data  –  well,  except  for  the  guys  I   this  data  repository  so  you  know  who  (to   went  to  Grad  School  with…”     the  individual)  created  that  data  point.    
  • 9. Elsevier  Research  Data  Services:  Goals   1.  Help  increase  the  amount  of  data  shared  from  the  lab,   enabling  incidental  collaboratories   2.  Help  increase  the  value  of  the  data  shared  by   increasing  annota@on,  normaliza@on,  provenance   enabling  enhanced  interoperability   3.  Help  measure  and  deliver  credit  for  shared  data,  the   researchers,  the  ins@tute,  and  the  funding  body,   enabling  more  sustainable  pla;orms  
  • 10. RDS  Guiding  Principles:   •  In  principle,  all  open  data  stays  open  and  URLs,  front   end  etc.  stay  where  they  are  (i.e.  with  repository)   •  Collabora@on  is  tailored  to  data  repositories’    unique   needs/interests  and  of  a  ‘service-­‐model’  type:     –  Aspects  where  collabora@on  is  needed  are  discussed   –  A  collabora@on  plan  is  drawn  up  using  a  Service-­‐Level   Agreement:  agree  on  @me,  condi@ons,  etc.     –  All  communica@on,  finance,  IPR  etc.  is  completely   transparent  at  all  @mes.     •  Very  small  (2/3  people)  department;  immediate   communica@on;  instant  deployment  of  ideas    
  • 11. RDS  Approach:   •  Collaborate  and  build  on  rela@onships  with  data   repositories  (life  science,  earth  science,  others)   •  Integrate  with  other  content  sources,  if  possible   •  Build  annota@on  and  standardisa@on  tools  and   processes  to  implement  this   •  Develop  next-­‐genera@on  infrastructure  solu@ons   for  back-­‐end  integra@on   •  Explore  crea@ve  revenue  opportuni@es  
  • 12. NIF  An@body  Registry:   Problem:     •  95  an@bodies  were  iden@fied  in  8  papers   •  52  did  not  contain  enough  informa@on     to  determine  the  an@body  used   •  Some  provided  details  in  another  paper   •  Failed  to  give  species,  vendor,  catalog  #   Solu@on  #  1:     •  Journals  ask  authors  to  provide     an@body  catalog  nr     •  Link  to  NIF  Registry  from  manufacturers/ vendors’  sites   Solu@on  #2:     •  Pilot  with  a  lab:    
  • 13. Let’s  start  with  the  Urban  Lab     •  Geyng  an@bodies     •  And  messy  bits       •  From  the  notebook     •  Into  Nathan  Urban’s   command  center     •  By  providing   – 7”  Tablets   – Links  to  IgorPro   – A  dashboard  UI  
  • 14. My  ques@ons  to  you:   •  Thoughts  on  this  approach:     –  In  principle?     –  In  prac@ce?   •  Do  you  see  serious  hurdles:     –  Are  we  overlapping  with  other  ini@a@ves;  if  so,  are  we   complementary?   –  How  does  this  connect  to  libraries/local  repositories?     –  Are  there  sensi@vi@es/pain  points  we  are  overlooking?     •  Where  to  start:     –  How  to  collaborate?     –  Who  to  talk  to  –  funding  agencies,  socie@es:  who  else?     –  Thoughts  on  data  repositories/plazorms  to  connect  to?    
  • 15. Your  ques@ons  to  me?   a.dewaard@elsevier.com   hXp://elsatglabs.com/labs/anita/     hXp://www.slideshare.net/anitawaard     Thanks  go  to:   •  Anita  Bandrowski  and  Maryann  Martone,  NIF   •  Nathan  Urban,  Shreejoy  Tripathy,  CMU   •  David  Marques,  SVP  RDS