0
SPARC digital repositories meeting
          baltimore, md
       17 november 2008

            john wilbanks
creative com...
i.a.n.a.r.e
i.a.n.a.r.e

(i am not a repository expert)
why is there a disconnect between planning
    to share and the actual sharing?
disruptive processes can’t be
     planned in advance.
disruptive processes can’t be
      planned in advance.

planned innovation tends to be
    incremental, and slow.
disruptive processes can’t be
      planned in advance.

planned innovation tends to be
    incremental, and slow.

     ....
process change comes more slowly than
      information product change
process change comes more slowly than
         information product change




    disruptive processes on the network
come...
1.
stable systems are resistant to change on
              multiple levels.
©
creative expression
the container, not the facts.
the container, not the facts.

but   © locks the container.
IGFBP-5 plays a role in the
regulation of cellular senescence
via a p53-dependent pathway
and in aging-associated
vascular...
IGFBP-5 plays a role in the
regulation of cellular senescence
via a p53-dependent pathway
and in aging-associated
vascular...
indexing: disallowed.




 http://orpheus-1.ucsd.edu/acq/license/cdlelsevier2004.pdf
the pre-existing system has blocks in
place to prevent process disruption.
creative
 work?
what do these
 ideas mean in
   a world of
integrated data?
40 minutes per year
nih policy.
i can has
repository staff?
Dorothea Salo, http://cavlec.yarinareth.net/2008/10/31/miniature-disasters-and-minor-catastrophes/
tension between meeting the demands
of adding content and providing
services
the existing system is robust against
              disruption



the existing system is robust against
              disr...
this is how evolved systems resist change:
 at multiple levels, with multiple fail-safes.
2.
reports from the front lines: building a
  commons is really, really hard.
Open Access Content
“running code”
c
>1000 journals under CC

  image from the public library of science
  licensed to the public under CC-BY 3.0
running policy code
    (w. SPARC)
+
+        +         +




+   is it legal?   +




+        +         +
a protocol, not a license
conflicts with the protection instinct
conflicts with the protection instinct

the protection instinct is frequently an instinct to
               protect “freedo...
solves the legal problem
but not the container
      problem.
building a web for data:
  the “semantic web”
making computers understand links between documents



                     links to
       Web page                    We...
making computers understand relationships between concepts




                          causes
        drinking coffee   ...
http://ontology.foo.org/causes



                                          causes
          drinking coffee              ...
use the web to
           integrate information
            from different places
             and different names
“coffee...
(too much work for
      coffee)
(distributed, networked
approaches start to look
       pretty good)
web 2.0, science 3.0, what about making
          Google work better?
over 200
   years at
one paper/day
what you want is
    a list of genes.

not a list of documents.
Open Source
Data Integration
a repository of ontologies,
namespaces, and integrated
         databases.
DRD1, 1812      adenylate cyclase activation
ADRB2, 154      adenylate cyclase activation
ADRB2, 154      arrestin mediate...
e pluribus unum.
we can transform complex queries into links


            prefix go: <http://purl.org/obo/owl/GO#>
    prefix rdfs: <http:...
we can transform complex queries into links
http://hcls1.csail.mit.edu:8890/sparql/?query=prefix%20go%3A%20%3Chttp%3A%2F%2F...
we can transform complex queries into links
we can help scholars “remix” queries
  prefix go: <http://purl.org/obo/owl/GO#>
  prefix rdfs: <http://www.w3.org/2000/01/rd...
we can build a corpus of queries as links
we can re-use cultural tools for scholarship
3.
two futures: a network of repositories, or a
              bunch of islands?
simple + open = WIN
content

 code

physical
knowledge

 content

  code

 physical
open copyright, balanced incentives, and
        distributed workloads
Some faculty have contributed to their IR as open
access advocates who believed in the importance of
freely accessible sch...
what questions can only a network
    of populated IRs answer?
4.
institutions have to provide a stable
   foundation for the knowledge web.
process revolutions: the network
                         Huntington’s


Parkinson’s




                                 ...
institutional revolutions: the network
                      Huntington’s


 Parkinson’s




                             ...
the infrastructure for this is very, very shaky.
prefix dc: <http://purl.org/dc/elements/1.1/>
prefix skos: <http://www.w3.org/2004/02/skos/core#>
prefix rdfs: <http://www.w3...
what are the odds that the organizations making the
 namespaces will be here in 50 years? 100 years?
Huntington’s
Huntington’s


Parkinson’s




                                ALS




     Multiple
     Sclerosis


                 Aut...
Huntington’s


Parkinson’s




                 library
                                          ALS




     Multiple
  ...
“In any case, it is clear that a library containing all possible
 books, arranged at random, is equivalent (as a source of...
exponential content growth
our brain capacity


5.00



3.75



2.50



1.25



  0
   1990   1994                    1998   2002
but if we can work together...
conclusion?
don’t wait.
use existing systems.
hack around problems.
create new ways to measure.
invest in your repository staff.
free as in speech
free as in speech
 free as in beer
free as in speech
 free as in beer
free as in a puppy
free as in speech
                                         free as in beer

Average Cost Of 100 Pound Dog
                ...
thank you

wilbanks@creativecommons.org

  http://sciencecommons.org
Sparc DR Meeting
Sparc DR Meeting
Sparc DR Meeting
Sparc DR Meeting
Sparc DR Meeting
Sparc DR Meeting
Sparc DR Meeting
Sparc DR Meeting
Sparc DR Meeting
Sparc DR Meeting
Sparc DR Meeting
Sparc DR Meeting
Sparc DR Meeting
Sparc DR Meeting
Sparc DR Meeting
Sparc DR Meeting
Sparc DR Meeting
Sparc DR Meeting
Sparc DR Meeting
Sparc DR Meeting
Sparc DR Meeting
Sparc DR Meeting
Sparc DR Meeting
Sparc DR Meeting
Sparc DR Meeting
Upcoming SlideShare
Loading in...5
×

Sparc DR Meeting

1,059

Published on

Published in: Education, Technology
0 Comments
2 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
1,059
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
19
Comments
0
Likes
2
Embeds 0
No embeds

No notes for slide

Transcript of "Sparc DR Meeting"

  1. 1. SPARC digital repositories meeting baltimore, md 17 november 2008 john wilbanks creative commons / science commons
  2. 2. i.a.n.a.r.e
  3. 3. i.a.n.a.r.e (i am not a repository expert)
  4. 4. why is there a disconnect between planning to share and the actual sharing?
  5. 5. disruptive processes can’t be planned in advance.
  6. 6. disruptive processes can’t be planned in advance. planned innovation tends to be incremental, and slow.
  7. 7. disruptive processes can’t be planned in advance. planned innovation tends to be incremental, and slow. ...and not innovative.
  8. 8. process change comes more slowly than information product change
  9. 9. process change comes more slowly than information product change disruptive processes on the network come from people hacking, not planning to hack.
  10. 10. 1. stable systems are resistant to change on multiple levels.
  11. 11. © creative expression
  12. 12. the container, not the facts.
  13. 13. the container, not the facts. but © locks the container.
  14. 14. IGFBP-5 plays a role in the regulation of cellular senescence via a p53-dependent pathway and in aging-associated vascular diseases
  15. 15. IGFBP-5 plays a role in the regulation of cellular senescence via a p53-dependent pathway and in aging-associated vascular diseases
  16. 16. indexing: disallowed. http://orpheus-1.ucsd.edu/acq/license/cdlelsevier2004.pdf
  17. 17. the pre-existing system has blocks in place to prevent process disruption.
  18. 18. creative work?
  19. 19. what do these ideas mean in a world of integrated data?
  20. 20. 40 minutes per year
  21. 21. nih policy.
  22. 22. i can has repository staff?
  23. 23. Dorothea Salo, http://cavlec.yarinareth.net/2008/10/31/miniature-disasters-and-minor-catastrophes/
  24. 24. tension between meeting the demands of adding content and providing services
  25. 25. the existing system is robust against disruption the existing system is robust against disruption
  26. 26. this is how evolved systems resist change: at multiple levels, with multiple fail-safes.
  27. 27. 2. reports from the front lines: building a commons is really, really hard.
  28. 28. Open Access Content
  29. 29. “running code”
  30. 30. c >1000 journals under CC image from the public library of science licensed to the public under CC-BY 3.0
  31. 31. running policy code (w. SPARC)
  32. 32. +
  33. 33. + + + + is it legal? + + + +
  34. 34. a protocol, not a license
  35. 35. conflicts with the protection instinct
  36. 36. conflicts with the protection instinct the protection instinct is frequently an instinct to protect “freedom”
  37. 37. solves the legal problem
  38. 38. but not the container problem.
  39. 39. building a web for data: the “semantic web”
  40. 40. making computers understand links between documents links to Web page Web page
  41. 41. making computers understand relationships between concepts causes drinking coffee feel awake
  42. 42. http://ontology.foo.org/causes causes drinking coffee feel awake http://ontology.foo.org/drinking coffee http://ontology.foo.org/feel awake h
  43. 43. use the web to integrate information from different places and different names “coffee” “cafe” coffee http://ontology.foo.org/coffee “kopi”
  44. 44. (too much work for coffee)
  45. 45. (distributed, networked approaches start to look pretty good)
  46. 46. web 2.0, science 3.0, what about making Google work better?
  47. 47. over 200 years at one paper/day
  48. 48. what you want is a list of genes. not a list of documents.
  49. 49. Open Source Data Integration
  50. 50. a repository of ontologies, namespaces, and integrated databases.
  51. 51. DRD1, 1812 adenylate cyclase activation ADRB2, 154 adenylate cyclase activation ADRB2, 154 arrestin mediated desensitization of G-protein coupled receptor protein signaling pathway DRD1IP, 50632 dopamine receptor signaling pathway DRD1, 1812 dopamine receptor, adenylate cyclase activating pathway DRD2, 1813 dopamine receptor, adenylate cyclase inhibiting pathway GRM7, 2917 G-protein coupled receptor protein signaling pathway GNG3, 2785 G-protein coupled receptor protein signaling pathway GNG12, 55970 G-protein coupled receptor protein signaling pathway DRD2, 1813 G-protein coupled receptor protein signaling pathway ADRB2, 154 G-protein coupled receptor protein signaling pathway CALM3, 808 G-protein coupled receptor protein signaling pathway HTR2A, 3356 G-protein coupled receptor protein signaling pathway DRD1, 1812 G-protein signaling, coupled to cyclic nucleotide second messenger SSTR5, 6755 G-protein signaling, coupled to cyclic nucleotide second messenger MTNR1A, 4543 G-protein signaling, coupled to cyclic nucleotide second messenger CNR2, 1269 G-protein signaling, coupled to cyclic nucleotide second messenger HTR6, 3362 G-protein signaling, coupled to cyclic nucleotide second messenger GRIK2, 2898 glutamate signaling pathway GRIN1, 2902 glutamate signaling pathway GRIN2A, 2903 glutamate signaling pathway GRIN2B, 2904 glutamate signaling pathway ADAM10, 102 integrin-mediated signaling pathway GRM7, 2917 negative regulation of adenylate cyclase activity LRP1, 4035 negative regulation of Wnt receptor signaling pathway ADAM10, 102 Notch receptor processing ASCL1, 429 Notch signaling pathway HTR2A, 3356 serotonin receptor signaling pathway ADRB2, 154 transmembrane receptor protein tyrosine kinase activation (dimerization) PTPRG, 5793 transmembrane receptor protein tyrosine kinase signaling pathway EPHA4, 2043 transmembrane receptor protein tyrosine kinase signaling pathway NRTN, 4902 transmembrane receptor protein tyrosine kinase signaling pathway CTNND1, 1500 Wnt receptor signaling pathway `
  52. 52. e pluribus unum.
  53. 53. we can transform complex queries into links prefix go: <http://purl.org/obo/owl/GO#> prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> Mesh: Pyramidal Neurons prefix owl: <http://www.w3.org/2002/07/owl#> prefix mesh: <http://purl.org/commons/record/mesh/> prefix sc: <http://purl.org/science/owl/sciencecommons/> prefix ro: <http://www.obofoundry.org/ro/ro.owl#> select ?genename ?processname where { graph <http://purl.org/commons/hcls/pubmesh> Pubmed: Journal Articles { ?paper ?p mesh:D017966 . ?article sc:identified_by_pmid ?paper. ?gene sc:describes_gene_or_gene_product_mentioned_by ?article. } graph <http://purl.org/commons/hcls/goa> { ?protein rdfs:subClassOf ?res. ?res owl:onProperty ro:has_function. ?res owl:someValuesFrom ?res2. Entrez Gene: Genes ?res2 owl:onProperty ro:realized_as. ?res2 owl:someValuesFrom ?process. graph <http://purl.org/commons/hcls/20070416/classrelations> {{?process <http://purl.org/obo/owl/obo#part_of> go:GO_0007166} union {?process rdfs:subClassOf go:GO_0007166 }} ?protein rdfs:subClassOf ?parent. ?parent owl:equivalentClass ?res3. GO: Signal Transduction ?res3 owl:hasValue ?gene. } graph <http://purl.org/commons/hcls/gene> { ?gene rdfs:label ?genename } graph <http://purl.org/commons/hcls/20070416> { ?process rdfs:label ?processname} }
  54. 54. we can transform complex queries into links http://hcls1.csail.mit.edu:8890/sparql/?query=prefix%20go%3A%20%3Chttp%3A%2F%2Fpurl.org%2Fobo%2Fowl%2FGO%23%3E%0Aprefix%20rdfs%3A %20%3Chttp%3A%2F%2Fwww.w3.org%2F2000%2F01%2Frdf-schema%23%3E%0Aprefix%20owl%3A%20%3Chttp%3A%2F%2Fwww.w3.org%2F2002% 2F07%2Fowl%23%3E%0Aprefix%20mesh%3A%20%3Chttp%3A%2F%2Fpurl.org%2Fcommons%2Frecord%2Fmesh%2F%3E%0Aprefix%20sc%3A%20% 3Chttp%3A%2F%2Fpurl.org%2Fscience%2Fowl%2Fsciencecommons%2F%3E%0Aprefix%20ro%3A%20%3Chttp%3A%2F%2Fwww.obofoundry.org%2Fro %2Fro.owl%23%3E%0A%0Aselect%20%3Fgenename%20%3Fprocessname%0Awhere%0A%7B%20%20graph%20%3Chttp%3A%2F%2Fpurl.org% 2Fcommons%2Fhcls%2Fpubmesh%3E%0A%20%20%20%20%20%7B%20%3Fpaper%20%3Fp%20mesh%3AD017966%20.%0A%20%20%20%20%20%20% 20%3Farticle%20sc%3Aidentified_by_pmid%20%3Fpaper.%0A%20%20%20%20%20%20%20%3Fgene%20sc% 3Adescribes_gene_or_gene_product_mentioned_by%20%3Farticle.%0A%20%20%20%20%20%7D%0A%20%20%20graph%20%3Chttp%3A%2F% 2Fpurl.org%2Fcommons%2Fhcls%2Fgoa%3E%0A%20%20%20%20%20%7B%20%3Fprotein%20rdfs%3AsubClassOf%20%3Fres.%0A%20%20%20%20% 20%20%20%3Fres%20owl%3AonProperty%20ro%3Ahas_function.%0A%20%20%20%20%20%20%20%3Fres%20owl%3AsomeValuesFrom%20%3Fres2.% 0A%20%20%20%20%20%20%20%3Fres2%20owl%3AonProperty%20ro%3Arealized_as.%0A%20%20%20%20%20%20%20%3Fres2%20owl% 3AsomeValuesFrom%20%3Fprocess.%0A%20%20%20graph%20%3Chttp%3A%2F%2Fpurl.org%2Fcommons%2Fhcls%2F20070416%2Fclassrelations%3E %0A%20%20%20%20%20%7B%7B%3Fprocess%20%3Chttp%3A%2F%2Fpurl.org%2Fobo%2Fowl%2Fobo%23part_of%3E%20go%3AGO_0007166%7D% 0A%20%20%20%20%20%20%20union%0A%20%20%20%20%20%20%7B%3Fprocess%20rdfs%3AsubClassOf%20go%3AGO_0007166%20%7D%7D%0A %20%20%20%20%20%20%20%3Fprotein%20rdfs%3AsubClassOf%20%3Fparent.%0A%20%20%20%20%20%20%20%3Fparent%20owl% 3AequivalentClass%20%3Fres3.%0A%20%20%20%20%20%20%20%3Fres3%20owl%3AhasValue%20%3Fgene.%0A%20%20%20%20%20%20%7D%0A% 20%20%20graph%20%3Chttp%3A%2F%2Fpurl.org%2Fcommons%2Fhcls%2Fgene%3E%0A%20%20%20%20%20%7B%20%3Fgene%20rdfs%3Alabel%20% 3Fgenename%20%7D%0A%20%20%20graph%20%3Chttp%3A%2F%2Fpurl.org%2Fcommons%2Fhcls%2F20070416%3E%0A%20%20%20%20%20%7B% 20%3Fprocess%20rdfs%3Alabel%20%3Fprocessname%7D%0A%7D&format=&maxrows=50
  55. 55. we can transform complex queries into links
  56. 56. we can help scholars “remix” queries prefix go: <http://purl.org/obo/owl/GO#> prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> prefix owl: <http://www.w3.org/2002/07/owl#> prefix mesh: <http://purl.org/commons/record/mesh/> prefix sc: <http://purl.org/science/owl/sciencecommons/> prefix ro: <http://www.obofoundry.org/ro/ro.owl#> select ?genename ?processname where { graph <http://purl.org/commons/hcls/pubmesh> mesh:D009369 { ?paper ?p ?article sc:identified_by_pmid ?paper. . Mesh: Cancer ?gene sc:describes_gene_or_gene_product_mentioned_by ?article. } graph <http://purl.org/commons/hcls/goa> { ?protein rdfs:subClassOf ?res. ?res owl:onProperty ro:has_function. ?res owl:someValuesFrom ?res2. ?res2 owl:onProperty ro:realized_as. ?res2 owl:someValuesFrom ?process. graph <http://purl.org/commons/hcls/20070416/classrelations> {{?process <http://purl.org/obo/owl/obo#part_of> go:GO_0006610} union go:GO_0006610 }} {?process rdfs:subClassOf ?protein rdfs:subClassOf ?parent. GO: Ribosomal Protein ?parent owl:equivalentClass ?res3. ?res3 owl:hasValue ?gene. } graph <http://purl.org/commons/hcls/gene> { ?gene rdfs:label ?genename } graph <http://purl.org/commons/hcls/20070416> { ?process rdfs:label ?processname} }
  57. 57. we can build a corpus of queries as links
  58. 58. we can re-use cultural tools for scholarship
  59. 59. 3. two futures: a network of repositories, or a bunch of islands?
  60. 60. simple + open = WIN
  61. 61. content code physical
  62. 62. knowledge content code physical
  63. 63. open copyright, balanced incentives, and distributed workloads
  64. 64. Some faculty have contributed to their IR as open access advocates who believed in the importance of freely accessible scholarship for their research community or their university. Perhaps most important to the viability of IRs, however, were the faculty who found that the IR could solve a particular information problem they faced in the everyday practice of scholarship.
  65. 65. what questions can only a network of populated IRs answer?
  66. 66. 4. institutions have to provide a stable foundation for the knowledge web.
  67. 67. process revolutions: the network Huntington’s Parkinson’s ALS Multiple Sclerosis Autism
  68. 68. institutional revolutions: the network Huntington’s Parkinson’s ALS Multiple Sclerosis Autism
  69. 69. the infrastructure for this is very, very shaky.
  70. 70. prefix dc: <http://purl.org/dc/elements/1.1/> prefix skos: <http://www.w3.org/2004/02/skos/core#> prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> prefix owl: <http://www.w3.org/2002/07/owl#> prefix sc: <http://purl.org/science/owl/sciencecommons/> prefix foaf: <http://xmlns.com/foaf/0.1/>
  71. 71. what are the odds that the organizations making the namespaces will be here in 50 years? 100 years?
  72. 72. Huntington’s
  73. 73. Huntington’s Parkinson’s ALS Multiple Sclerosis Autism
  74. 74. Huntington’s Parkinson’s library ALS Multiple Sclerosis Autism
  75. 75. “In any case, it is clear that a library containing all possible books, arranged at random, is equivalent (as a source of information) to a library containing zero books.” http://en.wikipedia.org/wiki/The_Library_of_Babel
  76. 76. exponential content growth
  77. 77. our brain capacity 5.00 3.75 2.50 1.25 0 1990 1994 1998 2002
  78. 78. but if we can work together...
  79. 79. conclusion?
  80. 80. don’t wait.
  81. 81. use existing systems.
  82. 82. hack around problems.
  83. 83. create new ways to measure.
  84. 84. invest in your repository staff.
  85. 85. free as in speech
  86. 86. free as in speech free as in beer
  87. 87. free as in speech free as in beer free as in a puppy
  88. 88. free as in speech free as in beer Average Cost Of 100 Pound Dog free as in a puppy Over A Year Good Quality Dog Food $70 x 12 = $840 Dog Accessories (collar, leash, etc.) $30 Dog Toys $30 - $50 Vaccines $35 Flea, Tick, & Heartworm Prevention $320 Dog Treats $200 Boarding $100 - $200 (at $15 - $20 a day) Emergency Costs $0 - $2500 or more Total $1375 or much more
  89. 89. thank you wilbanks@creativecommons.org http://sciencecommons.org
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×