0
1.
the policy
environment.
it is not sufficient.
http://www.systemswiki.org/images/8/8a/Wisdom.png
“is it open?” is
perhaps not the
right frame.
accessibility

adaptability

ease
of mastery

leverage
accessibility

EASY TO USE
NO OPEN LICENSE

adaptability

ease
of mastery

leverage
17
19
accessibility

NO OPEN LICENSE
DOWNLOAD AVAILABLE
DOCUMENTATION IN PDF

adaptability

ease
of mastery

leverage
2.
doing research in the
open: early returns.
it is not sufficient.
“how accurately can we predict if
a female breast cancer survivor
will develop a second tumor?”
may the best (statistical) model win
code sharing a prerequisite.
accuracy of model jumped three
orders of magnitude in nine days.
76% accurate.

27
(not a biologist)

28
21 february 2013

17 april 2013

ongoing...
SHOW ME THE CODE!
...
...
...
...
...
if we don’t have the article in
machinable form with rights
to tranform? doesn’t happen.
can we predict clinical utility from
genetics of arthritis?
can we predict scores on alzheimers
cognitive tests from existing data?
accessibility
25
THREE	
  OPTIONS	
  TO	
  DOWNLOAD	
  
NO	
  CLEAR	
  LICENSE	
  
PRIVACY	
  RESTRICTIONS	
  
METADATA

2...
accessibility
IMPACT	
  OF	
  PRIVATE	
  INTERVENTION

adaptability

ease	
  
of	
  mastery

leverage
68

core projects
248
researchers
28
institutions
1070
datasets
1723
results
Omberg,	
  et	
  al.	
  Nature	
  Gene*cs
colorectal cancer subtyping
analysis
groups

datasets

A

1

B

2

C

3

D

4

E

5

F

6

subtypes
analysis
groups

datasets

A

1

B

2

C

3

D

4

E

5

F

6

G

...

subtypes
analysis
groups

G
analysis
groups

datasets

A

1

B

2

C

3

D

4

E

5

F

6

G

...

subtypes
3.
research and culture are
on a collision course,
driven by data.
tension between
anonymity and utility.
“more like plutonium
than gold”
tension between
expectation and reuse.
68% want their data
shared for science
tension between value of
individual and value of
aggregate.
$.50 to $2.50 for SSN,
birthdate, etc.
$5 to $15 for credit,
background checks.
~40 records for $2100
tension between
“research” data and
“consumer” data.
https://www.scienceexchange.com/
it’s likely that we will end
up with a data network
effect of some sort.
a. the incremental
institution.
b. the walled garden.
c. big networks of
small things.
thank you
!

@wilbanks
wilbanks@nitrd.gov
CENDI wilbanks
CENDI wilbanks
CENDI wilbanks
CENDI wilbanks
CENDI wilbanks
CENDI wilbanks
CENDI wilbanks
CENDI wilbanks
CENDI wilbanks
CENDI wilbanks
CENDI wilbanks
CENDI wilbanks
CENDI wilbanks
CENDI wilbanks
CENDI wilbanks
CENDI wilbanks
CENDI wilbanks
CENDI wilbanks
CENDI wilbanks
CENDI wilbanks
CENDI wilbanks
CENDI wilbanks
CENDI wilbanks
CENDI wilbanks
CENDI wilbanks
CENDI wilbanks
CENDI wilbanks
CENDI wilbanks
CENDI wilbanks
CENDI wilbanks
Upcoming SlideShare
Loading in...5
×

CENDI wilbanks

829

Published on

Talk given to the meeting of the CENDI group in early November 2013. CENDI is a volunteer-powered membership organization that serves the federal information community - that is, all those who create, manage, aggregate, organize, and provide access to federally-funded data and publications resulting from the nation’s $150 billion annual investment in federal R&D. Member organizations represent a cross-section of federal data and publication providers, including libraries, data centers, aggregators, information technology developers, and content management providers.

Published in: Business, Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
829
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
3
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Transcript of "CENDI wilbanks"

  1. 1. 1. the policy environment. it is not sufficient.
  2. 2. http://www.systemswiki.org/images/8/8a/Wisdom.png
  3. 3. “is it open?” is perhaps not the right frame.
  4. 4. accessibility adaptability ease of mastery leverage
  5. 5. accessibility EASY TO USE NO OPEN LICENSE adaptability ease of mastery leverage
  6. 6. 17
  7. 7. 19
  8. 8. accessibility NO OPEN LICENSE DOWNLOAD AVAILABLE DOCUMENTATION IN PDF adaptability ease of mastery leverage
  9. 9. 2. doing research in the open: early returns. it is not sufficient.
  10. 10. “how accurately can we predict if a female breast cancer survivor will develop a second tumor?”
  11. 11. may the best (statistical) model win
  12. 12. code sharing a prerequisite.
  13. 13. accuracy of model jumped three orders of magnitude in nine days.
  14. 14. 76% accurate. 27
  15. 15. (not a biologist) 28
  16. 16. 21 february 2013 17 april 2013 ongoing...
  17. 17. SHOW ME THE CODE!
  18. 18. ...
  19. 19. ...
  20. 20. ...
  21. 21. ...
  22. 22. ...
  23. 23. if we don’t have the article in machinable form with rights to tranform? doesn’t happen.
  24. 24. can we predict clinical utility from genetics of arthritis?
  25. 25. can we predict scores on alzheimers cognitive tests from existing data?
  26. 26. accessibility 25 THREE  OPTIONS  TO  DOWNLOAD   NO  CLEAR  LICENSE   PRIVACY  RESTRICTIONS   METADATA 25 ease   of  mastery 0 adaptability 25 25 leverage
  27. 27. accessibility IMPACT  OF  PRIVATE  INTERVENTION adaptability ease   of  mastery leverage
  28. 28. 68 core projects
  29. 29. 248 researchers
  30. 30. 28 institutions
  31. 31. 1070 datasets
  32. 32. 1723 results
  33. 33. Omberg,  et  al.  Nature  Gene*cs
  34. 34. colorectal cancer subtyping
  35. 35. analysis groups datasets A 1 B 2 C 3 D 4 E 5 F 6 subtypes
  36. 36. analysis groups datasets A 1 B 2 C 3 D 4 E 5 F 6 G ... subtypes
  37. 37. analysis groups G
  38. 38. analysis groups datasets A 1 B 2 C 3 D 4 E 5 F 6 G ... subtypes
  39. 39. 3. research and culture are on a collision course, driven by data.
  40. 40. tension between anonymity and utility.
  41. 41. “more like plutonium than gold”
  42. 42. tension between expectation and reuse.
  43. 43. 68% want their data shared for science
  44. 44. tension between value of individual and value of aggregate.
  45. 45. $.50 to $2.50 for SSN, birthdate, etc.
  46. 46. $5 to $15 for credit, background checks.
  47. 47. ~40 records for $2100
  48. 48. tension between “research” data and “consumer” data.
  49. 49. https://www.scienceexchange.com/
  50. 50. it’s likely that we will end up with a data network effect of some sort.
  51. 51. a. the incremental institution.
  52. 52. b. the walled garden.
  53. 53. c. big networks of small things.
  54. 54. thank you ! @wilbanks wilbanks@nitrd.gov
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×