Deepak Singh, Ph.D.
 
 
 
 
Picture via  Eole  under a CC-BY-NC-SA license
?
Via  Reavel  under a CC-BY-NC-ND license
biz dev manager
 
 
resizable compute capacity
scalable web sites
number crunching
but
life science industry
software
informatics
scientific programmer
product manager
strategist
 
 
 
opinions
lots of opinions
 
career choices
software development
informatics
computing
data
open  data
http://mndoci.com
http://c2cbio.com  or on iTunes
http://bioscreencast.com
By  jasarcadia  under a CC-BY-NC-ND license
A  meme  (pronounced /miːm/) consists of any idea or behavior that can pass from one person to another by learning or imit...
 
big  data
collective intelligence
the new science
By  ~Prescott  under a CC-BY-NC license
datasets
many datasets
PFAM GENBANK ENSEMBL PDB Many Others
 
manageable
download
 
 
 
 
data management  is not  data storage
smart
 
context
Via   Nature Reviews Cancer
technology
technology ? ? ? ?
technology technology technology technology
Back of the room
listening
toxicologists
 
 
experiment design
holistic
systems biology
the s*&t hits the fan
Image courtesy Matt Wood
 
 
 
genome #1
$3 billion
15 years
 
1000  genomes
http://www.1000genomes.org/
By  bitterlysweet  under a CC-BY-NC-ND license
75 TB / week
600 GB – 6 TB / run
200 TB drive
schema
fit on a wall too big to
implications
Via  Barack Obama  under a CC-BY-NC-SA license
 
utilization
capacity planning
data availability
data access
collaboration
computation
typical informatics workflow
 
 
 
 
 
distribute everything
distributed data
distributed computing
Via  bionicteaching  under a CC-BY license
Via  bionicteaching  under a CC-BY license
services everywhere
data services
application services
api
available everywhere
available all the time
 
 
 
 
 
 
 
 
sensors
adverse event reporting
research streaming
 
computing everywhere
Via  Laughing Squid  under a CC-BY-NC-ND license
 
 
collective intelligence  is a shared or group intelligence that emerges from the collaboration and competition of many ind...
 
networked future of science
 
protective
A biologist would rather share their toothbrush than their (gene) names -- Mike Ashburner (Cambridge)
 
 
wisdom
look elsewhere
Wherever you work most of the smart people are somewhere else -- Bill Joy
TIMTOWDI
data
data finds the data Source:  Jeff Jonas
data finds the data, then people find people Source:  Jon  Udell
important
world wide web
giant global graph
 
search
traverse link graph
people
 
present
future
data in context
linked data
the artist formerly known as
the semantic web
 
entity extraction
 
follow the graph
let the data find the data
and then
people  will  find the people
information overload
 
filter failure
human trust networks
many ways
scientific social networks
 
why?
put people first
 
communities around data
http://ecolicommunity.org
http://ebird.org/content/ebird/
micro-communities
 
 
 
little segue
“ Bursty Work”
loosely distributed collaborations
computational problems
 
 
 
back on track
I define  Web 2.0  as the design of systems that harness network effects to get better the more people use them, or more c...
web as platform
data driven platform
people driven platform
 
 
 
bayesian filter
find relevant information
huge  amounts of data
architect for innovation
google visualization api
structured data
multiple sources
connected to the web
platform
create
share
re-use
visualizations
create
share
re-use
create
share
re-use
 
 
mashups
 
 
collect
analyze
remix
repurpose
only way
open data
obey web standards
xml
json
rdf
all this stuff
new models
research
collaboration
business
exciting times
Via  The Opportunity Agenda  under a CC-BY-NC-SA license
the door is open
take the step
Acknowledgements Matt Wood Carole Goble Larry Lessig The Biogang
 
Upcoming SlideShare
Loading in...5
×

Science Big, Science Connected

3,134

Published on

Talk at Virginia Tech, Nov 14, 2008

Published in: Technology, Education

Transcript of "Science Big, Science Connected"

  1. 1. Deepak Singh, Ph.D.
  2. 6. Picture via Eole under a CC-BY-NC-SA license
  3. 7. ?
  4. 8. Via Reavel under a CC-BY-NC-ND license
  5. 9. biz dev manager
  6. 12. resizable compute capacity
  7. 13. scalable web sites
  8. 14. number crunching
  9. 15. but
  10. 16. life science industry
  11. 17. software
  12. 18. informatics
  13. 19. scientific programmer
  14. 20. product manager
  15. 21. strategist
  16. 25. opinions
  17. 26. lots of opinions
  18. 28. career choices
  19. 29. software development
  20. 30. informatics
  21. 31. computing
  22. 32. data
  23. 33. open data
  24. 34. http://mndoci.com
  25. 35. http://c2cbio.com or on iTunes
  26. 36. http://bioscreencast.com
  27. 37. By jasarcadia under a CC-BY-NC-ND license
  28. 38. A meme (pronounced /miːm/) consists of any idea or behavior that can pass from one person to another by learning or imitation
  29. 40. big data
  30. 41. collective intelligence
  31. 42. the new science
  32. 43. By ~Prescott under a CC-BY-NC license
  33. 44. datasets
  34. 45. many datasets
  35. 46. PFAM GENBANK ENSEMBL PDB Many Others
  36. 48. manageable
  37. 49. download
  38. 54. data management is not data storage
  39. 55. smart
  40. 57. context
  41. 58. Via Nature Reviews Cancer
  42. 59. technology
  43. 60. technology ? ? ? ?
  44. 61. technology technology technology technology
  45. 62. Back of the room
  46. 63. listening
  47. 64. toxicologists
  48. 67. experiment design
  49. 68. holistic
  50. 69. systems biology
  51. 70. the s*&t hits the fan
  52. 71. Image courtesy Matt Wood
  53. 75. genome #1
  54. 76. $3 billion
  55. 77. 15 years
  56. 79. 1000 genomes
  57. 80. http://www.1000genomes.org/
  58. 81. By bitterlysweet under a CC-BY-NC-ND license
  59. 82. 75 TB / week
  60. 83. 600 GB – 6 TB / run
  61. 84. 200 TB drive
  62. 85. schema
  63. 86. fit on a wall too big to
  64. 87. implications
  65. 88. Via Barack Obama under a CC-BY-NC-SA license
  66. 90. utilization
  67. 91. capacity planning
  68. 92. data availability
  69. 93. data access
  70. 94. collaboration
  71. 95. computation
  72. 96. typical informatics workflow
  73. 102. distribute everything
  74. 103. distributed data
  75. 104. distributed computing
  76. 105. Via bionicteaching under a CC-BY license
  77. 106. Via bionicteaching under a CC-BY license
  78. 107. services everywhere
  79. 108. data services
  80. 109. application services
  81. 110. api
  82. 111. available everywhere
  83. 112. available all the time
  84. 121. sensors
  85. 122. adverse event reporting
  86. 123. research streaming
  87. 125. computing everywhere
  88. 126. Via Laughing Squid under a CC-BY-NC-ND license
  89. 129. collective intelligence is a shared or group intelligence that emerges from the collaboration and competition of many individuals.
  90. 131. networked future of science
  91. 133. protective
  92. 134. A biologist would rather share their toothbrush than their (gene) names -- Mike Ashburner (Cambridge)
  93. 137. wisdom
  94. 138. look elsewhere
  95. 139. Wherever you work most of the smart people are somewhere else -- Bill Joy
  96. 140. TIMTOWDI
  97. 141. data
  98. 142. data finds the data Source: Jeff Jonas
  99. 143. data finds the data, then people find people Source: Jon Udell
  100. 144. important
  101. 145. world wide web
  102. 146. giant global graph
  103. 148. search
  104. 149. traverse link graph
  105. 150. people
  106. 152. present
  107. 153. future
  108. 154. data in context
  109. 155. linked data
  110. 156. the artist formerly known as
  111. 157. the semantic web
  112. 159. entity extraction
  113. 161. follow the graph
  114. 162. let the data find the data
  115. 163. and then
  116. 164. people will find the people
  117. 165. information overload
  118. 167. filter failure
  119. 168. human trust networks
  120. 169. many ways
  121. 170. scientific social networks
  122. 172. why?
  123. 173. put people first
  124. 175. communities around data
  125. 176. http://ecolicommunity.org
  126. 177. http://ebird.org/content/ebird/
  127. 178. micro-communities
  128. 182. little segue
  129. 183. “ Bursty Work”
  130. 184. loosely distributed collaborations
  131. 185. computational problems
  132. 189. back on track
  133. 190. I define Web 2.0 as the design of systems that harness network effects to get better the more people use them, or more colloquially, as “ harnessing collective intelligence .” This includes explicit network-enabled collaboration, to be sure, but it should encompass every way that people connected to a network create synergistic effects -- Tim O’Reilly
  134. 191. web as platform
  135. 192. data driven platform
  136. 193. people driven platform
  137. 197. bayesian filter
  138. 198. find relevant information
  139. 199. huge amounts of data
  140. 200. architect for innovation
  141. 201. google visualization api
  142. 202. structured data
  143. 203. multiple sources
  144. 204. connected to the web
  145. 205. platform
  146. 206. create
  147. 207. share
  148. 208. re-use
  149. 209. visualizations
  150. 210. create
  151. 211. share
  152. 212. re-use
  153. 213. create
  154. 214. share
  155. 215. re-use
  156. 218. mashups
  157. 221. collect
  158. 222. analyze
  159. 223. remix
  160. 224. repurpose
  161. 225. only way
  162. 226. open data
  163. 227. obey web standards
  164. 228. xml
  165. 229. json
  166. 230. rdf
  167. 231. all this stuff
  168. 232. new models
  169. 233. research
  170. 234. collaboration
  171. 235. business
  172. 236. exciting times
  173. 237. Via The Opportunity Agenda under a CC-BY-NC-SA license
  174. 238. the door is open
  175. 239. take the step
  176. 240. Acknowledgements Matt Wood Carole Goble Larry Lessig The Biogang
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×