Into the Wonderful
Towards a Virtual Institute
you are here
Data
Lots of data
Lots of data, lots of
      people
Lots of data, lots of
people, lots of compute
Lots of data, lots of
people, lots of compute,
      lots of uses
Lots of data, lots of
 people, lots of compute,
lots of uses, lots and lots
   and lots and lots...
Trillionics
A platform for science
1    Get

2   Select

3   Work

4   Save
1    Get

2   Select

3   Work

4   Save
1    Get

2   Select

3   Work

4   Save
1    Get

2   Select

3   Work

4   Save
Work is the killer app
get here quickly
Work = publications
Problematic for
 complex data
1    Get

2   Select

3   Work

4   Save
1   Get: flat files / databases

2             Select

3             Work

4             Save
1    Get: flat files / databases

2   Select: scripts / directories

3              Work

4              Save
1    Get: flat files / databases

2   Select: scripts / directories

3       Work: interesting

4              Save
1    Get: flat files / databases

2   Select: scripts / directories

3       Work: interesting

4   Save: flat files / dat...
Get   Filter   Work   Save
Get   Filter   Work   Save
Get   Filter   Work   Save
Get   Filter   Work   Save
Get   Filter   Work   Save
Get   Filter   Work   Save
Get   Filter   Work   Save
Get   Filter   Work   Save
Filter
      Save
             Get
                   Work
Get      Filter     Work        Save
Filter
      Save
                                    Work
             Get
       Get         Work
Get      Filter     Wo...
Filter
      Save
                                    Work
             Get
       Get         Work
Get      Filter     Wo...
Virtualise
Get   Save
Data platform


Get      Save
Data platform


Get      Save
      Work
Data platform


Get      Save
      Work

 App platform
Data accessible via
     services
Applications accessible
     via services
Data platform

      Get / Save


    Work
Projects / SNP calling


App platform
Distribute
Data platform
Hintxon      Get / Save
                          San Diego
            Work


          App platform
Distributed storage



            Virtualised services


     Application programming interfaces
Getters             Filt...
Distributed storage



                Virtualised services


         Application programming interfaces
    Getters     ...
A distributed mindset
map/reduce
1. map
@a = [ 1, 2, 3 ]
@result = []

for each $value in @a
  push @result, map($value)
end

sub map($incoming)
 return ($incomin...
2. reduce
reduce(@result)

sub reduce($r)
  <transform $r>
end
independent
of array size!
 independent
of array size!
 independent
           of ea ch other!
independent

distribute across virtual machines!
Prerequisites
Open data

easy to get a t data
soft ware as a service
 Open APIs
Beyond SQL
Accessibility
East coast
24/7

      Accessibility

              Dow n the co rridor
West coast
Reliability
Build for flux
Authentication
Privacy
Less software
Distribute everything
Replicate everything

    Speed. Redundancy.
Will it scale?
Oh yes
New York Times
11 million TIFs
24 hours
  $500
Google, Yahoo!
   Amazon
We are here
We need to start now
2
X
150
  Tb/week
We need to start now

         as in, like, ye sterday
Petabyte journal club

 foomongers.org.uk
Thank you
GREENISGOOD.CO.UK
Into The Wonderful
Into The Wonderful
Upcoming SlideShare
Loading in …5
×

Into The Wonderful

9,323 views

Published on

An introduction to cloud computing from a scientific research perspective.

Published in: Technology, Business
3 Comments
13 Likes
Statistics
Notes
No Downloads
Views
Total views
9,323
On SlideShare
0
From Embeds
0
Number of Embeds
93
Actions
Shares
0
Downloads
198
Comments
3
Likes
13
Embeds 0
No embeds

No notes for slide

Into The Wonderful

  1. Into the Wonderful Towards a Virtual Institute
  2. you are here
  3. Data
  4. Lots of data
  5. Lots of data, lots of people
  6. Lots of data, lots of people, lots of compute
  7. Lots of data, lots of people, lots of compute, lots of uses
  8. Lots of data, lots of people, lots of compute, lots of uses, lots and lots and lots and lots...
  9. Trillionics
  10. A platform for science
  11. 1 Get 2 Select 3 Work 4 Save
  12. 1 Get 2 Select 3 Work 4 Save
  13. 1 Get 2 Select 3 Work 4 Save
  14. 1 Get 2 Select 3 Work 4 Save
  15. Work is the killer app get here quickly
  16. Work = publications
  17. Problematic for complex data
  18. 1 Get 2 Select 3 Work 4 Save
  19. 1 Get: flat files / databases 2 Select 3 Work 4 Save
  20. 1 Get: flat files / databases 2 Select: scripts / directories 3 Work 4 Save
  21. 1 Get: flat files / databases 2 Select: scripts / directories 3 Work: interesting 4 Save
  22. 1 Get: flat files / databases 2 Select: scripts / directories 3 Work: interesting 4 Save: flat files / databases
  23. Get Filter Work Save
  24. Get Filter Work Save
  25. Get Filter Work Save
  26. Get Filter Work Save
  27. Get Filter Work Save
  28. Get Filter Work Save
  29. Get Filter Work Save
  30. Get Filter Work Save
  31. Filter Save Get Work Get Filter Work Save
  32. Filter Save Work Get Get Work Get Filter Work Save
  33. Filter Save Work Get Get Work Get Filter Work Save
  34. Virtualise
  35. Get Save
  36. Data platform Get Save
  37. Data platform Get Save Work
  38. Data platform Get Save Work App platform
  39. Data accessible via services
  40. Applications accessible via services
  41. Data platform Get / Save Work Projects / SNP calling App platform
  42. Distribute
  43. Data platform Hintxon Get / Save San Diego Work App platform
  44. Distributed storage Virtualised services Application programming interfaces Getters Filters Savers Work
  45. Distributed storage Virtualised services Application programming interfaces Getters Filters Savers Work Work Work Work Work Work Work Work Work Work Work Work Work Work Work Work Work
  46. A distributed mindset
  47. map/reduce
  48. 1. map
  49. @a = [ 1, 2, 3 ] @result = [] for each $value in @a push @result, map($value) end sub map($incoming) return ($incoming * 10) end
  50. 2. reduce
  51. reduce(@result) sub reduce($r) <transform $r> end
  52. independent
  53. of array size! independent
  54. of array size! independent of ea ch other!
  55. independent distribute across virtual machines!
  56. Prerequisites
  57. Open data easy to get a t data
  58. soft ware as a service Open APIs
  59. Beyond SQL
  60. Accessibility
  61. East coast 24/7 Accessibility Dow n the co rridor West coast
  62. Reliability
  63. Build for flux
  64. Authentication
  65. Privacy
  66. Less software
  67. Distribute everything
  68. Replicate everything Speed. Redundancy.
  69. Will it scale?
  70. Oh yes
  71. New York Times
  72. 11 million TIFs
  73. 24 hours $500
  74. Google, Yahoo! Amazon
  75. We are here
  76. We need to start now
  77. 2 X
  78. 150 Tb/week
  79. We need to start now as in, like, ye sterday
  80. Petabyte journal club foomongers.org.uk
  81. Thank you
  82. GREENISGOOD.CO.UK

×