Your SlideShare is downloading. ×
0
Genomes on Rails
  has_many :sequences
Hello
➊
Previously

    ➋
Production

    ➌
 Process
➊ Previously
The human genome


   15 years to decode
     3 billion letters
$3 billion
$3 billion ++
Race for the prize
Open data
Open source
Perl
Lots of Perl
Lots of Perl
 ~4500 modules
Onwards!
40 species
Map evolutionary
     space
Compare genomes
compare species
Compare genomes
compare species
Compare genomes

 compa re indi viduals
More Perl
~1500 modules
Quantum leap!
1000 personal
  genomes
beyond 23andme
1000 personal
  genomes
Hypertension
Diabetes
Coronary heart disease
Bipolar disorder
Malaria
➋ Production
Register projects


Register samples


  Sample prep


  Sequencing


    Analysis
Change!
Flexible data capture
Virtual fields
Sample


   Name
  Organism
Concentration
class Sample < ActiveRecord::Base
  has_many :descriptors
  has_many :descriptor_values
end
Key value pairs
Faster than you’d think
Change!
V1               V2


   Sample          Sample


   Name             Name
  Organism        Organism
Concentration   Conc...
Rationalize!
V1               V2


   Sample          Sample


   Name             Name
  Organism        Organism
Concentration   Conc...
Mapping!
V1               V3


   Sample          Sample


   Name             Name
  Organism         Species
Concentration   Conc...
Pipeline management
Workflow

 Task 1        Task 2        Task 3

  Name          Name         Name
Operator     Serial number   Passed
Instru...
Throughput!
320Tb 450 CPU
320Tb 450 CPU   Archive
75   Tb
pilot study!
Multiple apps
Multiple instances
Loosely coupled
Loose coupling is hard
Deployment
Maintenance
Monitoring
Hard to maintain
  separation
Support novel science
Single code base
nginx reverse proxy
fairnginx
Mongrel
Fast deployment
Automate everything
Play well with others!



 Interoperability!
Legacy databases
RESTful services
Generate API stubs
SCALE!
Trillionics
2   X
150Tb per week
Over 6 months
More hardware
400 additional nodes
additional 360 Tb
Towards a
Virtual Institute
Lots of data
Lots of data, lots of
      people
Lots of data, lots of
people, lots of compute
Lots of data, lots of
people, lots of compute,
      lots of uses
Lots of data, lots of
 people, lots of compute,
lots of uses, lots and lots
   and lots and lots...
➌ Process
Concept Requirements Development   Product
takes too lon
                                   g
Concept Requirements Development       Product
takes too lon
                                    g
Concept Requirements Development        Product




       the se change
Plan                 Development




 REVIEW        Concept




What we need              Get ready
Focused
Project owner is key
Weekly releases
More flexible
Less time
Better transparency
Less software
Sequencing informatics
Thank you
GREENISGOOD.CO.UK
Genomes On Rails
Genomes On Rails
Genomes On Rails
Genomes On Rails
Genomes On Rails
Genomes On Rails
Genomes On Rails
Genomes On Rails
Genomes On Rails
Genomes On Rails
Genomes On Rails
Genomes On Rails
Genomes On Rails
Genomes On Rails
Genomes On Rails
Genomes On Rails
Genomes On Rails
Genomes On Rails
Genomes On Rails
Genomes On Rails
Genomes On Rails
Genomes On Rails
Genomes On Rails
Genomes On Rails
Genomes On Rails
Upcoming SlideShare
Loading in...5
×

Genomes On Rails

3,745

Published on

Originally given at RailsConf, this talk outlines how the Wellcome Trust Sanger Institute is using Ruby and Rails as part of their new sequencing platform.

Published in: Technology
0 Comments
5 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
3,745
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
70
Comments
0
Likes
5
Embeds 0
No embeds

No notes for slide

Transcript of "Genomes On Rails"

  1. 1. Genomes on Rails has_many :sequences
  2. 2. Hello
  3. 3. ➊ Previously ➋ Production ➌ Process
  4. 4. ➊ Previously
  5. 5. The human genome 15 years to decode 3 billion letters
  6. 6. $3 billion
  7. 7. $3 billion ++
  8. 8. Race for the prize
  9. 9. Open data
  10. 10. Open source
  11. 11. Perl
  12. 12. Lots of Perl
  13. 13. Lots of Perl ~4500 modules
  14. 14. Onwards!
  15. 15. 40 species
  16. 16. Map evolutionary space
  17. 17. Compare genomes
  18. 18. compare species Compare genomes
  19. 19. compare species Compare genomes compa re indi viduals
  20. 20. More Perl ~1500 modules
  21. 21. Quantum leap!
  22. 22. 1000 personal genomes
  23. 23. beyond 23andme 1000 personal genomes
  24. 24. Hypertension
  25. 25. Diabetes
  26. 26. Coronary heart disease
  27. 27. Bipolar disorder
  28. 28. Malaria
  29. 29. ➋ Production
  30. 30. Register projects Register samples Sample prep Sequencing Analysis
  31. 31. Change!
  32. 32. Flexible data capture
  33. 33. Virtual fields
  34. 34. Sample Name Organism Concentration
  35. 35. class Sample < ActiveRecord::Base has_many :descriptors has_many :descriptor_values end
  36. 36. Key value pairs
  37. 37. Faster than you’d think
  38. 38. Change!
  39. 39. V1 V2 Sample Sample Name Name Organism Organism Concentration Concentration Origin Quality metric
  40. 40. Rationalize!
  41. 41. V1 V2 Sample Sample Name Name Organism Organism Concentration Concentration Origin Quality metric
  42. 42. Mapping!
  43. 43. V1 V3 Sample Sample Name Name Organism Species Concentration Concentration Origin Origin Quality metric
  44. 44. Pipeline management
  45. 45. Workflow Task 1 Task 2 Task 3 Name Name Name Operator Serial number Passed Instrument Kit
  46. 46. Throughput!
  47. 47. 320Tb 450 CPU
  48. 48. 320Tb 450 CPU Archive
  49. 49. 75 Tb
  50. 50. pilot study!
  51. 51. Multiple apps
  52. 52. Multiple instances
  53. 53. Loosely coupled
  54. 54. Loose coupling is hard
  55. 55. Deployment
  56. 56. Maintenance
  57. 57. Monitoring
  58. 58. Hard to maintain separation
  59. 59. Support novel science
  60. 60. Single code base
  61. 61. nginx reverse proxy
  62. 62. fairnginx
  63. 63. Mongrel
  64. 64. Fast deployment
  65. 65. Automate everything
  66. 66. Play well with others! Interoperability!
  67. 67. Legacy databases
  68. 68. RESTful services
  69. 69. Generate API stubs
  70. 70. SCALE!
  71. 71. Trillionics
  72. 72. 2 X
  73. 73. 150Tb per week
  74. 74. Over 6 months
  75. 75. More hardware
  76. 76. 400 additional nodes
  77. 77. additional 360 Tb
  78. 78. Towards a Virtual Institute
  79. 79. Lots of data
  80. 80. Lots of data, lots of people
  81. 81. Lots of data, lots of people, lots of compute
  82. 82. Lots of data, lots of people, lots of compute, lots of uses
  83. 83. Lots of data, lots of people, lots of compute, lots of uses, lots and lots and lots and lots...
  84. 84. ➌ Process
  85. 85. Concept Requirements Development Product
  86. 86. takes too lon g Concept Requirements Development Product
  87. 87. takes too lon g Concept Requirements Development Product the se change
  88. 88. Plan Development REVIEW Concept What we need Get ready
  89. 89. Focused
  90. 90. Project owner is key
  91. 91. Weekly releases
  92. 92. More flexible
  93. 93. Less time
  94. 94. Better transparency
  95. 95. Less software
  96. 96. Sequencing informatics
  97. 97. Thank you
  98. 98. GREENISGOOD.CO.UK
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×