Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

2013 11-13-SoftwareSustainabilityInstitute@Manchester

353 views

Published on

Published in: Education, Technology
  • Be the first to comment

2013 11-13-SoftwareSustainabilityInstitute@Manchester

  1. 1. Ant genomics & Bioinformatics for emerging model organisms @yannick__ http://yannick.poulet.org
  2. 2. © Alex Wild & others
  3. 3. Animal biomass (Brazilian rainforest) Soil fauna excluding earthworms, ants & termites Spiders Earthworms Mammals Ants & termites Other insects Birds Reptiles Amphibians from Fittkau & Klinge 1973
  4. 4. 454 Illumina Solid...
  5. 5. 454 Illumina Solid...
  6. 6. 454 Illumina Solid... This changes everything.
  7. 7. Any lab can sequence anything! 454 Illumina Solid... This changes everything.
  8. 8. Major questions
  9. 9. Major questions Which genes are involved in social behaviour?
  10. 10. Major questions Which genes are involved in social behaviour? Which genes make it possible for queens to live up to 30 years?
  11. 11. We need great tools.
  12. 12. Tools for genomics work on emerging model organisms
  13. 13. Tools for genomics work on emerging model organisms Antgenomes.org
  14. 14. Tools for genomics work on emerging model organisms Antgenomes.org SequenceServer.com BLAST made easy
  15. 15. ACGACACGGCGCGTCGAGCGTGCACAGAGAAGTTGAATTATCGAGGGaGAAAAAGGGCGCGAGGGAGGGGCAGGACGGTCAAACACTGAGAAAGCAGGCAACATCCGCGAACCCGTAAATTAATTAGGGCCAACATAATAG CTGGGAATGGCTGGTAGTGGTGGTTCGTCTTCGTGGTCCAGCTTGCGGTTCCCGAGTGCCCTCGCAGTTCGCGCGGTCATGCCTCGGAAACTACACGATACACCTTGCGTCATCCTTACACGAGGATACCTGTTTGCGATC CCCTGAGGTATTAACGAGCGTATAAGCAGACTCAAAGTACGAACACACTGCTAGTTTCGCGTGCATGTTGCTCGTCGCCGTGCAGACGAGTTAGTACTGTCGACAAAGTAGTCGTAAAGTACGTAAGAGTCTTCCTCTAGT AGCGAGGACTAATCTCCCGGTACATTAATCTATGTATATTTTATGATATACACACCTGTATTTATCGAAAGTTTATGTTTATCTCGAAAATTAACGTTAATTTTCGAGGACGCAAAAGCTGTCGCAAGTTTaTTTAATTTT GTTTGtATTTTTTTTTCTTGCTCATTTTTATTTCGAAGTATGCAAATTGAATCTCTTGGTGGCATAACGAGAATTTTCGAAGCTTAAAGGGATCTGCGTTGGCGCGAAGAGAATGGGACTTGTTTTAAACTTTTCTTTCCA ACGAACACCGTTTCCTTCATTGTAACGTAAAAATGGTGAGCTTCTGCGGCGCGATTGGTCTCATCTTTCCTCTCGTATCGTTCGTCATTTTGTCCGAGAGCGTGTAGATATTGTACCCGATAGAACGGAGACGGAATCACG CTATACGGTTCCCTCCCAAATGTTCGTTCCTCACGGGCAAAAGTACAAGTAAAAGTAGGAGGGTCCGACTTTATGACCCTACGAGGCACAATAAAGGTTGATGACTACTAAACGTAGAAGAACACGTATACAGTCCGACGT AAAGTTATTTTTCAATTCCGCGGTCCTCCGCCGCAGTCGTTTTGCCTTGACGTGGAAAAGGAAATTTCCCGGGTTCTAGTCGTCCGGTCTTCTTTtCGTTCTACTGACAATACCATAtTTTCGACATAGATCGATCTCCTT TtCTCTTtCTCTCTCTCTCTCTCTCTCTCTCTTGAAAATGAAAGCGCGGAAGAGCCGGATCGTGCAACGCACAATGCGAGGCCGCTTTGTACGAGGTACGAGGAGTCCGTCATGCGCCGTAAAACGCCGGAATATGCAATA CTACTTTCGCAGCGACGGGGCTTGCACGAAGTAATATCGTGAAACGTAGAACCGTTCTTTTCATACGGATATGCGGGAGAAGTTGCTCGTCCGCCCTCCGGCGCATACACGTCGCGCGAGAGTATCGTGCATCCGAAACTC TGAGATGAAACTCTTCTGTAGTCGATATTCGTCGCGATAAATAGATAAGTCTCGGATAAGGTAGAGACATCGATAATTCCGTTACGAGAATACTCGAGAATAAGATCCAAGTGAAGTGATCACGCTCCATATGGTTTCAGA TTAATCTCCAATCGGCTGACGAAGGAGGATCATCCTTTTACCGGTGGAAAATAGGGTGGATTGGCGGAGCAAGAAGGCTAAACAGAGAATAAACAGGAGCTATAGCCGAACGAGGGAGAAGGTAAGTAGATGCCTCGGAGA ACGTGAGCGAAAAAAGGAAACGGGTACGAGAGAAAAAGAAACGGACGGAATAGCGGCTCGGGATTCGCATCCCAAGGAAAACCAAGGCTATACCGGGGTCTTGGATTATTCGAGGCACGACGACGATCTTCGAGTCAGCCG ACTGTCTGTACCGTGAAAGTGGCACGTATCGATCGCACGGCTGGATTATCTTCCACTTCGATCTACGACGATTACTTCCGCCATCGTATATCCGGGCTTTGCACTAGCGAGGCTATTTAAAAATCTGCGCTCAGTAACTAC TTATGATTTTTCCATCAGAAACGATTGTGGAGAGAAAAGAGGGAAAAAAAGAGATAGACAGCCTCTGGCTCGAAATGCTAATTTCGCAATCGAGAATTAAAATGCACTTCTTTGTATCTAAATTTTCGTAGAATTAAAATA AAATTGAATAAAGCAATGAATAAATTGAATAAACTAAAATATAGCTAAATATTTTTTCCTCTATACAAGGTGAATATAATTATCAAATATTTAAGTATGTAGATTGAATTTAAACAGCCTCGGAAGAGAAAAGAATCGGAT AAACGAAATGCTTTTGCCTCTATTTTCAAGCACGTGACGAATAAAATCTAGCAAAGCTTTTCGACACAATATGTCGACGCAAATGTGGTCTATTTTGGCTAATGATTATTACCGGGAGTCCCGGGCACGGTGTGTCCGCGG CGCGATAAATTAGAGCGCGAATCGACTTCCACGGCCGTTGTAGAAGGTACTTTGGCAAACGTTATTTCTTCCTGTCTCGAAGGAAGCGCCACTCGAAAACTTGGAAAAGTTCGGCCAGCTGCACGCACCGCGATCTCGGGT CCGTTCTGGCTCGGTGGCGTGCGCAGGACGGTTGTGAGACGAGAGAAAGAGAGAAAGGAGGATAGAGCGCGAGAGGGAAAGAAGAGGGAACATCCGCGTGCGTGGTTGTATATGGCGTTAATGGCGGCGAGCATAAAGCAT CCCCGcGCCTGCACGCTCGACCACGGTCGTTTACAGTGCCACTATTTATGTTGATAACTTCGGAATGGAGTCAGATACACGGTACCGAGTGCCGGCGGTTCGGCGGTGGTCCGGCAGCGGTCTGGTAGCACTTGCAAAGTA TGGGAGAAGAAGGGGGATGGTGGATTCGCGGATCTTTTTGGTCGTGCAAGGAAGGCGGGTCGGTTAGGGTAGGTAGAGAGGAGACGAGCCGAGTCGAGAACAAGATTCAAGCGGAAAGTTTTGCGAGTTAATGGTGCGGAG GTAATGCCGCGGCCGCAACAGCAACAGCCATCCCCCTGCTTGTGTGTTCGCTCGCTCGCTCACTtCTTCCGTTCTCTCCTCTTCGCGCTAGTTCTCTCTCTCTCTCTCTCTCTCTCTCTGtCTTTCTTGGAATAGCCGTAG AAAAAaGGAAAAGAAAGAGAGAGAAAGAACGAGACTCCTTTCTCCCCACGAATTCTCTCCTCTCTTTAAGCACACTCTCTCTTCCTGCaccCCCCCCCcTCcTCTCTCTCTCTTTCTCTCTCTATTTTCTTGTCTTCCTCA CTTGCATCATCCCTCGTCCATTTCCcTCTGGAGAACGTGCCACGTTCTCACTTCCTCGTTCGTCGCTACTTTTTTTCTTCTTTAGTTGTCTCCTCTCACACCCTCGAGACGGCCCGATCTTTTCCTCGTACAGCTCCTATC ACGAACAGTCGCTAACAAGTGCATCGAATGCAAGTTGCCGACAAACTTCTTCCACCGATTTGTGCTTGTTCTCTGTGCATGCGCGGGCATGTATATCTCTATTAGGCGACATCTGCTCTCAGCTTTTTCATAACGAGGTAG CGCGGATTCGCGCTCCGAGAGACTTCACGAAGCACTTCCGATCTCGCTACAGTAGAATGCGTATTGTATTTTCTTGTCTCTCTCATTTACTCTTTTCTATCTTTCGTATCTAGCGTGAATACTCCCATGAGGAATGTAGAA ACCAGATTTTGAAACGGCTTCTTCGTATCTAAATTTTCTTAACTATTTGTTCCTCAAAGTCCATTTTGACGTTATAAATTTTTATTTTTTAATGAGAATGTTTTTATGTGGAGAAGAAAAGTACAACTTTTTTCAAAATGC AATTAAATTTATACAAGACTACTAAGATAAAAAAAGATGCAAATAATAATGTTCAACTTACTAACTGCTATATTATTAAGGCCAAGTTAAATTACAGATTTATGATGTTGCAGAATATAATAGAAAAGTTTCAAGAAAAAA aCATTTTAAAGCTTAAAGAGTTTGTTTTCAAACCCATCAAATTTTTTtCTCTGGCAGTGCTGCTGCTCTGCGACGTCATTTTCAAGTCGATCAATGCAAAGTTAACAAAAGTATTTCAACTTTAAAATCATGCAGAAAGTT GGGAAAAATTTGTTTTCAATTTTACTTAAATTTTAGTTACATTTTTTCTAAAAGCTTAAAATCTCCTCTTTAAAATATGTCTTTAAAAATTGAGCTATGATTTTTCTTCACCGAGTTATCGTAATTTGAAATGGCCAATCG ACGTTTTCTTTTCATCACTAGGAATAATGCCGGCGGATTTAAGCTTTCAGAAGATTCCAAGCAAAGTTGAAAACAATTTGTTTGAATTCCGCATGATCTCAAAGTTGAAGCTAAAAAAATTTCACAAGACTTTAAAAGACA AAAAGTCAATGTTGCACAATCGACTTGGAAATAACGCCACTTGGAGCAGAGAAGTATTGCCGAGAAAAAGCTTCGACCGGGTTTGAAAGCAGGATCTATAGGCTTCCAAACTTTTTTTTtGAAAAATGGAGCTTTAAGGTT ACTTTTCACGAAATTGTATAAATTGTGTTTTGTTGtTTTTATCATCGATTCGAAATGATCCTTTCATTTTGTTCAAGAATGTATCATTAACTTAATGCAATGATATCTTTTAATAATTCAATACTTCTTACTAATTAGATT TAGAAAAGTTTCATGAACAATTTTAGAACGGTTATTGCTAATTATTCGCCAAAAACAGACGCCTTTCTTCGAGGAGTAACAACTCAAGCTGCATTCGCGGTTGACGGCTTTCCGCGGCGCGTCGCGGTAATATGGCTCCTA TTTAATTACCCGGTCCTTTGGGAGCTTAAACCAGCCAGAGCTGGAACGGCTGCGCCGTTATTCTATCTAGACTCCTTTGCTTTCTCAAGCAGGCGCGCGATCAAACCTTCTCGCATAAAAGACAATCGCAGCTGGCAGTCG ACGACGCGcGGGACAGTCGAATCATGACCCGCTCCTCTCTAGTTACCGCCGTCAGTCTGCTCTACTTCCAGACGCCGCGCGTAATCTATTCGACATTAGTTGCTTGATTGCACCGTAAAATGCGACGGCGACGTGAACGAG AACGACGACGACGACGACGACGACGACAACGACGACGACAATGACAACGACGACGACGACAAGAGTGGGTTCGTCGGTGCTCGATGGCGCCTCCGATTTCAACCGCAGCACGATGCAAGCCCATTACTATCGCCCGGAGCT AAACGGCACCCGGAGCTCGTGCCATTAAGGGAATCTAGGGTCCGATCCACCTCATTGAATTCCGTTcAATTGCGATTATGATAATGCGTGAACGATCGCCGTGGACGTGAGCTACGGaCAACgAGGGTGTTCGTTCTCGGC TCAGAGAAACGCAGCGATAAAATTATCAGTGACAGCTTCATTTTTGTTACATTTGACGTTGAAAAATTTGCGAAAGAAATGTGCATTATTCAACTACATTGACAATTGTAAATCTTACATGACCTTTTATTAATATACATA TATGTAAGATATATGTTCCATCTCTAGTCTCGTTCGATAATGAAGAAATTATAGCTGCAACATAGACGCCGGATTATTCGCGGGGGAAACCGTTGATTATTTGGACTCCGTGGGCTCGGGGTCTGGTTTACTTCTTCCCTT ATGTCCGGAGATAATGGACACTATTACCTTAACGAGGCCCCAGTCTCTTAGCCGGTAATTGCTTCGAGATATCCGAGAGAGCTCCGGCGTACGTTGCCGCCTGGTGTTGCAGGCAGAGAACCCaGACGGTATTATTGCCGC CGAGGCTACTCGCCCGTTTCATGGTGGTTATTGTTATGGGCCTGCGGCATTAAGATGTACACCGCACTCTCATACGGGACCCACCACCCCATATACGAGCTGCATATATACTTATGCGGGGAGGATTTCATTACGCCGCTT CATAACTGGCGGTCTTCAAAATCGCGATAAATCGGGATTGTCTTCCTTGTTAAATCAGCTCCCGTTCCCCTTCCTCATATCGCTGACGAAGCCAACGAGACGGATTAATCTGCGCGATTGAATGGCCTCTATGACGTAAGC GCACCGTTTACTGGCACGGCTCCTCGTGTTCACGTGAAACGATTTGCGTCGTAAATATTTTATTTTAACATCGCAACATCAAGACAGACGAGGATCGGCTATTGCCTCGTGATCCTAAAAGGGAGATTCTCAAGGCGGAAT CGGGGTAACGCGTTCGTTGATCTCGCCAAACTACGGCATCTTGAGGACTAGTCTTAGaGGAAAAAAAGACGACGAGGAGACACGGTGAGCATTAGATGAGAAAGAGACGGCGCGGCGCGGTGCGGCGGAGCGAGACGGAAA GAGATCAAATCTGGATATCAGGATTAGGGTGGGTACGTAATCCGCAGGACGACGGGTGGTAGGAACGGTGGATCCGTCGGCAGATCTCATCGCGCGGAGGAACTCTGCGGGTACTTCGCCCGGCCAAATACGACAAGAGCA GCAGCCTAACTCGAGTAGAGCCGCCGAGATGGTTTACGGCTCGTCGCAAGTTGGTGAAATTTAAGAGCGCATTTAAAACTGGTTTGGCGCGATGCCCTGCGCCTCCCTGCAGCAGGTACTGCCGGCGAGAGATACCGCTGG GCTCCGACAGGAGTTTATCGCCCTAACATTCTTGGAAAGCTTCGGATGGATTTCCCTCCTCAACCCATTTCTTCCGGCAGCACCTCAAATCGTCGCTTCTTTCGAGCTCCCTGTGTCGTCCGTTCTTTCCCCTTGCCACGA AAATTCTTCCAACAGACTCGACACCAAATCGTCACGACGATCATCGGTCAAGATACCCGGAGAAACGTCGCGGACGATGTTACCGTCAGATCGGACGCTTTTAGCTCCGCATTGGAGCTTTTTCCTCGACCGTCTCGGAGG AGATTTAGCGGACCGACAACAAATGAAGGTCATCTTTCGCGCGAGAAAGCGTCTCACCTTGATTTCCGTCTCGTTTCTCCGGAAATTGGATATCAACCGGACGAACTGTAATGTGTACGTTAAATTGCCAATTATATATAT AATGTAACAGCTGATTATCTCAAGTGTCCTAAAAACCATTATACTCTTAATTTCTGTGAAAAATGGCGAAAATAAAAAAGAAACCGATCTTAATAAAGATATTCTTCCTGATAGATGCCATGACCCACGTGGAAAACTTTT TAGTTTTGTACAGTGGTATTATACGTTATCTTCCGCTGAACGTAAGACGTGCCTATCGCGCAATTTCATCGCGACGTCGTCGTATAGCGATTATGGCTACTCCATTAAAAATGAATTTTATAAAGGCAATCTTTCCAAGCG ATCGTTGTAGGAGAAAAAGGCGAAAGCCGGAGCCAAAGGGGATGAGGCCACTACCTTTGGCTGATCCACTTCGAATGATAATCACCTCTAGGAGACTCAATTTCGCCCTGCTCCGCGTCCTTACCCGTTCCTATCTTCGGA AGGTTCAACGCCGCAGCGGACTGCATCTTTCACTCCCTTCGTCACCACCGCCCTATTCCTATCGCCCTCCGCGCGCCTACCGCCCCTATATCCTTCCCTTCCTTCACtCCTAGACTATTCTGAACGACCTCTTCCCCCATT CGCCAACGCTCACTCCTAACTGATTGGAGTACCAATCAATGCGGCATTCAGGCGGCCGTGCTGAAaCTTTAGGAAATTAACTATTCACTCTCTGGAAATGGTTATTTGGAAGGCCGGAAAGGCAGTCGGGACTACGTTACG Sequencing a genome is now cheap & easy.
  16. 16. ACGACACGGCGCGTCGAGCGTGCACAGAGAAGTTGAATTATCGAGGGaGAAAAAGGGCGCGAGGGAGGGGCAGGACGGTCAAACACTGAGAAAGCAGGCAACATCCGCGAACCCGTAAATTAATTAGGGCCAACATAATAG CTGGGAATGGCTGGTAGTGGTGGTTCGTCTTCGTGGTCCAGCTTGCGGTTCCCGAGTGCCCTCGCAGTTCGCGCGGTCATGCCTCGGAAACTACACGATACACCTTGCGTCATCCTTACACGAGGATACCTGTTTGCGATC CCCTGAGGTATTAACGAGCGTATAAGCAGACTCAAAGTACGAACACACTGCTAGTTTCGCGTGCATGTTGCTCGTCGCCGTGCAGACGAGTTAGTACTGTCGACAAAGTAGTCGTAAAGTACGTAAGAGTCTTCCTCTAGT AGCGAGGACTAATCTCCCGGTACATTAATCTATGTATATTTTATGATATACACACCTGTATTTATCGAAAGTTTATGTTTATCTCGAAAATTAACGTTAATTTTCGAGGACGCAAAAGCTGTCGCAAGTTTaTTTAATTTT GTTTGtATTTTTTTTTCTTGCTCATTTTTATTTCGAAGTATGCAAATTGAATCTCTTGGTGGCATAACGAGAATTTTCGAAGCTTAAAGGGATCTGCGTTGGCGCGAAGAGAATGGGACTTGTTTTAAACTTTTCTTTCCA ACGAACACCGTTTCCTTCATTGTAACGTAAAAATGGTGAGCTTCTGCGGCGCGATTGGTCTCATCTTTCCTCTCGTATCGTTCGTCATTTTGTCCGAGAGCGTGTAGATATTGTACCCGATAGAACGGAGACGGAATCACG CTATACGGTTCCCTCCCAAATGTTCGTTCCTCACGGGCAAAAGTACAAGTAAAAGTAGGAGGGTCCGACTTTATGACCCTACGAGGCACAATAAAGGTTGATGACTACTAAACGTAGAAGAACACGTATACAGTCCGACGT AAAGTTATTTTTCAATTCCGCGGTCCTCCGCCGCAGTCGTTTTGCCTTGACGTGGAAAAGGAAATTTCCCGGGTTCTAGTCGTCCGGTCTTCTTTtCGTTCTACTGACAATACCATAtTTTCGACATAGATCGATCTCCTT TtCTCTTtCTCTCTCTCTCTCTCTCTCTCTCTTGAAAATGAAAGCGCGGAAGAGCCGGATCGTGCAACGCACAATGCGAGGCCGCTTTGTACGAGGTACGAGGAGTCCGTCATGCGCCGTAAAACGCCGGAATATGCAATA CTACTTTCGCAGCGACGGGGCTTGCACGAAGTAATATCGTGAAACGTAGAACCGTTCTTTTCATACGGATATGCGGGAGAAGTTGCTCGTCCGCCCTCCGGCGCATACACGTCGCGCGAGAGTATCGTGCATCCGAAACTC TGAGATGAAACTCTTCTGTAGTCGATATTCGTCGCGATAAATAGATAAGTCTCGGATAAGGTAGAGACATCGATAATTCCGTTACGAGAATACTCGAGAATAAGATCCAAGTGAAGTGATCACGCTCCATATGGTTTCAGA TTAATCTCCAATCGGCTGACGAAGGAGGATCATCCTTTTACCGGTGGAAAATAGGGTGGATTGGCGGAGCAAGAAGGCTAAACAGAGAATAAACAGGAGCTATAGCCGAACGAGGGAGAAGGTAAGTAGATGCCTCGGAGA ACGTGAGCGAAAAAAGGAAACGGGTACGAGAGAAAAAGAAACGGACGGAATAGCGGCTCGGGATTCGCATCCCAAGGAAAACCAAGGCTATACCGGGGTCTTGGATTATTCGAGGCACGACGACGATCTTCGAGTCAGCCG ACTGTCTGTACCGTGAAAGTGGCACGTATCGATCGCACGGCTGGATTATCTTCCACTTCGATCTACGACGATTACTTCCGCCATCGTATATCCGGGCTTTGCACTAGCGAGGCTATTTAAAAATCTGCGCTCAGTAACTAC TTATGATTTTTCCATCAGAAACGATTGTGGAGAGAAAAGAGGGAAAAAAAGAGATAGACAGCCTCTGGCTCGAAATGCTAATTTCGCAATCGAGAATTAAAATGCACTTCTTTGTATCTAAATTTTCGTAGAATTAAAATA AAATTGAATAAAGCAATGAATAAATTGAATAAACTAAAATATAGCTAAATATTTTTTCCTCTATACAAGGTGAATATAATTATCAAATATTTAAGTATGTAGATTGAATTTAAACAGCCTCGGAAGAGAAAAGAATCGGAT AAACGAAATGCTTTTGCCTCTATTTTCAAGCACGTGACGAATAAAATCTAGCAAAGCTTTTCGACACAATATGTCGACGCAAATGTGGTCTATTTTGGCTAATGATTATTACCGGGAGTCCCGGGCACGGTGTGTCCGCGG CGCGATAAATTAGAGCGCGAATCGACTTCCACGGCCGTTGTAGAAGGTACTTTGGCAAACGTTATTTCTTCCTGTCTCGAAGGAAGCGCCACTCGAAAACTTGGAAAAGTTCGGCCAGCTGCACGCACCGCGATCTCGGGT CCGTTCTGGCTCGGTGGCGTGCGCAGGACGGTTGTGAGACGAGAGAAAGAGAGAAAGGAGGATAGAGCGCGAGAGGGAAAGAAGAGGGAACATCCGCGTGCGTGGTTGTATATGGCGTTAATGGCGGCGAGCATAAAGCAT CCCCGcGCCTGCACGCTCGACCACGGTCGTTTACAGTGCCACTATTTATGTTGATAACTTCGGAATGGAGTCAGATACACGGTACCGAGTGCCGGCGGTTCGGCGGTGGTCCGGCAGCGGTCTGGTAGCACTTGCAAAGTA TGGGAGAAGAAGGGGGATGGTGGATTCGCGGATCTTTTTGGTCGTGCAAGGAAGGCGGGTCGGTTAGGGTAGGTAGAGAGGAGACGAGCCGAGTCGAGAACAAGATTCAAGCGGAAAGTTTTGCGAGTTAATGGTGCGGAG GTAATGCCGCGGCCGCAACAGCAACAGCCATCCCCCTGCTTGTGTGTTCGCTCGCTCGCTCACTtCTTCCGTTCTCTCCTCTTCGCGCTAGTTCTCTCTCTCTCTCTCTCTCTCTCTCTGtCTTTCTTGGAATAGCCGTAG AAAAAaGGAAAAGAAAGAGAGAGAAAGAACGAGACTCCTTTCTCCCCACGAATTCTCTCCTCTCTTTAAGCACACTCTCTCTTCCTGCaccCCCCCCCcTCcTCTCTCTCTCTTTCTCTCTCTATTTTCTTGTCTTCCTCA CTTGCATCATCCCTCGTCCATTTCCcTCTGGAGAACGTGCCACGTTCTCACTTCCTCGTTCGTCGCTACTTTTTTTCTTCTTTAGTTGTCTCCTCTCACACCCTCGAGACGGCCCGATCTTTTCCTCGTACAGCTCCTATC ACGAACAGTCGCTAACAAGTGCATCGAATGCAAGTTGCCGACAAACTTCTTCCACCGATTTGTGCTTGTTCTCTGTGCATGCGCGGGCATGTATATCTCTATTAGGCGACATCTGCTCTCAGCTTTTTCATAACGAGGTAG CGCGGATTCGCGCTCCGAGAGACTTCACGAAGCACTTCCGATCTCGCTACAGTAGAATGCGTATTGTATTTTCTTGTCTCTCTCATTTACTCTTTTCTATCTTTCGTATCTAGCGTGAATACTCCCATGAGGAATGTAGAA ACCAGATTTTGAAACGGCTTCTTCGTATCTAAATTTTCTTAACTATTTGTTCCTCAAAGTCCATTTTGACGTTATAAATTTTTATTTTTTAATGAGAATGTTTTTATGTGGAGAAGAAAAGTACAACTTTTTTCAAAATGC AATTAAATTTATACAAGACTACTAAGATAAAAAAAGATGCAAATAATAATGTTCAACTTACTAACTGCTATATTATTAAGGCCAAGTTAAATTACAGATTTATGATGTTGCAGAATATAATAGAAAAGTTTCAAGAAAAAA aCATTTTAAAGCTTAAAGAGTTTGTTTTCAAACCCATCAAATTTTTTtCTCTGGCAGTGCTGCTGCTCTGCGACGTCATTTTCAAGTCGATCAATGCAAAGTTAACAAAAGTATTTCAACTTTAAAATCATGCAGAAAGTT GGGAAAAATTTGTTTTCAATTTTACTTAAATTTTAGTTACATTTTTTCTAAAAGCTTAAAATCTCCTCTTTAAAATATGTCTTTAAAAATTGAGCTATGATTTTTCTTCACCGAGTTATCGTAATTTGAAATGGCCAATCG ACGTTTTCTTTTCATCACTAGGAATAATGCCGGCGGATTTAAGCTTTCAGAAGATTCCAAGCAAAGTTGAAAACAATTTGTTTGAATTCCGCATGATCTCAAAGTTGAAGCTAAAAAAATTTCACAAGACTTTAAAAGACA AAAAGTCAATGTTGCACAATCGACTTGGAAATAACGCCACTTGGAGCAGAGAAGTATTGCCGAGAAAAAGCTTCGACCGGGTTTGAAAGCAGGATCTATAGGCTTCCAAACTTTTTTTTtGAAAAATGGAGCTTTAAGGTT ACTTTTCACGAAATTGTATAAATTGTGTTTTGTTGtTTTTATCATCGATTCGAAATGATCCTTTCATTTTGTTCAAGAATGTATCATTAACTTAATGCAATGATATCTTTTAATAATTCAATACTTCTTACTAATTAGATT TAGAAAAGTTTCATGAACAATTTTAGAACGGTTATTGCTAATTATTCGCCAAAAACAGACGCCTTTCTTCGAGGAGTAACAACTCAAGCTGCATTCGCGGTTGACGGCTTTCCGCGGCGCGTCGCGGTAATATGGCTCCTA TTTAATTACCCGGTCCTTTGGGAGCTTAAACCAGCCAGAGCTGGAACGGCTGCGCCGTTATTCTATCTAGACTCCTTTGCTTTCTCAAGCAGGCGCGCGATCAAACCTTCTCGCATAAAAGACAATCGCAGCTGGCAGTCG ACGACGCGcGGGACAGTCGAATCATGACCCGCTCCTCTCTAGTTACCGCCGTCAGTCTGCTCTACTTCCAGACGCCGCGCGTAATCTATTCGACATTAGTTGCTTGATTGCACCGTAAAATGCGACGGCGACGTGAACGAG AACGACGACGACGACGACGACGACGACAACGACGACGACAATGACAACGACGACGACGACAAGAGTGGGTTCGTCGGTGCTCGATGGCGCCTCCGATTTCAACCGCAGCACGATGCAAGCCCATTACTATCGCCCGGAGCT AAACGGCACCCGGAGCTCGTGCCATTAAGGGAATCTAGGGTCCGATCCACCTCATTGAATTCCGTTcAATTGCGATTATGATAATGCGTGAACGATCGCCGTGGACGTGAGCTACGGaCAACgAGGGTGTTCGTTCTCGGC TCAGAGAAACGCAGCGATAAAATTATCAGTGACAGCTTCATTTTTGTTACATTTGACGTTGAAAAATTTGCGAAAGAAATGTGCATTATTCAACTACATTGACAATTGTAAATCTTACATGACCTTTTATTAATATACATA TATGTAAGATATATGTTCCATCTCTAGTCTCGTTCGATAATGAAGAAATTATAGCTGCAACATAGACGCCGGATTATTCGCGGGGGAAACCGTTGATTATTTGGACTCCGTGGGCTCGGGGTCTGGTTTACTTCTTCCCTT ATGTCCGGAGATAATGGACACTATTACCTTAACGAGGCCCCAGTCTCTTAGCCGGTAATTGCTTCGAGATATCCGAGAGAGCTCCGGCGTACGTTGCCGCCTGGTGTTGCAGGCAGAGAACCCaGACGGTATTATTGCCGC CGAGGCTACTCGCCCGTTTCATGGTGGTTATTGTTATGGGCCTGCGGCATTAAGATGTACACCGCACTCTCATACGGGACCCACCACCCCATATACGAGCTGCATATATACTTATGCGGGGAGGATTTCATTACGCCGCTT CATAACTGGCGGTCTTCAAAATCGCGATAAATCGGGATTGTCTTCCTTGTTAAATCAGCTCCCGTTCCCCTTCCTCATATCGCTGACGAAGCCAACGAGACGGATTAATCTGCGCGATTGAATGGCCTCTATGACGTAAGC GCACCGTTTACTGGCACGGCTCCTCGTGTTCACGTGAAACGATTTGCGTCGTAAATATTTTATTTTAACATCGCAACATCAAGACAGACGAGGATCGGCTATTGCCTCGTGATCCTAAAAGGGAGATTCTCAAGGCGGAAT CGGGGTAACGCGTTCGTTGATCTCGCCAAACTACGGCATCTTGAGGACTAGTCTTAGaGGAAAAAAAGACGACGAGGAGACACGGTGAGCATTAGATGAGAAAGAGACGGCGCGGCGCGGTGCGGCGGAGCGAGACGGAAA GAGATCAAATCTGGATATCAGGATTAGGGTGGGTACGTAATCCGCAGGACGACGGGTGGTAGGAACGGTGGATCCGTCGGCAGATCTCATCGCGCGGAGGAACTCTGCGGGTACTTCGCCCGGCCAAATACGACAAGAGCA GCAGCCTAACTCGAGTAGAGCCGCCGAGATGGTTTACGGCTCGTCGCAAGTTGGTGAAATTTAAGAGCGCATTTAAAACTGGTTTGGCGCGATGCCCTGCGCCTCCCTGCAGCAGGTACTGCCGGCGAGAGATACCGCTGG GCTCCGACAGGAGTTTATCGCCCTAACATTCTTGGAAAGCTTCGGATGGATTTCCCTCCTCAACCCATTTCTTCCGGCAGCACCTCAAATCGTCGCTTCTTTCGAGCTCCCTGTGTCGTCCGTTCTTTCCCCTTGCCACGA AAATTCTTCCAACAGACTCGACACCAAATCGTCACGACGATCATCGGTCAAGATACCCGGAGAAACGTCGCGGACGATGTTACCGTCAGATCGGACGCTTTTAGCTCCGCATTGGAGCTTTTTCCTCGACCGTCTCGGAGG AGATTTAGCGGACCGACAACAAATGAAGGTCATCTTTCGCGCGAGAAAGCGTCTCACCTTGATTTCCGTCTCGTTTCTCCGGAAATTGGATATCAACCGGACGAACTGTAATGTGTACGTTAAATTGCCAATTATATATAT AATGTAACAGCTGATTATCTCAAGTGTCCTAAAAACCATTATACTCTTAATTTCTGTGAAAAATGGCGAAAATAAAAAAGAAACCGATCTTAATAAAGATATTCTTCCTGATAGATGCCATGACCCACGTGGAAAACTTTT TAGTTTTGTACAGTGGTATTATACGTTATCTTCCGCTGAACGTAAGACGTGCCTATCGCGCAATTTCATCGCGACGTCGTCGTATAGCGATTATGGCTACTCCATTAAAAATGAATTTTATAAAGGCAATCTTTCCAAGCG ATCGTTGTAGGAGAAAAAGGCGAAAGCCGGAGCCAAAGGGGATGAGGCCACTACCTTTGGCTGATCCACTTCGAATGATAATCACCTCTAGGAGACTCAATTTCGCCCTGCTCCGCGTCCTTACCCGTTCCTATCTTCGGA AGGTTCAACGCCGCAGCGGACTGCATCTTTCACTCCCTTCGTCACCACCGCCCTATTCCTATCGCCCTCCGCGCGCCTACCGCCCCTATATCCTTCCCTTCCTTCACtCCTAGACTATTCTGAACGACCTCTTCCCCCATT CGCCAACGCTCACTCCTAACTGATTGGAGTACCAATCAATGCGGCATTCAGGCGGCCGTGCTGAAaCTTTAGGAAATTAACTATTCACTCTCTGGAAATGGTTATTTGGAAGGCCGGAAAGGCAGTCGGGACTACGTTACG Sequencing a genome is now cheap & easy. But making it useable is hard.
  17. 17. Gene prediction Dozens of software algorithms: dozens of predictions Yandell & Ence 2013 NRG TTTTtACCTGTTTTtGAAAAGGTAATTTTCTTTAGATATATACAGTTTGTAATaTTAGGTATTTTATAAACAGTGTGTATATTTCTTACAATATAAAAGACACAATTGCAAACTAGCATGATTGTAAACAATTGCTAAACGGATCAATATAAATTAAAATTGTAATATTAAGTATCAAACCGATAATTTTT
  18. 18. Gene prediction Dozens of software algorithms: dozens of predictions TTTTtACCTGTTTTtGAAAAGGTAATTTTCTTTAGATATATACAGTTTGTAATaTTAGGTATTTTATAAACAGTGTGTATATTTCTTACAATATAAAAGACACAATTGCAAACTAGCATGATTGTAAACAATTGCTAAACGGATCAATATAAATTAAAATTGTAATATTAAGTATCAAACCGATAATTTTT Yandell & Ence 2013 NRG Evidence
  19. 19. Gene prediction Dozens of software algorithms: dozens of predictions TTTTtACCTGTTTTtGAAAAGGTAATTTTCTTTAGATATATACAGTTTGTAATaTTAGGTATTTTATAAACAGTGTGTATATTTCTTACAATATAAAAGACACAATTGCAAACTAGCATGATTGTAAACAATTGCTAAACGGATCAATATAAATTAAAATTGTAATATTAAGTATCAAACCGATAATTTTT Evidence Yandell & Ence 2013 NRG Consensus:
  20. 20. Gene prediction Dozens of software algorithms: dozens of predictions TTTTtACCTGTTTTtGAAAAGGTAATTTTCTTTAGATATATACAGTTTGTAATaTTAGGTATTTTATAAACAGTGTGTATATTTCTTACAATATAAAAGACACAATTGCAAACTAGCATGATTGTAAACAATTGCTAAACGGATCAATATAAATTAAAATTGTAATATTAAGTATCAAACCGATAATTTTT Evidence 20% failure rate: Yandell & Ence 2013 NRG Consensus:
  21. 21. Gene prediction Dozens of software algorithms: dozens of predictions TTTTtACCTGTTTTtGAAAAGGTAATTTTCTTTAGATATATACAGTTTGTAATaTTAGGTATTTTATAAACAGTGTGTATATTTCTTACAATATAAAAGACACAATTGCAAACTAGCATGATTGTAAACAATTGCTAAACGGATCAATATAAATTAAAATTGTAATATTAAGTATCAAACCGATAATTTTT Evidence 20% failure rate: •missing pieces Yandell & Ence 2013 NRG Consensus:
  22. 22. Gene prediction Dozens of software algorithms: dozens of predictions TTTTtACCTGTTTTtGAAAAGGTAATTTTCTTTAGATATATACAGTTTGTAATaTTAGGTATTTTATAAACAGTGTGTATATTTCTTACAATATAAAAGACACAATTGCAAACTAGCATGATTGTAAACAATTGCTAAACGGATCAATATAAATTAAAATTGTAATATTAAGTATCAAACCGATAATTTTT Evidence 20% failure rate: •missing pieces •extra pieces Yandell & Ence 2013 NRG Consensus:
  23. 23. Gene prediction Dozens of software algorithms: dozens of predictions TTTTtACCTGTTTTtGAAAAGGTAATTTTCTTTAGATATATACAGTTTGTAATaTTAGGTATTTTATAAACAGTGTGTATATTTCTTACAATATAAAAGACACAATTGCAAACTAGCATGATTGTAAACAATTGCTAAACGGATCAATATAAATTAAAATTGTAATATTAAGTATCAAACCGATAATTTTT Evidence 20% failure rate: •missing pieces •extra pieces •incorrect merging Yandell & Ence 2013 NRG Consensus:
  24. 24. Gene prediction Dozens of software algorithms: dozens of predictions TTTTtACCTGTTTTtGAAAAGGTAATTTTCTTTAGATATATACAGTTTGTAATaTTAGGTATTTTATAAACAGTGTGTATATTTCTTACAATATAAAAGACACAATTGCAAACTAGCATGATTGTAAACAATTGCTAAACGGATCAATATAAATTAAAATTGTAATATTAAGTATCAAACCGATAATTTTT Evidence 20% failure rate: •missing pieces •extra pieces •incorrect merging •incorrect splitting Yandell & Ence 2013 NRG Consensus:
  25. 25. Gene prediction Dozens of software algorithms: dozens of predictions TTTTtACCTGTTTTtGAAAAGGTAATTTTCTTTAGATATATACAGTTTGTAATaTTAGGTATTTTATAAACAGTGTGTATATTTCTTACAATATAAAAGACACAATTGCAAACTAGCATGATTGTAAACAATTGCTAAACGGATCAATATAAATTAAAATTGTAATATTAAGTATCAAACCGATAATTTTT Evidence 20% failure rate: •missing pieces •extra pieces •incorrect merging •incorrect splitting Visual inspection... and manual fixing required. Yandell & Ence 2013 NRG Consensus:
  26. 26. Gene prediction Dozens of software algorithms: dozens of predictions TTTTtACCTGTTTTtGAAAAGGTAATTTTCTTTAGATATATACAGTTTGTAATaTTAGGTATTTTATAAACAGTGTGTATATTTCTTACAATATAAAAGACACAATTGCAAACTAGCATGATTGTAAACAATTGCTAAACGGATCAATATAAATTAAAATTGTAATATTAAGTATCAAACCGATAATTTTT Evidence 20% failure rate: •missing pieces •extra pieces •incorrect merging •incorrect splitting Visual inspection... and manual fixing required. 1 gene = 20 minutes to 3 days Yandell & Ence 2013 NRG Consensus:
  27. 27. Gene prediction Dozens of software algorithms: dozens of predictions TTTTtACCTGTTTTtGAAAAGGTAATTTTCTTTAGATATATACAGTTTGTAATaTTAGGTATTTTATAAACAGTGTGTATATTTCTTACAATATAAAAGACACAATTGCAAACTAGCATGATTGTAAACAATTGCTAAACGGATCAATATAAATTAAAATTGTAATATTAAGTATCAAACCGATAATTTTT Evidence 20% failure rate: •missing pieces •extra pieces •incorrect merging •incorrect splitting Visual inspection... and manual fixing required. 1 gene = 20 minutes to 3 days 15,000 genes * 20 species = impossible. Yandell & Ence 2013 NRG Consensus:
  28. 28. Monica Dragan https://github.com/monicadragan/GeneValidator
  29. 29. Monica Dragan https://github.com/monicadragan/GeneValidator
  30. 30. Monica Dragan https://github.com/monicadragan/GeneValidator
  31. 31. Gene prediction Dozens of software algorithms: dozens of predictions TTTTtACCTGTTTTtGAAAAGGTAATTTTCTTTAGATATATACAGTTTGTAATaTTAGGTATTTTATAAACAGTGTGTATATTTCTTACAATATAAAAGACACAATTGCAAACTAGCATGATTGTAAACAATTGCTAAACGGATCAATATAAATTAAAATTGTAATATTAAGTATCAAACCGATAATTTTT Evidence 20% failure rate: •missing pieces •extra pieces •incorrect merging •incorrect splitting Visual inspection... and manual fixing required. 1 gene = 20 minutes to 3 days 15,000 genes * 20 species = impossible. Yandell & Ence 2013 NRG Consensus:
  32. 32. Crowd-sourcing the visual inspection + correction. Anurag Priyam http://afra.sbcs.qmul.ac.uk
  33. 33. Crowd-sourcing the visual inspection + correction. Begin Curate Being curated Submit Curate Curate Being curated Submit Being curated Submit “ create nex Anurag Priyam http://afra.sbcs.qmul.ac.uk
  34. 34. Crowd-sourcing the visual inspection + correction. Scientific software + Facebook + Points + Badges + Redundancy Begin Curate Being curated Submit Curate Curate Being curated Submit Being curated Submit “ create nex Anurag Priyam http://afra.sbcs.qmul.ac.uk
  35. 35. My aim is to make bioinformatics software:
  36. 36. My aim is to make bioinformatics software: Shared
  37. 37. My aim is to make bioinformatics software: Shared Robust
  38. 38. My aim is to make bioinformatics software: Shared Robust User-friendly
  39. 39. With the fellowship I will:
  40. 40. With the fellowship I will: • Organize a local Software-Carpentry-type event.
  41. 41. With the fellowship I will: • Organize • Include a local Software-Carpentry-type event. software & reproducibility best practices on two new MSc & existing BSc.
  42. 42. With the fellowship I will: • Organize a local Software-Carpentry-type event. • Include software & reproducibility best practices on two new MSc & existing BSc. • Promote/lobby in line with SSI’s mission (internally/talks/conf// networking/publications/web).
  43. 43. With the fellowship I will: • Organize a local Software-Carpentry-type event. • Include software & reproducibility best practices on two new MSc & existing BSc. • Promote/lobby in line with SSI’s mission (internally/talks/conf// networking/publications/web). • Seek funds to create great sustainable bioinformatics software.
  44. 44. Thanks! @yannick__ y.wurm@qmul.ac.uk http://yannick.poulet.org

×