Virus Hunting in French Guiana

2,128 views

Published on

Lab meeting presentation about my work doing viral metagenomics in French Guiana

Rat by Francisca Arévalo from The Noun Project
Bat by Adam Heller from The Noun Project

Published in: Science, Business
  • Be the first to comment

Virus Hunting in French Guiana

  1. 1. French Guiana Virus Hunting in Nacho Caballero
  2. 2. French Guiana
  3. 3. Rodents Bats
  4. 4. Rodents Bats Leishmania
  5. 5. Capture
  6. 6. Capture Isolate viral particles
  7. 7. Capture Isolate viral particles Extract RNA
  8. 8. Capture Isolate viral particles Extract RNA Sequence
  9. 9. Estimated read coverage % reads with coverage smaller than x Rodents
  10. 10. Estimated read coverage % reads with coverage smaller than x Rodents
  11. 11. Estimated read coverage % reads with coverage smaller than x Rodents Bats
  12. 12. Read How can we estimate the coverage without a reference genome?
  13. 13. Read How can we estimate the coverage without a reference genome?
  14. 14. K-mers Read How can we estimate the coverage without a reference genome?
  15. 15. How can we estimate the coverage without a reference genome?
  16. 16. 1 1 1 1 1 1 1 How can we estimate the coverage without a reference genome?
  17. 17. 7 8 10 8 11 3 6
  18. 18. 7 8 10 8 11 3 6 Median k-mer count ≈ Read coverage
  19. 19. k-mers make it possible to align without a reference
  20. 20. Problem: each sequencing error introduces k erroneous k-mers
  21. 21. Problem: each sequencing error introduces k erroneous k-mers
  22. 22. 7 8 10 8 11 3 6 Over a threshold, additional reads are redundant
  23. 23. 5 5 5 5 5 3 5 Solution: digital normalization reduces redundancy and errors
  24. 24. Assembly
  25. 25. Assembly SPADes
  26. 26. Assembly Alignment
  27. 27. Assembly Alignment BLAST
  28. 28. Assembly TaxonomyAlignment
  29. 29. Assembly TaxonomyAlignment NCBI
  30. 30. Problem: 67% of contigs in rodent dataset (serum) align to human sequences
  31. 31. Problem: 67% of contigs in rodent dataset (serum) align to human sequences Night-heron coronavirus HKU19 (1 Kb) Simian hemorrhagic fever virus (300 bp) Equine arteritis virus (3.7 Kb) Possum nidovirus Rodent hepacivirus Chipmunk parvovirus Theiler's disease-associated virus Reticuloendotheliosis virus Mosquito VEM Anellovirus SDBVL A Porcine reproductive and respiratory syndrome virus Dragonfly-associated circular virus 1 Gemycircularvirus 3 Rodent pegivirus Cyclovirus PK5510 Hypericum japonicum associated circular DNA virus
  32. 32. Pig stool associated circular ssDNA virus (1Kb) Avian gyrovirus 2 Torque teno sus virus 1a Mosquito VEM virus SDBVL G Turdivirus 3 Problem: 92% of contigs in bat dataset (droppings) don’t align to anything in NCBI
  33. 33. Lymphocytic choriomeningitis virus (7kb) Hepatitis C virus Amphotropic murine leukemia virus Murid herpesvirus 1 Mosquito VEM Anellovirus SDBVL A Rat retrovirus SC1 Mason-Pfizer monkey virus (retrovirus) Eidolon helvum parvovirus 2 Periplaneta fuliginosa densovirus (also a parvovirus) Moloney murine sarcoma virus Sclerotinia sclerotiorum hypovirulence associated DNA virus 1 Problem: 95% of contigs in rodent dataset 2 (serum,spleen) align to mouse sequences (2)
  34. 34. 7 out of 10 samples contained more than 1Kb of Leishmania RNA virus (94% ident) 5 Kb genome
  35. 35. Lessons
  36. 36. Assume that 50% of your samples are going to fail Lessons
  37. 37. Assume that 50% of your samples are going to fail Lessons Design a small experiment, then iterate
  38. 38. Assume that 50% of your samples are going to fail Lessons Design a small experiment, then iterate Come up with excuses to learn

×