1. Genomic Analysis and Comparative Proteomics of
Temperate Mycobacteriophages,
By: Carmen A Aguirre
The College of St. Scholastica
2. Background
• Bacteriophages are viruses that infect and replicate
within bacteria
• Mycobacteriophages specifically target
mycobacteria (e.g. M. smegmatis, M.tuberculosis)
• There are two types of Phages:
Lytic: obligate lysis of host
Temperate: Lytic or Lysogenic cycle
Applications:
• Alternatives to antibiotics for resistant infections
• Biocontrol agents in agriculture
• Valuable research tool
• Untapped reservoir of genetic diversity
3. Overview
• Introduction to SEA-Phages program
• Methods of Isolation
• General Overview of Annotated Mycobacteriophages
• Overview of comparative genomics and stoperator
sequences
• General Overview of Mass Spec Techniques
• Mass Spectrometry Protein Identification
• Future Directions
4. SEA-PHAGES Program
SEA Phage Hunters Advancing Genomics and
Evolutionary Science
• National experiment in bacteriophage genomics
• Students isolate, name, sequence, and analyze newly-
discovered bacteriophages isolated on M. smegmatis
and other hosts
• Research-based curricula
• Advance science education on a national scale
• Authentic scientific discovery
5. Collect soil
sample
Enrichment
Direct plating
Streak method
≥ 3 times
Serial dilution/
Titer
Spot test
MTL Harvest
and titer Web Plate
Empirical test
10-plate
infection
Isolate DNA
ARCHIVE
HTL harvest
and titer
HTL
Electron
Microscopy
HTL
HTL
Restriction
Digest
Quality
Control Gel SEQUENCING
CENTER
APPROVED
Genomic
DNA
IN
SILICO
Genome
Sequence
Methods
6. CSS Phage Stats
Total Phage Isolated during 3 years= 34
5 sequenced and annotated
4 in GenBank
Clusters are groups of phages that have nucleotide sequence
similarity
14 Cluster A 1 Cluster E
2 Cluster B 1 Cluster C
Families are groups of phages sharing similar electron
microscopy morphology
Myoviridae: double-stranded DNA genomes with contractile
tails
Siphoviridae: double-stranded DNA genomes and long,
flexible, non-contractile tails
Podoviridae: double-stranded DNA genomes and short,
stubby, non-contractile tails
9. Stoperator Sequences
• DNA sequences that bind repressors thus prevent RNA elongation (transcription)
• Often found before genes that promote lysis
fDNA
fRepressor
bound to
Stoperator
10. Stoperator Sequences
• In many phages, multiple stoperator sequences are found in the genome, with
most located in the second half of the genome
• Stoperator sequences have polarity (directionality) that correlates with direction of
transcription of target gene
• Usually found in intergenic region near target gene
Stoperator polarity and location in mycobacteriophage L5
Brown et al. The EMBO Journal Vol.16 No.19 pp.5914–5921
11. Stoperators in our Phages
Hetaeria
• Consensus sequence generated using 3 matches and 25 pseudo
matches to original search sequence
• 27/28 sequences found within relevant genes at the tail end
QuinnKiro
• 18 putative sequences
• Mostly at the end of genome
• Core TCAAG mutated in the three matches within Lysin genes
12. Unique Quinnkiro stoperator results
• NCBI:
Sequence GTGCGATGTCAAG found 11 times
• DNAMaster
-Additional pseudo matches found around Lysin A and B genes
-Multiple stoperator sites found within Lysin A and B
-Core TCAAG sequence mutated in one case
-Sequence polarity reversed in one match
Within Lysin A (4490 - 6052): Within and after Lysin B (6052 - 7017):
13. Comparative Genomics
• Looked for our consensus sequence in other clusters
• Recorded how many exact and pseudo matches were found
• Investigated the gene functions of the likely target genes
14. Phage Sequence Hits Gene Function
QuinnKiro (A) GTGCGATGTCAAG 18 Lysin A, B
Caelakin (A) GTGCGATGTCAAG 17 Lysin B, DNA Pol.
Hetaeria (B) GTGCGATGTCAAG 28 Helicase, Terminase
ZygoTaiga (C) GTGCGATGCCGAG 14 Nucleotide Binding Protein,
Hydrolase
Hawkeye (D) GCGCGATGTCAAG 3 DNA Pol. III
Bruin (E) GCGCGATGTGGAC 2 DNAB-like Helicase
Drago (F) GTGCGATGCCAAC 2 NKF
Avrafan (G) GTGCGAGGTCGAG 1 Lysin B
Damien (H) GTGCGATGTCCCG 3 NKF
Babsiella (I) GCGCGATGTCAAC 1 HiCa antitoxin
15. • Stoperator sequence of GTGCGATGTCAAG found in QuinnKiro, Caelakin, and
Hetaeria.
• Sequence found in nearly all clusters of phages, including non-temperate
• Stoperator sequence near LysA and B genes or areas that promote lysis/assembly
• most sequences in the second half of genome near genes with no known function
QuinnKiro
• exhibited some unusual stoperator sequences associated with the lysin genes:
• Multiple sequences embedded in coding region
• Apparent reversal of polarity
Speculation:
• What is the purpose of stoperator sequences in lytic phages?
• An outcome of genetic mosaicism?
• Common mycophage ancestor temperate?
Stoperator Genomics Conclusions
16. Phage protein identification through
mass spectrometry
Goals:
-Detection of phage proteins
-verify annotation
-How does host (smeg) protein expression respond to phage infection/prophage
presence?
Methods:
-First attempt: Harvest raw lysate or PEG-precipitated phage sample
-Second attempt: Infected Cell Pellet
-Analyze: digested proteins through triple TCF mass spec
-Analyze the peptides with scaffold program
17. Peg Precipitation Protocol
summary
• Able to identify proteins that are abundant and
unique
• Several phage structural proteins were identified
• Sample concentration may need to be increased to
achieve better detection
• Many orphan peptides remain unidentified (likely
smeg protein)
18. Infected Cell Pellet Protocol
Grow Smeg
Cultures of
Varying
concentration
s
~12hrs
Run Samples
and Find one
of OD=0.4
Infect Culture
with Lysate at
MOI=10
~3-3.5hrs
Detected
Peptides
Analyzed with
Sccafold
Viewer
Subject to
triple TOF
mass spec
Sample
Peptides with
>1000
counts/sec
fragmented
Pellet Cells
19. Infected Cell Pellet
analysis-Hetaeria
• Able to identify proteins that are abundant and unique
• Proteins functioning in: Assembly, Structure, Genome
Replication, and Other.
• Able to detect over half predicted proteins
• Able to cover a significant percentage of sequence
• Identify amino acid modifications
24. Summary of Hetaeria Mass
Spec Data
• Over half of in silico predicted proteins detected
• Over a quarter of start sites verified
• Proteins of varying functions identified
• Ability to locate amino acid modifications
• Ability to identify size of proteins and variety
• Increased sequence coverage
• Increasing understanding of the Lytic Phage
25. Currently
• Waiting on Mass spec data on infected cell pellet for Quinnkiro
(A3) and Brusacoram (P)-Temperate phages?
• Interesting data on proteins involved in switching from
lysogeny
• Compare identified proteins between lytic and temperate
phages
• Preparing Caelakin clear plaque mutant with increase phage
protein expression.
• Examine how growth time and conditions affect phage and
host protein expression
• Immunity Repressors?
26. Acknowledgements
• Dr. Daniel Westholm-Professor
• The College of St. Scholastica SEA-PHAGES Research Students
• Sequencing
• Virginia Commonwealth University (Hetaeria, Severus, ZygoTaiga)
• North Carolina State University (QuinnKiro, Caelakin)
• SEA-PHAGES
• Howard Hughes Medical Institution
• University of Pittsburg-Hatful Lab
• Mass Spectrometry
• Mayo Clinic Proteomic Core Lab
• Electron Microscopy
• Mayo Clinic Microscopy Lab
Editor's Notes
Necessary?
Over 50% is the threshold for similarity in cluster groups
Update?
Stoperators: bind repressors and prevent transcription. They are different from operators
Cluster A phages share a common TCAAG core sequence
NCBI blastn of phage genome against identified multiple regions where this sequence was found
The putative stoperator locations and sequences were then refined using the DNAMaster sequence scan tool
The number, and locations and specific sequences for each phage were recorded
Created a consensus stoperator sequence for each phage
Web logo program gives consensus sequence plots
Not polarity dependent?
Does mutation affect the function of the stoperator?
Can the repressor still bind?
Update to our current protocols and move some to future work?
Change this to pellet protocol? Include both?
These occurred some post translationally but some during sample prep
Able to isolate more proteins in lytic vs lysogenic
Increase genome coverage is the overall goal and we can do that with our pelleted cell
We have sent out two temperate phages and while we do not expect a high yield of proteins we may find some interesting data on proteins involved in switching between from lysogeny