presentation on the molecular biology of Coronaviruses which include taxonomy, history of the viruses, various proteins present in virus, their structure, importance, roles, the life cycle of virus, infection process, the process of disease development, the pathogenicity of virus, replication and translation process in coronaviruses, possible sites for vaccine development, available treatments, cures and drugs, and various studies regarding coronaviruses infection and cures.
3. Introduction
• Latin – Corona meaning “Crown” or “Wreath” refers to charactetic appearance of
virion by electron microscope
• Most distinctive feature of this viral family is genome size: coronaviruses have the
largest genomes among all RNA viruses, including those RNA viruses with
segmented genome.
• They generally infect their host in species-specific manner, and infection can be
acute or persistent.
• Infections are mainly transmitted by respiratory and fecal-oral routes.
• Coronaviruses are a family of enveloped RNA viruses that
are distributed widely among mammals and birds, causing
principally respiratory or enteric diseases but in some cases
neurological illness or hepatitis.
4. Epidemics caused by hCoV
• In 2002, SARS-COV-1 outbreak reported 8,422 cases over 29 different
countries and territories with a case fatality rate (CFR) of 11%.
• MERS-CoV caused 3 outbreaks till 2020. about 2,519 cases have been
reported as of now over 21 different countries and territories with 35%
case fatality rate (CFR).
• Ongoing SARS-COV-2 outbreak is observed in 227 different countries and
territories with 13,070,097 reported cases showing 4.4% case fatality rate
(CFR) as of 14 July 2020.
5. Taxonomy
• Coronaviruses are currently classified as one of the genera in the family
Coronaviridae which comes under order Nidovirales.
• The most salient features that all nidoviruses have in common are :
Gene expression through transcription of a set of multiple 3’ nested
subgenomic RNAs.
Expression of replicase polyprotein via ribosomal frameshifting.
• Members of the coronavirus family have been sorted into 4 groups based on
antigenic relationships and sequence comparisons of entire viral genome (or of as
much genomic sequence is available)
7. • Alpha coronaviruses (type 1) contain species like feline CoV, FECV (feline Enteric
Coronavirus) and FIPV (Feline Infectious Peritonitis Virus), porcine TGEV
(transmissible Gastro-Enteritis Virus), porcine PEDV Epidemic Diarrhea Virus),
PRCoV (porcine Respiratory Coronavirus) and canine CoV. Alpha coronaviruses
also contain human CoVs such as HCoV-229E and HCoV-NL63, but various bat
coronaviruses.
• Beta coronaviruses infect wide range of mammalians, murine hapatitic
coronavirus (MHV), Bovine coronavirus (BCoV) and human CoVs like HCoV-
HKU41, HCoV-OC43, SARS-CoV,MERS-CoV and SARS-CoV-2.
• Gama coronaviruses are specific of birds, Infectious bronchitis coronavirus (IBV),
Turkey CoV but with one exception of a beluga whale CoV.
• Delta coronavirus genus was created in 2012 and regroups various coronaviruses
from mammals to birds(HKU11, HKU12, HKU13)
8. Virion morphology
• Virus is roughly spherical and
moderately pleomorphic.
• Virus Diameter : average 80-120nm
• Spike length : 17-20nm from surface
• HE protein length : 5-10nm from
surface
• 3 components in viral envelop (S, M,
E proteins)
Fig. Corona virion structure shown with structural
proteins.
9. Spike protein (S)
S glycoprotein formerly called E2 which mediates
receptor attachment and viral and host cell
membrane fusion.
S protein is a very large, N-exo , C-endo
transmembrane protein that assembles into trimers
to form distinctive surface spikes of CoV.
S protein is inserted into endoplasmic reticulum
(ER) via amino-terminal signal peptide.
Molecular weight of full length protein monomer
falls in range of 150-200kDa. The S molecule is
highly glycosylated and this modification is
extensilvely N-linked.
S protein ectodomains have 19-39 potential
consensus glycosylation sites, but a comprehensive
mapping of actual glycosylation has not yet been
reported for any CoV.
Millet et al., 2015
10. • In most group 2 and in all group 3 CoV, the S protein is cleaved by a trypsin-like host
protease into 2 polypeptides, S1 and S2, of roughly equal sizes. Peptide sequencing
shown that cleavage occurs following the last residue in a highly basic motif (RRAHR in
MHV, KRRSRR in BCoV). Similar cleavage sites were predicted in group 2 CoV S proteins
except that of SARS-CoV.
• S1 domain is the most divergent region of molecule both across and within the 3 CoV
groups. Even among the strains and isolates of a single CoV species, sequence of S1
varies extensively. By contrast, most conserved part of molecule across the 3 CoV
groups is S2 sequence and the region of start of transmembrane domain.
11. (A) Different stages of coronavirus entry where host cellular proteases may activate coronavirus spikes. (B)Schematic drawing of the three-
dimensional (3D) structure of coronavirus spike. (C) Schematic drawing of the 1D structure of coronavirus spike. NTD (D) Sequence
comparison of the spikeproteins from SARS-CoV-2, SARS-CoV, and two bat SARS-like coronaviruses in a region at the S1/S2 boundary.
Jian Shang et al, 2020
12. Membrane protein (M)
• M glycoprotein (formerly called E1) is
the most abundant constituent of
coronaviruses and gives the virion
envelope its shape. its
Preglycolylated polypeptide form
ranges from 25-30kDa (221-262 AA)
• It is also regarded as the central
organizer of CoV assembly,
interacting with all other coronaviral
structural proteins.
13. • Homotypic interactions between the M proteins are the major driving force
behind virion envelope formation but, alone it’s not sufficient for the formation
of VLP
• M &E protein interaction will leads to the formation of VLPs and integration
between M and S , M and N ultimately promotes completion of viral assembly
• Segment of some 25 residues encompassing the end of 3rd transmembrane
domain and start of the endodomain is the most conserved region and
Ectodomain is the least conserved part of M molecule.
14. Envelope Protein (E)
• It is one of the small structural proteins, but also most enigmatic. It is composed
of approximately 76-109 amino acids.
• E is a short protein , having amino terminus consisting of 7-12 AA, followed by a
large hydrophobic Transmembrane domain (TMD) of 25 AA and then a
hydrophilic carboxy-terminal tail (39-76 AA)
• Carboxy terminal tail of molecule is situated in interior of the virion (i.e.
Cytoplasmic)
Schoeman et al.,2019
15. • E is abundantly expressed inside the infected cell, but only small portion is
incorporated into virion envelope. Reason behind this is still unknown.
• Majority of protein is localized at the site of intercellular trafficking, viz, the ER,
Golgi, and ERGIC, where it participate in virion assembly and budding.
• E protein sequences are extremely divergent across the 4 groups and within the
members of single group also.
16. Nucleocapsid Protein (N)
• phosphoprotein which ranges from 43-50kDa.
• It is the component of the helical nucleocapsid
and is thought to bind the genomic RNA in a
beads-on-string fashion.it plays an important
role in virion structure, replication and
transcription of CoVs.
• N protein is divided into three conserved
domains, which are separated by 2 highly
variable spacer regions.
• Domain 1 and 2 constitute most of the
molecule, are rich in arginines and lysines.
Domain 3 is has a net negative charge resulting
from an excess of acidic over basic residues.
17. • A significant portion of the stability of the nucleocapsid may derive from N-N
monomer interactions. Both sequence-specific and nonspecific modes of RNA
binding by N protein have been assayed in vitro
• Specific RNA substrates that have been identified for N protein include the
positive sense transcription regulating sequence regions of 3’ UTR and N gene,
and the genomic RNA packaging signal.
18. Accessory proteins
• Interspersed among the set of
canonical genes replicase, S, E,
M and N.
• Number of accessory genes
present in CoV varies from few
as one (PEDV and HCoV-NL63)
to many as eight (SARS-CoV).
• In some cases, accessory genes
can be entirely embedded in
another ORF, as the internal (I)
gene found within the N gene of
many group 2 CoVs.
19. • HE (Hemagglutinin) is the most extensively characterised accessory protein
(formerly called E3), which is the fourth constituent of membrane envelope in
many of group 2 CoVs.
• HE forms second set of small spikes. It was first identified as hemagglutinin in
HEV and BCoV.
• The HE monomer has an N-exo, C-endo transmembrane topology with an amino
terminal signal peptide.
• Monomers of HE protein ranges from 60-65kDa.
• The HE protein act as cofactor for S protein.
20. Genome
• Genome is non-segmented, single stranded RNA molecules of positive sense, that
is, the same sense as mRNA.
• Extremely large genome, length ranging from 27.3kb (HCoV-229E) to 31.3kb
(MHV). CoV genomes are among the largest mature RNA molecules known to
biology.
• Structurally they resemble most eukaryotic mRNA, in having 5’ caps and 3’ poly
(A) tails and they contain multiple ORFs.
• The genes for 4 structural proteins ,which accounts for less than one-third of
coding capacity of the genome and are clustered at the 3’ end.
• A single gene encodes for viral replicase, occupies the most 5’ most two-third of
the genome.
• The invariant gene order in all members of CoV family is 5’-Replicase-S-E-M-N-3’
22. Function of CoV non structural proteins
Protein Function
nsp1 Cellular mRNA degradation and blocks host cell
translation.
nsp2 No known function.
nsp3 Large, transmembrane protein,interact with N protein.
nsp4 Important for proper structure of DMVs.
nsp5 Cleaves viral polyprotein
nsp6 No known function.
nsp7 Forms complex with nsp8, may act as processivity
clamp for RNA polymerase.
nsp8 processivity clamp for RNA polymerase; may act as
primase
nsp9 RNA binding protein
nsp10 Cofactor for nsp16 and nsp14
23. Protein Function
nsp11 No known function.
nsp12 RdRp
nsp13 RNA helicase, 5′ triphosphatase
nsp14 N7 MTase and 3′-5′ exoribonuclease (ExoN)
nsp15 No known function.
nsp16 2′-O-MT
Anthony R. Fehr and Stanley Perlman., 2015
25. Receptors, Receptor recognition and Entry
• Viruses primarily binds to the receptor on the cell surface via S protein. Pairing of
CoVs and their corresponding receptors are generally species specific, but this
adaption is mutable.
• The S1 binds to the receptor and it leads to conformational changes in S2, which
further mediates the fusion between virion and cell membrane.
• Fusion of the cell and virion require cleavage of S1 and S2. Thus some CoVs, fuse
with plasma membrane while others appear to enter through receptor mediated
endocytosis and then fuse with membrane of acidified endosome.
• Receptor binding and cleavage mediated conformational changes in S1 leads to
the formation of ”Trimer of dimer” bundle of fusion peptide.
26. • SARS-CoV and SARS-COV-2 uses ACE2 as their receptors which is shown to have a role
in Renin Angiotensin System which regulate blood pressure.
(Sandrine belouzard et al, 2012)
Group Virus Receptor
Alphacoronavirus Transmissible gastroenteritis virus
(TGEV)
Aminopeptidase N
Porcine Epidemic Diarrhea
Coronavirus (PEDV)
Aminopeptidase N
Human coronavirus 229E Aminopeptidase N
Human coronavirus NL63 Angiotensin-converting enzyme 2
(ACE2)
Betacoronavirus Bovine coronavirus (BCoV) Neu 5,9 Ac2
Mouse hepatitis virus (MHV) Murine carcinoembryonic antigen
related molecule (mCEACAM1)
Human SARS-CoV ACE2
Human SARS-COV-2 ACE2
Gammacoronavirus Various Avian CoV Still unknown
Deltacoronavirus Thrush coronavirus HKU12 Still unknown
27. Fig. Mode of entry of SARS CoV entry into the cell (Graham Simmons et al, 2013)
28. Replication and Transcription
• Once the viral genome entered into the cytoplasm it triggers the translation of
replicase gene.
• Replicase gene products are encoded 2 very large ORF i.e ORF1a and ORF1ab
which are translated into 2 large polypeptide pp1a and pp1b by ribosomal
frameshifting mechanism which are cotranslationally processes by 2 or 3
proteinase to yield 16 nonstructural proteins.
• These 16 proteins Catalyst the events leading for the formation of form Double
membrane vesicles (DVM) and also form Replication and Transcription Complex
(RTC)
29. • CoV genome replication is a process of continuous synthesis of positive stand
gRNA that utilizes a full length complementary negetive stand synthesis.
• Whereas the Transcription include discontinues step with internal initiation and
premature termination which produce sgmRNAs
• sgmRNAs include a common leader sequence at their 5’ end which is present at
the 5’ end of gRNA.
30. Ribosomal frameshifting
• Expression of large replicase gene is
regulated by Ribosomal frameshifting
mechanism.
• There is a small (43nt) overlap
between ORF1a (11.9kb) and ORF1b
(8.1kb) and there are no sgRNA that
could serve as mRNA for ORF1b. This
arrangement is found in all CoVs.
• In IBV, Ribosomal frameshifting was
found to depend on two genomic
RNA elements: a heptanucleotide
“slippery sequence” (UUUAAAC) and
a downstream, hairpin-typr
pseudoknot.
31. • Pseudoknot impedes the progress of the elongating
ribosome with some fixed probability.
• This delay required for ribosome to melt out
secondary structural element allows the
simultaneous slippage of P and A site tRNA by one
base in -1 direction.
• The spacing between the slippery sequence and the
pseudoknot region is critical for the frameshifting
mechanism.
• Around 25-30% of frameshifting was measured in
IBV studies.
• End results of ribosomal frameshifting mediated
translation of replicase gene is the synthesis of 2
very large polyproteins, pp1a and pp1b.
32. Fig. Coronavirus RNA synthesis by discontinues step. The nested set of positive and negative strand RNAs
produced during replication and transcription are shown, using MHV as an example. The insert shown details of
the arrangement of leader and body copies of the transcription-regulating sequence (TRS). Paul S. Masters , 2006
34. • Transcription process is controlled by Transcription Regulatory Sequences (TRS’s)
located at the 3’ leader sequence (TRS-L) and preceding each viral gene (TRS-B)
• TRS include a conserved core sequence (CS), 6-7 nt with length and variable 5’
and 3’ flaking regions. [ 5’-AAACGAAC-3’ in case of SARS-CoV ]
• Because of identical CS in leader sequence (CS-L) and all mRNA coding sequence
(CS-B), CS-L could base pair with nascent negetive starnd complemetary cCS-B.
• Base pairing between CS-L and CS-B drive the template switch off nascent
negetive stand RNA to the leader.
35. Fig. CoV discontinues transcription process which is regulated by Long distance RNA RNA
interaction leading for the template switch by taking Nucleocapsid protein.
Isabel Sola et al, 2015
36. Regulation of protein stoichiometric ratios
• Expression of structural and accessory proteins should be regulated tightly to
ensure appropriate level of viral protein ratios.
• Multiple factors regulate transcription process by modulating template switch
frequency during discontinues transcription
• Most important factor is complemetarity between TRS-L and TRS-B other factors
such as TRS secondary structure, proximity to 3’ end, RNA-RNA and RNA – Protein
interaction.
• Cell proteins also modulate sgmRNA ratios. In IBV N protein was shown to recruit
cellular helicase (DDX1) to viral RTC, facilitating RTC read through and synthesis of
long sgmRNAs
37. Cis acting elements and their role
• Specific cis acting elements are required for CoV RNA synthesis which are located
in highly structured 5’ and 3’ untranslated region (UTR).
• First studied in BCoV using defective interfering RNAs.
• 5’ UTR consists of RNA elements which forms stemloop structure (SL) that
represent varying degrees of conservation among different groups of CoV.
• In 3’ UTR region, immediately after N gene there are 2 overlapping essential RNA
structure consisting of Bulged Stemloop (BSL) and a hairpin like PseudoKnot (PK)
38. Fig. Coronavirus cis-acting RNA elements. The higher-order RNA structures indicated in the diagram are mainly based on
studies done in betacoronaviruses. Isabel Sola et al ,2015
39. CoV proofreading system
• These viruses encode a set of RNA modifying proteins that are not present in
other viruses.
• Proofreading is carried out by nsp14 (ExoN) which is having N terminal ExoN
domain and also contain C terminal N7 methyltransferase ( N7-Mtase) domain.
• Nsp14 is a part of RTC core complex formed by nsp12, 7 and 8, which is invoked in
removal of missincorporated nucleotide.
• Interestingly nsp10 is able to enhance the ExoN activity upto 35 folks in-vitro and
it acts as cofactor for nsp14 and also for nsp16 methyltransferase.
40. Genome packaging
• Selectively incorporates positive sense genomic RNA into assembled virions is
brought by the Packaging Signal (PS) sequence present in the viral genome.
Position of PS and the sequence of PS is variable across the species of CoV.
• In ß CoV the PS sequence is mapped to a 190nt segment of rep1b (around 20kb
downstream to 5’ end of genome). Mapping of these signals were carried by
using defective-interfering (DI) RNA and by using mutants with altered genes.
• The PS are recognized by the N molecules i.e. N and M protein regulate packaging
of genomic RNA.
Paul S. Masters.,2006
41. Assembly and release
• Newly synthesized viral structural proteins will be instead into ER in which
they move along secretory pathway into ER- golgi intermediate complex
(ERGIC)
• Viral genome encapsidated with N Protein will bud into ERGIC contrasting
viral proteins and form mature virions.
• Virions were transported to cell surface in vesicles and released by
exocytosis.
• In few CoVs, S protein doesnot get assembled into virion between infected
cell and adjuscent uninfected cell. This will form a gaint multinucleate cells
which will allow virus to spread within infected organism without being
detected or neutralized by virus specific antibodies
42. Diagnosis, Treatment and Prevention
• In most cases of self limited inspection, diagnosis is unnecessary as
disease will naturally run its course. But it is very important in several
CoV outbreak where virus continues to circulate.
• RT- PCR became best method for diagnosis of HCoV as it is able to
detect all the HCoVs including SARS-CoV2
• Serological assay are important in case where RNA is difficult to
isolate or no longer present for epidemiological studies.
• To date, there are no antiviral therapitics that specifically target hCoV,
so treatment are only supportive.
43. • SARS and MERS outbreak stimulated research on identification of
antiviral targets such as proteases, polymerase and entry proteins.
• Only limited option are available for prevention of CoV infection.
Vaccines are approved only for IBV,TGEV and Cacini CoV.
• Many vaccines were developed including recombinant attenuated
virus, live virus vectors or individual proteins expired from DNA
plasmid.
44. Fig. Viral Lifecycle and Potential Drug Targets of SARS-CoV-2 (James et al., 2020)
45. Treatment Molecule Cell line used Reference
Host protein inhibitors K11777 (target catapsin
mediated cell entry)
293T cells of mouse Zhou et al, 2015.
Viral protein inhibitor Ribavirin
Remdesivir
In-vitro inhibition in mice Mommatin et al, 2013.
Calvin J. Gordon et al,
2020
Monoclonal antibody 5H10 ab targeting
proteolytic cleavage site
KM mice Wang et al, 2016.
IFN's Subtype Beta-1b , alpha-
n1 and n3
Monkey Stockman et al, 2006
Stroher et al, 2004.
Chineese treatment Glycyrrhizin (liquorice
component)
Macaque Cinatl et al, 2003.
Subunit vaccine S protein exposed in
recombinant Ankara
virus
Mice, rabbit and
monkeys
Chen et al, 2005.
Attenuated viral vaccine Urbani strain SARS CoV Mice Graham et al, 2012.
46. • Therapeutic SARS-CoV neutralizing antibodies also generated but
their efficiency is very low
• Vaccines of CoV are not always used because they are not much
effective and there are also reported emergence of novel pathogenic
strain via s recombination is circulating strain.
• So it is recommended to avoid spreading of virus by isolating the
infected individual from community.
47. SARS-CoV-2
• Also refered as China virus, Wuhan virus, COVID-19 virus.
• Known to causes coronavirus disease 2019 (COVID-19) the
respiratory illness responsible for 2019 pandemic.
• It is believed to have zoonotic origins and has close genetic
similarity to bat coronaviruses, but there is no evidence yet to
link an intermediate animal reservoir.
• It is a member of subgenus sarbecovirus with ~70% genetic
similarity to SARS-CoV and has 96% similarity to bat CoV.
48. • Six amino acids are shown to be critical for determining binding with ACE2 receptor and for
determining host range in SARS-CoV I.e Y442, L472,N479, D480, T487 and Y491 which
corresponds to L455, F486, Q493, S494, N501 and Y505 in SARS-CoV-2.
• SARS-CoV-2 have RBD which can bind to ACE2 of humans, ferrets and cats with higher
affinity.
Kristian G Anderson et al, 2020
49. • spike protein of SARS CoV-2 has a potential polybasic (furin) cleavage
site at the S1–S2 boundary through the insertion of 12 nucleotides.
• This change in cleavage site allows effective cleavage by furin and
other proteases and has a role in determining viral infectivity and host
range.
• RaTG13, sampled from a Rhinolophus affinis bat, is ~96% identical
overall to SARS-CoV-2, some pangolin coronaviruses exhibit strong
similarity to SARS-CoV-2 in the RBD, including all six key RBD residue
• SARS-CoV-2 spike protein showed 10-20% higher affininty for ACE2.
50. Fig. Spike Protein models for SARS-S and SARS-2-S (Hoffmann et al., 2020)
51. Pathogenesis
• Development of pneumonia symptoms in case of COVID-19 is due to
hyperactive immune system.
• Macrophages engulf few viruses and present it to helper T cells and
release Interleukin-1 and TNF-alpha.
• Fluids along with neutrophils and T-cells enter into alveoli due to
increased permeability which leads to pneumonia like symptoms.
• Helper T cells release interferons and stimulate cytotoxic T cells to act
on infected cells.
52. • Interferons and cytotoxic T cells fails to identify infected cells as virus
uses DVMs for viral replication and Transcription.
• As viral load and infection rate increases inside alvioli the macrophages
release IL-1 and TNF-alpha which inturn exaggerate pneumonia.
• Increased fluid inside alvioli causes decrease in O2 partial pressure in
blood leading to Hypoxemia.
• Cells start dying due to decrease oxygen supply.
• CoV show tissue tropism towards organs expressing ACE2 enzymes. i.e
lungs, small intestine, cardiac muscles, liver and Kidney.
53. Coloured scanning electron micrographs of cells infected with SARS-COV-2.
Source: National Institute of Allergy and Infectious Diseases (NIAID)
54. Testing for SARS-COV-2
1. RT-PCR
• Uses reverse transcription to obtain DNA, followed by PCR to amplify
that DNA, creating enough to be analyzed.
• Time consuming, costly but with high accuracy.
2. Antigen test
• A nasopharyngeal swab is exposed to paper strips containing artificial
antibodies designed to bind to coronavirus antigens.
• Antigens bind to the strips and give a visual readout.
• positive results from antigen tests are highly accurate, but there is a
higher chance of false negatives.
• Faster and cheaper method.