2. Introduction
• The first report of interspaced palindromic repeat sequences was made by Ishino et al. (Osaka
University) who detected five 29 bp repeats with 32 bp spacers near the iap gene of Escherichia coli .
• Further reports were made for Mycobacterium tuberculosis, Haloferax spp. and Archaeoglobus
fulgidus ; however, they were not detected in eukaryotic or virus sequences .
• The CRISPR locus, first observed in Escherichia coli, is present in about 84% of archaea and 45% of
bacteria according to the most recent update of the CRISPRdb.
• The source of the spacers was a sign that the CRISPR/cas system could have a role in adaptive
immunity in bacteria (Horvath P, & Barrangou R (2010), Science).
• The first publication proposing a role of CRISPR-Cas in microbial immunity, by the researchers at
the University of Alicante, predicted a role for the RNA transcript of spacers on target recognition in
a mechanism (Mojica FJ et al 2005).
7/27/2018
3. • The first direct evidence that CRISPR/Cas could protect bacteria from phages or plasmids was
provided by two key studies.
– Firstly, Barrangou et al. obtained phage resistant mutants following phage challenge of
Streptococcus thermophilus. The phage resistant strains had incorporated new spacers
matching to the viral genome while removal or addition of spacers matching to the phages
resulted in phage sensitivity or resistance, respectively.
– Subsequently, Marraffini and Sontheimer showed that CRISPR/Cas systems can also
prevent both conjugation and transformation of plasmids in Staphylococcus epidermidis.
– A 2010 study showed that CRISPR-Cas cuts both strands of phage and plasmid DNA in S.
thermophilus (Garneau JE et al 2010)
• In 2008, Brouns and Van der Oost identified a complex of Cas proteins (called Cascade) that
in E. coli cut the CRISPR RNA precursor within the repeats into mature spacer-containing RNA
molecules (crRNA), which remained bound to the protein complex.
7/27/2018
4. CRISPR (Clustered Regularly Interspaced Short Palindromic Repeat)
• It is a family of DNA sequences in bacteria and archaea and having response to viral and plasmid
challenges.
• The CRISPR is an array of short repeated sequences separated by spacers with unique sequences.
• The sequences contain short sequence of DNA from viruses that have attacked the prokaryote. These
are used by the prokaryote to detect and destroy DNA from similar viruses during subsequent attacks.
These sequences play a key role in a prokaryotic defense system ie. Adaptive immunity.
• CRISPR loci are transcribed, and the long primary transcript is processed into a library of short CRISPR-
derived RNAs (crRNAs) that each contain a sequence complementary to a previously encountered
invading nucleic acid.
• CRISPRs provide heritable ‘adaptive immune system’ that has a genetic memory of past genetic
incursions.
7/27/2018
5. Architecture and composition of CRISPR loci
• CRISPR/Cas systems comprise the following critical elements:
– a CRISPR array,
– the cas genes
• Generally, CRISPR repeat sequences are highly conserved within a given CRISPR locus.
• Highly diverse and hypervariable spacer sequences
• The size of CRISPR repeats and spacers varies between 23 to 47 bp and 21 to 72 bp, respectively.
7/27/2018
6. CRISPR Arrays: Repeats, Spacers and Leader Sequence
• In the CRISPR array, repeats alternate with spacer sequences.
• Within arrays, the repeats are typically identical in terms of length and sequence, but small
differences do occur.
• In particular, the repeat at the end of an array is often deviates more from the consensus
sequence.
• Repeats are usually between 23-47 bp in length.
• Many repeat sequences show palindromes or short inverted repeats and a predicted stable
secondary structure in the form of a stem-loop .
• However, those repeats, which are not palindromic and are predicted to be unstructured
and it has implications for the mechanism of pre-crRNA processing.
7/27/2018
7. SPACER
• Spacer sequences are mostly unique within a genome that confer the sequence-specific
immunity against extra-chromosomal agents.
• Many spacers have been matched to sequences originating from extra-chromosomal
sources such as phage or plasmids and other transferable elements.
• The sequences in the foreign genome from which spacers are derived are termed
protospacers and the majority of spacers are phage- or plasmid-derived.
• Spacer length can vary slightly throughout an array (typically by 1–2 nt) and spacer lengths
up to 72 bp have been reported, but usually the size is similar to that of the repeats in the
same array.
• New spacers are usually added at one side of the CRISPR, making the CRISPR a
chronological record of the viruses the cell and its ancestors have encountered.
• By adding new spacers new viruses can be recognized. The spacers are used as
recognition elements to find matching virus genomes and destroy them.7/27/2018
8. Leader sequence
• The third component of the CRISPR array is the leader sequence, which is located upstream of the
first repeat.
• The CRISPR leader, defined as a low-complexity, A/T-rich, noncoding sequence, located immediately
upstream of the first repeat, likely acts as a promoter for the transcription of the repeat-spacer array
into a CRISPR transcript, the pre-crRNA.
• This AT-rich sequence is about 200–500 bp long and includes the promoter necessary for
transcription of the array.
• The leader region is also important for the acquisition of new spacers.
7/27/2018
9. Cas Proteins
• Genes present in close proximity of CRISPR array that encode the CRISPR associated (Cas) proteins .
• Cas proteins provide the enzymatic machinery required for the acquisition of new spacers from, and
targeting, invading elements.
• The Cas proteins are a highly diverse group. Many are predicted or identified to interact with nucleic
acids; e.g. as nucleases, helicases and RNA-binding proteins.
• CRISPR/Cas systems are currently classified into type I, II and III, based on the phylogeny and
presence of particular Cas proteins.
• The Cas proteins are important for the differentiation between both the major CRISPR/Cas types and
the subtypes. The subtype specific proteins of some systems form a complex involved in targeting and
interference, which in the type I-E system is referred to as Cascade (CRISPR-associated complex for
antiviral defence).
• Two main groups of Cas proteins can be distinguished.
– The first group, which includes the core Cas proteins, is found across multiple types or subtypes.
• Cas1 and Cas2 (Cas2 is sometimes fused to another Cas protein) are found in every Cas operon across the three main types,
and Cas1 is considered the universal marker of CRISPR/Cas systems .
– Cas3, Cas9, and Cas6 are each specific for one major type, serving as signature proteins for type I, II, and III,
respectively .7/27/2018
10. CRISPR/Cas Mechanism
• The mechanism of CRISPR/Cas defence involves three stages- spacer
acquisition, expression (crRNA biogenesis) and interference.
• The first stage, adaptation, leads to insertion of new spacers in the
CRISPR locus.
• In the second stage, expression, the system gets ready for action by
expressing the cas genes and transcribing the CRISPR into a long
precursor CRISPR RNA (pre-crRNA). The pre-crRNA is subsequently
processed into mature crRNA by Cas proteins and accessory factors.
• In the third and last stage, interference, target nucleic acid is recognized
and destroyed by the combined action of crRNA and Cas proteins. i.e.
crRNA-Cas ribonucleoprotein complex targets the invading nucleic acid
for degradation.
7/27/2018
11. Adaptation via Spacer Acquisition
• New spacers are acquired from phages and plasmids i.e. Short pieces of DNA homologous to virus
or plasmid sequences and with duplication of the repeat, are added to the leader proximal end of
the CRISPR array.
• Characteristic length of approximately 30 bp, at the leader side of a CRISPR locus.
• Selection of spacer precursors (protospacers) is done by recognition of Proto-spacer-adjacent
motifs (PAMs).
• The insertion of new spacers has been experimentally demonstrated in several CRISPR-
Cas subtypes; Type I-A (Sulfolobus solfataricus, and Sulfolobus islandicus), I-B (Haloarcula
hispanica), I-E (E. coli) and I-F (Pseudomonas aeruginosa and Pectobacterium atrosepticum) and
Type II-A (S. thermophilus and a Streptococcus pyogenes system expressed in Staphylococcus
aureus).
• The first demonstration of spacer incorporation from phages and plasmids in the laboratory
was in the type II-A system of S. Thermophilus.
• The most well characterized system is the E. coli type I-E.
7/27/2018
13. Model of CRISPR/Cas adaptation (based on type I-E systems)
There are two types of spacer acquisition; naïve, when the invader has not
been previously encountered, and primed, when there is a pre-existing
record of the invader in the CRISPR
(a) Naïve acquisition.
Cas1 and Cas2 are required for acquisition of new spacers.
The first repeat at the leader end of the CRISPR array is duplicated
and incorporates a new spacer sequence from a protospacer.
The final nucleotide of the repeat is not duplicated but is provided
by the PAM nucleotide immediately adjacent to the protospacer.
(b) Priming acquisition.
Expression of the CRISPR array and generation of the crRNAs
against the foreign DNA results in binding / targeting, which is hypothesized
to aid in the generation of precursors for integration (possibly single- or
double-stranded).
Productive interference is not required for priming. Priming
acquisition requires Cascade-crRNA-Cas3 and Cas1-Cas2 and results in new
spacers derived from the same strand as the initial spacer.
7/27/2018
14. Cas Proteins Required for Adaptation
• In the E. coli type I-E system, Cas1 and Cas2 are required for spacer incorporation from plasmids and
phages and are conserved across all CRISPR/Cas types. Both Cas1 and Cas2 are nucleases and
mutations in the active site of Cas1 abolishes spacer integration in E. coli.
• Cas1 is a metal-dependent endonuclease and the Pseudomonas Cas1 cleaves dsDNA, generating ~80 bp
fragments , larger than expected for integration (~32–33 nt), indicating that other factors are involved.
• Cas2 from S. solfataricus cleaves ssRNA at U-rich regions in vitro, but ssRNA or ssDNA binding or
cleavage was not detected for Desulfovibrio vulgaris Cas2.
• Cas1 and Cas2 from E. coli form a complex where one Cas2 dimer binds two Cas1 dimers. Formation of
the complex is required for spacer acquisition but Cas2 nuclease activity is dispensable. Cas1
preferentially binds CRISPR DNA in a Cas2-dependent manner, further supporting a direct role in spacer
acquisition.
• Recently, the Bacillus halodurans Cas2 dimer was shown to contain Mg2+ -dependent endonuclease
activity against dsDNA and generated 120 bp products .
7/27/2018
15. • Spacer selection appears guided by certain sequence elements in the target. Analysis of target
sequences has revealed a short motif next to the target sequence called protospacer adjacent
motif (PAM) that is crucial for discrimination between self and non-self.
• The nuclease activity of Cas3 was proposed to be required for the feedback loop, generating
precursors for incorporation following the initial targeting event .
• However, spacers that match the foreign DNA but do not support targeting, due to single
nucleotide mutations, still promote priming .
• Priming is proposed to allow adaptation to phage and plasmids that mutate and escape the
initial spacers and multiple spacers would increase resistance and decrease the frequency of
escape.
• In both the Type I-E and Type II-A system, it is demonstrated that parts of the leader and one
repeat are required for spacer integration. Further, the leader-proximal repeat serves as
template for synthesis of the new repeat, probably by a strand separation mechanism.
7/27/2018
16. Protospacer Selection and Incorporation into CRISPR Arrays
• The minimal CRISPR requirements for spacer acquisition in the type I-E system are a
single ‘repeat’ and 60 bp of upstream sequence.
• As the CRISPR promoter is not contained within this 60 bp, this suggests that CRISPR
transcription is not required for incorporation, but indicates this region might be
recognized by Cas1 and/or Cas2 to enable the directionality of integration at the leader
end of the CRISPR array.
• In type I-E arrays with multiple repeats, the leader proximal repeat is the one that is
duplicated .
• In addition, the protospacer added to the CRISPR array contains the last nt of the PAM
motif, which becomes the final nt of the 5' repeat .
• Therefore, only the first 28 nt of the repeat are duplicated and thus, this unit has been
called the ‘duplicon’ .
7/27/2018
17. • This duplicon mechanism cannot apply to all CRISPR/Cas types since the
last nucleotide of some PAMs (e.g., CCN in S. solfataricus) is not
conserved, so could not contribute to the conserved repeats.
• In S. thermophilus, examination of the protospacers in the phage
genomes enabled the identification of a short 3' protospacer adjacent
motif (PAM).
• PAM sequences have since been identified via bioinformatics for other
CRISPR/Cas systems.
• In Type I and II systems interference requires the presence of a PAM
sequence and perfect protospacer-crRNA complementarity in the so-
called seed region, located adjacent to the PAM. The presence of a PAM
triggers “non-self activation”, which prevents the systems from attacking
its own CRISPR locus.7/27/2018
18. Expression, crRNA Generation and Interference
• Following the acquisition of spacers, the CRISPR array is expressed as a long pre-crRNA that is
processed into crRNAs .
• The mature crRNAs form part of a ribonucleoprotein complex, which targets and degrades the
foreign genetic material.
7/27/2018
20. Type I CRISPR–Cas systems
• Typical type I loci contain the cas3 gene.
• Currently, six subtypes of the Type I system are identified (Type I-A through Type I-F) that have a
variable number of cas genes.
• Here, regions within each repeat form a stem-loop, which is bound by the processing
endoribonuclease (belong to the group of Cas6 proteins include Cas6b in type I-B , Cas5d in type
I-D , Cas6e (CasE, Cse3) in type I-E and Cas6f (Csy4) in type I-F ).
• The pre-crRNA is cleaved in a sequence-specific manner at the downstream base of the stem-
loop, producing the mature crRNA consisting of a short 5' repeat handle followed by the spacer
sequence and the stem-loop of the next repeat .
• Encodes a large protein with separate helicase and DNase activities in addition to the proteins that
form Cascade-like complexes.
• RAMP;
• Repeat-associated mysterious proteins and Large superfamily of Cas proteins.
• Processes long spacer-repeat-containing transcript into a mature crRNA.
• Contain at least one RNA recognition motif (RRM/ ferredoxin-fold domain) and a characteristic glycine-rich
loop.
• Some possess sequence- or structure-specific RNase activity.
7/27/2018
21. • The crRNA and the endoribonuclease stay associated, and might serve as
nucleation point for the formation of Cascade .
• Cascade is a ribonucleoprotein complex formed by the subtype-specific Cas
proteins and the crRNA, initially identified in type I-E, but later also found in
type I-C and I-F.
• Based on the data from other type I systems, the type I-A Cascade is likely to
consist of the Cas6 endoribonuclease bound to the crRNA, a backbone of six
Cas7 proteins and a Cas5a/Csa5 heterodimer.
• During the final step of interference, the type I Cascade containing a specific
crRNA recognizes and binds the appropriate target DNA .
• In type I-E systems, a short loop within CasA recognizes potential targets via
the PAM sequence and it is proposed to bind them before the target DNA is
incorporated into Cascade
• In Type I-E systems, e.g. in E. coli, the pre-crRNA is cleaved in a metal-
independent mechanism by the Cas6e endoribonuclease.
7/27/2018
22. • In addition to the PAM requirement in the target, a non-contiguous 7 nt
sequence (nucleotides in position 1–5 and 7–8) within the first 8 nt of the
spacer, relative to the 5' end of the spacer within the crRNA, are essential for
initial target recognition and hence are termed the ‘seed sequence’ .
• Cascade with bound crRNA specifically binds to negatively supercoiled target
DNA (negative supercoiled topology serves as energy source for formation of
an R-loop).
• Furthermore, Cascade induces DNA bending before Cas3 is recruited for
subsequent DNA degradation .
• Cas3 is the signature protein of type I systems and possesses ATP-dependent
helicase activity as well as the ability to cleave ssDNA via an HD nuclease
domain.
• In the Type I-A system the Cas3 nuclease and helicase domains are encoded
as separate genes.
• It was revealed that, in all Type I systems the two domains act together to
processively degrade the double stranded DNA target.7/27/2018
23. Type II CRISPR–Cas systems
• Include HNH-type system.
• A very different mechanism of CRISPR expression and crRNA maturation
(Streptococcus pyogenes) .
• An abundant short trans-encoded transcript is essential for processing of pre-
crRNA into mature crRNA.
• Cas9, a single, very large protein generates crRNA and cleaves the target DNA.
It contains at least two nuclease domains;
– RuvC-like nuclease domain
– HNH (or McrA-like) nuclease domain.
• Possess endonuclease activity.
• pre-crRNA is cleaved by a mechanism that involves duplex formation between
a tracrRNA and part of the repeat in the pre-crRNA.
• Cleavage is catalysed by RNase III.
7/27/2018
24. • Therefore, these RNAs are termed trans-activating crRNAs (tracrRNA). These tracrRNAs
contain a 25 bp stretch with almost perfect complementarity to the type II CRISPR repeats
within the pre-crRNA .
• The type II repeats do not form stem-loops and this deficiency is overcome by pairing with the
tracrRNA .
• In addition to the tracrRNA, Cas9 (Csn1) and host RNase III are needed for the processing of
pre-crRNA into mature crRNA.
• The requirement of RNase III is the first discovery of a non-Cas protein being an essential part of
CRISPR/Cas defense machinery.
• Cas9 is not only involved in crRNA processing but, along with the tracrRNA:crRNA hybrid,
recognizes and degrades the target .
• Type II systems have been divided into subtypes II-A and II-B but recently a third, II-C, has been
suggested.
• The csn2 and cas4 genes, both encoding proteins involved in adaptation are present in Type II-A
and the Type II-B, respectively,
7/27/2018
26. • Furthermore, similar to the seed sequence in type I systems, complementarity
between crRNA and target over a 13 bp stretch proximal to the PAM is required
for interference and hence, phages with mutations in this region of the
protospacer region can evade interference .
• Cas9 structures from S. pyogenes and Actinomyces naeslundii reveal separate
lobes for target recognition and nuclease activity, accommodating the crRNA-
DNA heteroduplex in a positively charged groove at their interface.
• The recognition lobe is important for binding crRNA and target DNA, and the
nuclease lobe contains the HNH and RuvC nuclease domains that cleave the
complementary and non-complementary strands of the target, respectively.
• As the crRNA-induced reorientation of structural lobes facilitates DNA substrate
binding, loading of crRNA is proposed as a key step in Cas9 activation.
• In complex with the tracrRNA:crRNA structure, Cas9 binds the target dsDNA and
creates a double strand break by cleaving the complementary and non
complementary strand with its HNH- and RuvC-like domains, respectively .
7/27/2018
27. Type III CRISPR – Cas systems
• In type III CRISPR/Cas systems, expression and interference are predominantly
similar to those described for type I systems and the significant difference is
that the endoribonuclease Cas6 does not bind to a repeat stem-loop but to the
first bases of the repeat and that further crRNA maturation steps occur.
• Two sub-types:
– III-A (also known as Mtube or CASS6)
• Can target plasmids
– III-B (also known as the polymerase–RAMP module)
• Can target RNA
• At least two RAMPs involved in CRISPR transcript processing, in addition to
Cas6.
7/27/2018
28. • Indeed, not all repeats in those systems contain a strong palindrome and those associated with
type III-A are predicted to be mostly non-structured.
• In Pyrococcus furiosus, the binding site of Cas6 is located within nt 2–8 of the repeat and
cleaves at a distal site between bases 22 and 23 resulting in crRNAs with an 8 nt repeat handle
on the 5' end, the spacer sequence and 22 nt of the following repeat at the 3' end but this RNA
is not the final mature crRNA product.
• Cas6, unlike the type I endoribonucleases, only transfers the crRNA to the targeting complex
termed CMR for type III-B where they are further processed .
• In the type III-A system of S. epidermidis, a similar mechanism for processing by Cas6 was
observed and the csm2, csm3 and csm5 genes were required for further 3' processing and
crRNA maturation .
• The P. furiosus CMR complex includes the full subset of Cmr proteins (Cmr1–6) and can contain
two different species of crRNA , both with the 8 nt 5' end derived from the repeat and either
31 nt or 37 nt of the spacer .
7/27/2018
29. • The Csm complex in Type III-A typically includes six different proteins but the nuclease is not yet
identified.
• Early findings indicated that Csm complexes target DNA and Cmr complexes targets RNA .
• In Thermus thermophilus and S. thermophilus the Csm complex targets RNA, and
in T. thermophilus, which harbors both Type III-A and III-B systems, the Csm and Cmr complexes
share crRNA.
• Targeting of RNA and DNA by the same Cmr complex has been demonstrated in S. islandicus
• Base pairing between the last 14 nt of the crRNA and the RNA target seem to be essential for
cleavage. Target RNA cleavage occurs 14 nt upstream from the 3' end of the two mature crRNA
species, generating two cleavage sites .
• The ability of CRISPR/Cas systems to target DNA raised the question of how they avoid targeting
their own CRISPR arrays, which have perfect complementarity to the crRNAs they produce.
• For the type III-A system, this is avoided by requiring spacer:protospacer complementarity and
an absence of base-pairing to the 5' handle in the crRNA. This ensures that only non-self targets
are licensed for degradation, but whether the same principle applies to other types remains
unknown.
7/27/2018
30. • No PAMs are detected for Type III systems, instead the discrimination
between self and non-self is achieved by an extension of crRNA base pairing
into the repeat region of host DNA, which results in “self inactivation”, a
fundamentally different process to the PAM recognition used by Type I and
II systems.
• Overall comparison of the three types of CRISPR-Cas systems shows that
Type I and III systems share similarities in pre-crRNA processing as well as in
the structures of the crRNP complexes formed.
• All Type I and III systems utilize a Cas6 protein for pre-crRNA processing.
7/27/2018
32. • Targeted genome editing using customized nucleases provides a general method for
inducing targeted deletions, insertions and precise sequence changes in a broad range of
organisms and cell types.
• A crucial first step for performing targeted genome editing is the creation of a DNA
double-stranded break (DSB).
• Nuclease-induced DSBs can be repaired by;
– Non-homologous end joining (NHEJ)
• Efficient introduction of indels.
– Homology-directed repair (HDR)
• Specific point mutations or to insert desired sequences
7/27/2018
33. Genome editing
• Type II systems has been adapted for inducing sequence-specific DSBs and targeted
genome editing.
• Two components;
• Cas9 nuclease
• Guide RNA (gRNA), consisting of a fusion of a crRNA and a fixed tracrRNA
• Twenty nucleotides at the 5′ end of the gRNA direct Cas9 to a specific target DNA site
using standard RNA-DNA complementarity base-pairing rules.
• Target sites must lie immediately 5′ of a PAM sequence that matches the canonical
form 5′-NGG.
• Thus, Cas9 nuclease activity can be directed to any DNA sequence of the form N20-
NGG simply by altering the first 20 nt of the gRNA to correspond to the target DNA
sequence.
7/27/2018
34. • Cas9-induced DSBs have been used to introduce NHEJ-mediated
indel mutations as well as to stimulate HDR.
• Its simplicity has also inspired the generation of large gRNA
libraries using array-based oligonucleotide synthesis.
• In contrast to short hairpin RNA libraries, which mediate only gene
knockdown, these gRNA libraries have been used with Cas9
nuclease to generate libraries of cells with knockout mutations.
7/27/2018
35. Construction of CRISPR/Cas9 Vectors
• Plasmids for constructing customised gRNA- and Cas9-expressing vectors are available from
Addgene (https://www.addgene.org) and several other commercial companies including Life
Technologies, OriGene, and System BioSciences.
• The construction procedure only involves insertion of annealed oligonucleotides into the vectors,
which is much simpler than the procedures for ZFNs or TALENs.
• The gRNA and Cas9 can be expressed using either separate vectors or a single combined vector.
• pX330, a single vector expressing both gRNA and Cas9 nuclease with human U6 and chicken beta-
actin hybrid (CBh) promoters, respectively, was originally developed by the Feng Zhang laboratory
and has been very widely used for cell and animal genome editing.
7/27/2018
37. In bacterial type 2 CRISPR cas system,
the site specificity is defined by
complementery base pairing of a
small CRISPR RNA(crRNA
After annealing to a transactiviting
CRISPR RNA(tracrRNA) the crRNA
directly guides the cas9 endonuclease
to cleave the targeted DNA sequence.
The crRNA–transcr- RNA
heteroduplex could be replaced by one
chimeric RNA (so-called guide RNA
(gRNA)) and the gRNA could be
programmed to target specific sites.
Jinek et al.,2012
7/27/2018
38. • When applying the CRISPR/Cas system in genome editing, only two components are needed, namely a chimeric
guide RNA (gRNA) mimicking the crRNA–tracrRNA complex, and a Cas9 protein with nuclease activity (Fig. 2.2 ) .
• Although the targeting specificity is mainly dependent on the gRNA sequence, Cas9 also requires a few particular
bases, known as a protospacer adjacent motif (PAM).
• The PAM sequences vary among species. For example, Streptococcus pyogenes Cas9 (SpCas9) requires 5′-NGG-3′,
Streptococcus thermophilus Cas9 (StCas9) requires 5′-NNAGAAW-3′, and Neisseria meningitidis Cas9 (NmCas9)
requires 5′-NNNNGATT-3′ .
• Currently, SpCas9 is the most widely used for genetic engineering.
• The SpCas9- gRNA complex is known to initially seek out the PAM sequence in the genome, and subsequently
unwind the double-stranded DNA and form DNA-RNA base-pairing in a directional manner .
• When introducing a DSB, two nuclease domains, HNH and RuvC, independently induce a nick at the Watson and
Crick strands, resulting in a linear DSB between the bases at 3- and 4-bp upstream of the PAM sequence .
7/27/2018
40. • The gRNA structure is another important factor for CRISPR/Cas9-based genome editing.
• Although crRNA and tracrRNA can be separately transcribed like the naturally- occurring
CRISPR/Cas system, a chimeric gRNA structure is rather simple and often leads to high
activity .
• A chimeric gRNA consists of a crRNA-derived region at the 5′ end and a tracrRNA-derived
region at the 3′ end, and various modifications have been adopted in both regions by several
groups .
• Basically, the DNA-recognition sequence in the crRNA region is 20-bp long, but the addition
or truncation of a few bases can reportedly improve the specificity .
• The 3′ end of the crRNA region and the 5′ end of the tracrRNA region are generally linked
with four nucleotides (5′-GAAA-3′) to form a major stem loop, known as a tetraloop .
• The tracrRNA region has additional minor loops on the 3′ side, and these sequences are
known to be important for high gRNA expression .
7/27/2018
41. Targeting Specificity of CRISPR/Cas9
• As described earlier, the approximately 20-bp gRNA sequence and 3-bp PAM sequence of
SpCas9 define the targeting specificity of CRISPR/Cas9.
• Regarding the PAM sequence, SpCas9 has the ability to bind to 5′-NGA-3′ and 5′-NAG-3′ sites
as well as 5′-NGG-3′.
• Regarding the gRNA targeting sequence, the specificity decreases with increasing distance
from the PAM site.
• The sequence extending up to 12 bp adjacent to the PAM site is called the seed sequence,
and has relatively high targeting specificity .
• In some types of cultured cells, especially immortalized cell lines such as U2OS, HEK293T, and
K562, highly frequent off-target mutations have been observed by many groups.
• However, in normal cells such as mouse embryonic stem (mES) cells and organisms such as
mice and rats, the levels of induced off-target mutations do not seem to be as high .
7/27/2018
42. • In principle, the sequence limitation for targeting with CRISPR/Cas9
is only a PAM site.
• In practice, other actual limitations are as follows: 1) sequences
harboring poly-T should be avoided as gRNA target sequences,
because poly-T can work as a transcriptional terminator; and 2) the
number of potential off-target sites should be minimized.
• Currently, various tools for designing CRISPR/Cas9 target sites and
predicting potential off-target sites are available on the web. CRISPR
Design Tool (http:// crispr.mit.edu/ ), developed by the Feng Zhang
laboratory at MIT (Ran et al. 2013 ), is supposedly the most widely
used resource for designing and assessing gRNA target sequences .
7/27/2018
Target site gRNA design and Off-target effect prediction
43. 7/27/2018
• CRISPR Design Tool is used not only for design, but also for searching for off-target sites in the genomes of certain
species, including humans, rats, mice, zebrafish, flies, and nematodes. ZiFiT Targeter (http://zifi
t.partners.org/ZiFiT/) available on the website of the Zinc Finger Consortium is also commonly used for designing
gRNAs .
• Other web resources include E-CRISP ( http://www.e-crisp.org/E-CRISP/ ), CRISPRdirect ( http:// crispr.dbcls.jp/ ),
CRISPR Optimal Target Finder ( http://tools.fl ycrispr.molbio.wisc. edu/targetFinder/ ) (Gratz et al. 2014), CasOT (
http://eendb.zfgenetics.org/casot/ ) (Xiao et al. 2014 ), Cas-OFFinder ( http://www.rgenome.net/cas-offi nder/ )
(Bae et al. 2014 ), CHOPCHOP ( https://chopchop.rc.fas.harvard.edu/ ) (Montague et al. 2014 ), sgRNAcas9 (
http://www.biootools.com/col.jsp?id=103 ) (Xie et al. 2014 ), and CRISPy (http://staff.biosustain.dtu.dk/laeb/crispy/
) (Ronda et al. 2014 ).
45. CRISPR/Cas9 induced high rates (88–100%) of mutagenesis in the target
protein-encoding sites of MSTN.
MSTN-edited fry had more muscle cells (p < 0.001) than controls, and the
mean body weight of gene-edited fry increased by 29.7%.
46. • Evaluation of growth in myostatin (MSTN)-mutated one-month-old channel catfish fry. Body weight (A)
and body length (B) of mutant (blue) and wild type (red) (n = 330). (C,D) Representative images of the
ventral cross-sectional area of the epaxial muscle of wild-type (WT) (C) and mutant (individuals with
frame-shift mutation from MSTN-Mix group) (D), shown by Hematoxylin and Eosin (H&E) staining. Scale
bar in (C,D): 25 µm.
The first description of what would later be called CRISPR is from Osaka University researcher Yoshizumi Ishino and his colleagues in 1987. They accidentally cloned part of a CRISPR together with the iap gene, the target of interest. The organization of the repeats was unusual because repeated sequences are typically arranged consecutively along DNA.[19][20] They studied the relation of "iap" to the bacterium E. coli. The function of the interrupted clustered repeats was not known at the time
Barrangou and Horvath showed that Cas9 from the S. thermophilus CRISPR system can also be reprogrammed to target a site of their choosing by changing the sequence of its crRNA. These advances fueled efforts to edit genomes with the modified CRISPR-Cas9 system.
How are spacers actually integrated in the CRISPR array? Cas1 nuclease activity is required for nicking the CRISPR array in E. coli, and Cas1 is possibly responsible for the integration of the new spacer [48]. An in vitro study with Type I-E Cas1 and Cas2 confirm that the complex can insert DNA fragments into a CRISPR array by a mechanism reminiscent of retroviral integrases and transposases [31]. Whether or not the mechanism for spacer integration is conserved between the different systems remains to be determined.
In summary, available details of CRISPR adaptation are beginning to come to light, but there are still many questions regarding the exact mechanisms of protospacer recognition, spacer generation, repeat duplication, integration of the new spacer and the roles of the Cas proteins.
The palindromic nature of many CRISPR repeats is important to determine the position and direction of spacer integration into the array [31]. It is indicated that palindromic repeats form cruciform DNA structures that recruits Cas1 and Cas2 (Fig. 2) [31], [48], and such structures are known to be a target for Cas1 cleavage [42]. Interestingly, in vitro spacer integration can also be performed at other sequences predicted to form cruciform structures, in the absence of repeats [31]. Taken together, spacer integration is directed both by sequence and structure of the CRISPR.
A self / non-self mechanism does occur, since the incorporation of spacers from the chromosome is rare .
It is possible that DNA modification, such as by a restriction-modification-like mechanism, might play a role.