Pinky Sheetal V M.Tech BioinformaticsExperiment 2: Genome RearrangementAim: To analyze the genome rearrangement between two chromosomal genomes using SPRINGToolIntroduction:The genome of an organism consists of a long string of DNA, cut into a small number ofsegments called chromosomes. Genes are stretches of the DNA sequence that are responsible forencoding proteins. Each gene has an orientation, either forward or backward, depending in whichdirection it is supposed to be read. A chromosome can thus be abstracted as an ordered set oforiented genes. Higher organisms chromosomes are linear (their DNA sequence has a beginningand an end), but for lower organisms like bacteria, the chromosome is circular (their DNAsequence has no beginning or end). The most common and most studied mutations operating onDNA sequences are local: they affect only a very small stretch on DNA sequence. Thesemutations include nucleotide substitutions (where one nucleotide is substituted for another), aswell as nucleotide insertions and deletions. Most phylogenetic studies have been based on thesetypes of mutations. Genome rearrangement is a different class of mutation affecting very largestretches of DNA sequence. A genome rearrangement occurs when a chromosome breaks at twoor more locations (called the breakpoints), and the pieces are reassembled, but in the wrong"order. This results in a DNA sequence that has essentially the same features as the originalsequence, except that the order of these features has been modified. If the chromosome breaksoccur in non-functional sequence, the rearrangement is unlikely to have any deleterious effects.On the other hand, a rearrangement whose breakpoints fall in functional sequence (e.g. genes)will almost certainly make the gene dysfunctional, rendering the organism unlikely to survive.Consequently, almost all genome rearrangements that become fixed in future generations involveinter-genetic breakpoints.Tools:SPRING-SPRING (http://algorithm.cs.nthu.edu.tw/tools/SPRING/) is a tool for the analysis of genomerearrangement between two chromosomal genomes using reversals and/or block-interchanges.SPRING takes two or more chromosomes as its input and then computes a minimum series ofreversals and/or block interchanges between any two input chromosomes for transforming onechromosome into another. The input of SPRING can be either bacterial-size sequences orgene/landmark orders. If the input is a set of chromosomal sequences then the SPRING willautomatically search for identical landmarks, which are homologous/conserved regions sharedby all input sequences. In particular, SPRING also computes the breakpoint distance betweenany pair of two chromosomes, which can be used to compare with the rearrangement distance toconfirm whether they are correlated or not. In addition, SPRING shows phylogenetic trees thatare reconstructed based on the rearrangement and breakpoint distance matrixes. LCBs (LocallyCollinear Blocks) are identical landmarks, which are homologous/conserved regions shared byall input sequences. Basically, an LCB is a collinear set of multi-MUMs (which are exactlymatching subsequences shared by all chromosomes considered that occur only once in eachchromosome and that are bounded on either side by mismatched nucleotides). In practice, it maycorrespond to a homologous region of sequence shared by all genomes and does not contain anygenome rearrangements.
Pinky Sheetal V M.Tech BioinformaticsProtocol: 1. Retrieve query from NCBI- Sequence 1: >gi|363806631|emb|FQ976726.3| Susscrofachromosome Y clone WTSI_1061-69M2 Sequence 2: >gi|363806635|emb|FQ976728.4| Susscrofa chromosome Y clone WTSI_1061-70O20 2. Input the sequence inSPRING. 3. View Results.Result:Interpretation:S1 LCB:LCB Number, Left and Right End Coordinates of LCB (LCBs Length, LCBs Weight)1. 28262 : 28855 (594, 115)2. 28973 : 33435 (4463, 1022)The total LCBs cover 13.92% of the genome.
Pinky Sheetal V M.Tech BioinformaticsS2 LCB:LCB Number, Left and Right End Coordinates of LCB (LCBs Length, LCBs Weight)1. -26444 : -25669 (776, 115)2. 12656 : 22329 (9674, 1022)The total LCBs cover 29.19% of the genome.Since both coordinates are negative, they are inverted region on the opposite strand of the givensequence.