Bio info 5

1,017 views
829 views

Published on

bioinformatics

Published in: Education, Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
1,017
On SlideShare
0
From Embeds
0
Number of Embeds
3
Actions
Shares
0
Downloads
4
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide
  • 1.a+b=24 2. a+c=28 3. b+c=32a+b=24 : a=24-b put in 2 : 24-b+c=28 : c-b=28-24: c-b=4 : c=4+bput value of c in 3. b+4+b=32 : 2b+4=32: 2b=32-4; b=28/2=14Now put the value of b in 1
  • the evolutionary distance is expressed as the number of nucleotide differences per nucleotide site for each sequence pair. For example, sequences 1 and 2 are 20 nucleotides in length and have four differences, corresponding to an evolutionary difference of 4/20 = 0.2. Note that this analysis assumes that there are no multiple substitutions (also called multiple hits). Multiple substitution occurs when a single site undergoes two or more changes (e.g. the ancestral sequence … ATGT … gives rise to two modern sequences: … AGGT … and … ACGT …). There is only one nucleotide difference between the two modern sequences, but there have been two nucleotide substitutions. If this multiple hit is not recognized then the evolutionary distance between the two modern sequences will be significantly underestimated. To avoid this problem, distance matrices for phylogenetic analysis are usually constructed using mathematical methods that include statistical devices for estimating the amount of multiple substitution that has occurred.
  • the evolutionary distance is expressed as the number of nucleotide differences per nucleotide site for each sequence pair. For example, sequences 1 and 2 are 20 nucleotides in length and have four differences, corresponding to an evolutionary difference of 4/20 = 0.2. Note that this analysis assumes that there are no multiple substitutions (also called multiple hits). Multiple substitution occurs when a single site undergoes two or more changes (e.g. the ancestral sequence … ATGT … gives rise to two modern sequences: … AGGT … and … ACGT …). There is only one nucleotide difference between the two modern sequences, but there have been two nucleotide substitutions. If this multiple hit is not recognized then the evolutionary distance between the two modern sequences will be significantly underestimated. To avoid this problem, distance matrices for phylogenetic analysis are usually constructed using mathematical methods that include statistical devices for estimating the amount of multiple substitution that has occurred.
  • The lengths of the branches indicate the degree of difference between the genes represented by the nodes.
  • therefore, one may transfer functional information from one protein to another if both possess a certain degree of similarity. However, this process must be carried out critically, as similar proteins may yet perform different functions, despite, for example, having arisen from a common ancestor.
  • Homology is not quantifiable – either two sequences arehomologous or not. The identity or similarity of two sequences is, however, quantifiable.
  • Orthologs can be defined as "genes that have diverged after a speciation event... [that] tend to have similar function" (Fulton et al. 2006). Thus, orthologs are genes whose encoded proteins fulfill similar roles in different species. The importance of orthologs is quite simply seen when imagining a hypothetical comparison of two genes, A and B, that encode proteins with similar functions in two different species (human and chimp, for example). If one compares the two protein sequences encoded by the orthologs, the truly critical parts of the gene will be conserved. What has remained constant can probably be interpreted as crucial to the functioning; what has changed, minor. 
  • Two alpha chains plus two beta chains constitute HbA, which in normal adult life comprises about 97% of the total hemoglobin; alpha chains combine with delta chains to constitute HbA-2, which with HbF (fetal hemoglobin) makes up the remaining 3% of adult hemoglobin. Alpha thalassemiasresult from deletions of each of the alpha genes as well as deletions of both HBA2 and HBA1;LOCUS CONTROL REGION(LC R)Many thalassemias result from mutations in the coding regions of the globin genes, but a few were shown to map to a 12-kb region upstream of the β-globin gene cluster, the region now called the LCR. The ability of mutations in the LCR to cause thalassemia is a clear indication that disruption of the LCR results in a loss of globin gene expression.
  • Unlike identity, similarity is not as simple to calculate. Before similarity can be determined, it must first be defined how similar the building blocks of sequences are to each other. This is done with the help of similarity matrices that are also known as substitution or scoring matrices. Similarity matrices specify the probability at which a sequence transforms into another sequence over time. dependent on the time and the mutational rate of nucleotides.
  • Here, one assumes that the fournucleotides do not show any similarity to one other, and therefore,only identical nucleotides are factored into the similarityscoring.
  • Onereason for this is the triplet-based genetic code (see Chap. 2).For an exchange of aspartic acid to glutamic acid to occur onlya mutation of the last nucleotide in the triplet codon is required
  • For an exchange of aspartic acid to glutamic acid to occur onlya mutation of the last nucleotide in the triplet codon is requiredIn contrast, a complete mutationof the whole triplet has to occur in order to exchange asparticacid for tryptophan (GAT/GAC to TGG).
  • An exchange of aspartic acid for tryptophan, therefore, couldgreatly alter the tertiary structure of a protein and consequentlyits function. Such striking amino acid exchanges accompaniedby a loss of function rarely happen.
  • Fig. 4.2. Use of the BLOSUM 62 matrix for the construction of an optimalamino acid alignment. Two potential alignments for each are representedwhereby the optimal alignment is shown in green
  • THERE IS ASYMMETRY BETWEEN NUCLEIC ACIDS AND THEIR PRODUCTS i.e. PROTEINS
  • IT IS THE SIMPLEST BUT NOT THE ONLY AND THE MOST CORRECT SOLUTIONHere, one assumes that the fournucleotides do not show any similarity to one other, and therefore,only identical nucleotides are factored into the similarityscoring.
  • Fig. 4.2. Scoring matrices allow the computation of optimal alignments.(a) Use of an identity matrix for the construction of an optimal nucleotidealignment. (b) Use of the BLOSUM62 matrix for the construction of an optimalamino acid alignment. Two potential alignments for each are representedwhereby the optimal alignment is shown in green
  • Sometimes, interest may focus solely on aligning the mostsimilar stretches within two sequences – a local alignment. Withthis approach, protein domains and motifs (e.g., ATP bindingsites, DNA binding domains, N-glycosylation sites) can be identified. In principle, a local alignment is calculated in the sameway as a global alignment using a substitution matrix and theintroduction and extension of gaps.
  • Fig. 4.4. Calculation of a global alignment of two similar protein sequences.(a) Both sequences are compared in a two-dimensional matrix and thesimilarity of the amino acids is determined using similarity matrices. Eachalignment can be described as a path through the two-dimensional matrix,starting with highest-scoring amino acid pair at the N-terminus. (b) By addingthe values corresponding scores for the different paths are obtained.The alignment with the highest score is considered optimal (shown in red).(c) The optimal alignment is obtained by the introduction of a gap and contains10 amino acids, of which seven are identical. Using the BLOSUM62similarity matrix and a gap penalty of 1.0 a score of 31.0 is achieved
  • Fig. 4.4. Calculation of a global alignment of two similar protein sequences.(a) Both sequences are compared in a two-dimensional matrix and thesimilarity of the amino acids is determined using similarity matrices. Eachalignment can be described as a path through the two-dimensional matrix,starting with highest-scoring amino acid pair at the (b) By addingthe values corresponding scores for the different paths are obtained.The alignment with the highest score is considered optimal (shown in red).(c) The optimal alignment is obtained by the introduction of a gap and contains10 amino acids, of which seven are identical. Using the BLOSUM62similarity matrix and a gap penalty of 1.0 a score of 31.0 is achieved
  • Fig. 4.4. Calculation of a global alignment of two similar protein sequences.(a) Both sequences are compared in a two-dimensional matrix and thesimilarity of the amino acids is determined using similarity matrices. Eachalignment can be described as a path through the two-dimensional matrix,starting with highest-scoring amino acid pair at the (b) By addingthe values corresponding scores for the different paths are obtained.The alignment with the highest score is considered optimal (shown in red).(c) The optimal alignment is obtained by the introduction of a gap and contains10 amino acids, of which seven are identical. Using the BLOSUM62similarity matrix and a gap penalty of 1.0 a score of 31.0 is achieved
  • Fig. 4.4. Calculation of a global alignment of two similar protein sequences.(a) Both sequences are compared in a two-dimensional matrix and thesimilarity of the amino acids is determined using similarity matrices. Eachalignment can be described as a path through the two-dimensional matrix,starting with highest-scoring amino acid pair at the (b) By addingthe values corresponding scores for the different paths are obtained.The alignment with the highest score is considered optimal (shown in red).(c) The optimal alignment is obtained by the introduction of a gap and contains10 amino acids, of which seven are identical. Using the BLOSUM62similarity matrix and a gap penalty of 1.0 a score of 31.0 is achieved
  • What are your findings. You were given an option of 25 Aminoacids and aksed to prepare a suitable Matrix
  • Fig. 4.2. Scoring matrices allow the computation of optimal alignments.(a) Use of an identity matrix for the construction of an optimal nucleotidealignment. (b) Use of the BLOSUM62 matrix for the construction of an optimalamino acid alignment. Two potential alignments for each are representedwhereby the optimal alignment is shown in green
  • The use of an outgroup to root a phylogenetic treeThe tree of human, chimpanzee, gorilla and orangutan genes is rooted with a baboon gene because we know from the fossil record that baboons split away from the primate lineage before the time of the common ancestor of the other four species. For more information on phylogenetic analysis of humans and other primates see
  • The Design and Angles of the phylogenetic does not change the evolutionary distance among the various taxa represented. Naeem
  • Is This tree Rooted?
  • This Tree is Rooted ?
  • Fig. 4.6. Phylogenetic tree of dopamine receptor sequences. The evolutionaryrelationship between the sequences is reflected by the length of thebranches. Dopamine receptor sequences of invertebrates (Dm, Drosophilamelanogaster; Ag, Anopheles gambiae; Am, Apismellifera) are compared withthose of humans (Hs, Homo sapiens). Three clear clusters are formed. As acontrol, the phylogenetically distant sequence of the Dm histamine receptorwas not found in any of the clusters
  • OrthologousRefers to homologous genes located in the genomes of different organisms.
  • hap·lo·type (hpl-tp)n.1. The set of alleles that determine different antigens but are closely linked on one chromosome and inherited as a unit, providing a distinctive genetic pattern used in histocompatibility testing.2. The antigenic phenotype determined by closely linked genes inherited as a unit from one parent.The American Heritage® Medical Dictionary Copyright © 2007, 2004 by Houghton Mifflin Company. Published by Houghton Mifflin Company. All rights reserved.HaplotypeA set of alleles (an alternative form of a gene that can occupy a particular place on a chromosome) of a group of closely linked genes which are usually inherited as a unit.Mentioned in: Human Leukocyte Antigen Test
  • The pre-molecular view was that the great apes (chimpanzees, gorillas and orangutans) formed a clade separate from humans, and that humans diverged from the apes at least 15-30 MYA. Mitochondrial DNA, most nuclear DNA-encoded genes, and DNA/DNA hybridization all show that bonobos and chimpanzees are related more closely to humans than either are to gorillas.
  • Figure 16.14 Different interpretations of the evolutionary relationships between humans, chimpanzees and gorillasSee the text for details. Abbreviation: Myr, million years.
  • AbstractHuman immunodeficiency virus type 1 (HIV-1) transmission from infected patients to health-care workers has been well documented, but transmission from an infected healthcare worker to a patient has not been reported. After identification of an acquired immunodeficiency syndrome (AIDS) patient who had no known risk factors for HIV infection but who had undergone an invasive procedure performed by a dentist with AIDS, six other patients of this dentist were found to be HIV-infected. Molecular biologic studies were conducted to complement the epidemiologic investigation. Portions of the HIV proviral envelope gene from each of the seven patients, the dentist, and 35 HIV-infected persons from the local geographic area were amplified by polymerase chain reaction and sequenced. Three separate comparative genetic analyses-genetic distance measurements, phylogenetic tree analysis, and amino acid signature pattern analysis-showed that the viruses from the dentist and five dental patients were closely related. These data, together with the epidemiologic investigation, indicated that these patients became infected with HIV while receiving care from a dentist with AIDS.
  • LIFE CYCLE OF HIV A RETROVIUS
  • Retrovirus genomes accumulate mutations relatively quickly because reverse transcriptase, the enzyme that copies the RNA genome contained in the virus particle into the DNA version that integrates into the host genome (see Section 2.4.2), lacks an efficient proofreading activity (Section 13.2.2) and so tends to make errors when it carries out RNA-dependent DNA synthesis. This means that the molecular clock runs rapidly in retroviruses,
  • This means that the molecular clock runs rapidly in retroviruses, and genomes that diverged quite recently display sufficient nucleotide dissimilarity for a phylogenetic analysis to be carried out. Even though the evolutionary period we are interested in is less than 100 years, HIV and SIV genomes contain sufficient data for their relationships to be inferred by phylogenetic analysis
  • In molecular biology, real-time polymerase chain reaction, also called quantitative real time polymerase chain reaction (Q-PCR/qPCR/qrt-PCR) or kinetic polymerase chain reaction (KPCR), is a laboratory technique based on the PCR, which is used to amplify and simultaneously quantify a targeted DNA molecule. For one or more specific sequences in a DNA sample, Real Time-PCR enables both detection and quantification. The quantity can be either an absolute number of copies or a relative amount when normalized to DNA input or additional normalizing genes.The procedure follows the general principle of polymerase chain reaction; its key feature is that the amplified DNA is detected as the reaction progresses in real time. This is a new approach compared to standard PCR, where the product of the reaction is detected at its end. Two common methods for detection of products in real-time PCR are: (1) non-specificfluorescent dyes that intercalate with any double-stranded DNA, and (2) sequence-specificDNA probes consisting of oligonucleotides that are labeled with a fluorescent reporter which permits detection only after hybridization of the probe with its complementary DNA target.Frequently, real-time PCR is combined with reverse transcription to quantify messenger RNA and Non-coding RNA in cells or tissues.
  • Figure 16.15Thephylogenetic tree reconstructed from HIV and SIV genome sequencesThe AIDS epidemic is due to the HIV-1M type of immunodeficiency virus. ZR59 is positioned near the root of the star-like pattern formed by genomes of this type. Based on Wain-Hobson (1998).These simian immunodeficiency viruses (SIVs) are not pathogenic in their normal hosts
  • Bio info 5

    1. 1. Bioinformatics Lecture# 5 Dr. Naeem Ud Din Khattak Professor Department of ZoologyIslamia College Peshawar (Chartered University)
    2. 2. Phylogenetic Tree Construction
    3. 3. • The mutation distance : The minimal number of nucleotides that would need to be altered in order for the gene for one Protein to code for the other.• ACTGAT A C TGAT - T C T - ATC TCTATC 3
    4. 4. The construction of the tree• Assume proteins, A, B and C, and their mutation distances. B C A 24 28 B 32• There are two Qs: 1. Which pair does one join together first? 2. What are the lengths of edges a, b, and c? 4
    5. 5. Which pair does one join together first ?• It is simply by choosing the pair with the smallest mutation distance. B C A 24 28 B 32 A B C 5
    6. 6. What are the lengths of legs a, b, and c? c B C A 24 28 a b B 32 A B C a+b=24 a =? a =10 a+c=28 b =? b =14 b+c=32 c =18 c =? 6
    7. 7. • i. a+b=24 ii. a+c=28 iii. b+c=32• a+b=24 : a=24-b put the value of a in ii :• 24-b+c=28 ; c-b=28-24; c-b=4 : c=4+b• put value of c in iii. b+4+b=32 : 2b+4=32: 2b=32-4;• b=28/2=14• Now put the value of b in 1
    8. 8. • Note that this analysis assumes that there are no multiple substitutions||||||||||| ||||when a single site undergoes two or more changes e.g. the ancestral sequence … ATGT … gives … AGGT …• and … ACGT …).
    9. 9. Phylogenetic Tree Terminology Terminal Nodes Branches or Lineages A Represent the TAXA (genes, populations, B species, etc.) used to infer C the phylogeny DAncestral Node or ROOT of Internal Nodes or E the Tree Divergence Points (represent hypothetical ancestors of the taxa) Based on lectures by C-B Stewart, and by Tal Pupko
    10. 10. Phylogenetic trees diagram the evolutionary relationships between the taxa Taxon B Taxon C Taxon A Taxon D Taxon E((A,(B,C)),(D,E)) = The above phylogenyas nested parentheses Based on lectures by C-B Stewart, and by Tal Pupko
    11. 11. Clade Taxon B Taxon C Taxon A clade Taxon D Taxon E((A,(B,C)),(D,E))__ B and C are more closely related to each otherthan either is to A,___ and A, B, and C form a clade that is a sistergroup to the clade composed of D and E. ____Ifthe tree has a time scale, then D and E are the mostclosely related. Based on lectures by C-B Stewart, and by Tal Pupko
    12. 12. Sequence Comparisons
    13. 13. • Nature acts conservatively, i.e., it does not develop a new kind of biology for every life form but continuously changes and adapts a proven general concept.• Novel functionalities do not appear because a new gene has suddenly arisen but are developed and modified during evolution.• Thus, Alleles of a gene found in a population arise from a common ancestor gene_____________ HOMOLOGOUS
    14. 14. Homology is not a measure of similarity, but rather that sequences have a shared evolutionary history and, therefore, possess a common ancestral sequence (Tatusovet al. 1997).• An all or none phenomenon
    15. 15. Orthologs• Homologous proteins from different species that possess the same function (e.g., corresponding kinases in a signal transduction pathway in humans and mice) are called orthologs.Paralogs• Homologous proteins that have different functions in the same species (e.g., two kinases in different signal transduction pathways of humans) are termed paralogs.
    16. 16. • A visual representation of orthologs (and some other commonly confused terms, paralogs and homologs)
    17. 17. Orthologs: "genes that have diverged after a speciation event... [that] tend to have similar function" (Fulton et al. 2006). Thus, orthologs are genes whose encoded proteins fulfill similar roles in different species.
    18. 18. • Homology is not quantifiable –• The similarity and Identity of two sequences, however IS
    19. 19. Identity• ratio of the number of identical amino acids or nucleotides relative to the total number of amino acids or nucleotides. 4/20 = 0.2.
    20. 20. similarity• Unlike identity, similarity is not as simple to calculate. Before similarity can be determined, it must first be defined how similar the building blocks of sequences are to each other.• This is done with the help of similarity matrices _____ specify the probability at which a sequence transforms into another sequence over time.• dependent on the time and the mutational rate of nucleotides.
    21. 21. • For nucleotide sequences the simplest solution is an identity matrix ( Fig. 4.2a).
    22. 22. • For protein sequences, an identity matrix is not sufficient to describe biological and evolutionary processes.• Amino acids are not exchanged with the same probability as might be conceived theoretically.• YOU CAN RECALL THE SYNONYMOUS AND NON-SYNONYMOUS MUTATIONS
    23. 23. • For example, DNA T• an exchange of T in aspartic acid for DNA glutamic acid is frequently observed;• aspartic acid to tryptophan is seen rarely.
    24. 24. • A second reason for the mutation of aspartic acid- to- glutamic acidto occur more often is that both have similar properties.• In contrast aspartic acid and tryptophan are chemically different – the hydrophobic tryptophan is frequently found in the center of proteins, whereas the hydrophilic aspartic acid occurs more often at the surface.
    25. 25. • Amino acid substitution matrices, therefore, describe the probability at which amino acids are exchanged in the course of evolution.• The most commonly used amino acid scoring matrices are the PAM (Position Accepted Mutation; Dayhoff et al. 1978) and BLOSUM groups• (Blocks Substitution Matrix; Henikoff and Henikoff 1992)
    26. 26. Tryptophan Trp W Hydrophobic aspartic acid Asp DGlutamic acid Glu E Hydrophilic Electrically Charged (negative)
    27. 27. NUCLEOTIDE AND AMINO ACID SEQUENCES ARE EVOLUTIONARILY DIFFERENT SO,WE NEED DIFFERENT CRITERIA AND MATRICES TO ANALYZE THEM
    28. 28. • ( Fig. 4.2 a)• For nucleotide sequences the simplest solution is an identity matrix
    29. 29. ( Fig. 4.2 b) For Amino Acid SeqsWe need Similarity Matrices Score: 65 Score: 19
    30. 30. Calculation of a global alignment of two similar protein sequences.
    31. 31. Calculation of a global alignment of two similar protein Sequences
    32. 32. Identity• ratio of the number of identical amino acids or nucleotides relative to the total number of amino acids or nucleotides. 4/20 = 0.2.
    33. 33. Identity• ratio of the number of identical amino acids or nucleotides relative to the total number of amino acids or nucleotides. 4/20 = 0.2.
    34. 34. • Using MEGA to Calculate Mutation Distance
    35. 35. Outgroup to root a phylogenetic tree• The tree of human, chimpanzee, gorilla and orangutan genes is rooted with a baboon gene because• we know from the fossil record that the common ancestor of the four species split away from baboon earlier in geological time• Let’s See Members of this Group
    36. 36. Outgroup Chimp Human Gorilla Orangutan 0.01 Chimp Human Gorilla Orangutan Baboon 0.02
    37. 37. Outgroup Kiwi Ostrich Swan Ring Necked Phaes Silver phaesant song sparrow Parrot Lizzard
    38. 38. The Design of the phylogenetic TREE does notchange the evolutionary distance among thevarious taxa represented. Kiwi Struthio camelus Swan song sparrow Ring nick ed Phaesant Silver pheasant Parrot
    39. 39. The Design of the phylogenetic TREE does notchange the evolutionary distance among thevarious taxa represented. Kiwi Struthio camelus Swan song sparrow Ring nick ed Phaesa Silver pheasant Parrot
    40. 40. Types of Treesrooted trees Common Ancestor
    41. 41. Types of treesUnrooted tree represents the same phylogeny without the root node
    42. 42. Fig. 4.6. Phylogenetic tree of dopaminereceptor sequences.
    43. 43. Gene trees are not the same as species trees
    44. 44. Examples of what can be inferred from phylogenetic trees (DNA, protein)1. Which species are the closest living relatives of modern humans?2. Did the infamous Florida Dentist infect his patients with HIV?3. What is the relation between HIV and SIV
    45. 45. Relatives of modern humans? Humans Gorillas Chimpanzees Chimpanzees Bonobos Bonobos Gorillas Orangutans Orangutans Humans 14 0 15-30 0 MYA MYAMitochondrial DNA, most nuclearDNA-encoded genes, and The pre-molecular viewDNA/DNA hybridization
    46. 46. 2. Did the Florida Dentist infect his patients with HIV?Phylogenetic tree DENTIST Yes:of HIV sequences Patient C The HIV sequencesfrom the DENTIST, Patient A fromhis Patients, & Local Patient G these patients fallHIV-infected People: Patient B within Patient E the clade of HIV Patient A sequences found in the dentist. DENTIST Local control 2 Local control 3 Patient F No Local control 9 Local control 35 Local control 3 Patient D No Based on lectures by C-B Stewart, and by From Ou et al. (1992) and Page & Holmes (1998) Tal Pupko
    47. 47. 3. Relating Human HIV to Simian SIVretroviruseshuman immunodeficiency virus1 (HIV-1), pathogenicSIVs are not pathogenic in theirnormal hosts
    48. 48. CD4 proteins on surface Phospholipid membrane Matrix Capsid Viral RNA Viral enzymes: - Reverse transcriptase - Integrase - Protease The structure of HIVIMAGE FROM: Medical Art Service, Munich / Wellcome Images.
    49. 49. New virus leaves cell New virus assembled HIV attaches to CD4 Viral RNA receptors on T-Cell Viral proteins Viral core of Viral protease enzymes and RNA cuts up injected into cell proteins DNA transcribed from viral RNA Transcription Double-stranded DNA produced Viral integrase DNA integratesHIV’s replication cycle with host chromosome
    50. 50. Retrovirus genomes accumulate mutationsrelatively quickly• lacks an efficient proofreading, so makeerrors when it carries out RNA-dependentDNA synthesis.• the molecular clock runs rapidly inretroviruses,
    51. 51. •genomes that diverged quite recently displaysufficient nucleotide dissimilarity for aphylogenetic analysis to be carried out.•In less than 100 years, HIV and SIV genomescontain sufficient data.
    52. 52. •The starting point for thisphylogenetic analysis is RNA extracted from virus particles. RT-PCR
    53. 53. RT-PCRReverse transcription polymerase chainreaction (RT-PCR) is a variant of polymerase chainreaction (PCR). It is a laboratorytechnique commonly used in molecular biologywhere a RNA strand is reverse transcribed intoits DNA complement (complementaryDNA, or cDNA) using the enzyme reversetranscriptase, and the resulting cDNA is amplifiedusing PCR.
    54. 54. • This tree has a number of interesting features. First it shows that different samples ofHIV-1 have slightly different sequences, the samples as a whole forming a tight cluster, almost a star-like pattern, that radiates from one end of the unrooted tree.
    55. 55. •* This star-like topology implies that the global AIDS epidemic began with a very small number of viruses, perhaps just one, which have spread and diversified since entering the human population.• The closest relative to HIV-1 among primates is the SIV of chimpanzees, the implication being that• this virus jumped across the species barrier between chimps and humans and initiated the AIDS epidemic.
    56. 56. • However, this epidemic did not begin immediately: a relatively long uninterrupted branch links the center of the HIV-1 radiation with the internal node leading to the relevant SIV sequence, suggesting that after transmission to humans, HIV-1 underwent a latent period when it remained restricted to a small part of the global human population, presumably in Africa, before beginning its rapid spread to other parts of the world.
    57. 57. • Other primate SIVs are less closely related to HIV-1, but one, the SIV from sooty mangabey, clusters in the tree with the second human immunodeficiency virus, HIV-2.• It appears that HIV-2 was transferred to the human population independently of HIV- 1, and from a different simian host. HIV-2 is also able to cause AIDS, but has not, as yet, become globally epidemic.
    58. 58. REFERENCES• http://www.bio.davidson.edu/Courses/Molbio/MolStudents/spring2010/Rydbe rg/Orthologs.html

    ×