Good afternoon everyone. First, I would like to thank you for agreeing to serve on my candidacy exam committee. The topic I chose to present concerns Horizontal Gene Transfers between kingdoms of fungi and plants.
Here is the outline of my presentation. I will begin my presentation by briefly outlining what Horizontal Gene Transfer is, how it occurs, and why it matters. And then, I will summarize and discuss key findings in the paper I chosen, which is a Plant Cell paper published by Richards et al. in 2009. The main goal of this paper is to test whether Horizontal Gene Transfers have occurred between plants and fungi by using phylogenomic analysis of the proteomes of diverse plants, fungi, and other organisms.
To explain what the Horizontal Gene Transfer is, I first need to talk about the ‘Vertical Gene Transfer’ . The Vertical Gene Transfer refers to the passage of genes from one generation to the next through sexual or asexual production. However, Horizontal Gene Transfer doesn’t require such reproduction process and involves the movement of genetic materials across species barriers. Although most genes transferred through HGT probably fail to function, probably because the recipient organism degrade the genes, dies because of mutations caused by the introduction, or lacks the ability to turn on the genes, in some cases the introduced genes can survive and be fixed in the recipient genome, especially when the introduced genes help the recipient organism survive under harmful conditions, such as the presence of antibiotics, or allows the recipient to utilize certain substrates as nutrients. Basically, HGT is a mechanism that provides opportunities for rapid genome innovation. It appears that HGT is a common mechanism of genome evolution in prokaryotes, but the roles and frequencies of HGT in eukaryotes have not yet been extensively studied. However, the rapid increase in genome sequencing has opened up opportunities for studying HGT in the evolution of eukaryotes.
I would like to introduce two examples of HGTs in Eukaryotes. First example is plant pathogenic fungus, Fusarium oxysporum . Certain strains of the asexual species F. oxysporum have Lineage-specific (LS) chromosomes. These chromosomes cover about one-quarter of the genome, and many genes in the LS chromosomes are highly related to pathogenecity. The presence of these chromosomes allow the strains carrying them pathogenic to a particular host plant. Horizontal transfers of some of these chromosomes from a tomato pathogenic strain to a nonpathogenic strain have been experimentally demonstrated in the laboratory, and the recipient strains became pathogenic to tomato. Second example is the cartenoids biosynthe by pea aphid Acyrthosiphon pisum . The cartenoids biosynthetic genes were found in aphids. Usually, animals obtain carotenoids from food. However, Nancy Morgan at University of Arizona discovered the presence of a cartenoid gene set in insect (animal kingdom). Phylogenetic analyses imply the transfer of these genes from a fungus to an ancestor of aphid species, allowing aphids to synthesize cartenoides.
Table 1 presents the six plant genome which were used in this study. This is total number of proteins. Second row, if the protein of plant is highly similar to fungi. They use BLASTp method here. They exclude the Transposable Element in this study. (5586) Table 2 presents the nine candidate genes of potential HGT discovered by Richards et al.. There are five fungi-to-plant transfers, and four plant-to-fungi transfers. To find these nine candidates, they developed two pipelines which are ‘a HGT detection pipeline based on BLAST search and gene clustering through OrthoMCL’, and Phylogenomic analyses methods. Angiospermae (Flowering plants) Arabidopsis, 20-25cm tall, 3mm in diameter, Arabidopsis genome, 135 million bp, 5 chromosomes, 33410 protein-coding transcripts, one of the smallest genomes among plants, 2000 Arabidopsis Genome Initiative. T-DNA insertions, ATMT. Light microscopy analysis. Large number of ecotypes( genetically distinct geographic variety population, which is adapted to specific environmental conditions) 1986, T-DNA mediated transformation was first published. Populus trichocarpa is known as ‘black cottonwood’ or ‘california popular’, 2006 Full genome sequence, 30-50m, diameter of over 2m. Dioecious; maile and femail catkins are borne on separate trees. Model species for trees. Genome : 403 Mb, 45778 protein-coding transcripts, 19 chromosomes, Mitochondrial(803,000bp, 52genes), Chloroplast(157,000bp, 101genes) Oryza sativa, Asian Rice. Genome 372Mb, 12Chromosomes. Easily genetically modified. (Temperate, Japonica, Tropical japonica, aromatic, indica), China 11,500 BP, Model organism cereal biology(JGI) When I compare the number of transcript, the data is not cosistent. (51258, protein-coding transcript) Sorghum bicolor, 697Mbp, 2n=20 chromosome, 36,338 protein-coding transcripts(JGI) Selaginella moellendorfii, spikemosses are among the few surviving members of the lycophytes, an ancient group of plants whose orgin can be trace back 400million years ago. Only three families of lycophytes survive today, including the Selaginellaceae. 212.5Mbp, 27 chromosomes Physcomitrella patens, species of moss, basal lineage of land plants, having diverged before the acquision of well developed vasculate, 480Mb, 27 Chromosome, 39727 Loci, 39796 protein-coding transcripts(JGI)
In their HGT detection pipeline, the proteins encoded by fully sequenced genomes of six plant species were used as query sequences to compared with proteins encoded by fungi, algae, protists, animals, and prokaryotes. Each plant protein sequences were compared with the proteins in the target sequence database using BLASTp. The best similarity hits from each species extracted e-value cutoff 10-20. They identified the query proteins that picked up fungal proteins as the best hit. The sequences that had a close similarity with transposons (Repbase) were excluded because they wanted to study the effect of HGT on the putative functional proteomes of fungi and plants. As a result, 5586 proteins were identified through this analysis. These proteins were clustered OrthoMCL, resulting in 943 clusters and 746 singletons. In total, 1,689 sequence groups were generated. The main concern of using BLASTp to identify candidates for HGT is that BLAST was not designed to accurately determine the evolutionary relationship between two sequences. So using BLASTp alone to identify candidates for HGT can potentially miss some candidate genes for HGT, as the the closest BLAST hit is often not the nearest neighbor on a phylogenetic tree. To test for this possibility, they also repeated this clustering step using the entire predicted proteome of rice, producing 3177 cluster groups. The combined set of 4,866 sequence groups were subjected to ‘Phylogenomic analyses’. In here, they added highly homologus sequences from Target Database using BLASTp again. And then, they drew ‘4866’ phylogenetic trees to identify potential ‘HGT’ like events. The basic topology criteria is like this. Gene or gene family phylogenies demonstrating a plant gene branching within a cluster of sequences from fungal taxa (or vice versa) were identfied. After this analysis, only 38 phylogenies were remained. (HGT Detection pipeline is ended here, in the middle of right box) 35 trees were derived from the proteins identified through the BLAST-based survey, and two of them were from the rice orthology cluster, and one candidate phylogeny was discovered by both methods. Before conducting more detailed analyses, they added more protein sequence data from GenBank, TBestDB database (EST database) and manually adjusted sequence alignments for more robust analysis. Adding additional sequences generated the better quality phylogeny judging from multiple statistical methods. The final testing of HGT was based whether the resulting phylogeny is supported by one of the four evidences, which I will summarize in the following slides. At the end, they discovered 9 candidate genes.
The first evidence is quiet same with previous criteria. A gene phylogeny demonstrating a plant gene sequence branching within a cluster of sequences from fungal taxa (or vice versa), However, the bootstrap values need to be supported strongly. In this methods, they found three candidates. Although, they seems to be pretty clear, but they did alternative topology test by using CONSEL,SH(Shimodaria-Hasegawa) test. SH test will use the t-test of the score difference between the maximum likelihood tree and every other tree compared. This test was especially important in weak bootstrap values like second evidence. Although, the topology is not well supported by bootstrap, aternative topology test provide p-value that they could reject another trees by centain confidence level. P-value > 0.95, we can reject the null-hypothesis ================================================= According to Thomas Buckley, in 2001 MBE paper SH test has two advantages, First, the SH test simultaneously compares multiple topologies and corrects the corresponding p-value to accommodate the multiplicity of testing Second, the SH test is corrent when applied to a posteriori hypothesis(What is posteriori?) Null hypothesis(The compared two trees are not different) We can reject ‘null’ hypothesis in certain confidence level.
1a, L-Fucose permease, sugar transporter (Fungl > Plant) Fungal Taxonomy is also consistent. Bryophyte need sugar? Homeostatis According to ‘Gunn’ study in 1994, The L-fucose permease sugar transporter from bryophyte P. patens represents a distinct family of sugar transporters when compared with other sugar transporter families. And it also presents in appreciable quantities in soil environmnets. Especially, methypentose L-fucose is constituent of many glycoproteins and glycolipids synthesized by microorganisms. What is glycoprotein? Oligosaccharide chains(glycans) covalently attached to polypeptide side-chains. Glocoprotein is important integral membrane proteins, and also important role in cell-cell interaction. Glycolipids is lipids with a carbohydrate attached. Role is providing energy, or markers for cellular recognition. What is carbohydrate(Organic compund Cm(H2O)n). With lipid. Carbohydrates are found on the outer surface of all eukaryotic cell membrane. Homeostasis ============EBI Supplementary================ L-Fucose permease (6-deoxy-L-galactose) is a monosaccharide found in glycoproteins and cell wall ploysaccharides. L-fucose is used in bacteria through an inducible pathway mediated by at least four enzymes: a permease, isomerase, kinase, and aldolase which are encoded by fucP, fucK, fucA respectively. Parent is ‘Major facilitator superfamily’. GO : Transmembrane transport Why plant need sugar(?) They cannot synthesize sugar(?)
Third evidence is the prokaryote mediated transfer. Here you can see the prokaryote and fungi are quiet close in evolutionary with strong bootstrap values. If we can find the plant gene in this cluster, it is highly possible to be HGT from Fungi-Plant with prokaryote mediate. The prokayote tagged-chain transfer hypothesis means, In the absence of strong phylogenetic tree topology support, Putative eukayote-to-eukaryote HGT can be inferred when it is rooted by an uncontroversial case of prokayote-to-eukaryote HGT. 3a(siderophore biosynthesis) is really clear because non-pathogeneic bacteria discovered having this functions at low iron environment. Siderophore biosysnthesis is known that it is also important fungal pathogenecity, it is highly likely to transferred to ‘plant’ by bacteria. The last evidence is pretty similar with evidence 1, 2 however, they more focus on the gene families. 4A,4B have strong bootstrap value, however, 4C has weak bootstrap support. 4A, 4B demonstrate gene families that we observed to be restricted to diverse collection of plant genomes, but also present in a single fungal species and very low number of prokaryote genome. 4C demonstrates an additional gene family only present in plants and two closely related leotiomycete. By using these four evidences, nine candidate genes were found. =========== Supplementary ==================== What is bootstrap in phylogeny? One of the most commonly used tests of the reliability of an inferred tree is Felsenstein’s bootstrap test. If there are m sequences, each with n AA, a phylogenetic tree can be reconstructed using some tree building method. From each sequence, n nucleotides are randomly chosen with replacements, giving rice to m rows of n columns. These now constitutes a new set of sequence. A tree is then reconstructed with these new sequences using the same tree building method as before. Next the topology of this tree is compared to that of the original tree. If it is different from original one, the score is 0, and if it is same the score will be 1. (Interior branches) The procedure of resampling the sites and the subsequent tree reconstruction is repeated hundred times. Generally, if bootstrap value is more than 95%, it is likely to highly confident accoring to Nei and Kumar’s study.
3a, iucA/iucA family protein, siderophore biosynthesis (Fungi > Plant) They also noticed that gene trees is not consistent with the current fungal taxonomy. Dictyostelium (Soil-living amoeba) Here is the example of HGTs of siderophore biosynthesis. I’ll talk in the next slide why siderophore is important in fungi and plant. As you can see, lycophyte is grouped within many fungal species. You can also see the prokaryote is highly similar in bootstrap. Resistant, Homeostasis =================== Discussion in paper ====================== The siderophore biosynthetic protein containning iucA/iucC domain represents a putatively transferred function that confers an advantage by providing a means of taking iron from the environment. Acquisition of siderophore allow fungal pathogen to botain critical irons present at low concentration, however, these functions evolved in non-pathogenic bacteria living in the non-host-associated, ion-poor environment. Recent reports have suggested that siderophore production and homeostasis are important for fungal pathogens of both animals and plants and important for fungal pathogens.
According to the annual review of Phytopathology 2008, “Siderophores in Fungal Physiology and Virulence” Siderophore are essential to fungal homeostasis and pathogenecity. They allow fungi to acqure ‘Iron” from nature. It was interesting Lycophyte also likely acqure ‘siderophore’ biosynthesis from bacteria. Siderophores are carrier of iron. Siderphores are then recognized by cell specific receptors on the outer membrane of the cell. In fungi, Fe-Siderophore complex transferred to cell membrane. Why Iron is imporant to homeosis of cell? Exist in all living organism, Examples of proteins found in higher organisms include hemoglobin, cytochrome. Iron-binding proteins are very important to their function. Iron is the metal ion incorporated into the heme complex. Iron is important in plant defense mechanism. One example is Typical ferritins which consists of 24 subunits and accommodate up to 4,000 iron atoms.
4a, DUF239 domain protein(Plant > Fungi), 4b, Phosphate-responsive 1 family protein (Plant > Fungi) Domain Unknown Function. According to Richard et al, Plant-to-fungi HGT hypothesis is based upon the proposition that this gene family is Plantae-specific innovation that has then undergone transfer to the fungus Batrachochytrium dendrobatidis. 4b, Gene family! This gene family has numerous plant-specific paralogs. Especially, plant chloroplast requrie many phosphate, They synthesis sugar by photosynthesis. Phosphate is also very imporant source of backbone of DNA. Chytrid fungus B. dendrobatidis, bacterium S. usitatus. This protein suggested to be involved in phosphate-induced plant cell division. It is hard to predict the function in this time. Nutrition defeciency. Homeostatis and Symbiostatis(Long term relationship conferring gene family) Phosphate Why? Phosphate are most commonly found in the form of adenosine phosphates (AMP,ADP,ATP) and DNA and RNA. Phosphate backbone is very important. Giving high amount of energy to their living organism. The addition or removal of phosphate from proteins in all cells is a pivotal strategy in regulation of metabolic process Why? Phosphate is useful in animal cells as a buffering agent. Phosphate salts that are commonly used for preparing buffer solutions at cell pHs include Na2HPO4 Nutrients, Homeostasis =================== EBI Supplementary==================== DUF239, This domain is found in a number of Arabidopsis thaliana and other plant proteins of unknown function. A small number of the proteins that contain this domain are annotated as carboxyl-terminal proteinase-like. IPR006766 Phosphate-induced protein 1 This entry represents a family of conserved plant proteins. A conserved region in these proteins was identified in a phosphate-induced protein of unknown function.
Nine candiate genes were discussed about putative function assignment. Of the nine putative HGTs identified, four were identified as conserved hypothetical proteins or proteins with domains of unknown function; therefore, putative functional annotation was not possible. (3b,4a,4c) 1a, FucP facilitates the uptake of L-fucose(6-deocy-L-galactose), sugar transporter, homeosis, symbiosis, 1b, putative two-domain zinc binding alcohol dehydrogenase was encoded by an HGT candidate from the Plantae clade to the chytrid B. dendrobatidis. 1c, membrane transporter genes(MFS1), single-poly-peotide secondary carrier, signal transcution 2, phospholipase/carboxylesterase protein family. It has capability of hydrolyzing carboxylic ester bonds and have broad substrate specificity across the protein family. 3a, siderophores are low molecular mass iron chelators employed by both fungi and bacteria for iron uptake and storage. According to Richard et al, Plant-to-fungi HGT hypothesis is based upon the proposition that this gene family is Plantae-specific innovation that has then undergone transfer to the fungus Batrachochytrium dendrobatidis. 4b, Gene family! This gene family has numerous plant-specific paralogs. Especially, plant chloroplast requrie many phosphate, They synthesis sugar by photosynthesis. Phosphate is also very imporant source of backbone of DNA. Chytrid fungus B. dendrobatidis, bacterium S. usitatus. This protein suggested to be involved in phosphate-induced plant cell division. It is hard to predict the function in this time. Nutrition defeciency. 4c. They mentioned that this putative transfere should be treated with caustion need more sequencing data from wide number of species. If it is true, I guess it is related to Resistance – Pathogenecity. Trigger of plant resistant genes. What is Zinc Finger, They were first identified as a DNA-binding motif in TFIIIA from Xenopus laevis(African clawed frog). They also bind to DNA, RNA, PROTEIN, LIPID Substrate according to ‘EBI’ description. ZincFinger-containing proteins function in gene transcription, translation, mRNA trafficking, cytoskeletion organisation, and so on. S
This diagram shows the entire picture of HGT between plant and fungi. Interestingly, there are no HGTs in Angiosperm. In conclusion, HGT between plant and fungi has played only a very minor role in their evolution(less than 0.53%). Compare to 10-20% of HGTs in prokaryote, it is very rare. Five transfer events from fungi to plants were all acquired by ‘lycophyte’ or ‘bryophyte’. Then, why less evolved species acquire genes from others? Taphrinomycotina Chytridiomycota : The most primitivie of the fungi and are mostly saprobic (degrading chitin and keratin). Cell wall in chytrids is composed of chitin. Zygomycete: zygote fungi, Terrestrial in habitat, living in soil or on decaying plant or animal material. Zygomycota contains approximately 1% of the described species of true fungi. The most familiar representatives include the fast-growing molds that we encounter on spoiled strawberries and other fruits high in sugar content. Zygomycota are responsible as saprophytes on substrates such as fruit, soil, and dung, as harmless inhabitant s of anthropod guts. It is also pathogen to other fungi(Mycoparasites). Basidiomycete: 30,000 described species, which is 37% of the described of true Fungi. Mushroom(sexual reproductive structure). Terrestrial ecosystems, freshwater, marine habitats. Many Basidomytota obtain nutrition by decaying dead organic matter, including wood and leaf litter. Carbon cycle! Dikaryon(Each cell in the thallus contains two haploid nuclei resulting from a mating event) Basidium is the cell in which karyogamy (nuclear fusion) and meiosis occur, and on which haploid basidiospores are formed. Ascomycete: Sac fungi, monophyletic, Ascus. It is within the ascus that nuclear fusion and meiosis take place.
Current genome sampling would cause some bias. For example, the sequenced organisms came from same ecological niche? The sampling number would be enough? However, this bias will be solving after increasing the number of many available whole genomes. Secondly, TEs have significant role of genome evolution, however, they exclude it. For example, TEs in Fusarium oxysporum are very important as pathogeneticy factor. According to Sarah et al, published in 2010 Trends in Ecoloty and Evolution, They reviewed and argued that introduction of transposible elements by horizontal gene transfer in eukaryotic genome has been a major force of genomic variation. However, I also found many typos or errors of the papers Finally, I would like to learn two pipeline they had done, and apply it to oomycete and fungal HGTs. I think this research was really a pioneer project based on strong phylogenetic analysis and bioinformatics.
HGT first described in Japan in 1959,1960. Akiba, Ochiai described transferred drug resistance in Shigellae. Later on, they also suggested that multiple drug resistance may be transferred from multiple drug resistant E. coli to Shigellae.
1a, L-Fucose permease, sugar transporter (Fungl > Plant) Bryophyte need sugar? The L-fucose permease sugar transporter from bryophyte P. patens represents a distinct family of sugar transporters when compared with other sugar transporter families, and presents in appreciable quantities in soil environmnets ============EBI Supplementary================ L-Fucose permease (6-deoxy-L-galactose) is a monosaccharide found in glycoproteins and cell wall ploysaccharides. L-fucose is used in bacteria through an inducible pathway mediated by at least four enzymes: a permease, isomerase, kinase, and aldolase which are encoded by fucP, fucK, fucA respectively. Parent is ‘Major facilitator superfamily’. GO : Transmembrane transport Why plant need sugar(?) They cannot synthesize sugar(?)
1b, Zinc binding alcohol dehydrogenase (Plant > Fungi) Fungi(Chytridiomycete) need zinc? Members of this entry form a distinct subset of the larger family of oxidoreductases that includes zinc-binding alcohol dehydrogenases and NADPH:quinone reductases. The gene neighbourhood of members of this family is not conserved and it appears that no members are characterised. Sequence alignments reveal 6 invariant cysteine residues and one invariant histidine.
1c, Major facilitator superfamily, membrane transporter (Fungi > Plant) Putative transporter protein (single-poly-peptide secondary carrier) What kind of things will be transported? ================ EBI Supplementary================= Abstract Open in usermanual Transporters can be grouped in two classes, primary and secondary carriers. The primary active transporters drive solute accumulation or extrusion by using ATP hydrolysis, photon absorption, electron flow, substrate decarboxylation or methyl transfer. If charged molecules are unidirectionally pumped as a consequence of the consumption of a primary cellular energy source, electron chemical potential results. This potential can than be used to drive the active transport of additional solutes via secondary carriers. Among the different transporter the two largest families that occur ubiquitously in all classifications of organisms are the ATP-Binding Cassette (ABC) primary transporter superfamily (see PDOC00185) and the Major Facilitator Superfamily (MFS). The MFS transporters are single-polypeptide secondary carriers capable only of transporting small solutes in response to chemiosmotic ion gradients [1, 2]. They function as uniporters, symporters or antiporters. In addition their solute specificity are also diverse. MFS proteins contain 12 transmembrane regions (with some variations).
2. Phospholipase/carboxylesterase family protein (Fungl > Plant) Ascomycete>Lycophyte IPR003140 This family consists of both phospholipases and carboxylesterases with broad substrate specificity, and is structurally related to alpha/beta hydrolases.
3a, iucA/iucA family protein, siderophore biosynthesis (Fungi > Plant) Discussion in paper The siderophore biosynthetic protein containning iucA/iucC domain represents a putatively transferred function that confers an advantage by providing a means of taking iron from the environment. Acquisition of siderophore allow fungal pathogen to botain critical irons present at low concentration, however, these functions evolved in non-pathogenic bacteria living in the non-host-associated, ion-poor environment. Recent reports have suggested that siderophore production and homeostasis are important for fungal pathogens of both animals and plants and important for fungal pathogens.
3b, Unknown/conserved hypothetical protein (Fungi>Plant) Ascomycetes > Bryophyte
4a, DUF239 domain protein(Plant > Fungi), 4b, Phosphate-responsive 1 family protein (Plant > Fungi) =================== EBI Supplementary==================== DUF239, This domain is found in a number of Arabidopsis thaliana and other plant proteins of unknown function. A small number of the proteins that contain this domain are annotated as carboxyl-terminal proteinase-like. IPR006766 Phosphate-induced protein 1 This entry represents a family of conserved plant proteins. A conserved region in these proteins was identified in a phosphate-induced protein of unknown function.
4c. Unknown/conserved hypothetical protein with similarity to zinc finger (C2H2-type) protein Plant > Fungi
Table 1 represents the nine HGT candidate genes. There are five fungi-to-plant transfers, and four plant-to-fungi transfers. We will discuss more after learning about methods. To find these nine candidates, they developed two pipelines which are ‘HGT detection pipeline with BLAST, OrthoMCL’, and Phylogenomic analyses methods.
Here, I present some challenging questions. Firstly, how long ago these HGT process occurred? To find the answer, we need to consider the molecular and genome evolution theory. By comparison the gene variation between doner’s protein and recipient’s protein, we would be able to guess the transferred time hypothetically. Secondly, DNA can move back and forth between donors and recipients? It seems to be relatively easy in prokaryotic HGT, however, it will be really challenging question in eukaryotic transfer. For example in prokaryotic mediated transfers, eukaryotic has an exon-intron structure, and how transferred gene can recognize those sturucture in prokaryote? However, if we can find some directions of HGTs, or if we get to know there is no such a direction in HGTs, it will be applied to genome evolutionary theories, because HGT is the source of rapid mutation of organisms. Third, the environment will affect the gene transfer? In HGTs, co-existence in same ecological niche would be first and important because they need to interact each other. However, even though they interact each other, some environmental condition can affect the interaction. Fourth, there will be more mechanisms in nature of HGT. For example, TE associated transfer was introduced and intensively investigated recently. Probably, there are other mechanism causing HGT in nature. Finally, the transferred genes can be fixed in population level? These five questions is very fundamental, and significant effect on genome evolution. understanding
A particular slide catching your eye?
Clipping is a handy way to collect important slides you want to go back to later.