A functional and evolutionary perspective on transcription factor binding in ...Klaas Vandepoele
A functional and evolutionary perspective on transcription factor binding in Arabidopsis thaliana
Klaas Vandepoele
Comparative & Integrative Genomics group
Department of Plant Biotechnology and Bioinformatics, Ghent University
Department of Plant Systems Biology, VIB - Belgium
Gene editing application for cancer therapeuticsNur Farrah Dini
The application of TALENs as one of the gene editing tools in order to modify a specific targeted sites on a genome. This method shows a tremendous benefits especially in cancer research.
Engineering plant immunity using crispr cas9 to generate virus resistanceSheikh Mansoor
Targeted genome editing by use of artificial nucleases has the plausible potential to speed basic research as well as plant breeding by providing the means to modify genomes quickly in a specific and predictable manner but advanced CRISPR-Cas9 based technologies first confirmed in mammalian cell systems are quickly being fitted for use in plants. These new technologies increase CRISPR-Cas9’s utility and effectiveness by diversifying cellular capabilities through expression construct system evolution and enzyme orthogonality, as well as enhanced efficiency through delivery and expression mechanisms. RNA-guided genome editing using Streptococcus pyogenes CRISPR-Cas9 (Clustered Regularly Interspaced Short Palindromic Repeats) has renewed the concept of genome editing in plants. CRISPR-associated surveillance complexes are easily programmable molecular sleds that can target any sequence of choice. These complexes offer new opportunities for implementation in biotechnology. Recent studies have used CRISPR/Cas9 to engineer virus resistance in plants, either by directly targeting and cleaving the viral genome, or by modifying the host plant genome to introduce viral immunity. The CRISPR/Cas9 platform could also be used for targeted mutagenesis to identify host factors that control plant resistance and susceptibility to viral infection. Thus, CRISPR/Cas9 technology offers a promising approach for under- standing and engineering resistance to single and multiple viral infections in plants.
A functional and evolutionary perspective on transcription factor binding in ...Klaas Vandepoele
A functional and evolutionary perspective on transcription factor binding in Arabidopsis thaliana
Klaas Vandepoele
Comparative & Integrative Genomics group
Department of Plant Biotechnology and Bioinformatics, Ghent University
Department of Plant Systems Biology, VIB - Belgium
Gene editing application for cancer therapeuticsNur Farrah Dini
The application of TALENs as one of the gene editing tools in order to modify a specific targeted sites on a genome. This method shows a tremendous benefits especially in cancer research.
Engineering plant immunity using crispr cas9 to generate virus resistanceSheikh Mansoor
Targeted genome editing by use of artificial nucleases has the plausible potential to speed basic research as well as plant breeding by providing the means to modify genomes quickly in a specific and predictable manner but advanced CRISPR-Cas9 based technologies first confirmed in mammalian cell systems are quickly being fitted for use in plants. These new technologies increase CRISPR-Cas9’s utility and effectiveness by diversifying cellular capabilities through expression construct system evolution and enzyme orthogonality, as well as enhanced efficiency through delivery and expression mechanisms. RNA-guided genome editing using Streptococcus pyogenes CRISPR-Cas9 (Clustered Regularly Interspaced Short Palindromic Repeats) has renewed the concept of genome editing in plants. CRISPR-associated surveillance complexes are easily programmable molecular sleds that can target any sequence of choice. These complexes offer new opportunities for implementation in biotechnology. Recent studies have used CRISPR/Cas9 to engineer virus resistance in plants, either by directly targeting and cleaving the viral genome, or by modifying the host plant genome to introduce viral immunity. The CRISPR/Cas9 platform could also be used for targeted mutagenesis to identify host factors that control plant resistance and susceptibility to viral infection. Thus, CRISPR/Cas9 technology offers a promising approach for under- standing and engineering resistance to single and multiple viral infections in plants.
Advanced Genome Engineering Services and Transgenic Model Generation
at MSU’s Transgenic and Genome Editing Facility
Huirong Xie, Elena Demireva, Nate Kauffman, Richard Neubig
Genome Sequencing in Finger Millet
Genome size estimation
SOLiD Sequencing Technology
Illumina Sequencing Technology
Gene prediction and functional annotation of genes
Mining of plant transcription factors and other genes
WorkXO is a workplace culture management firm specializing in using culture analytics to solve business problems.
We help forward thinking leaders in growth-oriented organizations activate their culture to increase engagement, attract the right talent and unleash organizational potential. The core of our work starts here... with the Workplace Genome. Find out how to get yours at workxo.com.
Apollo is a web-based application that supports and enables collaborative genome curation in real time, allowing teams of curators to improve on existing automated gene models through an intuitive interface. Apollo allows researchers to break down large amounts of data into manageable portions to mobilize groups of researchers with shared interests.
An introduction on gene annotation & curation for the IAGC and BIPAA research communities.
Advanced Genome Engineering Services and Transgenic Model Generation
at MSU’s Transgenic and Genome Editing Facility
Huirong Xie, Elena Demireva, Nate Kauffman, Richard Neubig
Genome Sequencing in Finger Millet
Genome size estimation
SOLiD Sequencing Technology
Illumina Sequencing Technology
Gene prediction and functional annotation of genes
Mining of plant transcription factors and other genes
WorkXO is a workplace culture management firm specializing in using culture analytics to solve business problems.
We help forward thinking leaders in growth-oriented organizations activate their culture to increase engagement, attract the right talent and unleash organizational potential. The core of our work starts here... with the Workplace Genome. Find out how to get yours at workxo.com.
Apollo is a web-based application that supports and enables collaborative genome curation in real time, allowing teams of curators to improve on existing automated gene models through an intuitive interface. Apollo allows researchers to break down large amounts of data into manageable portions to mobilize groups of researchers with shared interests.
An introduction on gene annotation & curation for the IAGC and BIPAA research communities.
Genomic Big Data Management, Integration and Mining - Emanuel WeitschekData Driven Innovation
Thanks to Next Generation Sequencing (NGS), a technology that is lowering the cost and time of reading DNA, we are faced with huge amounts of biomedical data. These data are continuously collected by research laboratories, and often organized through world-wide consortia, which are releasing many public data bases. One of the main aims of bioinformatics is to solve fundamental issues in biomedicine research (e.g., how cancer occurs) starting from big genomic data and their analysis. In this talk I will give an overview of big genomic data management, integration, and mining.
Next Generation Sequencing Informatics - Challenges and OpportunitiesChung-Tsai Su
Genetic data is the foundation of precision medicine. Next Generation Sequencing(NGS) enable us to get our whole genome data in affordable cost. How to process huge amount of NGS data effectively ?
Apollo is a web-based application that supports and enables collaborative genome curation in real time, allowing teams of curators to improve on existing automated gene models through an intuitive interface. Apollo allows researchers to break down large amounts of data into manageable portions to mobilize groups of researchers with shared interests.
An introduction to use and functionality for the IAGC and BIPAA research communities.
the new emerging field of science that is nutrigenomics can deal with the issues of health and improve out health with the simple tools by understanding the risk and the baic genome of a person
Global Education and Skills Forum 2017 - Educating Global CitizensEduSkills OECD
Andreas Schleicher - Director for the Directorate of Education and Skills, OECD
Each year the Global Education & Skills Forum brings together world leaders from the public, private and social sectors seeking solutions to achieving education, equity and employment for all.
Diagnóstico SEO Técnico con Herramientas #TheInbounderMJ Cachón Yáñez
¿Cómo exprimir las herramientas SEO en análisis las distintas áreas del SEO?
Herramientas, guías paso a paso e insights de interés para que las herramientas no te manejen a ti
Twitter gives B2B marketers a powerful opportunity to access broad networks of brands, companies and decision makers on Twitter. Supported by the latest research, we demonstrate why Twitter is not optional and why private and publicly listed brands are missing out on a solid opportunity if they do not incorporate Twitter into their marketing mix.
We demonstrate that Twitter is not optional for brands engaged with B2B marketing. We include the most recent data from multiple leading sources, including The Social Media Examiner, Inc.; Twitter, Inc.; Regalix, Inc. and others.
Twitter provides private and publicly-listed brands an opportunity to engage with broad networks of other brands, firms and key decision makers that also use Twitter. We note that Twitter's active user base is comprised of 250 million plus users and is growing.
When used effectively and in combination with communication strategy and tools, Twitter represents the optimal platform for deploying ongoing messaging. When viewed as a communications hub, Twitter is unrivaled through its ability to integrate other channels and information sources and to coordinate their priority and emphasis. Twitter is effective at relaying information on channels that include Websites, Press releases, Instragram, Facebook, Snapchat, URLs, and any other linkable source of information, and driving traffic to these same sources.
We note that press releases and awareness in general can be difficult for some brands and companies to generate but that Twitter is a proven solution.
Sky Alphabet is a social media marketing agency that utilizes Twitter to achieve growth, awareness and sales objectives through integrated forms of traditional and digital communications driven by Twitter. We understand that Twitter is "not easy" because of its unrelenting requirement for fresh and relevant content, but it is this same requirement that makes Twitter the ideal platform for brands, companies, people and products that are prepared to express themselves through such an advanced channel.
Author: Steve Yanor Aug 2016. @skyalphabet
Research sources: Regalix, Inc. Twitter, Inc. Social Media Examiner, Inc.
This presentation is on online voting that is not present in India. with the advancement of technology it may possible that Indian Government start online voting system
3 Things Every Sales Team Needs to Be Thinking About in 2017Drift
Thinking about your sales team's goals for 2017? Drift's VP of Sales shares 3 things you can do to improve conversion rates and drive more revenue.
Read the full story on the Drift blog here: http://blog.drift.com/sales-team-tips
Club Brugge richt 3 nieuwe nv’s tegelijk opThierry Debels
In het Staatsblad van 17 maart 2017 verschenen 3 oprichtingsaktes in verband met Club Brugge: CLUB BRUGGE DEVELOPMENT, CLUB BRUGGE OEFENCENTRUM en CLUB BRUGGE STADION.
Het gaat telkens om nv’s en telkens is Bart Verhaeghe bestuurder.
De drie aktes werden verleden voor Thomas Dusselier, notaris in Knokke-Heist, op donderdag 9 maart 2017 en neergelegd ter registratie in Brugge.
For decades on end, members of the lower socio-economic strata have been voting for leftist parties in many Western countries. These conventional class-party ties have come to an end, puzzling many political commentators. Instead of voting for a leftist party, the working classes’ preferences have shifted towards populist parties like PVV, VlaamsBelang, FPO, and UKIP.
In this lecture, Peter Achterberg will discuss why this is so. More specifically, he will discuss three questions: 1) What explains the rise of radical right-wing political parties? 2) What explains the voting preferences of the working classes? And 3) Why have these changed in time?Lecture for Tilburg University Alumni: March 16th
Genome to pangenome : A doorway into crops genome explorationKiranKm11
This seminar underpins the significance and need of formulating pan-genome oriented crop improvement strategies over single reference genome based studies. Pangenome graphs uncovers large repository of genetic variation which could we useful for planning and executing strategic crop improvement programmed
Genotyping by Sequencing is a robust,fast and cheap approach for high throughput marker discovery.It has applications in crop improvement programs by enhancing identification of superior genotypes.
Apollo - A webinar for the Phascolarctos cinereus research communityMonica Munoz-Torres
Web Apollo is a web-based, collaborative genomic annotation editing platform. We need annotation editing tools to modify and refine precise location and structure of the genome elements that predictive algorithms cannot yet resolve automatically.
This presentation is an introduction to how the manual annotation process takes place using Web Apollo. It is addressed to the members of the Phascolarctos cinereus research community.
Apollo: A workshop for the Manakin Research Coordination NetworkMonica Munoz-Torres
Apollo is a web-based, collaborative genomic annotation editing platform. We need annotation editing tools to modify and refine precise location and structure of the genome elements that predictive algorithms cannot yet resolve automatically.
This presentation is an introduction to how the manual annotation process takes place using Apollo. It is addressed to the members of the Manakin Genomics research community.
Apollo is a web-based, collaborative genomic annotation editing platform. We need annotation editing tools to modify and refine precise location and structure of the genome elements that predictive algorithms cannot yet resolve automatically.
This presentation is an introduction to how the manual annotation process takes place using Apollo. It is addressed to the members of the American Chestnut & Chinese Chestnut Genomics research community.
KnetMiner provides an easy to use web interface to visualisation and data mining tools for the discovery and evaluation of candidate genes from large scale integrations of public and private data sets. It addresses the needs of scientists who generally lack the time and technical expertise to review all relevant information available in the literature, from key model species and from a potentially wide range of related biological databases. We have previously developed genome-scale knowledge networks (GSKNs) for multiple crop and animal species (Hassani-Pak et al. 2016). The KnetMiner web server searches and evaluates millions of relations and concepts within the GSKNs in real-time to determine if direct or indirect links between genes and trait-based keywords can be established. KnetMiner accepts as user inputs: search terms in combination with a gene list and/or genomic regions. It produces a table of ranked candidate genes and allows users to explore the output in interactive genome and network map visualisation tools that have been optimised for web use on desktop and mobile devices. The KnetMiner web server and the GSKNs provide a step-forward towards systematic and evidence-based gene discovery.
Dissecting plant genomes with the PLAZA 2.5 comparative genomics platformKlaas Vandepoele
Dissecting plant genomes with the PLAZA comparative genomics platform.
Van Bel M, Proost S, Wischnitzki E, Movahedi S, Scheerlinck C, Van de Peer Y, Vandepoele K.
Plant Physiol. 2012 Feb;158(2):590-600.
With the arrival of low-cost, next-generation sequencing, a multitude of new plant genomes are being publicly released, providing unseen opportunities and challenges for comparative genomics studies. Here, we present PLAZA 2.5, a user-friendly online research environment to explore genomic information from different plants. This new release features updates to previous genome annotations and a substantial number of newly available plant genomes as well as various new interactive tools and visualizations. Currently, PLAZA hosts 25 organisms covering a broad taxonomic range, including 13 eudicots, five monocots, one lycopod, one moss, and five algae. The available data consist of structural and functional gene annotations, homologous gene families, multiple sequence alignments, phylogenetic trees, and colinear regions within and between species. A new Integrative Orthology Viewer, combining information from different orthology prediction methodologies, was developed to efficiently investigate complex orthology relationships. Cross-species expression analysis revealed that the integration of complementary data types extended the scope of complex orthology relationships, especially between more distantly related species. Finally, based on phylogenetic profiling, we propose a set of core gene families within the green plant lineage that will be instrumental to assess the gene space of draft or newly sequenced plant genomes during the assembly or annotation phase.
Detection of genomic homology in eukaryotic genomesKlaas Vandepoele
i-ADHoRe 3.0--fast and sensitive detection of genomic homology in extremely large data sets.
Proost S, Fostier J, De Witte D, Dhoedt B, Demeester P, Van de Peer Y, Vandepoele K.
Nucleic Acids Res. 2012 Jan;40(2):e11.
Comparative genomics is a powerful means to gain insight into the evolutionary processes that shape the genomes of related species. As the number of sequenced genomes increases, the development of software to perform accurate cross-species analyses becomes indispensable. However, many implementations that have the ability to compare multiple genomes exhibit unfavorable computational and memory requirements, limiting the number of genomes that can be analyzed in one run. Here, we present a software package to unveil genomic homology based on the identification of conservation of gene content and gene order (collinearity), i-ADHoRe 3.0, and its application to eukaryotic genomes. The use of efficient algorithms and support for parallel computing enable the analysis of large-scale data sets. Unlike other tools, i-ADHoRe can process the Ensembl data set, containing 49 species, in 1 h. Furthermore, the profile search is more sensitive to detect degenerate genomic homology than chaining pairwise collinearity information based on transitive homology. From ultra-conserved collinear regions between mammals and birds, by integrating coexpression information and protein-protein interactions, we identified more than 400 regions in the human genome showing significant functional coherence. The different algorithmical improvements ensure that i-ADHoRe 3.0 will remain a powerful tool to study genome evolution.
The use of Nauplii and metanauplii artemia in aquaculture (brine shrimp).pptxMAGOTI ERNEST
Although Artemia has been known to man for centuries, its use as a food for the culture of larval organisms apparently began only in the 1930s, when several investigators found that it made an excellent food for newly hatched fish larvae (Litvinenko et al., 2023). As aquaculture developed in the 1960s and ‘70s, the use of Artemia also became more widespread, due both to its convenience and to its nutritional value for larval organisms (Arenas-Pardo et al., 2024). The fact that Artemia dormant cysts can be stored for long periods in cans, and then used as an off-the-shelf food requiring only 24 h of incubation makes them the most convenient, least labor-intensive, live food available for aquaculture (Sorgeloos & Roubach, 2021). The nutritional value of Artemia, especially for marine organisms, is not constant, but varies both geographically and temporally. During the last decade, however, both the causes of Artemia nutritional variability and methods to improve poorquality Artemia have been identified (Loufi et al., 2024).
Brine shrimp (Artemia spp.) are used in marine aquaculture worldwide. Annually, more than 2,000 metric tons of dry cysts are used for cultivation of fish, crustacean, and shellfish larva. Brine shrimp are important to aquaculture because newly hatched brine shrimp nauplii (larvae) provide a food source for many fish fry (Mozanzadeh et al., 2021). Culture and harvesting of brine shrimp eggs represents another aspect of the aquaculture industry. Nauplii and metanauplii of Artemia, commonly known as brine shrimp, play a crucial role in aquaculture due to their nutritional value and suitability as live feed for many aquatic species, particularly in larval stages (Sorgeloos & Roubach, 2021).
Travis Hills' Endeavors in Minnesota: Fostering Environmental and Economic Pr...Travis Hills MN
Travis Hills of Minnesota developed a method to convert waste into high-value dry fertilizer, significantly enriching soil quality. By providing farmers with a valuable resource derived from waste, Travis Hills helps enhance farm profitability while promoting environmental stewardship. Travis Hills' sustainable practices lead to cost savings and increased revenue for farmers by improving resource efficiency and reducing waste.
The ability to recreate computational results with minimal effort and actionable metrics provides a solid foundation for scientific research and software development. When people can replicate an analysis at the touch of a button using open-source software, open data, and methods to assess and compare proposals, it significantly eases verification of results, engagement with a diverse range of contributors, and progress. However, we have yet to fully achieve this; there are still many sociotechnical frictions.
Inspired by David Donoho's vision, this talk aims to revisit the three crucial pillars of frictionless reproducibility (data sharing, code sharing, and competitive challenges) with the perspective of deep software variability.
Our observation is that multiple layers — hardware, operating systems, third-party libraries, software versions, input data, compile-time options, and parameters — are subject to variability that exacerbates frictions but is also essential for achieving robust, generalizable results and fostering innovation. I will first review the literature, providing evidence of how the complex variability interactions across these layers affect qualitative and quantitative software properties, thereby complicating the reproduction and replication of scientific studies in various fields.
I will then present some software engineering and AI techniques that can support the strategic exploration of variability spaces. These include the use of abstractions and models (e.g., feature models), sampling strategies (e.g., uniform, random), cost-effective measurements (e.g., incremental build of software configurations), and dimensionality reduction methods (e.g., transfer learning, feature selection, software debloating).
I will finally argue that deep variability is both the problem and solution of frictionless reproducibility, calling the software science community to develop new methods and tools to manage variability and foster reproducibility in software systems.
Exposé invité Journées Nationales du GDR GPL 2024
Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...Ana Luísa Pinho
Functional Magnetic Resonance Imaging (fMRI) provides means to characterize brain activations in response to behavior. However, cognitive neuroscience has been limited to group-level effects referring to the performance of specific tasks. To obtain the functional profile of elementary cognitive mechanisms, the combination of brain responses to many tasks is required. Yet, to date, both structural atlases and parcellation-based activations do not fully account for cognitive function and still present several limitations. Further, they do not adapt overall to individual characteristics. In this talk, I will give an account of deep-behavioral phenotyping strategies, namely data-driven methods in large task-fMRI datasets, to optimize functional brain-data collection and improve inference of effects-of-interest related to mental processes. Key to this approach is the employment of fast multi-functional paradigms rich on features that can be well parametrized and, consequently, facilitate the creation of psycho-physiological constructs to be modelled with imaging data. Particular emphasis will be given to music stimuli when studying high-order cognitive mechanisms, due to their ecological nature and quality to enable complex behavior compounded by discrete entities. I will also discuss how deep-behavioral phenotyping and individualized models applied to neuroimaging data can better account for the subject-specific organization of domain-general cognitive systems in the human brain. Finally, the accumulation of functional brain signatures brings the possibility to clarify relationships among tasks and create a univocal link between brain systems and mental functions through: (1) the development of ontologies proposing an organization of cognitive processes; and (2) brain-network taxonomies describing functional specialization. To this end, tools to improve commensurability in cognitive science are necessary, such as public repositories, ontology-based platforms and automated meta-analysis tools. I will thus discuss some brain-atlasing resources currently under development, and their applicability in cognitive as well as clinical neuroscience.
Remote Sensing and Computational, Evolutionary, Supercomputing, and Intellige...University of Maribor
Slides from talk:
Aleš Zamuda: Remote Sensing and Computational, Evolutionary, Supercomputing, and Intelligent Systems.
11th International Conference on Electrical, Electronics and Computer Engineering (IcETRAN), Niš, 3-6 June 2024
Inter-Society Networking Panel GRSS/MTT-S/CIS Panel Session: Promoting Connection and Cooperation
https://www.etran.rs/2024/en/home-english/
The Evolution of Science Education PraxiLabs’ Vision- Presentation (2).pdfmediapraxi
The rise of virtual labs has been a key tool in universities and schools, enhancing active learning and student engagement.
💥 Let’s dive into the future of science and shed light on PraxiLabs’ crucial role in transforming this field!
What is greenhouse gasses and how many gasses are there to affect the Earth.moosaasad1975
What are greenhouse gasses how they affect the earth and its environment what is the future of the environment and earth how the weather and the climate effects.
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...University of Maribor
Slides from:
11th International Conference on Electrical, Electronics and Computer Engineering (IcETRAN), Niš, 3-6 June 2024
Track: Artificial Intelligence
https://www.etran.rs/2024/en/home-english/
hematic appreciation test is a psychological assessment tool used to measure an individual's appreciation and understanding of specific themes or topics. This test helps to evaluate an individual's ability to connect different ideas and concepts within a given theme, as well as their overall comprehension and interpretation skills. The results of the test can provide valuable insights into an individual's cognitive abilities, creativity, and critical thinking skills
This presentation explores a brief idea about the structural and functional attributes of nucleotides, the structure and function of genetic materials along with the impact of UV rays and pH upon them.
Nucleic Acid-its structural and functional complexity.
Comparative genomics to the rescue: How complete is your plant genome sequence?
1. Comparative genomics to the rescue
How complete is your plant genome sequence?
Klaas Vandepoele
Ghent University - VIB, Belgium
5th Plant Genomics & Gene Editing Congress
16-17 March 2017, Amsterdam
plaza_genomics
2. Plant genome sequencing is booming
New and faster sequencing technologies
Generating a draft genome sequence has become cheap
The number of published plant genomes grows exponentially
2
>150 published
plant genomes Credit: Usadel lab
3. From read data to knowledge
The basic genome analysis toolkit:
Genome assembly
Structural annotation shows
where genes are
Functional annotation tells you
what genes do
Data availability of genome
sequence & gene annotation
Faciliate biological discovery
3
4. Yet another “draft” plant genome
What is the quality and completeness of plant genome sequences?
4
The N50 denotes that 50% of
the total assembly length is
contained in scaffolds of
length N50 or longer
6. Quality of a genome: what to expect?
6
Transcript mapping
7. Transcript mapping: tools & settings
7
• The transcript mapping score is
stable (standard deviation < 1%)
in bin sizes of at least 3,000 ESTs
• Challenging to have a correct
estimation of the assembled gene
space. Influence of
mapping tools
coverage cutoffs
Intron-aware transcript mapping using GMAP
8. Transcript mapping: library size
8
• If the libraries contain more than
10,000 ESTs, the EST mapping
scores for A. thaliana libraries
converge to the same value as for
subsampling bins of >10,000 ESTs.
• RNA-Seq de novo assembled
transcripts can lead to the
over-estimation of the
expected number of genes
(allelic transcripts, splice
variants and fragmented
transcripts)
under-estimation due to the
failure to reconstruct low-
abundant transcripts
9. Estimating gene space completeness along an
evolutionary scale
9
evolutionary conserved
Species-specific
expected gene spaces influenced by
within-species
diversity
between-
species
diversity
CEGMA
248, single copy
BUSCO
952, single copy
PLAZA CoreGF
7k gene families
Transcript mapping
Species tree of life
PLAZA CoreGF
3k gene families
13. Evaluation
13
• Arabidopsis and Oryza have
consistent high Completeness
scores
• Over-estimation of
completeness by CEGMA
• Lolium: discrepancy between
genome vs gene set
completeness
14. Improving Lolium gene annotation
14
2 Transcriptomes, aligned with GenomeThreader
de novo assembly
Orthology-guided assembly
300k
80k
4 Proteomes, aligned with GenomeThreader
Brachypodium distachyon
Oryza sativa
Sorghum bicolor
Zea mays
16k
11k
11k
10k
2 Annotation sets
Byrne et al. (2015)
ab initio predictions
28k
41k
# loci
EVM consensus 39.967
Haas et al. (2008), Gremme et al. (2005), Ruttink et al. (2013)
15. Updated completeness scores Lolium
15 Completeness score (%)
75 80 90 9585 100
Byrne et al. (2015)
EVM consensus
>900 new coreGF loci found in the genome!
CEGMA
248, single copy
BUSCO
952, single copy
PLAZA CoreGF
7k gene families
Transcript mapping
Species tree of life
PLAZA CoreGF
3k gene families
16. Evaluation
16
• Arabidopsis and Oryza have
consistent high Completeness
scores
• Over-estimation of
completeness by CEGMA
• Lolium: discrepancy between
genome vs gene set
completeness
• Cicer: EST mapping score
much lower than BUSCO
geneset or coreGF score
More than half of the unmapped
sequences are of non-plant origin
(mostly from Fusarium oxysporum)
Proper taxonomic binning of
expected transcripts is essential!
17. Guidelines to assess the quality of a new genome sequence
1. Estimate genome size using different methods
2. Define and evaluate the expected gene space based on
transcript mapping AND evolutionary conservation
Cleaning and mapping transcripts
Prefer coreGF/BUSCO over CEGMA to model expected conserved
genes
3. Large differences in completeness scores between genome
assembly / annotated gene set can point to gene prediction
issues
4. To perform cross-species genome comparisons, focus on
genomes with complete and contiguous assemblies
17
Veeckman, E., Ruttink, T., and Vandepoele, K. (2016). Are We There Yet? Reliably Estimating
the Completeness of Plant Genome Sequences. Plant Cell 28, 1759-1768.
18. • Gene family annotation and phylogenetic trees
• Traceable functional annotation (GO/InterPro/MapMan)
• Colinearity and synteny
• Integrative gene orthology inference
Highly integrative platform to translate knowledge from model to crop
• 55 species/genomes
• Highly scalable design
• Web-based mobile user interface
• Integrated Workbench for analysis
of sets of genes
http://bioinformatics.psb.ugent.be/plaza/
19. Coverage gene function information
19 blue = primary GO; green = GO projection (orthology + homology)
Gene descriptions
Gene Ontology (Biological Process)
20. TRAPID: analysis of non-model transcriptomes
20
Homology-based ORFs detection incl. frameshift correction
Gene family assignment
Functional annotation based on Gene Ontology and/or protein domains
Two reference databases: PLAZA 2.5 and OrthoMCL-DB
Applications
Sugar cane, wheat, Crocus sativa, conifers, Coffea arabica, Prunus
Dinoflagellates, diatoms, worms, fishes
SRA Viridiplantae
Transcriptomic
Van Bel, … & Vandepoele, Genome Biology 2013
21. Drought Tolerance Conferred to Sugarcane by Association with
Gluconacetobacter diazotrophicus: A Transcriptomic View of
Hormone Pathways
21 Vargas et al., PLoS One 2014
22. Further reading
Veeckman, E., Ruttink, T., and Vandepoele, K. (2016). Are We There Yet? Reliably Estimating
the Completeness of Plant Genome Sequences. Plant Cell 28, 1759-1768.
Proost, S., Van Bel, M. … and Vandepoele, K. (2015). PLAZA 3.0: an access point for plant
comparative genomics. Nucleic Acids Research Jan;43(Database issue):D974-81
Vandepoele K (2017) A Guide to the PLAZA 3.0 Plant Comparative Genomic Database.
In Methods Mol Biol, Vol 1533, pp 183-200
Van Bel, M., Proost, S., Van Neste, C., Deforce, D., Van de Peer, Y., and Vandepoele, K. (2013).
TRAPID: an efficient online tool for the functional and comparative analysis of de novo RNA-
Seq transcriptomes. Genome Biol 14, R134.
plaza_genomics
Code freely available to efficiently compute coreGF completeness score
Want a free PDF? Check out PLAZA poster
23. PLAZA 3.0 user statistics (2016)
>11,000 users (+13%), 370K page views (+30%)
Users from >95 countries
Intensively used by
academia (>400 citations)
industry
25. PLAZA Workbench
25
Create a custom gene set (~experiment) using gene identifiers or BLAST
External/internal gene IDs (e.g. AN3, AT5G28640, GRMZM2G180246_T01)
BLAST interface can be used to map sequence data from a non-model species to a
reference species present in PLAZA
A toolbox is available to analyze user-defined gene sets (~experiment)
2,132 registered users processed 11,875 Workbench experiments