Experimental Designs in Next Generation Sequencing
Introduction
Types of experimental designs
Basic NGS chemistry
Tools used in NGS
Good and Bad experimental designs
Pasteur Institute User Story - Cheminfo Stories 2020 Day 5ChemAxon
Here, we present an updated version of iPPI-DB, our manually curated database of PPI modulators. In this release, the data model, the graphical interface and the tools to query the database have been completely redesigned. We used Chemaxon MarvinJS and JChem library to support this development. We added new PPI modulators, new PPI targets, and extended our focus to stabilizers of PPIs as well. Finally, we introduce a web application relying on crowdsourcing for the maintenance of the database. This application can be used outside of our group to collaboratively maintain iPPI-DB within a community of curators.
Presentation given at the NBT / ECCB 2020, presenting COMBINE standards. Also providing links to related projects, introducing open model repositories and giving some hints for creating reusable models.
Introduction to Gene Mining Part A: BLASTn-off!adcobb
In this lesson, students will learn to use bioinformatics portals and tools to mine plant versions of human genes. Student handout and teacher resource materials are available at www.Araport.org, Teaching Resources (Community tab). Suitable for grades 9-12 or first year undergraduate students.
ECCMID 2016 - How to build actionable virulome databasesJoão André Carriço
Talks given at the Session SY024 - Controversies in interpreting whole genome sequence data
9-April-2016 : http://eccmidlive.org/#resources/how-can-we-design-actionable-virulome-databases
Pasteur Institute User Story - Cheminfo Stories 2020 Day 5ChemAxon
Here, we present an updated version of iPPI-DB, our manually curated database of PPI modulators. In this release, the data model, the graphical interface and the tools to query the database have been completely redesigned. We used Chemaxon MarvinJS and JChem library to support this development. We added new PPI modulators, new PPI targets, and extended our focus to stabilizers of PPIs as well. Finally, we introduce a web application relying on crowdsourcing for the maintenance of the database. This application can be used outside of our group to collaboratively maintain iPPI-DB within a community of curators.
Presentation given at the NBT / ECCB 2020, presenting COMBINE standards. Also providing links to related projects, introducing open model repositories and giving some hints for creating reusable models.
Introduction to Gene Mining Part A: BLASTn-off!adcobb
In this lesson, students will learn to use bioinformatics portals and tools to mine plant versions of human genes. Student handout and teacher resource materials are available at www.Araport.org, Teaching Resources (Community tab). Suitable for grades 9-12 or first year undergraduate students.
ECCMID 2016 - How to build actionable virulome databasesJoão André Carriço
Talks given at the Session SY024 - Controversies in interpreting whole genome sequence data
9-April-2016 : http://eccmidlive.org/#resources/how-can-we-design-actionable-virulome-databases
Branch: An interactive, web-based tool for building decision tree classifiersBenjamin Good
A crucial task in modern biology is the prediction of complex phenotypes, such as breast cancer prognosis, from genome-wide measurements. Machine learning algorithms can sometimes infer predictive patterns, but there is rarely enough data to train and test them effectively and the patterns that they identify are often expressed in forms (e.g. support vector machines, neural networks, random forests composed of 10s of thousands of trees) that are highly difficult to understand. In addition, it is generally unclear how to include prior knowledge in the course of their construction.
Decision trees provide an intuitive visual form that can capture complex interactions between multiple variables. Effective methods exist for inferring decision trees automatically but it has been shown that these techniques can be improved upon via the manual interventions of experts. Here, we introduce Branch, a new Web-based tool for the interactive construction of decision trees from genomic datasets. Branch offers the ability to: (1) upload and share datasets intended for classification tasks (in progress), (2) construct decision trees by manually selecting features such as genes for a gene expression dataset, (3) collaboratively edit decision trees, (4) create feature functions that aggregate content from multiple independent features into single decision nodes (e.g. pathways) and (5) evaluate decision tree classifiers in terms of precision and recall. The tool is optimized for genomic use cases through the inclusion of gene and pathway-based search functions.
Branch enables expert biologists to easily engage directly with high-throughput datasets without the need for a team of bioinformaticians. The tree building process allows researchers to rapidly test hypotheses about interactions between biological variables and phenotypes in ways that would otherwise require extensive computational sophistication. In so doing, this tool can both inform biological research and help to produce more accurate, more meaningful classifiers.
A prototype of Branch is available at http://biobranch.org/
KnetMiner provides an easy to use web interface to visualisation and data mining tools for the discovery and evaluation of candidate genes from large scale integrations of public and private data sets. It addresses the needs of scientists who generally lack the time and technical expertise to review all relevant information available in the literature, from key model species and from a potentially wide range of related biological databases. We have previously developed genome-scale knowledge networks (GSKNs) for multiple crop and animal species (Hassani-Pak et al. 2016). The KnetMiner web server searches and evaluates millions of relations and concepts within the GSKNs in real-time to determine if direct or indirect links between genes and trait-based keywords can be established. KnetMiner accepts as user inputs: search terms in combination with a gene list and/or genomic regions. It produces a table of ranked candidate genes and allows users to explore the output in interactive genome and network map visualisation tools that have been optimised for web use on desktop and mobile devices. The KnetMiner web server and the GSKNs provide a step-forward towards systematic and evidence-based gene discovery.
Event: Plant and Animal Genomes conference 2012
Speaker: Rachael Huntley
The Gene Ontology (GO) is a well-established, structured vocabulary used in the functional annotation of gene products. GO terms are used to replace the multiple nomenclatures used by scientific databases that can hamper data integration. Currently, GO consists of more than 35,000 terms describing the molecular function, biological process and subcellular location of a gene product in a generic cell. The UniProt-Gene Ontology Annotation (UniProt-GOA) database1 provides high-quality manual and electronic GO annotations to proteins within UniProt. By annotating well-studied proteins with GO terms and transferring this knowledge to less well-studied and novel proteins that are highly similar, we offer a valuable contribution to the understanding of all proteomes. UniProt-GOA provides annotated entries for over 387,000 species and is the largest and most comprehensive open-source contributor of annotations to the GO Consortium annotation effort. Annotation files for various proteomes are released each month, including human, mouse, rat, zebrafish, cow, chicken, dog, pig, Arabidopsis and Dictyostelium, as well as a file for the multiple species within UniProt. The UniProt-GOA dataset can be queried through our user-friendly QuickGO browser2 or downloaded in a parsable format via the EBI3 and GO Consortium FTP4 sites. The UniProt-GOA dataset has increasingly been integrated into tools that aid in the analysis of large datasets resulting from high-throughput experiments thus assisting researchers in biological interpretation of their results. The annotations produced by UniProt-GOA are additionally cross-referenced in databases such as Ensembl and NCBI Entrez Gene.
1 http://www.ebi.ac.uk/GOA
2 http://www.ebi.ac.uk/QuickGO
3 ftp://ftp.ebi.ac.uk/pub/databases/GO/goa
4 ftp://ftp.geneontology.org/pub/go/gene-associations
Mastering RNA-Seq (NGS Data Analysis) - A Critical Approach To Transcriptomic...Elia Brodsky
This workshop will address critical issues related to Transcriptomics data:
Processing raw Next Generation Sequencing (NGS) data:
1. Next Generation Sequencing data preprocessing:
Trimming technical sequences
Removing PCR duplicates
2. RNA-seq based quantification of expression levels:
Conventional pipelines (looking at known transcripts)
Identification of novel isoforms
Analysis of Expression Data Using Machine Learning:
3. Unsupervised analysis of expression data:
Principal Component Analysis
Clustering
4. Supervised analysis:
Differential expression analysis
Classification, gene signature construction
5. Gene set enrichment analysis
The workshop will include hands-on exercises utilizing public domain datasets:
breast cancer cell lines transcriptomic profiles (https://genomebiology.biomedcentral.com/articles/10.1186/gb-2013-14-10-r110),
patient-derived xenograft (PDX) mouse model of tumor and stroma transcriptomic profiles (http://www.oncotarget.com/index.php?journal=oncotarget&page=article&op=view&path[]=8014&path[]=23533), and
processed data from The Cancer Genome Atlas samples (https://cancergenome.nih.gov/).
Team: The workshops are designed by the researchers at the Tauber Bioinformatics Research Center at University of Haifa, Israel in collaboration with academic centers across the US. Technical support for the workshops is provided by the Pine Biotech team. https://edu.t-bio.info/a-critical-approach-to-transcriptomic-data-analysis/
Drug discovery and development is a long and expensive process over time has notoriously bucked Moore's law that it now has its own law called Eroom's Law named after it (the opposite of Moore). It is estimated that the attrition rate of drug candidates is up to 96% and the average cost to develop a new drug has reached almost $2.5 billion in recent years. One of the major causes for the high attrition rate is drug safety, which accounts for 30% of drug failures. Even if a drug is approved in market, it could be withdrawn due to safety problems. Therefore, evaluating drug safety extensively as early as possible becomes all the more important to accelerate drug discovery and development. This talk provides a high-level overview of the current process of rational drug design that has been in place for many decades and covers some of the major areas where the application of AI, Deep learning and ML based techniques have had the most gains. Specifically, this talk covers a variety of drug safety related AI and ML based techniques currently in use which can generally divided into 3 main categories: 1. Classification 2. Regression 3. Read-across. The talk will also cover how by using a hierarchical classification methodology you can simplify the problem of assessing toxicity of any given chemical compound. We will also address recent progress of predictive models and techniques built for various toxicities. It will also cover some publicly available databases, tools and platforms available to easily leverage them. We will also compare and contrast various modeling techniques including deep learning techniques and their accuracy using recent research. Finally, the talk will also address some of the remaining challenges and limitations yet to be addressed in the area of drug safety assessment.
Whole Cell Volkswagen Summer School - a SEMS ProjectMarkusWolfien
"Combining standards for today's models"
A summary for a planned project for a summer school hosted by SEMS of University of Rostock in late 2014. We would like to transcribe the "Whole Cell - Mycoplasma genitalium Model" of Karr et al. (2012a) into standard formats to show its power and reusability.
For further information about the project and our group please visit:
>> https://sems.uni-rostock.de/workshops/volkwagen-summer-school-project/ <<
If you are interested in our project, please subscribe and get the latest news at:
>> https://groups.google.com/forum/#!forum/wholecell-symposium <<
Best regards,
Markus Wolfien
A collaborative model for bioinformatics education: combining biologically i...Elia Brodsky
Presented at the 6th Annual LA Conference on Computational Biology & Bioinformatics
Authors:
Kimberlee Mix*, Patricia Dorn*, Donald Hauber*, Scott McDermott**, Ryan Harvey** , Jack LeBien***, Sahil Sethi***, Julia Panov***, Avi Titievsky****, Elia Brodsky***
Departments of Biological Sciences*, Mathematics and Computer Science**, Loyola University New Orleans, 6363 St Charles Avenue, New Orleans, LA 70118
Pine Biotech, Inc***, 1441 Canal St. New Orleans, LA 70112
Tauber Bioinformatics Research Center****, University of Haifa Multi Purpose Building Room 225A Mount Carmel, Haifa 3498838 ISRAEL
Despite the growing impact of bioinformatics in the biological science community, integration of an on-site bioinformatics curriculum is cost prohibitive for many universities due to the necessary infrastructure and computational resources. Furthermore, many programs prioritize the technical aspects of bioinformatics over the biological concepts and logic of analyses, thus limiting the emphasis on critical thinking, problem solving, and in-depth inquiry. To address the gap in bioinformatics education and train students to approach complex biomedical problems, we present a new model for curriculum development that combines our unique online learning environment with traditional pedagogical approaches delivered through academic partnerships. The T-BioInfo platform (https://t-bio.info) allows users to combine computational analysis modules into pipelines to develop solutions for ‘omics data and machine learning problems. State-of-the-art tools for analysis, integration, and visualization of data are offered through a user-friendly interface. In parallel, online educational modules provide a theoretical framework for the analysis methods and experimental techniques. This model for bioinformatics training was implemented at Loyola University New Orleans, a liberal arts institution, for the first time in January 2018. Twelve undergraduate students and five faculty members participated in a new one-semester bioinformatics course. After completing a core set of online modules and pipelines, students conducted team research projects on topics such as patient derived xenograft (PDX) models, immune responses in cancer, and precision medicine. Gains in critical thinking and problem-solving skills were observed and participants were enthusiastic about engaging in bioinformatics research. In conclusion, our collaborative model for bioinformatics education combines best-practices in online and in-class learning with a powerful computational platform. This model could be implemented in undergraduate and graduate curricula to enhance research, build partnerships with industry, and strengthen the scientific workforce.
Building bioinformatics resources for the global communityExternalEvents
http://www.fao.org/about/meetings/wgs-on-food-safety-management/en/
Building bioinformatics resources for the global community. Presentation from the Technical Meeting on the impact of Whole Genome Sequencing (WGS) on food safety management and GMI-9, 23-25 May 2016, Rome, Italy.
Branch: An interactive, web-based tool for building decision tree classifiersBenjamin Good
A crucial task in modern biology is the prediction of complex phenotypes, such as breast cancer prognosis, from genome-wide measurements. Machine learning algorithms can sometimes infer predictive patterns, but there is rarely enough data to train and test them effectively and the patterns that they identify are often expressed in forms (e.g. support vector machines, neural networks, random forests composed of 10s of thousands of trees) that are highly difficult to understand. In addition, it is generally unclear how to include prior knowledge in the course of their construction.
Decision trees provide an intuitive visual form that can capture complex interactions between multiple variables. Effective methods exist for inferring decision trees automatically but it has been shown that these techniques can be improved upon via the manual interventions of experts. Here, we introduce Branch, a new Web-based tool for the interactive construction of decision trees from genomic datasets. Branch offers the ability to: (1) upload and share datasets intended for classification tasks (in progress), (2) construct decision trees by manually selecting features such as genes for a gene expression dataset, (3) collaboratively edit decision trees, (4) create feature functions that aggregate content from multiple independent features into single decision nodes (e.g. pathways) and (5) evaluate decision tree classifiers in terms of precision and recall. The tool is optimized for genomic use cases through the inclusion of gene and pathway-based search functions.
Branch enables expert biologists to easily engage directly with high-throughput datasets without the need for a team of bioinformaticians. The tree building process allows researchers to rapidly test hypotheses about interactions between biological variables and phenotypes in ways that would otherwise require extensive computational sophistication. In so doing, this tool can both inform biological research and help to produce more accurate, more meaningful classifiers.
A prototype of Branch is available at http://biobranch.org/
KnetMiner provides an easy to use web interface to visualisation and data mining tools for the discovery and evaluation of candidate genes from large scale integrations of public and private data sets. It addresses the needs of scientists who generally lack the time and technical expertise to review all relevant information available in the literature, from key model species and from a potentially wide range of related biological databases. We have previously developed genome-scale knowledge networks (GSKNs) for multiple crop and animal species (Hassani-Pak et al. 2016). The KnetMiner web server searches and evaluates millions of relations and concepts within the GSKNs in real-time to determine if direct or indirect links between genes and trait-based keywords can be established. KnetMiner accepts as user inputs: search terms in combination with a gene list and/or genomic regions. It produces a table of ranked candidate genes and allows users to explore the output in interactive genome and network map visualisation tools that have been optimised for web use on desktop and mobile devices. The KnetMiner web server and the GSKNs provide a step-forward towards systematic and evidence-based gene discovery.
Event: Plant and Animal Genomes conference 2012
Speaker: Rachael Huntley
The Gene Ontology (GO) is a well-established, structured vocabulary used in the functional annotation of gene products. GO terms are used to replace the multiple nomenclatures used by scientific databases that can hamper data integration. Currently, GO consists of more than 35,000 terms describing the molecular function, biological process and subcellular location of a gene product in a generic cell. The UniProt-Gene Ontology Annotation (UniProt-GOA) database1 provides high-quality manual and electronic GO annotations to proteins within UniProt. By annotating well-studied proteins with GO terms and transferring this knowledge to less well-studied and novel proteins that are highly similar, we offer a valuable contribution to the understanding of all proteomes. UniProt-GOA provides annotated entries for over 387,000 species and is the largest and most comprehensive open-source contributor of annotations to the GO Consortium annotation effort. Annotation files for various proteomes are released each month, including human, mouse, rat, zebrafish, cow, chicken, dog, pig, Arabidopsis and Dictyostelium, as well as a file for the multiple species within UniProt. The UniProt-GOA dataset can be queried through our user-friendly QuickGO browser2 or downloaded in a parsable format via the EBI3 and GO Consortium FTP4 sites. The UniProt-GOA dataset has increasingly been integrated into tools that aid in the analysis of large datasets resulting from high-throughput experiments thus assisting researchers in biological interpretation of their results. The annotations produced by UniProt-GOA are additionally cross-referenced in databases such as Ensembl and NCBI Entrez Gene.
1 http://www.ebi.ac.uk/GOA
2 http://www.ebi.ac.uk/QuickGO
3 ftp://ftp.ebi.ac.uk/pub/databases/GO/goa
4 ftp://ftp.geneontology.org/pub/go/gene-associations
Mastering RNA-Seq (NGS Data Analysis) - A Critical Approach To Transcriptomic...Elia Brodsky
This workshop will address critical issues related to Transcriptomics data:
Processing raw Next Generation Sequencing (NGS) data:
1. Next Generation Sequencing data preprocessing:
Trimming technical sequences
Removing PCR duplicates
2. RNA-seq based quantification of expression levels:
Conventional pipelines (looking at known transcripts)
Identification of novel isoforms
Analysis of Expression Data Using Machine Learning:
3. Unsupervised analysis of expression data:
Principal Component Analysis
Clustering
4. Supervised analysis:
Differential expression analysis
Classification, gene signature construction
5. Gene set enrichment analysis
The workshop will include hands-on exercises utilizing public domain datasets:
breast cancer cell lines transcriptomic profiles (https://genomebiology.biomedcentral.com/articles/10.1186/gb-2013-14-10-r110),
patient-derived xenograft (PDX) mouse model of tumor and stroma transcriptomic profiles (http://www.oncotarget.com/index.php?journal=oncotarget&page=article&op=view&path[]=8014&path[]=23533), and
processed data from The Cancer Genome Atlas samples (https://cancergenome.nih.gov/).
Team: The workshops are designed by the researchers at the Tauber Bioinformatics Research Center at University of Haifa, Israel in collaboration with academic centers across the US. Technical support for the workshops is provided by the Pine Biotech team. https://edu.t-bio.info/a-critical-approach-to-transcriptomic-data-analysis/
Drug discovery and development is a long and expensive process over time has notoriously bucked Moore's law that it now has its own law called Eroom's Law named after it (the opposite of Moore). It is estimated that the attrition rate of drug candidates is up to 96% and the average cost to develop a new drug has reached almost $2.5 billion in recent years. One of the major causes for the high attrition rate is drug safety, which accounts for 30% of drug failures. Even if a drug is approved in market, it could be withdrawn due to safety problems. Therefore, evaluating drug safety extensively as early as possible becomes all the more important to accelerate drug discovery and development. This talk provides a high-level overview of the current process of rational drug design that has been in place for many decades and covers some of the major areas where the application of AI, Deep learning and ML based techniques have had the most gains. Specifically, this talk covers a variety of drug safety related AI and ML based techniques currently in use which can generally divided into 3 main categories: 1. Classification 2. Regression 3. Read-across. The talk will also cover how by using a hierarchical classification methodology you can simplify the problem of assessing toxicity of any given chemical compound. We will also address recent progress of predictive models and techniques built for various toxicities. It will also cover some publicly available databases, tools and platforms available to easily leverage them. We will also compare and contrast various modeling techniques including deep learning techniques and their accuracy using recent research. Finally, the talk will also address some of the remaining challenges and limitations yet to be addressed in the area of drug safety assessment.
Whole Cell Volkswagen Summer School - a SEMS ProjectMarkusWolfien
"Combining standards for today's models"
A summary for a planned project for a summer school hosted by SEMS of University of Rostock in late 2014. We would like to transcribe the "Whole Cell - Mycoplasma genitalium Model" of Karr et al. (2012a) into standard formats to show its power and reusability.
For further information about the project and our group please visit:
>> https://sems.uni-rostock.de/workshops/volkwagen-summer-school-project/ <<
If you are interested in our project, please subscribe and get the latest news at:
>> https://groups.google.com/forum/#!forum/wholecell-symposium <<
Best regards,
Markus Wolfien
A collaborative model for bioinformatics education: combining biologically i...Elia Brodsky
Presented at the 6th Annual LA Conference on Computational Biology & Bioinformatics
Authors:
Kimberlee Mix*, Patricia Dorn*, Donald Hauber*, Scott McDermott**, Ryan Harvey** , Jack LeBien***, Sahil Sethi***, Julia Panov***, Avi Titievsky****, Elia Brodsky***
Departments of Biological Sciences*, Mathematics and Computer Science**, Loyola University New Orleans, 6363 St Charles Avenue, New Orleans, LA 70118
Pine Biotech, Inc***, 1441 Canal St. New Orleans, LA 70112
Tauber Bioinformatics Research Center****, University of Haifa Multi Purpose Building Room 225A Mount Carmel, Haifa 3498838 ISRAEL
Despite the growing impact of bioinformatics in the biological science community, integration of an on-site bioinformatics curriculum is cost prohibitive for many universities due to the necessary infrastructure and computational resources. Furthermore, many programs prioritize the technical aspects of bioinformatics over the biological concepts and logic of analyses, thus limiting the emphasis on critical thinking, problem solving, and in-depth inquiry. To address the gap in bioinformatics education and train students to approach complex biomedical problems, we present a new model for curriculum development that combines our unique online learning environment with traditional pedagogical approaches delivered through academic partnerships. The T-BioInfo platform (https://t-bio.info) allows users to combine computational analysis modules into pipelines to develop solutions for ‘omics data and machine learning problems. State-of-the-art tools for analysis, integration, and visualization of data are offered through a user-friendly interface. In parallel, online educational modules provide a theoretical framework for the analysis methods and experimental techniques. This model for bioinformatics training was implemented at Loyola University New Orleans, a liberal arts institution, for the first time in January 2018. Twelve undergraduate students and five faculty members participated in a new one-semester bioinformatics course. After completing a core set of online modules and pipelines, students conducted team research projects on topics such as patient derived xenograft (PDX) models, immune responses in cancer, and precision medicine. Gains in critical thinking and problem-solving skills were observed and participants were enthusiastic about engaging in bioinformatics research. In conclusion, our collaborative model for bioinformatics education combines best-practices in online and in-class learning with a powerful computational platform. This model could be implemented in undergraduate and graduate curricula to enhance research, build partnerships with industry, and strengthen the scientific workforce.
Building bioinformatics resources for the global communityExternalEvents
http://www.fao.org/about/meetings/wgs-on-food-safety-management/en/
Building bioinformatics resources for the global community. Presentation from the Technical Meeting on the impact of Whole Genome Sequencing (WGS) on food safety management and GMI-9, 23-25 May 2016, Rome, Italy.
Machine Learning Based Approaches for Cancer Classification Using Gene Expres...mlaij
The classification of different types of tumor is of great importance in cancer diagnosis and drug discovery.
Earlier studies on cancer classification have limited diagnostic ability. The recent development of DNA
microarray technology has made monitoring of thousands of gene expression simultaneously. By using this
abundance of gene expression data researchers are exploring the possibilities of cancer classification.
There are number of methods proposed with good results, but lot of issues still need to be addressed. This
paper present an overview of various cancer classification methods and evaluate these proposed methods
based on their classification accuracy, computational time and ability to reveal gene information. We have
also evaluated and introduced various proposed gene selection method. In this paper, several issues
related to cancer classification have also been discussed.
Speaker: Benedict C. S. Cross, PhD, Team leader (Discovery Screening), Horizon Discovery
CRISPR–Cas9 mediated genome editing provides a highly efficient way to probe gene function. Using this technology, thousands of genes can be knocked out and their function assessed in a single experiment. We have conducted over 150 of these complex and powerful screens and will use our experience to guide you through the process of screen design, performance and analysis.
We'll be discussing:
• How to use CRISPR screening for target ID and validation, understanding drug MOA and patient stratification
• The screen design, quality control and how to evaluate success of your screening program
• Horizon’s latest developments to the platform
• Horizon’s novel approaches to target validation screening
Clustering Approaches for Evaluation and Analysis on Formal Gene Expression C...rahulmonikasharma
Enormous generation of biological data and the need of analysis of that data led to the generation of the field Bioinformatics. Data mining is the stream which is used to derive, analyze the data by exploring the hidden patterns of the biological data. Though, data mining can be used in analyzing biological data such as genomic data, proteomic data here Gene Expression (GE) Data is considered for evaluation. GE is generated from Microarrays such as DNA and oligo micro arrays. The generated data is analyzed through the clustering techniques of data mining. This study deals with an implement the basic clustering approach K-Means and various clustering approaches like Hierarchal, Som, Click and basic fuzzy based clustering approach. Eventually, the comparative study of those approaches which lead to the effective approach of cluster analysis of GE.The experimental results shows that proposed algorithm achieve a higher clustering accuracy and takes less clustering time when compared with existing algorithms.
DEVELOPING CRYO-ELECTRON MICROSCOPY OF BIOMOLECULES IN WATERGuttiPavan
Cryo-electron microscopy (Cryo-EM) is a type of transmission electron microscopy that allows for the specimen of interest to be viewed at cryogenic temperatures (-150°C)
Following years of improvement, the cryo-electron microscope has become a valuable tool for viewing and studying the 3D structures of various biological molecules in water.
A brief information about the SCOP protein database used in bioinformatics.
The Structural Classification of Proteins (SCOP) database is a comprehensive and authoritative resource for the structural and evolutionary relationships of proteins. It provides a detailed and curated classification of protein structures, grouping them into families, superfamilies, and folds based on their structural and sequence similarities.
Professional air quality monitoring systems provide immediate, on-site data for analysis, compliance, and decision-making.
Monitor common gases, weather parameters, particulates.
Nutraceutical market, scope and growth: Herbal drug technologyLokesh Patil
As consumer awareness of health and wellness rises, the nutraceutical market—which includes goods like functional meals, drinks, and dietary supplements that provide health advantages beyond basic nutrition—is growing significantly. As healthcare expenses rise, the population ages, and people want natural and preventative health solutions more and more, this industry is increasing quickly. Further driving market expansion are product formulation innovations and the use of cutting-edge technology for customized nutrition. With its worldwide reach, the nutraceutical industry is expected to keep growing and provide significant chances for research and investment in a number of categories, including vitamins, minerals, probiotics, and herbal supplements.
Seminar of U.V. Spectroscopy by SAMIR PANDASAMIR PANDA
Spectroscopy is a branch of science dealing the study of interaction of electromagnetic radiation with matter.
Ultraviolet-visible spectroscopy refers to absorption spectroscopy or reflect spectroscopy in the UV-VIS spectral region.
Ultraviolet-visible spectroscopy is an analytical method that can measure the amount of light received by the analyte.
Cancer cell metabolism: special Reference to Lactate PathwayAADYARAJPANDEY1
Normal Cell Metabolism:
Cellular respiration describes the series of steps that cells use to break down sugar and other chemicals to get the energy we need to function.
Energy is stored in the bonds of glucose and when glucose is broken down, much of that energy is released.
Cell utilize energy in the form of ATP.
The first step of respiration is called glycolysis. In a series of steps, glycolysis breaks glucose into two smaller molecules - a chemical called pyruvate. A small amount of ATP is formed during this process.
Most healthy cells continue the breakdown in a second process, called the Kreb's cycle. The Kreb's cycle allows cells to “burn” the pyruvates made in glycolysis to get more ATP.
The last step in the breakdown of glucose is called oxidative phosphorylation (Ox-Phos).
It takes place in specialized cell structures called mitochondria. This process produces a large amount of ATP. Importantly, cells need oxygen to complete oxidative phosphorylation.
If a cell completes only glycolysis, only 2 molecules of ATP are made per glucose. However, if the cell completes the entire respiration process (glycolysis - Kreb's - oxidative phosphorylation), about 36 molecules of ATP are created, giving it much more energy to use.
IN CANCER CELL:
Unlike healthy cells that "burn" the entire molecule of sugar to capture a large amount of energy as ATP, cancer cells are wasteful.
Cancer cells only partially break down sugar molecules. They overuse the first step of respiration, glycolysis. They frequently do not complete the second step, oxidative phosphorylation.
This results in only 2 molecules of ATP per each glucose molecule instead of the 36 or so ATPs healthy cells gain. As a result, cancer cells need to use a lot more sugar molecules to get enough energy to survive.
Unlike healthy cells that "burn" the entire molecule of sugar to capture a large amount of energy as ATP, cancer cells are wasteful.
Cancer cells only partially break down sugar molecules. They overuse the first step of respiration, glycolysis. They frequently do not complete the second step, oxidative phosphorylation.
This results in only 2 molecules of ATP per each glucose molecule instead of the 36 or so ATPs healthy cells gain. As a result, cancer cells need to use a lot more sugar molecules to get enough energy to survive.
introduction to WARBERG PHENOMENA:
WARBURG EFFECT Usually, cancer cells are highly glycolytic (glucose addiction) and take up more glucose than do normal cells from outside.
Otto Heinrich Warburg (; 8 October 1883 – 1 August 1970) In 1931 was awarded the Nobel Prize in Physiology for his "discovery of the nature and mode of action of the respiratory enzyme.
WARNBURG EFFECT : cancer cells under aerobic (well-oxygenated) conditions to metabolize glucose to lactate (aerobic glycolysis) is known as the Warburg effect. Warburg made the observation that tumor slices consume glucose and secrete lactate at a higher rate than normal tissues.
Introduction:
RNA interference (RNAi) or Post-Transcriptional Gene Silencing (PTGS) is an important biological process for modulating eukaryotic gene expression.
It is highly conserved process of posttranscriptional gene silencing by which double stranded RNA (dsRNA) causes sequence-specific degradation of mRNA sequences.
dsRNA-induced gene silencing (RNAi) is reported in a wide range of eukaryotes ranging from worms, insects, mammals and plants.
This process mediates resistance to both endogenous parasitic and exogenous pathogenic nucleic acids, and regulates the expression of protein-coding genes.
What are small ncRNAs?
micro RNA (miRNA)
short interfering RNA (siRNA)
Properties of small non-coding RNA:
Involved in silencing mRNA transcripts.
Called “small” because they are usually only about 21-24 nucleotides long.
Synthesized by first cutting up longer precursor sequences (like the 61nt one that Lee discovered).
Silence an mRNA by base pairing with some sequence on the mRNA.
Discovery of siRNA?
The first small RNA:
In 1993 Rosalind Lee (Victor Ambros lab) was studying a non- coding gene in C. elegans, lin-4, that was involved in silencing of another gene, lin-14, at the appropriate time in the
development of the worm C. elegans.
Two small transcripts of lin-4 (22nt and 61nt) were found to be complementary to a sequence in the 3' UTR of lin-14.
Because lin-4 encoded no protein, she deduced that it must be these transcripts that are causing the silencing by RNA-RNA interactions.
Types of RNAi ( non coding RNA)
MiRNA
Length (23-25 nt)
Trans acting
Binds with target MRNA in mismatch
Translation inhibition
Si RNA
Length 21 nt.
Cis acting
Bind with target Mrna in perfect complementary sequence
Piwi-RNA
Length ; 25 to 36 nt.
Expressed in Germ Cells
Regulates trnasposomes activity
MECHANISM OF RNAI:
First the double-stranded RNA teams up with a protein complex named Dicer, which cuts the long RNA into short pieces.
Then another protein complex called RISC (RNA-induced silencing complex) discards one of the two RNA strands.
The RISC-docked, single-stranded RNA then pairs with the homologous mRNA and destroys it.
THE RISC COMPLEX:
RISC is large(>500kD) RNA multi- protein Binding complex which triggers MRNA degradation in response to MRNA
Unwinding of double stranded Si RNA by ATP independent Helicase
Active component of RISC is Ago proteins( ENDONUCLEASE) which cleave target MRNA.
DICER: endonuclease (RNase Family III)
Argonaute: Central Component of the RNA-Induced Silencing Complex (RISC)
One strand of the dsRNA produced by Dicer is retained in the RISC complex in association with Argonaute
ARGONAUTE PROTEIN :
1.PAZ(PIWI/Argonaute/ Zwille)- Recognition of target MRNA
2.PIWI (p-element induced wimpy Testis)- breaks Phosphodiester bond of mRNA.)RNAse H activity.
MiRNA:
The Double-stranded RNAs are naturally produced in eukaryotic cells during development, and they have a key role in regulating gene expression .
Slide 1: Title Slide
Extrachromosomal Inheritance
Slide 2: Introduction to Extrachromosomal Inheritance
Definition: Extrachromosomal inheritance refers to the transmission of genetic material that is not found within the nucleus.
Key Components: Involves genes located in mitochondria, chloroplasts, and plasmids.
Slide 3: Mitochondrial Inheritance
Mitochondria: Organelles responsible for energy production.
Mitochondrial DNA (mtDNA): Circular DNA molecule found in mitochondria.
Inheritance Pattern: Maternally inherited, meaning it is passed from mothers to all their offspring.
Diseases: Examples include Leber’s hereditary optic neuropathy (LHON) and mitochondrial myopathy.
Slide 4: Chloroplast Inheritance
Chloroplasts: Organelles responsible for photosynthesis in plants.
Chloroplast DNA (cpDNA): Circular DNA molecule found in chloroplasts.
Inheritance Pattern: Often maternally inherited in most plants, but can vary in some species.
Examples: Variegation in plants, where leaf color patterns are determined by chloroplast DNA.
Slide 5: Plasmid Inheritance
Plasmids: Small, circular DNA molecules found in bacteria and some eukaryotes.
Features: Can carry antibiotic resistance genes and can be transferred between cells through processes like conjugation.
Significance: Important in biotechnology for gene cloning and genetic engineering.
Slide 6: Mechanisms of Extrachromosomal Inheritance
Non-Mendelian Patterns: Do not follow Mendel’s laws of inheritance.
Cytoplasmic Segregation: During cell division, organelles like mitochondria and chloroplasts are randomly distributed to daughter cells.
Heteroplasmy: Presence of more than one type of organellar genome within a cell, leading to variation in expression.
Slide 7: Examples of Extrachromosomal Inheritance
Four O’clock Plant (Mirabilis jalapa): Shows variegated leaves due to different cpDNA in leaf cells.
Petite Mutants in Yeast: Result from mutations in mitochondrial DNA affecting respiration.
Slide 8: Importance of Extrachromosomal Inheritance
Evolution: Provides insight into the evolution of eukaryotic cells.
Medicine: Understanding mitochondrial inheritance helps in diagnosing and treating mitochondrial diseases.
Agriculture: Chloroplast inheritance can be used in plant breeding and genetic modification.
Slide 9: Recent Research and Advances
Gene Editing: Techniques like CRISPR-Cas9 are being used to edit mitochondrial and chloroplast DNA.
Therapies: Development of mitochondrial replacement therapy (MRT) for preventing mitochondrial diseases.
Slide 10: Conclusion
Summary: Extrachromosomal inheritance involves the transmission of genetic material outside the nucleus and plays a crucial role in genetics, medicine, and biotechnology.
Future Directions: Continued research and technological advancements hold promise for new treatments and applications.
Slide 11: Questions and Discussion
Invite Audience: Open the floor for any questions or further discussion on the topic.
Experimental Designs in Next Generation Sequencing
1. Presented by
Aishwarya Mohorikar
Pawan Gutti
Msc -1 Biochemistry
The Institute of Science Mumbai
Presented by
Presented
by
Aishwarya Mohorikar
pavan Gutti
Msc -1 Biochemistry
The Institute of Science
Mumbai
4/4/2018 1
Experimental Designs in Next Generation
Sequencing
2. • Introduction
• Types of experimental designs
• Basic NGS chemistry
• Tools used in NGS
• Good and Bad experimental designs
4/4/2018 2
Experimental Designs in Next Generation
Sequencing
3. • Next generation sequencing (NGS), massively
parallel or deep sequencing are related terms
that describe a DNA sequencing technology
which has revolutionized genomic research.
4/4/2018 3
Experimental Designs in Next Generation
Sequencing
4. • There are 3 types of experimental designs in
NGS.
• Paired v/s single
• Multiple
• Mate
4/4/2018 4
Experimental Designs in Next Generation
Sequencing
5. Principle
The Basics of NGS Chemistry In principle, the
concept behind NGS technology is similar to
CE sequencing. DNApolymerase catalyzes the
incorporation of fluorescently labeled deoxy
ribonucleotide triphosphates (dNTPs) in to a
DNA template strand during sequential cycles
of DNA synthesis.
4/4/2018 5
Experimental Designs in Next Generation
Sequencing
6. Contd
Library Preparation
The sequencing library is prepared by random
fragmentation of the DNA or Cdna sample, followed
by5′and 3′adapterligation.
Cluster Generation
For cluster generation, the library is loaded in to a flow
cell where fragment are captured on a lawn of
surface-bound oligoscomplementary to the library
adapters.
4/4/2018 6
Experimental Designs in Next Generation
Sequencing
7. Contd
Sequencing
Illumina SBS technology uses a proprietary
reversible terminator–based method that detects
single bases as they are incorporated in to DNA
template strands.
DataAnalysis
During data analysis and alignment, the newly
identified sequence read are aligned to a
reference genome . Following alignment, many
variation s of analysis are possible, such as single
nucleotide polymorphism (SNP)or insertion-
deletion(indel) identification, read counting for
RNA methods, phylogenetic or meta genomic
analysis, and more.
4/4/2018 7
Experimental Designs in Next Generation
Sequencing
9. • Integrated Genome Browser
Integrated Genome Browser (IGB) is an open
source genome browser, a visualization tool used to observe
biologically-interesting patterns in genomic data sets,
including sequence data, gene models, alignments, and data
from DNA microarrays.
IGB reads data in dozens of formats, including BAM,
BED, Affymetrix CHP, FASTA, GFF, GTF, PSL, SGR, and WIG.
4/4/2018 9
Experimental Designs in Next Generation
Sequencing
10. Contd
• Galaxy
Galaxy was originally written for biological data
analysis, particularly genomics. The set of available
tools has been greatly expanded over the years and
Galaxy is now also used
for geneexpression, genomeassembly, proteomics, e
pigenomics, transcriptomics and host of other
disciplines in the life sciences.
https://usegalaxy.org is the URL.
4/4/2018 10
Experimental Designs in Next Generation
Sequencing
11. • the better approach for the above example is “not to” sequence all
samples from an experimental group in a single lane, but make sure
each lane contains samples from both the control and experimental
groups.
• That is where randomization comes in. One good NGS design is to
randomly
• pick three samples from control and experimental groups and
sequence them in a lane. And sequencing the remaining six samples
in the second lane.
In this design even if one lane goes rogue that affects both the control
and
• experimental group samples equally and we still have one more
“well
• behaved” lane with both the groups. In statistical parlance, now the
lane effect
• and the experimental group effect is no longer confounded.
4/4/2018 11
Experimental Designs in Next Generation
Sequencing
12. • A naive design is to put all six samples from the same
group in a single lane (or dosequencing in one day).
For example, multiplex all six control samples and
sequence
• them in one lane and multiplex all six experimental
group samples and sequence in
• the other lane. This is a bad design despite the fact
that there are six biological
• replicates and it is multiplexed.
The reason why it is bad design is simply the same as
the good old saying, “Don’t put
4/4/2018 12
Experimental Designs in Next Generation
Sequencing
14. A survey of tools for variant analysis of next-
generation genome sequencing data
BRIEFINGS IN BIOINFORMATICS. VOL 15. NO 2.
256 ^278 doi:10.1093/bib/bbs086 Advance
Access published on 21 January 2013
Next-generation sequencing data interpretation:
enhancing reproducibility and accessibility
NGS analyses by visualization with Trackster
Published in final edited form as: Nat Biotechnol.
2012 November ; 30(11): 1036–1039.
doi:10.1038/nbt.2404. NIH Public Access
4/4/2018
Experimental Designs in Next Generation
Sequencing
14