In bioinformatics, a sequence alignment is a way of arranging the sequences of DNA, RNA, or protein to identify regions of similarity that may be a consequence of functional, structural, or evolutionary relationships between the sequences.
The following slides were prepared by POORNIMA M.S student of II M.Sc., Life Science Bangalore University, Bangalore
An update version of the genome assembly including the mention of techniques such as HiC and Bionano. Also include the QC. These are the same slides used in the course for the UNL in Argentina.
Next generation Sequencing or massive parallel sequencing is a high throughput approach to sequence genetic material using the concept of massively parallel processing. It is also called second generation sequencing.This enables researchers a wide variety of applications & study biological systems.
In bioinformatics, a sequence alignment is a way of arranging the sequences of DNA, RNA, or protein to identify regions of similarity that may be a consequence of functional, structural, or evolutionary relationships between the sequences.
The following slides were prepared by POORNIMA M.S student of II M.Sc., Life Science Bangalore University, Bangalore
An update version of the genome assembly including the mention of techniques such as HiC and Bionano. Also include the QC. These are the same slides used in the course for the UNL in Argentina.
Next generation Sequencing or massive parallel sequencing is a high throughput approach to sequence genetic material using the concept of massively parallel processing. It is also called second generation sequencing.This enables researchers a wide variety of applications & study biological systems.
The internet provides access to unprecedented amounts of data. In the domain of online chemistry resources RSC’s ChemSpider has achieved a role of prominence in providing access to high quality chemistry data. By providing access to a database of almost 25 million unique chemical compounds linked out to over 400 data sources on the internet, ChemSpider provides a platform to source detailed information about chemicals. The website is a crowdsourcing platform allowing members of the community to deposit new data and to annotate and curate the existing data. By providing computationally accessible interfaces into the database third parties can utilize the information on the database to support their software applications. One such application is the identification of metabolites in a field known as metabolomics, metabolic profiling at a cellular or organ level and primarily concerned with normal endogenous metabolism.
High resolution mass spectrometry is an analytical technique used for determining the accurate mass and the elemental composition of a molecule, and aiding in elucidating the chemical structures of molecules. It can be used to identify and quantify metabolites from a complex mixture after separation by GC or HPLC (LC-MS). The integration of mass spectrometric data with ChemSpider can, in many cases, allow for the rapid identification of metabolites.
This presentation will provide an overview of the ChemSpider database and developments in modern mass spectrometry and its application to metabolomics.
Presentation pathway extensions using knowledge integration and network approaches presented at the Systems Biology Institute in Luxembourg on November 28 2012.
Metabolite Set Enrichment Analysis (ChemRICH)Dinesh Barupal
Metabolomics answers a fundamental question in biology: How does metabolism respond to genetic, environmental or phenotypic perturbations? Combining several metabolomics assays can yield datasets for more than 800 structurally identified metabolites. However, biological interpretations of metabolic regulation in these datasets are hindered by inherent limits of pathway enrichment statistics. We have developed ChemRICH, a statistical enrichment approach that is based on chemical similarity rather than sparse biochemical knowledge annotations. ChemRICH utilizes structure similarity and chemical ontologies to map all known metabolites and name metabolic modules. Unlike pathway mapping, this strategy yields study-specific, non-overlapping sets of all identified metabolites. Subsequent enrichment statistics is superior to pathway enrichments because ChemRICH sets have a self-contained size where p-values do not rely on the size of a background database. We demonstrate ChemRICH’s efficiency on a public metabolomics data set discerning the development of type 1 diabetes in a non-obese diabetic mouse model. ChemRICH is available at www.chemrich.fiehnlab.ucdavis.edu
Towards semantic systems chemical biology Bin Chen
introduce a semantic framework for studying systems chemical biology / systems pharmacology, in which three major projects (Chem2Bio2RDF, Chem2Bio2OWL, SLAP (semantic link association prediction) are covered.
Cadd and molecular modeling for M.PharmShikha Popali
THE CADD IS FOR THE DRUG DEVELOPMENT THE DIFFERENT STRATEGIES ARE MENTIONED LIKE QSAR MOLECULAR DOCKING, THE DIFFERENT DIMNSIONAL FORMS OF QSAR , THE ADVANCE SAR of it.
Proteins play a key role in molecular recognition and are at the core of all biological processes. They can interact with other components of the cell, such as small molecular metabolites, nucleic acids, membranes and other proteins to build supramolecular components and carefully design molecular machines that perform various functions, from chemical catalysis, mechanical work to signal transmission And adjustment. So far, large-scale protein-protein interactions have been identified, and all the generated data is collected in a special database, which can create large-scale protein interaction networks. Like metabolism or genetic/epigenetic networks, the study of PPIs can help us understand the mechanisms of signal transduction, transmembrane transport, cell metabolism and other biological processes through stable or transient, covalent or non-covalent interactions. https://www.creative-proteomics.com/services/protein-protein-interaction-networks.htm
A full picture of -omics cellular networks of regulation brings researchers closer to a realistic and reliable understanding of complex conditions. For more information, please visit: http://tbioinfopb.pine-biotech.com/
T-Bioinfo is a comprehensive bioinformatics platform that allows the user to navigate NGS, Mass-Spec and Structural Biology data analysis pipelines using consistent interface. Analysis and integration of such data allows for better and faster discovery and optimization of personalized and precision treatment of complex diseases and understanding of medical conditions. For more information, go to pine-biotech.com
Summary: ENViz performs enrichment analysis for pathways and gene ontology (GO) terms in matched datasets of multiple data types (e.g. gene expression and metabolites or miRNA), then visualizes results as a Cytoscape network that can be navigated to show data overlaid on pathways and GO DAGs.
Background: Modern genomic, metabolomics, and proteomic assays produce multiplexed measurements that characterize molecular composition and biological activity from complimentary angles. Integrative analysis of such measurements remains a challenge to life science and biomedical researchers. We present an enrichment network approach to jointly analyzing two types of sample matched datasets and systematic annotations, implemented as a plugin to the Cytoscape [1] network biology software platform.
Approach: ENViz analyses a primary dataset (e.g. gene expression) with respect to a ‘pivot’ dataset (e.g. miRNA expression, metabolomics or proteomics measurements) and primary data annotation (e.g. pathway or GO). For each pivot entity, we rank elements of the primary data based on the correlation to the pivot across all samples, and compute statistical enrichment of annotation sets in the top of this ranked list based on minimum hypergeometric statistics [2]. Significant results are represented as an enrichment network - a bipartite graph with nodes corresponding to pivot and annotation entities, and edges corresponding to pivot-annotation pairs with statistical enrichmentscores above the user defined threshold. Correlations of primary data and pivot data are visually overlaid on biological pathways for significant pivot-annotation pairs using the WikiPathways resource [3], and on gene ontology terms. Edges of the enrichment network may point to functionally relevant mechanisms. In [4], a significant association between miR-19a and the cell-cycle module was substantiated as an association to proliferation, validated using a high-throughput transfection assay. The figures below show a pathway enrichment network, with pathway nodes green and miRNAs gray (left), network view of the edge between Inflammatory Response Pathway and mir-337-5p (center), and GO enrichment network with red areas indicating high enrichment for immune response and metabolic processes (right).
WikiPathways: how open source and open data can make omics technology more us...Chris Evelo
Presentation about collaborative development of open source pathway analysis code and pathways and about usage in analytical software distributed with analytical machines like mass spectrophotometers.
EnrichNet: Graph-based statistic and web-application for gene/protein set enr...Enrico Glaab
EnrichNet is a web-application and web-service to identify and visualize functional associations between a user-defined list of genes/proteins and known cellular pathways. As a complement to classical overlap-based enrichment analysis methods, the EnrichNet approach integrates a novel graph-based statistic with a new interactive visualization of network sub-structures to enable a direct molecular interpretation of how a set of genes or proteins is related to a specific cellular pathway. Available at: http://www.enrichnet.org
Metabolic network mapping for metabolomicsDinesh Barupal
We present a novel approach to integrate biochemical pathway and chemical relationships to map all detected metabolites in network graphs (MetaMapp) using KEGG reactant pair database, Tanimoto chemical and NIST mass spectral similarity scores. In fetal and maternal lungs, and in maternal blood plasma from pregnant rats exposed to environmental tobacco smoke (ETS), 459 unique metabolites comprising 179 structurally identified compounds were detected by gas chromatography time of flight mass spectrometry (GC-TOF MS) and BinBase data processing. MetaMapp graphs in Cytoscape showed much clearer metabolic modularity and complete content visualization compared to conventional biochemical mapping approaches. Cytoscape visualization of differential statistics results using these graphs showed that overall, fetal lung metabolism was more impaired than lungs and blood metabolism in dams. Fetuses from ETS-exposed dams expressed lower lipid and nucleotide levels and higher amounts of energy metabolism intermediates than control animals, indicating lower biosynthetic rates of metabolites for cell division, structural proteins and lipids that are critical for in lung development.
MetaMapp graphs efficiently visualizes mass spectrometry based metabolomics datasets as network graphs in Cytoscape, and highlights metabolic alterations that can be associated with higher rate of pulmonary diseases and infections in children prenatally exposed to ETS. The MetaMapp scripts can be accessed at http://metamapp.fiehnlab.ucdavis.edu.
Richard's aventures in two entangled wonderlandsRichard Gill
Since the loophole-free Bell experiments of 2020 and the Nobel prizes in physics of 2022, critics of Bell's work have retreated to the fortress of super-determinism. Now, super-determinism is a derogatory word - it just means "determinism". Palmer, Hance and Hossenfelder argue that quantum mechanics and determinism are not incompatible, using a sophisticated mathematical construction based on a subtle thinning of allowed states and measurements in quantum mechanics, such that what is left appears to make Bell's argument fail, without altering the empirical predictions of quantum mechanics. I think however that it is a smoke screen, and the slogan "lost in math" comes to my mind. I will discuss some other recent disproofs of Bell's theorem using the language of causality based on causal graphs. Causal thinking is also central to law and justice. I will mention surprising connections to my work on serial killer nurse cases, in particular the Dutch case of Lucia de Berk and the current UK case of Lucy Letby.
This pdf is about the Schizophrenia.
For more details visit on YouTube; @SELF-EXPLANATORY;
https://www.youtube.com/channel/UCAiarMZDNhe1A3Rnpr_WkzA/videos
Thanks...!
THE IMPORTANCE OF MARTIAN ATMOSPHERE SAMPLE RETURN.Sérgio Sacani
The return of a sample of near-surface atmosphere from Mars would facilitate answers to several first-order science questions surrounding the formation and evolution of the planet. One of the important aspects of terrestrial planet formation in general is the role that primary atmospheres played in influencing the chemistry and structure of the planets and their antecedents. Studies of the martian atmosphere can be used to investigate the role of a primary atmosphere in its history. Atmosphere samples would also inform our understanding of the near-surface chemistry of the planet, and ultimately the prospects for life. High-precision isotopic analyses of constituent gases are needed to address these questions, requiring that the analyses are made on returned samples rather than in situ.
2. DATA
ACQUISITION
Separation
Detection
SAMPLING
EXTRACTION
DATA
PROCESSING
File Conversion
Baseline Correction
Peak Detection
Deconvolution
Adduct Annotation
Alignment
Gap Filling
STATISTICS
Normalization
Multivariate Analysis
(Parametric, Nonparametric)
Univariate Analysis
(Unsupervised, Supervised)
BIOLOGICAL
INTERPRETATION
Pathway Mapping
Network Enrichment
STUDY DESIGN
VALIDATION
COMPOUND
IDENTIFICATION
Molecular Formula ID
Structure ID
MS Library Search
Database Search
In silico Fragmentation
WCMC
UC Davis
3. Question :
Which metabolites from my list can be mapped to
metabolic pathways ?
Approach : Database mapping
Results:
A. Table of pathways with statistics
B. Maps overlaid with detected metabolites
C. Metabolites not mapped to pathways
5. Prepare the compound list
Sort the statistical results by p-value
Select the KEGG ids of compounds with p-value <0.05
Use filter in
Excel
Example study : “spring_2018_metabolomics_course_pathway_example.xlsx” in the
PathwayAnalysis_example folder
6. Four major databases for pathway
mapping –
• KEGG
• Reactome
• HMDB
• Consensus path db
For a comprehensive list of pathway databases – go to http://www.pathguide.org/
7. KEGG : Kyoto encyclopedia of genes and genomes
http://www.genome.jp/kegg/
GO TO
KEGG is a collection of databases dealing with genomes, biological pathways, diseases, drugs, and chemical substances.
KEGG is utilized for bioinformatics research and education, including data analysis in genomics, metagenomics,
metabolomics and other omics studies, modeling and simulation in systems biology, and translational research in drug
development.
10. Copy and paste the KEGG
Ids in this box
Click on execute
Change organism if needed
11. KEGG Results -
Click on the a pathway to get
the map with compound
overlaid on it.
12. KEGG Global metabolic map
Red dots are the
compounds
present in the
metabolite list.
13. Green boxes are enzymes mapped to the human genome.
White boxes
are enzymes
not mapped
to the human
genome.
Red circle shows
the presence of
this compound
in the
metabolite list
14. Reactome pathway mapping
www.reactome.org
“REACTOME is an open-source, open access, manually curated and peer-reviewed pathway database.
Our goal is to provide intuitive bioinformatics tools for the visualization, interpretation and analysis of
pathway knowledge to support basic and clinical research, genome analysis, modeling, systems biology
and education. Founded in 2001, the Reactome project is led by Lincoln Stein of OICR, Peter D’Eustachio
of NYULMC, Henning Hermjakob of EMBL-EBI, and Guanming Wu of OHSU.”
19. Zooming in and double clicking will show
the map of a pathway
Metabolite in the list
20. MetaboAnalyst
Pathway Mapping
“To provide a user-friendly, web-based analytical pipeline for high-throughput metabolomics
studies. In particular, MetaboAnalyst aims to offer a variety of commonly used procedures for
metabolomic data processing, normalization, multivariate statistical analysis, as well as data
annotation. The current implementation focuses on exploratory statistical analysis, functional
interpretation, and advanced statistics for translational metabolomics studies.”
http://www.metaboanalyst.ca/
26. Consensus PathDB
Metabolic Network Mapping
“ConsensusPathDB-human integrates interaction networks in Homo sapiens including binary and complex
protein-protein, genetic, metabolic, signaling, gene regulatory and drug-target interactions, as well as
biochemical pathways. Data originate from currently 32 public resources for interactions (listed below) and
interactions that we have curated from the literature. The interaction data are integrated in a complementary
manner (avoiding redundancies), resulting in a seamless interaction network containing different types of
interactions.”
http://cpdb.molgen.mpg.de/
36. Key points :
• Not all detected metabolites have KEGG identifiers.
• Not all compounds with KEGG ids can be mapped to
metabolic pathways.
• Pathway maps may not show all the reaction known for a
metabolite.
• Several compound appeared in many pathways.
• Online tools can map metabolite list to pathway maps and
perform pathway over-representation analysis for
metabolites.
• Online network visualization of a large number of
metabolites is not efficient.
37. KEGG - 495
MetaCyc - 2453
Reactome - 2000
HMDB – 613
Wikipathways - 789
Note : Pathway definitions are manual and vary across
different databases.
Major pathway databases
How many pathway we know so far ?
40. Glucose belongs to ?
KEGG DB
Reactome DB
Pathway boundaries are arbitrary
41. Chemical-similarity maps of small molecules and metals in human blood
Rappaport, Stephen M., et al. "The blood exposome and its role in discovering causes
of disease." Environmental Health Perspectives (Online)122.8 (2014): 769.
Node size = number of
pathways
Node size =
Epidemiological studies
• Epidemiologists and
systems biologists /
biochemist don’t study
same compounds.
• Epidemiologist seem to
study compounds having
a lesser number of known
pathways.
Do everyone care about pathways ?Do everyone care about pathways ?
43. P <0.10 P <0.05
Which p-value : < 0.05 OR < 0.10
decide the p-value
cutoff to get the list
of input
metabolites.
44. Metabolite pathway analysis: how about pathway topology ?
Shall they
have a same
score ?
Topology scoring
If affected compounds are in
close biochemical proximity,
the pathway should be given
a weightage score.
45. Fahrmann et al 2017
Gene Ontology TCA map
-include gene names or symbols in the map
Use pathway maps for focused biochemical visualization
46. Use pathway maps for focused biochemical visualization
Pathway maps provided by databases are not used
by publications. Authors draw their own maps and
show data on it.
https://www.nature.com/articles/cddis2011123
47. How to create these customized maps ?
(Re) read biochemistry books and draw them
OR
48. How to create these customized maps ?
Metabolites Reactions Genes
Pathway layout
Biochemical databases
Metabolomics dataset
Cytsocape + powerpoint
**No automated way exits to make pretty
looking customized pathway maps.
ProteinsKEGG
HMDB
MetaCyc
Wikipathways
Reactome
Use biochemical databases
49. Next questions :
• How to map all the identified metabolites
into a biochemical network ?
• How to show all the known reactions for a
metabolite ?
• How to map publishable network graphs ?