The document introduces analyzing methylation data from Reduced Representation Bisulfite Sequencing (RRBS) experiments using the R package methylKit. It begins with an introduction and outline, then covers downloading example RRBS data, basics of R including vectors, matrices, and data frames, genomics and R packages for working with genomic intervals, and using methylKit to analyze RRBS methylation data.
This document introduces analyzing methylation data from Reduced Representation Bisulfite Sequencing (RRBS) experiments using the R package methylKit. It begins with an overview of basic R operations and data structures. Next, it discusses relevant genomics packages in Bioconductor like GenomicRanges and IRanges that are useful for working with genomic intervals. Finally, it demonstrates how to use methylKit to analyze RRBS methylation data, including working with annotated methylation events.
RNA-seq for DE analysis: detecting differential expression - part 5BITS
Part 5 of the training sesson 'RNA-seq for differential expression analysis' considers the algorithm used for detecting differential expression between conditions. See http://www.bits.vib.be
This document discusses biochemical network mapping and visualization. It begins by describing the process of creating a metabolic network graph with nodes representing metabolites and edges representing reactions. While metabolic databases can provide information on known reactions, not all detected metabolites may be present. The document then introduces MetaMapp as an approach to map all detected metabolites into a network graph by combining information on known biochemical reactions with chemical similarity. Cytoscape software allows visualization and analysis of these network graphs. In conclusion, MetaMapp can be used to incorporate all identified metabolites into biochemical modules to aid in interpretation of omics data.
A Knowledge Graph for Reaction & Synthesis Prediction (AstraZeneca)Neo4j
The document summarizes the development of a reaction knowledge graph (RKG) by AstraZeneca to predict novel chemical reactions and assist in synthesis planning. Key points include:
1) The RKG integrates reaction and molecule data from multiple sources and enriches the data with identifiers, templates, and other metadata.
2) Graph analytics and machine learning models are used to provide insights into reaction patterns and predict new reactions and syntheses.
3) Future work includes expanding the RKG with AstraZeneca's full collection and developing link prediction workflows to further support computer-aided synthesis prediction.
This document provides an overview of pairwise sequence alignment and BLAST. It discusses how pairwise alignment works using substitution matrices to assign homology between sites. It demonstrates the dynamic programming approach to pairwise alignment calculation and describes how local alignments are identified. The document also introduces BLAST and how it uses word matching to rapidly identify similar sequences in a database and then performs local alignments on matching regions.
The document discusses the Smith-Waterman algorithm for local sequence alignment. It describes how the algorithm uses dynamic programming to find the optimal local alignment between two sequences without allowing for negative scores. The key steps are initialization of a score matrix, filling the matrix using match/mismatch scores and a gap penalty, and tracing back through the matrix to determine the highest-scoring alignment. An example application of the algorithm aligns the sequences "GATGTAG" and "GAGATGTGC".
This document summarizes Shivendra Kumar's class presentation on SNP genotyping using KASP. It introduces SNP genotyping and the KASP platform. It describes using KASP to genotype a wheat mapping population derived from a cross between an introgression line containing stripe rust resistance genes and a susceptible cultivar. KASP markers were developed and used to map the resistance genes. One candidate resistance gene was identified and further analyzed through expression studies and development of a linked KASP marker. Recombinants were identified and confirmed through additional KASP genotyping.
The document introduces analyzing methylation data from Reduced Representation Bisulfite Sequencing (RRBS) experiments using the R package methylKit. It begins with an introduction and outline, then covers downloading example RRBS data, basics of R including vectors, matrices, and data frames, genomics and R packages for working with genomic intervals, and using methylKit to analyze RRBS methylation data.
This document introduces analyzing methylation data from Reduced Representation Bisulfite Sequencing (RRBS) experiments using the R package methylKit. It begins with an overview of basic R operations and data structures. Next, it discusses relevant genomics packages in Bioconductor like GenomicRanges and IRanges that are useful for working with genomic intervals. Finally, it demonstrates how to use methylKit to analyze RRBS methylation data, including working with annotated methylation events.
RNA-seq for DE analysis: detecting differential expression - part 5BITS
Part 5 of the training sesson 'RNA-seq for differential expression analysis' considers the algorithm used for detecting differential expression between conditions. See http://www.bits.vib.be
This document discusses biochemical network mapping and visualization. It begins by describing the process of creating a metabolic network graph with nodes representing metabolites and edges representing reactions. While metabolic databases can provide information on known reactions, not all detected metabolites may be present. The document then introduces MetaMapp as an approach to map all detected metabolites into a network graph by combining information on known biochemical reactions with chemical similarity. Cytoscape software allows visualization and analysis of these network graphs. In conclusion, MetaMapp can be used to incorporate all identified metabolites into biochemical modules to aid in interpretation of omics data.
A Knowledge Graph for Reaction & Synthesis Prediction (AstraZeneca)Neo4j
The document summarizes the development of a reaction knowledge graph (RKG) by AstraZeneca to predict novel chemical reactions and assist in synthesis planning. Key points include:
1) The RKG integrates reaction and molecule data from multiple sources and enriches the data with identifiers, templates, and other metadata.
2) Graph analytics and machine learning models are used to provide insights into reaction patterns and predict new reactions and syntheses.
3) Future work includes expanding the RKG with AstraZeneca's full collection and developing link prediction workflows to further support computer-aided synthesis prediction.
This document provides an overview of pairwise sequence alignment and BLAST. It discusses how pairwise alignment works using substitution matrices to assign homology between sites. It demonstrates the dynamic programming approach to pairwise alignment calculation and describes how local alignments are identified. The document also introduces BLAST and how it uses word matching to rapidly identify similar sequences in a database and then performs local alignments on matching regions.
The document discusses the Smith-Waterman algorithm for local sequence alignment. It describes how the algorithm uses dynamic programming to find the optimal local alignment between two sequences without allowing for negative scores. The key steps are initialization of a score matrix, filling the matrix using match/mismatch scores and a gap penalty, and tracing back through the matrix to determine the highest-scoring alignment. An example application of the algorithm aligns the sequences "GATGTAG" and "GAGATGTGC".
This document summarizes Shivendra Kumar's class presentation on SNP genotyping using KASP. It introduces SNP genotyping and the KASP platform. It describes using KASP to genotype a wheat mapping population derived from a cross between an introgression line containing stripe rust resistance genes and a susceptible cultivar. KASP markers were developed and used to map the resistance genes. One candidate resistance gene was identified and further analyzed through expression studies and development of a linked KASP marker. Recombinants were identified and confirmed through additional KASP genotyping.
Bioinformatics is the application of information technology to analyze biological data. This document provides an overview of bioinformatics, including publicly available genome sequences from 1998, promises for applications in medicine and biotechnology, the need for bioinformaticians to analyze growing biological databases, common bioinformatics tasks like sequence analysis and molecular modeling, and important databases like GenBank, SwissProt, and NCBI.
This document provides an introduction and overview of common methods for processing and analyzing next generation sequencing (NGS) data, including mapping NGS reads and de novo assembly of NGS reads. It discusses various NGS applications such as RNA-Seq, epigenetics, structural variation detection, and metagenomics. Key steps in read alignment such as choosing an alignment program and viewing alignments are outlined. Considerations for choosing an alignment program based on library type, read type, and platform are also reviewed. Popular alignment programs including Bowtie, BWA, TopHat, and Novoalign are mentioned.
This document summarizes Md. Bipul Hossen's presentation on comparing clustering techniques for gene expression data in bioinformatics. It introduces the topic and objectives, which are to find significant clusters in gene expression data and compare hierarchical and k-means clustering methods. It then provides details on different clustering algorithms and evaluation metrics. Results show complete linkage with Euclidean distance performed best for Affymetrix data, while k-means performed best for cDNA data. The conclusions state the comparative study helps document performance of these methods. Future work could involve comparing to other clustering techniques.
de Bruijn Graph Construction from Combination of Short and Long ReadsSikder Tahsin Al-Amin
This document describes the de Bruijn graph approach for genome assembly using a combination of short and long reads. It discusses key terminology, the motivation for this approach, and how an A-Bruijn graph is constructed and used to find the genomic path. It also covers error correction in draft genomes assembled using this method and potential areas for further development, such as calculating the likelihood ratio of different solid string sets and applying bridging and merging techniques.
RNASeq - Analysis Pipeline for Differential ExpressionJatinder Singh
RNA-Seq is a technique that uses next generation sequencing to sequence RNA transcripts and quantify gene expression levels. It can be used to estimate transcript abundance, detect alternative splicing, and compare gene expression profiles between healthy and diseased tissue. Computational challenges include read mapping due to exon-exon junctions and normalization of read counts. Key steps in RNA-Seq analysis include read mapping, transcript assembly, counting and normalizing reads, and detecting differentially expressed genes.
A workshop is intended for those who are interested in and are in the planning stages of conducting an RNA-Seq experiment. Topics to be discussed will include:
* Experimental Design of RNA-Seq experiment
* Sample preparation, best practices
* High throughput sequencing basics and choices
* Cost estimation
* Differential Gene Expression Analysis
* Data cleanup and quality assurance
* Mapping your data
* Assigning reads to genes and counting
* Analysis of differentially expressed genes
* Downstream analysis/visualizations and tables
This set of slides is based on the presentation I gave at ACM DataScience camp 2014. This is suitable for those who are still new to R. It has a few basic data manipulation techniques, and then goes into the basics of using of the dplyr package (Hadley Wickham) #rstats #dplyr
AGRF in conjunction with EMBL Australia recently organised a workshop at Monash University Clayton. This workshop was targeted at beginners and biologists who are new to analysing Next-Gen Sequencing data. The workshop also aimed to provide users with a snapshot of bioinformatics and data analysis tips on how to begin to analyse project data. An introduction to RNA-seq data analysis was presented by AGRF Senior Bioinformatician Dr. Sonika Tyagi.
Presented: 1st August 2012
This document provides an overview of DNA methylation analysis. It begins with background on DNA methylation functions and diseases. It then discusses methods for measuring DNA methylation status, including bisulfite sequencing. The document reviews steps for DNA methylation data analysis using tools like methylKit in R. It presents a case study example of analyzing DNA methylation data from human stem cells and fibroblasts. Alignment, quality control, differential methylation analysis and visualization are discussed.
The document discusses FASTA, a sequence alignment software tool. It describes the history and development of FASTA, which was originally designed for protein sequence similarity searching and later expanded to support DNA and translated DNA searches. FASTA uses local sequence alignment and heuristic methods to quickly search databases and find similar sequences. It supports various types of searches for protein, nucleotide, and translated sequences.
InterPro is a database that classifies proteins into families, domains, and sequence features based on their structural and functional properties. It integrates predictive models from several member databases to annotate unknown protein sequences. Protein signatures like patterns, profiles, fingerprints and hidden Markov models are generated from multiple sequence alignments and used by InterPro for classification. AlphaFold is an artificial intelligence system that can predict protein three-dimensional structures directly from amino acid sequences, representing a major advance in solving the protein folding problem.
Multi-omics data integration methods: kernel and other machine learning appro...tuxette
The document discusses multi-omics data integration methods, particularly kernel methods. It describes how kernel methods transform data into similarity matrices between samples rather than relying on variable space. Multiple kernel integration approaches are presented that combine multiple similarity matrices into a consensus kernel in an unsupervised manner, such as through a STATIS-like framework that maximizes the similarity between kernels. Examples of applications to datasets from the TARA Oceans expedition are given.
Zinc finger proteins bind DNA through zinc finger motifs. Each motif contains a beta sheet and alpha helix coordinated by a zinc ion. Early research found zinc fingers bind DNA in the major groove, with fingers 2-5 binding DNA directly and fingers 1-2 interacting through protein-protein interactions. Later, zinc finger proteins were engineered to cut DNA at specific sites, with cleavage occurring near the binding site. This led to the founding of Sangamo Biosciences to develop a modular approach to engineering zinc fingers to target desired DNA sequences. Zinc finger nucleases can introduce double-strand breaks to promote genome editing through homology directed repair.
Swiss-Prot is a protein database that provides highly annotated and non-redundant protein sequences. It features annotation of proteins, minimal redundancy, integration with other databases, and documentation. Swiss-Prot provides organized data and information on proteins and can be used as a starting point for protein research by allowing searches with various strings.
Bioinformatics is an interdisciplinary field involving biology, computer science, mathematics and statistics. It addresses large-scale biological problems from a computational perspective. Common problems include modeling biological processes at the molecular level and making inferences from collected data. A bioinformatics solution typically involves collecting statistics from biological data, building a computational model, solving a computational problem, and testing the algorithm. Bioinformatics plays a role in areas like structural genomics, functional genomics and nutritional genomics. It is used for applications such as transcriptome analysis, drug discovery, cheminformatics analysis, and more. It is an important tool in fields like molecular medicine, gene therapy, microbial genome applications, antibiotic resistance, and evolutionary studies. Biological databases are important for organizing
This presentation discusses protein structure prediction using Rosetta. It begins with an overview of the Critical Assessment of Protein Structure Prediction (CASP) experiments and notes that Rosetta is one of the top performing free-modeling servers. The presentation then describes the basic ab initio protocol used by Rosetta, which involves fragment insertion, scoring, and refinement. It also discusses limitations and success rates. Key aspects of the Rosetta energy functions and sampling algorithms are presented. Examples of specific Rosetta applications including low-resolution modeling and refinement are provided.
The document provides an overview of Chip-seq data analysis. It discusses the Chip-seq technology, visualization of genomic data, command line analysis including quality checking, alignment, peak calling, annotation, and motif finding. It also discusses downstream analysis such as comparing samples, analyzing region occupancy, and web resources for Chip-seq analysis.
This document introduces bioinformatics and discusses some of its key concepts and applications. It defines bioinformatics as an interdisciplinary field that combines computer science, statistics and engineering to study and process biological data. It describes some basic cell components like DNA, RNA and proteins, and how genetics and the genetic code work. It also provides a brief history of bioinformatics, highlighting projects like the Human Genome Project. Finally, it outlines several applications of bioinformatics like phylogenetic analysis, drug design, microarray analysis and protein-protein interaction networks.
This document summarizes a study that used the BigLD algorithm to partition haplotype blocks in chromosome 21 of the NARAC genomic dataset. The researchers:
1) Applied the BigLD algorithm and three other methods (FGT, CIT, SSLD) to detect haplotype blocks in a portion of chromosome 21.
2) Analyzed and compared the blocks detected by each method based on parameters like block size, number of blocks, and genomic coverage.
3) Found that BigLD produced the fewest and largest blocks, indicating more robust partitioning compared to the other methods.
With the unprecedented growth of chemical databases incorporating up to several hundred billions of synthetically feasible chemicals, modelers are not in shortage of chemicals to process. Importantly, such "Big Chemical Data" offers humongous opportunities for discovering novel bioactive molecules. However, the current generation of cheminformatics software tools is not capable of handling, characterizing, and processing such extremely large chemical libraries. In this presentation, we will discuss the rationale and the main challenges (theoretical and technical) for screening very large repositories of compounds in the current context of drug discovery. We will present several proof-of-concept studies regarding the screening of extremely large libraries (1+ billion compounds) using our novel GPU-accelerated cheminformatics platform to identify molecules with defined bioactivity. Overall, we will show that GPU computing represents an effective and inexpensive architecture to develop, employ, and validate a new generation of cheminformatics methods and tools ready to process billions of compounds.
Bioinformatics is the application of information technology to analyze biological data. This document provides an overview of bioinformatics, including publicly available genome sequences from 1998, promises for applications in medicine and biotechnology, the need for bioinformaticians to analyze growing biological databases, common bioinformatics tasks like sequence analysis and molecular modeling, and important databases like GenBank, SwissProt, and NCBI.
This document provides an introduction and overview of common methods for processing and analyzing next generation sequencing (NGS) data, including mapping NGS reads and de novo assembly of NGS reads. It discusses various NGS applications such as RNA-Seq, epigenetics, structural variation detection, and metagenomics. Key steps in read alignment such as choosing an alignment program and viewing alignments are outlined. Considerations for choosing an alignment program based on library type, read type, and platform are also reviewed. Popular alignment programs including Bowtie, BWA, TopHat, and Novoalign are mentioned.
This document summarizes Md. Bipul Hossen's presentation on comparing clustering techniques for gene expression data in bioinformatics. It introduces the topic and objectives, which are to find significant clusters in gene expression data and compare hierarchical and k-means clustering methods. It then provides details on different clustering algorithms and evaluation metrics. Results show complete linkage with Euclidean distance performed best for Affymetrix data, while k-means performed best for cDNA data. The conclusions state the comparative study helps document performance of these methods. Future work could involve comparing to other clustering techniques.
de Bruijn Graph Construction from Combination of Short and Long ReadsSikder Tahsin Al-Amin
This document describes the de Bruijn graph approach for genome assembly using a combination of short and long reads. It discusses key terminology, the motivation for this approach, and how an A-Bruijn graph is constructed and used to find the genomic path. It also covers error correction in draft genomes assembled using this method and potential areas for further development, such as calculating the likelihood ratio of different solid string sets and applying bridging and merging techniques.
RNASeq - Analysis Pipeline for Differential ExpressionJatinder Singh
RNA-Seq is a technique that uses next generation sequencing to sequence RNA transcripts and quantify gene expression levels. It can be used to estimate transcript abundance, detect alternative splicing, and compare gene expression profiles between healthy and diseased tissue. Computational challenges include read mapping due to exon-exon junctions and normalization of read counts. Key steps in RNA-Seq analysis include read mapping, transcript assembly, counting and normalizing reads, and detecting differentially expressed genes.
A workshop is intended for those who are interested in and are in the planning stages of conducting an RNA-Seq experiment. Topics to be discussed will include:
* Experimental Design of RNA-Seq experiment
* Sample preparation, best practices
* High throughput sequencing basics and choices
* Cost estimation
* Differential Gene Expression Analysis
* Data cleanup and quality assurance
* Mapping your data
* Assigning reads to genes and counting
* Analysis of differentially expressed genes
* Downstream analysis/visualizations and tables
This set of slides is based on the presentation I gave at ACM DataScience camp 2014. This is suitable for those who are still new to R. It has a few basic data manipulation techniques, and then goes into the basics of using of the dplyr package (Hadley Wickham) #rstats #dplyr
AGRF in conjunction with EMBL Australia recently organised a workshop at Monash University Clayton. This workshop was targeted at beginners and biologists who are new to analysing Next-Gen Sequencing data. The workshop also aimed to provide users with a snapshot of bioinformatics and data analysis tips on how to begin to analyse project data. An introduction to RNA-seq data analysis was presented by AGRF Senior Bioinformatician Dr. Sonika Tyagi.
Presented: 1st August 2012
This document provides an overview of DNA methylation analysis. It begins with background on DNA methylation functions and diseases. It then discusses methods for measuring DNA methylation status, including bisulfite sequencing. The document reviews steps for DNA methylation data analysis using tools like methylKit in R. It presents a case study example of analyzing DNA methylation data from human stem cells and fibroblasts. Alignment, quality control, differential methylation analysis and visualization are discussed.
The document discusses FASTA, a sequence alignment software tool. It describes the history and development of FASTA, which was originally designed for protein sequence similarity searching and later expanded to support DNA and translated DNA searches. FASTA uses local sequence alignment and heuristic methods to quickly search databases and find similar sequences. It supports various types of searches for protein, nucleotide, and translated sequences.
InterPro is a database that classifies proteins into families, domains, and sequence features based on their structural and functional properties. It integrates predictive models from several member databases to annotate unknown protein sequences. Protein signatures like patterns, profiles, fingerprints and hidden Markov models are generated from multiple sequence alignments and used by InterPro for classification. AlphaFold is an artificial intelligence system that can predict protein three-dimensional structures directly from amino acid sequences, representing a major advance in solving the protein folding problem.
Multi-omics data integration methods: kernel and other machine learning appro...tuxette
The document discusses multi-omics data integration methods, particularly kernel methods. It describes how kernel methods transform data into similarity matrices between samples rather than relying on variable space. Multiple kernel integration approaches are presented that combine multiple similarity matrices into a consensus kernel in an unsupervised manner, such as through a STATIS-like framework that maximizes the similarity between kernels. Examples of applications to datasets from the TARA Oceans expedition are given.
Zinc finger proteins bind DNA through zinc finger motifs. Each motif contains a beta sheet and alpha helix coordinated by a zinc ion. Early research found zinc fingers bind DNA in the major groove, with fingers 2-5 binding DNA directly and fingers 1-2 interacting through protein-protein interactions. Later, zinc finger proteins were engineered to cut DNA at specific sites, with cleavage occurring near the binding site. This led to the founding of Sangamo Biosciences to develop a modular approach to engineering zinc fingers to target desired DNA sequences. Zinc finger nucleases can introduce double-strand breaks to promote genome editing through homology directed repair.
Swiss-Prot is a protein database that provides highly annotated and non-redundant protein sequences. It features annotation of proteins, minimal redundancy, integration with other databases, and documentation. Swiss-Prot provides organized data and information on proteins and can be used as a starting point for protein research by allowing searches with various strings.
Bioinformatics is an interdisciplinary field involving biology, computer science, mathematics and statistics. It addresses large-scale biological problems from a computational perspective. Common problems include modeling biological processes at the molecular level and making inferences from collected data. A bioinformatics solution typically involves collecting statistics from biological data, building a computational model, solving a computational problem, and testing the algorithm. Bioinformatics plays a role in areas like structural genomics, functional genomics and nutritional genomics. It is used for applications such as transcriptome analysis, drug discovery, cheminformatics analysis, and more. It is an important tool in fields like molecular medicine, gene therapy, microbial genome applications, antibiotic resistance, and evolutionary studies. Biological databases are important for organizing
This presentation discusses protein structure prediction using Rosetta. It begins with an overview of the Critical Assessment of Protein Structure Prediction (CASP) experiments and notes that Rosetta is one of the top performing free-modeling servers. The presentation then describes the basic ab initio protocol used by Rosetta, which involves fragment insertion, scoring, and refinement. It also discusses limitations and success rates. Key aspects of the Rosetta energy functions and sampling algorithms are presented. Examples of specific Rosetta applications including low-resolution modeling and refinement are provided.
The document provides an overview of Chip-seq data analysis. It discusses the Chip-seq technology, visualization of genomic data, command line analysis including quality checking, alignment, peak calling, annotation, and motif finding. It also discusses downstream analysis such as comparing samples, analyzing region occupancy, and web resources for Chip-seq analysis.
This document introduces bioinformatics and discusses some of its key concepts and applications. It defines bioinformatics as an interdisciplinary field that combines computer science, statistics and engineering to study and process biological data. It describes some basic cell components like DNA, RNA and proteins, and how genetics and the genetic code work. It also provides a brief history of bioinformatics, highlighting projects like the Human Genome Project. Finally, it outlines several applications of bioinformatics like phylogenetic analysis, drug design, microarray analysis and protein-protein interaction networks.
This document summarizes a study that used the BigLD algorithm to partition haplotype blocks in chromosome 21 of the NARAC genomic dataset. The researchers:
1) Applied the BigLD algorithm and three other methods (FGT, CIT, SSLD) to detect haplotype blocks in a portion of chromosome 21.
2) Analyzed and compared the blocks detected by each method based on parameters like block size, number of blocks, and genomic coverage.
3) Found that BigLD produced the fewest and largest blocks, indicating more robust partitioning compared to the other methods.
With the unprecedented growth of chemical databases incorporating up to several hundred billions of synthetically feasible chemicals, modelers are not in shortage of chemicals to process. Importantly, such "Big Chemical Data" offers humongous opportunities for discovering novel bioactive molecules. However, the current generation of cheminformatics software tools is not capable of handling, characterizing, and processing such extremely large chemical libraries. In this presentation, we will discuss the rationale and the main challenges (theoretical and technical) for screening very large repositories of compounds in the current context of drug discovery. We will present several proof-of-concept studies regarding the screening of extremely large libraries (1+ billion compounds) using our novel GPU-accelerated cheminformatics platform to identify molecules with defined bioactivity. Overall, we will show that GPU computing represents an effective and inexpensive architecture to develop, employ, and validate a new generation of cheminformatics methods and tools ready to process billions of compounds.
This document outlines a project studying the assembly of viral capsids through site-directed mutagenesis and direct evolution. It provides background on the structure and genome of bacteriophage T7, which assembles its icosahedral capsid through both in vivo and in vitro mechanisms. Recent results included designing mutations to the major and minor capsid proteins to alter capsid size, comparing genome and protein sequences, and performing initial experiments like spot tests and titering on wildtype and mutant T7 that showed viability differences. Plans for continuing the project in spring include determining phage concentrations, applying direct evolution with chemicals, performing site-directed mutagenesis, and cloning phage genes into plasmids.
The document describes a computational study aimed at expediting drug discovery by identifying novel protein-protein interaction (PPI) ligands. The study used computational chemistry programs to test ligands against protein structures and identify those that overlay protein secondary structures with a root-mean-square deviation (RMSD) below 0.5 angstroms. Many ligands were found to successfully mimic protein secondary structures with low RMSD values, supporting the hypothesis that novel PPI ligands can be identified in this manner. The results indicate this computational approach may speed up drug discovery by targeting PPIs rather than single protein inhibition.
Characterization of monoclonal antibodies and Antibody drug conjugates by Sur...Merck Life Sciences
Watch the presentation of this webinar: https://bit.ly/3Pjpjvr
Highlights of this webinar:
- Surface plasmon resonance as a powerful tool for biologic characterization including mAbs and ADCs.
- SPR allows rapid binding analysis in real time without using labels for SARS-CoV-2 receptor binding domain mutations.
- Kinetic data is indicative of possible neutralizing activity allowed assessment of neutralizing ability of therapeutic monoclonal antibodies.
- The application can provide preliminarily efficacy information and facilitated mAbs/ACDs candidate selection process
Detailed description:
Characterization of therapeutic monoclonal antibodies (mAbs) or Antibody drug conjugates (ADCs) is challenging due to their ability to bind to a variety of proteins via their Fc and Fab domains, giving rise to diverse biological functions associated with each domain. The Fc domain of mAbs interacts with Fc receptors with varying affinities, which can influence biological processes such as Complement-dependent cytotoxicity (CDC) and Antibody-dependent cellular cytotoxicity (ADCC), transcytosis, phagocytosis, and/or serum half-life.
An important characteristic of an antibody is its Fc effector function. Antibodies can be engineered to obtain desired binding of the Fc region to Fc receptors expressed on effector cells. Hence, it is crucial to evaluate the binding interaction of mAbs/ADC with Fc receptors in the early phase of drug development to understand the potential biological activity of the product in vivo.
Surface Plasmon Resonance (SPR) is a powerful technique to establish binding kinetics in real-time, label free, and high sensitivity with low sample consumption. Along with target antigen binding, it is crucial to evaluate the binding interaction of antibodies and ADCs with Fc receptors. Our SPR case studies investigated the impact on binding kinetics of ADCs with different linkers and the binding interactions of SARS-CoV-2 spike protein variants and evaluated the neutralizing ability of therapeutic mAbs. SPR characterisation can be facilitated in all stages of the product life cycle to ensure the quality and safety of mAbs and ADCs.
Characterization of monoclonal antibodies and Antibody drug conjugates by Sur...MilliporeSigma
The document discusses characterization of antibodies and antibody-drug conjugates (ADCs) using surface plasmon resonance (SPR). It provides details on:
1. Using SPR to characterize binding kinetics of ADCs and determine effects of different linker types and drug-antibody ratios on antigen binding. SPR shows reduced but detectable binding for ADCs versus the unconjugated antibody.
2. An application of SPR to study binding interactions of SARS-CoV-2 spike protein and mutants with the ACE2 receptor and anti-spike antibodies. This can aid understanding of viral mutations and inform vaccine and drug development.
3. SPR is proposed as a method to screen binding kinetics of spike protein mutants to evaluate effects
The document outlines the goals of the Epigenetics Project to discover chemical probes for epigenetic targets. Significant progress has been made, including developing probes for the HMT G9a and bromodomain-containing proteins in the BET subfamily. Assays have been established for many epigenetic protein families with several probes in development. Collaborations with academic and pharmaceutical partners have contributed screening capabilities and medicinal chemistry support.
OECD Webinar | Assessing the dispersion stability and dissolution (rate) of n...OECD Environment
On Thursday 25 February 2021, Anne Gourmelon (Environment Directorate, OECD), Kathrin Schwirn (German Environment Agency, Umweltbundesamt, UBA); Frank von der Kammer (University of Vienna) Research and Development Center) and Doris Völker (German Environment Agency, Umweltbundesamt, UBA) presented the scope, content, and use of the Test Guideline No. 318: Dispersion Stability of Nanomaterials in Simulated Environmental Media and its accompanying Guidance Document. Further discussions focused on the scope of the upcoming Test Guideline.
The increased production and wide usage of manufactured nanomaterials suggest a higher probability of finding them in the environment. Therefore, testing the dissolution rate and dispersion stability for toxicity assessment are of paramount importance for adequate hazard assessment.
GIAB provides benchmark reference materials and datasets to improve confidence in genome sequencing and variant calling. It has characterized variants in 7 human genomes across different reference builds. Best practices for benchmarking include using appropriate stratifications, validation tools, and metrics interpretation to evaluate variant calling accuracy. Current efforts focus on developing benchmarks using diploid genome assemblies.
This document describes computational techniques used to design novel competitive inhibitors of the E. coli 5'-methylthioadenosine/S-adenosylhomocysteine nucleosidase (MTN) enzyme. It utilized core hopping to generate 10,000 structures by varying the core while keeping functional groups constant. Docking and binding energies were calculated for subsets of compounds down to the top 8 ligands. Results show several compounds have more favorable predicted binding than the control TDI inhibitor, warranting further optimization and testing of lead compounds.
QSAR STUDY ON READY BIODEGRADABILITY OF CHEMICALS. Presented at the 3rd Chemo...Kamel Mansouri
The goal of this study was to predict ready biodegradation of
chemicals by QSAR modeling. The dataset used for this purpose was
produced by the Japanese Ministry of International Trade and Industry
(MITI) with experimental results according to the OECD test guideline
301C. Molecular descriptors from Dragon 6 were calculated. Variable
selection coupled with classification methods were applied to find the
most predictive models with low cross-validation error rate. The best
models were after that validated using the preselected test set to check
its prediction reliability and for further analysis.
1) The document discusses strategies for designing targeted arrays to screen nuclear receptor ligands, including defining nuclear receptor chemical space and using x-ray crystallography data to assess potential ligands.
2) Virtual arrays are generated and analyzed using shape and pharmacophore matching tools like ROCS to prioritize arrays based on similarity to known receptor ligands.
3) Limitations of the current approach include using a single protein structure, not accounting for flexibility, and limitations of computational docking. Advancing the methods could improve array design for orphan receptors.
We previously reported a CRISPR-mediated knock-in strategy into introns of Drosophila genes, generating an attP-FRT-SA T2A-GAL4-polyA-3XP3-EGFP-FRT-attP transgenic library for multiple uses (Lee et al., 2018a). The method relied on double stranded DNA (dsDNA) homology donors with ~1 kb homology arms. Here, we describe three new simpler ways to edit genes in flies. We create single stranded DNA (ssDNA) donors using PCR and add 100 nt of homology on each side of an integration cassette, followed by enzymatic removal of one strand. Using this method, we generated GFP-tagged proteins that mark organelles in S2 cells. We then describe two dsDNA methods using cheap synthesized donors flanked by 100 nt homology arms and gRNA target sites cloned into a plasmid. Upon injection, donor DNA (1 to 5 kb) is released from the plasmid by Cas9. The cassette integrates efficiently and precisely in vivo. The approach is fast, cheap, and scalable.
The Genome in a Bottle Consortium provides reference samples and genomic data to enable the development and benchmarking of genome analysis tools, and they have recently published new resources from linked and long read sequencing including 10x Genomics, PacBio, and Oxford Nanopore data, which have expanded their small variant benchmark set and enabled the development of a structural variant benchmark.
This document describes an experimental study that aims to optimize weld bead geometry for gas metal arc welding (GMAW) of AISI 446 steel. The study uses a Taguchi design of experiments approach combined with Grey relational analysis to solve the multi-objective optimization problem. Welding voltage, speed, wire feed rate, and gas flow rate are selected as input parameters, while bead width, height, and penetration are the output quality targets. Experiments are conducted according to an L9 orthogonal array. Bead geometry is measured and normalized data is used to calculate Grey relational grades and determine the optimal welding parameters. Analysis of variance is also used to identify the most significant factors.
Process Parameter Optimization of Bead geometry for AISI 446 in GMAW Process ...iosrjce
This document describes an experimental study that aims to optimize weld bead geometry for gas metal arc welding (GMAW) of AISI 446 steel. The study uses a Taguchi design of experiments approach combined with Grey relational analysis to solve the multi-objective optimization problem. Welding voltage, speed, wire feed rate, and gas flow rate were selected as input parameters, with bead width, height, and penetration as the output quality targets. Experiments were conducted according to an L9 orthogonal array, and the bead geometry measurements were analyzed using Grey relational grade to determine the optimal welding conditions.
This document describes an experimental study that aims to optimize weld bead geometry for gas metal arc welding (GMAW) of AISI 446 steel. The study uses a Taguchi design of experiments approach combined with Grey relational analysis to solve the multi-objective optimization problem. Welding voltage, speed, wire feed rate, and gas flow rate were selected as input parameters, with bead width, height, and penetration as the output quality targets. Experiments were conducted according to an L9 orthogonal array, and the bead geometry measurements were analyzed using Grey relational grade to determine the optimal welding conditions.
This document provides an overview of the course BIONF/BENG 203: Functional Genomics. It discusses the grading breakdown, course outline, sources of functional genomic data including expression data from microarrays and RNA-Seq, proteomic data from mass spectrometry, protein-protein interaction data, and systematic phenotyping data. High-throughput methods for measuring these various types of omics data are also summarized.
Similar to EuroBioc 2018 - metyhlKit overview (20)
CLASS 12th CHEMISTRY SOLID STATE ppt (Animated)eitps1506
Description:
Dive into the fascinating realm of solid-state physics with our meticulously crafted online PowerPoint presentation. This immersive educational resource offers a comprehensive exploration of the fundamental concepts, theories, and applications within the realm of solid-state physics.
From crystalline structures to semiconductor devices, this presentation delves into the intricate principles governing the behavior of solids, providing clear explanations and illustrative examples to enhance understanding. Whether you're a student delving into the subject for the first time or a seasoned researcher seeking to deepen your knowledge, our presentation offers valuable insights and in-depth analyses to cater to various levels of expertise.
Key topics covered include:
Crystal Structures: Unravel the mysteries of crystalline arrangements and their significance in determining material properties.
Band Theory: Explore the electronic band structure of solids and understand how it influences their conductive properties.
Semiconductor Physics: Delve into the behavior of semiconductors, including doping, carrier transport, and device applications.
Magnetic Properties: Investigate the magnetic behavior of solids, including ferromagnetism, antiferromagnetism, and ferrimagnetism.
Optical Properties: Examine the interaction of light with solids, including absorption, reflection, and transmission phenomena.
With visually engaging slides, informative content, and interactive elements, our online PowerPoint presentation serves as a valuable resource for students, educators, and enthusiasts alike, facilitating a deeper understanding of the captivating world of solid-state physics. Explore the intricacies of solid-state materials and unlock the secrets behind their remarkable properties with our comprehensive presentation.
Sexuality - Issues, Attitude and Behaviour - Applied Social Psychology - Psyc...PsychoTech Services
A proprietary approach developed by bringing together the best of learning theories from Psychology, design principles from the world of visualization, and pedagogical methods from over a decade of training experience, that enables you to: Learn better, faster!
Immersive Learning That Works: Research Grounding and Paths ForwardLeonel Morgado
We will metaverse into the essence of immersive learning, into its three dimensions and conceptual models. This approach encompasses elements from teaching methodologies to social involvement, through organizational concerns and technologies. Challenging the perception of learning as knowledge transfer, we introduce a 'Uses, Practices & Strategies' model operationalized by the 'Immersive Learning Brain' and ‘Immersion Cube’ frameworks. This approach offers a comprehensive guide through the intricacies of immersive educational experiences and spotlighting research frontiers, along the immersion dimensions of system, narrative, and agency. Our discourse extends to stakeholders beyond the academic sphere, addressing the interests of technologists, instructional designers, and policymakers. We span various contexts, from formal education to organizational transformation to the new horizon of an AI-pervasive society. This keynote aims to unite the iLRN community in a collaborative journey towards a future where immersive learning research and practice coalesce, paving the way for innovative educational research and practice landscapes.
Mending Clothing to Support Sustainable Fashion_CIMaR 2024.pdfSelcen Ozturkcan
Ozturkcan, S., Berndt, A., & Angelakis, A. (2024). Mending clothing to support sustainable fashion. Presented at the 31st Annual Conference by the Consortium for International Marketing Research (CIMaR), 10-13 Jun 2024, University of Gävle, Sweden.
Travis Hills of MN is Making Clean Water Accessible to All Through High Flux ...Travis Hills MN
By harnessing the power of High Flux Vacuum Membrane Distillation, Travis Hills from MN envisions a future where clean and safe drinking water is accessible to all, regardless of geographical location or economic status.
Describing and Interpreting an Immersive Learning Case with the Immersion Cub...Leonel Morgado
Current descriptions of immersive learning cases are often difficult or impossible to compare. This is due to a myriad of different options on what details to include, which aspects are relevant, and on the descriptive approaches employed. Also, these aspects often combine very specific details with more general guidelines or indicate intents and rationales without clarifying their implementation. In this paper we provide a method to describe immersive learning cases that is structured to enable comparisons, yet flexible enough to allow researchers and practitioners to decide which aspects to include. This method leverages a taxonomy that classifies educational aspects at three levels (uses, practices, and strategies) and then utilizes two frameworks, the Immersive Learning Brain and the Immersion Cube, to enable a structured description and interpretation of immersive learning cases. The method is then demonstrated on a published immersive learning case on training for wind turbine maintenance using virtual reality. Applying the method results in a structured artifact, the Immersive Learning Case Sheet, that tags the case with its proximal uses, practices, and strategies, and refines the free text case description to ensure that matching details are included. This contribution is thus a case description method in support of future comparative research of immersive learning cases. We then discuss how the resulting description and interpretation can be leveraged to change immersion learning cases, by enriching them (considering low-effort changes or additions) or innovating (exploring more challenging avenues of transformation). The method holds significant promise to support better-grounded research in immersive learning.
EWOCS-I: The catalog of X-ray sources in Westerlund 1 from the Extended Weste...Sérgio Sacani
Context. With a mass exceeding several 104 M⊙ and a rich and dense population of massive stars, supermassive young star clusters
represent the most massive star-forming environment that is dominated by the feedback from massive stars and gravitational interactions
among stars.
Aims. In this paper we present the Extended Westerlund 1 and 2 Open Clusters Survey (EWOCS) project, which aims to investigate
the influence of the starburst environment on the formation of stars and planets, and on the evolution of both low and high mass stars.
The primary targets of this project are Westerlund 1 and 2, the closest supermassive star clusters to the Sun.
Methods. The project is based primarily on recent observations conducted with the Chandra and JWST observatories. Specifically,
the Chandra survey of Westerlund 1 consists of 36 new ACIS-I observations, nearly co-pointed, for a total exposure time of 1 Msec.
Additionally, we included 8 archival Chandra/ACIS-S observations. This paper presents the resulting catalog of X-ray sources within
and around Westerlund 1. Sources were detected by combining various existing methods, and photon extraction and source validation
were carried out using the ACIS-Extract software.
Results. The EWOCS X-ray catalog comprises 5963 validated sources out of the 9420 initially provided to ACIS-Extract, reaching a
photon flux threshold of approximately 2 × 10−8 photons cm−2
s
−1
. The X-ray sources exhibit a highly concentrated spatial distribution,
with 1075 sources located within the central 1 arcmin. We have successfully detected X-ray emissions from 126 out of the 166 known
massive stars of the cluster, and we have collected over 71 000 photons from the magnetar CXO J164710.20-455217.
(June 12, 2024) Webinar: Development of PET theranostics targeting the molecu...Scintica Instrumentation
Targeting Hsp90 and its pathogen Orthologs with Tethered Inhibitors as a Diagnostic and Therapeutic Strategy for cancer and infectious diseases with Dr. Timothy Haystead.
Anti-Universe And Emergent Gravity and the Dark UniverseSérgio Sacani
Recent theoretical progress indicates that spacetime and gravity emerge together from the entanglement structure of an underlying microscopic theory. These ideas are best understood in Anti-de Sitter space, where they rely on the area law for entanglement entropy. The extension to de Sitter space requires taking into account the entropy and temperature associated with the cosmological horizon. Using insights from string theory, black hole physics and quantum information theory we argue that the positive dark energy leads to a thermal volume law contribution to the entropy that overtakes the area law precisely at the cosmological horizon. Due to the competition between area and volume law entanglement the microscopic de Sitter states do not thermalise at sub-Hubble scales: they exhibit memory effects in the form of an entropy displacement caused by matter. The emergent laws of gravity contain an additional ‘dark’ gravitational force describing the ‘elastic’ response due to the entropy displacement. We derive an estimate of the strength of this extra force in terms of the baryonic mass, Newton’s constant and the Hubble acceleration scale a0 = cH0, and provide evidence for the fact that this additional ‘dark gravity force’ explains the observed phenomena in galaxies and clusters currently attributed to dark matter.
The binding of cosmological structures by massless topological defectsSérgio Sacani
Assuming spherical symmetry and weak field, it is shown that if one solves the Poisson equation or the Einstein field
equations sourced by a topological defect, i.e. a singularity of a very specific form, the result is a localized gravitational
field capable of driving flat rotation (i.e. Keplerian circular orbits at a constant speed for all radii) of test masses on a thin
spherical shell without any underlying mass. Moreover, a large-scale structure which exploits this solution by assembling
concentrically a number of such topological defects can establish a flat stellar or galactic rotation curve, and can also deflect
light in the same manner as an equipotential (isothermal) sphere. Thus, the need for dark matter or modified gravity theory is
mitigated, at least in part.
The binding of cosmological structures by massless topological defects
EuroBioc 2018 - metyhlKit overview
1. methylKit,
DNA methylation analysis from
high-throughput bisulfite
sequencing data
Alexander Gosdschan
PhD Student
Akalin Group, BIMSB MDC
bioinformatics.mdc-berlin.de
Bioconductor Europe Meeting 2018
3. Bioconductor Europe Meeting 2018Alexander Gosdschan
Read in data
## chrBase chr base strand coverage freqC freqT
## 1 chr21.9764539 chr21 9764539 R 12 25.00 75.00
## 2 chr21.9764513 chr21 9764513 R 12 0.00 100.00
## 3 chr21.9820622 chr21 9820622 F 13 0.00 100.00
## 4 chr21.9837545 chr21 9837545 F 11 0.00 100.00
## 5 chr21.9849022 chr21 9849022 F 124 72.58 27.42
from pre-called txt files (e.g. cytosineReport or
coverage files from Bismark aligner ):
methRead()
from Bismark BAM reads
(supported through RHTSlib):
processBismarkAln()
flat-file database: dbtype = “tabix”
methyl*DB, supported through
Rsamtools
in-memory:
methylRaw / methylRawList
4. Bioconductor Europe Meeting 2018Alexander Gosdschan
Summarize statistics on samples
getCoverageStats(methylRaw)getMethylationStats(methylRaw)
methylation statistics per base
summary:
Min. 1st Qu. Median Mean 3rd Qu. Max.
0.00 20.00 82.79 63.17 94.74 100.00
percentiles:
0% 10% 20% 30% 40% 50% 60% 70%
0.00000 0.00000 0.00000 48.38710 70.00000 82.78556 90.00000 93.33333
80% 90% 95% 99% 99.5% 99.9% 100%
96.42857 100.00000 100.00000 100.00000 100.00000 100.00000 100.00000
read coverage statistics per base
summary:
Min. 1st Qu. Median Mean 3rd Qu. Max.
10.00 16.00 26.00 34.45 39.00 630.00
percentiles:
0% 10% 20% 30% 40% 50% 60% 70% 80% 90%
10.000 11.000 14.000 17.000 20.000 26.000 30.000 36.000 42.000 60.000
95% 99% 99.5% 99.9% 100%
78.750 195.800 328.300 441.945 630.000
5. Bioconductor Europe Meeting 2018Alexander Gosdschan
Segmentation
Segmenting the methylome into sections of CpGs with similar
methylation profiles using change-point analysis, followed by
clustering using a mixture modeling approach.
Comparison of features identified
using methylKit change-point based
segmentation on Human IMR90
methylome with published PMDs
identified with MethPipe (Song et al.,
2013b),(Lister et al., 2009, Gaidatzis
et al., 2014)
Wreczycka & Gosdschan (2017)
(supported through fastseg)
methSeg(object) - segmentation of
GRanges, methylDiff,methylRaw
6. Bioconductor Europe Meeting 2018Alexander Gosdschan
Compare Samples
get the bases covered in all samples: merge all samples to one object for
base-pair locations that are covered in all samples:
unite(methylRawList) → methylBase
getCorrelation(methylBase) clusterSamples(methylBase) PCASamples(methylBase)
assocComp - Batch effect correction
tileMethylCounts - Tiling
windows analysis
7. Bioconductor Europe Meeting 2018Alexander Gosdschan
Differential Analysis
Testing for differential methylation using either Fisher’s exact test or
Chisq test for logistic regression model (depending on the sample
size per set) with p-value adjustment using SLIM method (Wang,
Tuominen, and Tsai 2011):
calculateDiffMeth(methylBase) → methylDiff
getMethylDiff - filtering
differential bases
calculateDiffMeth(...,
mc.cores=2) - use multiple cores:
Optional correction for overdispersion if more variability present in
the data than assumed by binomial distribution:
calculateDiffMeth(methylBase,overdispersion="MN")
Covariates can be included in the analysis to separate the influence of
the covariates from the treatment effect via the logistic regression model.
Testing if full model is better than the model with the covariates only.
covariates=data.frame(age=c(30,80,30,80))
calculateDiffMeth(methylBase,covariates=covariates)
8. Bioconductor Europe Meeting 2018Alexander Gosdschan
Annotation
Use genomation package to annotate differentially
methylated regions/bases based on gene annotation:
Presentation on Friday:
Session VI - Katarzyna Wreczycka
first read the gene BED file:
gene.obj=readTranscriptFeatures(system.file("extdata",
"refseq.hg18.bed.txt",package = "methylKit"))
then get all differentially methylated bases:
myDiff25p=getMethylDiff(methylDiff,difference=25,
qvalue=0.01)
now annotate differentially methylated CpGs with
promoter/exon/intron using annotation data
diffAnn=annotateWithGeneParts(as(myDiff25p,"GRanges"),
gene.obj)
finally visualize the annotation:
plotTargetAnnotation(diffAnn,precedence=TRUE,
main="differential methylation annotation")
9. Bioconductor Europe Meeting 2018Alexander Gosdschan
Acknowledgements
BIMSB: Altuna Akalin, Katarzyna Wreczycka
Bioconductor Team
Code:
- https://github.com/al2na/methylKit
Blog:
- http://zvfak.blogspot.com/search/label/methylKit
Support:
- https://groups.google.com/forum/#!forum/methylkit_discussion