PICS: Pathway Informed Classification System for cancer analysis using gene e...David Craft
We introduce PICS (Pathway Informed Classification System) for classifying cancers based on tumor sample gene expression levels. The method clearly separates a pan-cancer dataset into their tissue of origin and is also able to sub-classify individual cancer datasets into distinct survival classes. Gene expression values are collapsed into pathway scores that reveal which biological activities are most useful for clustering cancer cohorts into sub-types. Variants of the method allow it to be used on datasets that do and do not contain non-cancerous samples. Activity levels of all types of pathways, broadly grouped into metabolic, cellular processes and signaling, and immune system, are useful for separating the pan-cancer cohort. In the clustering of specific cancer types, certain pathway types become more valuable depending on the site being studied. For lung cancer, signaling pathways dominate, for pancreatic cancer signaling and metabolic pathways, and for melanoma immune system pathways are the most useful. This work suggests the utility of pathway level genomic analysis and points in the direction of using pathway classification for predicting the efficacy and side effects of drugs and radiation.
PICS: Pathway Informed Classification System for cancer analysis using gene e...David Craft
We introduce PICS (Pathway Informed Classification System) for classifying cancers based on tumor sample gene expression levels. The method clearly separates a pan-cancer dataset into their tissue of origin and is also able to sub-classify individual cancer datasets into distinct survival classes. Gene expression values are collapsed into pathway scores that reveal which biological activities are most useful for clustering cancer cohorts into sub-types. Variants of the method allow it to be used on datasets that do and do not contain non-cancerous samples. Activity levels of all types of pathways, broadly grouped into metabolic, cellular processes and signaling, and immune system, are useful for separating the pan-cancer cohort. In the clustering of specific cancer types, certain pathway types become more valuable depending on the site being studied. For lung cancer, signaling pathways dominate, for pancreatic cancer signaling and metabolic pathways, and for melanoma immune system pathways are the most useful. This work suggests the utility of pathway level genomic analysis and points in the direction of using pathway classification for predicting the efficacy and side effects of drugs and radiation.
Identifying novel and druggable targets in a triple negative breast cancer ce...Thermo Fisher Scientific
In this study, we developed a CRISPR/Cas9-based high throughput loss-of-function screen for identifying target genes responsible for the tumor proliferation and growth in TNBC. Our initial focus was to identify essential kinases in MDA-MB-231 cell line using the Invitrogen™ LentiArray™ Human Kinase CRISPR Library, which targets 840 kinases with up to 4 different gRNAs per protein kinase for complete gene knockout. This functional screen identified over 90 protein kinases that are essential for cell viability and cell proliferation. Ten of these hits (CDK1, CDK2, CDK8, CDK10, CDK11A, CDK19, CDK19, CDC7, EPHA2 and WEE1) are well-known targets validated in the literature. Currently, we are in the process validating the novel hits through target gene sequencing, western blotting and target specific small molecule kinase inhibitors.
Intelligent Systems for Cancer Genomics (AIS305) - AWS re:Invent 2018Amazon Web Services
One of the most exciting frontiers in science is building automated systems that use existing biomedical data to understand and ultimately treat human disease. The key difficulty in the case of cancer is that it is a highly heterogeneous disease, making it challenging to uncover which molecular alterations in tumors are important for the disease and to predict how an individual will respond to treatment. This talk presents an overview of integrative computational methods for analyzing cancer genomes that leverage a diverse range of complementary data in order to extract biomedically relevant insights.
Applications of protein array in diagnostics and genomic and proteomicSusan Rey
Microarray technology can simultaneously analyze thousands of parameters in a single experiment. Micro-point of capture molecules are fixed into ranks on a solid support and exposed to samples containing corresponding binding molecules. Complex formation in each micro-point can be detected by the readout system, which is based on fluorescence, chemiluminescence, mass spectrometry, radioactive or electrochemistry. Miniaturization and parallelization binding assays, whose analysis power can be also enlarged by microarray gene expression analysis, is sensitive. These systems can be used to detect the degree of hybridization and immobilized DNA microarray probes will be exposed to complementary target. Currently, the development of protein array has demonstrated its applications in enzyme-substrate, DNA- protein and different types of protein - protein interactions. In this post, we will discuss the capture-molecule-ligand analysis, analyze its theoretical advantages and disadvantage and its influence in diagnostics, genomic and proteomics.
Resolving Ambiguity in Target ID Screens - CRISPR-Cas9 Based Essentiality Pro...Candy Smellie
Pathfinder Target Essentiality Assay Service
A new CRISPR─Cas9 based medium throughput assay service for validation of target gene essentiality
Can be used to resolve ambiguous screening results
Can also provide information on drug target suitability
This assay developed at Horizon will enable you to identify genes essential for the growth of specific cancer cell lines.
It can be used to definitively resolve ambiguous screening results.
Or to provide information on target suitability – by testing essentiality in “normal” cells, or in cancer subtypes different to the proposed patient population
This is the Powerpoint presentation from my recent presentation at the TTP LabTech US Acumen Users Group Meeting (UGM) held at the British Consulate-General in Cambridge, MA on May 18, 2010
the document is about chromosomal analysis technique named array CGH technology, the complete procedure and the result interpretation of chromosomal variation
Identifying novel and druggable targets in a triple negative breast cancer ce...Thermo Fisher Scientific
In this study, we developed a CRISPR/Cas9-based high throughput loss-of-function screen for identifying target genes responsible for the tumor proliferation and growth in TNBC. Our initial focus was to identify essential kinases in MDA-MB-231 cell line using the Invitrogen™ LentiArray™ Human Kinase CRISPR Library, which targets 840 kinases with up to 4 different gRNAs per protein kinase for complete gene knockout. This functional screen identified over 90 protein kinases that are essential for cell viability and cell proliferation. Ten of these hits (CDK1, CDK2, CDK8, CDK10, CDK11A, CDK19, CDK19, CDC7, EPHA2 and WEE1) are well-known targets validated in the literature. Currently, we are in the process validating the novel hits through target gene sequencing, western blotting and target specific small molecule kinase inhibitors.
Intelligent Systems for Cancer Genomics (AIS305) - AWS re:Invent 2018Amazon Web Services
One of the most exciting frontiers in science is building automated systems that use existing biomedical data to understand and ultimately treat human disease. The key difficulty in the case of cancer is that it is a highly heterogeneous disease, making it challenging to uncover which molecular alterations in tumors are important for the disease and to predict how an individual will respond to treatment. This talk presents an overview of integrative computational methods for analyzing cancer genomes that leverage a diverse range of complementary data in order to extract biomedically relevant insights.
Applications of protein array in diagnostics and genomic and proteomicSusan Rey
Microarray technology can simultaneously analyze thousands of parameters in a single experiment. Micro-point of capture molecules are fixed into ranks on a solid support and exposed to samples containing corresponding binding molecules. Complex formation in each micro-point can be detected by the readout system, which is based on fluorescence, chemiluminescence, mass spectrometry, radioactive or electrochemistry. Miniaturization and parallelization binding assays, whose analysis power can be also enlarged by microarray gene expression analysis, is sensitive. These systems can be used to detect the degree of hybridization and immobilized DNA microarray probes will be exposed to complementary target. Currently, the development of protein array has demonstrated its applications in enzyme-substrate, DNA- protein and different types of protein - protein interactions. In this post, we will discuss the capture-molecule-ligand analysis, analyze its theoretical advantages and disadvantage and its influence in diagnostics, genomic and proteomics.
Resolving Ambiguity in Target ID Screens - CRISPR-Cas9 Based Essentiality Pro...Candy Smellie
Pathfinder Target Essentiality Assay Service
A new CRISPR─Cas9 based medium throughput assay service for validation of target gene essentiality
Can be used to resolve ambiguous screening results
Can also provide information on drug target suitability
This assay developed at Horizon will enable you to identify genes essential for the growth of specific cancer cell lines.
It can be used to definitively resolve ambiguous screening results.
Or to provide information on target suitability – by testing essentiality in “normal” cells, or in cancer subtypes different to the proposed patient population
This is the Powerpoint presentation from my recent presentation at the TTP LabTech US Acumen Users Group Meeting (UGM) held at the British Consulate-General in Cambridge, MA on May 18, 2010
the document is about chromosomal analysis technique named array CGH technology, the complete procedure and the result interpretation of chromosomal variation
En este escrito de formación número 7 se habla de lo que es la inteligencia humana y de lo que implican los aspectos espirituales del hombre con relación a su inteligencia
Presentación realizada durante el 8 Seminario de Prensa organizado por Instituto Roche ' Explorando las conexiones: neurociencias, medios sociales y sanidad 2.0' en el Parador de Bayona el día 8 de Junio de 2012
CATALOGO JLC - Repuestos maquinas de Jardin y Bosque - Jorge L Carranza SAMartin Funes
Repuestos para motosierras, motoguadañas y bordeadoras. Cortadoras de césped
Mini tractores - Motores 2 y 4 tiempos.
Parts of chain saw. Lawn and garden.
Carburators. Walbro, Zama, Tillotson
Stihl, Husqvarna, Poulan, MTD, Gamma, Raisman, Briggs & Stratton
BSidesPGH - Never Surrender - Reducing Social Engineering RiskRob Ragan
The weakest link in the security chain is often between the keyboard and the chair. People are a problem. We have a natural instinct as humans to trust someone's word. Although various technical means have been developed to cope with security threats, human factors have been comparatively neglected.
Once you put a human in a security chain, you have a weakness. That problem should be addressed by security practitioners, not every member of an organization. Very few would disagree that social engineering is the the most common and least challenging way to compromise an organization, but most accept the notion that there isn't much they can do about it. False!
This talk will focus on the psychological, technical, and physical involvement of social engineering, and also look at how we can remove the human element of the human problem. We will explore what organizations are doing wrong, also the processes and technical controls that can be put in place to achieve a strong social engineering defense.
We'll template a solution that can be customized. What will really help? What is the truth? What if we don't want to surrender our organization to social engineers?
Mastering RNA-Seq (NGS Data Analysis) - A Critical Approach To Transcriptomic...Elia Brodsky
This workshop will address critical issues related to Transcriptomics data:
Processing raw Next Generation Sequencing (NGS) data:
1. Next Generation Sequencing data preprocessing:
Trimming technical sequences
Removing PCR duplicates
2. RNA-seq based quantification of expression levels:
Conventional pipelines (looking at known transcripts)
Identification of novel isoforms
Analysis of Expression Data Using Machine Learning:
3. Unsupervised analysis of expression data:
Principal Component Analysis
Clustering
4. Supervised analysis:
Differential expression analysis
Classification, gene signature construction
5. Gene set enrichment analysis
The workshop will include hands-on exercises utilizing public domain datasets:
breast cancer cell lines transcriptomic profiles (https://genomebiology.biomedcentral.com/articles/10.1186/gb-2013-14-10-r110),
patient-derived xenograft (PDX) mouse model of tumor and stroma transcriptomic profiles (http://www.oncotarget.com/index.php?journal=oncotarget&page=article&op=view&path[]=8014&path[]=23533), and
processed data from The Cancer Genome Atlas samples (https://cancergenome.nih.gov/).
Team: The workshops are designed by the researchers at the Tauber Bioinformatics Research Center at University of Haifa, Israel in collaboration with academic centers across the US. Technical support for the workshops is provided by the Pine Biotech team. https://edu.t-bio.info/a-critical-approach-to-transcriptomic-data-analysis/
A micro-array is a tool for analyzing gene expression that consists of a small membrane or glass slide containing samples of many genes arranged in a regular pattern.
This was made by me while I was in Masters. I have made few animations. I hope it makes understanding better.
The content is made by searching through internet and referencing books. I do not claim any content in whole presentation except the animations made on the subject.
The analysis of proteins and messenger RNA is commonly used in the comparison of gene expression patterns in tissues or cells of different types and under distinct conditions. In gene expression analysis, normalization is a critical step as it guarantees the validity of downstream analyses. Data preprocessing is an indispensable step in the extraction and normalization of microarray gene expression data. The normalization of gene expression data is essential in ensuring accurate inferences. A number of normalization methods in high throughput sequencing studies are being employed. The preprocessing activity begins by a careful analysis of the gene expression data and usually involves the classification of many raw signal intensities into one expression value. The Robust Multiarray Average (RMA) is a normalization approach for microarrays that involves background correction, normalization and summarization of probe levels information without using MM probes (Lim et al., 2007). It is an algorithm commonly used in the creation of an expression matrix for Affymetrix data and is one of the most commonly used modes of preprocessing to normalize gene expression data. Values of raw intensity are initially background corrected and log2 transformed before being normalized. In order to generate an expression measure for probe sets on each array, a linear model is fitted to the normalized data.
Integrative analysis of transcriptomics and proteomics data with ArrayMining ...Natalio Krasnogor
These slides are part of a presentation I gave on March 2010 at the BioInformatics and Genome Research Open Club at the Weizmann Institute of Science, Israel.
In these slides my student and I describe two web-applications for microarray and gene/protein set analysis,
ArrayMining.net and TopoGSA. These use ensemble and consensus methods as well as the
possibility of modular combinations of different analysis techniques for an integrative view of
(microarray-based) gene sets, interlinking transcriptomics with proteomics data sources. This integrative process uses tools from different fields, e.g. statistics, optimisation and network
topological studies. As an example for these integrative techniques, we use a microarray
consensus-clustering approach based on Simulated Annealing, which is part of the ArrayMining.net
Class Discovery Analysis module, and show how this approach can be combined in a modular
fashion with a prior gene set analysis. The results reveal that improved cluster validity indices can be obtained by merging the two methods, and provide pointers to distinct sub-classes within pre-defined tumour categories for a breast cancer dataset by the Nottingham Queens Medical Centre.
In the second part of the talk, I show how results from a supervised
microarray feature selection analysis on ArrayMining.net can be investigated in further detail with
TopoGSA, a new web-tool for network topological analysis of gene/protein sets mapped on a
comprehensive human protein-protein interaction network. I discuss results from a TopoGSA
analysis of the complete set of genes currently known to be mutated in cancer.
Utilization of NGS to Identify Clinically-Relevant Mutations in cfDNA: Meet t...QIAGEN
Pancreatic cancer is a uniquely lethal malignancy characterized by frequent mutations in KRAS, CDKN2A, SMAD4, TP53 and many others. We have shown that KRAS mutation can be detected in cell-free, circulating tumor DNA (ctDNA) isolated from the plasma in a subset of patients and is associated with poor prognosis. The ability to simultaneously detect multiple pancreatic cancer-specific mutations in ctDNA would open a new avenue for detection of clinically-relevant mutations. In this study, we performed ultra-deep sequencing of ctDNA from advanced pancreatic cancer patients prior to treatment with Gemcitabine and Erlotinib following target enrichment. Somatic, non-synonymous variants were identified in 29 different genes at allele frequencies typically less than 0.5%. Updated results of ultra-deep NGS analysis will be presented.
In-silico structure activity relationship study of toxicity endpoints by QSAR...Kamel Mansouri
Several thousand chemicals were tested in hundreds of toxicity-related in-vitro high-throughput screening (HTS)
bioassays through the EPA’s ToxCast and Tox21 projects. However, this chemical set only covers a portion of the chemical
space of interest for environmental risk assessment, leading to a need to fill data gaps with other methods. A cost effective
and reliable approach to fullfill this task is to build quantitative structure-activity relationships (QSARs).
In this work, a subset of 1877 chemicals from ToxCast were used to build QSAR models. These models will be applied
to predict values for multiple ToxCast assays in a larger environmental database of ~30K chemical structures.
Based on a clustering study by Sipes et al. (2013), the initial molecular targets of this effort consisted of a set of 18
NovaScreen G-protein coupled receptor (GPCR) assays. These assays are part of the aminergic category that showed the
highest number of actives within the ToxCast portfolio. Classification methods including SOM, SVM, PLSDA and kNN, were
tested. These methods were coupled to variable selection techniques such as genetic algorithms that were applied in order
to select the best representative molecular descriptors based on statistical fitness functions. The obtained models were
validated and their prediction ability measured. The models that showed good results will be applied within the limits of
their established chemical space defined by the applicability domain.
2. Part1: Cluster analysis on part of NCI60 data.
Part2: Cluster analysis on kinases of Golub data using Neighbor
joining method.
3. Hierarchical Clustering
Is a connectivity based clustering.
Is a whole family of methods that differ by the way distances are
computed.
Represented using a dendrogram.
4. NCI60 data
• Dataset of gene expression profiles.
• The format is a list containing two elements:
data- a 64x6830 matrix of gene expression values.
labs- is a vector listing the 9 cancer types.
leukemia, melanoma, lung, colon, CNS, ovarian, renal, breast and
prostate cancers.
5. Computations on NCI60
• PCA
• Cluster analysis using complete, average and single linkage methods
on :
• Set1:Breast cancer and ovarian cancer cell lines.(metastasis)
• Set2:Colon cancer and prostate cancer cell lines(metastasis)
• Set3:Colon cancer and renal cancer cell lines(no association found)
6. PCA Computations.
prcomp() function outputs the standard deviation of each principal
component.
Squaring these standard deviations=variance
Proportion of variance explained (PVE) by each principal component
=variance explained by each principal component /the total variance
explained by all principal components.
7. Barplot of PCA on a part of NCI60 data
Plot of Principle component analysis on NCI60 data
Variances
02004006008001000
8. Screeplot
It is more informative to plot the PVE and the
cumulative PVE of each principal
component.
While each of the first 5 principle
components explain substantial amount of
variance , there is a marked decrease in the
variance explained by the further principle
components.
9. BREAST
BREAST
BREAST
OVARIAN
BREAST
BREAST
BREAST
BREAST
OVARIAN
OVARIAN
OVARIAN
OVARIAN
OVARIAN
20406080100120
Complete linkage
hclust (*, "complete")
DtBO
Height
BREAST
BREAST
BREAST
BREAST
BREAST
BREAST
BREAST
OVARIAN
OVARIAN
OVARIAN
OVARIAN
OVARIAN
OVARIAN
30405060708090100
Average linkage
hclust (*, "average")
DtBO
Height
BREAST
BREAST
BREAST
BREAST
BREAST
BREAST
BREAST
OVARIAN
OVARIAN
OVARIAN
OVARIAN
OVARIAN
OVARIAN
30405060708090
Single linkage
hclust (*, "single")
DtBO
Height
In complete linkage one of the ovarian cancer cell lines is very closely related to breast cancer cell line and they both form
a cluster together separately.
10. Clustering entire NCI60 data by Euclidean and
Maximum distance methods.
“Maximum” metric method - maximum distance between two
components (x and y) is calculated and in the resulting cluster, the
more distant ones are clustered together.
In other words, the components far apart in the cluster using maximum
distance are closely related.
We, can use this analogy to see the relatedness of breast cancer and
ovarian cancer cell lines.
15. Conclusion of part 1.
Complete linkage cluster analysis of ovarian cancer and breast cancer
cell lines, show a close relatedness between the two.
• Cluster analysis on the other two sections (colon cancer and prostate
cancer ; colon cancer and renal cancer) of data sets shows the
efficiency of the methods to clearly cluster within a single cancer
type.
• Single linkage will tend to yield trailing clusters, onto which individual
samples attach one by one.
• Complete and average linkage tend to yield more balanced clusters.
16. Part 2: Cluster analysis of kinase genes in Golub
data using Neighbor joining method.
Why Kinase genes?
Kinases modify other proteins by chemically adding phosphate group to
them. Phosphorylation can turn a protein off.
Kinases regulate majority of signal transduction cellular pathways. Errors in
signaling pathways are responsible for diseases such as cancer and
autoimmunity.
Some of the kinase inhibitors are used in treating cancer.
Identify closely related kinase genes through cluster analysis .
Such closely related kinases often have similar structure and function.
When you design a kinase inhibitor, this might as well work for a group of
closely related kinases or a family of kinases.
17. Why Neighbor Joining Method ?
• NJ method is statistically consistent under many models of evolution
• Unlike UPGMA, neighbor joining does not assume that all lineages
evolve at the same rate.
• Ideal to assign individuals to groups that often corresponds to self-
identified geographical ancestry.
18. Computations:
• Get only the kinase genes from the Golub data.
> library("multtest");data(golub)
> o<-grep("kinase",golub.gnames[,2])
> length(o)
[1] 139
There are 139 kinase genes.
• Use two sample t-test to select genes with experimental effect .
> pt <- apply(golub,1,function(x) t.test(x ~ gol.fac)$p.value)
> oo <- o[pt[o]<0.01]
> kin<-golub[oo,]
> dim(kin)
[1] 28 38
This yields 28 genes.
• Perform NJ clustering on the 28 kinase genes.
19. PRKCD Protein kinase C, delta
PFKP Phosphofructokinase, platelet
Fructose 6-phosphate,2-kinase/fructose 2,6-bisphosphatase
PRKCQ Protein kinase C-theta
Protein tyrosine kinase related mRNA sequence
PRKAR1A CAMP-dependent protein kinase regulatory subunit type I
DCK Deoxycytidine kinase
BLK Protein-tyrosine kinase blk
Protein kinase inhibitor [human, neuroblastoma cell line SH-SY-5Y, mRNA, 2147 nt]
Serine kinase mRNA
CSNK1D Casein kinase 1, delta
Protein kinase C-binding protein RACK7 mRNA, partial cds
Protein kinase ATR mRNA
Hematopoietic progenitor kinase (HPK1) mRNA
CaM kinase II isoform mRNA
ITPKB Inositol 1,4,5-trisphosphate 3-kinase B
DAGK1 Diacylglycerol kinase, alpha (80kD)
RPL7A Neurotrophic tyrosine kinase, receptor, type 1
MST1R Protein-tyrosine kinase RON
mRNA (clone C-2k) mRNA for serine/threonine protein kinase
Nucleoside-diphosphate kinase
Ndr protein kinase
Phosphatidylinositol 3-kinase
DNA-dependent protein kinase catalytic subunit (DNA-PKcs) mRNA
CALM1 Calmodulin 1 (phosphorylase kinase, delta)
DAGK4 Diacylglycerol kinase delta
GB DEF = T-lymphocyte specific protein tyrosine kinase p56lck (lck) abberant mRNA
PRKCB1 Protein kinase C, beta 1
NJ clustering on the 28 kinase genes.
20. Conclusion:
The two tyrosine kinases genes are clustered together,closely to each
other – “GB DEF=T-lymphocyte specific protein tyrosine kinase
p56lck(lck) abberant mRNA” and “Protein tyrosine kinase related mRNA
sequence”.
Can be used to design a kinase inhibitor which might work on all the
related kinases and help treat certain types of cancer.
21. Biochemical techniques to test the above analysis:
Perform automated chain-termination or Maxam- Gilbert DNA
sequencing for each the above closely related genes.
Also obtain the protein from respective genes, and sequence them.
Perform Multiple sequence alignment method with the help of online
tool.
Compare the Neighbor Joining tree obtained by above computation
with the phylogenetic tree produced by MSA tool of the sequences
obtained through wet lab analysis.
22. References:
• Klein RL, Brown AR, Gomez-Castro CM, Chambers SK, Cragun JM, Grasso-LeBeau L, Lang JE. Ovarian Cancer
Metastatic to the Breast Presenting as Inflammatory Breast Cancer: A Case Report and Literature Review. J
Cancer 2010; 1:27-31. doi:10.7150/jca.1.27. Available from http://www.jcancer.org/v01p0027.htm
• Malumbres M, and Barbacid M. 2007 Feb 17th, Cell cycle Curr Opin Genet Dev. 60-5 [PMID: 17208431]
• Saitou N, and Nei M. 1987 July 4th The neighbor-joining method: a new method for reconstructing
phylogenetic trees. Mol Biol Evol. 406-25 [PMID: 3447015]
• Bibilography:
• Gareth James, Daniela Witten, Trevor Hastie, Robert Tibshirani, Feb11th, 2013 An Introduction to
Statistical Learning: with Applications in R, pp377-419.
• Wim P. Krijnen (2009) Applied Statistics for Bioinformatics using R.