One of the most exciting frontiers in science is building automated systems that use existing biomedical data to understand and ultimately treat human disease. The key difficulty in the case of cancer is that it is a highly heterogeneous disease, making it challenging to uncover which molecular alterations in tumors are important for the disease and to predict how an individual will respond to treatment. This talk presents an overview of integrative computational methods for analyzing cancer genomes that leverage a diverse range of complementary data in order to extract biomedically relevant insights.
Developing a framework for for detection of low frequency somatic genetic alt...Ronak Shah
Cancer is a complex, heterogeneous disease of the genome. Most cancers result
from an accumulation of multiple genetic alterations that lead to dysfunction of cancer-associated
genes and pathways. Recent advances in sequencing technology have enabled comprehensive
profiling of genetic alterations in cancer. We have established a targeted sequencing platform
(IMPACT: Integrated Mutation Profiling of Actionable Cancer Targets) using hybridization capture and
next-generation sequencing (NGS) technology, which can reveal mutations, indels and copy number
alterations involving 340 cancer related genes.
The IMPACT of INDEL realignment: Detecting insertions and deletions longer th...Ronak Shah
Cancer is a disease of the genome –most of its forms result from a buildup of genetic alterations that, directly or indirectly, allow the patient’s cells to proliferate without restraint. For decades, identifying and targeting cancer mutations for treatment was impractical due to the limitations of sequencing technology. However, the rise of high-throughput next-generation sequencing (NGS) tools has allowed researchers to rapidly and cheaply sequence large, targeted regions of DNA. MSK-IMPACT(Memorial Sloan Kettering-IntegratedMutation Profiling of Actionable Cancer Targets), a sequencing platform with an associated computational pipeline, takes advantage of improvements in sequencing technology to analyzetumor specimensfor clinically actionable variants in341 cancer-associatedgenes.Criticalto IMPACT’s efficacy is the detection of somatic DNAalterationslike INDELs, which are insertions or deletions of nucleotides. Current sequence aligners have difficulty accuratelymapping reads (short, overlapping DNA sequences) containing morethan a single base change, let alone reads containing INDELs. This flaw necessitates the use of INDEL realigners, whichrearrange reads inregions where INDELs might exist in order to identify them more easily. Currently, the INDEL realignment software associated withMSK-IMPACT’scomputational pipeline, the Genome Analysis Toolkit’s IndelRealigner (GATK), canonly efficiently resolveINDELsshorter than 30 base pairs, which limits theplatform’sreliability forINDELdetection. Thus, wetested and compared the performance of a new INDEL realigner called ABRA (Assembly BasedRe-Aligner) to that of GATK’s IndelRealigner.
BioNetVisA 2018 ECCB workshop
From biological network reconstruction to data visualization and analysis in molecular biology and medicine.
http://eccb18.org/workshop-2/
https://bionetvisa.github.io/
Developing a framework for for detection of low frequency somatic genetic alt...Ronak Shah
Cancer is a complex, heterogeneous disease of the genome. Most cancers result
from an accumulation of multiple genetic alterations that lead to dysfunction of cancer-associated
genes and pathways. Recent advances in sequencing technology have enabled comprehensive
profiling of genetic alterations in cancer. We have established a targeted sequencing platform
(IMPACT: Integrated Mutation Profiling of Actionable Cancer Targets) using hybridization capture and
next-generation sequencing (NGS) technology, which can reveal mutations, indels and copy number
alterations involving 340 cancer related genes.
The IMPACT of INDEL realignment: Detecting insertions and deletions longer th...Ronak Shah
Cancer is a disease of the genome –most of its forms result from a buildup of genetic alterations that, directly or indirectly, allow the patient’s cells to proliferate without restraint. For decades, identifying and targeting cancer mutations for treatment was impractical due to the limitations of sequencing technology. However, the rise of high-throughput next-generation sequencing (NGS) tools has allowed researchers to rapidly and cheaply sequence large, targeted regions of DNA. MSK-IMPACT(Memorial Sloan Kettering-IntegratedMutation Profiling of Actionable Cancer Targets), a sequencing platform with an associated computational pipeline, takes advantage of improvements in sequencing technology to analyzetumor specimensfor clinically actionable variants in341 cancer-associatedgenes.Criticalto IMPACT’s efficacy is the detection of somatic DNAalterationslike INDELs, which are insertions or deletions of nucleotides. Current sequence aligners have difficulty accuratelymapping reads (short, overlapping DNA sequences) containing morethan a single base change, let alone reads containing INDELs. This flaw necessitates the use of INDEL realigners, whichrearrange reads inregions where INDELs might exist in order to identify them more easily. Currently, the INDEL realignment software associated withMSK-IMPACT’scomputational pipeline, the Genome Analysis Toolkit’s IndelRealigner (GATK), canonly efficiently resolveINDELsshorter than 30 base pairs, which limits theplatform’sreliability forINDELdetection. Thus, wetested and compared the performance of a new INDEL realigner called ABRA (Assembly BasedRe-Aligner) to that of GATK’s IndelRealigner.
BioNetVisA 2018 ECCB workshop
From biological network reconstruction to data visualization and analysis in molecular biology and medicine.
http://eccb18.org/workshop-2/
https://bionetvisa.github.io/
An understanding towards genetics and epigenetics is essential to cope up with the paradigm shift which is underway. Personalized medicine and gene therapy will confluence the days to come.
This review highlights traditional approaches as well as current advancements in the analysis of the gene expression data from cancer perspective.
Due to improvements in biometric instrumentation and automation, it has become easier to collect a lot of experimental data in molecular biology.
Analysis of such data is extremely important as it leads to knowledge discovery that can be validated by experiments. Previously, the diagnosis of complex genetic diseases has conventionally been done based on the non-molecular characteristics like kind of tumor tissue, pathological characteristics, and clinical phase.
The microarray data can be well accounted for high dimensional space and noise. Same were the reasons for ineffective and imprecise results. Several machine learning and data mining techniques are presently applied for identifying cancer using gene expression data.
While differences in efficiency do exist, none of the well-established approaches is uniformly superior to others. The quality of algorithm is important, but is not in itself a guarantee of the quality of a specific data analysis.
http://kaashivinfotech.com/
http://inplanttrainingchennai.com/
http://inplanttraining-in-chennai.com/
http://internshipinchennai.in/
http://inplant-training.org/
http://kernelmind.com/
http://inplanttraining-in-chennai.com/
http://inplanttrainingchennai.com/
the document is about chromosomal analysis technique named array CGH technology, the complete procedure and the result interpretation of chromosomal variation
Identification of cancer drivers across tumor typesNuria Lopez-Bigas
Thousands of tumor genomes/exomes are being sequenced as part of the International Cancer Genome Consortium (ICGC), The Cancer Genome Atlas (TCGA) and other initiatives. This opens the possibility to have, for the first time, a comprehensive picture of mutations, genes and pathways involved in the cancer phenotype across tumor types. We have developed computational methods able to identify signals of positive selection in the pattern of tumor somatic mutations, which point to genes and pathways directly involved in the development of the tumors. We have applied these approaches to 3025 tumors from 12 different cancer types of the TCGA Pan-Cancer project, identifying 291 high-confidence cancer driver genes acting on those tumors (Tamborero et al 2013). We have also developed IntOGen-mutations (http://www.intogen.org/mutations), a novel web platform for cancer genomes interpretations, which analyses not only TCGA pan-cancer data but all mutation data from ICGC and other initiatives. The resource allows users to identify driver mutations, genes and pathways acting on more than 6000 tumors originated in 17 different cancer sites and to analyze newly sequence tumor genomes. Among the novel cancer drivers identified there are chromatin regulatory factors and splicing factors, which are emerging as important genes in cancer development and are regarded as interesting candidates for novel targets for cancer treatment. In my talk I will summarize all these recent findings.
More info: http://bg.upf.edu/blog/2013/10/my-slides-on-identification-of-cancer-drivers-across-tumor-types/
In Silico Prescription of Anticancer Drugs Reveals Targeting OpportunitiesNuria Lopez-Bigas
Large efforts dedicated to sequence thousands of tumor genome/exomes are expected to lead to significant improvements of precision cancer medicine. However, high inter-tumor heterogeneity is a major obstacle in the road to develop an arsenal of targeted cancer drugs to treat most cancer patients. Therefore, it is critical to understand the current scope of anti-cancer targeted drugs for different tumor types in order to use them with the highest efficacy, and to define priorities for the development of new ones. We have developed a novel methodology to interpret the genomes of a cohort of tumor samples and to assess their therapeutic opportunities. Starting with somatic mutations detected across the cohort, the methodology identifies the driver genes, highlights those that dominate the clonal landscape of the tumors and determines their mode of action. It then does an in-silico prescription of approved and candidate targeted drugs to each patient in the cohort. The application of this approach to a cohort of 6795 cancer samples of 28 different tumor types showed that the fraction of patients that could benefit from prescribed FDA-approved drugs is strikingly small. Nevertheless, it improves significantly if repurposing opportunities are taken into consideration, with large differences between tumor types. In addition, we identify 80 therapeutically unexploited cancer genes, tightly bound by pre-clinical small molecules or potentially suitable for molecule binding. The resource created with this analysis is also intended to provide interpretation of newly sequenced cancer genomes and to design pan-cancer and tumor type specific sequencing panels for efficient early cancer detection and clinical insight.
More details at http://www.intogen.org
Proteogenomic analysis of human colon cancer reveals new therapeutic opportun...Gul Muneer
We performed the first proteogenomic study on a prospectively collected colon cancer cohort. Comparative proteomic and phosphoproteomic analysis of paired tumor and normal adjacent tissues produced a catalog of colon cancer-associated proteins and phosphosites, including known and putative new biomarkers, drug targets, and cancer/testis antigens. Proteogenomic integration not only prioritized genomically inferred targets, such as copy-number drivers and mutation-derived neoantigens, but also yielded novel findings. Phosphoproteomics data associated Rb phosphorylation with increased proliferation and decreased apoptosis in colon cancer, which explains why this classical tumor suppressor is amplified in colon tumors and suggests a rationale for targeting Rb phosphorylation in colon cancer. Proteomics identified an association between decreased CD8 T cell infiltration and increased glycolysis in microsatellite instability-high (MSI-H) tumors, suggesting glycolysis as a potential target to overcome the resistance of MSI-H tumors to immune checkpoint blockade. Proteogenomics presents new avenues for biological discoveries and therapeutic development.
Assessing the clinical utility of cancer genomic and proteomic data across tu...Gul Muneer
Molecular profiling of tumors promises to advance the clinical
management of cancer, but the benefits of integrating
molecular data with traditional clinical variables have not been
systematically studied. Here we retrospectively predict patient
survival using diverse molecular data (somatic copy-number
alteration, DNA methylation and mRNA, microRNA and protein
expression) from 953 samples of four cancer types from The
Cancer Genome Atlas project. We find that incorporating
molecular data with clinical variables yields statistically
significantly improved predictions (FDR < 0.05) for three
cancers but those quantitative gains were limited (2.2–23.9%).
Additional analyses revealed little predictive power across
tumor types except for one case. In clinically relevant genes,
we identified 10,281 somatic alterations across 12 cancer types
in 2,928 of 3,277 patients (89.4%), many of which would
not be revealed in single-tumor analyses. Our study provides
a starting point and resources, including an open-access
model evaluation platform, for building reliable prognostic and
therapeutic strategies that incorporate molecular data
Sequencing 60,000 Samples: An Innovative Large Cohort Study for Breast Cancer...QIAGEN
This slidedeck focuses on the design of a large cohort study for assessing breast cancer risk and how an innovative digital sequencing approach is able to solve the previously unmet challenges of this type of NGS study design. Our speaker, Dr. Fergus J. Couch of the Mayo Clinic, presents on the design of this NCI-funded project, which comprises the sequencing of 60,000 samples to assess the risk of breast cancer through association with targeted genes. The design and size of the study requires an accurate, robust and high-throughput sequencing method. The investigators are using a digital DNA sequencing approach from QIAGEN that incorporates molecular barcodes to tag and remove PCR duplicates and increase NGS assay sensitivity. The approach also uses proprietary chemistry that enables uniform sequencing to efficiently utilize sequencing power and deliver optimized results.
PICS: Pathway Informed Classification System for cancer analysis using gene e...David Craft
We introduce PICS (Pathway Informed Classification System) for classifying cancers based on tumor sample gene expression levels. The method clearly separates a pan-cancer dataset into their tissue of origin and is also able to sub-classify individual cancer datasets into distinct survival classes. Gene expression values are collapsed into pathway scores that reveal which biological activities are most useful for clustering cancer cohorts into sub-types. Variants of the method allow it to be used on datasets that do and do not contain non-cancerous samples. Activity levels of all types of pathways, broadly grouped into metabolic, cellular processes and signaling, and immune system, are useful for separating the pan-cancer cohort. In the clustering of specific cancer types, certain pathway types become more valuable depending on the site being studied. For lung cancer, signaling pathways dominate, for pancreatic cancer signaling and metabolic pathways, and for melanoma immune system pathways are the most useful. This work suggests the utility of pathway level genomic analysis and points in the direction of using pathway classification for predicting the efficacy and side effects of drugs and radiation.
발표자: 박혜진(신테카바이오)
발표일: 2018.1.
최근 과학 기술의 발전에 따라 환자 개개인에 최적화된 진단 및 치료를 제공하는 정밀의학의 시대가 도래했다. 정밀의학의 핵심은 유전체 등 빅데이터의 처리와 AI를 활용한 분석으로 진단에 유효한 정보를 확보하는 데 있다. 정밀의학이 가장 활발하게 적용되는 암 진단과 치료에 있어 분자 수준(인산화효소 돌연변이)과 세포 수준(암세포주)의 유전체 정보를 활용한 약물 활성 예측은 약물 저항성 관련 진단과 신약 개발에 도움을 줄 것으로 예상된다. 딥러닝 기법을 사용한 분자 수준(인산화효소 돌연변이)의 데이터를 활용한 약물 활성 예측 모형은 항암표적치료제와 표적 단백질의 상호작용을 기존의 모형보다 더 높은 정확도로 예측할 수 있다. 또한 세포 수준(암세포주) 데이터를 활용한 약물 활성 예측 모형은 환자의 약물 저항성 예측 뿐 아니라 신약 개발 이나 신약 재창출(drug repositioning) 연구에도 유용하게 사용될 수 있다.
발표 논문:
https://www.nature.com/articles/s41598-018-27214-6
An understanding towards genetics and epigenetics is essential to cope up with the paradigm shift which is underway. Personalized medicine and gene therapy will confluence the days to come.
This review highlights traditional approaches as well as current advancements in the analysis of the gene expression data from cancer perspective.
Due to improvements in biometric instrumentation and automation, it has become easier to collect a lot of experimental data in molecular biology.
Analysis of such data is extremely important as it leads to knowledge discovery that can be validated by experiments. Previously, the diagnosis of complex genetic diseases has conventionally been done based on the non-molecular characteristics like kind of tumor tissue, pathological characteristics, and clinical phase.
The microarray data can be well accounted for high dimensional space and noise. Same were the reasons for ineffective and imprecise results. Several machine learning and data mining techniques are presently applied for identifying cancer using gene expression data.
While differences in efficiency do exist, none of the well-established approaches is uniformly superior to others. The quality of algorithm is important, but is not in itself a guarantee of the quality of a specific data analysis.
http://kaashivinfotech.com/
http://inplanttrainingchennai.com/
http://inplanttraining-in-chennai.com/
http://internshipinchennai.in/
http://inplant-training.org/
http://kernelmind.com/
http://inplanttraining-in-chennai.com/
http://inplanttrainingchennai.com/
the document is about chromosomal analysis technique named array CGH technology, the complete procedure and the result interpretation of chromosomal variation
Identification of cancer drivers across tumor typesNuria Lopez-Bigas
Thousands of tumor genomes/exomes are being sequenced as part of the International Cancer Genome Consortium (ICGC), The Cancer Genome Atlas (TCGA) and other initiatives. This opens the possibility to have, for the first time, a comprehensive picture of mutations, genes and pathways involved in the cancer phenotype across tumor types. We have developed computational methods able to identify signals of positive selection in the pattern of tumor somatic mutations, which point to genes and pathways directly involved in the development of the tumors. We have applied these approaches to 3025 tumors from 12 different cancer types of the TCGA Pan-Cancer project, identifying 291 high-confidence cancer driver genes acting on those tumors (Tamborero et al 2013). We have also developed IntOGen-mutations (http://www.intogen.org/mutations), a novel web platform for cancer genomes interpretations, which analyses not only TCGA pan-cancer data but all mutation data from ICGC and other initiatives. The resource allows users to identify driver mutations, genes and pathways acting on more than 6000 tumors originated in 17 different cancer sites and to analyze newly sequence tumor genomes. Among the novel cancer drivers identified there are chromatin regulatory factors and splicing factors, which are emerging as important genes in cancer development and are regarded as interesting candidates for novel targets for cancer treatment. In my talk I will summarize all these recent findings.
More info: http://bg.upf.edu/blog/2013/10/my-slides-on-identification-of-cancer-drivers-across-tumor-types/
In Silico Prescription of Anticancer Drugs Reveals Targeting OpportunitiesNuria Lopez-Bigas
Large efforts dedicated to sequence thousands of tumor genome/exomes are expected to lead to significant improvements of precision cancer medicine. However, high inter-tumor heterogeneity is a major obstacle in the road to develop an arsenal of targeted cancer drugs to treat most cancer patients. Therefore, it is critical to understand the current scope of anti-cancer targeted drugs for different tumor types in order to use them with the highest efficacy, and to define priorities for the development of new ones. We have developed a novel methodology to interpret the genomes of a cohort of tumor samples and to assess their therapeutic opportunities. Starting with somatic mutations detected across the cohort, the methodology identifies the driver genes, highlights those that dominate the clonal landscape of the tumors and determines their mode of action. It then does an in-silico prescription of approved and candidate targeted drugs to each patient in the cohort. The application of this approach to a cohort of 6795 cancer samples of 28 different tumor types showed that the fraction of patients that could benefit from prescribed FDA-approved drugs is strikingly small. Nevertheless, it improves significantly if repurposing opportunities are taken into consideration, with large differences between tumor types. In addition, we identify 80 therapeutically unexploited cancer genes, tightly bound by pre-clinical small molecules or potentially suitable for molecule binding. The resource created with this analysis is also intended to provide interpretation of newly sequenced cancer genomes and to design pan-cancer and tumor type specific sequencing panels for efficient early cancer detection and clinical insight.
More details at http://www.intogen.org
Proteogenomic analysis of human colon cancer reveals new therapeutic opportun...Gul Muneer
We performed the first proteogenomic study on a prospectively collected colon cancer cohort. Comparative proteomic and phosphoproteomic analysis of paired tumor and normal adjacent tissues produced a catalog of colon cancer-associated proteins and phosphosites, including known and putative new biomarkers, drug targets, and cancer/testis antigens. Proteogenomic integration not only prioritized genomically inferred targets, such as copy-number drivers and mutation-derived neoantigens, but also yielded novel findings. Phosphoproteomics data associated Rb phosphorylation with increased proliferation and decreased apoptosis in colon cancer, which explains why this classical tumor suppressor is amplified in colon tumors and suggests a rationale for targeting Rb phosphorylation in colon cancer. Proteomics identified an association between decreased CD8 T cell infiltration and increased glycolysis in microsatellite instability-high (MSI-H) tumors, suggesting glycolysis as a potential target to overcome the resistance of MSI-H tumors to immune checkpoint blockade. Proteogenomics presents new avenues for biological discoveries and therapeutic development.
Assessing the clinical utility of cancer genomic and proteomic data across tu...Gul Muneer
Molecular profiling of tumors promises to advance the clinical
management of cancer, but the benefits of integrating
molecular data with traditional clinical variables have not been
systematically studied. Here we retrospectively predict patient
survival using diverse molecular data (somatic copy-number
alteration, DNA methylation and mRNA, microRNA and protein
expression) from 953 samples of four cancer types from The
Cancer Genome Atlas project. We find that incorporating
molecular data with clinical variables yields statistically
significantly improved predictions (FDR < 0.05) for three
cancers but those quantitative gains were limited (2.2–23.9%).
Additional analyses revealed little predictive power across
tumor types except for one case. In clinically relevant genes,
we identified 10,281 somatic alterations across 12 cancer types
in 2,928 of 3,277 patients (89.4%), many of which would
not be revealed in single-tumor analyses. Our study provides
a starting point and resources, including an open-access
model evaluation platform, for building reliable prognostic and
therapeutic strategies that incorporate molecular data
Sequencing 60,000 Samples: An Innovative Large Cohort Study for Breast Cancer...QIAGEN
This slidedeck focuses on the design of a large cohort study for assessing breast cancer risk and how an innovative digital sequencing approach is able to solve the previously unmet challenges of this type of NGS study design. Our speaker, Dr. Fergus J. Couch of the Mayo Clinic, presents on the design of this NCI-funded project, which comprises the sequencing of 60,000 samples to assess the risk of breast cancer through association with targeted genes. The design and size of the study requires an accurate, robust and high-throughput sequencing method. The investigators are using a digital DNA sequencing approach from QIAGEN that incorporates molecular barcodes to tag and remove PCR duplicates and increase NGS assay sensitivity. The approach also uses proprietary chemistry that enables uniform sequencing to efficiently utilize sequencing power and deliver optimized results.
PICS: Pathway Informed Classification System for cancer analysis using gene e...David Craft
We introduce PICS (Pathway Informed Classification System) for classifying cancers based on tumor sample gene expression levels. The method clearly separates a pan-cancer dataset into their tissue of origin and is also able to sub-classify individual cancer datasets into distinct survival classes. Gene expression values are collapsed into pathway scores that reveal which biological activities are most useful for clustering cancer cohorts into sub-types. Variants of the method allow it to be used on datasets that do and do not contain non-cancerous samples. Activity levels of all types of pathways, broadly grouped into metabolic, cellular processes and signaling, and immune system, are useful for separating the pan-cancer cohort. In the clustering of specific cancer types, certain pathway types become more valuable depending on the site being studied. For lung cancer, signaling pathways dominate, for pancreatic cancer signaling and metabolic pathways, and for melanoma immune system pathways are the most useful. This work suggests the utility of pathway level genomic analysis and points in the direction of using pathway classification for predicting the efficacy and side effects of drugs and radiation.
발표자: 박혜진(신테카바이오)
발표일: 2018.1.
최근 과학 기술의 발전에 따라 환자 개개인에 최적화된 진단 및 치료를 제공하는 정밀의학의 시대가 도래했다. 정밀의학의 핵심은 유전체 등 빅데이터의 처리와 AI를 활용한 분석으로 진단에 유효한 정보를 확보하는 데 있다. 정밀의학이 가장 활발하게 적용되는 암 진단과 치료에 있어 분자 수준(인산화효소 돌연변이)과 세포 수준(암세포주)의 유전체 정보를 활용한 약물 활성 예측은 약물 저항성 관련 진단과 신약 개발에 도움을 줄 것으로 예상된다. 딥러닝 기법을 사용한 분자 수준(인산화효소 돌연변이)의 데이터를 활용한 약물 활성 예측 모형은 항암표적치료제와 표적 단백질의 상호작용을 기존의 모형보다 더 높은 정확도로 예측할 수 있다. 또한 세포 수준(암세포주) 데이터를 활용한 약물 활성 예측 모형은 환자의 약물 저항성 예측 뿐 아니라 신약 개발 이나 신약 재창출(drug repositioning) 연구에도 유용하게 사용될 수 있다.
발표 논문:
https://www.nature.com/articles/s41598-018-27214-6
A micro-array is a tool for analyzing gene expression that consists of a small membrane or glass slide containing samples of many genes arranged in a regular pattern.
This was made by me while I was in Masters. I have made few animations. I hope it makes understanding better.
The content is made by searching through internet and referencing books. I do not claim any content in whole presentation except the animations made on the subject.
Mastering RNA-Seq (NGS Data Analysis) - A Critical Approach To Transcriptomic...Elia Brodsky
This workshop will address critical issues related to Transcriptomics data:
Processing raw Next Generation Sequencing (NGS) data:
1. Next Generation Sequencing data preprocessing:
Trimming technical sequences
Removing PCR duplicates
2. RNA-seq based quantification of expression levels:
Conventional pipelines (looking at known transcripts)
Identification of novel isoforms
Analysis of Expression Data Using Machine Learning:
3. Unsupervised analysis of expression data:
Principal Component Analysis
Clustering
4. Supervised analysis:
Differential expression analysis
Classification, gene signature construction
5. Gene set enrichment analysis
The workshop will include hands-on exercises utilizing public domain datasets:
breast cancer cell lines transcriptomic profiles (https://genomebiology.biomedcentral.com/articles/10.1186/gb-2013-14-10-r110),
patient-derived xenograft (PDX) mouse model of tumor and stroma transcriptomic profiles (http://www.oncotarget.com/index.php?journal=oncotarget&page=article&op=view&path[]=8014&path[]=23533), and
processed data from The Cancer Genome Atlas samples (https://cancergenome.nih.gov/).
Team: The workshops are designed by the researchers at the Tauber Bioinformatics Research Center at University of Haifa, Israel in collaboration with academic centers across the US. Technical support for the workshops is provided by the Pine Biotech team. https://edu.t-bio.info/a-critical-approach-to-transcriptomic-data-analysis/
Cancer recognition from dna microarray gene expression data using averaged on...IJCI JOURNAL
Cancer is a major leading cause of death and responsible for around 13% of all deaths world-wide. Cancer
incidence rate is growing at an alarming rate in the world. Despite the fact that cancer is preventable and
curable in early stages, the vast majority of patients are diagnosed with cancer very late. Therefore, it is of
paramount importance to prevent and detect cancer early. Nonetheless, conventional methods of detecting
and diagnosing cancer rely solely on skilled physicians, with the help of medical imaging, to detect certain
symptoms that usually appear in the late stages of cancer. The microarray gene expression technology is a
promising technology that can detect cancerous cells in early stages of cancer by analyzing gene
expression of tissue samples. The microarray technology allows researchers to examine the expression of
thousands of genes simultaneously. This paper describes a state-of-the-art machine learning based
approach called averaged one-dependence estimators with subsumption resolution to tackle the problem of
recognizing cancer from DNA microarray gene expression data. To lower the computational complexity
and to increase the generalization capability of the system, we employ an entropy-based geneselection
approach to select relevant gene that are directly responsible for cancer discrimination. This proposed
system has achieved an average accuracy of 98.94% in recognizing and classifyingcancer over 11
benchmark cancer datasets. The experimental results demonstrate the efficacy of our framework.
Visual Exploration of Clinical and Genomic Data for Patient StratificationNils Gehlenborg
Talk presented at the Simons Foundation Biotech Symposium "Complex Data Visualization: Approach and Application" (12 September 2014)
http://www.simonsfoundation.org/event/complex-data-visualization-approach-and-application/
In this talk I describe how we integrated a sophisticated computational framework directly into the StratomeX visualization technique to enable rapid exploration of tens of thousands of stratifications in cancer genomics data, creating a unique and powerful tool for the identification and characterization of tumor subtypes. The tool can handle a wide range of genomic and clinical data types for cohorts with hundreds of patients. StratomeX also provides direct access to comprehensive data sets generated by The Cancer Genome Atlas Firehose analysis pipeline.
http://stratomex.caleydo.org
Applications of Next generation sequencing in Drug Discoveryvjain38
This presentation gives an overview of the Next generation Sequencing (NGS) technology and what are is current applications in the drug discovery process.
Cardiotoxicity is unfortunately a common side effect of many modern chemotherapeutic agents. The mechanisms that underlie these detrimental effects on heart muscle, however, remain unclear. The Drug Toxicity Signature Generation Center at ISMMS aims to address this unresolved issue by providing a bridge between molecular changes in cells and the prediction of pathophysiological effects. I will discuss ongoing work in which we use next-generation sequencing to quantify changes in gene expression that occur in cardiac myocytes after they are treated with potentially toxic chemotherapeutic agents. I will focus in particular on the computational pipeline we are developing that integrates sophisticated sequence alignment, statistical and network analysis, and dynamical mathematical models to develop novel predictions about the mechanisms underlying drug-induced cardiotoxicity.
Jaehee Shim is a Ph.D candidate in the Biophysics and Systems Pharmacology Program at Icahn School of Medicine at Mount Sinai (ISMMS). As a part of her Ph.D. studies, she is building dynamical prediction models based on analysis of gene expression data generated by the Drug Toxicity Signature Generation Center at ISMMS. She received her B.S in Biochemistry from the University of Michigan-Dearborn. Prior to starting her Ph.D, Jaehee worked at the ISMMS Genomics Core with a team of senior scientists and gained experience in improving and troubleshooting RNA sequencing protocols using Next Generation Sequencing Platforms.
Interrogating differences in expression of targeted gene sets to predict brea...Enrique Moreno Gonzalez
Genomics provides opportunities to develop precise tests for diagnostics, therapy selection and monitoring. From analyses of our studies and those of published results, 32 candidate genes were identified, whose expression appears related to clinical outcome of breast cancer. Expression of these genes was validated by qPCR and correlated with clinical follow-up to identify a gene subset for development of a prognostic test.
Similar to Intelligent Systems for Cancer Genomics (AIS305) - AWS re:Invent 2018 (20)
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Amazon Web Services
Il Forecasting è un processo importante per tantissime aziende e viene utilizzato in vari ambiti per cercare di prevedere in modo accurato la crescita e distribuzione di un prodotto, l’utilizzo delle risorse necessarie nelle linee produttive, presentazioni finanziarie e tanto altro. Amazon utilizza delle tecniche avanzate di forecasting, in parte questi servizi sono stati messi a disposizione di tutti i clienti AWS.
In questa sessione illustreremo come pre-processare i dati che contengono una componente temporale e successivamente utilizzare un algoritmo che a partire dal tipo di dato analizzato produce un forecasting accurato.
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Amazon Web Services
La varietà e la quantità di dati che si crea ogni giorno accelera sempre più velocemente e rappresenta una opportunità irripetibile per innovare e creare nuove startup.
Tuttavia gestire grandi quantità di dati può apparire complesso: creare cluster Big Data su larga scala sembra essere un investimento accessibile solo ad aziende consolidate. Ma l’elasticità del Cloud e, in particolare, i servizi Serverless ci permettono di rompere questi limiti.
Vediamo quindi come è possibile sviluppare applicazioni Big Data rapidamente, senza preoccuparci dell’infrastruttura, ma dedicando tutte le risorse allo sviluppo delle nostre le nostre idee per creare prodotti innovativi.
Ora puoi utilizzare Amazon Elastic Kubernetes Service (EKS) per eseguire pod Kubernetes su AWS Fargate, il motore di elaborazione serverless creato per container su AWS. Questo rende più semplice che mai costruire ed eseguire le tue applicazioni Kubernetes nel cloud AWS.In questa sessione presenteremo le caratteristiche principali del servizio e come distribuire la tua applicazione in pochi passaggi
Vent'anni fa Amazon ha attraversato una trasformazione radicale con l'obiettivo di aumentare il ritmo dell'innovazione. In questo periodo abbiamo imparato come cambiare il nostro approccio allo sviluppo delle applicazioni ci ha permesso di aumentare notevolmente l'agilità, la velocità di rilascio e, in definitiva, ci ha consentito di creare applicazioni più affidabili e scalabili. In questa sessione illustreremo come definiamo le applicazioni moderne e come la creazione di app moderne influisce non solo sull'architettura dell'applicazione, ma sulla struttura organizzativa, sulle pipeline di rilascio dello sviluppo e persino sul modello operativo. Descriveremo anche approcci comuni alla modernizzazione, compreso l'approccio utilizzato dalla stessa Amazon.com.
Come spendere fino al 90% in meno con i container e le istanze spot Amazon Web Services
L’utilizzo dei container è in continua crescita.
Se correttamente disegnate, le applicazioni basate su Container sono molto spesso stateless e flessibili.
I servizi AWS ECS, EKS e Kubernetes su EC2 possono sfruttare le istanze Spot, portando ad un risparmio medio del 70% rispetto alle istanze On Demand. In questa sessione scopriremo insieme quali sono le caratteristiche delle istanze Spot e come possono essere utilizzate facilmente su AWS. Impareremo inoltre come Spreaker sfrutta le istanze spot per eseguire applicazioni di diverso tipo, in produzione, ad una frazione del costo on-demand!
In recent months, many customers have been asking us the question – how to monetise Open APIs, simplify Fintech integrations and accelerate adoption of various Open Banking business models. Therefore, AWS and FinConecta would like to invite you to Open Finance marketplace presentation on October 20th.
Event Agenda :
Open banking so far (short recap)
• PSD2, OB UK, OB Australia, OB LATAM, OB Israel
Intro to Open Finance marketplace
• Scope
• Features
• Tech overview and Demo
The role of the Cloud
The Future of APIs
• Complying with regulation
• Monetizing data / APIs
• Business models
• Time to market
One platform for all: a Strategic approach
Q&A
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Amazon Web Services
Per creare valore e costruire una propria offerta differenziante e riconoscibile, le startup di successo sanno come combinare tecnologie consolidate con componenti innovativi creati ad hoc.
AWS fornisce servizi pronti all'utilizzo e, allo stesso tempo, permette di personalizzare e creare gli elementi differenzianti della propria offerta.
Concentrandoci sulle tecnologie di Machine Learning, vedremo come selezionare i servizi di intelligenza artificiale offerti da AWS e, anche attraverso una demo, come costruire modelli di Machine Learning personalizzati utilizzando SageMaker Studio.
OpsWorks Configuration Management: automatizza la gestione e i deployment del...Amazon Web Services
Con l'approccio tradizionale al mondo IT per molti anni è stato difficile implementare tecniche di DevOps, che finora spesso hanno previsto attività manuali portando di tanto in tanto a dei downtime degli applicativi interrompendo l'operatività dell'utente. Con l'avvento del cloud, le tecniche di DevOps sono ormai a portata di tutti a basso costo per qualsiasi genere di workload, garantendo maggiore affidabilità del sistema e risultando in dei significativi miglioramenti della business continuity.
AWS mette a disposizione AWS OpsWork come strumento di Configuration Management che mira ad automatizzare e semplificare la gestione e i deployment delle istanze EC2 per mezzo di workload Chef e Puppet.
Scopri come sfruttare AWS OpsWork a garanzia e affidabilità del tuo applicativo installato su Instanze EC2.
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsAmazon Web Services
Vuoi conoscere le opzioni per eseguire Microsoft Active Directory su AWS? Quando si spostano carichi di lavoro Microsoft in AWS, è importante considerare come distribuire Microsoft Active Directory per supportare la gestione, l'autenticazione e l'autorizzazione dei criteri di gruppo. In questa sessione, discuteremo le opzioni per la distribuzione di Microsoft Active Directory su AWS, incluso AWS Directory Service per Microsoft Active Directory e la distribuzione di Active Directory su Windows su Amazon Elastic Compute Cloud (Amazon EC2). Trattiamo argomenti quali l'integrazione del tuo ambiente Microsoft Active Directory locale nel cloud e l'utilizzo di applicazioni SaaS, come Office 365, con AWS Single Sign-On.
Dal riconoscimento facciale al riconoscimento di frodi o difetti di fabbricazione, l'analisi di immagini e video che sfruttano tecniche di intelligenza artificiale, si stanno evolvendo e raffinando a ritmi elevati. In questo webinar esploreremo le possibilità messe a disposizione dai servizi AWS per applicare lo stato dell'arte delle tecniche di computer vision a scenari reali.
Amazon Web Services e VMware organizzano un evento virtuale gratuito il prossimo mercoledì 14 Ottobre dalle 12:00 alle 13:00 dedicato a VMware Cloud ™ on AWS, il servizio on demand che consente di eseguire applicazioni in ambienti cloud basati su VMware vSphere® e di accedere ad una vasta gamma di servizi AWS, sfruttando a pieno le potenzialità del cloud AWS e tutelando gli investimenti VMware esistenti.
Molte organizzazioni sfruttano i vantaggi del cloud migrando i propri carichi di lavoro Oracle e assicurandosi notevoli vantaggi in termini di agilità ed efficienza dei costi.
La migrazione di questi carichi di lavoro, può creare complessità durante la modernizzazione e il refactoring delle applicazioni e a questo si possono aggiungere rischi di prestazione che possono essere introdotti quando si spostano le applicazioni dai data center locali.
Crea la tua prima serverless ledger-based app con QLDB e NodeJSAmazon Web Services
Molte aziende oggi, costruiscono applicazioni con funzionalità di tipo ledger ad esempio per verificare lo storico di accrediti o addebiti nelle transazioni bancarie o ancora per tenere traccia del flusso supply chain dei propri prodotti.
Alla base di queste soluzioni ci sono i database ledger che permettono di avere un log delle transazioni trasparente, immutabile e crittograficamente verificabile, ma sono strumenti complessi e onerosi da gestire.
Amazon QLDB elimina la necessità di costruire sistemi personalizzati e complessi fornendo un database ledger serverless completamente gestito.
In questa sessione scopriremo come realizzare un'applicazione serverless completa che utilizzi le funzionalità di QLDB.
Con l’ascesa delle architetture di microservizi e delle ricche applicazioni mobili e Web, le API sono più importanti che mai per offrire agli utenti finali una user experience eccezionale. In questa sessione impareremo come affrontare le moderne sfide di progettazione delle API con GraphQL, un linguaggio di query API open source utilizzato da Facebook, Amazon e altro e come utilizzare AWS AppSync, un servizio GraphQL serverless gestito su AWS. Approfondiremo diversi scenari, comprendendo come AppSync può aiutare a risolvere questi casi d’uso creando API moderne con funzionalità di aggiornamento dati in tempo reale e offline.
Inoltre, impareremo come Sky Italia utilizza AWS AppSync per fornire aggiornamenti sportivi in tempo reale agli utenti del proprio portale web.
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareAmazon Web Services
Molte organizzazioni sfruttano i vantaggi del cloud migrando i propri carichi di lavoro Oracle e assicurandosi notevoli vantaggi in termini di agilità ed efficienza dei costi.
La migrazione di questi carichi di lavoro, può creare complessità durante la modernizzazione e il refactoring delle applicazioni e a questo si possono aggiungere rischi di prestazione che possono essere introdotti quando si spostano le applicazioni dai data center locali.
In queste slide, gli esperti AWS e VMware presentano semplici e pratici accorgimenti per facilitare e semplificare la migrazione dei carichi di lavoro Oracle accelerando la trasformazione verso il cloud, approfondiranno l’architettura e dimostreranno come sfruttare a pieno le potenzialità di VMware Cloud ™ on AWS.
Amazon Elastic Container Service (Amazon ECS) è un servizio di gestione dei container altamente scalabile, che semplifica la gestione dei contenitori Docker attraverso un layer di orchestrazione per il controllo del deployment e del relativo lifecycle. In questa sessione presenteremo le principali caratteristiche del servizio, le architetture di riferimento per i differenti carichi di lavoro e i semplici passi necessari per poter velocemente migrare uno o più dei tuo container.
5. Cancer Genome Landscapes
32 cancer subtypes
11,315 patient samples
• 22 cancer subtypes
• 20,487 patient samples
• Many mutations per cancer genome
• Only a few mutations within an individual “drive” the cancer
Mutationsper1MillionDNAbases
6. Cancer Genome Landscapes
32 cancer subtypes
11,315 patient samples
• 22 cancer subtypes
• 20,487 patient samples
• Many mutations per cancer genome
• Only a few mutations within an individual “drive” the cancer
Mutationsper1MillionDNAbases
❶ Discover “causal” cancer
driver mutations and genes
❷ Predict drug response
8. Uncovering Cancer Genes In The Context of Other
Information
AAATCGAGGCGATC...
ATATCGAGTCGATC...
ATATCGAGTCGATC...
CAATCGAGGCGATC...
ATATCGAGGCGGTC...
TTATCGAGGAGATC...
8,608,691 varying sites
60,706 individuals
WWbZIP
Bromo
ZF ZF ZF
Population Genomic Data Probabilistic Sequence
Patterns
16,230 “domains” cover 88%
of human proteins
Protein Structures
>127,000 PDB structures
Biological Networks
~300,000 interactions
9. Proteins Function Through Interactions
protein 1 protein 2 protein 3 protein 20K
protein−RNA
protein−ionprotein−DNAprotein−protein
protein−small molecule
13. 22,712total genes in human
61%13,923Computationally inferred interaction site info
2,871 13% genes w/ structural knowledge of any interaction sites
0
1
MISILRRGLLVLLAAFPLLALAVQTPHEVVQSTTNELLGDLKANKE
Partial, per-position 0 to 1
interaction potential
14. Uncovering Significantly Mutated Binding Sites
N C
no known interactionsmodeled interactions
0
1
1 20 3
Somatic mutations
per-position binding
potentials
Xi
sum of binding potentials where
mutations land
analytically compute mean
and variance
Z-score:
Xi
~7X speedup per shuffle
Typically >1,000 shuffles
17. PertInInt Identifies Cancer-Relevant Genes
Frequency based
Conservation
Domain
Interaction
All
Gene Rank
EnrichmentofGenesintheCancerGene
Census
30
20
10
0 1 50 100 150 200
~10 minutes
to process 10,000+ tumor samples
(2.4-2.7Ghz processor, <4GB RAM)
18. PertInInt In Summary
ZF ZF ZF
H. sap. MEGDAVEAIVEES...
P. tro. MENEPSEVILEEN...
G. gor. MEGGPTEAVVEDA...
P. mar. MEKILQMAEGIDI...
*** * **
• Perturbed interactions predictive for cancer genes
• Integrative framework identifies cancer-relevant
genes
– Novel and distinct mutational avenues for driver
genes
• Alternate way to prioritize mutations in an
individual’s cancerAAATCGAGGCGATC...
ATATCGAGTCGATC...
20. Tumor Growth in the Presence of Drug “X”
Illustration obtained from Verschoor et al. 2013
Compound is more effective
Compound is less effective
21. Drug Effectiveness Varies Across Tumor Cells
Source: Genomics of Drug Sensitivity in Cancer (GDSC)
Activity of 250 compounds on 960 cell lines
~160K drug-cell pairings
23. Our Data
Activity of drugs on diverse cell lines
Gene expression measurements on
untreated cells
Chemical structure of drugs
Goal: Predict activity of drugs on a
tumor using gene expression
profiles
and drug features
New tumors, new drugs!
24. Genomics Data Has Modular Structure
Illustration obtained from https://rgd.mcw.edu/rgdweb/pathway/pathwayRecord.html?acc_id=PW:0000
Can we use this modular information
to aid in our predictions?
25. Solution
Use modular knowledge of cellular function
Starting feature space: 960 cell lines x 20K features
(Resistant/Sensitive)
26. Approach: Autoencoders
Neural network approach to obtain a reduced feature space using
a guided modular genomics approach.
Use gene set autoencoded features for prediction
27. Merging Multiple Genomic Sources
Mutation within known cancer genes (CGCs)
Reduced set of gene expression values
28. The Other Half: Structure of Drugs
Features: 2D structural descriptors (chemical
subgroups) and physical features (e.g., size, charge)
Starting space: 250 drug compounds
30. Physical Features
1444 PaDEL physicochemical features from SMILES strings.
Molecular free energy, volume, topology.
Apply autoencoders to reduce feature space (90 features)
35. Summary
• Biologically-guided deep net approach to
predict response to drugs
• By training model across drugs and tumors,
can make predictions for new drugs & tumors
• Ultimate goals:
–Personalized oncology
–In silico drug development