SlideShare a Scribd company logo
CytoGPS (CytoGenetic Pattern Sleuth)
Arka Pattanayak
Zachary Abrams
Informatics Research & Development,
Dept. of Biomedical Informatics at The Ohio State
University
08/28/2013
2
• Complex chromosomal aberration
data – structure and knowledge.
• Inherently descriptive grammar –
International System for human
Cytogenetic Nomenclature (ISCN).
MOTIVATION : DATA
SOLUTION : CytoGPS
APPLICATIONS : MULTIPLE
• Parse karyotypes using Context-
Free Grammar (CFG) rules.
• Extract morphological phrases.
• Map phrases to abstract biological
meta-model.
• Discovery of important, obfuscated
patterns in cytogenetic data.
• Targeted Treatment.
• In-silico drug studies.
CytoGPS: 3-month status report
MOTIVATION Existing cytogenetic data:
• Structured.
• ISCN-conformant.
• Multi-dimensional.
Minimal exploitation due to its
informational complexity:
• Syntactic variability.
• Information density.
• Human error.
3
CytoGPS: 3-month status report
4
The Rulebook
CytoGPS: 3-month status report
CytoGPS Platform
Smart Parser
of Karyotypes
EBNF
Grammar
Rules
Parser
Generators
and Parse
Tree
Visitors
Biologically
Abstracted
Meta-Model
Mapper DSL
Genetic
Pattern
Matching
ML
Algorithms
Phenotype-
phenotype
Matching
5
CytoGPS: CytoGenetic Pattern Sleuth
CytoGPS: 3-month status report
Parsing Complex Karyotypes using SPoK
(Smart Parser of Karyotypes)
6
CytoGPS: 3-month status report
SPoK (Smart
Parser of
Karyotypes)
• Enables in-silico analyses of complex
karyotypes.
• Based on well-studied fundamentals in
computational parsing (CFG, EBNF).
• Disease-agnostic.
• Multi-disciplinary effort - Biomedical
Informatics, Cytogenetics, Hematology.
• ~76% of 3000 publicly available ISCN
2009 karyotypes were successfully parsed
with this method.
7
CytoGPS: 3-month status report
8
SPoK: Context-Free Grammar Rules
CytoGPS: 3-month status report
9
SPoK: Parser Generation
Deterministic Parser
ANTLR
CytoGPS: 3-month status report
10
CytoGPS: 3-month status report
SPoK: A Parse Tree Showing the
Morphological Deconstruction of a Complex
Karyotype
46,XY,del(17)(p12),t(12;15)(p13;q20)
Functional Abstraction using
LossGainFusion (Biologically Abstracted
Meta-Model)
11
CytoGPS: 3-month status report
LGF(Biologicall
y Abstracted
Meta-Model)
• Abstraction of ISCN aberrations observed
in chromosomal bands to their biologically
functional outcomes.
• Using a custom Domain-Specific
Language (DSL)
• Karyotype complexity-agnostic.
• Human-readable karyotypes to machine-
readable construct.
• ~90% of parsed karyotypes were
successfully mapped using this model.
12
CytoGPS: 3-month status report
13
LGF: Understanding Oncogenic Effects with
an Abstracted Meta-Model
CytoGPS: 3-month status report
del(17)(p12)
del 17p12
del1:L
46,XY,del(17)(p12),t(12;15)(p13;q20)
t(12;15)(p13;q20)
t 12p13 15q20
t2:F,F
1.Complete karyotype
2.Chromosomal
aberrations
3.ID and chromosomal
locations
14
CytoGPS: 3-month status report
LGF: Morphological Decomposition of
Karyotypes
der(4)t(4;13)(p14;p18)
der(4) t(4;13)(p14;p18)
t 4p14 13p18der 4
A B C D E
A+C=F
B,D,E add up to 3
F3:B,D,E
We don’t need B so we don’t put a annotation at
that location. We need to put the biological response for
D and E in there respective locations.
F3:,FL,FG
15
CytoGPS: 3-month status report
LGF: Morphological Decomposition of
Karyotypes (more complex example)
16
LGF: Domain-Specific Language for
Mapping ISCN Aberrations to the Meta-
Model.
CytoGPS: 3-month status report
Genetic Pattern Matching
17
CytoGPS: 3-month status report
GPM: Genetic Pattern Matching using
C.A.R.T. (Classification And Regression
Tree) Algorithm
18
Features-band locations on X-axis
Karyotypes
on Y-axis
1p36.3 1p36.2 1p36.1 … 1q44 2p25 … yp12
1 1 0 0 0 0 0 0
1 1 0 0 0 0 0 0
1 1 0 0 0 0 0 0
0 0 1 0 0 0 0 0
0 0 1 0 0 0 0 0
0 0 1 0 0 0 0 0
0 0 1 0 0 0 0 0
0 0 0 0 1 1 0 0
0 0 0 0 1 1 0 0
First cut
Second cut
Applied Biomedical Informatics using
CytoGPS: A Case Study
19
CytoGPS: 3-month status report
Case Study: In-silico Drug Studies
 Raw ISCN Karyotypes.
 Parse
 Machine-readable Construct
 Map ISCN aberration to gene-set
 Map gene-set to known chemical reagent databases.
 An end-to-end in-silico solution for Drug Studies
 Significant cost savings.
 Rapid.
 Flatter learning curve to operate such a system
compared to wet-lab testing.
20
CytoGPS: 3-month status report
Case Study: Map ISCN Aberration to Gene-
Set
 Ensembl @see:
http://beta.rest.ensembl.org/do
cumentation/info/feature_regio
n
 RESTful web service
endpoint
 Speaks JSON
 RESTful request looks like
this:
http://beta.rest.ensembl.org/
feature/region/human/17:15
700000-
16000000?feature=gene;co
ntent-type=application/json
21
CytoGPS: 3-month status report
22
Case Study: Extracting Genetic Information
CytoGPS: 3-month status report
Zachary
Abrams
Lori Dalton,
PhD
Philip R. O.
Payne, PhD
Arka
Pattanayak
Raj
Muthusamy,
PhD
Nyla
Heerema,
PhD
William
Kenworthy
Sarah
Yousef
Alex Mysiw
Yuxiang
Kou
Michael
Berkovich
23
CytoGPS: 3-month status report
24
CytoGPS: 3-month status report
25
CytoGPS: 3-month status report

More Related Content

Similar to Cytogenetics payne lab_presentation_08282013

Bioinformatics applications and challenges
Bioinformatics applications and challengesBioinformatics applications and challenges
Bioinformatics applications and challenges
S V Singh
 
Visual Exploration of Clinical and Genomic Data for Patient Stratification
Visual Exploration of Clinical and Genomic Data for Patient StratificationVisual Exploration of Clinical and Genomic Data for Patient Stratification
Visual Exploration of Clinical and Genomic Data for Patient Stratification
Nils Gehlenborg
 
Machine Learning for Molecules
Machine Learning for MoleculesMachine Learning for Molecules
Machine Learning for Molecules
Ichigaku Takigawa
 
Mastering RNA-Seq (NGS Data Analysis) - A Critical Approach To Transcriptomic...
Mastering RNA-Seq (NGS Data Analysis) - A Critical Approach To Transcriptomic...Mastering RNA-Seq (NGS Data Analysis) - A Critical Approach To Transcriptomic...
Mastering RNA-Seq (NGS Data Analysis) - A Critical Approach To Transcriptomic...
Elia Brodsky
 
Computational Biology thesis defense
Computational Biology thesis defenseComputational Biology thesis defense
Computational Biology thesis defense
csfunk
 
Session i overview bioinfo dm and app mmc
Session i overview bioinfo dm and app mmcSession i overview bioinfo dm and app mmc
Session i overview bioinfo dm and app mmcUSD Bioinformatics
 
IRJET- Gene Mutation Data using Multiplicative Adaptive Algorithm and Gene On...
IRJET- Gene Mutation Data using Multiplicative Adaptive Algorithm and Gene On...IRJET- Gene Mutation Data using Multiplicative Adaptive Algorithm and Gene On...
IRJET- Gene Mutation Data using Multiplicative Adaptive Algorithm and Gene On...
IRJET Journal
 
Microbiome Profiling with the Microbial Genomics Pro Suite
Microbiome Profiling with the Microbial Genomics Pro SuiteMicrobiome Profiling with the Microbial Genomics Pro Suite
Microbiome Profiling with the Microbial Genomics Pro Suite
QIAGEN
 
Bioinformatics MiRON
Bioinformatics MiRONBioinformatics MiRON
Bioinformatics MiRON
Prabin Shakya
 
“MS-Extractor: An Innovative Approach to Extract Microsatellites on „Y‟ Chrom...
“MS-Extractor: An Innovative Approach to Extract Microsatellites on „Y‟ Chrom...“MS-Extractor: An Innovative Approach to Extract Microsatellites on „Y‟ Chrom...
“MS-Extractor: An Innovative Approach to Extract Microsatellites on „Y‟ Chrom...
IJERD Editor
 
Bioinformatics-R program의 실례
Bioinformatics-R program의 실례Bioinformatics-R program의 실례
Bioinformatics-R program의 실례
mothersafe
 
May 15 workshop
May 15  workshopMay 15  workshop
May 15 workshop
Fahadahammed2
 
Phenotype rcn so-geno_workshop(shared)
Phenotype rcn so-geno_workshop(shared)Phenotype rcn so-geno_workshop(shared)
Phenotype rcn so-geno_workshop(shared)mhb120
 
May workshop
May workshopMay workshop
May workshop
Fahadahammed2
 
Rna seq - PDX models
Rna seq - PDX models Rna seq - PDX models
Rna seq - PDX models
Amitha Dasari
 
Recent trends in bioinformatics
Recent trends in bioinformaticsRecent trends in bioinformatics
Recent trends in bioinformatics
Zeeshan Hanjra
 
FAIR as a Working Principle for Cancer Genomic Data
FAIR as a Working Principle for Cancer Genomic DataFAIR as a Working Principle for Cancer Genomic Data
FAIR as a Working Principle for Cancer Genomic Data
Ian Fore
 
BIOINFORMATICS.ppt
BIOINFORMATICS.pptBIOINFORMATICS.ppt
BIOINFORMATICS.ppt
TSaiteja2
 

Similar to Cytogenetics payne lab_presentation_08282013 (20)

Bioinformatics applications and challenges
Bioinformatics applications and challengesBioinformatics applications and challenges
Bioinformatics applications and challenges
 
Visual Exploration of Clinical and Genomic Data for Patient Stratification
Visual Exploration of Clinical and Genomic Data for Patient StratificationVisual Exploration of Clinical and Genomic Data for Patient Stratification
Visual Exploration of Clinical and Genomic Data for Patient Stratification
 
Machine Learning for Molecules
Machine Learning for MoleculesMachine Learning for Molecules
Machine Learning for Molecules
 
Mastering RNA-Seq (NGS Data Analysis) - A Critical Approach To Transcriptomic...
Mastering RNA-Seq (NGS Data Analysis) - A Critical Approach To Transcriptomic...Mastering RNA-Seq (NGS Data Analysis) - A Critical Approach To Transcriptomic...
Mastering RNA-Seq (NGS Data Analysis) - A Critical Approach To Transcriptomic...
 
Computational Biology thesis defense
Computational Biology thesis defenseComputational Biology thesis defense
Computational Biology thesis defense
 
Session i overview bioinfo dm and app mmc
Session i overview bioinfo dm and app mmcSession i overview bioinfo dm and app mmc
Session i overview bioinfo dm and app mmc
 
IRJET- Gene Mutation Data using Multiplicative Adaptive Algorithm and Gene On...
IRJET- Gene Mutation Data using Multiplicative Adaptive Algorithm and Gene On...IRJET- Gene Mutation Data using Multiplicative Adaptive Algorithm and Gene On...
IRJET- Gene Mutation Data using Multiplicative Adaptive Algorithm and Gene On...
 
Intro to databases
Intro to databasesIntro to databases
Intro to databases
 
Microbiome Profiling with the Microbial Genomics Pro Suite
Microbiome Profiling with the Microbial Genomics Pro SuiteMicrobiome Profiling with the Microbial Genomics Pro Suite
Microbiome Profiling with the Microbial Genomics Pro Suite
 
Bioinformatics MiRON
Bioinformatics MiRONBioinformatics MiRON
Bioinformatics MiRON
 
“MS-Extractor: An Innovative Approach to Extract Microsatellites on „Y‟ Chrom...
“MS-Extractor: An Innovative Approach to Extract Microsatellites on „Y‟ Chrom...“MS-Extractor: An Innovative Approach to Extract Microsatellites on „Y‟ Chrom...
“MS-Extractor: An Innovative Approach to Extract Microsatellites on „Y‟ Chrom...
 
call for papers, research paper publishing, where to publish research paper, ...
call for papers, research paper publishing, where to publish research paper, ...call for papers, research paper publishing, where to publish research paper, ...
call for papers, research paper publishing, where to publish research paper, ...
 
Bioinformatics-R program의 실례
Bioinformatics-R program의 실례Bioinformatics-R program의 실례
Bioinformatics-R program의 실례
 
May 15 workshop
May 15  workshopMay 15  workshop
May 15 workshop
 
Phenotype rcn so-geno_workshop(shared)
Phenotype rcn so-geno_workshop(shared)Phenotype rcn so-geno_workshop(shared)
Phenotype rcn so-geno_workshop(shared)
 
May workshop
May workshopMay workshop
May workshop
 
Rna seq - PDX models
Rna seq - PDX models Rna seq - PDX models
Rna seq - PDX models
 
Recent trends in bioinformatics
Recent trends in bioinformaticsRecent trends in bioinformatics
Recent trends in bioinformatics
 
FAIR as a Working Principle for Cancer Genomic Data
FAIR as a Working Principle for Cancer Genomic DataFAIR as a Working Principle for Cancer Genomic Data
FAIR as a Working Principle for Cancer Genomic Data
 
BIOINFORMATICS.ppt
BIOINFORMATICS.pptBIOINFORMATICS.ppt
BIOINFORMATICS.ppt
 

Recently uploaded

Nucleic Acid-its structural and functional complexity.
Nucleic Acid-its structural and functional complexity.Nucleic Acid-its structural and functional complexity.
Nucleic Acid-its structural and functional complexity.
Nistarini College, Purulia (W.B) India
 
Astronomy Update- Curiosity’s exploration of Mars _ Local Briefs _ leadertele...
Astronomy Update- Curiosity’s exploration of Mars _ Local Briefs _ leadertele...Astronomy Update- Curiosity’s exploration of Mars _ Local Briefs _ leadertele...
Astronomy Update- Curiosity’s exploration of Mars _ Local Briefs _ leadertele...
NathanBaughman3
 
Body fluids_tonicity_dehydration_hypovolemia_hypervolemia.pptx
Body fluids_tonicity_dehydration_hypovolemia_hypervolemia.pptxBody fluids_tonicity_dehydration_hypovolemia_hypervolemia.pptx
Body fluids_tonicity_dehydration_hypovolemia_hypervolemia.pptx
muralinath2
 
role of pramana in research.pptx in science
role of pramana in research.pptx in sciencerole of pramana in research.pptx in science
role of pramana in research.pptx in science
sonaliswain16
 
ESR_factors_affect-clinic significance-Pathysiology.pptx
ESR_factors_affect-clinic significance-Pathysiology.pptxESR_factors_affect-clinic significance-Pathysiology.pptx
ESR_factors_affect-clinic significance-Pathysiology.pptx
muralinath2
 
Structural Classification Of Protein (SCOP)
Structural Classification Of Protein  (SCOP)Structural Classification Of Protein  (SCOP)
Structural Classification Of Protein (SCOP)
aishnasrivastava
 
RNA INTERFERENCE: UNRAVELING GENETIC SILENCING
RNA INTERFERENCE: UNRAVELING GENETIC SILENCINGRNA INTERFERENCE: UNRAVELING GENETIC SILENCING
RNA INTERFERENCE: UNRAVELING GENETIC SILENCING
AADYARAJPANDEY1
 
Richard's entangled aventures in wonderland
Richard's entangled aventures in wonderlandRichard's entangled aventures in wonderland
Richard's entangled aventures in wonderland
Richard Gill
 
PRESENTATION ABOUT PRINCIPLE OF COSMATIC EVALUATION
PRESENTATION ABOUT PRINCIPLE OF COSMATIC EVALUATIONPRESENTATION ABOUT PRINCIPLE OF COSMATIC EVALUATION
PRESENTATION ABOUT PRINCIPLE OF COSMATIC EVALUATION
ChetanK57
 
erythropoiesis-I_mechanism& clinical significance.pptx
erythropoiesis-I_mechanism& clinical significance.pptxerythropoiesis-I_mechanism& clinical significance.pptx
erythropoiesis-I_mechanism& clinical significance.pptx
muralinath2
 
filosofia boliviana introducción jsjdjd.pptx
filosofia boliviana introducción jsjdjd.pptxfilosofia boliviana introducción jsjdjd.pptx
filosofia boliviana introducción jsjdjd.pptx
IvanMallco1
 
(May 29th, 2024) Advancements in Intravital Microscopy- Insights for Preclini...
(May 29th, 2024) Advancements in Intravital Microscopy- Insights for Preclini...(May 29th, 2024) Advancements in Intravital Microscopy- Insights for Preclini...
(May 29th, 2024) Advancements in Intravital Microscopy- Insights for Preclini...
Scintica Instrumentation
 
Citrus Greening Disease and its Management
Citrus Greening Disease and its ManagementCitrus Greening Disease and its Management
Citrus Greening Disease and its Management
subedisuryaofficial
 
Hemostasis_importance& clinical significance.pptx
Hemostasis_importance& clinical significance.pptxHemostasis_importance& clinical significance.pptx
Hemostasis_importance& clinical significance.pptx
muralinath2
 
insect taxonomy importance systematics and classification
insect taxonomy importance systematics and classificationinsect taxonomy importance systematics and classification
insect taxonomy importance systematics and classification
anitaento25
 
Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...
Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...
Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...
Ana Luísa Pinho
 
in vitro propagation of plants lecture note.pptx
in vitro propagation of plants lecture note.pptxin vitro propagation of plants lecture note.pptx
in vitro propagation of plants lecture note.pptx
yusufzako14
 
extra-chromosomal-inheritance[1].pptx.pdfpdf
extra-chromosomal-inheritance[1].pptx.pdfpdfextra-chromosomal-inheritance[1].pptx.pdfpdf
extra-chromosomal-inheritance[1].pptx.pdfpdf
DiyaBiswas10
 
GBSN - Microbiology (Lab 4) Culture Media
GBSN - Microbiology (Lab 4) Culture MediaGBSN - Microbiology (Lab 4) Culture Media
GBSN - Microbiology (Lab 4) Culture Media
Areesha Ahmad
 
EY - Supply Chain Services 2018_template.pptx
EY - Supply Chain Services 2018_template.pptxEY - Supply Chain Services 2018_template.pptx
EY - Supply Chain Services 2018_template.pptx
AlguinaldoKong
 

Recently uploaded (20)

Nucleic Acid-its structural and functional complexity.
Nucleic Acid-its structural and functional complexity.Nucleic Acid-its structural and functional complexity.
Nucleic Acid-its structural and functional complexity.
 
Astronomy Update- Curiosity’s exploration of Mars _ Local Briefs _ leadertele...
Astronomy Update- Curiosity’s exploration of Mars _ Local Briefs _ leadertele...Astronomy Update- Curiosity’s exploration of Mars _ Local Briefs _ leadertele...
Astronomy Update- Curiosity’s exploration of Mars _ Local Briefs _ leadertele...
 
Body fluids_tonicity_dehydration_hypovolemia_hypervolemia.pptx
Body fluids_tonicity_dehydration_hypovolemia_hypervolemia.pptxBody fluids_tonicity_dehydration_hypovolemia_hypervolemia.pptx
Body fluids_tonicity_dehydration_hypovolemia_hypervolemia.pptx
 
role of pramana in research.pptx in science
role of pramana in research.pptx in sciencerole of pramana in research.pptx in science
role of pramana in research.pptx in science
 
ESR_factors_affect-clinic significance-Pathysiology.pptx
ESR_factors_affect-clinic significance-Pathysiology.pptxESR_factors_affect-clinic significance-Pathysiology.pptx
ESR_factors_affect-clinic significance-Pathysiology.pptx
 
Structural Classification Of Protein (SCOP)
Structural Classification Of Protein  (SCOP)Structural Classification Of Protein  (SCOP)
Structural Classification Of Protein (SCOP)
 
RNA INTERFERENCE: UNRAVELING GENETIC SILENCING
RNA INTERFERENCE: UNRAVELING GENETIC SILENCINGRNA INTERFERENCE: UNRAVELING GENETIC SILENCING
RNA INTERFERENCE: UNRAVELING GENETIC SILENCING
 
Richard's entangled aventures in wonderland
Richard's entangled aventures in wonderlandRichard's entangled aventures in wonderland
Richard's entangled aventures in wonderland
 
PRESENTATION ABOUT PRINCIPLE OF COSMATIC EVALUATION
PRESENTATION ABOUT PRINCIPLE OF COSMATIC EVALUATIONPRESENTATION ABOUT PRINCIPLE OF COSMATIC EVALUATION
PRESENTATION ABOUT PRINCIPLE OF COSMATIC EVALUATION
 
erythropoiesis-I_mechanism& clinical significance.pptx
erythropoiesis-I_mechanism& clinical significance.pptxerythropoiesis-I_mechanism& clinical significance.pptx
erythropoiesis-I_mechanism& clinical significance.pptx
 
filosofia boliviana introducción jsjdjd.pptx
filosofia boliviana introducción jsjdjd.pptxfilosofia boliviana introducción jsjdjd.pptx
filosofia boliviana introducción jsjdjd.pptx
 
(May 29th, 2024) Advancements in Intravital Microscopy- Insights for Preclini...
(May 29th, 2024) Advancements in Intravital Microscopy- Insights for Preclini...(May 29th, 2024) Advancements in Intravital Microscopy- Insights for Preclini...
(May 29th, 2024) Advancements in Intravital Microscopy- Insights for Preclini...
 
Citrus Greening Disease and its Management
Citrus Greening Disease and its ManagementCitrus Greening Disease and its Management
Citrus Greening Disease and its Management
 
Hemostasis_importance& clinical significance.pptx
Hemostasis_importance& clinical significance.pptxHemostasis_importance& clinical significance.pptx
Hemostasis_importance& clinical significance.pptx
 
insect taxonomy importance systematics and classification
insect taxonomy importance systematics and classificationinsect taxonomy importance systematics and classification
insect taxonomy importance systematics and classification
 
Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...
Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...
Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...
 
in vitro propagation of plants lecture note.pptx
in vitro propagation of plants lecture note.pptxin vitro propagation of plants lecture note.pptx
in vitro propagation of plants lecture note.pptx
 
extra-chromosomal-inheritance[1].pptx.pdfpdf
extra-chromosomal-inheritance[1].pptx.pdfpdfextra-chromosomal-inheritance[1].pptx.pdfpdf
extra-chromosomal-inheritance[1].pptx.pdfpdf
 
GBSN - Microbiology (Lab 4) Culture Media
GBSN - Microbiology (Lab 4) Culture MediaGBSN - Microbiology (Lab 4) Culture Media
GBSN - Microbiology (Lab 4) Culture Media
 
EY - Supply Chain Services 2018_template.pptx
EY - Supply Chain Services 2018_template.pptxEY - Supply Chain Services 2018_template.pptx
EY - Supply Chain Services 2018_template.pptx
 

Cytogenetics payne lab_presentation_08282013

  • 1. CytoGPS (CytoGenetic Pattern Sleuth) Arka Pattanayak Zachary Abrams Informatics Research & Development, Dept. of Biomedical Informatics at The Ohio State University 08/28/2013
  • 2. 2 • Complex chromosomal aberration data – structure and knowledge. • Inherently descriptive grammar – International System for human Cytogenetic Nomenclature (ISCN). MOTIVATION : DATA SOLUTION : CytoGPS APPLICATIONS : MULTIPLE • Parse karyotypes using Context- Free Grammar (CFG) rules. • Extract morphological phrases. • Map phrases to abstract biological meta-model. • Discovery of important, obfuscated patterns in cytogenetic data. • Targeted Treatment. • In-silico drug studies. CytoGPS: 3-month status report
  • 3. MOTIVATION Existing cytogenetic data: • Structured. • ISCN-conformant. • Multi-dimensional. Minimal exploitation due to its informational complexity: • Syntactic variability. • Information density. • Human error. 3 CytoGPS: 3-month status report
  • 5. CytoGPS Platform Smart Parser of Karyotypes EBNF Grammar Rules Parser Generators and Parse Tree Visitors Biologically Abstracted Meta-Model Mapper DSL Genetic Pattern Matching ML Algorithms Phenotype- phenotype Matching 5 CytoGPS: CytoGenetic Pattern Sleuth CytoGPS: 3-month status report
  • 6. Parsing Complex Karyotypes using SPoK (Smart Parser of Karyotypes) 6 CytoGPS: 3-month status report
  • 7. SPoK (Smart Parser of Karyotypes) • Enables in-silico analyses of complex karyotypes. • Based on well-studied fundamentals in computational parsing (CFG, EBNF). • Disease-agnostic. • Multi-disciplinary effort - Biomedical Informatics, Cytogenetics, Hematology. • ~76% of 3000 publicly available ISCN 2009 karyotypes were successfully parsed with this method. 7 CytoGPS: 3-month status report
  • 8. 8 SPoK: Context-Free Grammar Rules CytoGPS: 3-month status report
  • 9. 9 SPoK: Parser Generation Deterministic Parser ANTLR CytoGPS: 3-month status report
  • 10. 10 CytoGPS: 3-month status report SPoK: A Parse Tree Showing the Morphological Deconstruction of a Complex Karyotype 46,XY,del(17)(p12),t(12;15)(p13;q20)
  • 11. Functional Abstraction using LossGainFusion (Biologically Abstracted Meta-Model) 11 CytoGPS: 3-month status report
  • 12. LGF(Biologicall y Abstracted Meta-Model) • Abstraction of ISCN aberrations observed in chromosomal bands to their biologically functional outcomes. • Using a custom Domain-Specific Language (DSL) • Karyotype complexity-agnostic. • Human-readable karyotypes to machine- readable construct. • ~90% of parsed karyotypes were successfully mapped using this model. 12 CytoGPS: 3-month status report
  • 13. 13 LGF: Understanding Oncogenic Effects with an Abstracted Meta-Model CytoGPS: 3-month status report
  • 14. del(17)(p12) del 17p12 del1:L 46,XY,del(17)(p12),t(12;15)(p13;q20) t(12;15)(p13;q20) t 12p13 15q20 t2:F,F 1.Complete karyotype 2.Chromosomal aberrations 3.ID and chromosomal locations 14 CytoGPS: 3-month status report LGF: Morphological Decomposition of Karyotypes
  • 15. der(4)t(4;13)(p14;p18) der(4) t(4;13)(p14;p18) t 4p14 13p18der 4 A B C D E A+C=F B,D,E add up to 3 F3:B,D,E We don’t need B so we don’t put a annotation at that location. We need to put the biological response for D and E in there respective locations. F3:,FL,FG 15 CytoGPS: 3-month status report LGF: Morphological Decomposition of Karyotypes (more complex example)
  • 16. 16 LGF: Domain-Specific Language for Mapping ISCN Aberrations to the Meta- Model. CytoGPS: 3-month status report
  • 17. Genetic Pattern Matching 17 CytoGPS: 3-month status report
  • 18. GPM: Genetic Pattern Matching using C.A.R.T. (Classification And Regression Tree) Algorithm 18 Features-band locations on X-axis Karyotypes on Y-axis 1p36.3 1p36.2 1p36.1 … 1q44 2p25 … yp12 1 1 0 0 0 0 0 0 1 1 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 1 1 0 0 First cut Second cut
  • 19. Applied Biomedical Informatics using CytoGPS: A Case Study 19 CytoGPS: 3-month status report
  • 20. Case Study: In-silico Drug Studies  Raw ISCN Karyotypes.  Parse  Machine-readable Construct  Map ISCN aberration to gene-set  Map gene-set to known chemical reagent databases.  An end-to-end in-silico solution for Drug Studies  Significant cost savings.  Rapid.  Flatter learning curve to operate such a system compared to wet-lab testing. 20 CytoGPS: 3-month status report
  • 21. Case Study: Map ISCN Aberration to Gene- Set  Ensembl @see: http://beta.rest.ensembl.org/do cumentation/info/feature_regio n  RESTful web service endpoint  Speaks JSON  RESTful request looks like this: http://beta.rest.ensembl.org/ feature/region/human/17:15 700000- 16000000?feature=gene;co ntent-type=application/json 21 CytoGPS: 3-month status report
  • 22. 22 Case Study: Extracting Genetic Information CytoGPS: 3-month status report
  • 23. Zachary Abrams Lori Dalton, PhD Philip R. O. Payne, PhD Arka Pattanayak Raj Muthusamy, PhD Nyla Heerema, PhD William Kenworthy Sarah Yousef Alex Mysiw Yuxiang Kou Michael Berkovich 23 CytoGPS: 3-month status report

Editor's Notes

  1. ISCN: International System for Human Cytogenetic Nomenclature - Defines grammatical rules for encoding observed chromosomal aberrations.
  2. UNHANDLED CASE: Vast majority of those that were not parsed involved a ‘?’ and thus there was no location information. ~95% if no ‘?’ encountered.
  3. EBNF CFGs.
  4. Talk briefly about mapping failures and provide possible solutions.