SlideShare a Scribd company logo
1 of 37
1
ELIS – Multimedia Lab
Big Data Guy…
the story so far
2
ELIS – Multimedia Lab
I. CV
I. Electromagnetics for fusion
II. Bioinformatics: iADHoRe & BLSSPeller
III. Hadoop@Telenet
II. @MMLab
III. BLSSpeller:
I. Motivation
I. Genetics
II. Motif Discovery
II. Algorithm
III. Validation & Future work
Outline
3
ELIS – Multimedia Lab
CV
4
ELIS – Multimedia Lab
EM Wave solver for Nuclear Fusion
• Solving Maxwell’s equations in highly
inhomogeneous anisotropic plasma
5
ELIS – Multimedia Lab
HPC toolbox:
Integral Equations
MoM
FFT
FMM
CGsolver
FEM
FDTD
Cold plasma solution
6
ELIS – Multimedia Lab
Collinear
regions in
GHMs
Improved
sensitivity by
using aligned
gene segment
profiles
iADHoRe 3.0
7
ELIS – Multimedia LabVisualization suite
8
ELIS – Multimedia Lab
C++
MPI
Pthreads
Alignment
Clustering
Ensembl Dataset
9
ELIS – Multimedia Lab
Hox
10
ELIS – Multimedia Lab
• Parallel algorithm to uncover sequence motifs in
DNA sequences of closely related species
+
OR
Current Research: BLSSpeller
11
ELIS – Multimedia Lab
@Telenet: Data Warehousing in a
Hadoop production environment
12
ELIS – Multimedia Lab
@MMLAB
13
ELIS – Multimedia Lab
• Status file:///C:/Users/ddewitte/Dropbox/BDS_share/index.html
Teaching: BDS course
14
ELIS – Multimedia Lab
SPARQL
Public
medical
data
• Ontoforce: Federated query engine for
biomedical datasets:
• Virtuoso is currently the bottleneck
• Federated Querying
• Alternative architectures
• Some datasets >> Virtuoso
Project work
Aggregate
s
15
ELIS – Multimedia Lab
You …
Are …
The …
Greatest …
=> more powerpoints
MMLab Sales…
16
ELIS – Multimedia Lab
BLSSPELLER: MOTIVATION
Exhaustive comparative discovery of conserved cis-regulatory
elements
17
ELIS – Multimedia Lab
• DNA stores the
information to build
proteins…
• Proteins are generated
in a two-step process:
transcription &
translation
• Update: there should
be more arrows!
One slide on genetics
18
ELIS – Multimedia Lab
• RNA polymerase: DNA to RNA
• Recruited by transcription factors which bind to
gene promoter => where are the sites?
Transcription factor binding sites
19
ELIS – Multimedia Lab
• Original approach:
- Search for motifs in the promoter sequences of sets
of coregulated genes
- But: Coexpression IS NOT EQUAL TO
Coregulation
• Phylogenetic approach
- Compare promoter sequences of genes which are
homologous between similar species
- Approach: use genome alignments
Which data!?
20
ELIS – Multimedia Lab
Which binding site model?
MM
IUPAC
PWM
21
ELIS – Multimedia Lab
Word-based
Exact Algorithms
Nondeterministic
Algorithms:
EM, Gibbs Sampling,
…
Algorithm?
22
ELIS – Multimedia Lab
Alignment comes with a sacrifice
23
ELIS – Multimedia Lab
BLSSPELLER ALGORITHM
24
ELIS – Multimedia Lab
• Comparative motif finding
• Exhaustive algorithm => no heuristics
• Word-based motif model
• Alignment-free (Alignment-based as a bonus)
How are we different?
25
ELIS – Multimedia Lab
Preprocessing: gene families
26
ELIS – Multimedia Lab
Local motif conservation
27
ELIS – Multimedia Lab
Every candidate gets a BLS score
28
ELIS – Multimedia Lab
Depth-first
exhaustive motif
enumeration
for 15 character
IUPAC alphabet
GST can be used
for Branch & Bound
purposes: Search
space reduction
Motif discovery with GSTs
29
ELIS – Multimedia Lab
Genome wide conservation
30
ELIS – Multimedia Lab
Motif vs Background => Confidence
31
ELIS – Multimedia Lab
BLSSPELLER VALIDATION
32
ELIS – Multimedia Lab
• http://bioinformatics.intec.ugent.be/blsspeller/AFABGlobal.html
Comparing AF vs AB
33
ELIS – Multimedia Lab
• http://bioinformatics.intec.ugent.be/blsspeller/AFABHistograms.html
Validation: FDR
34
ELIS – Multimedia Lab
Retrieving a known maize regulator
35
ELIS – Multimedia Lab
Status?
36
ELIS – Multimedia Lab
• Extending BLSSpeller: treat it as a reverse
index!
• Query engine
• OCR data
• Frontend
• Clustering & Visualizations
• Alternative paths (decide after Speller review)
• Scaling out Sparql / Graph algorithms (Ontoforce)
• Stream analytics and querying (IoT)
• Medical images (Wesley)
New Research
37
ELIS – Multimedia Lab
Questions!?

More Related Content

Similar to ResearchMeeting15April2015

Theory and practice of graphical population analysis
Theory and practice of graphical population analysisTheory and practice of graphical population analysis
Theory and practice of graphical population analysisGenome Reference Consortium
 
Advanced Bioinformatics for Genomics and BioData Driven Research
Advanced Bioinformatics for Genomics and BioData Driven ResearchAdvanced Bioinformatics for Genomics and BioData Driven Research
Advanced Bioinformatics for Genomics and BioData Driven ResearchEuropean Bioinformatics Institute
 
Evolution algorithms
Evolution algorithmsEvolution algorithms
Evolution algorithmsAndrii Babii
 
Ontology Services for the Biomedical Sciences
Ontology Services for the Biomedical SciencesOntology Services for the Biomedical Sciences
Ontology Services for the Biomedical SciencesConnected Data World
 
Bioinformatics (Exam point of view)
Bioinformatics (Exam point of view)Bioinformatics (Exam point of view)
Bioinformatics (Exam point of view)Sijo A
 
2010 CASCON - Towards a integrated network of data and services for the life ...
2010 CASCON - Towards a integrated network of data and services for the life ...2010 CASCON - Towards a integrated network of data and services for the life ...
2010 CASCON - Towards a integrated network of data and services for the life ...Michel Dumontier
 
BioAssay Express: Creating and exploiting assay metadata
BioAssay Express: Creating and exploiting assay metadataBioAssay Express: Creating and exploiting assay metadata
BioAssay Express: Creating and exploiting assay metadataPhilip Cheung
 
Nitant_Choksi_CAP6545_Presentation_Slides.pptx
Nitant_Choksi_CAP6545_Presentation_Slides.pptxNitant_Choksi_CAP6545_Presentation_Slides.pptx
Nitant_Choksi_CAP6545_Presentation_Slides.pptxNitantChoksi1
 
Small Molecules in Big Data fTALES Ghent
Small Molecules in Big Data fTALES GhentSmall Molecules in Big Data fTALES Ghent
Small Molecules in Big Data fTALES GhentEmma Schymanski
 
Boosting probabilistic graphical model inference by incorporating prior knowl...
Boosting probabilistic graphical model inference by incorporating prior knowl...Boosting probabilistic graphical model inference by incorporating prior knowl...
Boosting probabilistic graphical model inference by incorporating prior knowl...Hakky St
 
2016 bioinformatics i_wim_vancriekinge_vupload
2016 bioinformatics i_wim_vancriekinge_vupload2016 bioinformatics i_wim_vancriekinge_vupload
2016 bioinformatics i_wim_vancriekinge_vuploadProf. Wim Van Criekinge
 
Ontologies for life sciences: examples from the gene ontology
Ontologies for life sciences: examples from the gene ontologyOntologies for life sciences: examples from the gene ontology
Ontologies for life sciences: examples from the gene ontologyMelanie Courtot
 
Facilitating semantic alignment.-biohackathon-jupp
Facilitating semantic alignment.-biohackathon-juppFacilitating semantic alignment.-biohackathon-jupp
Facilitating semantic alignment.-biohackathon-juppSimon Jupp
 
Extracting and Making Use of Materials Data from Millions of Journal Articles...
Extracting and Making Use of Materials Data from Millions of Journal Articles...Extracting and Making Use of Materials Data from Millions of Journal Articles...
Extracting and Making Use of Materials Data from Millions of Journal Articles...Anubhav Jain
 
Session ii g2 overview chemical modeling mmc
Session ii g2 overview chemical modeling mmcSession ii g2 overview chemical modeling mmc
Session ii g2 overview chemical modeling mmcUSD Bioinformatics
 
Ewan Birney Biocuration 2013
Ewan Birney Biocuration 2013Ewan Birney Biocuration 2013
Ewan Birney Biocuration 2013Iddo
 
[DSC Europe 23][DigiHealth] Vesna Pajic - Machine Learning Techniques for omi...
[DSC Europe 23][DigiHealth] Vesna Pajic - Machine Learning Techniques for omi...[DSC Europe 23][DigiHealth] Vesna Pajic - Machine Learning Techniques for omi...
[DSC Europe 23][DigiHealth] Vesna Pajic - Machine Learning Techniques for omi...DataScienceConferenc1
 
2014 09 30_t1_bioinformatics_wim_vancriekinge
2014 09 30_t1_bioinformatics_wim_vancriekinge2014 09 30_t1_bioinformatics_wim_vancriekinge
2014 09 30_t1_bioinformatics_wim_vancriekingeProf. Wim Van Criekinge
 

Similar to ResearchMeeting15April2015 (20)

Theory and practice of graphical population analysis
Theory and practice of graphical population analysisTheory and practice of graphical population analysis
Theory and practice of graphical population analysis
 
Advanced Bioinformatics for Genomics and BioData Driven Research
Advanced Bioinformatics for Genomics and BioData Driven ResearchAdvanced Bioinformatics for Genomics and BioData Driven Research
Advanced Bioinformatics for Genomics and BioData Driven Research
 
Evolution algorithms
Evolution algorithmsEvolution algorithms
Evolution algorithms
 
Activities at the Royal Society of Chemistry to gather, extract and analyze b...
Activities at the Royal Society of Chemistry to gather, extract and analyze b...Activities at the Royal Society of Chemistry to gather, extract and analyze b...
Activities at the Royal Society of Chemistry to gather, extract and analyze b...
 
Ontology Services for the Biomedical Sciences
Ontology Services for the Biomedical SciencesOntology Services for the Biomedical Sciences
Ontology Services for the Biomedical Sciences
 
Bioinformatics (Exam point of view)
Bioinformatics (Exam point of view)Bioinformatics (Exam point of view)
Bioinformatics (Exam point of view)
 
2010 CASCON - Towards a integrated network of data and services for the life ...
2010 CASCON - Towards a integrated network of data and services for the life ...2010 CASCON - Towards a integrated network of data and services for the life ...
2010 CASCON - Towards a integrated network of data and services for the life ...
 
BioAssay Express: Creating and exploiting assay metadata
BioAssay Express: Creating and exploiting assay metadataBioAssay Express: Creating and exploiting assay metadata
BioAssay Express: Creating and exploiting assay metadata
 
Nitant_Choksi_CAP6545_Presentation_Slides.pptx
Nitant_Choksi_CAP6545_Presentation_Slides.pptxNitant_Choksi_CAP6545_Presentation_Slides.pptx
Nitant_Choksi_CAP6545_Presentation_Slides.pptx
 
Small Molecules in Big Data fTALES Ghent
Small Molecules in Big Data fTALES GhentSmall Molecules in Big Data fTALES Ghent
Small Molecules in Big Data fTALES Ghent
 
Boosting probabilistic graphical model inference by incorporating prior knowl...
Boosting probabilistic graphical model inference by incorporating prior knowl...Boosting probabilistic graphical model inference by incorporating prior knowl...
Boosting probabilistic graphical model inference by incorporating prior knowl...
 
2016 bioinformatics i_wim_vancriekinge_vupload
2016 bioinformatics i_wim_vancriekinge_vupload2016 bioinformatics i_wim_vancriekinge_vupload
2016 bioinformatics i_wim_vancriekinge_vupload
 
2015 bioinformatics wim_vancriekinge
2015 bioinformatics wim_vancriekinge2015 bioinformatics wim_vancriekinge
2015 bioinformatics wim_vancriekinge
 
Ontologies for life sciences: examples from the gene ontology
Ontologies for life sciences: examples from the gene ontologyOntologies for life sciences: examples from the gene ontology
Ontologies for life sciences: examples from the gene ontology
 
Facilitating semantic alignment.-biohackathon-jupp
Facilitating semantic alignment.-biohackathon-juppFacilitating semantic alignment.-biohackathon-jupp
Facilitating semantic alignment.-biohackathon-jupp
 
Extracting and Making Use of Materials Data from Millions of Journal Articles...
Extracting and Making Use of Materials Data from Millions of Journal Articles...Extracting and Making Use of Materials Data from Millions of Journal Articles...
Extracting and Making Use of Materials Data from Millions of Journal Articles...
 
Session ii g2 overview chemical modeling mmc
Session ii g2 overview chemical modeling mmcSession ii g2 overview chemical modeling mmc
Session ii g2 overview chemical modeling mmc
 
Ewan Birney Biocuration 2013
Ewan Birney Biocuration 2013Ewan Birney Biocuration 2013
Ewan Birney Biocuration 2013
 
[DSC Europe 23][DigiHealth] Vesna Pajic - Machine Learning Techniques for omi...
[DSC Europe 23][DigiHealth] Vesna Pajic - Machine Learning Techniques for omi...[DSC Europe 23][DigiHealth] Vesna Pajic - Machine Learning Techniques for omi...
[DSC Europe 23][DigiHealth] Vesna Pajic - Machine Learning Techniques for omi...
 
2014 09 30_t1_bioinformatics_wim_vancriekinge
2014 09 30_t1_bioinformatics_wim_vancriekinge2014 09 30_t1_bioinformatics_wim_vancriekinge
2014 09 30_t1_bioinformatics_wim_vancriekinge
 

Recently uploaded

B2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxB2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxStephen266013
 
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfKantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfSocial Samosa
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationshipsccctableauusergroup
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysismanisha194592
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz1
 
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Callshivangimorya083
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionfulawalesam
 
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptdokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptSonatrach
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfLars Albertsson
 
RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998YohFuh
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxolyaivanovalion
 
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Serviceranjana rawat
 
100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptxAnupama Kate
 
定制英国白金汉大学毕业证(UCB毕业证书) 成绩单原版一比一
定制英国白金汉大学毕业证(UCB毕业证书)																			成绩单原版一比一定制英国白金汉大学毕业证(UCB毕业证书)																			成绩单原版一比一
定制英国白金汉大学毕业证(UCB毕业证书) 成绩单原版一比一ffjhghh
 
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service BhilaiLow Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service BhilaiSuhani Kapoor
 
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxolyaivanovalion
 
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130Suhani Kapoor
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxolyaivanovalion
 

Recently uploaded (20)

B2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxB2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docx
 
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfKantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysis
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signals
 
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interaction
 
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptdokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdf
 
VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...
VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...
VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...
 
RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptx
 
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
 
100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx
 
定制英国白金汉大学毕业证(UCB毕业证书) 成绩单原版一比一
定制英国白金汉大学毕业证(UCB毕业证书)																			成绩单原版一比一定制英国白金汉大学毕业证(UCB毕业证书)																			成绩单原版一比一
定制英国白金汉大学毕业证(UCB毕业证书) 成绩单原版一比一
 
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service BhilaiLow Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
 
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptx
 
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptx
 

ResearchMeeting15April2015

  • 1. 1 ELIS – Multimedia Lab Big Data Guy… the story so far
  • 2. 2 ELIS – Multimedia Lab I. CV I. Electromagnetics for fusion II. Bioinformatics: iADHoRe & BLSSPeller III. Hadoop@Telenet II. @MMLab III. BLSSpeller: I. Motivation I. Genetics II. Motif Discovery II. Algorithm III. Validation & Future work Outline
  • 4. 4 ELIS – Multimedia Lab EM Wave solver for Nuclear Fusion • Solving Maxwell’s equations in highly inhomogeneous anisotropic plasma
  • 5. 5 ELIS – Multimedia Lab HPC toolbox: Integral Equations MoM FFT FMM CGsolver FEM FDTD Cold plasma solution
  • 6. 6 ELIS – Multimedia Lab Collinear regions in GHMs Improved sensitivity by using aligned gene segment profiles iADHoRe 3.0
  • 7. 7 ELIS – Multimedia LabVisualization suite
  • 8. 8 ELIS – Multimedia Lab C++ MPI Pthreads Alignment Clustering Ensembl Dataset
  • 10. 10 ELIS – Multimedia Lab • Parallel algorithm to uncover sequence motifs in DNA sequences of closely related species + OR Current Research: BLSSpeller
  • 11. 11 ELIS – Multimedia Lab @Telenet: Data Warehousing in a Hadoop production environment
  • 13. 13 ELIS – Multimedia Lab • Status file:///C:/Users/ddewitte/Dropbox/BDS_share/index.html Teaching: BDS course
  • 14. 14 ELIS – Multimedia Lab SPARQL Public medical data • Ontoforce: Federated query engine for biomedical datasets: • Virtuoso is currently the bottleneck • Federated Querying • Alternative architectures • Some datasets >> Virtuoso Project work Aggregate s
  • 15. 15 ELIS – Multimedia Lab You … Are … The … Greatest … => more powerpoints MMLab Sales…
  • 16. 16 ELIS – Multimedia Lab BLSSPELLER: MOTIVATION Exhaustive comparative discovery of conserved cis-regulatory elements
  • 17. 17 ELIS – Multimedia Lab • DNA stores the information to build proteins… • Proteins are generated in a two-step process: transcription & translation • Update: there should be more arrows! One slide on genetics
  • 18. 18 ELIS – Multimedia Lab • RNA polymerase: DNA to RNA • Recruited by transcription factors which bind to gene promoter => where are the sites? Transcription factor binding sites
  • 19. 19 ELIS – Multimedia Lab • Original approach: - Search for motifs in the promoter sequences of sets of coregulated genes - But: Coexpression IS NOT EQUAL TO Coregulation • Phylogenetic approach - Compare promoter sequences of genes which are homologous between similar species - Approach: use genome alignments Which data!?
  • 20. 20 ELIS – Multimedia Lab Which binding site model? MM IUPAC PWM
  • 21. 21 ELIS – Multimedia Lab Word-based Exact Algorithms Nondeterministic Algorithms: EM, Gibbs Sampling, … Algorithm?
  • 22. 22 ELIS – Multimedia Lab Alignment comes with a sacrifice
  • 23. 23 ELIS – Multimedia Lab BLSSPELLER ALGORITHM
  • 24. 24 ELIS – Multimedia Lab • Comparative motif finding • Exhaustive algorithm => no heuristics • Word-based motif model • Alignment-free (Alignment-based as a bonus) How are we different?
  • 25. 25 ELIS – Multimedia Lab Preprocessing: gene families
  • 26. 26 ELIS – Multimedia Lab Local motif conservation
  • 27. 27 ELIS – Multimedia Lab Every candidate gets a BLS score
  • 28. 28 ELIS – Multimedia Lab Depth-first exhaustive motif enumeration for 15 character IUPAC alphabet GST can be used for Branch & Bound purposes: Search space reduction Motif discovery with GSTs
  • 29. 29 ELIS – Multimedia Lab Genome wide conservation
  • 30. 30 ELIS – Multimedia Lab Motif vs Background => Confidence
  • 31. 31 ELIS – Multimedia Lab BLSSPELLER VALIDATION
  • 32. 32 ELIS – Multimedia Lab • http://bioinformatics.intec.ugent.be/blsspeller/AFABGlobal.html Comparing AF vs AB
  • 33. 33 ELIS – Multimedia Lab • http://bioinformatics.intec.ugent.be/blsspeller/AFABHistograms.html Validation: FDR
  • 34. 34 ELIS – Multimedia Lab Retrieving a known maize regulator
  • 35. 35 ELIS – Multimedia Lab Status?
  • 36. 36 ELIS – Multimedia Lab • Extending BLSSpeller: treat it as a reverse index! • Query engine • OCR data • Frontend • Clustering & Visualizations • Alternative paths (decide after Speller review) • Scaling out Sparql / Graph algorithms (Ontoforce) • Stream analytics and querying (IoT) • Medical images (Wesley) New Research
  • 37. 37 ELIS – Multimedia Lab Questions!?

Editor's Notes

  1. Master Thesis: solving maxwell’s equation in a fusion reactor RF antenna is used to pump energy into the plasma, fully ionized gas, needs 10^7 K to ignite and produce energy (like a fire) Clean energy: D, T in => He + n out
  2. Scattering on a cold plasma cylinder, validation of the algorithm since it was possible to derive an analytical solution here
  3. Followed Jan Fostier to IBCN: iterative Alignment-based Detection of Homologous Regions GHM: compare two chromosomes, may also be same species or same chromosome Collinear regions = gene order and content conserved Bottomup clustering algorithm using a nonuniform distance (shortest along diagonals) Alignments used as new ‘chromosomes’ for bigger sensitivity (right panel)
  4. Visualisation suite mainly used to understand the algorithm (green boxes pvalue, red boxes bad pvalue, dots, blue confidence interval for regression line)
  5. “I just want to make things fast” Joachim VH Bad News: we were able to process the full ensembl dataset of genomes in approximately 8 hours on a single node => no need for 4.0 
  6. Why compare 50 genomes? We managed to find a multiplicon of all species: HOX cluster (=gold standard) known to relate to the body plan of an organism
  7. After that on my own: started a new research based on a book I read about index structures: find dna motifs, exhaustive algorithm which forced us to look at parallellisation and big computers
  8. March left: Telenet, Dataretention project => Identify people based on request by the Police
  9. Met Erik on a train, convinced me to resign my well paid job
  10. Big Data course: Visualisation, Big Data management (nosql, hadoop, architectures, semantics, spark, IOT streaming), Analytics (deep learning, scalable algorithms)
  11. Project with Ontoforce: design of a federated query engine. Status: fixing federated engine of MMLab, open question federated querying connected to disqover not feasible So many options possible since virtuoso is only used as a data adapter
  12. I am often not here, and I often wonder what I am actualy doing but I think you should call it sales. We already have strong connections with IoT en Data Science, on the right a number of Companies we introduced ourselves (companies want to do or are already doing big data)
  13. More arrows: many different types of RNAs, often an end product, proteins are in a dynamic equilibrium, proteins bind back to DNA to trigger the creation of new proteins If one of the arrows breaks: Cancer Protein networks are often drawn = pathway analysis
  14. Transcription: molecular machine, recruited by transcription factors (=proteins) Where are the sites: in vitro versus in vivo is not the same!! (chipSeq)
  15. At first: compare promoters of coregulated genes Next: compare promoters of genes which are related (or mostly WGA)
  16. IUPAC motif model is the most exact motif model
  17. Word based algorithms (based on MM), exhaustive,… are better than randomized algorithms but overall most of them are not very good
  18. Current approaches work mainly with alignments but these biologically identified binding sites are misaligned!!
  19. Set of 17724 gene families for which we have the promoter sequences (each time >=1 gene from each species)
  20. Alignment-free!, score phylogenetic conservation (BLS)
  21. Internal nodes are bitvectors which represent in which sequences a prefix occurs
  22. Genome wide: a given motif for all gene families => aggregate How many families does a motif occurs with a certain BLS threshold
  23. Summarized in motif confidence charts