Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Ghent University Global Campus (GUGC) Research Seminar
Wesley De Neve
Ghent University – iMinds & KAIST
Het Pand, Ghent, B...
2
• Credentials
- Master’s degree in computer science (2002)
• at Ghent University, Belgium
- Ph.D. degree in computer sci...
3
Teaching Activities
Informatics 1 Informatics 2
4
• Main track
- machine learning for social media
and video content understanding
• Side track
- compression of genomic d...
5
COMPRESSION OF GENOMIC DATA
USING VIDEO CODING TOOLS
In what follows…
6
• DNA sequencing (digitization) is quickly becoming cheaper
Context
7
• Challenge: data handling
- the ability of researchers to sequence
DNA is outrunning their ability to
store, transmit, ...
8
DNA Compression Framework
Input filter Encoding filter
Pipe
Output filter
Pipe PipePipe
Statistics
9
DNA Compression Framework
Input filter Encoding filter
Pipe
Output filter
Pipe PipePipe
Statistics
software for reading ...
10
DNA Compression Framework
Input filter Encoding filter
Pipe
Output filter
Pipe PipePipe
Statistics
software for compres...
11
DNA Compression Framework
Input filter Encoding filter
Pipe
Output filter
Pipe PipePipe
Statistics
software for writing...
12
DNA Compression Framework
Input filter Encoding filter
Pipe
Output filter
Pipe PipePipe
Statistics
software for gatheri...
13
• Modular and extensible
- thanks to the use of the pipes and filters design pattern
Characteristics (1)
Input filter E...
14
• Block-based compression
- allows selecting the best compression tool per block (adaptivity)
- enables random access, ...
15
Characteristics (3)
Efficiency
FunctionalityEffectiveness
Proposed
solution
SOTA
allowing for a flexible trade-off betw...
16
• Effectiveness: compression of the human Y chromosome
• Efficiency: no meaningful measurements thus far
Experimental R...
17
• Compression
- integration of advanced entropy coding
- support for the protein alphabet
- performance optimizations (...
18
• From
- What video coding technologies can be re-used
in the context of DNA data compression?
• To
- What multimedia t...
Thank you for your attention
Any questions or comments?
20
[1] Tom Paridaens, Wesley De Neve, Peter Lambert, Rik Van de Walle,
Genome Sequences as Media Files: Towards Effective,...
Upcoming SlideShare
Loading in …5
×

Towards using multimedia technology for biological data processing

430 views

Published on

Towards using multimedia technology for biological data processing.

Presentation given during the Ghent University Global Campus (GUGC) Research Seminar on 19/1/2014.

Published in: Technology
  • Be the first to comment

  • Be the first to like this

Towards using multimedia technology for biological data processing

  1. 1. Ghent University Global Campus (GUGC) Research Seminar Wesley De Neve Ghent University – iMinds & KAIST Het Pand, Ghent, Belgium January 19, 2015 Towards Using Multimedia Technology for Biological Data Processing
  2. 2. 2 • Credentials - Master’s degree in computer science (2002) • at Ghent University, Belgium - Ph.D. degree in computer science engineering (2007) • at Ghent University, Belgium • Employment - Multimedia Lab @ Ghent University - iMinds, Belgium (since 2011) - Image and Video Systems Lab @ KAIST, Korea (since 2007) Background
  3. 3. 3 Teaching Activities Informatics 1 Informatics 2
  4. 4. 4 • Main track - machine learning for social media and video content understanding • Side track - compression of genomic data using video coding tools Research Activities
  5. 5. 5 COMPRESSION OF GENOMIC DATA USING VIDEO CODING TOOLS In what follows…
  6. 6. 6 • DNA sequencing (digitization) is quickly becoming cheaper Context
  7. 7. 7 • Challenge: data handling - the ability of researchers to sequence DNA is outrunning their ability to store, transmit, and analyze DNA • Research question - how about compressing DNA by making use of video coding tools in order to alleviate storage, transmission, and analysis problems? Problem Statement
  8. 8. 8 DNA Compression Framework Input filter Encoding filter Pipe Output filter Pipe PipePipe Statistics
  9. 9. 9 DNA Compression Framework Input filter Encoding filter Pipe Output filter Pipe PipePipe Statistics software for reading DNA data from the hard disk or the network
  10. 10. 10 DNA Compression Framework Input filter Encoding filter Pipe Output filter Pipe PipePipe Statistics software for compressing DNA data
  11. 11. 11 DNA Compression Framework Input filter Encoding filter Pipe Output filter Pipe PipePipe Statistics software for writing DNA data to the hard disk or the network
  12. 12. 12 DNA Compression Framework Input filter Encoding filter Pipe Output filter Pipe PipePipe Statistics software for gathering compression performance statistics
  13. 13. 13 • Modular and extensible - thanks to the use of the pipes and filters design pattern Characteristics (1) Input filter Encoding filter Pipe Output filter Pipe PipePipe Statistics
  14. 14. 14 • Block-based compression - allows selecting the best compression tool per block (adaptivity) - enables random access, streaming, and parallel processing Characteristics (2) Input filter Encoding filter Pipe Output filter Pipe PipePipe Statistics
  15. 15. 15 Characteristics (3) Efficiency FunctionalityEffectiveness Proposed solution SOTA allowing for a flexible trade-off between efficiency, effectiveness, and functionality has always been a major design goal
  16. 16. 16 • Effectiveness: compression of the human Y chromosome • Efficiency: no meaningful measurements thus far Experimental Results Format File size (MB) No compression (FASTA) 18.70 Binary 7.01 Huffman 5.16 Proposed framework 4.26 (*) Tom Paridaens, Yves Van Stappen, Wesley De Neve, Peter Lambert, Rik Van de Walle, Towards block-based compression of genomic data with random access functionality, Proceedings of the IEEE GlobalSIP 2014 Workshop on Genomic Signal Processing and Statistics
  17. 17. 17 • Compression - integration of advanced entropy coding - support for the protein alphabet - performance optimizations (I/O, GPU) • Privacy protection - encryption • Streaming • Compressed-domain manipulation - only download and decode that part of the compressed genome that belongs to a particular gene (region-of-interest) Future Research (1) Past Future
  18. 18. 18 • From - What video coding technologies can be re-used in the context of DNA data compression? • To - What multimedia technologies can be re-used in the context of biological data processing? Future Research (2) Past Future
  19. 19. Thank you for your attention Any questions or comments?
  20. 20. 20 [1] Tom Paridaens, Wesley De Neve, Peter Lambert, Rik Van de Walle, Genome Sequences as Media Files: Towards Effective, Efficient, and Functional Compression of Genomic Data, Proceedings of DCBIOSTEC 2014 [2] Tom Paridaens, Yves Van Stappen, Wesley De Neve, Peter Lambert, Rik Van de Walle, Towards block-based compression of genomic data with random access functionality, Proceedings of the IEEE GlobalSIP 2014 Workshop on Genomic Signal Processing and Statistics References

×