Your SlideShare is downloading. ×
Csr2011 june17 17_00_likhomanov
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×

Introducing the official SlideShare app

Stunning, full-screen experience for iPhone and Android

Text the download link to your phone

Standard text messaging rates apply

Csr2011 june17 17_00_likhomanov

115
views

Published on


0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
115
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
1
Comments
0
Likes
0
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. Two Combinatorial Criteria for BWT Images Konstantin M. Likhomanov Arseny M. Shur Ural State University, Yekaterinburg, RussiaK. M. Likhomanov, A. M. Shur (USU) Combinatorial Criteria for BWT Images CSR2011 1/9
  • 2. Burrows-Wheeler Transform (BWT)Introduced by M. Burrows, D. J. Wheeler, 1994K. M. Likhomanov, A. M. Shur (USU) Combinatorial Criteria for BWT Images CSR2011 2/9
  • 3. Burrows-Wheeler Transform (BWT)Introduced by M. Burrows, D. J. Wheeler, 1994An illustration: B A N A N AK. M. Likhomanov, A. M. Shur (USU) Combinatorial Criteria for BWT Images CSR2011 2/9
  • 4. Burrows-Wheeler Transform (BWT)Introduced by M. Burrows, D. J. Wheeler, 1994An illustration: A B A N A N A N A B A N A N A N A B B A N A N A N A B A N A N A N A B AK. M. Likhomanov, A. M. Shur (USU) Combinatorial Criteria for BWT Images CSR2011 2/9
  • 5. Burrows-Wheeler Transform (BWT)Introduced by M. Burrows, D. J. Wheeler, 1994An illustration: A B A N A N A N A B A N A N A N A B B A N A N A N A B A N A N A N A B A bwt (BANANA) = NNBAAABWT always performs a permutation of the original word.K. M. Likhomanov, A. M. Shur (USU) Combinatorial Criteria for BWT Images CSR2011 2/9
  • 6. Burrows-Wheeler Transform (BWT)Introduced by M. Burrows, D. J. Wheeler, 1994An illustration: A B A N A N A N A B A N A N A N A B B A N A N A N A B A N A N A N A B A bwt (BANANA) = NNBAAABWT always performs a permutation of the original word.It is not a bijection because it doesn’t distinguish between cyclic shifts.K. M. Likhomanov, A. M. Shur (USU) Combinatorial Criteria for BWT Images CSR2011 2/9
  • 7. BWT and context dependences inimal length ......m inimal letter ......m inimal letter ......m inimal letter ......m inimal letter ......m inimum letters......mK. M. Likhomanov, A. M. Shur (USU) Combinatorial Criteria for BWT Images CSR2011 3/9
  • 8. BWT and context dependences    inimal  length ......m    inimal  letter ......m       inimal letter ......m   Right contexts of these letters  inimal  letter ......m     inimal    letter ......m      inimum letters......mK. M. Likhomanov, A. M. Shur (USU) Combinatorial Criteria for BWT Images CSR2011 3/9
  • 9. BWT and context dependences    inimal  length ......m    inimal  letter ......m       inimal letter ......m   Right contexts of these letters  inimal  letter ......m     inimal    letter ......m      inimum letters......mUsage of BWT: reversibility + clustering effect + linear-time algorithms for both bwt and bwt−1 = good preprocessing for lossless data compressors (bzip2, 7-Zip,. . . )K. M. Likhomanov, A. M. Shur (USU) Combinatorial Criteria for BWT Images CSR2011 3/9
  • 10. Combinatorial properties of BWT Earlier results: words with “simple” BWT images Description of the words having BWT images of the form bi aj (S. Mantaci, A. Restivo, S. Sciortino, 2003) Description of the words having BWT images of the form ci bj ak (J. Simpson, S. J. Puglisi, 2008) Partial description of the words having BWT images of the form in−1 i ain an−1 . . . a11 (A. Restivo, G. Rosone, 2009) nK. M. Likhomanov, A. M. Shur (USU) Combinatorial Criteria for BWT Images CSR2011 4/9
  • 11. Combinatorial properties of BWT Earlier results: words with “simple” BWT images Description of the words having BWT images of the form bi aj (S. Mantaci, A. Restivo, S. Sciortino, 2003) Description of the words having BWT images of the form ci bj ak (J. Simpson, S. J. Puglisi, 2008) Partial description of the words having BWT images of the form in−1 i ain an−1 . . . a11 (A. Restivo, G. Rosone, 2009) n Two general decision problems (over any fixed ordered alphabet): Given a word w, is w a BWT image? Given a word w = c1 c2 . . . cl , is there a BWT image of the form p c11 . . . cpl for some positive p1 , . . . , pl ? lK. M. Likhomanov, A. M. Shur (USU) Combinatorial Criteria for BWT Images CSR2011 4/9
  • 12. Combinatorial properties of BWT Earlier results: words with “simple” BWT images Description of the words having BWT images of the form bi aj (S. Mantaci, A. Restivo, S. Sciortino, 2003) Description of the words having BWT images of the form ci bj ak (J. Simpson, S. J. Puglisi, 2008) Partial description of the words having BWT images of the form in−1 i ain an−1 . . . a11 (A. Restivo, G. Rosone, 2009) n Two general decision problems (over any fixed ordered alphabet): Given a word w, is w a BWT image? Given a word w = c1 c2 . . . cl , is there a BWT image of the form p c11 . . . cpl for some positive p1 , . . . , pl ? lWe present efficient solutions for both problemsK. M. Likhomanov, A. M. Shur (USU) Combinatorial Criteria for BWT Images CSR2011 4/9
  • 13. Checking whether a word is a BWT imageThe stable sorting of a word w = b1 . . . bk is the permutation σ of the set{1, . . . , k} such that the string σ(w) = bσ(1) . . . bσ(k) is lexicographicallysorted and the relative order of equal letters in σ(w) is the same as in w.K. M. Likhomanov, A. M. Shur (USU) Combinatorial Criteria for BWT Images CSR2011 5/9
  • 14. Checking whether a word is a BWT imageThe stable sorting of a word w = b1 . . . bk is the permutation σ of the set{1, . . . , k} such that the string σ(w) = bσ(1) . . . bσ(k) is lexicographicallysorted and the relative order of equal letters in σ(w) is the same as in w.Theorem 1A word w = cp1 . . . cpk , where pi > 0 and ci = ci+1 , is a BWT image if and 1 konly if the number of orbits of its stable sorting is equal to gcd(p1 , . . . , pk ).K. M. Likhomanov, A. M. Shur (USU) Combinatorial Criteria for BWT Images CSR2011 5/9
  • 15. Checking whether a word is a BWT imageThe stable sorting of a word w = b1 . . . bk is the permutation σ of the set{1, . . . , k} such that the string σ(w) = bσ(1) . . . bσ(k) is lexicographicallysorted and the relative order of equal letters in σ(w) is the same as in w.Theorem 1A word w = cp1 . . . cpk , where pi > 0 and ci = ci+1 , is a BWT image if and 1 konly if the number of orbits of its stable sorting is equal to gcd(p1 , . . . , pk ). N N B A A A B A N A N A A A A B N N A A A B N N (1 4 3 6 2 5) (1 2 4) (3 5 6) ×K. M. Likhomanov, A. M. Shur (USU) Combinatorial Criteria for BWT Images CSR2011 5/9
  • 16. Checking whether a word is a BWT imageThe stable sorting of a word w = b1 . . . bk is the permutation σ of the set{1, . . . , k} such that the string σ(w) = bσ(1) . . . bσ(k) is lexicographicallysorted and the relative order of equal letters in σ(w) is the same as in w.Theorem 1A word w = cp1 . . . cpk , where pi > 0 and ci = ci+1 , is a BWT image if and 1 konly if the number of orbits of its stable sorting is equal to gcd(p1 , . . . , pk ). A linear-time algorithm checks whether w is a BWT image, and returns “for free” any given preimage of w; so, this algorithm can be used in BWT-based compression schemes to check the correctness of decompression when calculating the inverse BWTK. M. Likhomanov, A. M. Shur (USU) Combinatorial Criteria for BWT Images CSR2011 5/9
  • 17. “Pumping” words to BWT imagesA word w has a global ascent if w = uv and each letter of u is less than orequal to each letter of v.K. M. Likhomanov, A. M. Shur (USU) Combinatorial Criteria for BWT Images CSR2011 6/9
  • 18. “Pumping” words to BWT imagesA word w has a global ascent if w = uv and each letter of u is less than orequal to each letter of v.Theorem 2 p pLet w = c1 . . . cl . A BWT image of the form c11 . . . cl l , pi > 0, exists ifand only if w has no global ascents.K. M. Likhomanov, A. M. Shur (USU) Combinatorial Criteria for BWT Images CSR2011 6/9
  • 19. “Pumping” words to BWT imagesA word w has a global ascent if w = uv and each letter of u is less than orequal to each letter of v.Theorem 2 p pLet w = c1 . . . cl . A BWT image of the form c11 . . . cl l , pi > 0, exists ifand only if w has no global ascents.BANANA has no global ascentsOne can check that BANNANA = bwt(AANANNB)K. M. Likhomanov, A. M. Shur (USU) Combinatorial Criteria for BWT Images CSR2011 6/9
  • 20. “Pumping” words to BWT imagesA word w has a global ascent if w = uv and each letter of u is less than orequal to each letter of v.Theorem 2 p pLet w = c1 . . . cl . A BWT image of the form c11 . . . cl l , pi > 0, exists ifand only if w has no global ascents.BANANA has no global ascentsOne can check that BANNANA = bwt(AANANNB)In general, constructing such an image is not easy. We present aquadratic-time algorithm.K. M. Likhomanov, A. M. Shur (USU) Combinatorial Criteria for BWT Images CSR2011 6/9
  • 21. Algorithm for constructing a BWT imageLemmaIf a nonempty word has no global ascents, then we can delete one of itsletters so that the new word will have no global ascents.K. M. Likhomanov, A. M. Shur (USU) Combinatorial Criteria for BWT Images CSR2011 7/9
  • 22. Algorithm for constructing a BWT imageLemmaIf a nonempty word has no global ascents, then we can delete one of itsletters so that the new word will have no global ascents.Stable sorting moves blocks of letters as a whole.K. M. Likhomanov, A. M. Shur (USU) Combinatorial Criteria for BWT Images CSR2011 7/9
  • 23. Algorithm for constructing a BWT imageLemmaIf a nonempty word has no global ascents, then we can delete one of itsletters so that the new word will have no global ascents.Stable sorting moves blocks of letters as a whole.LemmaIf we change the length of some block of letters by its shift under a stablesorting, this doesn’t change neither the number of orbits of the sorting northe GCD of the lengths of blocks.K. M. Likhomanov, A. M. Shur (USU) Combinatorial Criteria for BWT Images CSR2011 7/9
  • 24. Algorithm for constructing a BWT imageLemmaIf a nonempty word has no global ascents, then we can delete one of itsletters so that the new word will have no global ascents.Stable sorting moves blocks of letters as a whole.LemmaIf we change the length of some block of letters by its shift under a stablesorting, this doesn’t change neither the number of orbits of the sorting northe GCD of the lengths of blocks.The idea of the algorithm: remove letters until we obtain a BWT image,then return them as zero-length blocks and resize those blocks by theirshiftsK. M. Likhomanov, A. M. Shur (USU) Combinatorial Criteria for BWT Images CSR2011 7/9
  • 25. An exampleLet us construct a BWT image of the form D? A? C? B? A?K. M. Likhomanov, A. M. Shur (USU) Combinatorial Criteria for BWT Images CSR2011 8/9
  • 26. An exampleLet us construct a BWT image of the form D? A? C? B? A? Deleting letters, we get DACBA → DABA → DAA (a BWT image)K. M. Likhomanov, A. M. Shur (USU) Combinatorial Criteria for BWT Images CSR2011 8/9
  • 27. An exampleLet us construct a BWT image of the form D? A? C? B? A? Deleting letters, we get DACBA → DABA → DAA (a BWT image) D A B0 A A A B0 D The block B0 has zero shift and cannot be pumped Auxiliary step: pump an existing block (the last A, having the shift 1) D A B0 A A A A A B0 D Now the block B0 has shift 1 and can be pumped to get DABAAK. M. Likhomanov, A. M. Shur (USU) Combinatorial Criteria for BWT Images CSR2011 8/9
  • 28. An exampleLet us construct a BWT image of the form D? A? C? B? A? Deleting letters, we get DACBA → DABA → DAA (a BWT image) D A C0 B A A A A A B C0 D The block C0 has shift 2 and can be pumped to get DACCBAAOne can check that DACCBAA = bwt(AACBCAD).K. M. Likhomanov, A. M. Shur (USU) Combinatorial Criteria for BWT Images CSR2011 8/9
  • 29. Open problems Improve the complexity of the above algorithm Construct the shortest possible BWT image with given order of lettersK. M. Likhomanov, A. M. Shur (USU) Combinatorial Criteria for BWT Images CSR2011 9/9