SlideShare a Scribd company logo
1 of 46
Download to read offline
Compressed Membership in Automata with
                            Compressed Labels

                                              Markus Lohrey
                                            Christian Mathissen

                                             University of Leipzig


                                              June 16, 2011




Lohrey, Mathissen (University of Leipzig)     Compressed Membership   June 16, 2011   1 / 12
Motivation

  Try to develop algorithms that directly work on compressed data.




Lohrey, Mathissen (University of Leipzig)   Compressed Membership   June 16, 2011   2 / 12
Motivation

  Try to develop algorithms that directly work on compressed data.
  Goal: Beat straightforward decompress and manipulate strategy.




Lohrey, Mathissen (University of Leipzig)   Compressed Membership   June 16, 2011   2 / 12
Motivation

  Try to develop algorithms that directly work on compressed data.
  Goal: Beat straightforward decompress and manipulate strategy.
  In this talk: focus on compressed strings




Lohrey, Mathissen (University of Leipzig)   Compressed Membership   June 16, 2011   2 / 12
Motivation

  Try to develop algorithms that directly work on compressed data.
  Goal: Beat straightforward decompress and manipulate strategy.
  In this talk: focus on compressed strings
  Applications:
     ◮    all domains, where massive string data arise and have to be
          processed, e.g. bioinformatics




Lohrey, Mathissen (University of Leipzig)   Compressed Membership   June 16, 2011   2 / 12
Motivation

  Try to develop algorithms that directly work on compressed data.
  Goal: Beat straightforward decompress and manipulate strategy.
  In this talk: focus on compressed strings
  Applications:
     ◮    all domains, where massive string data arise and have to be
          processed, e.g. bioinformatics
     ◮    large (and highly compressible) strings often occur as intermediate
          data structures (e.g. in computational group theory, program analysis,
          verification).




Lohrey, Mathissen (University of Leipzig)   Compressed Membership   June 16, 2011   2 / 12
Straight-line programs
  Dictionary-based compression algorithms (LZ77, LZ78) exploit text
  repetition.




Lohrey, Mathissen (University of Leipzig)   Compressed Membership   June 16, 2011   3 / 12
Straight-line programs
  Dictionary-based compression algorithms (LZ77, LZ78) exploit text
  repetition.
  Straight-line programs are a general representation for compressed strings,
  which covers most dictionary-based algorithms.




Lohrey, Mathissen (University of Leipzig)   Compressed Membership   June 16, 2011   3 / 12
Straight-line programs
  Dictionary-based compression algorithms (LZ77, LZ78) exploit text
  repetition.
  Straight-line programs are a general representation for compressed strings,
  which covers most dictionary-based algorithms.
  A straight-line program (SLP) over the alphabet Γ is a sequence of
  definitions A = (Ai := αi )1≤i ≤n , where either αi ∈ Γ or αi = Aj Ak for
  some j, k < i .




Lohrey, Mathissen (University of Leipzig)   Compressed Membership   June 16, 2011   3 / 12
Straight-line programs
  Dictionary-based compression algorithms (LZ77, LZ78) exploit text
  repetition.
  Straight-line programs are a general representation for compressed strings,
  which covers most dictionary-based algorithms.
  A straight-line program (SLP) over the alphabet Γ is a sequence of
  definitions A = (Ai := αi )1≤i ≤n , where either αi ∈ Γ or αi = Aj Ak for
  some j, k < i .
  Alternatively: a context-free grammar that generates exactly one string.




Lohrey, Mathissen (University of Leipzig)   Compressed Membership   June 16, 2011   3 / 12
Straight-line programs
  Dictionary-based compression algorithms (LZ77, LZ78) exploit text
  repetition.
  Straight-line programs are a general representation for compressed strings,
  which covers most dictionary-based algorithms.
  A straight-line program (SLP) over the alphabet Γ is a sequence of
  definitions A = (Ai := αi )1≤i ≤n , where either αi ∈ Γ or αi = Aj Ak for
  some j, k < i .
  Alternatively: a context-free grammar that generates exactly one string.
  We write val(A) for the unique word generated by A.




Lohrey, Mathissen (University of Leipzig)   Compressed Membership   June 16, 2011   3 / 12
Straight-line programs
  Dictionary-based compression algorithms (LZ77, LZ78) exploit text
  repetition.
  Straight-line programs are a general representation for compressed strings,
  which covers most dictionary-based algorithms.
  A straight-line program (SLP) over the alphabet Γ is a sequence of
  definitions A = (Ai := αi )1≤i ≤n , where either αi ∈ Γ or αi = Aj Ak for
  some j, k < i .
  Alternatively: a context-free grammar that generates exactly one string.
  We write val(A) for the unique word generated by A.
  The size of an SLP A = (Ai := αi )1≤i ≤n is |A| = n.




Lohrey, Mathissen (University of Leipzig)   Compressed Membership   June 16, 2011   3 / 12
Straight-line programs
  Dictionary-based compression algorithms (LZ77, LZ78) exploit text
  repetition.
  Straight-line programs are a general representation for compressed strings,
  which covers most dictionary-based algorithms.
  A straight-line program (SLP) over the alphabet Γ is a sequence of
  definitions A = (Ai := αi )1≤i ≤n , where either αi ∈ Γ or αi = Aj Ak for
  some j, k < i .
  Alternatively: a context-free grammar that generates exactly one string.
  We write val(A) for the unique word generated by A.
  The size of an SLP A = (Ai := αi )1≤i ≤n is |A| = n.
  One may have |val(A)| = 2n .



Lohrey, Mathissen (University of Leipzig)   Compressed Membership   June 16, 2011   3 / 12
Straight-line programs
  Dictionary-based compression algorithms (LZ77, LZ78) exploit text
  repetition.
  Straight-line programs are a general representation for compressed strings,
  which covers most dictionary-based algorithms.
  A straight-line program (SLP) over the alphabet Γ is a sequence of
  definitions A = (Ai := αi )1≤i ≤n , where either αi ∈ Γ or αi = Aj Ak for
  some j, k < i .
  Alternatively: a context-free grammar that generates exactly one string.
  We write val(A) for the unique word generated by A.
  The size of an SLP A = (Ai := αi )1≤i ≤n is |A| = n.
  One may have |val(A)| = 2n .
  Thus, an SLP A can be seen as a compressed representation of val(A).

Lohrey, Mathissen (University of Leipzig)   Compressed Membership   June 16, 2011   3 / 12
Straight-line programs
  Example: A = (A1 := b,                    A2 := a,      Ai := Ai −1 Ai −2 for 3 ≤ i ≤ 7).




Lohrey, Mathissen (University of Leipzig)    Compressed Membership               June 16, 2011   4 / 12
Straight-line programs
  Example: A = (A1 := b,                    A2 := a,      Ai := Ai −1 Ai −2 for 3 ≤ i ≤ 7).
  Then val(A) = abaababaabaab:




Lohrey, Mathissen (University of Leipzig)    Compressed Membership               June 16, 2011   4 / 12
Straight-line programs
  Example: A = (A1 := b,                    A2 := a,      Ai := Ai −1 Ai −2 for 3 ≤ i ≤ 7).
  Then val(A) = abaababaabaab:

                                     A3 = A2 A1 = ab
                                     A4 = A3 A2 = aba
                                     A5 = A4 A3 = abaab
                                     A6 = A5 A4 = abaababa
                                     A7 = A6 A5 = abaababaabaab




Lohrey, Mathissen (University of Leipzig)    Compressed Membership               June 16, 2011   4 / 12
Straight-line programs
  Example: A = (A1 := b,                    A2 := a,      Ai := Ai −1 Ai −2 for 3 ≤ i ≤ 7).
  Then val(A) = abaababaabaab:

                                     A3 = A2 A1 = ab
                                     A4 = A3 A2 = aba
                                     A5 = A4 A3 = abaab
                                     A6 = A5 A4 = abaababa
                                     A7 = A6 A5 = abaababaabaab

  Relationship to dictionary-based compression (Rytter):




Lohrey, Mathissen (University of Leipzig)    Compressed Membership               June 16, 2011   4 / 12
Straight-line programs
  Example: A = (A1 := b,                    A2 := a,      Ai := Ai −1 Ai −2 for 3 ≤ i ≤ 7).
  Then val(A) = abaababaabaab:

                                     A3 = A2 A1 = ab
                                     A4 = A3 A2 = aba
                                     A5 = A4 A3 = abaab
                                     A6 = A5 A4 = abaababa
                                     A7 = A6 A5 = abaababaabaab

  Relationship to dictionary-based compression (Rytter):
     ◮    From an SLP A one can compute in polynomial time LZ77(val(A)).




Lohrey, Mathissen (University of Leipzig)    Compressed Membership               June 16, 2011   4 / 12
Straight-line programs
  Example: A = (A1 := b,                    A2 := a,      Ai := Ai −1 Ai −2 for 3 ≤ i ≤ 7).
  Then val(A) = abaababaabaab:

                                     A3 = A2 A1 = ab
                                     A4 = A3 A2 = aba
                                     A5 = A4 A3 = abaab
                                     A6 = A5 A4 = abaababa
                                     A7 = A6 A5 = abaababaabaab

  Relationship to dictionary-based compression (Rytter):
     ◮    From an SLP A one can compute in polynomial time LZ77(val(A)).
     ◮    From LZ77(w ) one can compute in polynomial time an SLP A with
          val(A) = w .

Lohrey, Mathissen (University of Leipzig)    Compressed Membership               June 16, 2011   4 / 12
Algorithms on SLP-compressed strings

  The mother of all algorithms on SLP -compressed strings:




Lohrey, Mathissen (University of Leipzig)   Compressed Membership   June 16, 2011   5 / 12
Algorithms on SLP-compressed strings

  The mother of all algorithms on SLP -compressed strings:


  Plandowski 1994
  The following problem can be solved in polynomial time:
  INPUT: SLPs A, B
  QUESTION: val(A) = val(B)?




Lohrey, Mathissen (University of Leipzig)   Compressed Membership   June 16, 2011   5 / 12
Algorithms on SLP-compressed strings

  The mother of all algorithms on SLP -compressed strings:


  Plandowski 1994
  The following problem can be solved in polynomial time:
  INPUT: SLPs A, B
  QUESTION: val(A) = val(B)?


  Note: The decompress-and-compare strategy does not work here.
  We cannot compute val(A) and val(B) in polynomial time.




Lohrey, Mathissen (University of Leipzig)   Compressed Membership   June 16, 2011   5 / 12
Algorithms on SLP-compressed strings

  The mother of all algorithms on SLP -compressed strings:


  Plandowski 1994
  The following problem can be solved in polynomial time:
  INPUT: SLPs A, B
  QUESTION: val(A) = val(B)?


  Note: The decompress-and-compare strategy does not work here.
  We cannot compute val(A) and val(B) in polynomial time.

  Plandowski’s algorithm uses combinatorics on words, in particular the
  Periodicity-Lemma of Fine and Wilf.


Lohrey, Mathissen (University of Leipzig)   Compressed Membership   June 16, 2011   5 / 12
Improvements of Plandowski’s result


  Gasieniec, Karpinski, Miyazaki, Plandowski, Rytter, Shinohara,
  Takeda (mid 90’s)
  The following problem can be solved in polynomial time
  (fully compressed pattern matching):
  INPUT: SLPs P, T
  QUESTION: Is val(P) a factor of val(T), i.e., ∃x, y : val(T) = x val(P) y ?




Lohrey, Mathissen (University of Leipzig)   Compressed Membership   June 16, 2011   6 / 12
Improvements of Plandowski’s result


  Gasieniec, Karpinski, Miyazaki, Plandowski, Rytter, Shinohara,
  Takeda (mid 90’s)
  The following problem can be solved in polynomial time
  (fully compressed pattern matching):
  INPUT: SLPs P, T
  QUESTION: Is val(P) a factor of val(T), i.e., ∃x, y : val(T) = x val(P) y ?


  The best known algorithm has a running time of O(|P| · |T|2 )
  (Lifshits 2006).




Lohrey, Mathissen (University of Leipzig)   Compressed Membership   June 16, 2011   6 / 12
Generalization: Compressed automata
  A compressed automata is an ordinary finite automaton, where every
  transition is labelled with an SLP.




Lohrey, Mathissen (University of Leipzig)   Compressed Membership   June 16, 2011   7 / 12
Generalization: Compressed automata
  A compressed automata is an ordinary finite automaton, where every
  transition is labelled with an SLP.

  Example: The compressed automaton
           Σ                Σ

                      q              A      q

  accepts all words that have val(A) as a factor.




Lohrey, Mathissen (University of Leipzig)   Compressed Membership   June 16, 2011   7 / 12
Generalization: Compressed automata
  A compressed automata is an ordinary finite automaton, where every
  transition is labelled with an SLP.

  Example: The compressed automaton
           Σ                Σ

                      q              A        q

  accepts all words that have val(A) as a factor.

  The size |A| of the compressed automaton A
  (with the set ∆ of transition triples) is

                                            |A| =                |A|.
                                                    (p,A,q)∈∆


Lohrey, Mathissen (University of Leipzig)     Compressed Membership     June 16, 2011   7 / 12
Compressed membership for compressed automata
  Compressed membership for compressed automata:
  INPUT: A compressed automaton A and an SLP B.
  QUESTION: val(B) ∈ L(A)?




Lohrey, Mathissen (University of Leipzig)   Compressed Membership   June 16, 2011   8 / 12
Compressed membership for compressed automata
  Compressed membership for compressed automata:
  INPUT: A compressed automaton A and an SLP B.
  QUESTION: val(B) ∈ L(A)?


  Plandowski, Rytter 1999

     ◮    Compressed membership for compressed automata belongs to
          PSPACE.
     ◮    Compressed membership for compressed automata over a unary
          alphabet is NP-complete.




Lohrey, Mathissen (University of Leipzig)   Compressed Membership   June 16, 2011   8 / 12
Compressed membership for compressed automata
  Compressed membership for compressed automata:
  INPUT: A compressed automaton A and an SLP B.
  QUESTION: val(B) ∈ L(A)?


  Plandowski, Rytter 1999

     ◮    Compressed membership for compressed automata belongs to
          PSPACE.
     ◮    Compressed membership for compressed automata over a unary
          alphabet is NP-complete.


  Plandowski and Rytter conjectured that compressed membership for
  compressed automata is NP-complete (for every alphabet size).
Lohrey, Mathissen (University of Leipzig)   Compressed Membership   June 16, 2011   8 / 12
Some combinatorics on words

  The period of the word w ∈ Σ∗ is the smallest number p such that
  w [k + p] = w [k] for all 1 ≤ k ≤ |w | − p
  (period(w ) = |w | if such a p does not exist).




Lohrey, Mathissen (University of Leipzig)   Compressed Membership   June 16, 2011   9 / 12
Some combinatorics on words

  The period of the word w ∈ Σ∗ is the smallest number p such that
  w [k + p] = w [k] for all 1 ≤ k ≤ |w | − p
  (period(w ) = |w | if such a p does not exist).
                       |w |
  Let order(w ) = ⌊ period(w ) ⌋.




Lohrey, Mathissen (University of Leipzig)   Compressed Membership   June 16, 2011   9 / 12
Some combinatorics on words

  The period of the word w ∈ Σ∗ is the smallest number p such that
  w [k + p] = w [k] for all 1 ≤ k ≤ |w | − p
  (period(w ) = |w | if such a p does not exist).
                       |w |
  Let order(w ) = ⌊ period(w ) ⌋.

  Example: Let w = abbabbab = (abb)2 ab.




Lohrey, Mathissen (University of Leipzig)   Compressed Membership   June 16, 2011   9 / 12
Some combinatorics on words

  The period of the word w ∈ Σ∗ is the smallest number p such that
  w [k + p] = w [k] for all 1 ≤ k ≤ |w | − p
  (period(w ) = |w | if such a p does not exist).
                       |w |
  Let order(w ) = ⌊ period(w ) ⌋.

  Example: Let w = abbabbab = (abb)2 ab.
  Then period(w ) = 3 and order(w ) = 2.




Lohrey, Mathissen (University of Leipzig)   Compressed Membership   June 16, 2011   9 / 12
Some combinatorics on words

  The period of the word w ∈ Σ∗ is the smallest number p such that
  w [k + p] = w [k] for all 1 ≤ k ≤ |w | − p
  (period(w ) = |w | if such a p does not exist).
                       |w |
  Let order(w ) = ⌊ period(w ) ⌋.

  Example: Let w = abbabbab = (abb)2 ab.
  Then period(w ) = 3 and order(w ) = 2.

  For a compressed automaton A, let:

             order(A) = max{order(val(A)) | A occurs as a label in A}
           period(A) = max{period(val(A)) | A occurs as a label in A}



Lohrey, Mathissen (University of Leipzig)   Compressed Membership   June 16, 2011   9 / 12
First main result


  Theorem 1
  For a compressed automaton A and an SLP B, we can check
  val(B) ∈ L(A) in time polynomial in (i) |A|, (ii) |B|, and (iii) order(A).




Lohrey, Mathissen (University of Leipzig)   Compressed Membership   June 16, 2011   10 / 12
First main result


  Theorem 1
  For a compressed automaton A and an SLP B, we can check
  val(B) ∈ L(A) in time polynomial in (i) |A|, (ii) |B|, and (iii) order(A).


  We use the following combinatorial fact:


     Let u, v ∈ Σ∗ and let p be a position in v . Then there exist at most
     order(u) many occurrences of u in v that “touch” position p.




Lohrey, Mathissen (University of Leipzig)   Compressed Membership   June 16, 2011   10 / 12
Second main result


  Theorem 2
  For a compressed automaton A and an SLP B, we can check
  val(B) ∈ L(A) nondeterministically in time polynomial in (i) |A|, (ii) |B|,
  and (iii) period(A).




Lohrey, Mathissen (University of Leipzig)   Compressed Membership   June 16, 2011   11 / 12
Second main result


  Theorem 2
  For a compressed automaton A and an SLP B, we can check
  val(B) ∈ L(A) nondeterministically in time polynomial in (i) |A|, (ii) |B|,
  and (iii) period(A).

  Generalizes the result of Plandowski & Rytter for a unary alphabet
  (NP-completeness of compressed membership for compressed automata).




Lohrey, Mathissen (University of Leipzig)   Compressed Membership   June 16, 2011   11 / 12
Second main result


  Theorem 2
  For a compressed automaton A and an SLP B, we can check
  val(B) ∈ L(A) nondeterministically in time polynomial in (i) |A|, (ii) |B|,
  and (iii) period(A).

  Generalizes the result of Plandowski & Rytter for a unary alphabet
  (NP-completeness of compressed membership for compressed automata).
  A word w = an has period 1!




Lohrey, Mathissen (University of Leipzig)   Compressed Membership   June 16, 2011   11 / 12
Second main result


  Theorem 2
  For a compressed automaton A and an SLP B, we can check
  val(B) ∈ L(A) nondeterministically in time polynomial in (i) |A|, (ii) |B|,
  and (iii) period(A).

  Generalizes the result of Plandowski & Rytter for a unary alphabet
  (NP-completeness of compressed membership for compressed automata).
  A word w = an has period 1!
  The theorem is proven by a reduction to the case of a unary alphabet.




Lohrey, Mathissen (University of Leipzig)   Compressed Membership   June 16, 2011   11 / 12
Open problems and a conjecture
  Conjecture 1
  Compressed membership for compressed automata is NP-complete.




Lohrey, Mathissen (University of Leipzig)   Compressed Membership   June 16, 2011   12 / 12
Open problems and a conjecture
  Conjecture 1
  Compressed membership for compressed automata is NP-complete.


  Conjecture 2
  If val(B) ∈ L(A) for an SLP B and a compressed automaton A, then there
  exists an accepting run of A on val(B) (viewed as a word over the set of
  transition triples of A), which can be generated by an SLP of size
  poly(|B|, |A|).




Lohrey, Mathissen (University of Leipzig)   Compressed Membership   June 16, 2011   12 / 12
Open problems and a conjecture
  Conjecture 1
  Compressed membership for compressed automata is NP-complete.


  Conjecture 2
  If val(B) ∈ L(A) for an SLP B and a compressed automaton A, then there
  exists an accepting run of A on val(B) (viewed as a word over the set of
  transition triples of A), which can be generated by an SLP of size
  poly(|B|, |A|).

  Conjecture 2 implies Conjecture 1:
     ◮    Guess nondeterministically an SLP C of polynomial size.
     ◮    Check (in deterministic polynomial time), whether val(C) is an
          accepting run of A on val(B).

Lohrey, Mathissen (University of Leipzig)   Compressed Membership   June 16, 2011   12 / 12

More Related Content

More from CSR2011

Csr2011 june18 09_30_shpilka
Csr2011 june18 09_30_shpilkaCsr2011 june18 09_30_shpilka
Csr2011 june18 09_30_shpilkaCSR2011
 
Csr2011 june18 12_00_nguyen
Csr2011 june18 12_00_nguyenCsr2011 june18 12_00_nguyen
Csr2011 june18 12_00_nguyenCSR2011
 
Csr2011 june18 11_00_tiskin
Csr2011 june18 11_00_tiskinCsr2011 june18 11_00_tiskin
Csr2011 june18 11_00_tiskinCSR2011
 
Csr2011 june18 11_30_remila
Csr2011 june18 11_30_remilaCsr2011 june18 11_30_remila
Csr2011 june18 11_30_remilaCSR2011
 
Csr2011 june17 17_00_likhomanov
Csr2011 june17 17_00_likhomanovCsr2011 june17 17_00_likhomanov
Csr2011 june17 17_00_likhomanovCSR2011
 
Csr2011 june17 16_30_blin
Csr2011 june17 16_30_blinCsr2011 june17 16_30_blin
Csr2011 june17 16_30_blinCSR2011
 
Csr2011 june17 09_30_yekhanin
Csr2011 june17 09_30_yekhaninCsr2011 june17 09_30_yekhanin
Csr2011 june17 09_30_yekhaninCSR2011
 
Csr2011 june17 09_30_yekhanin
Csr2011 june17 09_30_yekhaninCsr2011 june17 09_30_yekhanin
Csr2011 june17 09_30_yekhaninCSR2011
 
Csr2011 june17 12_00_morin
Csr2011 june17 12_00_morinCsr2011 june17 12_00_morin
Csr2011 june17 12_00_morinCSR2011
 
Csr2011 june17 11_30_vyalyi
Csr2011 june17 11_30_vyalyiCsr2011 june17 11_30_vyalyi
Csr2011 june17 11_30_vyalyiCSR2011
 
Csr2011 june17 11_00_lonati
Csr2011 june17 11_00_lonatiCsr2011 june17 11_00_lonati
Csr2011 june17 11_00_lonatiCSR2011
 
Csr2011 june17 14_00_bulatov
Csr2011 june17 14_00_bulatovCsr2011 june17 14_00_bulatov
Csr2011 june17 14_00_bulatovCSR2011
 
Csr2011 june17 15_15_kaminski
Csr2011 june17 15_15_kaminskiCsr2011 june17 15_15_kaminski
Csr2011 june17 15_15_kaminskiCSR2011
 
Csr2011 june17 14_00_bulatov
Csr2011 june17 14_00_bulatovCsr2011 june17 14_00_bulatov
Csr2011 june17 14_00_bulatovCSR2011
 
Csr2011 june17 12_00_morin
Csr2011 june17 12_00_morinCsr2011 june17 12_00_morin
Csr2011 june17 12_00_morinCSR2011
 
Csr2011 june17 11_30_vyalyi
Csr2011 june17 11_30_vyalyiCsr2011 june17 11_30_vyalyi
Csr2011 june17 11_30_vyalyiCSR2011
 
Csr2011 june17 11_00_lonati
Csr2011 june17 11_00_lonatiCsr2011 june17 11_00_lonati
Csr2011 june17 11_00_lonatiCSR2011
 
Csr2011 june17 09_30_yekhanin
Csr2011 june17 09_30_yekhaninCsr2011 june17 09_30_yekhanin
Csr2011 june17 09_30_yekhaninCSR2011
 
Csr2011 june17 17_00_likhomanov
Csr2011 june17 17_00_likhomanovCsr2011 june17 17_00_likhomanov
Csr2011 june17 17_00_likhomanovCSR2011
 
Csr2011 june16 11_30_georgiadis
Csr2011 june16 11_30_georgiadisCsr2011 june16 11_30_georgiadis
Csr2011 june16 11_30_georgiadisCSR2011
 

More from CSR2011 (20)

Csr2011 june18 09_30_shpilka
Csr2011 june18 09_30_shpilkaCsr2011 june18 09_30_shpilka
Csr2011 june18 09_30_shpilka
 
Csr2011 june18 12_00_nguyen
Csr2011 june18 12_00_nguyenCsr2011 june18 12_00_nguyen
Csr2011 june18 12_00_nguyen
 
Csr2011 june18 11_00_tiskin
Csr2011 june18 11_00_tiskinCsr2011 june18 11_00_tiskin
Csr2011 june18 11_00_tiskin
 
Csr2011 june18 11_30_remila
Csr2011 june18 11_30_remilaCsr2011 june18 11_30_remila
Csr2011 june18 11_30_remila
 
Csr2011 june17 17_00_likhomanov
Csr2011 june17 17_00_likhomanovCsr2011 june17 17_00_likhomanov
Csr2011 june17 17_00_likhomanov
 
Csr2011 june17 16_30_blin
Csr2011 june17 16_30_blinCsr2011 june17 16_30_blin
Csr2011 june17 16_30_blin
 
Csr2011 june17 09_30_yekhanin
Csr2011 june17 09_30_yekhaninCsr2011 june17 09_30_yekhanin
Csr2011 june17 09_30_yekhanin
 
Csr2011 june17 09_30_yekhanin
Csr2011 june17 09_30_yekhaninCsr2011 june17 09_30_yekhanin
Csr2011 june17 09_30_yekhanin
 
Csr2011 june17 12_00_morin
Csr2011 june17 12_00_morinCsr2011 june17 12_00_morin
Csr2011 june17 12_00_morin
 
Csr2011 june17 11_30_vyalyi
Csr2011 june17 11_30_vyalyiCsr2011 june17 11_30_vyalyi
Csr2011 june17 11_30_vyalyi
 
Csr2011 june17 11_00_lonati
Csr2011 june17 11_00_lonatiCsr2011 june17 11_00_lonati
Csr2011 june17 11_00_lonati
 
Csr2011 june17 14_00_bulatov
Csr2011 june17 14_00_bulatovCsr2011 june17 14_00_bulatov
Csr2011 june17 14_00_bulatov
 
Csr2011 june17 15_15_kaminski
Csr2011 june17 15_15_kaminskiCsr2011 june17 15_15_kaminski
Csr2011 june17 15_15_kaminski
 
Csr2011 june17 14_00_bulatov
Csr2011 june17 14_00_bulatovCsr2011 june17 14_00_bulatov
Csr2011 june17 14_00_bulatov
 
Csr2011 june17 12_00_morin
Csr2011 june17 12_00_morinCsr2011 june17 12_00_morin
Csr2011 june17 12_00_morin
 
Csr2011 june17 11_30_vyalyi
Csr2011 june17 11_30_vyalyiCsr2011 june17 11_30_vyalyi
Csr2011 june17 11_30_vyalyi
 
Csr2011 june17 11_00_lonati
Csr2011 june17 11_00_lonatiCsr2011 june17 11_00_lonati
Csr2011 june17 11_00_lonati
 
Csr2011 june17 09_30_yekhanin
Csr2011 june17 09_30_yekhaninCsr2011 june17 09_30_yekhanin
Csr2011 june17 09_30_yekhanin
 
Csr2011 june17 17_00_likhomanov
Csr2011 june17 17_00_likhomanovCsr2011 june17 17_00_likhomanov
Csr2011 june17 17_00_likhomanov
 
Csr2011 june16 11_30_georgiadis
Csr2011 june16 11_30_georgiadisCsr2011 june16 11_30_georgiadis
Csr2011 june16 11_30_georgiadis
 

Recently uploaded

Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Victor Rentea
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodJuan lago vázquez
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...apidays
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...apidays
 
Cyberprint. Dark Pink Apt Group [EN].pdf
Cyberprint. Dark Pink Apt Group [EN].pdfCyberprint. Dark Pink Apt Group [EN].pdf
Cyberprint. Dark Pink Apt Group [EN].pdfOverkill Security
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MIND CTI
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Victor Rentea
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Zilliz
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDropbox
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
 
AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024The Digital Insurer
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...Zilliz
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businesspanagenda
 
Spring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUKSpring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUKJago de Vreede
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...apidays
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobeapidays
 
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Angeliki Cooney
 

Recently uploaded (20)

Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
 
Cyberprint. Dark Pink Apt Group [EN].pdf
Cyberprint. Dark Pink Apt Group [EN].pdfCyberprint. Dark Pink Apt Group [EN].pdf
Cyberprint. Dark Pink Apt Group [EN].pdf
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Spring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUKSpring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUK
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
 

Csr2011 june16 17_00_lohrey

  • 1. Compressed Membership in Automata with Compressed Labels Markus Lohrey Christian Mathissen University of Leipzig June 16, 2011 Lohrey, Mathissen (University of Leipzig) Compressed Membership June 16, 2011 1 / 12
  • 2. Motivation Try to develop algorithms that directly work on compressed data. Lohrey, Mathissen (University of Leipzig) Compressed Membership June 16, 2011 2 / 12
  • 3. Motivation Try to develop algorithms that directly work on compressed data. Goal: Beat straightforward decompress and manipulate strategy. Lohrey, Mathissen (University of Leipzig) Compressed Membership June 16, 2011 2 / 12
  • 4. Motivation Try to develop algorithms that directly work on compressed data. Goal: Beat straightforward decompress and manipulate strategy. In this talk: focus on compressed strings Lohrey, Mathissen (University of Leipzig) Compressed Membership June 16, 2011 2 / 12
  • 5. Motivation Try to develop algorithms that directly work on compressed data. Goal: Beat straightforward decompress and manipulate strategy. In this talk: focus on compressed strings Applications: ◮ all domains, where massive string data arise and have to be processed, e.g. bioinformatics Lohrey, Mathissen (University of Leipzig) Compressed Membership June 16, 2011 2 / 12
  • 6. Motivation Try to develop algorithms that directly work on compressed data. Goal: Beat straightforward decompress and manipulate strategy. In this talk: focus on compressed strings Applications: ◮ all domains, where massive string data arise and have to be processed, e.g. bioinformatics ◮ large (and highly compressible) strings often occur as intermediate data structures (e.g. in computational group theory, program analysis, verification). Lohrey, Mathissen (University of Leipzig) Compressed Membership June 16, 2011 2 / 12
  • 7. Straight-line programs Dictionary-based compression algorithms (LZ77, LZ78) exploit text repetition. Lohrey, Mathissen (University of Leipzig) Compressed Membership June 16, 2011 3 / 12
  • 8. Straight-line programs Dictionary-based compression algorithms (LZ77, LZ78) exploit text repetition. Straight-line programs are a general representation for compressed strings, which covers most dictionary-based algorithms. Lohrey, Mathissen (University of Leipzig) Compressed Membership June 16, 2011 3 / 12
  • 9. Straight-line programs Dictionary-based compression algorithms (LZ77, LZ78) exploit text repetition. Straight-line programs are a general representation for compressed strings, which covers most dictionary-based algorithms. A straight-line program (SLP) over the alphabet Γ is a sequence of definitions A = (Ai := αi )1≤i ≤n , where either αi ∈ Γ or αi = Aj Ak for some j, k < i . Lohrey, Mathissen (University of Leipzig) Compressed Membership June 16, 2011 3 / 12
  • 10. Straight-line programs Dictionary-based compression algorithms (LZ77, LZ78) exploit text repetition. Straight-line programs are a general representation for compressed strings, which covers most dictionary-based algorithms. A straight-line program (SLP) over the alphabet Γ is a sequence of definitions A = (Ai := αi )1≤i ≤n , where either αi ∈ Γ or αi = Aj Ak for some j, k < i . Alternatively: a context-free grammar that generates exactly one string. Lohrey, Mathissen (University of Leipzig) Compressed Membership June 16, 2011 3 / 12
  • 11. Straight-line programs Dictionary-based compression algorithms (LZ77, LZ78) exploit text repetition. Straight-line programs are a general representation for compressed strings, which covers most dictionary-based algorithms. A straight-line program (SLP) over the alphabet Γ is a sequence of definitions A = (Ai := αi )1≤i ≤n , where either αi ∈ Γ or αi = Aj Ak for some j, k < i . Alternatively: a context-free grammar that generates exactly one string. We write val(A) for the unique word generated by A. Lohrey, Mathissen (University of Leipzig) Compressed Membership June 16, 2011 3 / 12
  • 12. Straight-line programs Dictionary-based compression algorithms (LZ77, LZ78) exploit text repetition. Straight-line programs are a general representation for compressed strings, which covers most dictionary-based algorithms. A straight-line program (SLP) over the alphabet Γ is a sequence of definitions A = (Ai := αi )1≤i ≤n , where either αi ∈ Γ or αi = Aj Ak for some j, k < i . Alternatively: a context-free grammar that generates exactly one string. We write val(A) for the unique word generated by A. The size of an SLP A = (Ai := αi )1≤i ≤n is |A| = n. Lohrey, Mathissen (University of Leipzig) Compressed Membership June 16, 2011 3 / 12
  • 13. Straight-line programs Dictionary-based compression algorithms (LZ77, LZ78) exploit text repetition. Straight-line programs are a general representation for compressed strings, which covers most dictionary-based algorithms. A straight-line program (SLP) over the alphabet Γ is a sequence of definitions A = (Ai := αi )1≤i ≤n , where either αi ∈ Γ or αi = Aj Ak for some j, k < i . Alternatively: a context-free grammar that generates exactly one string. We write val(A) for the unique word generated by A. The size of an SLP A = (Ai := αi )1≤i ≤n is |A| = n. One may have |val(A)| = 2n . Lohrey, Mathissen (University of Leipzig) Compressed Membership June 16, 2011 3 / 12
  • 14. Straight-line programs Dictionary-based compression algorithms (LZ77, LZ78) exploit text repetition. Straight-line programs are a general representation for compressed strings, which covers most dictionary-based algorithms. A straight-line program (SLP) over the alphabet Γ is a sequence of definitions A = (Ai := αi )1≤i ≤n , where either αi ∈ Γ or αi = Aj Ak for some j, k < i . Alternatively: a context-free grammar that generates exactly one string. We write val(A) for the unique word generated by A. The size of an SLP A = (Ai := αi )1≤i ≤n is |A| = n. One may have |val(A)| = 2n . Thus, an SLP A can be seen as a compressed representation of val(A). Lohrey, Mathissen (University of Leipzig) Compressed Membership June 16, 2011 3 / 12
  • 15. Straight-line programs Example: A = (A1 := b, A2 := a, Ai := Ai −1 Ai −2 for 3 ≤ i ≤ 7). Lohrey, Mathissen (University of Leipzig) Compressed Membership June 16, 2011 4 / 12
  • 16. Straight-line programs Example: A = (A1 := b, A2 := a, Ai := Ai −1 Ai −2 for 3 ≤ i ≤ 7). Then val(A) = abaababaabaab: Lohrey, Mathissen (University of Leipzig) Compressed Membership June 16, 2011 4 / 12
  • 17. Straight-line programs Example: A = (A1 := b, A2 := a, Ai := Ai −1 Ai −2 for 3 ≤ i ≤ 7). Then val(A) = abaababaabaab: A3 = A2 A1 = ab A4 = A3 A2 = aba A5 = A4 A3 = abaab A6 = A5 A4 = abaababa A7 = A6 A5 = abaababaabaab Lohrey, Mathissen (University of Leipzig) Compressed Membership June 16, 2011 4 / 12
  • 18. Straight-line programs Example: A = (A1 := b, A2 := a, Ai := Ai −1 Ai −2 for 3 ≤ i ≤ 7). Then val(A) = abaababaabaab: A3 = A2 A1 = ab A4 = A3 A2 = aba A5 = A4 A3 = abaab A6 = A5 A4 = abaababa A7 = A6 A5 = abaababaabaab Relationship to dictionary-based compression (Rytter): Lohrey, Mathissen (University of Leipzig) Compressed Membership June 16, 2011 4 / 12
  • 19. Straight-line programs Example: A = (A1 := b, A2 := a, Ai := Ai −1 Ai −2 for 3 ≤ i ≤ 7). Then val(A) = abaababaabaab: A3 = A2 A1 = ab A4 = A3 A2 = aba A5 = A4 A3 = abaab A6 = A5 A4 = abaababa A7 = A6 A5 = abaababaabaab Relationship to dictionary-based compression (Rytter): ◮ From an SLP A one can compute in polynomial time LZ77(val(A)). Lohrey, Mathissen (University of Leipzig) Compressed Membership June 16, 2011 4 / 12
  • 20. Straight-line programs Example: A = (A1 := b, A2 := a, Ai := Ai −1 Ai −2 for 3 ≤ i ≤ 7). Then val(A) = abaababaabaab: A3 = A2 A1 = ab A4 = A3 A2 = aba A5 = A4 A3 = abaab A6 = A5 A4 = abaababa A7 = A6 A5 = abaababaabaab Relationship to dictionary-based compression (Rytter): ◮ From an SLP A one can compute in polynomial time LZ77(val(A)). ◮ From LZ77(w ) one can compute in polynomial time an SLP A with val(A) = w . Lohrey, Mathissen (University of Leipzig) Compressed Membership June 16, 2011 4 / 12
  • 21. Algorithms on SLP-compressed strings The mother of all algorithms on SLP -compressed strings: Lohrey, Mathissen (University of Leipzig) Compressed Membership June 16, 2011 5 / 12
  • 22. Algorithms on SLP-compressed strings The mother of all algorithms on SLP -compressed strings: Plandowski 1994 The following problem can be solved in polynomial time: INPUT: SLPs A, B QUESTION: val(A) = val(B)? Lohrey, Mathissen (University of Leipzig) Compressed Membership June 16, 2011 5 / 12
  • 23. Algorithms on SLP-compressed strings The mother of all algorithms on SLP -compressed strings: Plandowski 1994 The following problem can be solved in polynomial time: INPUT: SLPs A, B QUESTION: val(A) = val(B)? Note: The decompress-and-compare strategy does not work here. We cannot compute val(A) and val(B) in polynomial time. Lohrey, Mathissen (University of Leipzig) Compressed Membership June 16, 2011 5 / 12
  • 24. Algorithms on SLP-compressed strings The mother of all algorithms on SLP -compressed strings: Plandowski 1994 The following problem can be solved in polynomial time: INPUT: SLPs A, B QUESTION: val(A) = val(B)? Note: The decompress-and-compare strategy does not work here. We cannot compute val(A) and val(B) in polynomial time. Plandowski’s algorithm uses combinatorics on words, in particular the Periodicity-Lemma of Fine and Wilf. Lohrey, Mathissen (University of Leipzig) Compressed Membership June 16, 2011 5 / 12
  • 25. Improvements of Plandowski’s result Gasieniec, Karpinski, Miyazaki, Plandowski, Rytter, Shinohara, Takeda (mid 90’s) The following problem can be solved in polynomial time (fully compressed pattern matching): INPUT: SLPs P, T QUESTION: Is val(P) a factor of val(T), i.e., ∃x, y : val(T) = x val(P) y ? Lohrey, Mathissen (University of Leipzig) Compressed Membership June 16, 2011 6 / 12
  • 26. Improvements of Plandowski’s result Gasieniec, Karpinski, Miyazaki, Plandowski, Rytter, Shinohara, Takeda (mid 90’s) The following problem can be solved in polynomial time (fully compressed pattern matching): INPUT: SLPs P, T QUESTION: Is val(P) a factor of val(T), i.e., ∃x, y : val(T) = x val(P) y ? The best known algorithm has a running time of O(|P| · |T|2 ) (Lifshits 2006). Lohrey, Mathissen (University of Leipzig) Compressed Membership June 16, 2011 6 / 12
  • 27. Generalization: Compressed automata A compressed automata is an ordinary finite automaton, where every transition is labelled with an SLP. Lohrey, Mathissen (University of Leipzig) Compressed Membership June 16, 2011 7 / 12
  • 28. Generalization: Compressed automata A compressed automata is an ordinary finite automaton, where every transition is labelled with an SLP. Example: The compressed automaton Σ Σ q A q accepts all words that have val(A) as a factor. Lohrey, Mathissen (University of Leipzig) Compressed Membership June 16, 2011 7 / 12
  • 29. Generalization: Compressed automata A compressed automata is an ordinary finite automaton, where every transition is labelled with an SLP. Example: The compressed automaton Σ Σ q A q accepts all words that have val(A) as a factor. The size |A| of the compressed automaton A (with the set ∆ of transition triples) is |A| = |A|. (p,A,q)∈∆ Lohrey, Mathissen (University of Leipzig) Compressed Membership June 16, 2011 7 / 12
  • 30. Compressed membership for compressed automata Compressed membership for compressed automata: INPUT: A compressed automaton A and an SLP B. QUESTION: val(B) ∈ L(A)? Lohrey, Mathissen (University of Leipzig) Compressed Membership June 16, 2011 8 / 12
  • 31. Compressed membership for compressed automata Compressed membership for compressed automata: INPUT: A compressed automaton A and an SLP B. QUESTION: val(B) ∈ L(A)? Plandowski, Rytter 1999 ◮ Compressed membership for compressed automata belongs to PSPACE. ◮ Compressed membership for compressed automata over a unary alphabet is NP-complete. Lohrey, Mathissen (University of Leipzig) Compressed Membership June 16, 2011 8 / 12
  • 32. Compressed membership for compressed automata Compressed membership for compressed automata: INPUT: A compressed automaton A and an SLP B. QUESTION: val(B) ∈ L(A)? Plandowski, Rytter 1999 ◮ Compressed membership for compressed automata belongs to PSPACE. ◮ Compressed membership for compressed automata over a unary alphabet is NP-complete. Plandowski and Rytter conjectured that compressed membership for compressed automata is NP-complete (for every alphabet size). Lohrey, Mathissen (University of Leipzig) Compressed Membership June 16, 2011 8 / 12
  • 33. Some combinatorics on words The period of the word w ∈ Σ∗ is the smallest number p such that w [k + p] = w [k] for all 1 ≤ k ≤ |w | − p (period(w ) = |w | if such a p does not exist). Lohrey, Mathissen (University of Leipzig) Compressed Membership June 16, 2011 9 / 12
  • 34. Some combinatorics on words The period of the word w ∈ Σ∗ is the smallest number p such that w [k + p] = w [k] for all 1 ≤ k ≤ |w | − p (period(w ) = |w | if such a p does not exist). |w | Let order(w ) = ⌊ period(w ) ⌋. Lohrey, Mathissen (University of Leipzig) Compressed Membership June 16, 2011 9 / 12
  • 35. Some combinatorics on words The period of the word w ∈ Σ∗ is the smallest number p such that w [k + p] = w [k] for all 1 ≤ k ≤ |w | − p (period(w ) = |w | if such a p does not exist). |w | Let order(w ) = ⌊ period(w ) ⌋. Example: Let w = abbabbab = (abb)2 ab. Lohrey, Mathissen (University of Leipzig) Compressed Membership June 16, 2011 9 / 12
  • 36. Some combinatorics on words The period of the word w ∈ Σ∗ is the smallest number p such that w [k + p] = w [k] for all 1 ≤ k ≤ |w | − p (period(w ) = |w | if such a p does not exist). |w | Let order(w ) = ⌊ period(w ) ⌋. Example: Let w = abbabbab = (abb)2 ab. Then period(w ) = 3 and order(w ) = 2. Lohrey, Mathissen (University of Leipzig) Compressed Membership June 16, 2011 9 / 12
  • 37. Some combinatorics on words The period of the word w ∈ Σ∗ is the smallest number p such that w [k + p] = w [k] for all 1 ≤ k ≤ |w | − p (period(w ) = |w | if such a p does not exist). |w | Let order(w ) = ⌊ period(w ) ⌋. Example: Let w = abbabbab = (abb)2 ab. Then period(w ) = 3 and order(w ) = 2. For a compressed automaton A, let: order(A) = max{order(val(A)) | A occurs as a label in A} period(A) = max{period(val(A)) | A occurs as a label in A} Lohrey, Mathissen (University of Leipzig) Compressed Membership June 16, 2011 9 / 12
  • 38. First main result Theorem 1 For a compressed automaton A and an SLP B, we can check val(B) ∈ L(A) in time polynomial in (i) |A|, (ii) |B|, and (iii) order(A). Lohrey, Mathissen (University of Leipzig) Compressed Membership June 16, 2011 10 / 12
  • 39. First main result Theorem 1 For a compressed automaton A and an SLP B, we can check val(B) ∈ L(A) in time polynomial in (i) |A|, (ii) |B|, and (iii) order(A). We use the following combinatorial fact: Let u, v ∈ Σ∗ and let p be a position in v . Then there exist at most order(u) many occurrences of u in v that “touch” position p. Lohrey, Mathissen (University of Leipzig) Compressed Membership June 16, 2011 10 / 12
  • 40. Second main result Theorem 2 For a compressed automaton A and an SLP B, we can check val(B) ∈ L(A) nondeterministically in time polynomial in (i) |A|, (ii) |B|, and (iii) period(A). Lohrey, Mathissen (University of Leipzig) Compressed Membership June 16, 2011 11 / 12
  • 41. Second main result Theorem 2 For a compressed automaton A and an SLP B, we can check val(B) ∈ L(A) nondeterministically in time polynomial in (i) |A|, (ii) |B|, and (iii) period(A). Generalizes the result of Plandowski & Rytter for a unary alphabet (NP-completeness of compressed membership for compressed automata). Lohrey, Mathissen (University of Leipzig) Compressed Membership June 16, 2011 11 / 12
  • 42. Second main result Theorem 2 For a compressed automaton A and an SLP B, we can check val(B) ∈ L(A) nondeterministically in time polynomial in (i) |A|, (ii) |B|, and (iii) period(A). Generalizes the result of Plandowski & Rytter for a unary alphabet (NP-completeness of compressed membership for compressed automata). A word w = an has period 1! Lohrey, Mathissen (University of Leipzig) Compressed Membership June 16, 2011 11 / 12
  • 43. Second main result Theorem 2 For a compressed automaton A and an SLP B, we can check val(B) ∈ L(A) nondeterministically in time polynomial in (i) |A|, (ii) |B|, and (iii) period(A). Generalizes the result of Plandowski & Rytter for a unary alphabet (NP-completeness of compressed membership for compressed automata). A word w = an has period 1! The theorem is proven by a reduction to the case of a unary alphabet. Lohrey, Mathissen (University of Leipzig) Compressed Membership June 16, 2011 11 / 12
  • 44. Open problems and a conjecture Conjecture 1 Compressed membership for compressed automata is NP-complete. Lohrey, Mathissen (University of Leipzig) Compressed Membership June 16, 2011 12 / 12
  • 45. Open problems and a conjecture Conjecture 1 Compressed membership for compressed automata is NP-complete. Conjecture 2 If val(B) ∈ L(A) for an SLP B and a compressed automaton A, then there exists an accepting run of A on val(B) (viewed as a word over the set of transition triples of A), which can be generated by an SLP of size poly(|B|, |A|). Lohrey, Mathissen (University of Leipzig) Compressed Membership June 16, 2011 12 / 12
  • 46. Open problems and a conjecture Conjecture 1 Compressed membership for compressed automata is NP-complete. Conjecture 2 If val(B) ∈ L(A) for an SLP B and a compressed automaton A, then there exists an accepting run of A on val(B) (viewed as a word over the set of transition triples of A), which can be generated by an SLP of size poly(|B|, |A|). Conjecture 2 implies Conjecture 1: ◮ Guess nondeterministically an SLP C of polynomial size. ◮ Check (in deterministic polynomial time), whether val(C) is an accepting run of A on val(B). Lohrey, Mathissen (University of Leipzig) Compressed Membership June 16, 2011 12 / 12