A Note on Linear Programming Based Communication
                     Receivers

Shunsuke Horii, Tota Suko, Toshiyasu Matsushima, Shigeichi Hirasawa

                        Waseda University, Japan


                       September 15, 2011




                                                                  1 / 23
Outline


1   Motivation

2   Problem description and preliminary materials

3   Review early study

4   Our proposition

5   Summary and future plans




                                                    2 / 23
Motivation
Problem: MAP estimation problems on graphical models
    Ex: MAP (ML) decoding problem of binary linear codes
    Linear Programming based decoding algorithm is attracting.
    [Feldman et al. 2005]
Extension of LP decoding to other problems
    decoding for q-ary linear codes [Flanagan et al. 2008]
    application to Partial-Response (PR) channel [Taghavi et al. 2007]
    MAP estimation problems on factor graphs which don’t have
    multiple-degree non-indicator functions [Flanagan 2010]

Our research
                                                          .
LP based inference algorithm for MAP estimation problems on factor
graphs which have multiple-degree non-indicator functions
    include decoding problems of linear codes over multiple-access
    channels
                                                                         3 / 23
Comparison to early studies
    Inference algorithm based on the LP relaxation
    Classify the problems by the functions in the factor graphs
         Indicator function: takes value only in {0, 1}
         Non-indicator function: other than indicator functions
         (take value in R+ )
         Multiple-degree non-indicator function: Non-indicator function which
         has more than one argument

      !"#$%&'(&")*'!"#$'+,-.)-/01/(&//'2%203213#"$%&'4,2#.%25'
                                                         6B,&'&/5/"&#*:!

      !"#$%&'(&")*'!"#$%&#'+,-.)-/01/(&//'2%203213#"$%&'4,2#.%25'
                                                       6!-"2"("2'7898:!
      ";/#%132(')&%<-/+'%4'<32"&='-32/"&'#%1/5'6!/-1+"2'/$'"->'788?:'
      ";/#%132(')&%<-/+'%4'@0"&='-32/"&'#%1/5'6!-"2"("2'/$'"->'788A:'


      ";/#%132(')&%<-/+'%4'-32/"&'#%1/5'%C/&'DE'F*"22/-'6G"(*"C3'/$'"->'788H:'
      ";/#%132(')&%<-/+'%4'-32/"&'#%1/5'%C/&'+,-.)-/0"##/55'#*"22/-'

                                                                                 4 / 23
Problem description

x = (x1 , x2 , · · · , xN ), xn ∈ Xn (finite set)
     ∏
X = N Xn (Cartesian product)
       n=1

Problem: find x ∈ X which maximizes g(x)
                               ∏
                        g(x) =     fa (xa )
                                            a∈A

A: discrete index set
xa : argument of a function fa              {                           }
xa = (xa1 , xa2 , · · · , xaN (a) ), N (a) = a1 , a2 , · · · , a|N (a)|

Assumption: range of fa is R+ (non-negative real number)

Generally, it’s a computationally hard problem.
(needs O(2N ) computation when |Xn | = 2)
                                                                            5 / 23
Factor graph
Factor graph [F. R. Kschischang et al. 2001]
    A bipartite graph which expresses the structure of a factorization of a
    product of functions
    variable nodes: represent the variables {xn }n=1,2,··· ,N
    factor nodes: represent the functions {fa }a∈A
                                                           .
    edge-connections: variable node xn and factor node fa is connected
    iff xn is an argument of fa

Example: g(x , x , x , x ) = f (x , x )f (x , x , x )f (x , x )
            1 2 3 4           A 1 2 B 2 3 4 C 2 4




                                                                .



                                                                         6 / 23
Classify the problems



We classify the functions {fa }a∈A into two classes, indicator functions and
non-indicator functions
     { }
      fIj j∈J : indicator functions in {fa }a∈A
    {fRl }l∈L : non-indicator functions in {fa }a∈A
Then the factorization is reduced to
                              ∏             ∏
                     g(x) =      fIj (xIj )   fRl (xRl )
                             j∈J           l∈L




                                                                         7 / 23
Example: Factor graph for binary linear codes
ML decoding:
                                            ∏
                                            N                   ∏
                                                                M
    ˆ
    x = arg max p(y|x) = arg max                   p(yn |xn )         fIm (xN (m) )
               x∈C                  x∈{0,1}N n=1                m=1
y: received word, C: binary linear code, N (m) = {n : Hn,m = 1},
H = {Hn,m } ∈ {0, 1}M ×N : Parity check matrix
                            {        ∑
                               1 if    n∈N (m) xn = 0 mod 2
            fIm (xN (m) ) =
                               0 otherwise
                       p(yn |xn )

                                            !"#$%#&'()*"+,-.#(/"#0!




                                             %#&'()*"+,-.#(/"#0!

                      fIm (xN (m) )
                                                                                      8 / 23
Flanagan’s work


    LP relaxation for the problem to find the x that maximizes g(x)
    For the case that every non-indicator function has only one argument
                                                                  
                                   ∏                  ∏
               x∗ = arg max             fIj (xIj )         fRl (xl )
                        x∈X
                                   j∈J                l∈L
                               ∑
                  = arg max          log fRl (xl )
                         x∈Q
                               l∈L

      Q
where { is defined as follows:
                       }
Qj = xj : fIj (xj ) = 1 , Q = {x : xj ∈ Qj ∀j ∈ J }



                                                                         9 / 23
Flanagan’s work

Define the vector (mapping):
τn (xn ) = (τn:1 (xn ), τn:2 (xn ), · · · , τn:|Xn | (xn )) ∈ {0, 1}|Xn | , where
                                           {
                                               1 if xn = α
                            τn:α (xn ) =
                                               0 otherwise

Ex.) Xn = {1, 2, 3}, then τn (2) = (τn:1 (2), τn:2 (2), τn:3 (2)) = (0, 1, 0)

Let τ (x) = (τ1 (x1 ), τ2 (x2 ), · · · , τN (xN )).
Define the λl:α = log fRl (α) for all α ∈ Xl , then
                                          ∑
                      x∗ = arg max            log fRl (xl )
                                     x∈Q
                                           l∈L
                                           ∑∑
                             = arg max                λl:α τl:α (xl )
                                     x∈Q
                                           l∈L α∈Xl


                                                                                    10 / 23
Flanagan’s work
                       {                     }
Define the polytope Pj = τ (x) : fIj (xIj = 1) and
         (∩        )
P = conv    j∈J Pj (conv(·): convex hull of a set), then
                   ∑∑                                  ∑∑
             max              λl:α τl:α (xl ) = max               λl:α τl:α
             x∈Q                                τ ∈P
                   l∈L α∈Xl                            l∈L α∈Xl

where τ is the variable that takes its value in the range of τ (x) and τl:α is
an element of τ .
LP relaxed problem is derived as follows:
                                ∑∑
                           max            λl:α τl:α
                              τ ∈P
                                 ˜
                                     l∈L α∈Xl

      ˜ ∩
where P = j∈J conv(Pj ). is the relaxed polytope.

See [Flanagan, 2010] for detail.
                                                                              11 / 23
Aim of our research



    We can’t apply Flanagan’s result if the factor graph has some
    non-indicator functions that have more than one argument. (We call
    these functions multiple-degree non-indicator function)
    Some problems have such functions (Ex. Decoding problem over
    multiple-access channel)

Our Proposition
LP relaxation for the problem the objective function of which has some
multiple-degree non-indicator functions.




                                                                         12 / 23
Our Proposition

Original problem:
                                ∏                  ∏
                      arg max         fIj (xIj )         fRl (xRl )
                           x
                                j∈J                l∈L

Introduce auxiliary variables and functions:

                       Wl = Xl1 × Xl2 × · · · × Xl|N (l)|
                       wl : variable which takes its value in Wl
                         w = (wl )l∈L
                               {
                                 1 if wl = xRl
             fIl (wl , xRl ) =
                                 0 otherwise.

c.f. fIl is an Indicator function.


                                                                      13 / 23
Our Proposition
Problem translation:
                        ∏                  ∏
               g(x) =         fIj (xIj )         fRl (xRl )
                        j∈J                l∈L
                        ∏                  ∏                       ∏
           g (x, w) =         fIj (xIj )         fIl (wl , xRl )          fRl (wl )
                        j∈J                l∈L                     l∈Rl

If we regard wl as ’one’ variable which takes value in Wl , the function fRl
has only one argument.
Theorem
Let (x∗ , w∗ ) maximizes the function g , then x∗ maximizes the function g.

    The problem is reduced to the problem to find the maximizer of g .
    The function g doesn’t have any multi-degree non-indicator
    functions.
    ⇒ We can use Flanagan’s result.
                                                                                      14 / 23
Application to Gaussian-MAC
Gaussian-Multiple Access Channel model:
                                                  ak : amplitude of user k
                                                                             Gaussian noise
                                                                              ǫ ∼ N (0, σ 2 I)




               xK ∈ CK                      1 − 2xK
                                                                  aK (1 − 2xK )


Maximum Likelihood decoding rule:
    (x1 , x2 , · · · , xK ) = arg
     ˆ ˆ               ˆ                      max                 Pr(y|x1 , x2 , · · · , xK )
                                    x1 ∈C1 ,x2 ∈C2 ,··· ,xK ∈CK

Likelihood function:
                                            ∏
                                            N
          Pr(y|x1 , x2 , · · · , xK ) =           Pr(yn |x1,n , x2,n , · · · , xK,n )
                                            n=1
N : codeword length,xk,n : n − th symbol of user k’s codeword
                                                                                                 15 / 23
Factor graph for MAC

                            Pr(yn |x1,n , x2,n , · · · , xK,n )
                                                                         Non-Indicator(
                                                                           !"#$%&'#

                                                                          xK,1   xK,2     xK,N



                                                             Indicator
                                                              !"#$%&'#
 user 1’s code constraint



                                                {       ∑                 }
                                         1         (yn − K (1 − 2xk,n ))2
 Pr(yn |x1,n , x2,n , · · · , xK,n ) =       exp −       k=1
                                       2πσ 2               2σ 2



                                                                                                 16 / 23
Simulation




Simulation condition:
    The channel model is Gaussian MAC
    Number of users K = 2
    The amplitudes are set to a1 = 1.0, a2 = 1.5
    The two user codes C1 , C2 are (60, 30) LDPC codes
    GLPK is used as LP solver




                                                         17 / 23
Simulation
Simulation result:




                     18 / 23
Conclusion


Summary:
    We proposed LP based inference algorithm for MAP estimation
    problems on factor graphs which have multiple-degree non-indicator
    functions.
    As an example, we proposed LP based decoding algorithm for
    Multiple-Access channels.
Future works:
    Improve the inference algorithms by using the combinatorial
    optimization algorithms.
    Application to problems other than Multiple-Access channel.




                                                                     19 / 23
Appendix
Another simulation result:




                             20 / 23
Appendix


Average decoding time (ms):

(60,30) LDPC codes, Eb/N0=6.0dB

            SP(T1 = T2 = 10)   SP(T1 = T2 = 20)     LP
                 0.0520             0.1762        0.1403

(100,50) LDPC codes, Eb/N0=6.0dB

            SP(T1 = T2 = 20)   SP(T1 = T2 = 30)     LP
                 0.2818             0.6259        0.4235




                                                           21 / 23
Appendix

Computational Complexity:
Computational complexity of LP is polynomial order of the number of
variables and constraints.

For MAC problem:
Number of variables: O(N 2K )
                               ∑       (k)
Number of constraints: O(N K + k Mk 2dmax )
                    (k)
                  (dmax :Maximum row hamming weight of H (k) )

For Gaussian MAC problem:1
Number of variables: O(N K 2 )
                                      ∑           (k)
Number of constraints: O(N K 2 +         k   Mk 2dmax )

   1
   S. Horii, T. Matsushima, S. Hirasawa, “A Note on the Linear Programming
Decoding of Binary Linear Codes for Multiple-Access Channel,” IEICE Trans.
Fundamentals, Vol.E94-A, No.6, pp.1230-1237.
                                                                             22 / 23
Appendix
Proof of Theorem
                        ∏                  ∏
               g(x) =         fIj (xIj )         fRl (xRl )
                        j∈J                l∈L
                        ∏                  ∏                       ∏
           g (x, w) =         fIj (xIj )         fIl (wl , xRl )          fRl (wl )
                        j∈J                l∈L                     l∈Rl


Theorem
Let (x∗ , w∗ ) maximizes the function g , then x∗ maximizes the function g.

Proof: Assume that there exists x such that g(x ) > g(x∗ ).
         (                            )
Let wl = xl1 , xl2 , · · · , xl|N (l)| .
Then it holds that f (xRl , wl ) = 1 for all l ∈ L.
Then,                                                                           .
                 g (x , w ) = g(x ) > g(x∗ ) = g(x∗ , w∗ )
It contradicts to the assumption that (x∗ , w∗ ) maximizes g ..
                                                                                      23 / 23

ma112011id535

  • 1.
    A Note onLinear Programming Based Communication Receivers Shunsuke Horii, Tota Suko, Toshiyasu Matsushima, Shigeichi Hirasawa Waseda University, Japan September 15, 2011 1 / 23
  • 2.
    Outline 1 Motivation 2 Problem description and preliminary materials 3 Review early study 4 Our proposition 5 Summary and future plans 2 / 23
  • 3.
    Motivation Problem: MAP estimationproblems on graphical models Ex: MAP (ML) decoding problem of binary linear codes Linear Programming based decoding algorithm is attracting. [Feldman et al. 2005] Extension of LP decoding to other problems decoding for q-ary linear codes [Flanagan et al. 2008] application to Partial-Response (PR) channel [Taghavi et al. 2007] MAP estimation problems on factor graphs which don’t have multiple-degree non-indicator functions [Flanagan 2010] Our research . LP based inference algorithm for MAP estimation problems on factor graphs which have multiple-degree non-indicator functions include decoding problems of linear codes over multiple-access channels 3 / 23
  • 4.
    Comparison to earlystudies Inference algorithm based on the LP relaxation Classify the problems by the functions in the factor graphs Indicator function: takes value only in {0, 1} Non-indicator function: other than indicator functions (take value in R+ ) Multiple-degree non-indicator function: Non-indicator function which has more than one argument !"#$%&'(&")*'!"#$'+,-.)-/01/(&//'2%203213#"$%&'4,2#.%25' 6B,&'&/5/"&#*:! !"#$%&'(&")*'!"#$%&#'+,-.)-/01/(&//'2%203213#"$%&'4,2#.%25' 6!-"2"("2'7898:! ";/#%132(')&%<-/+'%4'<32"&='-32/"&'#%1/5'6!/-1+"2'/$'"->'788?:' ";/#%132(')&%<-/+'%4'@0"&='-32/"&'#%1/5'6!-"2"("2'/$'"->'788A:' ";/#%132(')&%<-/+'%4'-32/"&'#%1/5'%C/&'DE'F*"22/-'6G"(*"C3'/$'"->'788H:' ";/#%132(')&%<-/+'%4'-32/"&'#%1/5'%C/&'+,-.)-/0"##/55'#*"22/-' 4 / 23
  • 5.
    Problem description x =(x1 , x2 , · · · , xN ), xn ∈ Xn (finite set) ∏ X = N Xn (Cartesian product) n=1 Problem: find x ∈ X which maximizes g(x) ∏ g(x) = fa (xa ) a∈A A: discrete index set xa : argument of a function fa { } xa = (xa1 , xa2 , · · · , xaN (a) ), N (a) = a1 , a2 , · · · , a|N (a)| Assumption: range of fa is R+ (non-negative real number) Generally, it’s a computationally hard problem. (needs O(2N ) computation when |Xn | = 2) 5 / 23
  • 6.
    Factor graph Factor graph[F. R. Kschischang et al. 2001] A bipartite graph which expresses the structure of a factorization of a product of functions variable nodes: represent the variables {xn }n=1,2,··· ,N factor nodes: represent the functions {fa }a∈A . edge-connections: variable node xn and factor node fa is connected iff xn is an argument of fa Example: g(x , x , x , x ) = f (x , x )f (x , x , x )f (x , x ) 1 2 3 4 A 1 2 B 2 3 4 C 2 4 . 6 / 23
  • 7.
    Classify the problems Weclassify the functions {fa }a∈A into two classes, indicator functions and non-indicator functions { } fIj j∈J : indicator functions in {fa }a∈A {fRl }l∈L : non-indicator functions in {fa }a∈A Then the factorization is reduced to ∏ ∏ g(x) = fIj (xIj ) fRl (xRl ) j∈J l∈L 7 / 23
  • 8.
    Example: Factor graphfor binary linear codes ML decoding: ∏ N ∏ M ˆ x = arg max p(y|x) = arg max p(yn |xn ) fIm (xN (m) ) x∈C x∈{0,1}N n=1 m=1 y: received word, C: binary linear code, N (m) = {n : Hn,m = 1}, H = {Hn,m } ∈ {0, 1}M ×N : Parity check matrix { ∑ 1 if n∈N (m) xn = 0 mod 2 fIm (xN (m) ) = 0 otherwise p(yn |xn ) !"#$%#&'()*"+,-.#(/"#0! %#&'()*"+,-.#(/"#0! fIm (xN (m) ) 8 / 23
  • 9.
    Flanagan’s work LP relaxation for the problem to find the x that maximizes g(x) For the case that every non-indicator function has only one argument   ∏ ∏ x∗ = arg max  fIj (xIj ) fRl (xl ) x∈X j∈J l∈L ∑ = arg max log fRl (xl ) x∈Q l∈L Q where { is defined as follows: } Qj = xj : fIj (xj ) = 1 , Q = {x : xj ∈ Qj ∀j ∈ J } 9 / 23
  • 10.
    Flanagan’s work Define thevector (mapping): τn (xn ) = (τn:1 (xn ), τn:2 (xn ), · · · , τn:|Xn | (xn )) ∈ {0, 1}|Xn | , where { 1 if xn = α τn:α (xn ) = 0 otherwise Ex.) Xn = {1, 2, 3}, then τn (2) = (τn:1 (2), τn:2 (2), τn:3 (2)) = (0, 1, 0) Let τ (x) = (τ1 (x1 ), τ2 (x2 ), · · · , τN (xN )). Define the λl:α = log fRl (α) for all α ∈ Xl , then ∑ x∗ = arg max log fRl (xl ) x∈Q l∈L ∑∑ = arg max λl:α τl:α (xl ) x∈Q l∈L α∈Xl 10 / 23
  • 11.
    Flanagan’s work { } Define the polytope Pj = τ (x) : fIj (xIj = 1) and (∩ ) P = conv j∈J Pj (conv(·): convex hull of a set), then ∑∑ ∑∑ max λl:α τl:α (xl ) = max λl:α τl:α x∈Q τ ∈P l∈L α∈Xl l∈L α∈Xl where τ is the variable that takes its value in the range of τ (x) and τl:α is an element of τ . LP relaxed problem is derived as follows: ∑∑ max λl:α τl:α τ ∈P ˜ l∈L α∈Xl ˜ ∩ where P = j∈J conv(Pj ). is the relaxed polytope. See [Flanagan, 2010] for detail. 11 / 23
  • 12.
    Aim of ourresearch We can’t apply Flanagan’s result if the factor graph has some non-indicator functions that have more than one argument. (We call these functions multiple-degree non-indicator function) Some problems have such functions (Ex. Decoding problem over multiple-access channel) Our Proposition LP relaxation for the problem the objective function of which has some multiple-degree non-indicator functions. 12 / 23
  • 13.
    Our Proposition Original problem: ∏ ∏ arg max fIj (xIj ) fRl (xRl ) x j∈J l∈L Introduce auxiliary variables and functions: Wl = Xl1 × Xl2 × · · · × Xl|N (l)| wl : variable which takes its value in Wl w = (wl )l∈L { 1 if wl = xRl fIl (wl , xRl ) = 0 otherwise. c.f. fIl is an Indicator function. 13 / 23
  • 14.
    Our Proposition Problem translation: ∏ ∏ g(x) = fIj (xIj ) fRl (xRl ) j∈J l∈L ∏ ∏ ∏ g (x, w) = fIj (xIj ) fIl (wl , xRl ) fRl (wl ) j∈J l∈L l∈Rl If we regard wl as ’one’ variable which takes value in Wl , the function fRl has only one argument. Theorem Let (x∗ , w∗ ) maximizes the function g , then x∗ maximizes the function g. The problem is reduced to the problem to find the maximizer of g . The function g doesn’t have any multi-degree non-indicator functions. ⇒ We can use Flanagan’s result. 14 / 23
  • 15.
    Application to Gaussian-MAC Gaussian-MultipleAccess Channel model: ak : amplitude of user k Gaussian noise ǫ ∼ N (0, σ 2 I) xK ∈ CK 1 − 2xK aK (1 − 2xK ) Maximum Likelihood decoding rule: (x1 , x2 , · · · , xK ) = arg ˆ ˆ ˆ max Pr(y|x1 , x2 , · · · , xK ) x1 ∈C1 ,x2 ∈C2 ,··· ,xK ∈CK Likelihood function: ∏ N Pr(y|x1 , x2 , · · · , xK ) = Pr(yn |x1,n , x2,n , · · · , xK,n ) n=1 N : codeword length,xk,n : n − th symbol of user k’s codeword 15 / 23
  • 16.
    Factor graph forMAC Pr(yn |x1,n , x2,n , · · · , xK,n ) Non-Indicator( !"#$%&'# xK,1 xK,2 xK,N Indicator !"#$%&'# user 1’s code constraint { ∑ } 1 (yn − K (1 − 2xk,n ))2 Pr(yn |x1,n , x2,n , · · · , xK,n ) = exp − k=1 2πσ 2 2σ 2 16 / 23
  • 17.
    Simulation Simulation condition: The channel model is Gaussian MAC Number of users K = 2 The amplitudes are set to a1 = 1.0, a2 = 1.5 The two user codes C1 , C2 are (60, 30) LDPC codes GLPK is used as LP solver 17 / 23
  • 18.
  • 19.
    Conclusion Summary: We proposed LP based inference algorithm for MAP estimation problems on factor graphs which have multiple-degree non-indicator functions. As an example, we proposed LP based decoding algorithm for Multiple-Access channels. Future works: Improve the inference algorithms by using the combinatorial optimization algorithms. Application to problems other than Multiple-Access channel. 19 / 23
  • 20.
  • 21.
    Appendix Average decoding time(ms): (60,30) LDPC codes, Eb/N0=6.0dB SP(T1 = T2 = 10) SP(T1 = T2 = 20) LP 0.0520 0.1762 0.1403 (100,50) LDPC codes, Eb/N0=6.0dB SP(T1 = T2 = 20) SP(T1 = T2 = 30) LP 0.2818 0.6259 0.4235 21 / 23
  • 22.
    Appendix Computational Complexity: Computational complexityof LP is polynomial order of the number of variables and constraints. For MAC problem: Number of variables: O(N 2K ) ∑ (k) Number of constraints: O(N K + k Mk 2dmax ) (k) (dmax :Maximum row hamming weight of H (k) ) For Gaussian MAC problem:1 Number of variables: O(N K 2 ) ∑ (k) Number of constraints: O(N K 2 + k Mk 2dmax ) 1 S. Horii, T. Matsushima, S. Hirasawa, “A Note on the Linear Programming Decoding of Binary Linear Codes for Multiple-Access Channel,” IEICE Trans. Fundamentals, Vol.E94-A, No.6, pp.1230-1237. 22 / 23
  • 23.
    Appendix Proof of Theorem ∏ ∏ g(x) = fIj (xIj ) fRl (xRl ) j∈J l∈L ∏ ∏ ∏ g (x, w) = fIj (xIj ) fIl (wl , xRl ) fRl (wl ) j∈J l∈L l∈Rl Theorem Let (x∗ , w∗ ) maximizes the function g , then x∗ maximizes the function g. Proof: Assume that there exists x such that g(x ) > g(x∗ ). ( ) Let wl = xl1 , xl2 , · · · , xl|N (l)| . Then it holds that f (xRl , wl ) = 1 for all l ∈ L. Then, . g (x , w ) = g(x ) > g(x∗ ) = g(x∗ , w∗ ) It contradicts to the assumption that (x∗ , w∗ ) maximizes g .. 23 / 23