Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Network Coding for Distributed
Storage Systems*
Presented by
Jayant Apte
ASPITRG
7/9/13 & 7/11/13
*Dimakis, A.G.; Godfrey,...
Outline
●
Part 1
– Single Source Multi-cast Linear Network Coding
●
Part 2
– The repair problem
– Reduction of repair prob...
Some background on single source
multi-cast network coding
*Koetter, R.; Medard, M., "An algebraic approach to network cod...
Some background on single source
multi-cast network coding
*Koetter, R.; Medard, M., "An algebraic approach to network cod...
Max-Flow-Min-Cut Theorem
Max-Flow-Min-Cut Theorem
Max-Flow-Min-Cut Theorem
Some background on single source
multi-cast network coding
*Koetter, R.; Medard, M., "An algebraic approach to network cod...
Basic Network Model
Basic Network Model
Local coding coefficients
Global coding coefficients
Matrix formulation
The transfer matrix
Proof of Theorem 2
Proof of Theorem 3
Some background on single source
multi-cast network coding
*Koetter, R.; Medard, M., "An algebraic approach to network cod...
Extension to multicast
Part 2- Outline
● Introduction
● The repair problem
● Reduction of repair problem to single source multicast network
● Fam...
Distributed storage
● We are living in an internet age
● Demand for large scale data storage has increased
significantly
●...
A storage code((4,2) MDS)
Kwefgws
Jwehfwg
SjfJHFJ
jhfefog
Sikytrd
sdjhvkjd
A1
A2
B1
B2
A1
A2
B1
B2
A1
+B1
A2
+B2
A2
+B1
A1...
A storage code((4,2) MDS)
Kwefgws
Jwehfwg
SjfJHFJ
jhfefog
Sikytrd
sdjhvkjd
A1
A2
B1
B2
A1
A2
B1
B2
A1
+B1
A2
+B2
A2
+B1
A1...
Part 2- Outline
● Introduction
● The repair problem
● Reduction of repair problem to single source multicast network
● Fam...
Problem Definition
● Storage nodes are distributed and connected in a network
● Together they represent some storage code(...
Notation
The repair problem
x1
x2
x3
x4
y1
y2
x5
Example: A (4,2) MDS code
( = repair bandwidth per node )
The repair problem
● Data object (2Mb) is divided into two fragments:
y1
,y2
(1 Mb each)
● 4 encoded fragments generated: ...
The repair problem
● What(and how much) should x1
,x2
,x3
communicate to
x5
such that are minimized?
x1
x2
x3
x4
y1
y2
x5
...
Variants of the repair problem
● Exact Repair: Failed blocks are exactly regenerated
i.e. newcomer node must reconstruct e...
Variants of the repair problem
● Exact Repair: Failed blocks are exactly regenerated
i.e. newcomer node must reconstruct e...
Functional repair example
(Using RLNC)
a1
b1
a2
b2
a1
+b1
+a2
+b2
a1
+2b1
+a2
+2b2
a1
+2b1
+3a2
+b2
3a1
+2b1
+2a2
+3b2
a1
...
Functional repair example
(Using RLNC)
a1
b1
a2
b2
a1
+b1
+a2
+b2
a1
+2b1
+a2
+2b2
a1
+2b1
+3a2
+b2
3a1
+2b1
+2a2
+3b2
a1
...
An attempt at solution
x1
x2
x3
x4
y1
y2
x5
Example 1: A (4,2) MDS code
An attempt at solution
x1
x2
x3
x4
y1
y2
x5
Example 1: A (4,2) MDS code
x5
Recovers original data
object and creates a new...
Can we do better than this?
Can we do better than this?
YES!
Part 2- Outline
● Introduction
● The repair problem
● Reduction of repair problem to single source
multicast network
● Fam...
Reduction to information flow graph
Example
x1
in
x2
in
x3
in
x4
in
x5
in
x1
out
x2
out
x3
out
x4
out
S
x5
out
DC
Information flow graph corresponding
to Exam...
Dynamic nature of information flow
graph due to given failure pattern
x1
in
x2
in
x3
in
x4
in
x5
in
x1
out
x2
out
x3
out
x...
Family of information flow graphs
x1
in
x2
in
x3
in
x4
in
x5
in
x1
out
x2
out
x3
out
x4
out
S
x5
out
DC
Information flow g...
Lemma 1
Outline
● The repair problem
● Reduction of repair problem to single source
multicast network
● Family of single source mu...
Information flow graph
S
Information flow graph
S
Information flow graph
S
Information flow graph
S
Information flow graph
S
Information flow graph
S
Proof
WLOG
Outline
● The repair problem
● Reduction of repair problem to single source
multicast network
● Family of single source mu...
Minimize subject to the lower
bound
Nature of constraint
LHS of constraint as function of
LHS of constraint as function of
Solution to the optimization
Simplification of solution
Simplification of solution
Solution
Minimum repair bandwidth
Storage-Bandwidth Tradeoff
Relationship between and [1]
References
● [1]Alexandros G. Dimakis, P. Brighten Godfrey, Yunnan Wu, Martin J. Wainwright,
and Kannan Ramchandran. 2010....
Network Coding for Distributed Storage Systems(Group Meeting Talk)
Network Coding for Distributed Storage Systems(Group Meeting Talk)
Network Coding for Distributed Storage Systems(Group Meeting Talk)
Network Coding for Distributed Storage Systems(Group Meeting Talk)
Network Coding for Distributed Storage Systems(Group Meeting Talk)
Upcoming SlideShare
Loading in …5
×

Network Coding for Distributed Storage Systems(Group Meeting Talk)

1,343 views

Published on

Reviews work of Koetter et al. and Dimakis et al.
The former provides an algebraic framework for linear network coding. The latter reduces the so called repair problem to single-source multicast network-coding problem and shows that there is a tradeoff between amount of data stored in a distributed sturage system and amount of data transfer required to repair the system if a node(hard-drive) fails.

Published in: Technology
  • Be the first to comment

Network Coding for Distributed Storage Systems(Group Meeting Talk)

  1. 1. Network Coding for Distributed Storage Systems* Presented by Jayant Apte ASPITRG 7/9/13 & 7/11/13 *Dimakis, A.G.; Godfrey, P.B.; Wu, Y.; Wainwright, M.J.; Ramchandran, K. "Network Coding for Distributed Storage Systems", Information Theory, IEEE Transactions on, On page(s): 4539 – 4551 Volume: 56, Issue: 9, Sept. 2010
  2. 2. Outline ● Part 1 – Single Source Multi-cast Linear Network Coding ● Part 2 – The repair problem – Reduction of repair problem to single source multicast network – Family of single source multi-cast networks arising from the reduction – A lower bound on min-cuts(i.e. An upper bound on max-flow and hence coding capacity of network) – Minimization of storage bandwidth subject to this lower bound
  3. 3. Some background on single source multi-cast network coding *Koetter, R.; Medard, M., "An algebraic approach to network coding," Networking, IEEE/ACM Transactions on , vol.11, no.5, pp.782,795, Oct. 2003
  4. 4. Some background on single source multi-cast network coding *Koetter, R.; Medard, M., "An algebraic approach to network coding," Networking, IEEE/ACM Transactions on , vol.11, no.5, pp.782,795, Oct. 2003
  5. 5. Max-Flow-Min-Cut Theorem
  6. 6. Max-Flow-Min-Cut Theorem
  7. 7. Max-Flow-Min-Cut Theorem
  8. 8. Some background on single source multi-cast network coding *Koetter, R.; Medard, M., "An algebraic approach to network coding," Networking, IEEE/ACM Transactions on , vol.11, no.5, pp.782,795, Oct. 2003
  9. 9. Basic Network Model
  10. 10. Basic Network Model
  11. 11. Local coding coefficients
  12. 12. Global coding coefficients
  13. 13. Matrix formulation
  14. 14. The transfer matrix
  15. 15. Proof of Theorem 2
  16. 16. Proof of Theorem 3
  17. 17. Some background on single source multi-cast network coding *Koetter, R.; Medard, M., "An algebraic approach to network coding," Networking, IEEE/ACM Transactions on , vol.11, no.5, pp.782,795, Oct. 2003
  18. 18. Extension to multicast
  19. 19. Part 2- Outline ● Introduction ● The repair problem ● Reduction of repair problem to single source multicast network ● Family of single source multi-cast networks arising from the reduction ● A lower bound on min-cuts(i.e. An upper bound on max-flow and hence coding capacity of network) ● Minimization of storage bandwidth subject to this lower bound
  20. 20. Distributed storage ● We are living in an internet age ● Demand for large scale data storage has increased significantly ● Social networks, file and video sharing require seamless storage, access and security for massive amounts of data ● Storage mediums(viz. hard-drives) are individually unreliable ● Hence we introduce redundancy via the use of erasure codes to improve reliability
  21. 21. A storage code((4,2) MDS) Kwefgws Jwehfwg SjfJHFJ jhfefog Sikytrd sdjhvkjd A1 A2 B1 B2 A1 A2 B1 B2 A1 +B1 A2 +B2 A2 +B1 A1 + A2 +B2 Fragment 1 Fragment 2 Disk 1 Disk 2 Disk 3 Disk 4
  22. 22. A storage code((4,2) MDS) Kwefgws Jwehfwg SjfJHFJ jhfefog Sikytrd sdjhvkjd A1 A2 B1 B2 A1 A2 B1 B2 A1 +B1 A2 +B2 A2 +B1 A1 + A2 +B2 Fragment 1 Fragment 2 Disk 1 Disk 2 Disk 3 Disk 4
  23. 23. Part 2- Outline ● Introduction ● The repair problem ● Reduction of repair problem to single source multicast network ● Family of single source multi-cast networks arising from the reduction ● A lower bound on min-cuts(i.e. An upper bound on max-flow and hence coding capacity of network) ● Minimization of storage bandwidth subject to this lower bound
  24. 24. Problem Definition ● Storage nodes are distributed and connected in a network ● Together they represent some storage code(MDS or approximate MDS like LDPC) ● The issue of repairing a node arises when a storage node of the system fails ● The still functioning nodes are called active nodes ● A newcomer node called repair node must connect to a subset of active nodes, obtain information from them and reconstruct the storage code i.e, repair the code ● The objective is to minimize amount of information transferred in this process
  25. 25. Notation
  26. 26. The repair problem x1 x2 x3 x4 y1 y2 x5 Example: A (4,2) MDS code ( = repair bandwidth per node )
  27. 27. The repair problem ● Data object (2Mb) is divided into two fragments: y1 ,y2 (1 Mb each) ● 4 encoded fragments generated: x1 ,x2 ,x3 ,x4 (1 Mb each) ● x4 fails, x5 , the newcomer needs to communicate with existing nodes and create a new encoded packet ● Any two out of x1 ,x2 ,x3 ,x5 must suffice to recover original data object
  28. 28. The repair problem ● What(and how much) should x1 ,x2 ,x3 communicate to x5 such that are minimized? x1 x2 x3 x4 y1 y2 x5 Example 1: A (4,2) MDS code
  29. 29. Variants of the repair problem ● Exact Repair: Failed blocks are exactly regenerated i.e. newcomer node must reconstruct exact replica of encoded block in the failed node ● Functional Repair: Newly generated data block need not be exact replica of encoded block on the failed node ● Exact repair of the systematic part: Only repair the systematic part exactly so there is always a un- coded copy of original file available
  30. 30. Variants of the repair problem ● Exact Repair: Failed blocks are exactly regenerated i.e. newcomer node must reconstruct exact replica of encoded block in the failed node ● Functional Repair: Newly generated data block need not be exact replica of encoded block on the failed node ● Exact repair of the systematic part: Only repair the systematic part exactly so there is always a un- coded copy of original file available
  31. 31. Functional repair example (Using RLNC) a1 b1 a2 b2 a1 +b1 +a2 +b2 a1 +2b1 +a2 +2b2 a1 +2b1 +3a2 +b2 3a1 +2b1 +2a2 +3b2 a1 b1 a2 b2 p1=a1 +2b1 p2=2a2 +b2 p1=4a1 +5b1 +4a2 +5b2 5a1 +7b1 +8a2 +7b2 6a1 +9b1 +6a2 +6b2 1 2 2 1 3 1 1 1 1 1 2 2 File fragments Encoded data blocks Encoded repair packets Repair node (Each box is 0.5Mb)
  32. 32. Functional repair example (Using RLNC) a1 b1 a2 b2 a1 +b1 +a2 +b2 a1 +2b1 +a2 +2b2 a1 +2b1 +3a2 +b2 3a1 +2b1 +2a2 +3b2 a1 b1 a2 b2 p1=a1 +2b1 p2=2a2 +b2 p1=4a1 +5b1 +4a2 +5b2 5a1 +7b1 +8a2 +7b2 6a1 +9b1 +6a2 +6b2 1 2 2 1 3 1 1 1 1 1 2 2 File fragments Encoded data blocks Encoded repair packets Repair node (Each box is 0.5Mb) Flow across this Cut is repair b/w
  33. 33. An attempt at solution x1 x2 x3 x4 y1 y2 x5 Example 1: A (4,2) MDS code
  34. 34. An attempt at solution x1 x2 x3 x4 y1 y2 x5 Example 1: A (4,2) MDS code x5 Recovers original data object and creates a new independent linear combination
  35. 35. Can we do better than this?
  36. 36. Can we do better than this? YES!
  37. 37. Part 2- Outline ● Introduction ● The repair problem ● Reduction of repair problem to single source multicast network ● Family of single source multi-cast networks arising from the reduction ● A lower bound on min-cuts(i.e. An upper bound on max-flow and hence coding capacity of network) ● Minimization of storage bandwidth subject to this lower bound
  38. 38. Reduction to information flow graph
  39. 39. Example x1 in x2 in x3 in x4 in x5 in x1 out x2 out x3 out x4 out S x5 out DC Information flow graph corresponding to Example 1: A (4,2) MDS code Node 4 has failed
  40. 40. Dynamic nature of information flow graph due to given failure pattern x1 in x2 in x3 in x4 in x5 in x1 out x2 out x3 out x4 out S x5 out DC Information flow graph corresponding to Example 1: A (4,2) MDS code Node 4 has failed
  41. 41. Family of information flow graphs x1 in x2 in x3 in x4 in x5 in x1 out x2 out x3 out x4 out S x5 out DC Information flow graph corresponding to Example 1: A (4,2) MDS code Node 3 also failed say a few minutes later x6 in x6 out
  42. 42. Lemma 1
  43. 43. Outline ● The repair problem ● Reduction of repair problem to single source multicast network ● Family of single source multi-cast networks arising from the reduction ● A lower bound on min-cuts(i.e. An upper bound on max-flow and hence coding capacity of network) ● Minimization of storage bandwidth subject to this lower bound
  44. 44. Information flow graph S
  45. 45. Information flow graph S
  46. 46. Information flow graph S
  47. 47. Information flow graph S
  48. 48. Information flow graph S
  49. 49. Information flow graph S
  50. 50. Proof
  51. 51. WLOG
  52. 52. Outline ● The repair problem ● Reduction of repair problem to single source multicast network ● Family of single source multi-cast networks arising from the reduction ● A lower bound on min-cuts(i.e. An upper bound on max-flow and hence coding capacity of network) ● Minimization of storage bandwidth subject to this lower bound
  53. 53. Minimize subject to the lower bound
  54. 54. Nature of constraint
  55. 55. LHS of constraint as function of
  56. 56. LHS of constraint as function of
  57. 57. Solution to the optimization
  58. 58. Simplification of solution
  59. 59. Simplification of solution
  60. 60. Solution
  61. 61. Minimum repair bandwidth
  62. 62. Storage-Bandwidth Tradeoff Relationship between and [1]
  63. 63. References ● [1]Alexandros G. Dimakis, P. Brighten Godfrey, Yunnan Wu, Martin J. Wainwright, and Kannan Ramchandran. 2010. Network coding for distributed storage systems. IEEE Trans. Inf. Theor. 56, 9 (September 2010), 4539-4551. ● [2]Koetter, R.; Medard, M., "An algebraic approach to network coding," Networking, IEEE/ACM Transactions on , vol.11, no.5, pp.782,795, Oct. 2003 ● [3]Tracey Ho and Desmond Lun. 2008. Network Coding: An Introduction. Cambridge University Press, New York, NY, USA. ● [4]Dimakis, A.G.; Ramchandran, K.; Wu, Y.; Changho Suh, "A Survey on Network Codes for Distributed Storage," Proceedings of the IEEE , vol.99, no.3, pp.476,489, March 2011

×