A Genetic Algorithm for
Clustered Steiner Tree Problem
(CluSteiner)
Nguy𝒆𝒏 𝑯𝒂𝒊 𝑷𝒉𝒐𝒏𝒈 , Le Minh Tu , Nguy𝒆𝒏 𝑽𝒊𝒆𝒕 𝑨𝒏𝒉 , and 𝑫𝒐 𝑻𝒖𝒂𝒏 𝑨𝒏𝒉
School of Information and Communication Technology, Hanoi University of Science and Technology, Vietnam
Table of Contents
1. Introduction
2. Related works
3. Problem formulation
4. Proposed algorithm
5. Computational results
6. Conclusion
7. Acknowledgement
Introduction
• The Inter-Domain Path Computation problem under Node-defined
Domain Uniqueness Constraint (IDPC-NDU) is a problem
tackling the requirement of routing the packets most efficiently
and economically in large multi-domain networks.
Communication in multi-domain network
• The IDPC-DU can be applied to solve problems of finding the
shortest path in military communication or transportation.
Related works
• [1] L. Maggi, J. Leguay, J. Cohen, and P. Medagliani, “Domain clustering for inter-domain path
computation speed-up,” Networks, vol. 71, no. 3, pp. 252–270, 2018.
o Introduce two variants of the IDPC-DU and prove their NP-Hard property.
o Propose a dynamic programming approach whose complexity is 𝑂( 𝑉 22|𝐷||𝐷|2) (where 𝑉 is
the number of nodes, and 𝐷 is the number of domains)
• [2] H. T. T. Binh, T. B. Thangy, N. B. Long, N. V. Hoang, and P. D. Thanh, “Multifactorial
evolutionary algorithm for inter-domain path computation under domain uniqueness constraint,” in
2020 IEEE Congress on Evolutionary Computation (CEC). IEEE, 2020, pp. 1–8.
o Adopt Multifactorial evolutionary algorithm (MFEA) to solve the IDPC-EDU.
o Use two-layer encryption based on the Priority-based Encoding.
Problem formulation
• Input:
o A weighted undirected graph 𝐺 = 𝑉, 𝐸, 𝑤 .
o A partition set R⸦V, R={R1, R1, …, RK}, 𝑅𝑖 ∩ 𝑅𝑗 =
∅.
• Output: T={VT,ET} is a Steiner tree which spans all the
vertices in R.
• Objective: 𝑒𝑇
⸦ ET 𝑤(𝑒𝑇) → minimum,
• Constraint: A steiner tree Ti is a subtree of T on Ri,
• Ti={R’i,Ei}, Ri ⸦ R’i
• R’i ∩ R’j = ∅. ∀𝑖, 𝑗, 0 ≤ 𝑖, 𝑗 ≤ 𝐾
An input graph of the CluSteiner
Problem formulation
Invalid solutions
Valid solutions
A B
C
D
Proposed algorithm: PGA
• If there exists more than one edge going from vertex 𝑖
to vertex 𝑗, we retain only the one with the lowest
weight and eliminate the remaining other.
• Apart from domains containing the source and the
destination node, any domains whose indegree or
outdegree is equal to 0 will also be removed.
Pre-filtering process
Simple graph after filtered
Proposed algorithm: PGA
• Chia thành 2 bài toán con.
o Bài toán con 1 là tìm local tree cho từng cluster sao
cho thỏa mãn điều kiện các local tree rời rạc lẫn
nhau.
o Bài toán con 2 là thiết lập liên kết giữa các local
tree.
Two-level Genetic Algorithm
Proposed algorithm : PGA
Individual Representation
Simple graph G
Simple graph 𝐺𝐷
• mỗi NST là hoán vị từ 1-k, biểu diễn thứ tự tìm local
tree cluster k
• Lần lượt theo thứ tự trên NST, xây dựng đồ thị 𝐺𝑘 =
(𝑉𝑘 , 𝐸𝑘 ) cho cluster k
o 𝑉𝑘 = {𝑑1, 𝑑2, … , 𝑑𝑡} gồm các đỉnh đích của cluster k
và các đỉnh trung gian tự do
o 𝐸𝑘 = {𝑒𝑖𝑗} gồm các cạnh nối giữa 2 đỉnh
𝑑𝑖, 𝑑𝑗 𝑡ℎ𝑢ộ𝑐 𝑉𝑘
Proposed algorithm : PGA
• A chromosome is represented as an order of visited domains.
• A chromosome is consist of an array of integers, whose each
element specifies a domain’s priority.
Individual Representation
A representation of a chromosome in graph
Simple graph 𝐺𝐷
Proposed algorithm : PGA
• Sau mỗi lần xây dựng đồ thị Vk của cluster k, sử dụng
thuật toán SPH để tìm local tree của cluster đó
• Sau khi có local tree của từng cluster, xây dựng đồ thị
mới 𝐺0 = (𝑉0 , 𝐸0 )
o 𝑉0 ={𝑑1, 𝑑2, … , 𝑑𝑘} là tập các local tree cluster k
và các đỉnh trung gian tự do
o 𝐸𝑘 = {𝑒𝑖𝑗} gồm các cạnh nối giữa 2 đỉnh
𝑑𝑖, 𝑑𝑗 𝑡ℎ𝑢ộ𝑐 𝑉0
• Áp dụng thuật toán Prim trên 𝐺0
Decoding method
Graph 𝐺′ which is constructed
from the above chromosome
Proposed algorithm : PGA
Apply Order Crossover (OX) that constructs
an offspring by choosing a substring of one
parent and maintaining the other parent’s
elements’ relative order.
Crossover
OX generates two offspring
Proposed algorithm : PGA
Sử dụng Swapping Mutation (SW), đổi vị trí
của 2 cluster ngẫu nhiên trên NST.
Mutation
SW works on a chromosome
Computational results
• Instance datasets are generated based on a set of Non-Euclidean CluSPT [1].
• The datasets are categorized into two kinds regarding dimensionality: small
instances, each of which has between 30 and 120 vertices, and large instances, each
of which has over 260 vertices.
Problem instances
[1] Thanh Pham Dinh. “CluSPT Instances,Mendeley Data, V3”. In: (2019). DOI: 10.17632/b4gcgybvt6.3.
Computational results
• To evaluate the performance in detail of the proposed algorithm, we perform three
experiments as follows:
• The datasets are categorized into two kinds regarding dimensionality: small
instances, each of which has between 30 and 120 vertices, and large instances, each
of which has over 260 vertices.
Problem instances
Computational results
• The SPH algorithm is compared with an MST algorithm in the previous work [1].
• SGA setups: the random mating probability (pc) is 0.8, the mutation rate (pm) is 0.05.
• All algorithms maintain a population with 100 individuals under 50000 evaluations.
Experimental setup
[10] D. Sudholt and C. Thyssen, “A simple ant colony optimizer for stochastic shortest path problems,” Algorithmica, vol.
64, no. 4, pp. 643–672, 2012
Computational results
• Relative Percentage Differences (RPD):
o 𝑆𝑎𝑟
𝑖 is the cost value obtained by algorithm 𝑎 in 𝐴, for instance 𝑖 in 𝐼, on the 𝑟𝑡ℎ run.
o 𝐵𝑖 is the best-known solution value for instance 𝑖, among all algorithms and previous
results.
• The smaller RDP values are, the better results found by the algorithm.
Experimental setup
• Let 𝐴 be the set of tested algorithms, and 𝐼 be the set of instances.
Computational results
Experimental results
• Overall, the proposed algorithm
PGA improves S-ACO 49.8% in
term of the average result and the
biggest gap recorded is 83.3%.
The improvement percentage (PI) of PGA and S-ACO
Computational results
Experimental results
The obtained RPD values of all algorithms
Computational results
Experimental results
The statistical values by Wilcoxon signed-rank test with α = 0.05
and OPT is the optimal solution
Conclusion
• Introduce a two-level algorithm termed PGA to grapple with the IDPC-NDU.
• A pre-filtering process and a new chromosome encoding method that decreases the
chromosome length to the number of domains are also integrated.
• Experiments and comparisons with several algorithms on various-sized data sets were
conducted to evaluate the proposal’s efficiency.
Acknowledgement
This research was sponsored by the U.S. Army Combat Capabilities Development
Command (CCDC) Pacific and CCDC Army Research Laboratory (ARL) under Contract
Number W90GQZ-93290007 for Huynh Thi Thanh Binh. Ta Bao Thang was funded by
Vingroup Joint Stock Company and supported by the Domestic Master/ Ph.D. Scholarship
Programme of Vingroup Innovation Foundation (VINIF), Vingroup Big Data Institute
(VINBIGDATA), code VINIF.2020.ThS.BK.01.
THANK YOU

On clusteredsteinertree slide-ver 1.1

  • 1.
    A Genetic Algorithmfor Clustered Steiner Tree Problem (CluSteiner) Nguy𝒆𝒏 𝑯𝒂𝒊 𝑷𝒉𝒐𝒏𝒈 , Le Minh Tu , Nguy𝒆𝒏 𝑽𝒊𝒆𝒕 𝑨𝒏𝒉 , and 𝑫𝒐 𝑻𝒖𝒂𝒏 𝑨𝒏𝒉 School of Information and Communication Technology, Hanoi University of Science and Technology, Vietnam
  • 2.
    Table of Contents 1.Introduction 2. Related works 3. Problem formulation 4. Proposed algorithm 5. Computational results 6. Conclusion 7. Acknowledgement
  • 3.
    Introduction • The Inter-DomainPath Computation problem under Node-defined Domain Uniqueness Constraint (IDPC-NDU) is a problem tackling the requirement of routing the packets most efficiently and economically in large multi-domain networks. Communication in multi-domain network • The IDPC-DU can be applied to solve problems of finding the shortest path in military communication or transportation.
  • 4.
    Related works • [1]L. Maggi, J. Leguay, J. Cohen, and P. Medagliani, “Domain clustering for inter-domain path computation speed-up,” Networks, vol. 71, no. 3, pp. 252–270, 2018. o Introduce two variants of the IDPC-DU and prove their NP-Hard property. o Propose a dynamic programming approach whose complexity is 𝑂( 𝑉 22|𝐷||𝐷|2) (where 𝑉 is the number of nodes, and 𝐷 is the number of domains) • [2] H. T. T. Binh, T. B. Thangy, N. B. Long, N. V. Hoang, and P. D. Thanh, “Multifactorial evolutionary algorithm for inter-domain path computation under domain uniqueness constraint,” in 2020 IEEE Congress on Evolutionary Computation (CEC). IEEE, 2020, pp. 1–8. o Adopt Multifactorial evolutionary algorithm (MFEA) to solve the IDPC-EDU. o Use two-layer encryption based on the Priority-based Encoding.
  • 5.
    Problem formulation • Input: oA weighted undirected graph 𝐺 = 𝑉, 𝐸, 𝑤 . o A partition set R⸦V, R={R1, R1, …, RK}, 𝑅𝑖 ∩ 𝑅𝑗 = ∅. • Output: T={VT,ET} is a Steiner tree which spans all the vertices in R. • Objective: 𝑒𝑇 ⸦ ET 𝑤(𝑒𝑇) → minimum, • Constraint: A steiner tree Ti is a subtree of T on Ri, • Ti={R’i,Ei}, Ri ⸦ R’i • R’i ∩ R’j = ∅. ∀𝑖, 𝑗, 0 ≤ 𝑖, 𝑗 ≤ 𝐾 An input graph of the CluSteiner
  • 6.
  • 7.
    Proposed algorithm: PGA •If there exists more than one edge going from vertex 𝑖 to vertex 𝑗, we retain only the one with the lowest weight and eliminate the remaining other. • Apart from domains containing the source and the destination node, any domains whose indegree or outdegree is equal to 0 will also be removed. Pre-filtering process Simple graph after filtered
  • 8.
    Proposed algorithm: PGA •Chia thành 2 bài toán con. o Bài toán con 1 là tìm local tree cho từng cluster sao cho thỏa mãn điều kiện các local tree rời rạc lẫn nhau. o Bài toán con 2 là thiết lập liên kết giữa các local tree. Two-level Genetic Algorithm
  • 9.
    Proposed algorithm :PGA Individual Representation Simple graph G Simple graph 𝐺𝐷 • mỗi NST là hoán vị từ 1-k, biểu diễn thứ tự tìm local tree cluster k • Lần lượt theo thứ tự trên NST, xây dựng đồ thị 𝐺𝑘 = (𝑉𝑘 , 𝐸𝑘 ) cho cluster k o 𝑉𝑘 = {𝑑1, 𝑑2, … , 𝑑𝑡} gồm các đỉnh đích của cluster k và các đỉnh trung gian tự do o 𝐸𝑘 = {𝑒𝑖𝑗} gồm các cạnh nối giữa 2 đỉnh 𝑑𝑖, 𝑑𝑗 𝑡ℎ𝑢ộ𝑐 𝑉𝑘
  • 10.
    Proposed algorithm :PGA • A chromosome is represented as an order of visited domains. • A chromosome is consist of an array of integers, whose each element specifies a domain’s priority. Individual Representation A representation of a chromosome in graph Simple graph 𝐺𝐷
  • 11.
    Proposed algorithm :PGA • Sau mỗi lần xây dựng đồ thị Vk của cluster k, sử dụng thuật toán SPH để tìm local tree của cluster đó • Sau khi có local tree của từng cluster, xây dựng đồ thị mới 𝐺0 = (𝑉0 , 𝐸0 ) o 𝑉0 ={𝑑1, 𝑑2, … , 𝑑𝑘} là tập các local tree cluster k và các đỉnh trung gian tự do o 𝐸𝑘 = {𝑒𝑖𝑗} gồm các cạnh nối giữa 2 đỉnh 𝑑𝑖, 𝑑𝑗 𝑡ℎ𝑢ộ𝑐 𝑉0 • Áp dụng thuật toán Prim trên 𝐺0 Decoding method Graph 𝐺′ which is constructed from the above chromosome
  • 12.
    Proposed algorithm :PGA Apply Order Crossover (OX) that constructs an offspring by choosing a substring of one parent and maintaining the other parent’s elements’ relative order. Crossover OX generates two offspring
  • 13.
    Proposed algorithm :PGA Sử dụng Swapping Mutation (SW), đổi vị trí của 2 cluster ngẫu nhiên trên NST. Mutation SW works on a chromosome
  • 14.
    Computational results • Instancedatasets are generated based on a set of Non-Euclidean CluSPT [1]. • The datasets are categorized into two kinds regarding dimensionality: small instances, each of which has between 30 and 120 vertices, and large instances, each of which has over 260 vertices. Problem instances [1] Thanh Pham Dinh. “CluSPT Instances,Mendeley Data, V3”. In: (2019). DOI: 10.17632/b4gcgybvt6.3.
  • 15.
    Computational results • Toevaluate the performance in detail of the proposed algorithm, we perform three experiments as follows: • The datasets are categorized into two kinds regarding dimensionality: small instances, each of which has between 30 and 120 vertices, and large instances, each of which has over 260 vertices. Problem instances
  • 16.
    Computational results • TheSPH algorithm is compared with an MST algorithm in the previous work [1]. • SGA setups: the random mating probability (pc) is 0.8, the mutation rate (pm) is 0.05. • All algorithms maintain a population with 100 individuals under 50000 evaluations. Experimental setup [10] D. Sudholt and C. Thyssen, “A simple ant colony optimizer for stochastic shortest path problems,” Algorithmica, vol. 64, no. 4, pp. 643–672, 2012
  • 17.
    Computational results • RelativePercentage Differences (RPD): o 𝑆𝑎𝑟 𝑖 is the cost value obtained by algorithm 𝑎 in 𝐴, for instance 𝑖 in 𝐼, on the 𝑟𝑡ℎ run. o 𝐵𝑖 is the best-known solution value for instance 𝑖, among all algorithms and previous results. • The smaller RDP values are, the better results found by the algorithm. Experimental setup • Let 𝐴 be the set of tested algorithms, and 𝐼 be the set of instances.
  • 18.
    Computational results Experimental results •Overall, the proposed algorithm PGA improves S-ACO 49.8% in term of the average result and the biggest gap recorded is 83.3%. The improvement percentage (PI) of PGA and S-ACO
  • 19.
    Computational results Experimental results Theobtained RPD values of all algorithms
  • 20.
    Computational results Experimental results Thestatistical values by Wilcoxon signed-rank test with α = 0.05 and OPT is the optimal solution
  • 21.
    Conclusion • Introduce atwo-level algorithm termed PGA to grapple with the IDPC-NDU. • A pre-filtering process and a new chromosome encoding method that decreases the chromosome length to the number of domains are also integrated. • Experiments and comparisons with several algorithms on various-sized data sets were conducted to evaluate the proposal’s efficiency.
  • 22.
    Acknowledgement This research wassponsored by the U.S. Army Combat Capabilities Development Command (CCDC) Pacific and CCDC Army Research Laboratory (ARL) under Contract Number W90GQZ-93290007 for Huynh Thi Thanh Binh. Ta Bao Thang was funded by Vingroup Joint Stock Company and supported by the Domestic Master/ Ph.D. Scholarship Programme of Vingroup Innovation Foundation (VINIF), Vingroup Big Data Institute (VINBIGDATA), code VINIF.2020.ThS.BK.01.
  • 23.

Editor's Notes

  • #15 We first generated three parameters for each instance: number of nodes, number of domains, and number of edges. After that, an optimal path p where the weight of edges is equal to 1 and the number of domains on p is approximately the input graph’s domain number. Next, the noise is added to the instance by for every node in p, besides random weight edges, several random one-weight edges from that node to some other nodes not in p and some random edges with greater values of weight than the total cost of p are added into. These traps make simple greedy algorithms harder to find the optimal solution. Especially in Type 2, feasible paths whose length is less than three are removed.
  • #16 We first generated three parameters for each instance: number of nodes, number of domains, and number of edges. After that, an optimal path p where the weight of edges is equal to 1 and the number of domains on p is approximately the input graph’s domain number. Next, the noise is added to the instance by for every node in p, besides random weight edges, several random one-weight edges from that node to some other nodes not in p and some random edges with greater values of weight than the total cost of p are added into. These traps make simple greedy algorithms harder to find the optimal solution. Especially in Type 2, feasible paths whose length is less than three are removed.
  • #20 the first quartile (LQ) and third quartile (UQ) are the bottom and top of the box, respectively, and the line inside the box denotes the median. The inter-quartile range (IQR) is the difference between UQ and LQ. The smaller RPD values are, the better performance the algorithms draw. All statistics (LQ, UQ, median) of PGA are lower than S-ACO. This shows that the results of the PGA algorithm are more stable and accurate than S-ACO in both small and large instances.
  • #21 Wilcoxon signed-rank tests are conducted based on the obtained RPD values as shown in Table IV to find a factor that the obtained solution is not far from the optima. All obtained p-values are less than α = 0.05, which confirms a significant difference between pairs of algorithms. The positive decision (+) means that PGA outperforms another algorithm with statistical significance. Besides, the negative decision (-) means that the non-existence of evidence ensures the statistical significance of the performance gap. The obtained statistical values indicate that the results of PGA are at most twice the optimal results.