ECE 565 FInal Project

Technology Mapping for Area
Minimization without breaking DAGs
into trees(updated red)
Lakshmi Yasaswi Kamireddy
RajKumar Balachandaran

Problem Formulation :
Technology Mapping for a DAG w/o breaking the DAG into trees
Heuristics for Area optimal Mapping
General ASICs hence the problem is very complex (NP - hard)
A cover is a collection of pattern graphs such that every node of the subject graph is
contained in one or more of the pattern graphs .The cover is further constrained so that each
input required by a pattern graph is actually an output of some other pattern.
For minimum area ,cost of cover is the sum of areas of gates in the cover .The technology
mapping problem is the optimization problem of finding a minimum cost covering of the
subject graph by choosing from the collection of pattern graphs for all gates in the library.

Introduction:
Library:
DAG:
TM:
Area cost = 23
Area cost = 19
Area cost = 14
http://www.ece.tamu.edu/~sunil

Earlier Algorithms for Area optimal mapping
Decomposing DAG into forest of trees and apply Technology Mapping for the forest of trees.
Disadvantage: Loss of some optimality
Advantage: Can be solved in polynomial time
Every node that has fan out greater than one will lead to creation of new tree
[K. Keutzer, IEEE DAC’87]

Tech Mapping for FPGAs [Flowmap-r]
 Maximum Fan out Free Cone (MFFCV) - is an FFC of v such that for any non-PI node w, if
output(w) is a subset of MFFCV , then w is a node element of MFFCV
Algorithm:
This algorithm is a Dynamic Programming based algorithm
first form MFFCs for each individual node starting from nodes connected to output nodes
that is all the internal nodes MFFCs should also be formed then the algorithms starts
[Cong & Ding,IEEE DAC’93]

Tech Mapping for FPGAs [Flowmap-r] contd
 Form all partitions P (X, X̅) in MFFCV such that X̅ is a FFCV , input (X̅) should be no more than K and hence the cut P of
MFFCV is K-feasible and can be mapped by K-input LUT
 For each k-feasible cut P = (X,X̅) of MFFCV
- Cover X̅ with a K-LUT LUTV
P and partition X= MFFCV – X̅ into a set of disjoint MFFCs MFFCV1
P, MFFCV2
P…..
MFFCVM
P
- Then we recursively compute area optimal DF-mapping of each MFFCVi
P (1 ≤ i ≤ m)
- Cost P = 1 + 𝑖=1
𝑚
𝑎𝑟𝑒𝑎(MFFCVi
P) is computed where area(MFFCVi
P) is the area of the optimal DF-mapping
 Choose the cut P with the least cost and this cost (P) gives the best area DF-mapping of MFFCV
 Repeat the algorithm for all other N- MFFCV MFFC’s
[Cong & Ding,IEEE DAC’93]

Tech Mapping for DAG w/o Breaking into Trees
 Extension of Flowmap-r --Approach1
 Divide and conquer with DP --Approach2
Why MFFC formation:
 Primary objective of forming MFFC is not to break DAGs into trees.
 Secondary objective is that forming MFFCs makes the bigger problem into a set
of sub problems i.e, each MFFC is a sub DAG of the bigger DAG.

The previous algorithm [Flowmap-r] was based on K-input LUTs but a standard library can have gates varying from 1 to K-
input. The following is the algorithm for technology mapping without breaking into trees using extension of Flowmap-r.
Algorithm:
1. Form all partitions P (X, X̅ ) in MFFCV such that X̅ is a FFCV , input (X̅) should be equal to K and check if at least 1
partition in set P matches a library gate and if it does then proceed to Step 2 .
• Else, clear the partition set P and form new partitions P (X, X̅) such that X̅̅ is a FFCV, input (X¯) should be equal to K-
1 and check if at least 1 partition in set P matches a library gate and if it does then proceed to Step 2
• Else keep on continuing with input(X̅) =K-2. Map able proceed to step 2
• Therefore finally if none of the before formed partition does not match a library gate then we will take only the root
node as our FFCv because a single node of any gate type is map able in the library.
2. For each k-feasible cut P = (X,X̅) of MFFCV
• Cover X̅ with a K-input Library gate and partition X= MFFCV – X̅ into a set of disjoint MFFCs MFFCV1P,
MFFCV2P….. MFFCVMP
• Then we recursively compute area optimal mapping of each MFFCViP (1 ≤ i ≤ m)
• Cost P = 1 + 𝑖=1
𝑚
𝑎𝑟𝑒𝑎(MFFCViP) is computed where area(MFFCViP) is the area of the optimal mapping
3. Choose the cut P with the least cost and this cost (P) gives the best area DF-mapping of MFFCV
4. Repeat the algorithm for all other N- MFFCV MFFC’s
Approach 1 – Extension of Flowmap-r

 Problem – Best area cover for DAG.
 Form MFFC s for each node in the circuit.
 Sub problem set 1- Best area cover for MFFCs of
output nodes and the other MFFCs that are left
over in the circuit.
 Divide the problem until leaf nodes which are the
nodes connected to Primary Inputs.
 Find the least cost area for the leaf node which will
be the gate itself.
 Then add the leaf node solutions with the node for
which the leaf nodes are inputs to form the sub
problem solution.
 Do this until you reach the output nodes and the
entire circuit is covered.
Sub problem 1
Sub problem 3
Sub problem 2
Approach 2 – Divide and conquer
The solution cost might be equal to the tree.
But the time complexity will be higher than the
tree.

In each sub problem again form sub problems until a
leaf problem is encountered (kind of a Divide and
Conquer approach)
Keep on dividing until we reach the leaf nodes.
Once we reach the leaf node find the best area gate to
map with the leaf node
Combine the solution of the leaf nodes(36,37) to
form the solution of the sub problem(38).
For each sub problem we get a best cost and we
combine with other sub problems(combine 38 and 39
to form 40) until we reach the main problem(41).
Approach 2 – Divide and conquer

MFFC Creation Algorithm
To get mffc, we perform the process in 2 steps.
• First generate mffc_dictionary, where for each node, we add itself and also, its
children with only one parent
• For children with shared parents, check if all the parents are in the grand parents,
then add child to each of grandparent
• Repeat the steps for every level on DAG and fill the mffc_dictionary
New slide

MFFC Pseudo Code
• def rget_mffc_single_parent(G, list_of_nodes, mffc_dict):
# temp list
temp_list = set()
for node in list_of_nodes:
# add node itself
mffc_dict[node].add(node)
# add parents of leaf to a new list
temp_list |= set(G.successors(node))
# check for children
children = G.predecessors(node)
if len(children) > 0:
for child in children:
if len(G.successors(child)) == 1:
mffc_dict[node] |= mffc_dict[child]
if len(temp_list) != 0:
rget_mffc(G, temp_list, mffc_dict)
return mffc_dict

MFFC Pseudo Code contd
• def rget_mffc(G, list_of_nodes, mffc_dict):
temp_list = set()
for node in list_of_nodes:
mffc_dict[node].add(node) //add node to itself
temp_list |= set(G.successors(node)) //get its parents
children = G.predecessors(node) //get its children for the current node
if len(children) > 0: //if it is not leaf node
for child in children:
parents_list = G.successors(child) //get its parent list
for parent in parents_list:
grand_parents_set |= set(G.successors(parent)) //get its grandparent
list
for grand_parent in grand_parents_set:
if set(parents_list).issubset(set(mffc_dict[grand_parent])):
mffc_dict[grand_parent].add(child)
• if len(temp_list) != 0:
rget_mffc(G, temp_list, mffc_dict)
return mffc_dict

Mappable MFFCs
• From the library for each logic expression extract variables and extract
operators based on parentheses.
• Pass inputs combinations to a parser and generate the truth tables
• This truth table parsing function is used to generate truth table for the
library gates
• So the outputs for all combination of inputs are compared with the
MFFC outputs and the map able MFFC are generated.
New slide

Best Possible Cover for minimum Area
• Our problem is to find the maximal cover (number of nodes) with minimal area.
• For the algorithm we have explained this turned out to be a weighted set cover problem which is one of the NP-complete
problems.
Set cover problem :Given a set of elements {1,2,...,m} (called the universe) and collection S of n sets whose union equals the
universe, the set cover problem is to identify the smallest sub-collection of S whose union equals the universe.
For example, consider the universe U={1,2,3,4,5} and the collection of sets S={{1,2,3},{2,4},{3,4},{4,5}}. Clearly the union
of S is U. However, we can cover all of the elements with the following, smaller number of sets: {{1,2,3},{4,5}}
In our case we find the min area MFFC to cover the entire DAG.
https://en.wikipedia.org/wiki/Set_cover_problem
New slide

Best Possible Cover for minimum Area
Inputs to the function are list of mappable MFFC ,G(V,E)
The nodes in the graph will be the universal set and the nodes in the MFFCs are the subsets.
Initially
area =0; // area of the nodes covered
Covered nodes=0 //number nodes covered
Create a priority queue sorted basing on least area MFFC
For each MFFC in list of MFFCs
Insert into priority queue
While covered nodes < length(number of nodes in DAG)
{
Pop an MFFC from priority queue
Update the area
Update covered nodes
Update sets with newly covered MFFC
{
For each newly covered MFFC
For each MFFC in the number of nodes in DAG
If this MFFC is not the popped from the priority queue
{Then discard this MFFC from the list of MFFCs
And push it to priority queue}
}
}
All the combinations are pushed
continuously into the priority queue
and the best is picked from it.
New slide

Sample results
27 --> 27
28 --> 28
29 --> 29
30 --> 27, 30
31 --> 29, 31
32 --> 32
33 --> 33
34 --> 34
35 --> 35
36 --> 36
37 --> 37
38 --> 36, 37, 38
39 --> 39
40 --> 36, 37, 38, 39, 40
41 --> 36, 37, 38, 39, 40, 41
42 --> 34, 42
43 --> 32, 33, 43
44 --> 27, 28, 29, 30, 31, 44
45 --> 45
46 --> 46
47 --> 32, 33, 34, 42, 43, 47
48 --> 27, 28, 29, 30, 31, 44, 45, 48
49 --> 49
50 --> 27, 28, 29, 30, 31, 44, 45, 48, 50
51 --> 32, 33, 34, 42, 43, 47, 51
52 --> 27, 28, 29, 30, 31, 32, 33, 34, 42, 43, 44, 45, 47, 48, 49, 50, 51, 52
53 --> 53
54 --> 27, 28, 29, 30, 31, 32, 33, 34, 42, 43, 44, 45, 47, 48, 49, 50, 51, 52, 53, 54
MFFC

1,BUFX1,1856.00,a
1,INVX1,928.00,!(a)
3,AND3X1,1856.00,a*(b*(c))
2,AND2X1,1392.00,a*(b)
4,AOI211X1,4176.00,!(a*(b+(c+(d)))+!a*(c+(d)))
3,AOI21X1,3248.00,!(a*(b+(c))+!a*(c))
4,AND4X1,2320.00,a*(b*(c*d))
4,AOI31X1,5104.00,!(a*(b*(c+(d))+!b*(d))+!a*(d))
6,AOI222X1,9744.00,!(a*(b+(c*(d+(e*(f)))+!c*(e*(f))))+!a*(c*(d+(e*(f)))+
!c*(e*(f))))
5,AOI221X1,7888.00,!(a*(b+(c*(d+(e))+!c*(e)))+!a*(c*(d+(e))+!c*(e)))
4,AOI22X1,4176.00,!(a*(b+(c*(d)))+!a*(c*(d)))
4,AOI2UM2X1,4640.00,!(a*(c*(d))+!a*((c*(d))+!b))
3,AOI2UM1X1,3712.00,!(a*(c)+!a*((c)+!b))
6,AOI33X1,8352.00,!(a*(b*(c+(d*(e*(f))))+!b*(c*(e*(f))))+!a*(c*(e*(f))))
4,AO22X1,4176.00,a*(b+(c*(d)))+!a*(c*(d))
3,AO21X1,3248.00,a*(b+(c))+!a*(c)
4,NAND4X1,3766.00,!(a*(b*(c*(d))))
4,NAND4UMX1,2784.00,a+(b+!(c*(d)))
4,NAND4BX1,2784.00,a+!(b*(c*(d)))
3,NAND3BX1,2320.00,a+!(b*(c))
2,NAND2BX1,1856.00,a+!(b)
4,NAND4X6,2320.00,!(a*(b*(c*(d))))
4,NOR4X1,2320.00,!(a+(b+(c+(d))))
3,NOR3X1,1856.00,!(a+(b+(c)))
2,NOR2X1,1392.00,!(a+(b))
4,NOR4UMX1,3248.00,!(((c+(d))+!b)+!a)
4,NOR4BX1,2784.00,!((b+(c+(d)))+!a)
3,NOR3BX1,2320.00,!((b+(c))+!a)
2,NOR2BX1,1856.00,!((b)+!a)
4,NOR4X6,2320.00,!(a+(b+(c+(d))))
4,OR4X1,2320.00,a+(b+(c+(d)))
3,OR3X1,1856.00,a+(b+(c))
2,OR2X1,1392.00,a+(b)
5,OAI221X1,7888.00,!(a*(c*(e)+!c*(d*(e)))+!a*(b*(c*(e)+!c*(d*(e)))))
4,OAI22X1,4176.00,!(a*(c+(d))+!a*(b*(c+(d))))
6,OAI33X1,7888.00,!(a*(d+(e+(f)))+!a*(b*(d+(e+(f)))+!b*(c*(d+(e+(f))))))
5,OAI32X1,6496.00,!(a*(c+(d))+!a*(b*(c+(d))+!b*(e*(c+(d)))))
5,OAI31X1,5104.00,!(a*(c)+!a*(b*(c)+!b*(d*(e))))
6,OAI222X1,9744.00,!(a*(c*(e+(f))+!c*(d*(e+(f))))+!a*(b*(c*(e+(f))+!c*(d
*(e+(f))))))
4,OA22X1,4176.00,a*(c+(d))+!a*(b*(c+(d)))
3,OA21X1,3248.00,a*(c)+!a*(b*(c))
3,OAI2UM1X1,4176.00,a*(b+!(c))+!a*!(c)
2,XNOR2X1,3248.00,a*(b)+!a*!(b)
3,XOR3X1,7888.00,a*(b*(c)+!b*!(c))+!a*!(b*(c)+!b*!(c))
2,XOR2X1,3248.00,!(a*(b)+!a*!(b))
3,XNOR3X4,7888.00,!(a*(b*(c)+!b*!(c))+!a*!(b*(c)+!b*!(c)))
3,XNOR3X2,7888.00,!(a*(b*(c)+!b*!(c))+!a*!(b*(c)+!b*!(c)))
4,type1,3248.00,!(!(!(a*(b)*c)*(d)))
2,type2,1392.00,!(a*(b))
3,type3,1856.00,!(a*(b*c))
2,type4,3248.00,(a*(!b))+(b*(!a))
2,type5,2784.00,(a*(!b))
New slideLibrary

Sample results
The area calculated is only for the gates that are mapped from the library in
Mappable MFFC’s and the type of gate it is mapped to and the area for mappable
part.
NOTE:Please ignore the percentage it is for our reference to check for how much
percentage of the file has completed processing.
Number of nodes 6
Number of edges 6
Number of inputs 5
MFFC 16.67% 5 [1, 1, 1, 0] [1, 1, 1, 0] type2 1392.0
MFFC 33.33% 7 [1, 1, 1, 0] [1, 1, 1, 0] type2 1392.0
MFFC 50.00% 6 [1, 1, 1, 0] [1, 1, 1, 0] type2 1392.0
MFFC 83.33% 8 [1, 1, 1, 0] [1, 1, 1, 0] type2 1392.0
Mappable MFFC's 4
[['5'], ['7'], ['6'], ['8']]
Area=5568.0
IPT_0 [label = IPT ];
NND_5 [gate = NND ];
IPT_0 -> NND_5 [ name = 0 ];
IPT_2 -> NND_5 [ name = 1 ];
IPT_2 -> NND_6 [ name = 2 ];
IPT_3 -> NND_6 [ name = 3 ];
IPT_1 -> NND_7 [ name = 4 ];
NND_6 -> NND_7 [ name = 5 ];
NND_6 -> NND_8 [ name = 6 ];
IPT_4 -> NND_8 [ name = 7 ];
NND_5 -> NND_9 [ name = 8 ];
NND_7 -> NND_9 [ name = 9 ];
NND_7 -> NND_10 [ name = 10 ];
NND_8 -> NND_10 [ name = 11 ];
10 --> 10, 8
5 --> 5
6 --> 6
7 --> 7
8 --> 8
9 --> 5, 9
Dot file
MFFC
Final OutputC17.bench New slide

IPT_0 -> NND_6 [ name = 0 ];
IPT_1 -> NND_6 [ name = 1 ];
IPT_2 -> NND_7 [ name = 2 ];
IPT_3 -> NND_7 [ name = 3 ];
IPT_4 -> NND_8 [ name = 4 ];
IPT_5 -> NND_8 [ name = 5 ];
NND_6 -> NND_9 [ name = 6 ];
NND_7 -> NND_9 [ name = 7 ];
NND_7 -> NND_10 [ name = 8 ];
NND_8 -> NND_10 [ name = 9 ];
NND_9 -> NND_11 [ name = 10 ];
NND_10 -> NND_11 [ name = 11 ];
10 --> 10, 8
11 --> 10, 11, 6, 7, 8, 9
6 --> 6
7 --> 7
8 --> 8
9 --> 6, 9
Number of nodes 6
Number of edges 6
Number of inputs 6
MFFC 33.33% 7 [1, 1, 1, 0] [1, 1, 1, 0] type2 1392.0
MFFC 50.00% 6 [1, 1, 1, 0] [1, 1, 1, 0] type2 1392.0
MFFC 83.33% 8 [1, 1, 1, 0] [1, 1, 1, 0] type2 1392.0
Mappable MFFC's 3
[['7'], ['6'], ['8']]
Area = 4176.0
C1 example we have taken
Dot file
MFFC
Final Output
For bigger files please check the input output folder
New slide

References
[1] K. Keutzer and D. Richards. Computational complexity of logic synthesis and optimization in
Proceedings of International Workshop on Logic Synthesis, 1989.
[2] K. Keutzer. Dagon: Technology binding and local optimization by dag matching in Proceedings of
24th ACM/IEEE Design Automation Conference, 1987.
[3] J Cong , Y Ding “On Area/Depth Trade-off in LUT-Based FPGA Technology Mapping”, Design
Automation Conference (DAC), 1993
[4] Performing Technology Mapping and Optimization by DAG covering: A review of Traditional
Approaches.

ECE 565 FInal Project

Recommended

Recommended

More Related Content

What's hot

What's hot (15)

Viewers also liked

Viewers also liked (9)

Similar to ECE 565 FInal Project

Similar to ECE 565 FInal Project (20)

More from Lakshmi Yasaswi Kamireddy

More from Lakshmi Yasaswi Kamireddy (9)

Recently uploaded

Recently uploaded (20)

ECE 565 FInal Project