SlideShare a Scribd company logo
1 of 35
By
Kailash Chand Meena
(13EC35032)
under the supervision of
Prof. Santanu Chattopadhyay
Department of Electronics and Electrical Communication
Engineering
IIT Kharagpur
1.Introduction:
οƒ˜ Application mapping is one of the most important dimensions in Network-
on-Chip (NoC) research. It affects the overall performance and power
requirement of the system.
οƒ˜ Rapid progress in technology scaling makes transistors smaller and faster
over successive generations and consequently number of IP cores in a
system gets increased but power consumption of transistor no longer
scales in proportion .
οƒ˜ Increasing number of IP-cores in a multi-processor system on chip makes
NoC application mapping more challenging to find optimum core-to-router
mapping.
οƒ˜ A significant proportion of the power consumed gets directly dissipated as
heat. Increase in power density can lead to increase several others.
οƒ˜ Application mapping with its ability to spread out high power components
can potentially be a good approach to mitigate the looming issue of
hotspots in many-core processors.
Terminology in Application Mapping
ο‚— Application: An application consists of a set of tasks, each of which is implemented
by an IP core.
ο‚— IP Cores : Functional modules of NoC are known as intellectual property(IP) cores.
ο‚— Hopcount: Distance is measured in terms of hopcount to transmit a message from
source router to the destination router through the router fabric.
ο‚— Core Graph: Application can be represented in the form of a core graph, with each
vertex representing an IP core and the directed edge representing the
communication between the cores. An video application VOPD(video object plane
decoder) consists of 16 cores and DVOPD(dual video object plane decoder) consists
of 32 cores.
Core Graph for VOPD
Bandwidth Unit: MB/s
Core Graph for VOPD generated by TGFF Tool
16
0 70 INF INF INF INF INF INF INF INF INF INF INF INF INF INF
70 0 362 INF INF INF INF INF INF INF INF INF INF INF INF INF
INF 362 0 362 INF INF INF INF INF INF INF INF INF INF INF INF
INF INF 362 0 362 INF INF INF INF INF INF INF INF INF INF 49
INF INF INF 362 0 357 INF INF INF INF INF INF INF INF INF 27
INF INF INF INF 357 0 353 INF INF INF INF 16 INF INF INF INF
INF INF INF INF INF 353 0 300 INF INF INF INF INF INF INF INF
INF INF INF INF INF INF 300 0 313 500 INF INF INF INF INF INF
INF INF INF INF INF INF INF 313 0 407 INF 16 INF INF INF INF
INF INF INF INF INF INF INF 500 407 0 INF INF INF INF INF INF
INF INF INF INF INF INF INF INF INF INF 0 16 INF INF 16 INF
INF INF INF INF INF 16 INF INF 16 INF 16 0 16 INF INF INF
INF INF INF INF INF INF INF INF INF INF INF 16 0 157 16 INF
INF INF INF INF INF INF INF INF INF INF INF INF 157 0 16 INF
INF INF INF INF INF INF INF INF INF INF 16 INF 16 16 0 INF
INF INF INF 49 27 INF INF INF INF INF INF INF INF INF INF 0
οƒ˜Mesh Topology:
β€’The mesh topology is one of the most common network topologies because it
provides a regular structure with short interconnects and a high bisection width and
a modular architecture for the NoC with equal sized links.
2.What is Application Mapping Problem?
ο‚— The core graph of an application is a directed graph, CG(C,E) with each vertex π‘π‘–βˆˆ C
representing a core and the directed edge 𝑒𝑖,π‘—βˆˆE representing the communication
between the cores 𝑐𝑖 and 𝑐𝑗. The bandwidth requirement of the communication from 𝑐𝑖
to 𝑐𝑗, is weighted to the edge 𝑒𝑖,𝑗 and is denoted by π‘π‘œπ‘šπ‘šπ‘–,𝑗.
ο‚— The NoC topology graph is a directed graph TG(T,G) with each vertex 𝑑𝑖 belongs to T
representing a node in the topology and the directed edge 𝑔𝑖,𝑗representing a physical
link between the vertices 𝑑𝑖 and 𝑑𝑗. The weight of the edge 𝑔𝑖,𝑗is denoted as 𝑏𝑀𝑖,𝑗
represents the bandwidth across the edge 𝑔𝑖,𝑗.
ο‚— A mapping of core graph CG(C,E) onto the topology graph TG(T,G) is defined by the
function H: CG β†’TP. Such that, βˆ€π‘π‘–βˆˆC,βˆƒπ‘‘π‘—βˆˆT and map (𝑐𝑖) = 𝑑𝑗 .
ο‚— The quality of such a mapping is defined in terms of the total communication cost of the
application under this mapping. The communication between each pair of cores can be
treated as flow of a single commodity π‘‘π‘˜, k = 1, 2,...,|E|.
ο‚— The value of commodity π‘‘π‘˜ corresponding to the communication between cores 𝑐𝑖 and
𝑐𝑗 is equal to π‘π‘œπ‘šπ‘šπ‘–,𝑗 , the bandwidth requirement. The quantity π‘‹π‘˜(i, j) indicating the
value of commodity π‘‘π‘˜ flowing through link (𝑑𝑖, 𝑑𝑗) is given by-
value (π‘‘π‘˜) , if link (𝑑𝑖, 𝑑𝑗) οƒŽ Path (source (π‘‘π‘˜ ),destination (π‘‘π‘˜))
0 , otherwise
Contd.
ο‚— To ensure that the bandwidth does not exceed the limits of individual links,
the following constraints must be satisfied-
π‘˜=1
|𝐸|
π‘‹π‘˜(𝑖, 𝑗) ≀ 𝑏𝑀𝑖,𝑗 , βˆ€ i, j ∈ {1, 2,...,|T |}.
ο‚— The Communication Cost between the core 𝑐𝑖 and 𝑐𝑗 is measured by-
πΆπ‘œπ‘šπ‘šπ‘π‘œπ‘ π‘‘ 𝑖,𝑗 = π‘π‘œπ‘šπ‘šπ‘–,𝑗 Γ— 𝑀𝐷(map 𝑐𝑖 , map 𝑐𝑗 )
ο‚— The total communication cost of a mapping solution is calculated as-
πΆπ‘œπ‘šπ‘šπΆπ‘œπ‘ π‘‘ = 𝑐𝑖,𝑐𝑗 ∈𝐸 πΆπ‘œπ‘šπ‘šπ‘π‘œπ‘ π‘‘ (𝐢𝑖, 𝐢𝑗)
3. Problem Statement:
ο‚— Given the properties of the application (in terms of its core graph)and NoC
architecture(in terms of topology graph),the optimum association between routers
and cores has to be so determined that the weighted communication cost(BW Γ—
Hop-count) of the application and the peak temperature of the chip remain
minimum under a given routing mechanism.
ο‚— The following are the inputs to the problem:
1. A task graph CG, representing the application.
2. A topology graph TG corresponding to the 2D NoC.
3. Power profile of each core.
4. Power profile of each router and link.
5. Floorplan for the NoC.
ο‚— A core together with its corresponding router, forms a tile. The tiles are identified
by the router’s ID. So each tile has an associated power profile, governed by the
associated IP-core, router and links.
ο‚— The above mentioned problem has been solved using the Genetics Algorithm(GA).
4. Why Genetic Algorithm(GA)?:
ο‚— GA offers several advantages over other stochastic strategies for the optimization of
the application mapping problem like Simulated Annealing(SA) and Ant Colony
Optimization(ACO) .
ο‚— In GA optimization, multiple solutions co-exist at any stage of the process, whereas,
SA progresses with only one solution. The solutions of GA are generally produced
faster than SA and ACO which use only limited population and resources.
ο‚— Proposed GA based approach combines the local search method with the global
search method(guided search) to balance exploration and exploitation.
ο‚— In GA approach, chromosomes( mapping solutions) do not die because of the local
best of a chromosome(solution) remains attached to that chromosome and gets
updated whenever a better solution identified by the solution.
ο‚— But in SA, the population moves together in an unguided search and some
solutions are filtered out by the selection criteria. Similarly, in ACO, random paths
are selected for an ant(solution) and because of that solution takes time to
converge.
5. GA formulation of Application Mapping Problem:
5.1.Chromosome structure and initial population generation:
οƒ˜ The length of each chromosome is equal to the number of vertices in a core graph,
and the chromosome is en-coded into integer strings.
οƒ˜ Each gene (vertex in core graph) in the chromosome contains an integer which
indicates a randomly chosen node in mesh topology, and the vertex can not overlap
each other.
οƒ˜ A chromosome can efficiently be represented as an 1D-array, in which the indices
represent the router numbers, and the values of the cells represent the core
associated with the corresponding router. Thus, a chromosome is a permutation of
the numbers of cores in core graph
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
16 4 3 2 14 5 6 1 13 12 7 9 15 11 8 10
Chromosome structure and corresponding NoC Mapping
A chromosome conveniently can be viewed as a 1-D array in which chromosome[i]
notes down the core mapped to the π‘–π‘‘β„Ž router or node.
1 2 3 4 5 6 7 8 9 10 11 12 13 14
15 16
1
6
4 3 2 1
4
5 6 1 1
3
1
2
7 9 1
5
1
1
8 1
0
5.2. Evaluation of Fitness value of Chromosome by calculating Objective Function:
β€’ The Communication Cost between the core 𝑐𝑖 and 𝑐𝑗 is measured by-
πΆπ‘œπ‘šπ‘šπ‘π‘œπ‘ π‘‘ 𝑖,𝑗 = π‘π‘œπ‘šπ‘šπ‘–,𝑗 Γ— 𝑀𝐷(map 𝑐𝑖 , map 𝑐𝑗 )
β€’ The total communication cost of a mapping solution is calculated as-
πΆπ‘œπ‘šπ‘šπΆπ‘œπ‘ π‘‘ = 𝑐𝑖,𝑐𝑗 ∈𝐸 πΆπ‘œπ‘šπ‘šπ‘π‘œπ‘ π‘‘ 𝐢𝑖, 𝐢𝑗
β€’ F_obj[i] = πΆπ‘œπ‘šπ‘šπΆπ‘œπ‘ π‘‘
β€’ Fitness of π‘–π‘‘β„Ž chromosome:
Fitness[i]=1/(1+F_obj[i])
5.3. Chromosome Selection for Next Generation using Roulette Wheel:
β€’ The fitness probability for π‘–π‘‘β„Ž
chromosome is formulated by:
P[i]=Fitness[i] / ( 𝑖=1
𝑁
𝐹𝑖𝑑𝑛𝑒𝑠𝑠[𝑖])
β€’ The cumulative probability for π‘˜π‘‘β„Ž
chromosome can be formulated as:
𝐢[π‘˜] =
𝑖=1
π‘˜
𝑃[𝑖]
Contd.
ο‚— Algorithm for the Roulette wheel selection process:
begin
k οƒŸ 0;
while(k < population size) do
R[k] οƒŸ (0,1);
For(i=0 to population size) do
if(R[k]< C[i]) then
chromosome[k] οƒŸ chromosome[i];
break;
i=i+1;
end;
k=k+1;
end;
end;
5.4. Crossover Operation over Chromosomes(Solutions):
ο‚— For the crossover process, generated floating point random numbers between 0 to
1. Chromosome k will be selected as parent if R[k] < crossover rate.
ο‚— After Chromosome selection as parent, position of crossover point is determined
by generating random integers between one to (numbers of cores in core graph-1).
ο‚— Algorithm:
begin
k οƒŸ 0;
While (k<population size) do
R[k] οƒŸ random(0,1);
If( R[k]< crossover rate) then
Select chromosome[k] as parent;
k=k+1;
end;
end;
5.5 Mutation operation over Chromosomes:
ο‚— Number of chromosomes that have mutations in population is determined by the
mutation rate parameter.
ο‚— In mutation process, exchange two members in chromosomes that are selected
randomly.
ο‚— Total_members = number of cores in a chromosome Γ— population size.
ο‚— Mutation process is done by generating a random integer between 1 to
Total_Members. If generated random number is smaller than mutation rate then
marked the position of gene and it will be mutated.
ο‚— Number of mutations = mutation rate Γ— Total_members
ο‚— Algorithm:
begin
k οƒŸ 0;
While(k < number of mutations) do
R[k] οƒŸ [1,total_members]; Integer random number
a οƒŸ Quotient of (R[k] / core_num);
select chromosome[a] for mutation;
b οƒŸ Remainder of (R[k] / core_num);
select position b in chromosome [a] for mutation;
k=k+1;
end;
end;
6. Control Over GA Iterations:
ο‚— In this approach, the GA has been run several times to improve upon the best
solution (π‘”π‘ π‘’π‘π‘’π‘Ÿ) which has been found in previous iterations. At the end of the π‘›π‘‘β„Ž
iteration of the GA, let the best solution for the π‘˜π‘‘β„Ž chromosome, found in this
iteration be 𝑙𝑏𝑒𝑠𝑑𝑛
π‘˜
and the best solution found in previous n iterations be π‘”π‘ π‘’π‘π‘’π‘Ÿπ‘› .
In the (𝑛 + 1)π‘‘β„Ž
iteration of GA, it starts with a new set of chromosomes. However
the 𝑙𝑏𝑒𝑠𝑑𝑛
π‘˜
and π‘”π‘ π‘’π‘π‘’π‘Ÿπ‘› solutions are passed on from π‘›π‘‘β„Ž to the (𝑛 + 1)π‘‘β„Ž
iteration of GA.
ο‚— The maximum number of GA runs has been set as follows:
1. Either the number of GA iterations exceeds a user-define value. For this work,
this limit value is set to be 1000.
2. Or, fitness of the solution π‘”π‘ π‘’π‘π‘’π‘Ÿπ‘› which has been found in previous iterations
does not change in the last 30 runs.
7. Genetic Algorithm Formulation of Temperature-Aware Mapping:
7.1. Temperature Calculation:
ο‚— The primary source of heat generation in a chip is governed by the energy dissipation of
the tiles present in the silicon layer.
ο‚— This heat generated in the silicon layer flows towards the heat sink through the following
heat transfer path(PHTP): Silicon layer β†’ Thermal Interface-layer β†’ Heat Spreader β†’
Heat Sink.
ο‚— Each of these layers is divided into several smaller blocks, as in the block model of
Hotspot.
ο‚— We have considered that each block in the Si-layer corresponds to a tile present in the
NoC. Thereby, if the NoC contains n tiles, the Si-layer is divided into n blocks.
ο‚— Also, the other layers present in the PHTP, exactly below Si-layer are divided into similar
n-blocks. Therefore, a total of such (4 Γ— n) number of blocks are present in the thermal
model.
ο‚— In addition to those 4n blocks, the Heat Spreader layer contains 4 extra peripheral blocks
and the Heat Sink layer contains 8 extra peripheral blocks. Hence the total number of
blocks present in the thermal model of the chip (tot_blk) is (4 Γ— n + 12).
ο‚— The CTM works on the principle of duality between the thermal and the electrical
quantities.
Contd.
ο‚— Thermal resistance along x, y and z directions:
𝑇𝑅π‘₯ =
1
π‘˜π‘™π‘Žπ‘¦π‘’π‘Ÿ
(0.5 Γ—
𝐷π‘₯
𝐷𝑦 Γ— 𝐷𝑧
)
𝑇𝑅𝑦 =
1
π‘˜π‘™π‘Žπ‘¦π‘’π‘Ÿ
(0.5 Γ—
𝐷𝑦
𝐷𝑧 Γ— 𝐷π‘₯
)
𝑇𝑅𝑧 =
1
π‘˜π‘™π‘Žπ‘¦π‘’π‘Ÿ
(0.5 Γ—
𝐷𝑧
2𝐷π‘₯ Γ— 𝐷𝑦
)
ο‚— Following equation is solved to determine the temperature matrix ([𝑇]π‘‘π‘œπ‘‘_π‘π‘™π‘˜Γ—1) :
[𝐢]π‘‘π‘œπ‘‘_π‘π‘™π‘˜Γ—tot_blk Γ— 𝑇 π‘‘π‘œπ‘‘_π‘π‘™π‘˜Γ—1= 𝑃 π‘‘π‘œπ‘‘_π‘π‘™π‘˜Γ—1
7.2 Fitness Calculation:
ο‚— The fitness of each chromosome is evaluated using the following
expression:
𝐹𝑖𝑑𝑛𝑒𝑠𝑠 = 𝑀 Γ—
πΆπ‘œπ‘šπ‘šπΆπ‘œπ‘ π‘‘
πΆπ‘œπ‘šπ‘šπΆπ‘œπ‘ π‘‘π‘šπ‘Žπ‘₯
+ 1 βˆ’ 𝑀 Γ— (
π‘‡π‘ƒπ‘’π‘Žπ‘˜πΆβ„Žπ‘–π‘
π‘‡π‘€π‘Žπ‘₯π‘€π‘Žπ‘
)
ο‚— When w=0, it minimizes the chip temperature, and w=1, it minimizes the
communication cost.
8.Simulation Results:
8.1. Comparison of Communication cost for Benchmark Applications:
The applications are mapped onto 2-D mesh structures with mesh sizes
noted in Table I.
TABLE I
NoC Benchmarks and Their Mesh-Sizes
Benchmark NoCs No. Of Cores 2-D Mesh Size
DVOPD 32 8 Γ— 4
VOPD 16 4 Γ— 4
MPEG-4 12 4 Γ— 4
PIP 8 4 Γ— 2
MWD 12 4 Γ— 4
263ENC MP3DEC 12 4 Γ— 4
MP3ENC MP3DEC 13 4 Γ— 4
263DEC MP3DEC 14 4 Γ— 4
TABLE II
Comparison of Communication Cost for NoC Benchmarks
Mapping
Techniques
Communication Cost ( Hops Γ— BW)
DVOPD VOPD MPEG-4 PIP MWD 263ENC
MP3DEC
MP3ENC
MP3DEC
263DEC
MP3DEC
NMAP 10253 4265 3672 640 184 230.407 18.171 20.073
PMAP - 7054 6128 832 - - - -
GMAP - 5553 7849 704 - - - -
PBB - 4317 3763 640 - - - -
MOCA - - 5246 - - - - -
BMAP - 4351 6280 - - - - -
Onyx - 4249 3612 - - - - -
CHMAP - 4249 3977 - - - - -
CMAP - 4281 3704 - - - - -
Elixir - 4249 3640 - - - - -
LMAP - 4189 4006 640 - - - -
CastNet - 4135 3852 - - - - -
GMAP - 4300 3600 - - - - -
GAMR - - 3772 - - - - -
GBMAP - 4217 3572 - - - - -
ACO - - 3633 - - - - -
PSMAP 9752 4119 3567 640 120 230.407 17.021 19.823
PSO 9602 4119 3567 640 120 230.407 17.021 19.823
ILP - 4119 3567 640 120 230.407 17.021 19.823
Proposed 9688 4135 3633 640 120 230.442 17.115 20.021
8.2. Latency and Throughput for Benchmark Applications :
Used System-C based Noxim simulator to calculate network latency and throughput.
TABLE III
Noxim Settings
Parameters Values
Buffer Depth 6
Minimum and Maximum Packet Size 64 flits(32 flits per flit)
Routing Dimension ordered(XY)
Selection Logic Random
Warm-up Time 10000 Clk cycles
Simulation Time 20000 Clk cycles
Traffic Table based
(Contd.)
TABLE IV
Latency and Throughput of NoC Benchmarks
Benchmark Applications Latency(Cycles) Throughput(Flits/Cycle)
DVOPD 80037.40 0.55
VOPD 81198.16 0.59
MPEG-4 79972.50 0.61
PIP 89415.20 0.69
MWD 89936.70 0.62
263ENC MP3DEC 81962.10 0.75
MP3ENC MP3DEC 80911.80 0.63
263DEC MP3DEC 81850.10 0.62
8.3. Communication cost for Bigger Applications:
To check the applicability of the GA-based approach on larger NoCs, used the TGFF tool to
generate a few task graphs with 64,128 and 256 cores. 64 core,128 core and 256 core NoCs
are implemented on 8 Γ— 8, 8 Γ— 16 and 16Γ— 16 Mesh respectively.
TABLE V
Communication Cost for Different TGFF Task Graphs
TGFF Task Graphs Communication Cost(Hops Γ— BW)
NMAP LMAP PSMAP PSO Proposed
64 Cores G1 55244.17 51344.40 50947.00 45598.64 53728.00
G2 44902.16 44005.16 42086.00 42086.00 43789.00
128 Cores G3 70168.36 70168.36 67508.53 56721.63 69195.73
G4 343982.87 306761.00 285295.72 281405.75 315684.47
256 Cores G5 - - - - 542126.00
G6 - - - - 655831.00
8.4. Comparison Among Different Thermal Mapping Techniques:
ο‚— For checking the quality of proposed GA based approach for thermal-aware mapping, the results are
compared with those of ILP, TAAP, CoolMap.
TABLE VI
Comparison Among Different Thermal Mapping Techniques
NoCs Weight
Factor
(w)
ILP CoolMap TAAP Proposed
Comm.
Cost
Temp.
(Kelvin)
Comm.
Cost
Temp.
(Kelvin)
Comm.
Cost
Temp.
(Kelvin)
Comm.
Cost
Temp.
(Kelvin)
MWD 0 1986 346.14 1986 346.14 1986 346.14 1632 348.82
0.5 1258 355.73 1258 355.73 1258 355.73 1312 348.70
1 1248 353.51 1248 353.51 1248 353.51 1248 359.68
MPEG-4 0 7449.50 346.14 7449.50 346.14 7449.50 346.14 7849 338.84
0.5 3643 355.73 3643 355.73 3643 355.73 3672 341.46
1 3587 353.51 3587 353.51 3587 353.51 3633 343.33
263ENC–
MP3DEC
0 551.61 346.14 551.61 346.14 551.61 346.14 270.51 350.33
0.5 310.94 355.73 310.94 355.73 310.94 355.73 230.46 351.52
1 230.43 353.51 230.43 353.51 230.43 353.51 230.44 351.50
8.5. Communication Cost and Peak Temperature of NoC Benchmarks:
TABLE VII
Communication Cost and Peak Temperature for Benchmark Applications
NoCs No. of Cores Weight Factor Comm_Cost Peak Temp. (Kelvin)
DVOPD 32 0 10182 348.57
0.5 10072 348.68
1 9688 352.35
VOPD 16 0 4356 344.39
0.5 4183 345.48
1 4135 348.91
MPEG-4 12 0 7849 338.84
0.5 3672 341.46
1 3633 343.33
MWD 12 0 1632 348.82
0.5 1312 348.70
1 1248 359.67
263ENC-MP3DEC 12 0 270.51 350.33
0.5 230.46 351.52
1 230.44 351.50
MP3ENC-MP3DEC 13 0 19.60 343.85
0.5 19.46 343.87
1 17.12 344.20
263DEC-MPEDEC 14 0 20.42 342.78
0.5 20.32 344.80
1 20.02 350.87
Contd.
ο‚— To check the applicability of the GA based thermal-aware mapping approach on larger scale, a
few task graphs are generated using TGFF tool.
TABLE VIII
Communication Cost and Peak Temperature Reduction for Different TGFF Task Graphs
Task Graphs Comm_Cost(Hops Γ— BW) Peak Temp. Reduction(Kelvin)
Graph111 124732.77 92.55
Graph112 718853.43 92.09
Graph113 876083.87 96.97
Graph114 182443.65 92.56
Graph115 160572.93 94.38
Graph116 20306.87 92.37
Graph117 20306.87 97.57
Graph118 221245.67 90.66
8.6. Trading-off Communication Cost and Peak Temperature:
ο‚— A trade-off is established between NoC peak temperature and Communication Cost. Below
figure shows the trade-offs between communication cost and peak temperature for
benchmark application VOPD.
8.7. Imposing Thermal Safety by Temperature Constraints:
ο‚— In this experiment, thermal safety has been imposed by taking peak temperature as a
constraint. The experiment finds out the mapping solution that is suitable to the temperature
budget.
TABLE IX
Communication Cost and Peak Temperature Constraints
NoC Benchmark Applications
VOPD DVOPD
Tcons (Kelvin) Comm_Cost Tpeak(Kelvin) Tcons (Kelvin) Comm_Cost Tpeak (Kelvin)
361 3612 358.87 360 9427 356.38
359 4888 356.26 356 10486 354.23
356 4899 351.07 359 10510 357.10
8.8. Dynamic Simulation Of Thermal-Aware Mapping:
ο‚— For the simulation purpose, Noxim simulator has been used. Any NoC is expected to have high
throughput, while the latency is expected to be low.
TABLE X
Throughput and Latency of NoC Benchmarks
Benchmark NoCs Throughput(Flits/Cycle) Latency(Cycles)
DVOPD 83735.70 0.53
VOPD 82398.14 0.57
MPEG-4 79998.50 0.59
PIP 89475.20 0.63
MWD 89963.70 0.61
263ENC-MP3DEC 81997.10 0.58
9.Conclusions:
ο‚— Proposed mapping approach produces reasonable improvement in communication
cost compared to some of the previously reported strategies.
ο‚— It can be noted from simulation results that, the proposed strategy performs better
compared to NMAP for the NoCs having higher number of cores.
ο‚— The communication model used in proposed approach is assumed that each router
takes same amount of time to traverse through it. In practical, this may not be true.
ο‚— Proposed thermal-aware mapping approach has been found to improve the
communication cost and peak temperature of the chip.
ο‚— A trade-off has also been established between communication cost and peak
temperature , so that designers can choose the solution that suits their
requirement best.
ο‚— Experimental results show that the proposed thermal-aware mapping approach
outperforms, those of many contemporary approaches, reported in the literature.
10.Future Scope:
ο‚— Proposed mapping strategy can be extended for mapping and routing for
NoC architectures with other network topologies like Ring, Torus topology
etc.
ο‚— Proposed thermal-aware mapping approach can be extended for 3-D
structured mapping strategies targeting fault-tolerant and reliability-aware
mapping techniques for 2-D as well as 3-D NoC environments.
11.References:
ο‚— [1].S.Murali and G.De. Micheli,Bandwidth-constrained mapping of cores onto noc
architectures,design, Automation and test in Europe conference and exhibition, 2004. Proceedings,
vol. 2. Feb. 2004, pp. 896–901.
ο‚— [2].Pradip Kumar Sahu, Kanchan Manna, Tapan Shah and Santanu Chattopadhyay, A Constructive
Heuristic for Application Mapping onto Mesh Based Network-on-Chip, Journal of Circuits, Systems,
and Computers Vol. 24, No. 8 (2015) 1550126 (29 pages)
ο‚— [3]. P. K. Sahu and S. Chattopadhyay, A survey on application mapping strategies for network- on-
chip design, J. Syst. Archit., vol. 59, 2013,pp. 60–76.
ο‚— [4].Application Mapping Onto Mesh-Based Network-on-Chip Using Discrete Particle Swarm
Optimization, Pradip Kumar Sahu, Tapan Shah, Kanchan Manna, and Santanu Chattopadhyay IEEE
transactions on very large scale integration (VLSI) systems, VOL. 22, NO. 2, February 2014.
ο‚— [5].J. Hu and R. Marculescu,β€œEnergy-aware mapping for tile-based NoC architectures
underperformance constraints,”in Proc. Asia South Pacific Des. Autom. Conf., 2003,pp.233-239.
ο‚— [6].M. Moazzen, A. Reza, and M. Reshadi, CoolMap: A Thermal-aware mapping algorithm for
application specific networks-on- chip, in Proc. Euromicro Conf. Digital Syst. Des., Sep. 2012,
pp. 731–734.
ο‚— [7].D. Zhu, L. Chen, T. Pinkston, and M. Pedram, TAPP: Temperature- aware application mapping for
NoC-based many-core processors, in Proc. Des., Autom. Test Eur., 2015, pp. 1241–1244.
ο‚— [8].W. Huang, S. Ghosh, S. Velusamy, K. Sankaranarayanan, K. Skadron, M Stan, Hotspot: a compact
thermal modeling methodology for early-stage vlsi design, very large scale integer, VLSI syst. IEEE
Trans, 14(5) (2006) 501-513.
ο‚— [9]. http://mehransoft.ir/wp-content/uploads/2014/05/Noxim_User_Guide.pdf
Thank You

More Related Content

Similar to Kailash(13EC35032)_mtp.pptx

IRJET- Chord Classification of an Audio Signal using Artificial Neural Network
IRJET- Chord Classification of an Audio Signal using Artificial Neural NetworkIRJET- Chord Classification of an Audio Signal using Artificial Neural Network
IRJET- Chord Classification of an Audio Signal using Artificial Neural NetworkIRJET Journal
Β 
Design and Implementation of an Embedded System for Software Defined Radio
Design and Implementation of an Embedded System for Software Defined RadioDesign and Implementation of an Embedded System for Software Defined Radio
Design and Implementation of an Embedded System for Software Defined RadioIJECEIAES
Β 
Hamming net based Low Complexity Successive Cancellation Polar Decoder
Hamming net based Low Complexity Successive Cancellation Polar DecoderHamming net based Low Complexity Successive Cancellation Polar Decoder
Hamming net based Low Complexity Successive Cancellation Polar DecoderRSIS International
Β 
Hardware Architecture of Complex K-best MIMO Decoder
Hardware Architecture of Complex K-best MIMO DecoderHardware Architecture of Complex K-best MIMO Decoder
Hardware Architecture of Complex K-best MIMO DecoderCSCJournals
Β 
Investigating the Performance of NoC Using Hierarchical Routing Approach
Investigating the Performance of NoC Using Hierarchical Routing ApproachInvestigating the Performance of NoC Using Hierarchical Routing Approach
Investigating the Performance of NoC Using Hierarchical Routing ApproachIJERA Editor
Β 
Investigating the Performance of NoC Using Hierarchical Routing Approach
Investigating the Performance of NoC Using Hierarchical Routing ApproachInvestigating the Performance of NoC Using Hierarchical Routing Approach
Investigating the Performance of NoC Using Hierarchical Routing ApproachIJERA Editor
Β 
www.ijerd.com
www.ijerd.comwww.ijerd.com
www.ijerd.comIJERD Editor
Β 
Area, Delay and Power Comparison of Adder Topologies
Area, Delay and Power Comparison of Adder TopologiesArea, Delay and Power Comparison of Adder Topologies
Area, Delay and Power Comparison of Adder TopologiesVLSICS Design
Β 
FPGA DESIGN FOR H.264/AVC ENCODER
FPGA DESIGN FOR H.264/AVC ENCODERFPGA DESIGN FOR H.264/AVC ENCODER
FPGA DESIGN FOR H.264/AVC ENCODERIJCSEA Journal
Β 
Design and implementation of log domain decoder
Design and implementation of log domain decoder Design and implementation of log domain decoder
Design and implementation of log domain decoder IJECEIAES
Β 
Research Inventy: International Journal of Engineering and Science
Research Inventy: International Journal of Engineering and ScienceResearch Inventy: International Journal of Engineering and Science
Research Inventy: International Journal of Engineering and Scienceresearchinventy
Β 
Research Inventy : International Journal of Engineering and Science is publis...
Research Inventy : International Journal of Engineering and Science is publis...Research Inventy : International Journal of Engineering and Science is publis...
Research Inventy : International Journal of Engineering and Science is publis...researchinventy
Β 
Improving The Performance of Viterbi Decoder using Window System
Improving The Performance of Viterbi Decoder using Window System Improving The Performance of Viterbi Decoder using Window System
Improving The Performance of Viterbi Decoder using Window System IJECEIAES
Β 
An Optimized Parallel Algorithm for Longest Common Subsequence Using Openmp –...
An Optimized Parallel Algorithm for Longest Common Subsequence Using Openmp –...An Optimized Parallel Algorithm for Longest Common Subsequence Using Openmp –...
An Optimized Parallel Algorithm for Longest Common Subsequence Using Openmp –...IRJET Journal
Β 
Cycle’s topological optimizations and the iterative decoding problem on gener...
Cycle’s topological optimizations and the iterative decoding problem on gener...Cycle’s topological optimizations and the iterative decoding problem on gener...
Cycle’s topological optimizations and the iterative decoding problem on gener...Usatyuk Vasiliy
Β 
Welcome to International Journal of Engineering Research and Development (IJERD)
Welcome to International Journal of Engineering Research and Development (IJERD)Welcome to International Journal of Engineering Research and Development (IJERD)
Welcome to International Journal of Engineering Research and Development (IJERD)IJERD Editor
Β 
THRESHOLD SENSITIVE HETEROGENOUS ROUTING PROTOCOL FOR BETTER ENERGY UTILIZATI...
THRESHOLD SENSITIVE HETEROGENOUS ROUTING PROTOCOL FOR BETTER ENERGY UTILIZATI...THRESHOLD SENSITIVE HETEROGENOUS ROUTING PROTOCOL FOR BETTER ENERGY UTILIZATI...
THRESHOLD SENSITIVE HETEROGENOUS ROUTING PROTOCOL FOR BETTER ENERGY UTILIZATI...ijassn
Β 
THRESHOLD SENSITIVE HETEROGENOUS ROUTING PROTOCOL FOR BETTER ENERGY UTILIZATI...
THRESHOLD SENSITIVE HETEROGENOUS ROUTING PROTOCOL FOR BETTER ENERGY UTILIZATI...THRESHOLD SENSITIVE HETEROGENOUS ROUTING PROTOCOL FOR BETTER ENERGY UTILIZATI...
THRESHOLD SENSITIVE HETEROGENOUS ROUTING PROTOCOL FOR BETTER ENERGY UTILIZATI...ijassn
Β 

Similar to Kailash(13EC35032)_mtp.pptx (20)

IRJET- Chord Classification of an Audio Signal using Artificial Neural Network
IRJET- Chord Classification of an Audio Signal using Artificial Neural NetworkIRJET- Chord Classification of an Audio Signal using Artificial Neural Network
IRJET- Chord Classification of an Audio Signal using Artificial Neural Network
Β 
Design and Implementation of an Embedded System for Software Defined Radio
Design and Implementation of an Embedded System for Software Defined RadioDesign and Implementation of an Embedded System for Software Defined Radio
Design and Implementation of an Embedded System for Software Defined Radio
Β 
Hamming net based Low Complexity Successive Cancellation Polar Decoder
Hamming net based Low Complexity Successive Cancellation Polar DecoderHamming net based Low Complexity Successive Cancellation Polar Decoder
Hamming net based Low Complexity Successive Cancellation Polar Decoder
Β 
Hardware Architecture of Complex K-best MIMO Decoder
Hardware Architecture of Complex K-best MIMO DecoderHardware Architecture of Complex K-best MIMO Decoder
Hardware Architecture of Complex K-best MIMO Decoder
Β 
Investigating the Performance of NoC Using Hierarchical Routing Approach
Investigating the Performance of NoC Using Hierarchical Routing ApproachInvestigating the Performance of NoC Using Hierarchical Routing Approach
Investigating the Performance of NoC Using Hierarchical Routing Approach
Β 
Investigating the Performance of NoC Using Hierarchical Routing Approach
Investigating the Performance of NoC Using Hierarchical Routing ApproachInvestigating the Performance of NoC Using Hierarchical Routing Approach
Investigating the Performance of NoC Using Hierarchical Routing Approach
Β 
www.ijerd.com
www.ijerd.comwww.ijerd.com
www.ijerd.com
Β 
Ppt.final.phd thesis
Ppt.final.phd thesisPpt.final.phd thesis
Ppt.final.phd thesis
Β 
Area, Delay and Power Comparison of Adder Topologies
Area, Delay and Power Comparison of Adder TopologiesArea, Delay and Power Comparison of Adder Topologies
Area, Delay and Power Comparison of Adder Topologies
Β 
Survey on Prefix adders
Survey on Prefix addersSurvey on Prefix adders
Survey on Prefix adders
Β 
FPGA DESIGN FOR H.264/AVC ENCODER
FPGA DESIGN FOR H.264/AVC ENCODERFPGA DESIGN FOR H.264/AVC ENCODER
FPGA DESIGN FOR H.264/AVC ENCODER
Β 
Design and implementation of log domain decoder
Design and implementation of log domain decoder Design and implementation of log domain decoder
Design and implementation of log domain decoder
Β 
Research Inventy: International Journal of Engineering and Science
Research Inventy: International Journal of Engineering and ScienceResearch Inventy: International Journal of Engineering and Science
Research Inventy: International Journal of Engineering and Science
Β 
Research Inventy : International Journal of Engineering and Science is publis...
Research Inventy : International Journal of Engineering and Science is publis...Research Inventy : International Journal of Engineering and Science is publis...
Research Inventy : International Journal of Engineering and Science is publis...
Β 
Improving The Performance of Viterbi Decoder using Window System
Improving The Performance of Viterbi Decoder using Window System Improving The Performance of Viterbi Decoder using Window System
Improving The Performance of Viterbi Decoder using Window System
Β 
An Optimized Parallel Algorithm for Longest Common Subsequence Using Openmp –...
An Optimized Parallel Algorithm for Longest Common Subsequence Using Openmp –...An Optimized Parallel Algorithm for Longest Common Subsequence Using Openmp –...
An Optimized Parallel Algorithm for Longest Common Subsequence Using Openmp –...
Β 
Cycle’s topological optimizations and the iterative decoding problem on gener...
Cycle’s topological optimizations and the iterative decoding problem on gener...Cycle’s topological optimizations and the iterative decoding problem on gener...
Cycle’s topological optimizations and the iterative decoding problem on gener...
Β 
Welcome to International Journal of Engineering Research and Development (IJERD)
Welcome to International Journal of Engineering Research and Development (IJERD)Welcome to International Journal of Engineering Research and Development (IJERD)
Welcome to International Journal of Engineering Research and Development (IJERD)
Β 
THRESHOLD SENSITIVE HETEROGENOUS ROUTING PROTOCOL FOR BETTER ENERGY UTILIZATI...
THRESHOLD SENSITIVE HETEROGENOUS ROUTING PROTOCOL FOR BETTER ENERGY UTILIZATI...THRESHOLD SENSITIVE HETEROGENOUS ROUTING PROTOCOL FOR BETTER ENERGY UTILIZATI...
THRESHOLD SENSITIVE HETEROGENOUS ROUTING PROTOCOL FOR BETTER ENERGY UTILIZATI...
Β 
THRESHOLD SENSITIVE HETEROGENOUS ROUTING PROTOCOL FOR BETTER ENERGY UTILIZATI...
THRESHOLD SENSITIVE HETEROGENOUS ROUTING PROTOCOL FOR BETTER ENERGY UTILIZATI...THRESHOLD SENSITIVE HETEROGENOUS ROUTING PROTOCOL FOR BETTER ENERGY UTILIZATI...
THRESHOLD SENSITIVE HETEROGENOUS ROUTING PROTOCOL FOR BETTER ENERGY UTILIZATI...
Β 

Recently uploaded

MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINEMANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINESIVASHANKAR N
Β 
Processing & Properties of Floor and Wall Tiles.pptx
Processing & Properties of Floor and Wall Tiles.pptxProcessing & Properties of Floor and Wall Tiles.pptx
Processing & Properties of Floor and Wall Tiles.pptxpranjaldaimarysona
Β 
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...Soham Mondal
Β 
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICSAPPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICSKurinjimalarL3
Β 
GDSC ASEB Gen AI study jams presentation
GDSC ASEB Gen AI study jams presentationGDSC ASEB Gen AI study jams presentation
GDSC ASEB Gen AI study jams presentationGDSCAESB
Β 
Biology for Computer Engineers Course Handout.pptx
Biology for Computer Engineers Course Handout.pptxBiology for Computer Engineers Course Handout.pptx
Biology for Computer Engineers Course Handout.pptxDeepakSakkari2
Β 
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).pptssuser5c9d4b1
Β 
What are the advantages and disadvantages of membrane structures.pptx
What are the advantages and disadvantages of membrane structures.pptxWhat are the advantages and disadvantages of membrane structures.pptx
What are the advantages and disadvantages of membrane structures.pptxwendy cai
Β 
Introduction to IEEE STANDARDS and its different types.pptx
Introduction to IEEE STANDARDS and its different types.pptxIntroduction to IEEE STANDARDS and its different types.pptx
Introduction to IEEE STANDARDS and its different types.pptxupamatechverse
Β 
Current Transformer Drawing and GTP for MSETCL
Current Transformer Drawing and GTP for MSETCLCurrent Transformer Drawing and GTP for MSETCL
Current Transformer Drawing and GTP for MSETCLDeelipZope
Β 
Internship report on mechanical engineering
Internship report on mechanical engineeringInternship report on mechanical engineering
Internship report on mechanical engineeringmalavadedarshan25
Β 
Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024hassan khalil
Β 
IMPLICATIONS OF THE ABOVE HOLISTIC UNDERSTANDING OF HARMONY ON PROFESSIONAL E...
IMPLICATIONS OF THE ABOVE HOLISTIC UNDERSTANDING OF HARMONY ON PROFESSIONAL E...IMPLICATIONS OF THE ABOVE HOLISTIC UNDERSTANDING OF HARMONY ON PROFESSIONAL E...
IMPLICATIONS OF THE ABOVE HOLISTIC UNDERSTANDING OF HARMONY ON PROFESSIONAL E...RajaP95
Β 
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...Dr.Costas Sachpazis
Β 
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur EscortsCall Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur High Profile
Β 
Coefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptxCoefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptxAsutosh Ranjan
Β 
Model Call Girl in Narela Delhi reach out to us at πŸ”8264348440πŸ”
Model Call Girl in Narela Delhi reach out to us at πŸ”8264348440πŸ”Model Call Girl in Narela Delhi reach out to us at πŸ”8264348440πŸ”
Model Call Girl in Narela Delhi reach out to us at πŸ”8264348440πŸ”soniya singh
Β 

Recently uploaded (20)

MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINEMANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
Β 
Processing & Properties of Floor and Wall Tiles.pptx
Processing & Properties of Floor and Wall Tiles.pptxProcessing & Properties of Floor and Wall Tiles.pptx
Processing & Properties of Floor and Wall Tiles.pptx
Β 
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
Β 
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICSAPPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
Β 
GDSC ASEB Gen AI study jams presentation
GDSC ASEB Gen AI study jams presentationGDSC ASEB Gen AI study jams presentation
GDSC ASEB Gen AI study jams presentation
Β 
Biology for Computer Engineers Course Handout.pptx
Biology for Computer Engineers Course Handout.pptxBiology for Computer Engineers Course Handout.pptx
Biology for Computer Engineers Course Handout.pptx
Β 
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt
Β 
What are the advantages and disadvantages of membrane structures.pptx
What are the advantages and disadvantages of membrane structures.pptxWhat are the advantages and disadvantages of membrane structures.pptx
What are the advantages and disadvantages of membrane structures.pptx
Β 
Call Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCR
Call Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCRCall Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCR
Call Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCR
Β 
Introduction to IEEE STANDARDS and its different types.pptx
Introduction to IEEE STANDARDS and its different types.pptxIntroduction to IEEE STANDARDS and its different types.pptx
Introduction to IEEE STANDARDS and its different types.pptx
Β 
Current Transformer Drawing and GTP for MSETCL
Current Transformer Drawing and GTP for MSETCLCurrent Transformer Drawing and GTP for MSETCL
Current Transformer Drawing and GTP for MSETCL
Β 
Internship report on mechanical engineering
Internship report on mechanical engineeringInternship report on mechanical engineering
Internship report on mechanical engineering
Β 
Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024
Β 
IMPLICATIONS OF THE ABOVE HOLISTIC UNDERSTANDING OF HARMONY ON PROFESSIONAL E...
IMPLICATIONS OF THE ABOVE HOLISTIC UNDERSTANDING OF HARMONY ON PROFESSIONAL E...IMPLICATIONS OF THE ABOVE HOLISTIC UNDERSTANDING OF HARMONY ON PROFESSIONAL E...
IMPLICATIONS OF THE ABOVE HOLISTIC UNDERSTANDING OF HARMONY ON PROFESSIONAL E...
Β 
β˜… CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR
β˜… CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCRβ˜… CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR
β˜… CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR
Β 
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Β 
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur EscortsCall Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Β 
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
Β 
Coefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptxCoefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptx
Β 
Model Call Girl in Narela Delhi reach out to us at πŸ”8264348440πŸ”
Model Call Girl in Narela Delhi reach out to us at πŸ”8264348440πŸ”Model Call Girl in Narela Delhi reach out to us at πŸ”8264348440πŸ”
Model Call Girl in Narela Delhi reach out to us at πŸ”8264348440πŸ”
Β 

Kailash(13EC35032)_mtp.pptx

  • 1. By Kailash Chand Meena (13EC35032) under the supervision of Prof. Santanu Chattopadhyay Department of Electronics and Electrical Communication Engineering IIT Kharagpur
  • 2. 1.Introduction: οƒ˜ Application mapping is one of the most important dimensions in Network- on-Chip (NoC) research. It affects the overall performance and power requirement of the system. οƒ˜ Rapid progress in technology scaling makes transistors smaller and faster over successive generations and consequently number of IP cores in a system gets increased but power consumption of transistor no longer scales in proportion . οƒ˜ Increasing number of IP-cores in a multi-processor system on chip makes NoC application mapping more challenging to find optimum core-to-router mapping. οƒ˜ A significant proportion of the power consumed gets directly dissipated as heat. Increase in power density can lead to increase several others. οƒ˜ Application mapping with its ability to spread out high power components can potentially be a good approach to mitigate the looming issue of hotspots in many-core processors.
  • 3. Terminology in Application Mapping ο‚— Application: An application consists of a set of tasks, each of which is implemented by an IP core. ο‚— IP Cores : Functional modules of NoC are known as intellectual property(IP) cores. ο‚— Hopcount: Distance is measured in terms of hopcount to transmit a message from source router to the destination router through the router fabric. ο‚— Core Graph: Application can be represented in the form of a core graph, with each vertex representing an IP core and the directed edge representing the communication between the cores. An video application VOPD(video object plane decoder) consists of 16 cores and DVOPD(dual video object plane decoder) consists of 32 cores.
  • 4. Core Graph for VOPD Bandwidth Unit: MB/s
  • 5. Core Graph for VOPD generated by TGFF Tool 16 0 70 INF INF INF INF INF INF INF INF INF INF INF INF INF INF 70 0 362 INF INF INF INF INF INF INF INF INF INF INF INF INF INF 362 0 362 INF INF INF INF INF INF INF INF INF INF INF INF INF INF 362 0 362 INF INF INF INF INF INF INF INF INF INF 49 INF INF INF 362 0 357 INF INF INF INF INF INF INF INF INF 27 INF INF INF INF 357 0 353 INF INF INF INF 16 INF INF INF INF INF INF INF INF INF 353 0 300 INF INF INF INF INF INF INF INF INF INF INF INF INF INF 300 0 313 500 INF INF INF INF INF INF INF INF INF INF INF INF INF 313 0 407 INF 16 INF INF INF INF INF INF INF INF INF INF INF 500 407 0 INF INF INF INF INF INF INF INF INF INF INF INF INF INF INF INF 0 16 INF INF 16 INF INF INF INF INF INF 16 INF INF 16 INF 16 0 16 INF INF INF INF INF INF INF INF INF INF INF INF INF INF 16 0 157 16 INF INF INF INF INF INF INF INF INF INF INF INF INF 157 0 16 INF INF INF INF INF INF INF INF INF INF INF 16 INF 16 16 0 INF INF INF INF 49 27 INF INF INF INF INF INF INF INF INF INF 0
  • 6. οƒ˜Mesh Topology: β€’The mesh topology is one of the most common network topologies because it provides a regular structure with short interconnects and a high bisection width and a modular architecture for the NoC with equal sized links.
  • 7. 2.What is Application Mapping Problem? ο‚— The core graph of an application is a directed graph, CG(C,E) with each vertex π‘π‘–βˆˆ C representing a core and the directed edge 𝑒𝑖,π‘—βˆˆE representing the communication between the cores 𝑐𝑖 and 𝑐𝑗. The bandwidth requirement of the communication from 𝑐𝑖 to 𝑐𝑗, is weighted to the edge 𝑒𝑖,𝑗 and is denoted by π‘π‘œπ‘šπ‘šπ‘–,𝑗. ο‚— The NoC topology graph is a directed graph TG(T,G) with each vertex 𝑑𝑖 belongs to T representing a node in the topology and the directed edge 𝑔𝑖,𝑗representing a physical link between the vertices 𝑑𝑖 and 𝑑𝑗. The weight of the edge 𝑔𝑖,𝑗is denoted as 𝑏𝑀𝑖,𝑗 represents the bandwidth across the edge 𝑔𝑖,𝑗. ο‚— A mapping of core graph CG(C,E) onto the topology graph TG(T,G) is defined by the function H: CG β†’TP. Such that, βˆ€π‘π‘–βˆˆC,βˆƒπ‘‘π‘—βˆˆT and map (𝑐𝑖) = 𝑑𝑗 . ο‚— The quality of such a mapping is defined in terms of the total communication cost of the application under this mapping. The communication between each pair of cores can be treated as flow of a single commodity π‘‘π‘˜, k = 1, 2,...,|E|. ο‚— The value of commodity π‘‘π‘˜ corresponding to the communication between cores 𝑐𝑖 and 𝑐𝑗 is equal to π‘π‘œπ‘šπ‘šπ‘–,𝑗 , the bandwidth requirement. The quantity π‘‹π‘˜(i, j) indicating the value of commodity π‘‘π‘˜ flowing through link (𝑑𝑖, 𝑑𝑗) is given by- value (π‘‘π‘˜) , if link (𝑑𝑖, 𝑑𝑗) οƒŽ Path (source (π‘‘π‘˜ ),destination (π‘‘π‘˜)) 0 , otherwise
  • 8. Contd. ο‚— To ensure that the bandwidth does not exceed the limits of individual links, the following constraints must be satisfied- π‘˜=1 |𝐸| π‘‹π‘˜(𝑖, 𝑗) ≀ 𝑏𝑀𝑖,𝑗 , βˆ€ i, j ∈ {1, 2,...,|T |}. ο‚— The Communication Cost between the core 𝑐𝑖 and 𝑐𝑗 is measured by- πΆπ‘œπ‘šπ‘šπ‘π‘œπ‘ π‘‘ 𝑖,𝑗 = π‘π‘œπ‘šπ‘šπ‘–,𝑗 Γ— 𝑀𝐷(map 𝑐𝑖 , map 𝑐𝑗 ) ο‚— The total communication cost of a mapping solution is calculated as- πΆπ‘œπ‘šπ‘šπΆπ‘œπ‘ π‘‘ = 𝑐𝑖,𝑐𝑗 ∈𝐸 πΆπ‘œπ‘šπ‘šπ‘π‘œπ‘ π‘‘ (𝐢𝑖, 𝐢𝑗)
  • 9. 3. Problem Statement: ο‚— Given the properties of the application (in terms of its core graph)and NoC architecture(in terms of topology graph),the optimum association between routers and cores has to be so determined that the weighted communication cost(BW Γ— Hop-count) of the application and the peak temperature of the chip remain minimum under a given routing mechanism. ο‚— The following are the inputs to the problem: 1. A task graph CG, representing the application. 2. A topology graph TG corresponding to the 2D NoC. 3. Power profile of each core. 4. Power profile of each router and link. 5. Floorplan for the NoC. ο‚— A core together with its corresponding router, forms a tile. The tiles are identified by the router’s ID. So each tile has an associated power profile, governed by the associated IP-core, router and links. ο‚— The above mentioned problem has been solved using the Genetics Algorithm(GA).
  • 10. 4. Why Genetic Algorithm(GA)?: ο‚— GA offers several advantages over other stochastic strategies for the optimization of the application mapping problem like Simulated Annealing(SA) and Ant Colony Optimization(ACO) . ο‚— In GA optimization, multiple solutions co-exist at any stage of the process, whereas, SA progresses with only one solution. The solutions of GA are generally produced faster than SA and ACO which use only limited population and resources. ο‚— Proposed GA based approach combines the local search method with the global search method(guided search) to balance exploration and exploitation. ο‚— In GA approach, chromosomes( mapping solutions) do not die because of the local best of a chromosome(solution) remains attached to that chromosome and gets updated whenever a better solution identified by the solution. ο‚— But in SA, the population moves together in an unguided search and some solutions are filtered out by the selection criteria. Similarly, in ACO, random paths are selected for an ant(solution) and because of that solution takes time to converge.
  • 11. 5. GA formulation of Application Mapping Problem: 5.1.Chromosome structure and initial population generation: οƒ˜ The length of each chromosome is equal to the number of vertices in a core graph, and the chromosome is en-coded into integer strings. οƒ˜ Each gene (vertex in core graph) in the chromosome contains an integer which indicates a randomly chosen node in mesh topology, and the vertex can not overlap each other. οƒ˜ A chromosome can efficiently be represented as an 1D-array, in which the indices represent the router numbers, and the values of the cells represent the core associated with the corresponding router. Thus, a chromosome is a permutation of the numbers of cores in core graph 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 16 4 3 2 14 5 6 1 13 12 7 9 15 11 8 10
  • 12. Chromosome structure and corresponding NoC Mapping A chromosome conveniently can be viewed as a 1-D array in which chromosome[i] notes down the core mapped to the π‘–π‘‘β„Ž router or node. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 1 6 4 3 2 1 4 5 6 1 1 3 1 2 7 9 1 5 1 1 8 1 0
  • 13. 5.2. Evaluation of Fitness value of Chromosome by calculating Objective Function: β€’ The Communication Cost between the core 𝑐𝑖 and 𝑐𝑗 is measured by- πΆπ‘œπ‘šπ‘šπ‘π‘œπ‘ π‘‘ 𝑖,𝑗 = π‘π‘œπ‘šπ‘šπ‘–,𝑗 Γ— 𝑀𝐷(map 𝑐𝑖 , map 𝑐𝑗 ) β€’ The total communication cost of a mapping solution is calculated as- πΆπ‘œπ‘šπ‘šπΆπ‘œπ‘ π‘‘ = 𝑐𝑖,𝑐𝑗 ∈𝐸 πΆπ‘œπ‘šπ‘šπ‘π‘œπ‘ π‘‘ 𝐢𝑖, 𝐢𝑗 β€’ F_obj[i] = πΆπ‘œπ‘šπ‘šπΆπ‘œπ‘ π‘‘ β€’ Fitness of π‘–π‘‘β„Ž chromosome: Fitness[i]=1/(1+F_obj[i]) 5.3. Chromosome Selection for Next Generation using Roulette Wheel: β€’ The fitness probability for π‘–π‘‘β„Ž chromosome is formulated by: P[i]=Fitness[i] / ( 𝑖=1 𝑁 𝐹𝑖𝑑𝑛𝑒𝑠𝑠[𝑖]) β€’ The cumulative probability for π‘˜π‘‘β„Ž chromosome can be formulated as: 𝐢[π‘˜] = 𝑖=1 π‘˜ 𝑃[𝑖]
  • 14. Contd. ο‚— Algorithm for the Roulette wheel selection process: begin k οƒŸ 0; while(k < population size) do R[k] οƒŸ (0,1); For(i=0 to population size) do if(R[k]< C[i]) then chromosome[k] οƒŸ chromosome[i]; break; i=i+1; end; k=k+1; end; end;
  • 15. 5.4. Crossover Operation over Chromosomes(Solutions): ο‚— For the crossover process, generated floating point random numbers between 0 to 1. Chromosome k will be selected as parent if R[k] < crossover rate. ο‚— After Chromosome selection as parent, position of crossover point is determined by generating random integers between one to (numbers of cores in core graph-1). ο‚— Algorithm: begin k οƒŸ 0; While (k<population size) do R[k] οƒŸ random(0,1); If( R[k]< crossover rate) then Select chromosome[k] as parent; k=k+1; end; end;
  • 16. 5.5 Mutation operation over Chromosomes: ο‚— Number of chromosomes that have mutations in population is determined by the mutation rate parameter. ο‚— In mutation process, exchange two members in chromosomes that are selected randomly. ο‚— Total_members = number of cores in a chromosome Γ— population size. ο‚— Mutation process is done by generating a random integer between 1 to Total_Members. If generated random number is smaller than mutation rate then marked the position of gene and it will be mutated. ο‚— Number of mutations = mutation rate Γ— Total_members ο‚— Algorithm: begin k οƒŸ 0; While(k < number of mutations) do R[k] οƒŸ [1,total_members]; Integer random number a οƒŸ Quotient of (R[k] / core_num); select chromosome[a] for mutation; b οƒŸ Remainder of (R[k] / core_num); select position b in chromosome [a] for mutation; k=k+1; end; end;
  • 17. 6. Control Over GA Iterations: ο‚— In this approach, the GA has been run several times to improve upon the best solution (π‘”π‘ π‘’π‘π‘’π‘Ÿ) which has been found in previous iterations. At the end of the π‘›π‘‘β„Ž iteration of the GA, let the best solution for the π‘˜π‘‘β„Ž chromosome, found in this iteration be 𝑙𝑏𝑒𝑠𝑑𝑛 π‘˜ and the best solution found in previous n iterations be π‘”π‘ π‘’π‘π‘’π‘Ÿπ‘› . In the (𝑛 + 1)π‘‘β„Ž iteration of GA, it starts with a new set of chromosomes. However the 𝑙𝑏𝑒𝑠𝑑𝑛 π‘˜ and π‘”π‘ π‘’π‘π‘’π‘Ÿπ‘› solutions are passed on from π‘›π‘‘β„Ž to the (𝑛 + 1)π‘‘β„Ž iteration of GA. ο‚— The maximum number of GA runs has been set as follows: 1. Either the number of GA iterations exceeds a user-define value. For this work, this limit value is set to be 1000. 2. Or, fitness of the solution π‘”π‘ π‘’π‘π‘’π‘Ÿπ‘› which has been found in previous iterations does not change in the last 30 runs.
  • 18. 7. Genetic Algorithm Formulation of Temperature-Aware Mapping: 7.1. Temperature Calculation: ο‚— The primary source of heat generation in a chip is governed by the energy dissipation of the tiles present in the silicon layer. ο‚— This heat generated in the silicon layer flows towards the heat sink through the following heat transfer path(PHTP): Silicon layer β†’ Thermal Interface-layer β†’ Heat Spreader β†’ Heat Sink. ο‚— Each of these layers is divided into several smaller blocks, as in the block model of Hotspot. ο‚— We have considered that each block in the Si-layer corresponds to a tile present in the NoC. Thereby, if the NoC contains n tiles, the Si-layer is divided into n blocks. ο‚— Also, the other layers present in the PHTP, exactly below Si-layer are divided into similar n-blocks. Therefore, a total of such (4 Γ— n) number of blocks are present in the thermal model. ο‚— In addition to those 4n blocks, the Heat Spreader layer contains 4 extra peripheral blocks and the Heat Sink layer contains 8 extra peripheral blocks. Hence the total number of blocks present in the thermal model of the chip (tot_blk) is (4 Γ— n + 12). ο‚— The CTM works on the principle of duality between the thermal and the electrical quantities.
  • 19. Contd. ο‚— Thermal resistance along x, y and z directions: 𝑇𝑅π‘₯ = 1 π‘˜π‘™π‘Žπ‘¦π‘’π‘Ÿ (0.5 Γ— 𝐷π‘₯ 𝐷𝑦 Γ— 𝐷𝑧 ) 𝑇𝑅𝑦 = 1 π‘˜π‘™π‘Žπ‘¦π‘’π‘Ÿ (0.5 Γ— 𝐷𝑦 𝐷𝑧 Γ— 𝐷π‘₯ ) 𝑇𝑅𝑧 = 1 π‘˜π‘™π‘Žπ‘¦π‘’π‘Ÿ (0.5 Γ— 𝐷𝑧 2𝐷π‘₯ Γ— 𝐷𝑦 ) ο‚— Following equation is solved to determine the temperature matrix ([𝑇]π‘‘π‘œπ‘‘_π‘π‘™π‘˜Γ—1) : [𝐢]π‘‘π‘œπ‘‘_π‘π‘™π‘˜Γ—tot_blk Γ— 𝑇 π‘‘π‘œπ‘‘_π‘π‘™π‘˜Γ—1= 𝑃 π‘‘π‘œπ‘‘_π‘π‘™π‘˜Γ—1
  • 20. 7.2 Fitness Calculation: ο‚— The fitness of each chromosome is evaluated using the following expression: 𝐹𝑖𝑑𝑛𝑒𝑠𝑠 = 𝑀 Γ— πΆπ‘œπ‘šπ‘šπΆπ‘œπ‘ π‘‘ πΆπ‘œπ‘šπ‘šπΆπ‘œπ‘ π‘‘π‘šπ‘Žπ‘₯ + 1 βˆ’ 𝑀 Γ— ( π‘‡π‘ƒπ‘’π‘Žπ‘˜πΆβ„Žπ‘–π‘ π‘‡π‘€π‘Žπ‘₯π‘€π‘Žπ‘ ) ο‚— When w=0, it minimizes the chip temperature, and w=1, it minimizes the communication cost.
  • 21. 8.Simulation Results: 8.1. Comparison of Communication cost for Benchmark Applications: The applications are mapped onto 2-D mesh structures with mesh sizes noted in Table I. TABLE I NoC Benchmarks and Their Mesh-Sizes Benchmark NoCs No. Of Cores 2-D Mesh Size DVOPD 32 8 Γ— 4 VOPD 16 4 Γ— 4 MPEG-4 12 4 Γ— 4 PIP 8 4 Γ— 2 MWD 12 4 Γ— 4 263ENC MP3DEC 12 4 Γ— 4 MP3ENC MP3DEC 13 4 Γ— 4 263DEC MP3DEC 14 4 Γ— 4
  • 22. TABLE II Comparison of Communication Cost for NoC Benchmarks Mapping Techniques Communication Cost ( Hops Γ— BW) DVOPD VOPD MPEG-4 PIP MWD 263ENC MP3DEC MP3ENC MP3DEC 263DEC MP3DEC NMAP 10253 4265 3672 640 184 230.407 18.171 20.073 PMAP - 7054 6128 832 - - - - GMAP - 5553 7849 704 - - - - PBB - 4317 3763 640 - - - - MOCA - - 5246 - - - - - BMAP - 4351 6280 - - - - - Onyx - 4249 3612 - - - - - CHMAP - 4249 3977 - - - - - CMAP - 4281 3704 - - - - - Elixir - 4249 3640 - - - - - LMAP - 4189 4006 640 - - - - CastNet - 4135 3852 - - - - - GMAP - 4300 3600 - - - - - GAMR - - 3772 - - - - - GBMAP - 4217 3572 - - - - - ACO - - 3633 - - - - - PSMAP 9752 4119 3567 640 120 230.407 17.021 19.823 PSO 9602 4119 3567 640 120 230.407 17.021 19.823 ILP - 4119 3567 640 120 230.407 17.021 19.823 Proposed 9688 4135 3633 640 120 230.442 17.115 20.021
  • 23. 8.2. Latency and Throughput for Benchmark Applications : Used System-C based Noxim simulator to calculate network latency and throughput. TABLE III Noxim Settings Parameters Values Buffer Depth 6 Minimum and Maximum Packet Size 64 flits(32 flits per flit) Routing Dimension ordered(XY) Selection Logic Random Warm-up Time 10000 Clk cycles Simulation Time 20000 Clk cycles Traffic Table based
  • 24. (Contd.) TABLE IV Latency and Throughput of NoC Benchmarks Benchmark Applications Latency(Cycles) Throughput(Flits/Cycle) DVOPD 80037.40 0.55 VOPD 81198.16 0.59 MPEG-4 79972.50 0.61 PIP 89415.20 0.69 MWD 89936.70 0.62 263ENC MP3DEC 81962.10 0.75 MP3ENC MP3DEC 80911.80 0.63 263DEC MP3DEC 81850.10 0.62
  • 25. 8.3. Communication cost for Bigger Applications: To check the applicability of the GA-based approach on larger NoCs, used the TGFF tool to generate a few task graphs with 64,128 and 256 cores. 64 core,128 core and 256 core NoCs are implemented on 8 Γ— 8, 8 Γ— 16 and 16Γ— 16 Mesh respectively. TABLE V Communication Cost for Different TGFF Task Graphs TGFF Task Graphs Communication Cost(Hops Γ— BW) NMAP LMAP PSMAP PSO Proposed 64 Cores G1 55244.17 51344.40 50947.00 45598.64 53728.00 G2 44902.16 44005.16 42086.00 42086.00 43789.00 128 Cores G3 70168.36 70168.36 67508.53 56721.63 69195.73 G4 343982.87 306761.00 285295.72 281405.75 315684.47 256 Cores G5 - - - - 542126.00 G6 - - - - 655831.00
  • 26. 8.4. Comparison Among Different Thermal Mapping Techniques: ο‚— For checking the quality of proposed GA based approach for thermal-aware mapping, the results are compared with those of ILP, TAAP, CoolMap. TABLE VI Comparison Among Different Thermal Mapping Techniques NoCs Weight Factor (w) ILP CoolMap TAAP Proposed Comm. Cost Temp. (Kelvin) Comm. Cost Temp. (Kelvin) Comm. Cost Temp. (Kelvin) Comm. Cost Temp. (Kelvin) MWD 0 1986 346.14 1986 346.14 1986 346.14 1632 348.82 0.5 1258 355.73 1258 355.73 1258 355.73 1312 348.70 1 1248 353.51 1248 353.51 1248 353.51 1248 359.68 MPEG-4 0 7449.50 346.14 7449.50 346.14 7449.50 346.14 7849 338.84 0.5 3643 355.73 3643 355.73 3643 355.73 3672 341.46 1 3587 353.51 3587 353.51 3587 353.51 3633 343.33 263ENC– MP3DEC 0 551.61 346.14 551.61 346.14 551.61 346.14 270.51 350.33 0.5 310.94 355.73 310.94 355.73 310.94 355.73 230.46 351.52 1 230.43 353.51 230.43 353.51 230.43 353.51 230.44 351.50
  • 27. 8.5. Communication Cost and Peak Temperature of NoC Benchmarks: TABLE VII Communication Cost and Peak Temperature for Benchmark Applications NoCs No. of Cores Weight Factor Comm_Cost Peak Temp. (Kelvin) DVOPD 32 0 10182 348.57 0.5 10072 348.68 1 9688 352.35 VOPD 16 0 4356 344.39 0.5 4183 345.48 1 4135 348.91 MPEG-4 12 0 7849 338.84 0.5 3672 341.46 1 3633 343.33 MWD 12 0 1632 348.82 0.5 1312 348.70 1 1248 359.67 263ENC-MP3DEC 12 0 270.51 350.33 0.5 230.46 351.52 1 230.44 351.50 MP3ENC-MP3DEC 13 0 19.60 343.85 0.5 19.46 343.87 1 17.12 344.20 263DEC-MPEDEC 14 0 20.42 342.78 0.5 20.32 344.80 1 20.02 350.87
  • 28. Contd. ο‚— To check the applicability of the GA based thermal-aware mapping approach on larger scale, a few task graphs are generated using TGFF tool. TABLE VIII Communication Cost and Peak Temperature Reduction for Different TGFF Task Graphs Task Graphs Comm_Cost(Hops Γ— BW) Peak Temp. Reduction(Kelvin) Graph111 124732.77 92.55 Graph112 718853.43 92.09 Graph113 876083.87 96.97 Graph114 182443.65 92.56 Graph115 160572.93 94.38 Graph116 20306.87 92.37 Graph117 20306.87 97.57 Graph118 221245.67 90.66
  • 29. 8.6. Trading-off Communication Cost and Peak Temperature: ο‚— A trade-off is established between NoC peak temperature and Communication Cost. Below figure shows the trade-offs between communication cost and peak temperature for benchmark application VOPD.
  • 30. 8.7. Imposing Thermal Safety by Temperature Constraints: ο‚— In this experiment, thermal safety has been imposed by taking peak temperature as a constraint. The experiment finds out the mapping solution that is suitable to the temperature budget. TABLE IX Communication Cost and Peak Temperature Constraints NoC Benchmark Applications VOPD DVOPD Tcons (Kelvin) Comm_Cost Tpeak(Kelvin) Tcons (Kelvin) Comm_Cost Tpeak (Kelvin) 361 3612 358.87 360 9427 356.38 359 4888 356.26 356 10486 354.23 356 4899 351.07 359 10510 357.10
  • 31. 8.8. Dynamic Simulation Of Thermal-Aware Mapping: ο‚— For the simulation purpose, Noxim simulator has been used. Any NoC is expected to have high throughput, while the latency is expected to be low. TABLE X Throughput and Latency of NoC Benchmarks Benchmark NoCs Throughput(Flits/Cycle) Latency(Cycles) DVOPD 83735.70 0.53 VOPD 82398.14 0.57 MPEG-4 79998.50 0.59 PIP 89475.20 0.63 MWD 89963.70 0.61 263ENC-MP3DEC 81997.10 0.58
  • 32. 9.Conclusions: ο‚— Proposed mapping approach produces reasonable improvement in communication cost compared to some of the previously reported strategies. ο‚— It can be noted from simulation results that, the proposed strategy performs better compared to NMAP for the NoCs having higher number of cores. ο‚— The communication model used in proposed approach is assumed that each router takes same amount of time to traverse through it. In practical, this may not be true. ο‚— Proposed thermal-aware mapping approach has been found to improve the communication cost and peak temperature of the chip. ο‚— A trade-off has also been established between communication cost and peak temperature , so that designers can choose the solution that suits their requirement best. ο‚— Experimental results show that the proposed thermal-aware mapping approach outperforms, those of many contemporary approaches, reported in the literature.
  • 33. 10.Future Scope: ο‚— Proposed mapping strategy can be extended for mapping and routing for NoC architectures with other network topologies like Ring, Torus topology etc. ο‚— Proposed thermal-aware mapping approach can be extended for 3-D structured mapping strategies targeting fault-tolerant and reliability-aware mapping techniques for 2-D as well as 3-D NoC environments.
  • 34. 11.References: ο‚— [1].S.Murali and G.De. Micheli,Bandwidth-constrained mapping of cores onto noc architectures,design, Automation and test in Europe conference and exhibition, 2004. Proceedings, vol. 2. Feb. 2004, pp. 896–901. ο‚— [2].Pradip Kumar Sahu, Kanchan Manna, Tapan Shah and Santanu Chattopadhyay, A Constructive Heuristic for Application Mapping onto Mesh Based Network-on-Chip, Journal of Circuits, Systems, and Computers Vol. 24, No. 8 (2015) 1550126 (29 pages) ο‚— [3]. P. K. Sahu and S. Chattopadhyay, A survey on application mapping strategies for network- on- chip design, J. Syst. Archit., vol. 59, 2013,pp. 60–76. ο‚— [4].Application Mapping Onto Mesh-Based Network-on-Chip Using Discrete Particle Swarm Optimization, Pradip Kumar Sahu, Tapan Shah, Kanchan Manna, and Santanu Chattopadhyay IEEE transactions on very large scale integration (VLSI) systems, VOL. 22, NO. 2, February 2014. ο‚— [5].J. Hu and R. Marculescu,β€œEnergy-aware mapping for tile-based NoC architectures underperformance constraints,”in Proc. Asia South Pacific Des. Autom. Conf., 2003,pp.233-239. ο‚— [6].M. Moazzen, A. Reza, and M. Reshadi, CoolMap: A Thermal-aware mapping algorithm for application specific networks-on- chip, in Proc. Euromicro Conf. Digital Syst. Des., Sep. 2012, pp. 731–734. ο‚— [7].D. Zhu, L. Chen, T. Pinkston, and M. Pedram, TAPP: Temperature- aware application mapping for NoC-based many-core processors, in Proc. Des., Autom. Test Eur., 2015, pp. 1241–1244. ο‚— [8].W. Huang, S. Ghosh, S. Velusamy, K. Sankaranarayanan, K. Skadron, M Stan, Hotspot: a compact thermal modeling methodology for early-stage vlsi design, very large scale integer, VLSI syst. IEEE Trans, 14(5) (2006) 501-513. ο‚— [9]. http://mehransoft.ir/wp-content/uploads/2014/05/Noxim_User_Guide.pdf